33
funerr
8y

FUCKING ENCODING SHITHEADS!
WHY ISN'T UTF-8 THE FUCKING DEFAULT EVERYWHERE!
§¥~@#&•…≈!

Comments
  • 0
    I think out has something to do with storage and the fact that there are billions of characters out there in all kinds of languages. Not sure though, but I've read somewhere that UTF8 reserves 2 to 4 bytes for each character. And UTF16 even more. Something with databases? But to be honest: I haven't got a clue.

    But everything exists for a reason and everything has pro's and con's. Is there a charset-expert in the house?
  • 2
    @kanduvisla UTF-8 has variable character lenght. US characters take one byte. Extended latin characters like 'čřá' take two bytes. Some asian character can take four bytes. Maximum is six I think. So depending on what you write, there is an overhead. Processing is also a little bit harder due to variable lenght, but it is nice from backward compatibility point of view. UTF-16 is fixed length, but uses two bytes all the time even for 7bit ascii. Similarly UTF-32. So for some Asian countries it might be more effective to use UTF-16 as they would use less space and it is easier to process. Maybe there will be case for UTF-32 as well. And old encodings like ISO8859-[2-...] managed to squeez all characters people cared about into one byte.
  • 0
    @miska thanks! by the way. reading your comment it looks like the Devrant database has some encoding issues as well. 😁
  • 1
    @kanduvisla I just put there examples from my native language :-) Those are real characters. I don't have means to type some kanji :-D Would have to search and copy paste.
  • 3
    7-bit ASCII or die.
  • 3
    Why isn't UTF16 the standard? Stop oppressing the Mandarin and Cyrillic typefaces, shit lord.
  • 0
    Using python? :p
  • 0
    @hiestaa sometimes, I guess I need to change to python3
Add Comment