
On Sat, Aug 03, 2019 at 06:18:51PM -0400, D. Hugh Redelmeier via talk wrote:
https://news.ycombinator.com/item?id=20600195
There are so many hairy details!
UTF-8 gets a bit less coverage since it has fewer hairy details.
From this I learned that Java and JavaScript now have optimizations to use LATIN-1 when they can. Normally they use UTF-16 (originally UCS-16). I take it that Using Latin-1 is an opportunistic optimization hidden from the program. I don't think Python 3 uses this.
I think that Linux does this right and needs no such hack: just use UTF-8. Of course Java, JavaScript, Python 2, and Python 3 on Linux don't get it right.
UTF-8 just makes much more sense. Backwards compatible with ascii, no endieness issues, stupidly simple. It just makes sense. 16 bit characters are just all sorts of pain. :) Of course given who invented UTF8, it is no wonder it is briliant and simple. -- Len Sorensen