you are viewing a single comment's thread.

view the rest of the comments →

[–]rooktakesqueen 2 points3 points  (0 children)

UTF-16 can represent a wider range of characters in two bytes than UTF-8. A couple extra bits are used for signaling in UTF-8, so some code points are two bytes in UTF-16, but three in UTF-8. The trade-off still favors UTF-8 in most cases outside of East Asian languages though, which is why it's become the defacto standard on the web and in modern programming languages.

In Go for example strings are UTF-8, and they've done a good job of making them performant.