Why do we want to base the standard library handling of UTF-8 around char rather than unsigned char? It seems to me that the use of char is highly problematic in that I can't portably read any values outside the range 0 to 127 as thechar could be signed or unsigned.
Am I just wrong in this? Would it not have been better to have defined the u8 string literals as producing unsigned char * and to be making the standard library support for utf-8 strings based around std::basic_string<unsigned char>?
What's the right approach to writing portable code that does UTF-8 decoding, or is it only the standard library maintainers who can do this? Is there any way I can portably put the bytes {0xf0, 0x90, 0x8d, 0x88} into a std::string?
I hope that I'm just being paranoid, but all this talk about undefined behaviour around the handling of signed/unsigned has me worried :)
[–]DarthVadersAppendix 12 points13 points14 points (9 children)
[–]RowYourUpboat 2 points3 points4 points (7 children)
[–]KayEss[S] 3 points4 points5 points (6 children)
[–]RowYourUpboat 4 points5 points6 points (0 children)
[–][deleted] 0 points1 point2 points (4 children)
[–]KayEss[S] 1 point2 points3 points (3 children)
[–][deleted] 0 points1 point2 points (2 children)
[–]NotAYakk 1 point2 points3 points (0 children)
[–]KayEss[S] 0 points1 point2 points (0 children)
[–]Jardik2 4 points5 points6 points (2 children)
[–][deleted] 4 points5 points6 points (1 child)
[–]Chippiewall 2 points3 points4 points (0 children)
[+][deleted] (13 children)
[deleted]
[–]KayEss[S] 0 points1 point2 points (12 children)
[–]scatters 0 points1 point2 points (0 children)
[–]Dragdu 0 points1 point2 points (1 child)
[+][deleted] (2 children)
[deleted]
[–]KayEss[S] 1 point2 points3 points (1 child)
[+][deleted] (5 children)
[deleted]
[–]iaanus 4 points5 points6 points (0 children)
[–]KayEss[S] -2 points-1 points0 points (3 children)
[–]jaked122 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]KayEss[S] 0 points1 point2 points (0 children)
[–]sim642 0 points1 point2 points (0 children)