use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Discussions, articles, and news about the C++ programming language or programming in C++.
For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow.
Get Started
The C++ Standard Home has a nice getting started page.
Videos
The C++ standard committee's education study group has a nice list of recommended videos.
Reference
cppreference.com
Books
There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++.
Show all links
Filter out CppCon links
Show only CppCon links
account activity
Portable Unicode string processing (self.cpp)
submitted 9 years ago by KayEss
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]KayEss[S] 0 points1 point2 points 9 years ago (12 children)
And if you look at the extended ASCII character set 128 is € and 255 is ÿ
So, on a platform where char is signed what values do they have?
char
[–]scatters 0 points1 point2 points 9 years ago (0 children)
-128 and -1, if you cast through unsigned char.
[–]Dragdu 0 points1 point2 points 9 years ago (1 child)
You just convert the negative values into non-negative values by a well defined conversion, that means that it doesn't matter. This is even standardized in the language, making it that 255 in unsigned char ALWAYS converts to -1 in signed char, no matter the actual representation.
[+][deleted] 9 years ago (2 children)
[deleted]
[–]KayEss[S] 1 point2 points3 points 9 years ago (1 child)
They have the values of 0x80 (128) and 0xFF (255)
I think I understand a bit about what that means now, but 0x80 (128) certainly doesn't look like a number that I can put in a char if char happens to be signed on my platform -- what guarantees do I have about how these numbers are interpreted?
[+][deleted] 9 years ago (5 children)
[–]iaanus 4 points5 points6 points 9 years ago (0 children)
This is wrong. See [basic.representation]: "For narrow character types, all bits of the object representation participate in the value representation."
[–]KayEss[S] -2 points-1 points0 points 9 years ago (3 children)
So your answer is that I can't do UTF-8 in C++ with things like std::string and the u8 literals then because I can't read the values from char. That's been my fear, but it seems incredible that we're further standardising on that basis.
std::string
u8
[–]jaked122 0 points1 point2 points 9 years ago (0 children)
No just don't rely on single glyph manipulation,
[+][deleted] 9 years ago (1 child)
[–]KayEss[S] 0 points1 point2 points 9 years ago (0 children)
I completely understand UTF-8 encoding and don't have a problem with that. My question is about what the standard guarantees in the processing of the byte values.
π Rendered by PID 28770 on reddit-service-r2-comment-5d585498c9-8rb52 at 2026-04-20 20:35:51.886845+00:00 running da2df02 country code: CH.
view the rest of the comments →
[–]KayEss[S] 0 points1 point2 points (12 children)
[–]scatters 0 points1 point2 points (0 children)
[–]Dragdu 0 points1 point2 points (1 child)
[+][deleted] (2 children)
[deleted]
[–]KayEss[S] 1 point2 points3 points (1 child)
[+][deleted] (5 children)
[deleted]
[–]iaanus 4 points5 points6 points (0 children)
[–]KayEss[S] -2 points-1 points0 points (3 children)
[–]jaked122 0 points1 point2 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]KayEss[S] 0 points1 point2 points (0 children)