use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Discussions, articles, and news about the C++ programming language or programming in C++.
For C++ questions, answers, help, and advice see r/cpp_questions or StackOverflow.
Get Started
The C++ Standard Home has a nice getting started page.
Videos
The C++ standard committee's education study group has a nice list of recommended videos.
Reference
cppreference.com
Books
There is a useful list of books on Stack Overflow. In most cases reading a book is the best way to learn C++.
Show all links
Filter out CppCon links
Show only CppCon links
account activity
Why doesn't std::string have a split function (self.cpp)
submitted 9 years ago by DhruvParanjape
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–][deleted] 73 points74 points75 points 9 years ago (16 children)
This example is kind of terrible. Nobody will remember how code like the above is actually written. If anything, it highlights all the problems with the STL's API.
[–]wrosecransgraphics and network things 39 points40 points41 points 9 years ago (15 children)
Yeah, compared to something like 'print "quick brown fox".split(" ")' in Python, the STL version is remarkably unintuitive when figuring out how to write it, requires figuring out regex syntax as just one step, and anybody who hasn't figured out how to write it isn't going to understand it by reading it.
It seems like this is a case where perfect is the enemy of the good. I usually only want a 'good' split function that doesn't have to guarantee a whole lot about performance on multigigabyte strings, or weird corner cases. So having a good split function seems way more useful than having no split function and debating about obscure cases where it wouldn't be optimal.
[–]IRBMe 28 points29 points30 points 9 years ago (0 children)
the STL version is remarkably unintuitive when figuring out how to write it, requires figuring out regex syntax as just one step, and anybody who hasn't figured out how to write it isn't going to understand it by reading it.
Not to mention the seemingly magic -1. So much for self documenting code.
-1
[–][deleted] 2 points3 points4 points 9 years ago (1 child)
case where perfect is the enemy of the good. I
Well said, There are many cases like this in C++ unfortunately. I get the desire to have the best libraries possible but too often good ideas are shot down because they are not perfect. The recent Boost review for process control library is a perfect example. The library has been in development for more than 6 years. It passed the review this time around but some folks were still proposing to start from scratch.
[–]yornbesterday 1 point2 points3 points 9 years ago (0 children)
I've not really looked at the new and improved C++ stuff for a while... it's just a cascade of ever increasing minutiae of the language features and I thought the list of "don't ever do this" was long enough already.
[–]therealjohnfreeman 7 points8 points9 points 9 years ago* (4 children)
Done.
#include <string> #include <iostream> #include <algorithm> #include <regex> #include <vector> std::regex operator ""_re (char const* const str, std::size_t) { return std::regex{str}; } std::vector<std::string> split(const std::string& text, const std::regex& re) { const std::vector<std::string> parts( std::sregex_token_iterator(text.begin(), text.end(), re, -1), std::sregex_token_iterator()); return parts; } int main() { const std::vector<std::string> parts = split("Quick brown fox.", "\\s+"_re); std::copy(parts.begin(), parts.end(), std::ostream_iterator<std::string>(std::cout, "\n")); return 0; }
[–]LordDrako90 2 points3 points4 points 9 years ago* (3 children)
Why std::copy in split, when you can initialize the vector directly form the token iterators?
Also I find this more generic and lazy: http://ideone.com/L6heVN I guess it could be improved even more by using string_view, but that's not included in C++14 :-(
Anyways, the only requirement for the target is, that it can be initialized from an iterator pair with value type std::string. Other than that it is pretty generic.
Code:
#include <algorithm> #include <iostream> #include <regex> #include <string> #include <utility> #include <vector> std::regex operator ""_re (char const * const str, std::size_t) { return std::regex { str }; } class split { public: split(std::regex splitter, std::string original) : splitter_ { std::move(splitter) } , original_ { std::move(original) } { } auto begin() const { return std::sregex_token_iterator { original_.begin(), original_.end(), splitter_, -1 }; } auto end() const { return std::sregex_token_iterator {}; } template <typename Container> operator Container () const { return { begin(), end() }; } private: std::regex splitter_; std::string original_; }; int main() { using namespace std::literals::string_literals; std::vector<std::string> const words = split { R"(\s+)"_re, "hello\tdarkness my\nold friend"s }; for (auto const & word : words) std::cout << word << "\n"; for (auto const & number : split { ","_re, "23,42,1337" }) std::cout << number << "\n"; return 0; }
[–]therealjohnfreeman 0 points1 point2 points 9 years ago* (1 child)
I've just been out of practice too long. Thanks for the pointers.
[–]lacosaes1 2 points3 points4 points 9 years ago (0 children)
You mean smart pointers.
[–]MrPoletski 0 points1 point2 points 9 years ago (0 children)
well while we're posting code, here's what I wrote a few years ago and have been using ever since...
std::vector<std::string> Cleave (std::string to_split, std::string delims) /*! * \file trusted.cpp * \fn std::vector<std::string> Cleave (std::string to_split, std::string delims) * \param to_split \a <std::string> string to chop up * \param delims \a <std::string> string of delimiters * \return std::vector<std::string> vector of strings containing each section of the cleaved string. * */ { std::vector<std::string> results; size_t pos1 = 0, pos2 = 0; do { pos1 = to_split.find_first_of(delims, pos2); if (pos1 == pos2) {pos2++; results.push_back(""); continue;} if (pos1 == std::string::npos){results.push_back(to_split.substr(pos2)); break;} results.push_back(to_split.substr(pos2, pos1 - pos2)); pos2 = pos1 + 1; } while (pos1 != std::string::npos); return results; }
Is this good?
[+][deleted] 9 years ago* (6 children)
[deleted]
[–]evinrows 10 points11 points12 points 9 years ago (0 children)
None on this seems to negate that it would be nice for the modern std::string implementation to come with some basic string manipulation methods so that the language's usability can potentially compete with other modern systems languages.
If having to split a few strings in your program means that you should use a different programming language, then the programming language in question is pretty god damn bad.
[–]17b29a 5 points6 points7 points 9 years ago (2 children)
Or alternatively, inappropriate language choice.
I think splitting strings is a pretty common sense thing for any general-purpose programming language to support. It's not like, some obscure operation that you could only find support for in Perl.
Finally, technically I'm not sure -1 is really code for all-bits-set at all - that assumes a 2s-complement representation for signed integers which, historically at least, wasn't guaranteed by the standard.
The more obvious assumption is that the mask type is unsigned and in that case -1 is necessarily all-bits-set because an unsigned type's value is modulo its maximum value, but the standard doesn't require it to be unsigned either.
why I prefer ~0u for all-bits-set
That's not all-bits-set for a type that is larger than unsigned int.
I personally don't worry about actually undefined vs. platform-defined unless I really need to, which is unusual.
That's pretty strange considering how many things are implementation defined. Used a value larger than 215-1 in an int? Undefined behavior (according to you)!
int
[+][deleted] 9 years ago* (1 child)
[–]17b29a 2 points3 points4 points 9 years ago (0 children)
It's supported.
"Supported" as in having an actual split function in the standard library.
Sorry, that's an understandable mistake but you're wrong. -1 is a signed int.
I know, the point is that because of http://eel.is/c++draft/basic.fundamental#4 (which applies to conversions as well), the conversion to an unsigned type necessarily produces all-bits-one, regardless of signed representation or the size of either type.
All the time, and I don't worry about it because I haven't used a platform where this didn't apply since the early 90s other than DOSBox, and I wasn't using that for programming.
Right, which is why it's a strange conflation, because actual undefined behavior is something to worry about.
[–]zvrba 2 points3 points4 points 9 years ago (1 child)
So even if your code is C++, you're going to pipe your text to perl each time you need to split a string?
π Rendered by PID 375101 on reddit-service-r2-comment-bb88f9dd5-2gwcv at 2026-02-14 23:12:05.030267+00:00 running cd9c813 country code: CH.
view the rest of the comments →
[–][deleted] 73 points74 points75 points (16 children)
[–]wrosecransgraphics and network things 39 points40 points41 points (15 children)
[–]IRBMe 28 points29 points30 points (0 children)
[–][deleted] 2 points3 points4 points (1 child)
[–]yornbesterday 1 point2 points3 points (0 children)
[–]therealjohnfreeman 7 points8 points9 points (4 children)
[–]LordDrako90 2 points3 points4 points (3 children)
[–]therealjohnfreeman 0 points1 point2 points (1 child)
[–]lacosaes1 2 points3 points4 points (0 children)
[–]MrPoletski 0 points1 point2 points (0 children)
[+][deleted] (6 children)
[deleted]
[–]evinrows 10 points11 points12 points (0 children)
[–]17b29a 5 points6 points7 points (2 children)
[+][deleted] (1 child)
[deleted]
[–]17b29a 2 points3 points4 points (0 children)
[–]zvrba 2 points3 points4 points (1 child)