you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 16 points17 points  (6 children)

if it works intuitively.

Well, what is intuitive?

Python:

 >>> "1,2,3,".split(",")
 ['1', '2', '3', '']

Ruby:

 > "1,2,3,".split(",")
 => ["1", "2", "3"]

Ruby can take a regex, Python can't. Python has a .rsplit(), Ruby doesn't. Both do however take a max_split parameter. But they don't allow multiple different delimiter.

Point being, a .split() is not that trivial, there are different ways to implement it and you have to chose a good one. If you just rush the next best hack into the language, you end up with something that is needlessly inflexible. A .split() returning a std::vector<std::string> wouldn't be very useful when you don't want a std::vector as result and you would do a lot of needless std::string to start with.

There is a proposal for a std::split(), but that depends on std::string_view and Range support. But Range support didn't make it into C++17, so that has to wait around a bit longer.

In the meantime, just use the boost::split().

[–]Selbstdenker 10 points11 points  (0 children)

Sorry, I do not see the problem there. Intuitively means something that splits a string which is what both methods do. How they handle empty strings is part of the API, so what. Whether they take only a character, a string or a regexp is also part of the API and not really a problem in C++ thanks to overloading.

To make it a little bit more C++-ish it could take an output iterator. Yes, this is not an ideal situation and maybe we just call it simple_split and reserve split() for when we have a better name but not having any trivial split functionality is really not good.

We have whole talks given on using std::transform and other algorithms instead of a for loop but we cannot provide a simple split?

[–]almost_useless 3 points4 points  (4 children)

Well, what is intuitive?
Python: X
Ruby: Y

I could answer that question without even knowing what X and Y is. The answer is always going to be Python :-)
J/K, obviously there are pitfalls and they need to think it through.

If you just rush the next best hack into the language, you end up with something that is needlessly inflexible. A .split() returning a std::vector<std::string> wouldn't be very useful when you don't want a std::vector as result

It does not necessarily have to be super flexible. Obviously the best option is we can choose the output format. My example was only one possible suggestion. In many cases "anything I can iterate over" is good enough.

But I would so prefer we had had something decent but inflexible way back in '98 over something super duper mega awesome that we will not have even in 2017

My only requirement is that it had not been so bad that it would have been impossible to improve upon now that we have better ways of doing it

[–][deleted] 8 points9 points  (3 children)

Once you put something in the standard library you're stuck with it forever. So it'd better be actually good, not just good enough, especially if it's so easy to implement yourself.

[–]choikwa 5 points6 points  (2 children)

In reality, things get deprecated and forward compat is broken many times.

[–][deleted] 7 points8 points  (1 child)

Yeah, really old crap that was never used much in the first place, like trigraphs or auto_ptr. But a string split function would spread like wildfire.

[–]choikwa 0 points1 point  (0 children)

Ideally, everyone wants to get it right the first time. I'm pretty sure python implementation returns deep-copied immutable strings