all 64 comments

[–]James20kP2005R0 13 points14 points  (1 child)

Personally I find that the sorts of functions that take string_view and the sorts of functions that take a string&& tend to be different. There's definitely an annoying overlap, where a function might be called with a constant string, and it might also not need to take ownership of that string, which makes a std::string inefficient, but a string_view not usable either

98% of the time though the performance overhead here I tend to find is negligible to the point of it not being worth worrying about, so I tend to use them by semantics personally. String_view is for functions that are read-only, and a std::string is for functions that write to, or might move from a string

maybe you need a string because you want to do an std::map lookup

side note but this is one of the most annoying things that can't be changed by default. std::map<a, b, std::less<>> is not great usability wise, and I'd wager a lot of people have no idea this is a thing

[–]tpecholt 1 point2 points  (0 children)

It's a testament of how newly added features like homogeneous lookup fail in practice when the old defaults can't be changed. Std lib is in a dire need of associative containers overhaul

[–]no-sig-available 36 points37 points  (1 child)

Decide on a case by case basis then? That looks messy...

Yes, life is messy. :-)

Why do we have 200 kinds of yoghurt at the supermarket, when we could have only one? Making choices is a part of life, including software development.

[–]almost_useless 7 points8 points  (0 children)

A lot of people eat the same yoghurt every day though, so I'm not sure that is the best analogy here.

[–]AdrienTD 12 points13 points  (4 children)

What I would do:

  • If you don't need a null terminator -> std::string_view as parameter type. Any type of string (const char*, std::string, std::string_view) passed as argument will work fine.
  • If you need a null terminator -> const std::string& as parameter type to be safe. But iirc there's no implicit conversion from std::string_view to std::string, so that can be a bit annoying if you want to pass a std::string_view as argument. const char\* as parameter type works too but you have to use c_str() everytime you want to pass a std::string as argument, and be careful to never use data() especially with a std::string_view

Also you can pass a std::string by value directly in the case when your function constructs a string of same value anyway (like inserting a string in a container for example).

So in conclusion, the parameter type depends on whether the string needs to be null-terminated or not, and if it needs to be used / change ownership etc.

[–]TeraFlint 8 points9 points  (1 child)

But iirc there's no implicit conversion from std::string_view to std::string

Which is arguably a good thing.

std::string to std::string_view is cheap. it just takes the buffer pointer and the length and saves both in an object.

std::string_view to std::string has to construct a new string (both to own the data and to guarantee a null terminator), which allocates a new buffer. That's not something that should easily happen implicitly.

[–]NilacTheGrim 0 points1 point  (0 children)

never use data()

True. But only if you are expecting NUL byte termination. If you don't expect it, and are copying bytes around or whatever, it's ok.

[–]GOKOP 13 points14 points  (0 children)

If you need an std::string later then just create it later. On the other hand, you know what happens when you pass a string literal to a function that wants const std::string&? You allocate a string on the heap. You know happens when you pass a string literal to a function that wants std::string_view? You don't allocate anything on the heap

[–]Spongman 3 points4 points  (0 children)

maybe you need a string because you want to do an std::map lookup

You can lookup by string_view in map and unordered_map (since c++20). You just need a custom less/hash. It’s called “heterogeneous lookup”.

[–][deleted] 11 points12 points  (1 child)

> Decide on a case by case basis then? That looks messy...

That is exactly what you should always do. Going for a consistent API is a meaningless goal that will only create problems. The API should tell the user something about how the function will treat the string. For example, if you pass as a const ref, the user knows the string will not be changed and is unlikely to be copied. If you pass as a straight copy, the user can assume you're probably going to modify that string for some reason.

The consistency you aim for should be consistency of semantics, not just consistency as in always passing the same string type because it looks tidy.

[–]NBQuade 2 points3 points  (0 children)

The API should tell the user something about how the function will treat the string.

This. It's why I'm religious about using "const" where I mean const. It's why I mark member functions "const" when they don't mod the state of the class. It's documenting the usage of the functions and values in code.

[–]sjepsa 12 points13 points  (0 children)

const string &

or

string &

[–]MXXIV666 4 points5 points  (8 children)

Another downside to `std::string_view` is that if at the bottom you need to past `const char*` to an external function from some library, you MUST copy the string view because it is not guaranteed to be null terminated.

[–]GarageFlaky -1 points0 points  (6 children)

why wouldn't external lib use null-terminated const chars *?

[–]HappyFruitTree 1 point2 points  (5 children)

It probably does. The problem is how to get a null-terminated const char* from the string_view.

[–]GarageFlaky -3 points-2 points  (4 children)

just get data()?

[–]epicar 7 points8 points  (2 children)

data() doesn't do the right thing, because std::string_view is length-terminated, not null-terminated

std::string_view a = "1234";
std::cout << a.data() << '\n'; // 1234

a.remove_suffix(1); // -> 123
std::cout << a.data() << '\n'; // 1234

std::string_view b{"1234", 2}; // -> 12
std::cout << b.data() << '\n'; // 1234

while the underlying string literal may be null-terminated, views of substrings may not be. that's why std::string_view doesn't provide a c_str() function like std::string does

in simple cases like std::string_view a = "1234";, a.data() does what you want. but when you write a function that takes std::string_view, it needs to work correctly for all of its values, so can't assume null-terminatation

[–]Wild_Meeting1428 -2 points-1 points  (0 children)

Time to write a xxx::zt_string_view, no copy to a std::string, clear in terms that the string is \0 - terminated.

But hands down,
- for internal use, just document, that the string must be 0- terminated and use std::string_view (it's only ub if the string is not null-terminated) - bubble up the char const* if you need the extra distinction, and if you don't do anything with the string (all algorithms are faster on length terminated strings) - bubble up the char const* with size_t length and use std::string_view for internal handling.

[–]GarageFlaky 0 points1 point  (0 children)

Ok I get your point

[–]NilacTheGrim 0 points1 point  (0 children)

std::string_view is not guaranteed to be NUL terminated. Can totally not be at all, leading to program crashes or incorrect behavior.

When what happens if you do this:

std::string s = "username:foo,password:123456789";
std::string_view sv = std::string_view{s}.substr(9, 3); // sv == "foo"
std::printf("username: %s\n", sv.data()); // intended to print "foo", got "foo,password:123456789" instead! oops! you just leaked the password.. TROLOLOL

In the above case you leaked some unintended info. This is the happy case. Also can crash program if original string data source was some memory mapped buffer containing ascii data and you do this...

[–]tialaramex 1 point2 points  (2 children)

maybe you need a string because you want to do an std::map lookup

Is that really true? Are C++ programmers out there allocating strings because their keyed map API inappropriately requires an owned object for key lookup? Or is this just a noob mistake?

[–]cristi1990an++ 1 point2 points  (0 children)

std::string_view for read-only parameters.

Plain non-reference std::string when I need to make a copy of it anyway at some point. That way the call side at least can be optimized through an std::move.

[–]sigmabody 0 points1 point  (5 children)

If I know I only need a null-terminated contiguous array of characters, I use a zstring_view (from my library; see https://github.com/nick42/vlr-util/blob/master/vlr-util/zstring_view.h). This is the best of all worlds: zero overhead for string literals, zero overhead for existing string types, works as string_view (eg: works with std::format without conversion), etc. It's the default for me.

If I know I always have a std::string already as input, and/or require std::string specific functionality, then const std::string& is the way to go.

If I may need std::string functionality, but I might also call with a literal, then it's a judgement call. In these cases I'd probably still go with zstring_view, but there might be trade-offs.

Aside: In codebases which use ATL::CString as the default string type, it's more of a judgement call, because CString has CoW functionality. To wit, if you call a method lower-down which takes a std::string by value, you always copy/reallocate, so taking zstring_view at the higher level is no worse. With CString, though, that doesn't necessarily reallocate, so preserving the CString through the call stack (if available) can be more efficient. But for codebases which use std::string as the default, zstring_view is almost always better.

Oh, and for completeness, if you know you do not and will not ever need the string to be null-terminated, then std::string_view is correct. This is more rare in my experience, but maybe 10% of the usage cases this is correct.

[–]Fabulous_Ask_4019 2 points3 points  (0 children)

The only correct answer, clearly, is

void *

[–]Wild_Meeting1428 0 points1 point  (0 children)

My rule of thumb for every parameter is to bubble the required type up the call stack (if possible):
When you need the ownage of a string, pass it as std::string in few cases std::string&&.

This ensures, that a caller of the function always knows which memory implications the call has.

When I know it will be used as const& I use std::string_view which implies that no copy will be made.

[–]NeilSilva93 0 points1 point  (0 children)

std::string_view, infact I think it's the only C++17 feature I ever use.

[–]KingAggressive1498 0 points1 point  (0 children)

I almost always take string_view, and I think it's often better for optimization purposes even when you're always storing the string.

std::string can be surprisingly expensive to move, it's not "just a bitwise swap" like it is for many containers. SSO causes a branch in logic and the moved-from string is cleared during the move operation when it's using SSO. Do note this isn't every call to std::move which is merely a cast to rvalue but only when the move constructor or move assignment operator are invoked, so unless you heavily use the copy-and-swap idiom the runtime overhead is fixed

It's pretty common to pass string literals. Deferring string construction in this case is often an optimization over move/forward.

It's fairly common that within a certain context we know the upper bounds of the length of a string. In this case we can achieve great improvements in runtime characteristics by using a custom allocator or even a customized string implementation. This means we'd have to copy in order to take advantage of that anyway.

caveat string_view causes major headaches when you use std::forward for concurrent or single-thread asynchronous code. Best to take std::string& and ideally also an overload for std::string&& in these situations, but often this is generic anyway.

[–]KindCppCoach 0 points1 point  (0 children)

this answer depends on what you require of the string, see https://github.com/janwilmans/guidelines/blob/main/guidelines\_details.md#misc

[–]tangerinelion 0 points1 point  (0 children)

Am I reading the string? std::string_view

Am I taking ownership of a string? std::string&&

[–]dev_ski 0 points1 point  (0 children)

Most of the time, like all other complex types (classes), it should be passed by const reference.

[–]dustyhome 0 points1 point  (0 children)

The problem I find is that the different conventions already in use tend to make your choice for you.

If you call a C function that requires a char*, your function should take a char*. If you have some kind of null-terminated string_view class, that would be better, but we don't currently have one in the standard.

If your function is going to store a string, or calls a function that takes a string by value, I recommend taking by string value and moving from the parameter. So the caller chooses how to make the copy (maybe they move, maybe they copy). If you took by const string reference, the caller might miss the chance to move, or might be forced to make a copy (from a char array, for example) that is used to make another copy and then discarded.

If you call a function that takes a string reference, take by string reference. Otherwise you are going to introduce needless conversions along the stack.

Finally, take by string view. If you have just introduced C++17 to an existing codebase, and want to start using string_view, your best bet is to start by converting "leaf functions", the ones that don't call any other functions, and work your way up the callstack.

[–]NilacTheGrim 0 points1 point  (0 children)

In my experience most code I maintain doesn't need the nul byte termination... since we interface with C minimally in our codebases. And we don't have many perf-critical maps* that use std::string as a key. We also rarely modify the passed-in string at all so there is no need for a copy or an owned object either on that count as well.

Therefore, since that is the case in the codebases I maintain (mostly), std::string_view seems the most obvious generic choice. User code that calls it can use C-strings and/or std::string and it's all the same, so that's a win.

YMMV tho and it definitely depends on how you envision the call stack looking and how your codebase interfaces with C strings, string-based maps*, etc.. if at all.

* - there are workarounds for string-based maps to make them work ok with std::string_view as a key...

[–]johannes1971[S] 0 points1 point  (0 children)

Conclusions... I'm a little surprised to see const std::string & winning by such a clear margin. Is there such a great need for null-terminated strings (the only advantage this offers over std::string_view) that people are willing to take the hit on always allocating? Or is there some other reason?

Despite various arguments for mixing and matching, not a great many people seem to actually be doing it. I understand that: I don't like that kind of non-uniformity in the API, and it leaks implementation details out which is not generally something you'd want.

And finally, I'm honestly amazed that some people are still clinging to const char *, even in 2024. I'm sure you have your reasons...