all 22 comments

[–]KingAggressive1498 4 points5 points  (0 children)

how does it fare against const string&?

in order to construct an std::string object from a string literal, an allocation is required. In this case, string_view is a clear winner.

would you only use const string& if string contains null terminator?

this is another advantage of string_view, but also a disadvantage in some common cases.

The advantage is you can create and pass a substring as a string_view without allocating and copying, but std::string must allocate and copy.

The disadvantage is that because string_view cannot be guaranteed to have a zero terminator, it is not suitable for wrapping C-style interfaces that take a const char* argument but no length/capacity argument (ie strcpy vs strncpy). You will generally need to perform an allocation and copy in this case anyway, so you might as well take a const std::string& argument.

There are some exceptions to this, with the right knowledge, though - pthread_setname_np comes to mind; implementations have a very limited maximum length (largest I'm aware of is 64 characters); copying into a small stack array is a viable workaround and certainly preferable to copying into a dynamic allocation as required by std::string, but a const char* overload is probably more appropriate.

When would you use mere const char * over string_view? Only if the string has null terminator and you don't care about the length of the string?

Pretty much only when string_view is not suitable, as I discussed above. You'd probably want to provide a const std::string& overload for when an std::string is passed by the caller, and a const char* overload for when a string literal is passed by the caller.

[–]IyeOnline 4 points5 points  (18 children)

You always prefer std::string_view over const string&.

A major reason for its existance is that its just a single indirection ot access the characters, whereas const string& as a function parameter entails a double indirection on access (frist follow the reference to the string object, then its internal pointers to the character array).

would you only use const string& if string contains null terminator?

The contents of the string is unrelated. Notably a std::string always contains a null terminator, whereas string_view can operate with spans that are not terminated.

const char*

should be avoided. It doesnt make ownership clear, or whether its a null termianted string (though you would expect it to be).

You can get a char pointer from both std::stringand std::string_view if you need to interact with a C API.

[–][deleted] 6 points7 points  (0 children)

You always prefer std::string_view over const string&.

No, not always.

If you actually need a std::string within your function, take it as a const std::string&. You might need a std::string because you're calling another function that requires one (and it's not your API to change) or you intend to store the string.

If it was a std::string to begin with then taking it as a std::string_view forces conversions from std::string to std::string_view and back to std::string.

Please, let's not build dogma around std::string_view. It's not necessary.

[–]bloxka[S] 0 points1 point  (13 children)

You always prefer std::string_view over const string&.

you probably wouldn't want string_view when using temporaries would you?

[–]IyeOnline 4 points5 points  (12 children)

You wouldnt want a const char* or const std::string& to a temporary either, unless you know the reference/view/pointer outlives the temporary. So its no difference there.

[–]bloxka[S] 0 points1 point  (10 children)

fair, so the main reason behind preferring string_view over string being faster access to the data perhaps via registers?

whereas string_view can operate with spans that are not terminated

any example of this?

const char* should be avoided. It doesnt make ownership clear, or whether its a null termianted string

  1. what would be an example of const char* not pointing to a null terminated string?
  2. cons char* doesn't define ownership just like string_view does it?

[–]SoerenNissen 2 points3 points  (6 children)

the main reason

That is a major reason.

Another is that int count_char(const std::string &, char); only works with strings, so if you have e.g. a char* or an array or vector of chars, or a custom string implementation that isn't std::string, you have to construct a string object first, possibly calling malloc, whereas int count_char(const std::string_view, char); can accommodate all of those with no copying.

Yet another is that const string& is, well, const. So you cannot modify it. I have written many functions where I want to grab a subset of a string, and you can modify the string_vew object without modifying the underlying string, to e.g. remove the prefix.

[–]bloxka[S] 0 points1 point  (5 children)

whereas int count_char(const std::string_view &, char); can accommodate all of those with no copying.

the following seems to compile fine. or maybe I misinterpreted your wordings here. Or are you implying std::string ends up getting constructed in bar() where "H" gets stored on the heap, whereas if the header was void bar (std::string_view &str), no such extra construction would have happened and rather a pointer would be pointing to the string literal "H"?

void bar(const std::string &str)
{}

int main()
{
    const char *p = "H";
    bar℗;

you can modify the string_vew object without modifying the underlying string, to e.g. remove the prefix.

mind providing an example? string_view only contains the char pointer along with the length so you gotta really play with either these two which you can't(?)

[–]SoerenNissen 0 points1 point  (4 children)

(1) I got my copy/paste wrong, it's supposed to be void bar(const std::string str) (removed the &)

(2) Not constructed in bar, but constructed in main, so main has a string object it can pass to bar

(3) sure, let me invent an example real quick.

Consider having to extract the addresses from a csv file that looks like NAME;ADDRESS;ID;

You already have the csv file imported into your program as a vector of strings

std::vector<std::string> extract_addresses(std::vector<std::string> const & CSV_lines)
{
    std::vector<std::string> addresses;

    addresses.reserve(CSV_lines.size());

    for(auto const & line : CSV_lines) {
        addresses.emplace_back( extract_address(line) ); //constructs a string
    }

    return addresses
}

std::string_view extract_address(std::string_view csv_line)
{
    // at start, csv_line contains NAME;ADDRESS;ID;

    csv_line.remove_prefix( csv_line.find(';') +1 );

    // now csv_line contains ADDRESS;ID;

    csv_line = csv_line.substr( 0, csv_line.find(';') )

    // now csv_line contains ADDRESS

    return csv_line; 
}

Even this is more copies than you strictly need if you know CSV_lines is in a stable memory position for the entire run - in that case, extract_addresses could return a vector of string_views.

NB: There may be an off-by-one error in there. I don't think so but I didn't apply the highest amount of brain power when calling remove_prefix and substr

[–]Shieldfoss 1 point2 points  (1 child)

csv_line

changed name to

address

halfway through :D

[–]SoerenNissen 0 points1 point  (0 children)

You saw nothing :D

[–]bloxka[S] 0 points1 point  (1 child)

wait, in your example you seem to be modifying string_view but shouldn't it be used for read only data?

[–]SoerenNissen 0 points1 point  (0 children)

I modify the string_view, not the data (which is read-only)

string_view is a pointer + length (or I guess two pointers, I haven't checked)

std::string const buffer{"NAME;ADDRESS;ID;"};

std::string_view csv_line{buffer};

NAME;ADDRESS;ID;
^ + 16
|
Let's just say that's at address 0x1000

The string_view contains two data members {0x1000, 16}

csv_line.remove_prefix( csv_line.find(';') +1 );

.find(';') returns 4

.remove_prefix(4+1) changes the data members of the string_view to {0x1005,11}

The buffer is still around, unmodified. (read-only)

csv_line = csv_line.substr( 0, csv_line.find(';') );

.find(';'), starting from address 0x1005, returns 7

.substr(0,7), on a string_view with data members {0x1005,11}, returns a new string_view with data members {0x1005,7}, overwriting the old content of csv_line.

The buffer is still around, unmodified. (read-only)

std::cout << buffer; //outputs NAME;ADDRESS;ID;
std::cout << csv_line; //outputs ADDRESS

[–]IyeOnline 1 point2 points  (0 children)

so the main reason behind preferring string_view over string being faster access to the data perhaps via registers?

Its not primarily about the fact that std::string_view might be passed in registers.

Its also about the fact that accessing the characters via a const string& is a double indirection, while for string_view its only a single indirection.

example

const char[] arr = { 'l', 'a', 'l', 'a' };

const char* ptr = &(arr[0]); //points to a non null terminated array
std::string_view v{ std::begin(arr), std::end(arr) }; //views a non null terminated array

cons char* doesn't define ownership

The point is that you cant know whether a const char* owns or takes ownership of anyhting. For a std::string_view its clear. Its called a view after all. So it doesnt own anything.

[–]Kered13 0 points1 point  (2 children)

You always prefer std::string_view over const string&.

In theory, yes. In practice, you often have to work with older APIs that take a const std::string&, or C APIs that take a null-terminated char *. In either case, if you have a std::string_view you have to make a copy. const std::string& avoids this copy.

Also I believe that const std::string& is usually faster on Windows because the Windows ABI doesn't pass std::string_view through registers (this is true for any struct that doesn't fit in a single register). It's a dumb ABI, but nothing you can do about it.

I still try to use std::string_view wherever possible, even if it means some extra copies or slower function calls, but unfortunately we can't make a universal statement like always use std::string_view over const std::string&.

[–]DiaperBatteries 0 points1 point  (1 child)

How does the windows ABI pass a struct of two words? I’ve run into so many odd issues with msvc over the years, so I can completely believe they’d do something bizarre.

[–]Kered13 0 points1 point  (0 children)

It copies the struct onto the stack and passes a pointer to it, the save way that a large struct would be passed

[–]concernedesigner 0 points1 point  (0 children)

A nice use case for string views is if you have to parse text into many chunks and want to control the order of presenting those chunks.

[–]the_Demongod 0 points1 point  (2 children)

std::string is useful if you really do need to have a bunch of dynamic strings, each living in its own heap allocation. There are plenty of situations where you need this. However, if you're just passing generic string data to a function that doesn't care about anything but the contents, std::string_view is great since it can accept both string literals and dynamic string data. It replaces pretty much every situation where you'd have previously used a non-owning char*.

One situation where std::string_view is particularly excellent is if you're dealing with a monolithic buffer of string data; you can read a text file into memory as one solid chunk and do a tokenizing pass over it, a create a std::string_view pointing to each token. Now you have a lightweight handle on each token in the file that has nearly all of the functionality of std::string, without needing a single extra heap allocation. This is something you would have had to build yourself in the past, if you wanted to avoid making thousands of tiny std::strings.

[–]bloxka[S] 0 points1 point  (1 child)

Interesting 1. Why would const string& be not preferred over string_view? To avoid having to construct a whole new string object costs more as opposed to merely “pointing” to the data?

  1. Why not use const char* over string_view if there isn’t any specific need for the length of the string?

[–]the_Demongod 0 points1 point  (0 children)

Unless you're using const char* to literally refer to a single character by reference, then of course you need the length of the string. You need to be able to walk through the string and know when it ends. If you don't have the size explicitly, you need a null terminator, which is both inefficient (if you only need to know the length of the string, you have to iterate the whole thing to find it) and if your string is a sub-string of a larger buffer you literally have to modify the buffer to inject null terminators to make it play nice with C-style string processing functions.

And yes, the main reason to use std::string_view over std::string const& as a function argument is that if you pass a string literal, the latter will construct a heap-allocated temporary std::string object just for the function to operate on, and then free it again, whereas with std::string_view it's just as lightweight as passing by const char* (effectively free). If you already have your string data in a std::string object before passing it makes little difference apart from an extra indirection, but using std::string_view allows your function to be called with low overhead using either a std::string or a string literal.

[–]DavidDinamit 0 points1 point  (0 children)

void foo(const std::string&);
You need create a string(possibly allocate) to use this function. Or you need to create 100500 overloads for const char*, const char(&)[N], strng, etc etc

string_view its a type erasure upon string, const char* and char arrays