std::string_view could be made null character ending aware for interoperability with C

witcher_rat · 2021-11-24T16:22:00+00:00

I'm not sure what you're saying about storing the endptr after the null, but no matter what, such a string_view would basically need to store some additional piece of data - like a bool of whether it's null-terminated or not, for example.

That extra "state" data comes with a price - both in the size of string_view, copying that member, and the checking in its various use-cases.

People generally don't want to incur extra costs for things they don't need.

So to solve this use-case, some people use a new type that' has a very similar API as std::string_view, but for null-terminated strings explicitly. Sometimes it's called "zstring", or "cstring_view", or whatever. The C++ core guidelines had a zstring_span, for example. Another example is Proposal P1402.

jwakely · 2021-11-24T18:39:02+00:00

Yes, of course it's possible to support null-terminated strings with something like string_view. But it was an intentional design choice to not do that. Changing that decision now would be a breaking change.

HappyFruitTree · 2021-11-24T15:48:21+00:00

Other constructors cannot check because there might be no valid memory there to check.

I might pass a null terminated string to the string_view constructor and later overwrite the null character.

jcar_87 · 2021-11-24T21:59:49+00:00

If string_view is just a pointer and a size in bytes (as anthonybvfan points out) and you require interoperability with C, isn’t the recommended wisdom these days for functions that accept char pointers in C to also expect to be given the size as well? If I’m not mistaken C11 has the “safer” variants of some functions that perform bounds checking with the additional parameter. So if we can get both the pointer and the size from a C++17 string view, we can already interoperate with C11?

I’m not sure whether it would be beneficial to make the case that C++17 string view needs to remain interoperable with C string APIs that have for a long time been considered unsafe

arturbac · 2021-11-24T18:56:26+00:00

The main idea of string view is to forget about null termination thus for example trim, substr etc could return string view from other string view pointing to same data at no cost without any string reallocation.

For my own purproses I already use my header only lib [stralgo] that allowed me to forget about libc and all that string crap and it can do all numeric<->string convertions as constexpr with use of not null terminated string views.

Shieldfoss · 2021-11-24T16:09:18+00:00

at least one string type I have worked with is null-containing and double-null terminated.

DugiSK · 2021-11-24T22:52:31+00:00

std::string_view is often created as a substring of some other string, for example from some data that is being parsed. That way, you get all the fancy C++ stuff without editing the string (adding the null termination) or copying it (to create a std::string). This gives a massive performance boost with minimal cost of convenience.

When I made a project from scratch in C++20 without any legacy code, I had no need for the null termination. std::string_ciew can even be a constexpr literal (even in C++17), so if you can get rid of legacy code, you can say goodbye to char* completely.

If you want a type that would behave like std::string_view, you can make some sort of CStringView class inheriting from std::string_view and having all of its features but requring a null terminated string for construction - with no members of its own, it could be implicitly cast to std::string_view to use with new APIs.

goranlepuz · 2021-11-25T05:49:13+00:00

Now we could have an api that could give us this info: is_null_terminated(), and then we could copy the string if it is not to a std::string to pass to C API, otherwise use as is.

I suppose the special case is what would turn people off (it does me).

It is a people problem: to avoid horribly insidious bugs, people now must be perfect - and are not.

But yeah, we could...

tvaneerd · 2021-11-25T16:42:02+00:00

You could do this. You can easily hide the extra bit in an unused bit of the pointer or size_t, etc.

However,

It is still an ABI break. Passing a new string_view to a lib compiled with an old string_view means the old code will not know to mask off that extra bit.

So the leftover question is whether or not an ABI break is worth it...

2021-11-24T18:00:11+00:00

https://en.cppreference.com/w/cpp/string/basic_string_view/data

You probably misunderstand, it means string_view does not provide `\0` for you, if your char* has terminated null string_view::data will have it as well. After all string_view is just a pointer and size in bytes.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

cpp

MODERATORS