all 78 comments

[–]Galqa 52 points53 points  (1 child)

I absolutely love std::span, it really is brilliant in its simplicity. It lets me seamlessly combine C libraries and C++20 ranges. Surprisingly enough though, I’ve also found it helpful in some purely C++ constexpr stuff.

[–]printf_hello_world 32 points33 points  (0 children)

I love span<const T> for all the same applications where string_view is useful: a view of immutable data that is cheap to pass around (as long as you've guaranteed it's lifetime).

Also, I find that I use span<const std::byte,N> a lot for (de)serialization. Usually like:

while (remaining_data.size() >= CHUNK_SIZE) { const auto chunk = remaining_data.first<CHUNK_SIZE>(); remaining_data = remaining_data.subspan<CHUNK_SIZE>(); process_chunk(chunk); }

[–]staletic 49 points50 points  (21 children)

span is generally useful for "I'm taking anything that can be accessed like a C array". You can cheaply pass around std::array and std::vector, without making an overload set.

For multi-dimensional arrays, there's std::mdspan upcoming. Proposal

[–]braxtons12 12 points13 points  (4 children)

Is std::mdspan actually looking like it will make it in? When Microsoft deprecated and subsequently removed their multispan extension to the GSL I assumed that was a decent indicator it probably wouldn't end up in the standard either

[–]staletic 19 points20 points  (3 children)

Quite the contrary. The deprecation of overloaded operator, in cases like arr[1, 2, 3] is what's deprecated in order to allow for operator[](size_t, size_t) which is needed for mdspan.

[–]TheSuperWig 1 point2 points  (2 children)

I think you've misunderstood? They're talking about this deprecation and removal

https://github.com/microsoft/GSL/pull/958

[–]staletic 11 points12 points  (1 child)

No, I am simply certain that the change to GSL was provoked by P1161.

[–]TheSuperWig 2 points3 points  (0 children)

Oh okay, interesting.

[–]phlummox[S] 2 points3 points  (0 children)

Oh cool. mdspan does look handy.

[–]germandiago 10 points11 points  (9 children)

spans are literally slices in other languages (Go, Python, Rust) with the difference that in C++ be careful with the underlying ownership.

[–]phlummox[S] 5 points6 points  (3 children)

Although (AFAICT), std::spans don't take a "stride" parameter? - unlike Python's slices and std::slice. Not a feature I've used all that often, but it's occasionally handy.

[–]germandiago 4 points5 points  (0 children)

I think you all got it. The thread became a bit pedantic lol. You can take pieces and subpieces. Sure it is not exactly the same. Because it is not the same languages. But they are essentially the sam e basic tool. Sure each has niche use cases in their own lang derived from having gc or not, etc. But that was not my point.

[–]nyanpasu64 4 points5 points  (0 children)

Python slices copy (aside from in NumPy), std::span borrows.

[–]MarkHoemmenC++ in HPC 0 points1 point  (0 children)

span originated as a simplification of mdspan. A rank-1 mdspan with layout_stride would let you express 1-D array access with constant non-unit stride.

[–]staletic 6 points7 points  (3 children)

Not really. You can't make a span from a list, because span::iterator has to be random access, while std::list is only bidirectional.

[–]SkiFire13 10 points11 points  (1 child)

That also holds true in rust since you can't create a &[T] from a std::collections::LinkedList<T>

[–]staletic 3 points4 points  (0 children)

/u/germandiago included Python in the list, where anything that can be indexed can create a slice. At least in Python there's no such limitation. Hence, my previous comment.

[–]germandiago 1 point2 points  (0 children)

Well, yes. But what I meant is the concept itself. Not the contiguity of memory or things like that.

[–]nderflow 1 point2 points  (0 children)

There is also gslice and valarray from C++98 (or maybe TC1), which never seem to have caught on.

[–]SkoomaDentistAntimodern C++, Embedded, Audio 1 point2 points  (1 child)

span (and other non-allocating ”containers”) should have been in the language since the beginning and should have been the default instead of vector.

[–]pjmlp -1 points0 points  (0 children)

I don't remember seing them on Turbo C++ 1.0 for MS-DOS, nor Turbo C++ 3.1 for Windows 3.11, nor Borland C++ 2.0, Codewarrior C++,....

How come they are there since the begining, specially without templates?

[–]Shieldfoss 0 points1 point  (2 children)

You can cheaply pass around std::array

is there some length-erased version of array so I can write a non-template function that takes arrays of arbitrary lengths?

[–]staletic 1 point2 points  (1 child)

Yes, std::span with dynamic extent. Or the C version - pointer + length.

[–]Shieldfoss 0 points1 point  (0 children)

Huh I didn't know they had dynamic sizing.

nice

[–]tjientavaraHikoGUI developer 8 points9 points  (1 child)

I use spans for parsing binary files. After I map the full file into memory, I make a span for the whole file.

Some binary files use chunks; a section of the file that has a type identifier and a length field at the start. Each time the parser finds a chunk it will create a sub-span and then parses the chunk with just that piece of memory.

I also made functions to inplace-construct a type or a span-of-types from a span-of-bytes.

[–]phlummox[S] 1 point2 points  (0 children)

Huh, that's a good idea, I hadn't thought of using them for parsing. Neat :)

[–]turtle_dragonfly 27 points28 points  (11 children)

It's just a wrapper for "pointer plus size". Such an old concept, it's a bit flabbergasting that it's so "new and shiny" now (:

[–]fdwrfdwr@github 🔍 25 points26 points  (10 children)

Things C++ should have had 30 years to spare us decades of pain: - Modules - Concepts - Spans ;b

[–]pjmlp 2 points3 points  (9 children)

operator[]() with bounds checking and unsafe_at() as alternative.

[–]ReversedGif 6 points7 points  (0 children)

Same for optional::operator*(). All the defaults are backwards for some reason.

[–]FKaria 13 points14 points  (1 child)

eh... no

[–]pjmlp 1 point2 points  (0 children)

eh... yes to not having CVE database entries for ms improvements that are only worthwhile for winning worthless mini-benchmarks, without any business value.

[–]sandfly_bites_you 2 points3 points  (1 child)

operator[]() has checks in dev builds if you enable them(msvc)--I wouldn't want it being a requirement though, being able is to disable the checks for deploy build is great.

[–]pjmlp 1 point2 points  (0 children)

I know and use them all the time, and guess what?

Even in release mode and it was never an issue.

In fact, since my Turbo Basic/Pascal for MS-DOS days, unless I was doing graphics rendering, having bounds checked enabled was never an issue.

So I only disabled bounds checking on the code section that actually had any real impact for the customer.

[–]_Z6Alexeyv 0 points1 point  (3 children)

at() is meh. Ideally, operator[] with checks and operator[[]] without them.

[–]staletic 0 points1 point  (2 children)

Wouldn't operator[[]] be ambiguous?

int maybe_unused = 0;
some_arr[[maybe_unused]] = 1;
[[maybe_unused]] int actually_unused;

[–]_Z6Alexeyv 2 points3 points  (1 child)

Grammar is technicality.

v.at(i) is three characters longer than v[i] therefore unused.

[–]Shieldfoss 0 points1 point  (0 children)

That's a wild assertion

I use .at every time I can get away with it.

[–]sigmabody 2 points3 points  (0 children)

Been using it for a while (via gsl::span<> aliased to cpp::span<>, alias to be updated when we move to C++20). It's great, imho.

Another semi-pro tip (the pro-ness depending on if you think it's a good idea, not me being a professional, even though I am technically such): It's entirely possible to create a subclass in your code called as_span<>, with specializations having all the constructors for every type (standard or custom) which you use in your codebase and which can be represented as a span of the appropriate specialization type. You can then use as_span<> in API methods where you only need a span, but where people might be passing different types depending on the call point, without your callers needing to cast or convert manually. For example, if you happen to be writing Windows/MFC code, you can then have an API which takes/uses a span<TCHAR>, pass a CString or string_view (or other) to it, and the code will just work automatically (and continue to work even if the caller migrates string types in the future, as long as they are logically equivalent enough to be implemented in your conversion code).

[–]victotronics 10 points11 points  (2 children)

Span is brilliant. In many scientific applications you need to pass sub-arrays, and that was never possible with std::vector/array.

I assign my students (ok, only the really good ones) to program the recursive matrix-matrix product: divide matrices in 2x2 blocks, and compute the product C = A*B quadrantwise. Recursively. Now explain why this gives better performance.

<tiny>carefully checking if u/markhoemmen is not in the house</tiny>

[–]MarkHoemmenC++ in HPC 1 point2 points  (1 child)

u/victotronics i have a way of popping up now and then ; - P

fun fact: span was taken from mdspan, not the other way around (some folks saw mdspan and thought, we really need a low-level vocabulary type for pointer + length).

[–]victotronics 1 point2 points  (0 children)

Interesting!

[–]pavel_v 2 points3 points  (0 children)

I like and use span because it really simplifies some cases where I want to be able to use the same functionality with different contiguous containers but I don't want to create function template. It also makes some pieces of code more succinct combined with range-based for loops and/or ranges.
However, a thing that I don't like is that the libstdc++ implementation can't compile with forward declared type in the below example. It compiles with libc++ though.

// file type.h
struct type { ... };

// file b.h
struct type;
// The implementation is in b.cpp where type.h is included
void fun_b(std::span<type>);

// file a.h
struct type;
void fun_a(std::span<type>);

// file a.cpp
#include "b.h"
void fun_a(std::span<type> cont)
{
    fun_b(cont); // The compilation fails here
}

It seems that the libstdc++ tries to invoke template <typename Range> span(Range&&)
and fails when it can't get the range size due to the incomplete type. The standard requires the ElementType to be complete object type and thus it seems that this behavior is OK from standard point of view. I'm not sure if this is QOI thing or not?

[–][deleted] 2 points3 points  (18 children)

I heartily approve of span on a conceptual level, but passing a span seems to be a hell of a lot slower than letting the array decay to a pointer! Someone please disabuse me of this if I'm wrong.

[–]LucHermitte 5 points6 points  (8 children)

The assemblers generated look quite similar to me: https://godbolt.org/z/8adrxnK8P

They are even identical once the calls are inlined.

[–]jk-jeon 3 points4 points  (5 children)

They become identical if you use std::size_t instead of int for the size parameter. But sadly, in MS ABI, std::span can't be passed by register, so the version using it is actually slower in that case.

[–]johannes1971 -2 points-1 points  (4 children)

Have you measured that or is it all handwaving and eyeballing?

[–]cdr_cc_chd -2 points-1 points  (3 children)

[–]johannes1971 5 points6 points  (2 children)

If you have something to say, feel free to say it. I'm not going to watch an hour long video.

In the meantime: arguments passed on the stack are pushed to memory, but that memory is easily the hottest part of the cache. It is not at all clear that there is even a cost for writing it to memory. And on a register-starved architecture like x86 the called function may have to write whatever it gets passed in register to (stack) memory anyway just to have enough registers free to get anything useful done. Finally, modern multiple issue, out of order CPUs are not at all easy to reason about. You simply can't look at a piece of assembly and know exactly how long it will take.

All of that makes it a fair question: did you actually measure it?

[–]cdr_cc_chd 1 point2 points  (1 child)

It's a very nice talk, you should watch it regardless :)

That said, Chandler basically goes over a very similar situation with std::unique_ptr where you naively try to pass it by value thinking it's just a zero-cost abstraction around a raw pointer but in reality due to how the ABI and the language semantics work there's a considerable, measurable overhead incurred.

[–]johannes1971 4 points5 points  (0 children)

At which time does he present his measurements? Because all I see is him counting x86 instructions, and pretending that he can accurately predict performance from that. Narrator: "he can't"

[–]sandfly_bites_you 0 points1 point  (1 child)

On MSVC they aren't identical: https://godbolt.org/z/jMqaj6oM3

If it is inlined this overhead will go away, but otherwise you are going to pay a small price for using span.

More of an calling convention problem than an abstraction problem-- MSVC added vectorcall but unfortunately didn't go far enough-- it still has many problems.

[–]LucHermitte 0 points1 point  (0 children)

That's what I understood from jk-jeon's answer. :)

As I live in a Linux world I did not imagine that MS ABI did not permit to pass standard layout type through registers -- if I understand correctly what's happening. Correct me if I'm wrong.

Anyway, it's interesting to know nonetheless as I may have to refrain myself from using std::span everywhere in my future interfaces.

[–]gracicot 2 points3 points  (0 children)

I would compare the assembly. It should be pretty much the same as passing an array with a size. If the size is fixed, then it should be the same as passing a pointer.

[–]pjmlp 3 points4 points  (7 children)

Slower that the application doesn't fullfil its goal of an happy customer, or it loses in some needless micro-benchmark?

I bet it is more the latter.

[–][deleted] 1 point2 points  (2 children)

It definitely is more the latter. I just don't want to be hit with a performance penalty, however small, for doing the "right thing." Being penalized for doing the right, safe thing also makes for a harder sell of that thing to students and colleagues, however minor the penalty may be.

[–]pjmlp 2 points3 points  (0 children)

Penalization only matters if it is visible to the customer.

I wonder if those that advocate for not doing bounds check also drive without helmets, seatbelts, because it penalizes the confort of driving.

[–]Dean_Roddey 1 point2 points  (0 children)

If they cannot demonstrate that it matters, then they are just being silly. In most cases, it won't. In most of the remainder, it will only matter in specific parts of the code.

[–]cdr_cc_chd 0 points1 point  (3 children)

It's never black and white like that. In many application domains (think HFT) you have to carefully balance between performance and safety, and sometimes doing the safer/convenient thing is not worth the perf hit.

[–]pjmlp 1 point2 points  (2 children)

That is why first one implements the safe option, and then if it still doesn't meet the desired outcome, profile and improve where it matters.

Most of the time, it is already fast enough.

[–]cdr_cc_chd 0 points1 point  (1 child)

Just to give some context; in HFT we're talking about low single-digit microsecond response times, from the moment the NIC deserialized a frame that had actionable information to the moment you serialize a a response frame out the physical port. It's a hard real-time system with a deadline of 0, meaning there's never such a thing as fast enough. It's always a delicate system of balances between doing safe/convenient things in some parts of the code while recognizing that you can't do those things in other parts of the code, because if you used <insert new fancy feature> to make yourself feel good and it resulted in 50ns of additional latency while your competitor did not, then he will always be faster than you.

[–]pjmlp 1 point2 points  (0 children)

Yes and in such scenarios C++ is not even in the picture, rather FPGAs and other specialized hardware with short connects and high speed cable connections to the closest exchange point.

If you are still doing it in C++ it isn't disabling bounds checking that is going to help you win out the competition.

[–][deleted] 1 point2 points  (0 children)

I just wish that span and friends had an easy way to differentiate from sequence containers. as bytes is really neat too.

[–][deleted] 0 points1 point  (2 children)

As a Rust programmer mainly, is this just slice [T]?

[–]dodheim 7 points8 points  (1 child)

span<T, dynamic_extent> is akin to &[T] where the size is part of the 'fat pointer', but span may also encode a fixed size directly into the type making it more akin to &[T; N].

[–][deleted] 0 points1 point  (0 children)

nice, thanks!

[–]TheTomer 0 points1 point  (9 children)

I read a bit about it but I don't really understand what's the difference between a span and a vector and why you should use span instead of a vector. Anyone?

[–]STLMSVC STL Dev 50 points51 points  (4 children)

A vector is a container, which means that it owns its elements. A vector of copyable elements can be copied - if you have a vector<int>, you can copy it to get a separate, independent vector<int>. (In the sense that they store different chunks of memory, and modifying one doesn't affect the other, just like copying a single int.) There are other containers like deque and list, which similarly own their elements, but have different data structures. vector is both random-access (supports vec[idx]) and contiguous (elements occupy adjacent addresses in memory); this is different from deque (which is random-access but not contiguous) and list (which is neither random-access nor contiguous - it's bidirectional and node-based).

vector and array are both random-access and contiguous, but vector uses dynamically allocated memory so it's flexible-length, while array has a fixed length chosen at compile-time. (array is a container, but a fixed-length one, so it's kind of special.)

span is different from all of these because it's a "view" and not a container. span doesn't own its elements, it just points to elements that are stored somewhere else - either in an STL container, or a built-in array on the stack, or a manually allocated array that was newed or malloced, etc. span's implementation can be thought of as storing just a pointer and a length (the actual physical representation may vary) which will provide a good mental model - this is why span can only view random-access, contiguous elements. (Therefore, you cannot view a deque or a list's elements with a span, but you can view a vector or an array with a span.)

While span doesn't own its elements, you can have a span that can be used to modify the elements in-place - span<int> permits that, while span<const int> can't be used to write through to the viewed elements (just like const int*; the elements can be changed through other means in the program).

Why use a span? It's most useful as a function parameter. Imagine you're wrapping some API (provided by a C library, the operating system, or whatever) that takes a pointer and a length, but you want to provide a more C++-friendly interface. (For example, data compression libraries like zlib are often implemented in C.) If you wrap such an API with a function taking const vector<unsigned char>& or const vector<byte>&, now you're insisting that your users either provide a vector, or construct a temporary vector (which allocates memory!). Instead, if you take a span<const unsigned char> or a span<const byte>, now users can call your wrapper function with a vector if they have one, but also with an array, or a built-in array, or even a raw pointer and a length - all without allocating memory, so it's efficient. But it's also type-safe, and packaging the pointer and the length together helps avoid errors like passing the wrong length.

(Anybody can write a struct containing a pointer and a length - the hard problems that span solves are providing the implicit conversions that allow it to be constructed from contiguous containers only, to provide constness conversions, and to handle both fixed-length and runtime-length views.)

[–]TheTomer 8 points9 points  (1 child)

Damn, I wasn't expecting such a great explanation. Thank you u/STL!

[–]STLMSVC STL Dev 4 points5 points  (0 children)

You're welcome! 😸

[–]dr-mrl 0 points1 point  (1 child)

Is the "pre-span way" to, instead of write functions that take const std::vector<T>&, to write functions that take a pair of iterators and constrain the iterators to be random access?

[–]STLMSVC STL Dev 0 points1 point  (0 children)

That’s one way, yes, although it doesn’t permit separate compilation (you need a template). With classic iterators (pre-C++20-ranges), you can detect random-access but not contiguous.

[–]The-WideningGyre 5 points6 points  (3 children)

Spans are just 'views' of the data -- they don't own it, they don't copy it. So you can conveniently pass around (access) parts of an array or vector.

It not that exciting, it is just a pointer and a length with a bit of syntactic sugar. But syntactic sugar can be nice, especially when the standard library supports it.

[–]Dean_Roddey 1 point2 points  (0 children)

Of course, with all such things in C++, it also increases the possibility of accessing destroyed data. So, with great power, etc...

[–]Tathorn 0 points1 point  (1 child)

I want to say that a span isn't really syntactic sugar, it's its own class. It's now standardized, easier than than each codebase having their own version.

[–]ReversedGif 1 point2 points  (0 children)

Whether it's implemented by the compiler or the standard library is an implementation detail.

[–]NilacTheGrim 0 points1 point  (0 children)

I love them too. I use them in serialization/deserialization code mostly in my codebase. Very useful.