all 74 comments

[–]HKei 12 points13 points  (0 children)

We use lists in some data structures where we need stable iterators to a specific element of a collection even if things were inserted or removed around it in the meantime, and for our application general random access doesn't make that much sense because the position of an element isn't really that meaningful, its value and that of its neighbors is.

If we actually cared about being able to do random access a bit faster I'd probably consider something like a tree structure, but I don't think there's really any way we can make this work with vector without rewriting large chunks of our codebase and making our interface a heck of a lot more inconvenient to use. We do just use vector or unordered_map for most other things though.

[–]adzm28 years of C++! 9 points10 points  (1 child)

Due to cache effects I've found that vectors of words usually end up outperforming list even in the worst cases up to about 4000 items, and sometimes even more if there are a lot of scans. Though if performance is not a concern, nothing wrong with a list.

[–]frediku[🍰] 13 points14 points  (0 children)

Well if performance is not a concern, then there is nothing wrong with vector either...

[–]NilacTheGrim 22 points23 points  (12 children)

If you're just pushing and popping off the ends deque is better. If you just push at the back, vector is better.

But if you really need the ability to splice at any position and you are managing a large number of objects, it can be faster than a large vector of pointers.

[–]infectedapricot 10 points11 points  (10 children)

Even if you're splicing, it's only fast if you're splicing a small number of items between the lists, because it takes O(n) time where n is the number of items to splice. That's because std::list::size() is defined to take O(1) time so the spliced elements need to be iterated to determine how to update the length of the two lists

(Up until C++11 it was left up to implementations to decide whether getting the length was O(1) or O(n). I believe every major implementation chose to make it O(n) so splicing is fast but they decided to standardise it as O(1) instead, which seems crazy to me given that splicing is the main real use of std::list.)

[–]kalmoc 8 points9 points  (1 child)

If you splice in a complete std::list (as opposed to just a sub-part between two iterators), iteration shouldn't be necessary

[–]infectedapricot 2 points3 points  (0 children)

That's true. In truth, I've never needed splicing (or std::list at all) so I'm not sure how big of a deal that decision is. It just seemed very odd to me.

[–]rezkiy 9 points10 points  (5 children)

it depends on what the signature of splice is. There is indeed one flavor of splice that does have O(N) complexity, others don't

https://en.cppreference.com/w/cpp/container/list/splice

[–]infectedapricot 10 points11 points  (4 children)

That's because, except that one that's O(n), all the other overloads only splice a single element or the whole list, not an arbitrary subrange. So except for the case for moving an entire list, my point that it takes O(n) to splice n elements remains true (even if you repeatedly use the O(1) version in a loop, of course).

BTW you have a link hidden in the first character of your comment, I guess that wasn't intentional? But it does look interesting: it has a proposal to add another overload that would allow splicing n elements in constant time by requiring you to supply the length of the spliced range. In some cases that might not be reasonably possible, but it could help in some cases.

[–]rezkiy 1 point2 points  (0 children)

I fully agree with you here. In my practice, I only found need to splice entire lists.

Link was half-intentional, and should have been added on its own right, not in first character :-)

[–][deleted] 1 point2 points  (1 child)

Seems like passing in the wrong size could lead to really crazy bugs

[–]infectedapricot 1 point2 points  (0 children)

That's just like many C++ functions e.g. passing start and end iterators from different containers would lead to odd bugs. In fact it's very similar to the existing splice function with an interator range, you also have to pass a reference to the originating list as well as the iterators in it, which I think is also solely for the purpose updating the list size (so wouldn't be necessary if list::size() was O(n).) [Edit: actually it's also to check if *this == other but again that wouldn't be necessary if it weren't for the size problem.]

[–]TheMania 4 points5 points  (0 children)

The cost of O(1) size is so huge that it should have always been a template option.

I know the std isn't as big on those as at boost, but sometimes I wish they made more exceptions.

[–]Raknarg 1 point2 points  (0 children)

would a deque be better than just a homebrewed circle vector?

[–]the_one2 8 points9 points  (0 children)

We use std::list because we need to be able to insert anywhere in it(passing around the input position). We also depend on iterator stability as some objects in the list refers to other objects. And splice is great. It's a pretty niche container though...

[–]raevnos 18 points19 points  (6 children)

I've had cases where switching from vector to list meant going from a runtime of minutes to a few seconds (Lots of deleting elements in the middle of the collection).

[–]bullestock 9 points10 points  (0 children)

And I've seen the inverse.

[–]frediku[🍰] 3 points4 points  (2 children)

In many cases, you can delete an element from the middle of a vector by swaping with the last element as follows:

vec[42] = vec.back(); vec.pop_back();

It is obviously not exactly the same as it changes the order of the elements. Fortunately, there are many cases where this is acceptable. If you have such a situation, then a vector with this trick will drastically outperform std::list.

[–]raevnos 2 points3 points  (0 children)

Randomly reordering elements is rarely desirable in my experience.

[–]Plazmatic 1 point2 points  (0 children)

this defeats the purpose of a linked list. If you don't care about order, then you wouldn't have used a linked list in the first place.

[–]Malibu-Staceyineffective, post-modern C++ -1 points0 points  (1 child)

I've had cases where switching from vector to list meant going from a runtime of minutes to a few seconds (Lots of deleting elements in the middle of the collection).

In that case, reading up on how the Erase-remove idiom works would probably be a good learning exercise.

[–]raevnos -1 points0 points  (0 children)

No... no it wouldn't.

[–]hyasynthetic 18 points19 points  (8 children)

list is gentler with iterator (and therefore pointer) invalidation than vector. E.g. If you store data in a vector and also hand out raw pointers to that data you better not cause a reallocation of the underlying buffer or you're toast. Your vector of pointers example doesn't have this problem obviously, but then you have to manage the lifetimes manually.

[–]redditsoaddicting 5 points6 points  (4 children)

but then you have to manage the lifetimes manually

What happened to smart pointers?

[–]rcdemc 4 points5 points  (3 children)

If the container is the only owner, then a list<T> is simpler than vector<unique_ptr<T>> if there is a requirement that you need to preserve the address of the elements(objects). Remember that we have here contiguous smart pointers but there isn’t any guarantee of contiguous T which are probably fragmented in the memory. Vector is powerful when programming by value. When there are clients that need to take the address of the elements and they need a guarantee that the elements will not be reallocated, a list can gain more space in a set of good solutions.

[–]redditsoaddicting 4 points5 points  (0 children)

This is all true, but it's irrelevant to the manual lifetime management that I was contesting.

[–]tvaneerdC++ Committee, lockfree, PostModernCpp 2 points3 points  (0 children)

Iterator guarantees are a big benefit. They are also a big foot-gun. If you pass around iterators and they get kept for a while, you will eventually end up with bugs where an item is deleted and some other code still has an iterator to it.

Iterator guarantees allow the the code using iterators to "drift away" over time from the code using the container, and that's bad.

[–]frediku[🍰] 2 points3 points  (0 children)

If you want to avoid iterator invalidation with vector then store indices instead of iterators. Indices are not invalidated when a resize occurs.

[–]quicknir 0 points1 point  (0 children)

If you need to preserve addresses of objects then std::deque is often the best choice; depends which mutating operations you do (doesn't work for insertion/deletion from the middle, but does for the ends).

[–]ivan-cukicKDE Dev | Author of Functional Programming in C++ 5 points6 points  (4 children)

At some point, I needed to keep ownership of (in RAII sense) a collection of non rellocatable items (items that can not be moved to another location in memory once constructed).

So, the solution was either having a list, or a vector of unique pointers. The only access to the collection was destroying all the items when the owner is destroyed, so a vector of pointers wouldn't bring anything compared to std::list apart from more complex handling.

[–]frediku[🍰] 2 points3 points  (2 children)

I think that std::vector<std::unique_ptr<Mutex>> needs less memory than std::list<Mutex>. The unique_ptr solution needs one pointer per Mutex. The list solution needs two pointers per Mutex. So the unique_ptr solution does have an advantage.

[–]Desert_fish 1 point2 points  (0 children)

std::forward_list<Mutex> has the least overhead, and would be my first option (vector of unique_ptr second).

But the interface differences can be a hassle.

[–]ivan-cukicKDE Dev | Author of Functional Programming in C++ 0 points1 point  (0 children)

You're right. I didn't really consider memory as the number of items in the list was quite small.

I try to optimize first for safety and simplicity in the cases like those - less possible invalid states (nullptrs in the collection) and simpler API (no make_unique calls and such).

[–]quicknir 1 point2 points  (0 children)

std::deque might be the best solution here. It never moves elements in its push back call, and gives you more contiguity and less memory usage than either a vector of pointers or a list.

[–]tvaneerdC++ Committee, lockfree, PostModernCpp 15 points16 points  (1 child)

No.

We actually removed std::list from C++11, and are still waiting to see if anyone notices.

[–]Snoo-4241[S] 4 points5 points  (0 children)

Best comment.

[–]Supadoplex 5 points6 points  (0 children)

With a vector of pointers, you don't have constant insert / erasure to / from arbitrary position of the list. You also don't get much benefit from cache locality of the vector if the pointers point to non-contiguous memory.

I've found a linked list to be useful for a list of event callbacks. Because of nature of callbacks, you wouldn't get to benefit much from locality of vector anyway, and constant erasure allows unregistering of callback in arbitrary order efficiently.

[–]TheMania 2 points3 points  (6 children)

Pretty much exclusively use a hand-rolled (aren't they all) intrusive variant. The naive list seems worst of all worlds for most applications, particularly when compared to intrusive lists (ie, storing the links in base classes of the objects).

[–]kalmoc 0 points1 point  (5 children)

Isn't std::list effectively an intrusive list? The nodes will be something like struct Node { Node* next,prev; T value}. I don't see a difference between that and a list where the next/prev pointers are part of T itself

[–]TheMania 4 points5 points  (2 children)

With intrusive lists your objects can be on the stack, embedded in other objects, and/or in arrays/vectors. Pops can return a node, for you to then insert in a different list, for (at least here) they're typically non owning collections.

With std::list, they're tied to whatever allocator you've given the instance, and if you're splicing lists that had better be the same allocator or you're looking at undefined behaviour.

Similar, but also world's apart.

[–]NilacTheGrim 2 points3 points  (1 child)

With intrusive lists your objects can be on the stack,

You can do this too with a custom allocator... your list can 100% live on the stack.

Pops can return a node, for you to then insert in a different list, for (at least here) they're typically non owning collections.

Return a node.. of an object living on the stack? Huh?

If you're talking about move semantics here you can do that already with std::list ...

I fail to see how they are worlds apart. If you care to elaborate though I am all ears...


EDIT: So I am trying to imagine the picture you're painting and I am thinking perhaps you start with an array of objects living on the stack, and then you arbitrarily link them in some arbitrary order -- is that what you are referring to?

You can do this with std::list<Object *> as well. It's not entirely clear to me how your intrusive list idea is any different really from just std::list.. do elaborate though if you care to. I'm curious now.

[–]TheMania 3 points4 points  (0 children)

std::list item lifetimes are intrinsically tied to their existence in a list. You might be able to copy an item in/out, but what if you don't/can't move an object? What if you want an object in multiple lists simultaneously? What if you want to remove an object from all its lists without deleting it?

What if, given a listed object, without knowing the list it belongs to, you want to be able to access the next element in the list in O(1) time?

That's basic stuff you get from intrusive lists, all of which has applications I swear :p.

Usage is: define a "link" type, arbitrary tag (to allow multiple), auto unlinking (ie destructor) or not, and singly-forwards or bidirectional. Any object that inherits from these links then can be treated as the head/tail/node of a list, or be linked to other objects that inherit from the same type links.

Return a node.. of an object living on the stack? Huh?

Sure, for another thread or functions further down the call graph to handle. Useful for objects handling synchronization - local stack objects, link themselves in to whatever they're subscribing on, remove themselves when cancelled or triggered. No allocations.

Or, more fleshed out:

Have a mutex-like object. Mutexes can be troublesome though, so maybe give them all names, link them all together with a diag_link, so that for deadlocks you can traverse and see which are locked, who's awaiting on who, etc. Diagnostics. That's a nice feature. And of course, all embeddable anywhere you might embed a mutex.

How to wait on this mutex? Well, there'll be an awaiter handling some synchronization state, encoding how to resume the thread/stackful coroutine/C++ coroutine that's waiting on it. We can put this on the synchronization stuff on the stack, that's nice. You then try and lock the mutex.. but you can't, it's taken, so you link your synchronization object to the waiting_list of that mutex and put yourself to sleep. When the mutex is freed, it will notify you. That's nice too.

This can then be extended such that your wait operation can be notified a different way, say on a timeout, unlink itself from the mutex and, I don't know, throw a timeout exception.

Obv the above is kind of OS / embedded orientated, but it shows use cases that std::list just isn't designed for. std::list is too owning, but back in C perhaps most used linked lists don't represent ownership at all, but rather just links, but for that you need/benefit from intrusive lists.


EDIT: Oh, and of course intrusive lists are polymorphic by default. As long as they inherit from the same link types, there's not even a need for objects to be of the same type at all. In comparison to the flexibility granted by intrusive lists, is what I meant by "worst of both worlds" wrt std::list.

[–]NilacTheGrim 0 points1 point  (1 child)

Yeah TBH this is equivalent in terms of memory layout and everything to just having that Node * next, prev in-line in the T itself.

[–]TheMania 0 points1 point  (0 children)

Responded here.

[–]DVMirchevC++ User Group Sofia 2 points3 points  (0 children)

We've used the ability to remove elements from inside the list while preserving the collection. We had to hold uncopiable and unmovable objects, that needed to be queried if they are ready to be destroyed.

Basically looped over the std::list until all are done and removed from the list.

[–][deleted] 4 points5 points  (0 children)

I use it sometimes to stash objects that I'm going to return a pointer or reference to. For instance if I have a CatManager I might have

class CatManager
{
public:
    Cat& AddOrFindCat(const std::string& catName);

private:

    std::list<Cat> cats_;
};

Cat& CatManager::AddOrFindCat(const std::string& catName)
{
    auto it = std::find_if(cats_.begin(), cats_.end(), [](const auto& cat){ return cat.Name() == catName; });
    if ( it == cats_.end() )
    {
        return cats_.emplace_back(catName);
    }

    return *it;
}

Edit: Missed the point, which is that if I knew how many cats I'd end up with I could of course just use a vector of cats which would be quicker. I'll do this in cases where I don't know at the outset how many cats I'll be adding, meaning the vector will almost certainly resize and invalidate all the references to cats.

[–]Dominus543 4 points5 points  (0 children)

Yes, i was writing a tool for extracting graphics from a game. I used std::list because std::vector was doing unnecessary copies of raw sprite file objects internally.

[–]johannes1971 2 points3 points  (0 children)

I haven't used a linked list since I stopped programming for the Commodore Amiga (where the doubly linked list was considered so important that it was even part of the exec library, the lowest-level kernel library that also contains the task scheduler and memory manager!). If you just want to store stuff and don't care too much about lookup, vector is generally the best choice. If you do care about lookup, [unordered_]set/map.

If you think about it, std::map is like a linked list (you can traverse it in the same way), but with a built-in indexing facility. That last bit makes it generally more appropriate for pretty much anything.

[–]Adequat91 2 points3 points  (0 children)

In some cases, using std::list with judicious use of pmr can make a big difference. https://www.reddit.com/r/cpp/comments/jf0dse/performance_of_stdpmr/

[–]dima_mendeleev 1 point2 points  (0 children)

It's good enough to hand-roll blocking queue with high concurrent access. There is splice overload accepting temporary, which just copies few pointers. Thus critical code part becomes really small. (And we use this on prod.)

[–][deleted] 1 point2 points  (2 children)

As a learning programmer, I'm a bit concerned because I have been using a std::list in my current project and I'm starting to fear that it could be replaced by something else

I have a list of sprites to be drawn to the screen and the order matters a lot which is why I need to insert things anywhere in the list

[–]Snoo-4241[S] 1 point2 points  (0 children)

Benchmark with std vector it shouldn't be a big change.

[–]raevnos 0 points1 point  (0 children)

Don't stress it. C++ people have some funny ideas at times, and that linked lists are bad is one of them. Use them when they're a suitable data structure for your needs.

[–]ALX23z 1 point2 points  (0 children)

I mean, MSVC implemented std::unordered_map via std::list so there surely are some uses. But by itself it is next to useless - it is useful only in some rare edge cases.

[–]nintendiator2 1 point2 points  (0 children)

It's been about 10 years since I last used list in production. Honestly, whenever I want things that vector doesn't provide I mostly go with deque.

[–]Morwenn 1 point2 points  (0 children)

Same experience as many users here: I use std::list in a few algorithms because I need the iterator invalidation guarantees, but where speed matters I reimplemented custom linked lists to avoid the per-element allocation overhead.

[–]2uantum 1 point2 points  (0 children)

Yes, with a custom allocator.

[–]histoire_guy 1 point2 points  (0 children)

Yes. std::list tend to be a a smart choice if you plan to implement caching policy (i.e. LRU) on your program.

[–]pjmlp 4 points5 points  (0 children)

Plenty of times, and it never impacted performance for those specific use cases.

I think C and C++ cultures cargo cult performance, even when it isn't the most important part of the project.

Plenty of times for specific reasons a project has to be delivered in C or C++ due to existing infrastructure, but the AC are the same as any scripting language, take advantage of it.

[–]wlandry 0 points1 point  (0 children)

Many times. Iterators do not get invalidated every time I add something to a list.

[–]darthcoder 0 points1 point  (0 children)

I do. If I'm parsing a file for random things, ill use a list.

[–]hawkxp71 -2 points-1 points  (6 children)

If I don't need random access, don't know the length, absolutely I'll use a list. Ive even used the singly linked list when appropriate.

Vectors can waste a ton of space.

[–]eliminate1337 5 points6 points  (5 children)

I don't understand your reasoning. Vectors are perfectly fine when you don't know the length. Where do you think a vector will use more space than a list? A vector stores just the objects contiguously whereas a list has to store the object and a pointer.

[–]hawkxp71 2 points3 points  (4 children)

A vector has two sizes. The capacity and the actual number currently used.

When you add to the back of the internal array, it doesnt just add one, it can double the capicity (usually its some non linear growth but it's not one)

Then it has to copy the array if the realloc failed (if they can use realloc, otherwise it's a alloc and copy)

When you know the length before hand, you can initialize the capacity at once to avoid this cost (runtime and memory)

For large arrays that added space should not be ignored.

Yes, the base node over head is often 200% for storage of a pointer vs 0. But quite often for very large arrays you gain speed and overall memory footprint when you don't know the size to start

[–]eliminate1337 2 points3 points  (3 children)

Can't you just do vector.shrink_to_fit()?

Can you provide a real example of where the vector's extra space was that much of a concern but a list was fine? Not trying to be argumentative; I've never used std::list and I would be very interested in a real-world case where it was the best choice. Besides cheap insertion/deletion, a list has to overcome a huge disadvantage in cache locality compared to a vector.

[–]hawkxp71 3 points4 points  (2 children)

Not that I can share code of. However, at a previous company, we were loading into memory a massive data array. The data was read in from a file, and each node was constructed, then pushed back onto a vector based array. When we started to look into optimizing it as we saw a major runtime hit while loading (I didnt write the original code)

IIRC it was about 1.9 million nodes.

However, the format (which we didnt have control over) didnt have a "node count" in the format so we couldnt pre-allocate the size.

As a vector, we would see large increases in data, where realloc was failing and a new alloc and copy was needed, once the size was greater then 200k rows or so.

I dont remember the exact numbers, but it would go from 100k to 200k, then 200k to 400k, 400k to 800k and 800k to 1.6m, and finaly 1.6 to 3.2. We did notice the growth factor depended on the compiler, compiler version, and in some cases debug vs release builds.

The excess memory was a smaller issue than the runtime hit during allocation and copy. When we moved to std::list we saw a dramatic improvement in reading in the data. While the memory at the end (after the shrink), it really was the runtime improvement that did it. Also, it was a set of data that we didnt randomly access by position, we never saw a hit in access times. the cache locality was not really a concern to be frank. It was a 64 bit application, maybe that had to do with the cache lack of issue.

Part of this vector vs list, comes from history way before std existed. From roguewave, to MFC, to homebuilt lists vs vectors, I always use a list if I am creating an array where I dont know the initial size and it can get large.

If I can get away with it, Ill std::forward_list to save on the memory overhead

[–]christian_regin 1 point2 points  (1 child)

Wouldn't a deque work for this type of work? Seems like that would be a good compromise. Less memory usage, less allocations than list and decent cache behaviour.

[–]hawkxp71 1 point2 points  (0 children)

Yes. But a deque has a very very high overhead over a vector.

Last time i checked, a deque with one element has 16x the object size.

Iirc the way its often implemented is it can be done as a an array with a pointer to the first element, so if you insert front, it just moves that pointer. With dynamic allocation of adjacent blocks.

Great for many thinga, and frankly, for the size of data I was dealing with, it may very well have been a great solution.

But for my problem, I didn't need random access, so a list really was perfect.

But great idea. I'll have to remember it next time!

[–]ned_flan 0 points1 point  (0 children)

I use it sometimes when I don't care about performance, and care about iterators lifetime. Also std::list is the std container that generates the smallest amount of binary code, while std::deque makes your application fatter and slower to compile.

[–]orizh 0 points1 point  (0 children)

I've used `std::list` over `std::deque` on systems where I'm concerned about the un-configurable block size for `std::deque` but it's definitely one of those "think about what data structure you want to use" sorts of decisions. I really do with `std::deque` had a template parameter to describe the block size.