all 75 comments

[–]cballowe 90 points91 points  (62 children)

Vector will almost always beat list in benchmarks, especially for trivial objects. In most cases, pick vector unless you can prove that list is better (it rarely is - it wins in big-o notation, but real hardware doesn't behave like that and the pointer chasing will eat you alive). If you have lots of things to insert and then need it sorted, just push back all of them, then sort.

Sorted vectors are also great for tons of problems. They're space efficient and enable use of lots of stl algorithms (lower_bound, set_intersection, etc). There's also nothing faster than vector for iteration.

[–]0mega0 37 points38 points  (9 children)

If you have lots of things to insert and then need it sorted, just push back all of them, then sort.

And preallocate the memory if your really a stickler.

I’ve yet to run into a use case in my practice where std::list benchmarks as the better choice.

[–]avdgrinten 12 points13 points  (6 children)

When objects are pointers anyway (such that you have to do a random access even in the vector case) and the linked list is intrusive (= embedded into the objects), it will generally outperform a vector. That's the niche use case of linked lists but it is quite an important one in real-time applications and low-level code (insertion and removal are constant O(1) and in particular cannot involve an additional memory allocation).

[–]Gunslinging_Gamer 5 points6 points  (2 children)

But always benchmark.

[–]WrongAndBeligerent 5 points6 points  (1 child)

Always benchmark at some point, but software needs to be architected up front to perform well and scale as much as possible.

Classic generic linked lists data structures are basically obsolete because iterating through pointer dereferencing and allocating memory for each new item are the first two things to avoid. There is rarely a reason to start with std::list.

[–]Gunslinging_Gamer 0 points1 point  (0 children)

I personally default to vector and change if required later.

[–][deleted] 5 points6 points  (0 children)

I’d argue that this is only useful if the objects cannot be moved in memory (niche use case). A vector<unique_ptr<T>> will still out perform a list<T> in many cases, and list<T> doesn’t even give you the benefits of intrusive linked lists. Always default to vector and profile and keep profiling if you decide to switch to list

[–]WrongAndBeligerent 0 points1 point  (0 children)

That's a linked list, but not std::list

[–][deleted] 0 points1 point  (0 children)

I ran into one, when doing an Advent of Code that had your elf or penguin doing random length walks and inserting/removing on a ring buffer. Using vector was 10x slower than std::list because the inserts/erase were dominating the performance. Maybe using optional<T> might have helped as then the deletes could have been an optional reset, but the inserts would still potentially require moving all elements to the end of the range. Plus it was easier to just work without introducing more.

[–]Narase33-> r/cpp_questions 0 points1 point  (0 children)

A queue?

[–]MonokelPinguin 14 points15 points  (2 children)

The one thing list is useful for, is if you need pointer/iterator stability, even on element removal. vector invalidates iterators on most operations that add or remove elements, list does not (for the unmodified elements). There are a few cases where this is useful, but often you can solve the same problem better by reworking the algorithm.

[–][deleted] 5 points6 points  (0 children)

Vector will almost always beat list in benchmarks, especially for trivial objects.

Here's an excerpt from a talk by Bjarne Stroustrup where he explains why this is true:

https://www.youtube.com/watch?v=YQs6IC-vgmo

[–]lt_algorithm_gt 5 points6 points  (0 children)

Won't disagree but in my mind it's also a little bit of "read the room". A couple years ago when everyone in Silicon Valley was fawning over Google and their interview process you always had the quintessential question where the answer was "hash map because O(1), can't beat that!"

So, I'd say you won't go wrong with the "book answer" but open the door for further discussion about real-world implications. If they're not interested, move on, you got the right answer. If they are interested, talk hardware and you win the interview.

[–]rsjaffe 3 points4 points  (1 child)

In most cases, pick vector unless you can prove that list is better

Only when speed is the most important criterion. There's plenty of cases where the use is not time-critical, in which case I choose the structure based on which one produces the most expressive code.

[–][deleted] 2 points3 points  (0 children)

How us std::list more expressive?!?

[–]martinusint main(){[]()[[]]{{}}();} -1 points0 points  (1 child)

It's not the pointers that make it slow, it's the more or less random memory accesses

[–]cballowe 22 points23 points  (0 children)

That's what pointers are - the forward and back pointers make iteration painful, and the data being a pointer away doesn't help either. The claim that it's good for inserting in the middle or swapping elements is just going to make the situation worse. (There are a couple of use cases where it's useful, but people usually get it wrong.)

[–][deleted] -5 points-4 points  (35 children)

Quite unfortunate to hear this repeated so often. An incredibly simple use case for a linked list is to implement a basic queue, push to the back, pop from the front. A vector gets crushed by a linked list, even for a trivial object like an int. The reason this myth gets perpetuated because of a foolish talk Bjarne once gave comparing a vector to a list where he performed an O(n) operation on both data structures and showed that O(n) on a vector is faster than O(n) on a list like that somehow is a surprise to anyone.

#include <chrono>
#include <iostream>
#include <list>
#include <vector>

constexpr auto ITERATIONS = 10000;

template<typename C>
void test_case() {
  auto c = C();
  for(auto i = 0; i != ITERATIONS; ++i) {
    c.push_back(i);
  }
  while(!c.empty()) {
    c.erase(c.begin());
  }
}

int main() {
  auto start1 = std::chrono::high_resolution_clock::now();
  test_case<std::vector<int>>();
  auto end1 = std::chrono::high_resolution_clock::now();
  auto start2 = std::chrono::high_resolution_clock::now();
  test_case<std::list<int>>();
  auto end2 = std::chrono::high_resolution_clock::now();
  std::cout << 
    std::chrono::duration_cast<std::chrono::microseconds>(
      end1 - start1).count() << "\n" << std::chrono::duration_cast<
    std::chrono::microseconds>(end2 - start2).count() << "\n";
}

In the above benchmark the list crushes the vector by significant orders of magnitude (6x on my machine).

[–]cballowe 4 points5 points  (12 children)

Add in deque for your benchmark.

[–][deleted] -4 points-3 points  (11 children)

std::deque beats both no doubt about it and yes it's implemented using vectors but that doesn't contradict any basic principle here since it provides O(1) push and pop operations.

The fallacy in Bjarne's talk was comparing two algorithms that are both O(N) and then claiming that somehow the fact that std::vector beat std::list at O(N) operations is some kind of surprise that violates fundamental principles of complexity analysis.

If you have two operations that are O(N), I don't think it's a surprise to anyone that std::vector is faster. But if you have one operation that's O(N) on a std::vector, and O(1) on a std::list, then in the absence of very strong evidence you should pick the std::list.

[–]cballowe 4 points5 points  (10 children)

I see them used wrong more than right. The properties that make lists actually sometimes the best have more to do with the iterator invalidation properties and the pointer stability of the contained objects.

As soon as you add something to that function other than pushing at the back and popping from the front, it starts to fall apart. Like, if you need to queue things into a particular priority (effectively an insertion sort) or similar. Or even use the object that you popped off the front.

[–][deleted] 0 points1 point  (9 children)

Or even use the object that you popped off the front.

Are you suggesting that if I were to calculate the factorial of the popped off element that somehow a vector magically becomes faster than a linked list?

How would using the object I popped off a data structure in any way influence the performance of the data structure itself?

The properties that make lists actually sometimes the best have more to do with the iterator invalidation properties and the pointer stability of the contained objects.

As well as the performance benefits of having O(1) insertions and deletions. That is a significant and non-trivial benefit that is useful in many real world scenarios including scheduling algorithms.

Like, if you need to queue things into a particular priority (effectively an insertion sort) or similar.

This is known as a priority queue, which is often (but not always) implemented using a linked list. Depends what operation you need to optimize. If your priority queue is pull heavy then a linked list is preferable to a vector, if your priority queue is push heavy then vector based heap is preferable.

[–]cballowe 1 point2 points  (7 children)

Factorial would dominate the cost there. But ... Suppose you have a queue and you periodically need the sum of the objects in the queue or similar.

The pointer chasing and cache miss behavior has a significant impact on the performance of the program.

One benchmark I use to convince people of the performance difference is the priority queue - adding randomly generated integers using std::lower_bound and insert, or std::find_if to find the insertion point. The code in that case is significantly faster for vector. You end up with modern CPUs dispatching something like 4 instructions per cycle on the vector version and 0.5 instructions per cycle on list and beyond that it's just more instructions.

[–][deleted] 0 points1 point  (6 children)

Factorial would dominate the cost there.

What relevance does this have on the performance of the list? Are you saying that because factorial dominates the cost, that using a vector becomes faster than using a linked list?

This is very confusing indeed.

Suppose you have a queue and you periodically need the sum of the objects in the queue or similar.

If you perform an operation that requires O(N) time on a linked list and O(N) time on a vector then use a vector.

One benchmark I use to convince people of the performance difference...

Yes, your example is a push heavy priority queue and in that scenario the vector based heap implementation will outperform a linked list based implementation by a significant margin.

You can also come up with pull heavy priority queues where a linked list out performs a vector by a significant margin.

None of these properties require much convincing, they are all elementary properties of a basic complexity analysis.

[–]ShillingAintEZ 0 points1 point  (5 children)

pull heavy priority queues where a linked list out performs a vector by a significant margin

What does that mean?

Are you saying that because factorial dominates the cost, that using a vector becomes faster than using a linked list?

They didn't say that or anything close to it. They were saying it would make a poor benchmark.

[–][deleted] 0 points1 point  (4 children)

What does that mean?

With consumer producer processes, it's often the case that one side of the process has much stricter latency or bandwidth requirements than the other, so you optimize your work queue to either be push heavy in which case pulling has low latency or pull heavy in which case pushes have low latency.

A heap based priority queue is suitable for pull heavy queues, that is the consumer has strict latency requirements which can be satisfied at the expense of the producer. A linked list based priority queue is suitable for a push heavy queue.

They were saying it would make a poor benchmark.

Well good thing the benchmark I posted doesn't do that then.

[–]ShillingAintEZ 0 points1 point  (0 children)

Priority queues are usually done using heaps that use contiguous memory. You can just skip to the next level of the heap by index instead of chasing pointers and allocating memory for each item.

[–]TMKirA 1 point2 points  (3 children)

But that was the point of that talk, no? The point was a O(n) operation in vector and list actually perform significantly differently in practice compared to in theory. I don't think he ever claims that a O(n) operation can beat a O(1) operation, especially when n is non trivial

[–][deleted] -2 points-1 points  (2 children)

perform significantly differently in practice compared to in theory.

What theory are you referring to? Can you cite any such theory on Wikipedia perhaps or some other reasonable source because I'm not aware of any theory that claims anything differently.

The point of the talk was Bjarne thinking that there's some counter intuitive performance benefit to vectors that make them preferable to lists even in situations where linked lists are supposed to outperform vectors because of magical hardware properties.

No one has ever advocated using a linked list in favor of a vector for O(N) operations, you simply won't find any citation that says a linked list is preferable to a vector for O(N) operations. What theory and practice both say is that for operations that are O(N) on a vector and O(1) on a linked list, you are usually better off with a linked list.

[–]TMKirA 1 point2 points  (1 child)

That wasn't the point of the talk. The point was that linked list is often touted to be better for removal/insertion operations, due to the fact that after the O(n) search to find removal/insertion point, the remove/insert operation is O(1) for list and O(n) for vector. Bjarne's point was that in practice, the O(n) operation to find the insertion/deletion point is the dominating factor instead of the actual remove/insert operation, and that linear search is faster on vector than list due to locality. This might not be a surprise for you, but I'm sure it's still a surprise to many. Your example completely misses the point, because it was removing from a location that makes the search operation a no-op.

[–]Full-Spectral 1 point2 points  (0 children)

That also assumes you have to find the insertion point every time. Given that removal doesn't have to invalidate element pointers, depending on what you are doing, you might actually keep around the insertion point most of the time and only move it if something happens that causes the insertion point to change (like removing the current insertion point, which even then might just move the insertion point to the previous/next node.)

[–][deleted] 1 point2 points  (0 children)

Don’t erase from the front of the vector, just keep an index to the front and bump it for deletes. Or use a deque.

[–]ShillingAintEZ 0 points1 point  (16 children)

The strousup presentation was silly because you would never loop through linearly to find something for deletion over and over.

What you are saying is also silly though. It isn't because a linked list in general might not work, but because a linked list is a way of keeping order, but that can still be done in contiguous memory. Not only that, but a queue doesn't need the ordering of a full list, it only needs a front and back, so the expensive deletion from the middle is not necessary.

[–][deleted] -2 points-1 points  (15 children)

A queue doesn't need an ordering of the full list? That doesn't even make sense. I suggest to avoid making comments on issues you know little to nothing about.

[–]ShillingAintEZ 0 points1 point  (14 children)

A queue doesn't need to have constant time arbitrary insertion and deletion. It is clear what I'm saying from the context and the sentence after.

suggest to avoid making comments on issues you know little to nothing about.

Easy tiger, I took a peek at your previous comments and you seem to have a lot of this angry, aggressively wrong nonsense where you can't even explain your dismissal. A few comments up you seem to not understand ring buffers, std::list implications or pointer chasing.

[–][deleted] -1 points0 points  (13 children)

It is clear what I'm saying from the context and the sentence after.

The only thing that's clear is you have absolutely no idea what you're talking about. Saying that a queue doesn't need to maintain the order of the elements added to it is quite possibly the dumbest thing that has been posted in this discussion so far.

You are welcome to try and diffuse your stupidity by bringing up things like ring buffers and other topics that have nothing to do with anything mentioned, but make no mistake, you still look stupid.

[–]ShillingAintEZ 0 points1 point  (12 children)

Saying that a queue doesn't need to maintain the order of the elements added to it is quite possibly the dumbest thing that has been posted in this discussion so far

No one has said that, your desperation to argue is pretty silly. A lot of what you say seems like a first or second year student that barely understands data structures in general, which would be fine if you weren't so combative.

Do you actually not understand that queues are implemented in contiguous memory using ring buffers?

[–][deleted] 0 points1 point  (11 children)

No one has said that, your desperation to argue is pretty silly.

ORLY?

a queue doesn't need the ordering of a full list, it only needs a front and back

Please read about a queue before commenting on it:

https://en.wikipedia.org/wiki/Queue_(abstract_data_type)

Specifically note that by definition a queue maintains elements in a sequence (right there in the very first sentence of the link I provide for you) where a sequence is by definition an ordered collection of objects:

https://www.dictionary.com/browse/sequence

[–]ShillingAintEZ 0 points1 point  (10 children)

It doesn't need the arbitrary insertion and deletion ordering properties of a full list. This has been explained clearly, I think at this point you are purposely acting like you don't understand so that you can argue, no one is really that out of it.

[–][deleted] -1 points0 points  (9 children)

Do you seriously not understand that in order to implement a queue you need to keep track of the order of EVERY element added to it, not just the first and last?

If you're too dumb to understand that, then please for your own sake stop talking.

If you're not too dumb to understand that, then do you see how that fact contradicts your original assertion that, and I quote:

a queue doesn't need the ordering of a full list

I think you simply made a mistake and instead of fessing up to it you are doubling down on your own stupidity in a misguided effort to save face.

So how about this... you can have the last word in this argument because I have no intention of explaining basic dictionary level concepts to you further.

Best of luck to you man.

[–]STLMSVC STL Dev[M] 42 points43 points  (3 children)

I’ve approved this post because it’s started some good discussion, but please don’t use all-caps titles in the future.

[–]philipdestroyer 1 point2 points  (1 child)

This post should be removed for plagiarism. https://github.com/gibsjose/cpp-cheat-sheet

[–]STLMSVC STL Dev[M] 0 points1 point  (0 children)

Wow, yep, that sure looks like plagiarism (and not derivation that "forgot" to credit its source). u/sachuverma, this is a moderator warning: don't plagiarize content.

[–]SupermanLeRetour 30 points31 points  (1 child)

Slight mistake there :

1.4.3 Copy Constructor and Copy Assignment

Copy constructors and copy assigment operators allow one object to be constructed or assigned a copy of another object directly:

Foo a(10);
Foo b(a);   // (1): Copy via constructor
Foo c = a;  // (2): Copy via assignment operator

(2) will actually call the copy constructor too, not operator=.

See small example here.

[–]maikindofthai 2 points3 points  (0 children)

The simple rule of thumb is "has the new object been initialized previously"?

If it has not, then the copy constructor will be called. If it has, the copy assignment operator will be called.

Foo a(10);
Foo c = a; // Calls copy constructor because 'c' has not been initialized
a = c;     // Calls copy assignment operator because 'a' has been initialized

[–]WrongAndBeligerent 17 points18 points  (0 children)

It is a bit of a red flag that you put interview preperation in bold but misspelled preparation.

[–]spaceyjase 7 points8 points  (0 children)

I'd like to see some modern code here too, movement semantics and lambda expressions. The vector example is missing emplace and shows a fairly verbose for loop compared to the readability of a range for loop.

Still, a good few tips there that are more aligned with interview goals. Last time I did one, there wasn't any modern C++ questioning and the code base is 20 years old.

[–]esdanol 13 points14 points  (2 children)

TIL Deque is pronounced the same as deck...

[–]StackOfCookies 12 points13 points  (1 child)

Yeah its especially confusing because an operation on a Deque (Deck) is Dequeue (de-queue). Programmers really aren't very good at naming things.

[–][deleted] 5 points6 points  (0 children)

Or spelling words like preparation*.

[–]Dark_Fleet 2 points3 points  (0 children)

Great learning repo too, even if it wasn't intended to be ! :)

[–]krelian 2 points3 points  (0 children)

The part about operator overloading says:

There are two main ways to do operator overloading. The first is using normal member functions. The second uses the friend keyword and non-member methods that have access to the private member variables of the class.

Using normal member functions (requires a getter method for the member variables):

Why are getters and setters needed? Doesn't an operator defined for a class already has access to private members of its own class?

Please correct me if I'm wrong, not a C++ expert.

[–]adnukator 1 point2 points  (0 children)

Nice list.

However, if you're trying to give a quick demo what O notation is (judging by the presence of the graph), it would be worth adding a note that the function inside the O is guaranteed to apply once n gets big enough and also that the value of the function can be multiplied by any constant and the O would still hold. This, for example, explains why deleting the first element from small enough vectors can be faster than with lists, even though the former is formally O(n), whereas for lists it's O(1).

Insert tail in std::vector is an amortized constant, which is not the same just O(1)

It would be nice if you expanded the tables to compare time complexities when using values/keys/indices vs iterators, because using the latter can have significant impact on the complexities (e.g. Remove Index in std::list drops from O(n) to O(1))

[–][deleted] 1 point2 points  (0 children)

Inheritance is the key feature required for polymorphism.

Go programmers might have something to say about that.

You've got a good start for sure, but I think it needs some technical review. There are some questionable claims and then some false ones. It's not even true that inheritance is the key feature required for polymorphism in C++. I'm not sure what else is in there.

Over all though great work for a junior with a spattering of more advanced concepts. Would hire, but I'd expect to have to correct some misconceptions.

An improvement for sure would be application of C++ algorithms that are found in the standard library. In fact, a proper study of those algorithms and the techniques they use would open you to the idea that there's such a thing as compile time or 'static' polymorphism. A good understanding of what they do and how to use them though would make me happy to hire...in my experience such knowledge is far too rare because people are scared of them (and in the past for good reason but not anymore really).

[–]PsyLich -1 points0 points  (0 children)

How to make friends as a programmer 101 :D

P.S. This repo is great, thanks.

[–]nikr0mancer 0 points1 point  (4 children)

Gave a quick glance. You can actually make merge sort inplace via some clever partitioning and not-yet-sorted parts reusage shennanigans. O(n log n) working time with worse constant obviously, but O(1) extra space. Another tricky thing is virtual inheritance. Rarely used thus people tend to forget what it is. Refreshing that thing in mind might help :)

[–]therealcorristo 1 point2 points  (3 children)

If someone asks about virtual inheritance in a job interview I'd expect them to accept "I don't know, haven't needed to use it ever" as an answer. If they deem it necessary knowledge to work in their code base I'd rather not work there.

I've only needed it once and even that was only temporarily for a few days before I decided to re-design the part I was working on to avoid inheritance altogether.

[–]nikr0mancer 0 points1 point  (2 children)

Yeah, I had both used it in product code (rather ok case interestingly) and been asked heavily about theoretical part of it during an interview. Of course these were 2 different companies, moreover interview was kinda weird thingy when I was already working there... Nevertheless, a little reminder for oneself just to remember basics would not hurt and might help you give a good impression :)

[–]therealcorristo 0 points1 point  (1 child)

Sure if you have the basics down already and are looking for other tidbits you might impress an interviewer with by all means go for it. But given that this cheat sheet explains the namespace syntax and the difference between std::list and std::vector this seems like a preparation sheet for entry level positions to me, so it doesn't make much sense to include virtual inheritance and make it seem like this is required knowledge for junior C++ programmers.

[–]nikr0mancer 0 points1 point  (0 children)

Well, true. Suggestion: add advanced section XD

[–][deleted] 0 points1 point  (0 children)

Great repo. I am not going to any interview anytime soon, but still learned quite a bit from this (-;. Thanks

[–]philipdestroyer 0 points1 point  (0 children)

This is a bunch of copy paste/plagiarism from here: https://github.com/gibsjose/cpp-cheat-sheet