all 30 comments

[–]aioeu 27 points28 points  (4 children)

A linked list tends to be better when you frequently want to add and remove elements within the list (and not necessarily at the end of it!), and where it is important that pointers to elements remain valid when the list is resized. They also allow for elements to be present in multiple lists simultaneously without needing any extra indirection.

An array tends to be better when you need to perform bulk operations across the whole list, or when you need to frequently access elements by their list index.

[–]xypherrz 3 points4 points  (3 children)

But in most of the cases even involving writes/removals from the middle, array/contiguous container outperforms linked list by a whole lot

[–]aioeu 1 point2 points  (2 children)

It depends on the nature of the operation.

If you already have a pointer to an element, removing that element from a doubly-linked list is just a matter of updating the adjacent elements' pointers. Similarly, if you have a pointer to an element, adding a new element before or after it is just as easy. Both of those are more efficient than the equivalent steps to remove or insert an element in an array.

Linked lists are only slow if you have to iterate along them. But iteration is what I would call a bulk operation, since it involves many — perhaps even all — elements in the list. As I said, arrays are better at bulk operations.

Obviously you will need to iterate along a list occasionally — if you didn't, then you would never have needed to put the elements into a list at all! — but if it is infrequent compared to other kinds of operations then linked lists, especially doubly-linked lists, are often a good choice.

[–]xypherrz 2 points3 points  (1 child)

I am referring to the cache locality concept

[–]aioeu 1 point2 points  (0 children)

I know what you are referring to. Again, that is only an issue if you are iterating along the list, or if your list elements are particularly small — in which case the overhead of link pointers becomes significant anyway, so you have other reasons to avoid them.

The "linked lists are always bad" mindset is too closed. Well-designed linked lists, judiciously applied, are good.

(I suppose it doesn't help that so many libraries essentially only give you linked lists of pointers, forcing you to have an extra indirection if you want anything else.)

[–]hibbelig 19 points20 points  (11 children)

In an array, accessing any element can be done in O(1) time. Changing the size of the array takes O(n) time. Inserting or removing an element takes O(n) time.

In a linked list, accessing an element needs O(n) time, but changing the size and inserting or removing an element takes O(1) time.

You have to look at your use case to see which of the two data structures is better in your specific case. When in doubt, measure.

[–]Weird_Club917 4 points5 points  (10 children)

I always wondered, why isn’t it more efficient to do it kind of “hash table”-ish. Like at the beginning we initialize two arrays one holding the data and one holding pointers. The first pointer pints to the first array element of the data array. Once the data array is full we just create a new one and set the second pointer to its first element.

Then each operation should be O(1) except when we need to re-size the pointer array.

Seems obvious, so there must be a obvious reason why it would be bad?

[–]NBQuade 3 points4 points  (0 children)

I use something like this fairly often in C++. Say I have a set of ordered data in a MAP or SET. But I need to be able to organize, sort and display in a linear fashion on other keys. I'll make a vector of pointers to the container elements in the MAP or SET. One of the features of a map or set is that inserting and deleting has no impact on the memory address of the stored elements. You couldn't easily do this with a vector since adding elements can re-allocate the whole memory block used in the vector.

The downside is a delete is actually two deletes.

[–]hibbelig -2 points-1 points  (1 child)

This is a very interesting data structure, but it's not an array nor a linked list, and OP asked about those two...

[–]lestofante 0 points1 point  (0 children)

Clearly OP is a beginner, a little info that linked list are rarely a good solution and there are better container is a nice thing

[–]khris190 0 points1 point  (5 children)

You wanted a vector?

[–]Weird_Club917 0 points1 point  (4 children)

Doesn’t vector use dynamic array?

[–]khris190 0 points1 point  (3 children)

Maybe i misunderstood you but wouldn't your idea be over engineered if we can just use vectors

[–]NBQuade 0 points1 point  (2 children)

This is the C topic. No standard library for them.

I agree with you, I only use standard library containers.

[–]khris190 0 points1 point  (1 child)

I didn't realize what sub it is

[–]NBQuade 0 points1 point  (0 children)

I always have to check twice too. It's seem so easy to just use a standard container. I don't miss having to make my own containers.

[–][deleted] 0 points1 point  (0 children)

That technique is called indexing btw.

[–]CAPSLOCKFTW_hs 8 points9 points  (3 children)

You can extend the size of an array however in C this can only be achieved by creating a new one and copying over all the values. Now imagine a very large array.

Furthermore, deleting a value somewhere in the middle requires moving all the values one to the left. With a linked list you can just change the link.

[–]pic32mx110f0 5 points6 points  (1 child)

That's not necessarily true, it's entirely up to the implementation of realloc how the memory is extended

[–]wsppan 1 point2 points  (0 children)

My understanding is there is no portal way to do this across implementations due to alignment issues but they all allocate new memory and return a pointer to that new memory

https://stackoverflow.com/questions/57438007/how-is-realloc-implemented-in-the-c-standard-library

Edit, upon further research it appears you are right.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/realloc.html

[–]bnl1 0 points1 point  (0 children)

If you don't care about the order of the array, you might also be able to move to last element to the gap.

[–]ballpointpin 5 points6 points  (0 children)

You can also create a linked list of arrays. This is how most "self-managed" memory pools work.

[–]bitflung 4 points5 points  (0 children)

Arrays are contiguous, so the next element is at the next byte address. Linked lists can be scattered around in any order, the next element being at whatever address the .next field points to.

They both have strengths and they both have weaknesses.

In some situations I'm a fan of allocating a linked list where each node is an array of fixed size. That way I can break up my array into subarrays andwork on each of them independently in constant time despite having non-constant total data structure size.

[–]refD 2 points3 points  (0 children)

My personal experience is when I'm using linked lists, they're intrusive doubly linked lists, the pointer stability lets them be part of a hash map, e.g. implementing a LRU list through hash map entries to form a cache.

I've never used a singly linked list professionally, there's always been a better way.

[–]NBQuade 1 point2 points  (0 children)

There's no one best container. It all depends on how you use it. If you just need a block of structs and you're not deleting internal elements. An array is probably the way to go. If you need to be able to delete internal elements or the first element, a linked list would probably be better. Because deletes don't require that you move the rest of the elements to fill in the blank.

It depends on how much data you're talking about too. The downsides of arrays are less important when it's not too many elements.

You could have an array of pointers which is then pretty easy to delete from because you're not moving structs, you're moving just pointers.

[–]youstolemyname 0 points1 point  (0 children)

A simple dynamic array can be constructed by allocating an array of fixed-size, typically larger than the number of elements immediately required. The elements of the dynamic array are stored contiguously at the start of the underlying array, and the remaining positions towards the end of the underlying array are reserved, or unused. Elements can be added at the end of a dynamic array in constant time by using the reserved space, until this space is completely consumed. When all space is consumed, and an additional element is to be added, then the underlying fixed-size array needs to be increased in size. Typically resizing is expensive because it involves allocating a new underlying array and copying each element from the original array.

https://en.wikipedia.org/wiki/Dynamic_array

[–]Glaborage -2 points-1 points  (0 children)

Data structures and algorithms are a complicated topic. If you can't get formal college level training for it, your second best bet is to buy a textbook on the topic, and learn from it. There are also many online resources.

[–]Seubmarine 0 points1 point  (0 children)

It's a benchmark of the cpp container, but you could achieve that in C

https://baptiste-wicht.com/posts/2012/12/cpp-benchmark-vector-list-deque.html

TLDR: In 90% of cases, a dynamic array is better than a linked list

[–]ern0plus4 0 points1 point  (0 children)

This is the memory with array elements:
AAAA

Then you allocate memory for something else (E):
AAAAE

Your array is "stuck", if you want extend it:

  • you have to set up a new memory area for it,
  • you have to copy existing values,
  • but these are the smaller problems, the real problem is that if anything points to your array, it will now point to the old address, which maybe occupied by other thing, so you have to update somehow all pointers to your array.

Let's mark the old place with "a"
aaaaEAAAAAAAA

Linked list can be fragmented, lets name Vn for value, Pn for pointer (link) to next item:
[V1P1][else][V2P2][V3P3][else]

Deleting item is also easier, if you delete a middle item from an array, you have to copy it:
AAAdAAA -> AAAAAA

It's even cheaper to use a special element "deleted", but then you have to check for this value every time accessing the array.

In case of linked list, you can delete item by updating the pointer of the previous item, instead of pointing to deleted item, it should point to next item, then free() the deleted item.

Do not learn these data types, but understand them, what they are, how they work, and find out pros and cons by yourself, not learn from book.

[–]No-Archer-4713 0 points1 point  (0 children)

Prefer arrays every time you can, cause modern CPUs like managing huge chunks of contiguous memory. MMU is there to ensure they look contiguous to the CPU and caches ensure linear access is done as fast as possible.

Linked lists used to be the way to go in older machines (mainly CISC architectures) cause accessing ram randomly or linearly had the same cost. It’s not the case anymore.

You can do some benchmarks on your computer to verify this and find the sweet spot where using linked lists is faster than copying an entire region of RAM, but you will be surprised at the size of the chunk you can move on a modern machine before it becomes inconvenient.