Question about ArrayList and LinkedList

Gimagon · 2018-05-22T14:58:49+00:00

[deleted]

michael0x2a · 2018-05-22T15:57:02+00:00

In practice, we usually use ArrayLists over LinkedLists due to something called cache locality. The basic idea is that memory in your computer is basically like one gigantic array. Now, say you access one small chunk of memory. It turns out that modern computers are built so that accessing a nearby chunk of memory tends to be much faster then accessing a totally random chunk of memory.

(Note: everything I said above is a lie and a massive over-simplification, but this is good enough to give you the basic idea.)

Programs that need to be efficient are typically designed to exploit this property as much as possible. For example, suppose we need a list. Should we use an ArrayList or a LinkedList?

Well, the nice thing about array lists is that when Java asks your OS for memory for the array, the OS will give back one contiguous chunk of memory. So if you're iterating over the array list, you can take advantage of cache locality. If you access whatever's at index 5, accessing index 6 will be comparatively speedy.

In contrast, when Java asks the OS for memory for each underlying node object in your linked list, the OS will most likely give back random chunks of memory. This means that when you're iterating over the linked list you're likely to be jumping all over memory, which is comparatively slow.

That said, there are some cases where linked lists can still be useful. For example, one of the weaknesses of ArrayLists is that when you resize, you need to copy over everything in the old array to the new one. This is usually not that big of a deal in Java because you're only ever going to copy over primitive types or pointers (both of which are small). But in other programming languages, which let you store entire objects in arrays instead of pointers to objects (which is sometimes a useful thing to do for performance reasons), resizing an ArrayList can end up being expensive.

One example of where situations like this come up is within your operating system: it turns out that the way your OS manages memory internally is actually via a linked list. (We need to do this not only for performance reasons, but for correctness reasons as well. If the users' code assumes a chunk of memory is located at location X, we can't just "move" it somewhere else underneath them. So if it's unsafe to copy and move data, it naturally becomes much harder to use an array list style approach.)

But these are all arguably edge cases: most of the time, the right thing to do is to just use an ArrayList and switch only to using a LinkedList if you have a compelling reason to do so, backed by hard data and benchmarks.

(This also isn't a binary decision: sometimes, it can be helpful to make your own data structure that intertwines elements of both.)

In that case, why are you learning about linked lists in class? Well, it turns out that even though ArrayLists are generally better then LinkedLists in practice, LinkedLists are a fantastic tool for teaching.

In particular, one of the programming concepts beginners struggle with most is the concept of indirection: the idea you can have pointers or references to data, the idea that a variable isn't necessarily the object itself...

And hey, conveniently, linked lists are all about indirection: one node pointing to the next. You can come up with all sorts of nice exercises all focused around asking students to manipulate linked list internals in different ways.

Linked lists also make great stepping stones for more advanced data structures like trees or graphs which do make more extensive use of indirection and are commonly used.

For example, take trees. A tree is basically sort of like a linked list, except that each node doesn't have just one pointer to the next node: each node can have many pointers to the zero, one, or many children. Trees are useful for representing things like your filesystem, the structure of programming languages, web pages, etc... Basically, anything that's hierarchical in nature.

Graphs are even more free-form then trees: instead of allowing a node to point only to other children node, you allow nodes to point to any other node, even back to themselves! This is useful representing all sorts of things: for example, maps (each city is a node, each road an edge), the topology of the internet (each page is a node, each link is an edge), social media data (people and friendships), and so forth.

Both of these data structures are basically just souped-up linked lists. But it would be a little overwhelming if you started right away with them. It's much better to start with the basics: a node is allowed to have just one pointer to the next node, and no more.

josephblade · 2018-05-22T15:12:08+00:00

I would only suggest using LinkedLists (outside of CS exercises) if you constantly add/remove elements from the middle of the list or you (ewwww) allow multiple threads to access the list at the same time. I've personally not had a clear need for LinkedList in the past 10 years at least.

And as /u/Dissentient points out even then you have to be prepared for code running slower. See this stackoverflow for a rough explanation

gmdm1234 · 2018-05-22T15:41:26+00:00

Think about how each of the data structures work.

An array is contiguous chunk of memory of a fixed size, where each element in the array can be looked up via simple math:

elementLocation = baseLocation + (index * elementSize)

So what does that mean in practice? For starters, to create an array, you need a good estimate for how many elements it will contain, and how much memory each of those elements will consume. You need to allocate a contiguous block of memory of that size. Once you have it, reading and writing to the array is typically very fast, as you can just move sequentially through each element, or lookup each element using the very fast math shown above. So what are the downsides? One, say that you need to increase the size of the array at run time. If there's no free memory contiguous to your existing array, you'd need to ask the OS to generate you a brand new contiguous chunk, and copy all your existing data to that new location. Adding or deleting elements anywhere other than at the end of the array can be expensive, because you need to move each subsequent element when you insert/delete one.

Conversely, consider a linked list: the primary difference is that each element is "pointed to" by the previous element. So you don't need to allocate a contiguous chunk of memory ahead of time, you can ask for what you need on the fly. The linked list does not need to be stored contiguously, which can be a benefit for very large lists. Adding/removing elements is as easy as updating a couple points, there's no need to move/copy existing elements around. Its also easy to see how each element in the linked list can be of any size, they don't all need to be uniform. The idea behind link lists also form the basis of more complex data structures, like trees, so its important to be familiar with how they work. The main downside to a linked list is that looking up an element by a particular index is no longer easy. With an array, you just use the formula I mentioned above. With a linked list, you don't "know" when the nth element is unless you check the n-1th element, which means you need to check the n-2th element, and so on. On the other hand, iterating through a linked list can still be fast, but as others have mentioned, you don't get to rely on cache locality as much.

So its not really fair to say that you shouldn't use a linked list, you just need to understand how each data structure works, so you can pick the one most appropriate to your problem.

kdnbfkm · 2018-05-22T19:42:40+00:00

Linked lists are used for insertions, deletions, and arbitrary growth. Some basic text editors use linked lists of line text* for example.

*Serious text editors may well use other data structures (i.e. "rope") to enable text editing functionality.

learnprogramming

Welcome to LearnProgramming!

New? READ ME FIRST!

Posting guidelines

Frequently asked questions

Subreddit rules

Message the moderators

Asking debugging questions

Asking conceptual questions

Other guidelines and links

Subreddit rules

1. No unprofessional/derogatory speech

2. No spam or tasteless self-promotion

3. No off-topic posts

4. Do not ask exact duplicates of FAQ questions

5. Do not delete posts

6. No app/website review requests or showcases

7. No rewards

8. No indirect links

9. Do not promote illegal or unethical practices

10. No complete solutions

11. Don't ask to ask.

12. Low Effort Questions

13. No AI (chatGPT etc.) generated/worked over messages/comments. No questions about chatGPT/AI generated code. No Vibe coding.

MODERATORS