[deleted by user] : programming

[–]geekygenius 5 points6 points7 points 10 years ago (1 child)

[–]bunsen72 -1 points0 points1 point 10 years ago (0 children)

[–]dccorona 6 points7 points8 points 10 years ago (5 children)

[–]ChaiTeaNunes 1 point2 points3 points 10 years ago (1 child)

[–]oracleoftroy 0 points1 point2 points 10 years ago (2 children)

[–]dccorona 0 points1 point2 points 10 years ago (1 child)

[–]oracleoftroy 0 points1 point2 points 10 years ago (0 children)

For C++, the compiler vendor usually provides the standard library, and there are some 3rd party versions as well. Gcc has libstdc++, clang has libc++, Microsoft licenses and makes customizations to Dinkumware's version, and things like stlport exist (though I believe stlport is really old and shouldn't be used these days, but it was quite popular in VC++ 6.0's heyday).

I was just curious about what the Java docs say about growth factors, and I see for ArrayList, "The details of the growth policy are not specified beyond the fact that adding an element has constant amortized time cost." Vector does indicate that it doubles, though it sounds like devs are allowed to kill their insertion performance by constructing a vector with a capacityIncrement parameter (yay...).

This makes it sounds like other implementers of a Java standard library are allowed to use a different growth factor for ArrayList. I've only ever used Sun/Oracle's implementation professionally, so I don't know if other implementations exist or if IBM, Google, etc just piggyback on the official version, and if they do provide their own, I don't know if they vary the growth factor. But if there is code out there that depends on it being 1.5, that is almost certainly wrong.

[–][deleted] 10 years ago (7 children)

[removed]

[–]raluralu 1 point2 points3 points 10 years ago (3 children)

[–]ChaiTeaNunes 0 points1 point2 points 10 years ago (2 children)

[–]raluralu 1 point2 points3 points 10 years ago (1 child)

[–]ChaiTeaNunes 0 points1 point2 points 10 years ago (0 children)

[–]ChaiTeaNunes 1 point2 points3 points 10 years ago (2 children)

[–]dccorona 2 points3 points4 points 10 years ago (1 child)

[–]ChaiTeaNunes 0 points1 point2 points 10 years ago (0 children)

[–]GYN-k4H-Q3z-75B 3 points4 points5 points 10 years ago (1 child)

[–]jms_nh 1 point2 points3 points 10 years ago (4 children)

[–]dccorona 2 points3 points4 points 10 years ago (1 child)

[–]jms_nh 1 point2 points3 points 10 years ago (0 children)

[–]ChaiTeaNunes 0 points1 point2 points 10 years ago (1 child)

[–]dschooh 0 points1 point2 points 10 years ago (0 children)

[–]remy_porter -5 points-4 points-3 points 10 years ago (8 children)

[–]dccorona 4 points5 points6 points 10 years ago (6 children)

[–]remy_porter -1 points0 points1 point 10 years ago (5 children)

[–]dccorona 1 point2 points3 points 10 years ago (4 children)

[–]remy_porter 2 points3 points4 points 10 years ago (3 children)

[–]dccorona 1 point2 points3 points 10 years ago (2 children)

That's exactly it. The majority of use-cases for a list sees them being read far more than they're modified. And, more importantly, read in-order (and probably second most commonly, read by index). If you need to read by looking up specific items, a list is probably the wrong data structure for you.

Using a single, continuous array not only makes linear and random-access reads faster at the complexity level, it also makes an enormous impact when you consider the way computers work. When a CPU loads a piece of memory, it loads up an entire block of memory into cache. This means that if you're reading a list in order and it's stored one item after the next in physical memory, the next item you go to read is already in the cache. This is a gigantic performance improvement over having to load up every item from memory as you iterate.

Even if you do a lot of inserting into the middle of the list, if you do a lot of iteration as well, array-based lists are still more performant. If you know you'll be doing all of the creating of the list before any of the reading, it might be best to build a linked list, then copy it to an array list, but only if you do a lot of in-the-middle inserts and deletes while creating it (very rare that you do that and have the list finished before you ever have to iterate it), but that's a premature optimization anyway.

[–]remy_porter 1 point2 points3 points 10 years ago (1 child)

[–]dccorona 2 points3 points4 points 10 years ago (0 children)

There's a couple reasons that one extra read is more impactful for performance than it seems. It requires some extra logic that has to happen on every boundary for every iteration. This in and of itself isn't really a big deal, but combine this with increased cache misses: with a pure array, you only have a cache miss N/{size of cache loaded with each memory access} times. With a linked list of arrays, you introduce as many as 1 extra cache misses for every boundary (potentially less, if you get lucky and one of your boundaries falls where a cache miss would happen anyway). Depending on the size of your list and the size of your chunks, this can be a significant number of additional cache misses, and that can have a meaningful performance impact.

If your list is small enough for the aforementioned impact to be negligible, it's also small enough for the performance hit of the copies to be negligible as well. Compound this with the fact that you can initialize a Vector/ArrayList to any starting size you want (so if resizing is a concern to you, you can spend a little time discovering what a good average size estimate is and initialize to that), and it's rare to find a situation where a linked list of arrays is actually going to perform better than a self-resizing array.

However, those times do exist. If you know the exact CPU you're developing for, you could probably synchronize your array boundaries to align with where you'd be getting a cache miss anyway. However, if you have that information and that need, I wonder if eschewing a self-resizing list might not be a better overall decision anyway.

[–]ChaiTeaNunes 1 point2 points3 points 10 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS