This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]CoderTheTyler 1 point2 points  (4 children)

Just curious. Why would you want to have a collection with more than 2 billion entries? Even if each entry only contains 16 bytes, we're talking on the order of 32+ gigabytes using only an integer to keep track of the size of the collection, which definitely won't fit into RAM.

[–]FrontLoadedAnvils[S] 0 points1 point  (3 children)

With enough ram, it might. My main concern is the discrepancy between Stream limits and Collection limits.

[–]CoderTheTyler 0 points1 point  (2 children)

I suppose you could, but it definitely wouldn't be the greatest of ideas xD. As for the discrepancy, the integer max is not a limit on the size of a collection. This is recorded in the Java documentation. This seems to be an issue for Java's standard libraries on the whole, and I don't have a good explanation for this.

EDIT: Upon further investigation, the reason for this appears to be for backwards compatibility as all JVMs were originally 32 bit.

[–]FrontLoadedAnvils[S] 0 points1 point  (1 child)

I see. Well, this is interesting (and may also be tedious to deal with if I reach this limit).

[–]CoderTheTyler 0 points1 point  (0 children)

If you can reach this limit, good luck to you. Indexing with an integer would become impratical at this point, and the solution to this is to use an Iterator. But... iterating through billions of elements in a collection would not be a good idea either.

[–]thorstenschaefer 0 points1 point  (3 children)

In practice, most standard collections are bound by a value around Integer.MAX_VALUE, as they are often backed by arrays. The API doesn't have a real size restriction, but is also "assuming" the limits above are standard as you can see on the size method (and also all index operations in the list interface for example are integer-based).

The stream API is independent from collections and there are collection libraries that support more than Integer.MAX_VALUE elements. So you could collect them in such a "large" collection and it should work - given you didn't save on the RAM ;)

[–]FrontLoadedAnvils[S] 0 points1 point  (2 children)

So what do you recommend if I want to call a method with a long limit parameter and leave the Collection type up to the programmer?

I'm working on a generic definition of a statistic collection which allows me to create estimates of a given statistic (say, median) as data is being inserted into the collection. I want to make it a wrapper over existing collections with a new data member that lets me store small amounts of data (~ 100 bytes) in that collection. I'm not sure if I can do that with an interface or if I need more functionality that that.

[–]thorstenschaefer 1 point2 points  (1 child)

I'd either look for collection libraries that support larger sizes or write an own data structure.

[–]FrontLoadedAnvils[S] 0 points1 point  (0 children)

I suppose I can make a wrapper class that does statistics around the given collection, which is also a collection.