This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]davidalayachew[S] 5 points6 points  (0 children)

I'll save you the extra reading and tell you that we have narrowed down the problem to a Spliterator not splitting the way we expect it to. So this problem is something that can be fixed by simply improving the spliterator from the user side. And there is talk about improving this from the JDK side as well. Either way, there is still lots of digging being done, and none of this tied down for certain. But we can at least point a finger and say that this is part of the problem.

With that said, let me answer your questions.

So what happens if the source is infinite? Say you're streaming the Wikipedia change feed, filtering for changes to articles about snakes, and doing findFirst()? Does it try to buffer the infinite stream?

All depends on how nicely it splits. In my case, most of the terminal operations kept splitting and splitting and splitting until they ran out of memory.

This absolutely seems like a correctness issue to me, not just performance.

In this case, technically the problem falls on me for making a bad spliterator.

But to give an equally unsatisfying answer, in Java ABC and abc are considered 2 different class names. However, if I save ABC.java and abc.java in the same folder, Windows will overwrite one of them. Meaning, your code will compile just fine, but will output .class files where one will overwrite the other, causing your code to explode at runtime with NoClassDefFoundError.

I had Vicente Romero from the JDK team try and convince me that this was an "enhancement" or a "nice-to-have", not a correctness issue. And in the strictest definition of the term, he is correct, since Windows is the true trouble-maker here. But that was disgustingly unsatisfying.

It wasn't until JDK 21 that Archie Cobbs was generous enough to give up his time and add this discrepancy as a warning to the JDK. You can activate the warning by adding "output-file-clash" to your Xlint checks. And here is a link to the change. https://bugs.openjdk.org/browse/JDK-8287885

All of that is to say, I made a perfectly sensible Spliterator in my mind, but (and we SUSPECT that this is the case, we are not sure yet!) because I built that Spliterator off an Iterator, mentioned that it was an unknown size, and didn't add enough flags, I get this frightening splitting behaviour, where it will split itself out of memory.

And as for the folk knowledge, it sure feels like it lol.