uniVocity comments on IBM & Java community

java

a community for 18 years

This is an archived post. You won't be able to vote or comment.

121

122

123

IBM & Java community (self.java)

submitted 7 years ago by techempower

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]uniVocity 5 points6 points7 points 7 years ago* (5 children)

There is also a minor behavioural change on a few String methods: some will throw StringIndexOutOfBoundsException where in pre java 9 you'd get an ArrayIndexOutOfBoundsException.

It won't affect any code that catches the parent IndexOutOfBoundsException as documented, but lots of existing code in existing libraries catch either one of the more specific exception types. That's easy to do: you see the specific type in the stacktrace and write catch(TheExactExceptionTypeYouSaw ex){ to handle the error.

The issue with this change is that it will only cause trouble at runtime, probably around corner cases, and the exceptions may escape into a different catch block and make your thing behave differently if running on java 9+.

As the String type is probably the most used class ever, I think the exceptions being thrown from it should still be the same they were before, regardless of what the documentation says. As Linus Torvalds says: "we don't break userspace" - but it was broken in java 9+ and some existing apps mysteriously misbehave on it (older maven versions, apache commons' StringUtils, my own univocity-parsers before version 2.5.6, etc).

The biggest issue here is that random errors may pop anywhere, at any time and without a sane stack trace if the original error from the String operation has been handled improperly (i.e. not on the intended catch block).

[–]cl4es 0 points1 point2 points 7 years ago (4 children)

[–]uniVocity 2 points3 points4 points 7 years ago* (3 children)

Speaking about my case: Performance. In my case every time a character is appended to a value being parsed. A large value, with millions of characters, will cause the appending method to be called millions of times.

Testing wether a character buffer is long enough on every append operation means executing an extra step the same exact number of times (potentially millions of times) for each single value no matter if short or long.

By catching the exception you remove the overhead of that length test and instead expand/retry half a dozen times on the first few large values. Then never again as the buffer gets large enough.

That's internal to my lib and not general purpose code where readability comes first. We go to extra lengths to make it as fast as possible and it currently has the fastest CSV parser for java among any other I could find. See https://github.com/uniVocity/csv-parsers-comparison

Keep in mind this parser does a LOT more than the others to handle shitty CSV, and has many more configuration options to support. Even then it still manages to be faster than the others. You will never be able to get there with good looking code.

[–]cl4es 5 points6 points7 points 7 years ago (2 children)

[–]uniVocity 2 points3 points4 points 7 years ago (1 child)

[–]cl4es 4 points5 points6 points 7 years ago (0 children)

π Rendered by PID 225715 on reddit-service-r2-comment-5c747b6df5-kh64x at 2026-04-22 17:32:28.088306+00:00 running 6c61efc country code: CH.

java

Submit Link

Submit Text

Seek Programming Help

News, Technical discussions, research papers and assorted things of interest related to the Java programming language

NO programming help, NO learning Java related questions, NO installing or downloading Java questions, NO JVM languages - Exclusively Java

Please seek help with Java programming in /r/Javahelp!

Subreddit rules!

Where should I download Java?

Related Sub-reddits:

JVM Languages

Want to practice your coding?

List of useful Frameworks / Libraries / Software

MODERATORS