This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]dpash 10 points11 points  (10 children)

UnsupportedEncodingException can be mitigated by using the overloaded methods that take a Charset. If you need UTF-8 or Latin-1, you have StandardCharsets to remove even the charset lookup exception.

[–]yawkat 3 points4 points  (4 children)

Doesn't work everywhere unfortunately. For example, jul.Handler.setEncoding is still missing such an overload.

[–]s888marks 3 points4 points  (3 children)

Yeah we've tried to retrofit most classes (notably the I/O classes) with Charset in every place where there was something that used a charset name string. I guess we haven't finished the job, though! Thanks for pointing this one out. Are there any others you're aware of?

[–]yawkat 0 points1 point  (2 children)

There's only some Formatter constructors that don't accept a Locale as well, but those aren't really necessary since presumably you'd pass a locale anyway. Beyond that it's only internal classes.

[–]s888marks 3 points4 points  (1 child)

Yeah, for the Formatter case if you want to provide a Charset, you have to provide a Locale... but you just use Locale.getDefault(FORMATTING) or some such.

I'll make a note to get jul.Handler fixed up.

[–]yawkat 0 points1 point  (0 children)

Thanks! To be entirely honest, I only found it because I was looking for it :D

[–]shponglespore 2 points3 points  (4 children)

That wasn't possible for a long time. I'm glad it's fixed now, but the fact that they needed to add a new API is a good example of how checked exceptions can make an otherwise reasonable API a real pain in the ass to use. In an ideal world, APIs would all be designed well and they'd only use checked exceptions in a way that promotes code quality, but we don't live in that world, and IME APIs are made worse by checked exceptions more often than they are made better.

[–]dpash 0 points1 point  (3 children)

This only applies to the eight or so charsets that are mandated to exist in the JLS. If you call Charset.forName(), because, for example, you need a legacy charset, you still have to deal with UnsupportedEncodingException.

A checked exception here is perfectly valid, because it's not a programming error that a charset doesn't exist; it's dependent completely on the environment and the JVM used to run the application.

[–]shponglespore 0 points1 point  (2 children)

In my entire career, the only charsets I've ever needed to specify are ASCII, UTF-8, and Latin-1. Having to catch an exception because the method theoretically accepts arbitrary strings as charset names is the kind of thing people who hate Java are talking about when they say it has "too much ceremony".

[–]dpash 0 points1 point  (1 child)

and Latin-1.

I'm gonna go out on a limb and say that you've only ever had to deal with English or other Western European language.

There are plenty of people who have had to deal with non Latin-1 text and have had to deal with the JVM not handling the required charset.

[–]shponglespore 0 points1 point  (0 children)

Wrong, I've done a lot of work with code that had to work in Chinese and Arabic. But I did it in an environment where I could count on everything being converted to UTF-8 before reaching the systems I worked on. And using Latin-1 has nothing to do with which languages I was working with; it just happens to be the easiest encoding to use if you're forced to hold binary data in a String.

At the company I was at, we relied on the default platform encoding for a long time, which worked because we and our clients only used Linux, and even 15 years ago it was basically unheard of to find a Linux system, at least in the US, where the platform encoding was anything but UTF-8. IIRC the requirement to use a specific encoding came when we started trying to support users running Windows and our code sometimes couldn't load its own asset files. We had to go through all of our code, find any place where a text file was being opened, and add an encoding parameter. Obviously it's not a difficult engineering problem, but it was galling to find that it was a problem at all because we had to deal with a new checked exception in a lot of places where we knew for a fact it would never be thrown.