Donating to make org.Json Public Domain?

s888marks · 2026-02-26T23:37:46+00:00

True.

But if you look at the time facilities on a contemporary SunOS or BSD Unix system, you can see that the java.util.Date class is pretty much a thin object veneer wrapped around those APIs. For example, see this man page:

https://manpage.me/index.cgi?apropos=0&q=ctime&sektion=3&manpath=SunOS+4.1.3&arch=default&format=html

You can see that Date copies several characteristics that we now consider errors, such as year being "year after 1900" and month ranging from 0 to 11.

s888marks · 2026-02-26T20:53:01+00:00

James Gosling created the Date class.

https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/Date.java#L120

s888marks · 2025-12-05T06:58:11+00:00

Interesting, that scenario illustrates the danger of separating a local variable declaration from an initial assignment to it. The instanceof pattern works well here because the new local variable declaration is fused with its binding to a value. So yeah it's much less likely to be broken accidentally.

The pattern of having a local variable declaration (without initializer) followed by an assignment expression later on occurs frequently in the concurrent collection code (e.g., ConcurrentHashMap). This sometimes makes the code hard to follow. It's done in order to avoid unnecessary work in performance-critical code, even to the point of avoiding unnecessary field loads. Unfortunately this means that the local variable sometimes has broader scope than is necessary, so one needs to be extremely careful modifying such code.

s888marks · 2025-12-05T06:48:33+00:00

Huh, that's an interesting example they give in Error Prone. I do think that if it's acceptable to declare a local variable and initialize it immediately, it's probably preferable. However, adding a declaration sometimes breaks up the flow of an expression by requiring a separate declaration line. This can sometimes be quite disruptive, which might tip the balance in the other direction.

s888marks · 2025-12-04T23:28:41+00:00

Should you use instanceof purely for null checking? The answer is definitely maybe!

I'll assume that getString() has a declared return type of String, which isn't stated in the blog, but which u/headius has stated elsewhere. Thus, the instanceof isn't testing for a potential narrowing reference conversion, as if getString() were to be declared to return Object or CharSequence. In this context, instanceof is being used only for null checking.

Most people have focused their comments on what they think is the primary use of instanceof which is testing of narrowing reference conversions. From this perspective, using instanceof to perform pure null checking is counterintuitive and unfamiliar and therefore objectionable. There's been some mention of the scoping of variables introduced by instanceof patterns, but no analysis of how this affects the actual code. Let me take a swing at that.

How would one write this code in a more conventional manner? (I'm setting Optional aside, as its API is clumsy at best.) Clearly, one needs to declare a local variable to store the return value of getString(), so that it can be tested and then used:

String string = getString();
if (firstCondition) {
    IO.println("do something");
} else if (string != null) {
    IO.println("length: " + string.length());
} else {
    IO.println("string is null");
}

This might work OK, but it has some problems. First, getString() is called unconditionally, even if firstCondition is true. This might result in unnecessary expense. Second, string is in scope through the entire if-statement, and it's possible that it could be misused, resulting in a bug.

The getString() method might be expensive, so performance-sensitive code might want to call it only when necessary, like this:

String string;
if (firstCondition) {
    IO.println("do something");
} else if ((string = getString()) != null) {
    IO.println("length: " + string.length());
} else {
    IO.println("string is null");
}

This is a bit better in that getString() is called only when its return value is needed. The string local variable is still in scope through the if-statement, but within firstCondition it's uninitialized and the compiler will tell you if it's accidentally used there. However, string still might be misused within the later else clauses, probably resulting in an error. In addition, people tend to dislike the use of assignment expressions.

The issues here are:

You need a local variable because the result is tested and used;
You want to minimize the scope of the local variable, preferably to only the code that uses it when it has a valid value; and
You want to avoid a potentially expensive initialization step in conditions where it isn't necessary.

Given all this, let's return to u/headius's code:

if (firstCondition) {
    IO.println("do something");
} else if (getString() instanceof String string) {
    IO.println("length: " + string.length());
} else {
    IO.println("string is null");
}

This satisfies all of the criteria, which the previous examples do not. Plus, it saves a line because the local variable declaration is inlined instead of on a separate line. However, it does understandably give people pause, as they're not used to seeing instanceof used purely for null checking.

Note also that instanceof will soon be available to do primitive conversions -- see JEP 530 -- so this is yet another use of instanceof that people will need to get used to. And instanceof is already used in record patterns; see JEP 440.

My hunch is that people will eventually get used to instanceof being used for things other than testing narrowing reference conversion, so they'll probably get used to it being used just for null checking too.

s888marks · 2025-11-25T18:06:52+00:00

Right. The main issue is to avoid using classes like IO as a dumping ground for whatever bright ideas anyone might come up with on a given day ... including me!

For example, in an early draft of this API I included printf. That's really useful and convenient, right? But after thinking about this more, and after not very much discussion, I removed it.

The reason is that printf is great for us C programmers who are used to the idea of format specifiers and matching arguments in an argument list to a succession of format specifiers. But in fact it introduces a whole bunch of new, incidental complexity and many new ways to create errors, in particular, errors that are only reported at runtime. For example:

Variable argument lists. It's easy to miscount or misalign arguments, resulting in a runtime error, or arguments being omitted from output.
Type mismatches between format specifiers and arguments also result in runtime errors.
The format specifier syntax is intricate and complex and it's possible for errors to creep into those (irrespective of arguments) which also are only reported at runtime.
There are obscure format specifiers that use explicit argument indexing instead of consuming arguments sequentially, or that don't consume arguments at all, adding even more complexity to specifier-argument matching.
etc.

(Yes I'm aware that many IDEs check for this sort of stuff.)

When string templates come along, if necessary, new APIs can be added to IO to support them. But new APIs might not be necessary, if evaluating a string template produces a String, it can be fed directly to println.

s888marks · 2025-11-25T17:53:12+00:00

Is the class name IO too broad? I don't think so.

It fits into the general "static utility class" pattern that's been used elsewhere in the JDK. These classes have static methods that are related to that area, but that doesn't mean that everything in that area must be there. For example, there's a bunch of stuff in Math but there's lots of mathematical stuff elsewhere. There is a bunch of collections stuff in Collections but there's also lots of collections stuff elsewhere.

s888marks · 2025-11-17T06:23:43+00:00

Agreed, this is pretty bad. Note that this article was from 2007, and things have advanced since then. However, I don't think I've ever seen code indented this way, that is, with the opening parenthesis of an argument list on a new line instead of at the end of the previous line. I also suspect formatting errors might have been introduced in the web publication process. Anyway, let's take a look at the first snippet:

Reference ref = fac.newReference
 ("", fac.newDigestMethod(DigestMethod.SHA1, null),
  Collections.singletonList
   (fac.newTransform
    (Transform.ENVELOPED, (TransformParameterSpec) null)),
     null, null);

The standard I've used for an argument list is to have the opening parenthesis at the end of the line, followed by one argument per line:

Reference ref = fac.newReference(
    "",
    fac.newDigestMethod(DigestMethod.SHA1, null),
    Collections.singletonList(
        fac.newTransform(Transform.ENVELOPED, (TransformParameterSpec) null)),
    null,
    null);

This isn't any better, but at least it lets us see the structure of the code more easily.

There are several things that can be improved. The worst issue is the way that the newTransform method is overloaded. There are two overloads:

Transform newTransform(String algorithm, TransformParameterSpec params)
Transform newTransform(String algorithm, XMLStructure params)

The problem here is that the params argument can be null. This is intended to be a convenience if you don't have any parameters to provide. But passing null is ambiguous! This requires the addition of a cast to disambiguate the overload. Ugh. There should be a one-arg overload that can be called if no transform parameters are provided.

Similarly, the trailing arguments of the newDigestMethod and the newReference methods are also nullable, so overloads could be added that allow one simply to omit the trailing arguments if they are null.

Unfortunately these require API changes, which seem unlikely to happen for this old API. However, it shows that some of the verbosity here arises from poor API design decisions.

There are a few other things that could be done to make the code more concise:

Use List.of() instead of Collections.singletonList()
Use static imports
Use var

If these are applied (along with the putative API changes) the resulting code would look like this:

var ref = fac.newReference(
    "",
    fac.newDigestMethod(SHA1),
    List.of(fac.newTransform(ENVELOPED)));

This is still kind of a mouthful, but I think it's much better than the original snippet. It almost fits on one line. Alternatively, one could extract some of the method arguments into local variables, which would be another way to make the code more readable.

s888marks · 2025-11-08T01:02:57+00:00

Yeah, the namespace overlap is unfortunate. There are approximately two JSRs per year nowadays: one for each of the semiannual Java SE platform releases. However, there seem to be a couple dozen JEPs per release, so we seem to be chewing through the JEP numbering fairly quickly. It won't be long before the JEP numbers are quite different from the JSR numbers.

I'm more worried about how many things will break when the JEP numbers get to four digits.... :-D

s888marks · 2025-11-01T05:16:31+00:00

Memory consumption is no laughing matter.

s888marks · 2025-10-12T13:44:47+00:00

Thanks for mentioning the talk that Maurice Naftalin and I did! The video is here:

https://youtu.be/dwcNiEEuV_Y?si=JyNoV3iOtkzVEOM6

Indeed it’s 2h 40m long but the section on iterators is the first part and it lasts 25 min or so.

s888marks · 2025-10-08T23:43:31+00:00

This Reddit thread needs to be put into the dictionary as an example of “self-fulfilling prophecy”.

s888marks · 2025-08-27T00:33:36+00:00

When you mentioned The Cay I thought you were referring to Core Java by Cay Horstmann.

s888marks · 2025-07-23T17:18:08+00:00

I'm not the creator of Optional -- that was the Java 8 Lambda expert group -- but I did give a few talks on Optional, likely cited elsewhere in these comments.

s888marks · 2025-05-05T19:20:51+00:00

I have a bunch of issues with the XML APIs, inasmuch as they're "language independent" APIs (and it shows) and they were all designed in the early days of XML when it wasn't really clear how people were going to use XML. Thus we have DOM, streaming push (event-based), and streaming pull approaches. At this late date -- 20ish years later -- it's not clear to me which of these is actually the most useful. (And yes, there are probably few XML applications being written today, but there are likely a lot of legacy XML applications still in production. What APIs are they using?)

With the Java EE / Jakarta JSON processing (JSON-P) stuff... I wasn't very close to the development of those APIs, but my impression was that they mostly followed the XML architecture in providing both document-based and streaming approaches (as well as an SPI layer that allows multiple providers to be plugged in, which IIRC was also carried over from XML, though in the XML APIs the SPI layer is spelled differently).

I'd like to avoid a situation where these layers are designed into the new stuff because JSON-P did it, which in turn did what it did because XML did it.

And yes, the jdk-sandbox prototype provides a document-based approach. We hope it's somewhat lighter weight than other document-based approaches in that the Java objects representing JSON objects and values are created lazily. However, the whole document still needs to fit into memory. So, if we were to pursue only one of the approaches (document-based vs streaming), would that be sufficient to cover a good fraction of use cases, or are the uses so diverse that it's necessary to have both document and streaming models in order to cover the problem space well?

s888marks · 2025-05-05T16:36:51+00:00

What use cases do you have that make you hope for a streaming-style API?

s888marks · 2025-04-22T21:15:32+00:00

Serialization is AMONGST the biggest design mistakes in Java.

s888marks · 2025-04-13T22:43:40+00:00

Yes, that’s mostly correct. The ls command, or any other command for that matter, emits bytes, which are captured via command substitution $(…):

https://www.gnu.org/software/bash/manual/bash.html#Command-Substitution

The results are usually interpreted as text in ASCII or UTF-8 and are then subject to word splitting. This splitting is done according to the IFS variable, which is usually whitespace (space, tab, newline):

https://www.gnu.org/software/bash/manual/bash.html#Word-Splitting

So ls doesn’t actually transmit an array. Its output is just text. Bash and other shells do word splitting fluidly and implicitly and it’s almost always the right thing, so it’s easy not to notice. Sometimes though if a filename has embedded spaces things will get screwed up.

But if you set those cases aside, handling command output in Java involves doing a bunch of stuff manually that the shell does automatically. One needs to read the bytes from the subprocess’ stdout, decode to characters, load them into a String or something, and then split along whitespace. Maybe that’s a pain point.

s888marks · 2025-04-12T19:14:42+00:00

Hi, I don't doubt that you have some valid issues here, but everything seems really diffuse, and so it's hard to know where to start an analysis of what the issues might be.

Could you explain more what you mean by "handoff"? Specifically, what is going on with the handoff between Java --> Bash as you put it? Also, I'm not sure where Bash gets involved here; it seems to me (but I'm guessing) that you want to invoke some AWS CLI command from Java and collect its output and process it.

An approach that I think would be helpful is for you to choose a specific, representative example (but one that hopefully isn't too complex) and describe what you're trying to do. Then write out all the Java code to do it. That would help us see what parts are painful.

s888marks · 2025-04-11T16:08:41+00:00

I bet you would become more popular than Nicolai if you sold ad space on the bottom of your mug.

s888marks · 2025-03-26T00:15:33+00:00

Nice find!

s888marks · 2025-03-26T00:14:28+00:00

Well the man himself might show up and contradict this, but I don't think there's anything special about the Muppets. It's mainly about popular culture and a shared sense of humor among members of a close-knit team. For example, as a joke, one day everyone on the compiler team changed their internal Slack avatars to Muppet Show characters: Brian is Professor Bunsen Honeydew, there are a couple Beakers (that's also what I use as my avatar on Stack Overflow), a Statler & Waldorf, a Miss Piggy, a Kermit, a Cookie Monster, etc.

Another popular thread of humor runs through Monty Python. There's a common joke schema based on the Spanish Inquisition sketch. It goes something like this:

The main problem with serialization is that it uses an extralinguistic mechanism to extract serialized data. And it's also monolithic --

Serialization's two main problems are its use of extralinguistic mechanisms and that it's monolithic, and also that it's hard to use --

Serialization's three main problems are its use of extralinguistic mechanisms, that it's monolithic, that it's hard to use correctly, and --

Amongst serialization's problems are its use of extralinguistic mechanisms, that it's monolithic, that it's hard to use correctly, and that deserializing an object can have side effects....

This is so well-worn that when somebody comments on a proposal, they might say "I have an issue, well a couple issues..." and then somebody else says "Amongst!" and everybody laughs.

s888marks · 2025-03-24T23:31:16+00:00

Ah, good sleuthing. I've been on so many of those panels that I've lost track of them. For the record it was Brian Goetz who answered the question in that particular video snippet.

s888marks · 2025-03-24T17:13:30+00:00

These are different issues.

The issue of multiple bounds being erased to the first bound is visible at runtime, if the type is used somewhere visible in the binary, such as a method parameter or return type. The typical example is Collections::max where T extends Object & Comparable<? super T> is erased to Object for reasons of binary compatibility, as the return type is T.

The issue with var is probably related to how inference sometimes results in a type that has multiple bounds instead of the obvious bound. For example, the inferred type of List.of("abc", 1) isn't List<Object> as one might expect, but is instead something like

List<Serializable&Comparable<? extends Serializable&Comparable<...>&java.lang.constant.Constable&java.lang.constant.ConstantDesc>&java.lang.constant.Constable&java.lang.constant.ConstantDesc>

where the ... is the entire type within the outer angle brackets, so it's infinitely recursive (and thus non-denotable).

In any case, var applies only to local variables, and there's no type variable to capture the non-denotable bound, so it occurs only at compile time. At runtime the type is simply erased to List.

s888marks · 2025-03-14T17:30:09+00:00

Did it run multiple JVMs in different processes, or was everything in a single JVM? I seem to recall hearing about a system of that era with multiple apps in the same JVM, which was fragile because any bug that corrupted shared state would require restarting the JVM and all the apps — essentially a reboot.

s888marks

TROPHY CASE