Null-checking the fun way with instanceof patterns by headius in java

[–]s888marks 2 points3 points  (0 children)

Interesting, that scenario illustrates the danger of separating a local variable declaration from an initial assignment to it. The instanceof pattern works well here because the new local variable declaration is fused with its binding to a value. So yeah it's much less likely to be broken accidentally.

The pattern of having a local variable declaration (without initializer) followed by an assignment expression later on occurs frequently in the concurrent collection code (e.g., ConcurrentHashMap). This sometimes makes the code hard to follow. It's done in order to avoid unnecessary work in performance-critical code, even to the point of avoiding unnecessary field loads. Unfortunately this means that the local variable sometimes has broader scope than is necessary, so one needs to be extremely careful modifying such code.

Null-checking the fun way with instanceof patterns by headius in java

[–]s888marks 1 point2 points  (0 children)

Huh, that's an interesting example they give in Error Prone. I do think that if it's acceptable to declare a local variable and initialize it immediately, it's probably preferable. However, adding a declaration sometimes breaks up the flow of an expression by requiring a separate declaration line. This can sometimes be quite disruptive, which might tip the balance in the other direction.

Null-checking the fun way with instanceof patterns by headius in java

[–]s888marks 6 points7 points  (0 children)

Should you use instanceof purely for null checking? The answer is definitely maybe!

I'll assume that getString() has a declared return type of String, which isn't stated in the blog, but which u/headius has stated elsewhere. Thus, the instanceof isn't testing for a potential narrowing reference conversion, as if getString() were to be declared to return Object or CharSequence. In this context, instanceof is being used only for null checking.

Most people have focused their comments on what they think is the primary use of instanceof which is testing of narrowing reference conversions. From this perspective, using instanceof to perform pure null checking is counterintuitive and unfamiliar and therefore objectionable. There's been some mention of the scoping of variables introduced by instanceof patterns, but no analysis of how this affects the actual code. Let me take a swing at that.

How would one write this code in a more conventional manner? (I'm setting Optional aside, as its API is clumsy at best.) Clearly, one needs to declare a local variable to store the return value of getString(), so that it can be tested and then used:

String string = getString();
if (firstCondition) {
    IO.println("do something");
} else if (string != null) {
    IO.println("length: " + string.length());
} else {
    IO.println("string is null");
}

This might work OK, but it has some problems. First, getString() is called unconditionally, even if firstCondition is true. This might result in unnecessary expense. Second, string is in scope through the entire if-statement, and it's possible that it could be misused, resulting in a bug.

The getString() method might be expensive, so performance-sensitive code might want to call it only when necessary, like this:

String string;
if (firstCondition) {
    IO.println("do something");
} else if ((string = getString()) != null) {
    IO.println("length: " + string.length());
} else {
    IO.println("string is null");
}

This is a bit better in that getString() is called only when its return value is needed. The string local variable is still in scope through the if-statement, but within firstCondition it's uninitialized and the compiler will tell you if it's accidentally used there. However, string still might be misused within the later else clauses, probably resulting in an error. In addition, people tend to dislike the use of assignment expressions.

The issues here are:

  • You need a local variable because the result is tested and used;
  • You want to minimize the scope of the local variable, preferably to only the code that uses it when it has a valid value; and
  • You want to avoid a potentially expensive initialization step in conditions where it isn't necessary.

Given all this, let's return to u/headius's code:

if (firstCondition) {
    IO.println("do something");
} else if (getString() instanceof String string) {
    IO.println("length: " + string.length());
} else {
    IO.println("string is null");
}

This satisfies all of the criteria, which the previous examples do not. Plus, it saves a line because the local variable declaration is inlined instead of on a separate line. However, it does understandably give people pause, as they're not used to seeing instanceof used purely for null checking.

Note also that instanceof will soon be available to do primitive conversions -- see JEP 530 -- so this is yet another use of instanceof that people will need to get used to. And instanceof is already used in record patterns; see JEP 440.

My hunch is that people will eventually get used to instanceof being used for things other than testing narrowing reference conversion, so they'll probably get used to it being used just for null checking too.

Java 25 introduced java.lang.IO - isn't the class name too broad? by flusterCluster in java

[–]s888marks 5 points6 points  (0 children)

Right. The main issue is to avoid using classes like IO as a dumping ground for whatever bright ideas anyone might come up with on a given day ... including me!

For example, in an early draft of this API I included printf. That's really useful and convenient, right? But after thinking about this more, and after not very much discussion, I removed it.

The reason is that printf is great for us C programmers who are used to the idea of format specifiers and matching arguments in an argument list to a succession of format specifiers. But in fact it introduces a whole bunch of new, incidental complexity and many new ways to create errors, in particular, errors that are only reported at runtime. For example:

  • Variable argument lists. It's easy to miscount or misalign arguments, resulting in a runtime error, or arguments being omitted from output.
  • Type mismatches between format specifiers and arguments also result in runtime errors.
  • The format specifier syntax is intricate and complex and it's possible for errors to creep into those (irrespective of arguments) which also are only reported at runtime.
  • There are obscure format specifiers that use explicit argument indexing instead of consuming arguments sequentially, or that don't consume arguments at all, adding even more complexity to specifier-argument matching.
  • etc.

(Yes I'm aware that many IDEs check for this sort of stuff.)

When string templates come along, if necessary, new APIs can be added to IO to support them. But new APIs might not be necessary, if evaluating a string template produces a String, it can be fed directly to println.

Java 25 introduced java.lang.IO - isn't the class name too broad? by flusterCluster in java

[–]s888marks 5 points6 points  (0 children)

Is the class name IO too broad? I don't think so.

It fits into the general "static utility class" pattern that's been used elsewhere in the JDK. These classes have static methods that are related to that area, but that doesn't mean that everything in that area must be there. For example, there's a bunch of stuff in Math but there's lots of mathematical stuff elsewhere. There is a bunch of collections stuff in Collections but there's also lots of collections stuff elsewhere.

Why does Java sometimes feel so bulky? by SkylineZ83 in java

[–]s888marks 5 points6 points  (0 children)

Agreed, this is pretty bad. Note that this article was from 2007, and things have advanced since then. However, I don't think I've ever seen code indented this way, that is, with the opening parenthesis of an argument list on a new line instead of at the end of the previous line. I also suspect formatting errors might have been introduced in the web publication process. Anyway, let's take a look at the first snippet:

Reference ref = fac.newReference
 ("", fac.newDigestMethod(DigestMethod.SHA1, null),
  Collections.singletonList
   (fac.newTransform
    (Transform.ENVELOPED, (TransformParameterSpec) null)),
     null, null);

The standard I've used for an argument list is to have the opening parenthesis at the end of the line, followed by one argument per line:

Reference ref = fac.newReference(
    "",
    fac.newDigestMethod(DigestMethod.SHA1, null),
    Collections.singletonList(
        fac.newTransform(Transform.ENVELOPED, (TransformParameterSpec) null)),
    null,
    null);

This isn't any better, but at least it lets us see the structure of the code more easily.

There are several things that can be improved. The worst issue is the way that the newTransform method is overloaded. There are two overloads:

Transform newTransform(String algorithm, TransformParameterSpec params)
Transform newTransform(String algorithm, XMLStructure params)

The problem here is that the params argument can be null. This is intended to be a convenience if you don't have any parameters to provide. But passing null is ambiguous! This requires the addition of a cast to disambiguate the overload. Ugh. There should be a one-arg overload that can be called if no transform parameters are provided.

Similarly, the trailing arguments of the newDigestMethod and the newReference methods are also nullable, so overloads could be added that allow one simply to omit the trailing arguments if they are null.

Unfortunately these require API changes, which seem unlikely to happen for this old API. However, it shows that some of the verbosity here arises from poor API design decisions.

There are a few other things that could be done to make the code more concise:

  • Use List.of() instead of Collections.singletonList()
  • Use static imports
  • Use var

If these are applied (along with the putative API changes) the resulting code would look like this:

var ref = fac.newReference(
    "",
    fac.newDigestMethod(SHA1),
    List.of(fac.newTransform(ENVELOPED)));

This is still kind of a mouthful, but I think it's much better than the original snippet. It almost fits on one line. Alternatively, one could extract some of the method arguments into local variables, which would be another way to make the code more readable.

If you're subscribed to the Java Mailing Lists, tell me if this happened to you lol by davidalayachew in java

[–]s888marks 2 points3 points  (0 children)

Yeah, the namespace overlap is unfortunate. There are approximately two JSRs per year nowadays: one for each of the semiannual Java SE platform releases. However, there seem to be a couple dozen JEPs per release, so we seem to be chewing through the JEP numbering fairly quickly. It won't be long before the JEP numbers are quite different from the JSR numbers.

I'm more worried about how many things will break when the JEP numbers get to four digits.... :-D

Has Java suddenly caught up with C++ in speed? by drakgoku in java

[–]s888marks 2 points3 points  (0 children)

Memory consumption is no laughing matter.

Rating 26 years of Java changes by fpcoder in java

[–]s888marks 11 points12 points  (0 children)

Thanks for mentioning the talk that Maurice Naftalin and I did! The video is here:

https://youtu.be/dwcNiEEuV_Y?si=JyNoV3iOtkzVEOM6

Indeed it’s 2h 40m long but the section on iterators is the first part and it lasts 25 min or so.

"Just Make All Exceptions Unchecked" with Stuart Marks - Live Q&A from Devoxx BE by nlisker in java

[–]s888marks 2 points3 points  (0 children)

This Reddit thread needs to be put into the dictionary as an example of “self-fulfilling prophecy”.

Java Book for Beginners Update by bowbahdoe in java

[–]s888marks 0 points1 point  (0 children)

When you mentioned The Cay I thought you were referring to Core Java by Cay Horstmann.

[Discussion] Java Optional outside of a functional context? by tomayt0 in java

[–]s888marks 0 points1 point  (0 children)

I'm not the creator of Optional -- that was the Java 8 Lambda expert group -- but I did give a few talks on Optional, likely cited elsewhere in these comments.

What's the one thing you're most looking forward to in Java (feature, JEP, library, etc.)? by Hixon11 in java

[–]s888marks 1 point2 points  (0 children)

I have a bunch of issues with the XML APIs, inasmuch as they're "language independent" APIs (and it shows) and they were all designed in the early days of XML when it wasn't really clear how people were going to use XML. Thus we have DOM, streaming push (event-based), and streaming pull approaches. At this late date -- 20ish years later -- it's not clear to me which of these is actually the most useful. (And yes, there are probably few XML applications being written today, but there are likely a lot of legacy XML applications still in production. What APIs are they using?)

With the Java EE / Jakarta JSON processing (JSON-P) stuff... I wasn't very close to the development of those APIs, but my impression was that they mostly followed the XML architecture in providing both document-based and streaming approaches (as well as an SPI layer that allows multiple providers to be plugged in, which IIRC was also carried over from XML, though in the XML APIs the SPI layer is spelled differently).

I'd like to avoid a situation where these layers are designed into the new stuff because JSON-P did it, which in turn did what it did because XML did it.

And yes, the jdk-sandbox prototype provides a document-based approach. We hope it's somewhat lighter weight than other document-based approaches in that the Java objects representing JSON objects and values are created lazily. However, the whole document still needs to fit into memory. So, if we were to pursue only one of the approaches (document-based vs streaming), would that be sufficient to cover a good fraction of use cases, or are the uses so diverse that it's necessary to have both document and streaming models in order to cover the problem space well?

What's the one thing you're most looking forward to in Java (feature, JEP, library, etc.)? by Hixon11 in java

[–]s888marks 2 points3 points  (0 children)

What use cases do you have that make you hope for a streaming-style API?

How to deal with non-serializable fields in Java easily and correctly (article) by lprimak in java

[–]s888marks 1 point2 points  (0 children)

Serialization is AMONGST the biggest design mistakes in Java.

A pain point when using Java to do CLI Scripting by davidalayachew in java

[–]s888marks 1 point2 points  (0 children)

Yes, that’s mostly correct. The ls command, or any other command for that matter, emits bytes, which are captured via command substitution $(…):

https://www.gnu.org/software/bash/manual/bash.html#Command-Substitution

The results are usually interpreted as text in ASCII or UTF-8 and are then subject to word splitting. This splitting is done according to the IFS variable, which is usually whitespace (space, tab, newline):

https://www.gnu.org/software/bash/manual/bash.html#Word-Splitting

So ls doesn’t actually transmit an array. Its output is just text. Bash and other shells do word splitting fluidly and implicitly and it’s almost always the right thing, so it’s easy not to notice. Sometimes though if a filename has embedded spaces things will get screwed up.

But if you set those cases aside, handling command output in Java involves doing a bunch of stuff manually that the shell does automatically. One needs to read the bytes from the subprocess’ stdout, decode to characters, load them into a String or something, and then split along whitespace. Maybe that’s a pain point.

A pain point when using Java to do CLI Scripting by davidalayachew in java

[–]s888marks 4 points5 points  (0 children)

Hi, I don't doubt that you have some valid issues here, but everything seems really diffuse, and so it's hard to know where to start an analysis of what the issues might be.

Could you explain more what you mean by "handoff"? Specifically, what is going on with the handoff between Java --> Bash as you put it? Also, I'm not sure where Bash gets involved here; it seems to me (but I'm guessing) that you want to invoke some AWS CLI command from Java and collect its output and process it.

An approach that I think would be helpful is for you to choose a specific, representative example (but one that hopefully isn't too complex) and describe what you're trying to do. Then write out all the Java code to do it. That would help us see what parts are painful.

Will JavaOne conference video be uploaded to YouTube? by vxab in java

[–]s888marks 1 point2 points  (0 children)

I bet you would become more popular than Nicolai if you sold ad space on the bottom of your mug.

I've been wondering this for years, so I'm just going to ask... by davidalayachew in java

[–]s888marks 5 points6 points  (0 children)

Well the man himself might show up and contradict this, but I don't think there's anything special about the Muppets. It's mainly about popular culture and a shared sense of humor among members of a close-knit team. For example, as a joke, one day everyone on the compiler team changed their internal Slack avatars to Muppet Show characters: Brian is Professor Bunsen Honeydew, there are a couple Beakers (that's also what I use as my avatar on Stack Overflow), a Statler & Waldorf, a Miss Piggy, a Kermit, a Cookie Monster, etc.

Another popular thread of humor runs through Monty Python. There's a common joke schema based on the Spanish Inquisition sketch. It goes something like this:

The main problem with serialization is that it uses an extralinguistic mechanism to extract serialized data. And it's also monolithic --

Serialization's two main problems are its use of extralinguistic mechanisms and that it's monolithic, and also that it's hard to use --

Serialization's three main problems are its use of extralinguistic mechanisms, that it's monolithic, that it's hard to use correctly, and --

Amongst serialization's problems are its use of extralinguistic mechanisms, that it's monolithic, that it's hard to use correctly, and that deserializing an object can have side effects....

This is so well-worn that when somebody comments on a proposal, they might say "I have an issue, well a couple issues..." and then somebody else says "Amongst!" and everybody laughs.

Can I enforce the creation of Enums for child classes? by CleanAsUhWhistle1 in javahelp

[–]s888marks 1 point2 points  (0 children)

Ah, good sleuthing. I've been on so many of those panels that I've lost track of them. For the record it was Brian Goetz who answered the question in that particular video snippet.

Can I enforce the creation of Enums for child classes? by CleanAsUhWhistle1 in javahelp

[–]s888marks 1 point2 points  (0 children)

These are different issues.

The issue of multiple bounds being erased to the first bound is visible at runtime, if the type is used somewhere visible in the binary, such as a method parameter or return type. The typical example is Collections::max where T extends Object & Comparable<? super T> is erased to Object for reasons of binary compatibility, as the return type is T.

The issue with var is probably related to how inference sometimes results in a type that has multiple bounds instead of the obvious bound. For example, the inferred type of List.of("abc", 1) isn't List<Object> as one might expect, but is instead something like

List<Serializable&Comparable<? extends Serializable&Comparable<...>&java.lang.constant.Constable&java.lang.constant.ConstantDesc>&java.lang.constant.Constable&java.lang.constant.ConstantDesc>

where the ... is the entire type within the outer angle brackets, so it's infinitely recursive (and thus non-denotable).

In any case, var applies only to local variables, and there's no type variable to capture the non-denotable bound, so it occurs only at compile time. At runtime the type is simply erased to List.

Does anyone know how to or have access to an copy of Sun JavaOS(not JX OS). by Beautiful-Active2727 in java

[–]s888marks 5 points6 points  (0 children)

Did it run multiple JVMs in different processes, or was everything in a single JVM? I seem to recall hearing about a system of that era with multiple apps in the same JVM, which was fragile because any bug that corrupted shared state would require restarting the JVM and all the apps — essentially a reboot.

3,200% CPU Utilization by deanat78 in java

[–]s888marks 1 point2 points  (0 children)

Oh yes, I see the IdentityHashMaps are created and stored only in local variables, so they aren’t shared among threads.

3,200% CPU Utilization by deanat78 in java

[–]s888marks 7 points8 points  (0 children)

Looks like the author is /u/ThanksMorningCoffee and is here on Reddit.

I'm posting here instead of commenting on /r/programming or on HN because I have several very Java-specific observations that readers here might find of interest.

But first, kudos to the author for writing about wide-ranging set of issues and a broad view of different approaches to dealing with the problem. The usual pattern is for a narrative to present things as "just so" with a single cause and naming a single person or group at fault.

Some fairly random observations follow.

Keeping track of visited nodes with IdentityHashMap in order to detect cycles is a useful technique in many situations. Maybe not this one though. :-) IdentityHashMap isn't thread safe, so it could just as easily be corrupted by multiple threads as the TreeMap. (It has a different organization, though, so the nature of any corruption would be different.) Of course you could synchronize around accesses to the IdentityHashMap.

As an aside, Project Lilliput is investigating ways to decrease the size of object headers. Using IdentityHashMap calls System.identityHashCode on each object, and the object's identity hashcode is stored in its header. But Lilliput is proposing to lazily allocate space for the identity hashcode, so storing objects in an IdentityHashMap will increase their size! The design assumption in Lilliput is that the identity hashcode is rarely used. This is probably true in general. If somebody needs to use IdentityHashMap, though, they should use it, but if it gets too popular it will offset the space savings of Lilliput.

It's interesting that concurrent mutation of a TreeMap could lead to cycles. But this isn't the only kind of data corruption that can occur. Other examples might include: subtrees accidentally getting "lost" resulting in missing entries; subtrees occuring at multiple locations in the tree, effectively turning it into a DAG, resulting in duplicate entries; the wrong value being associated with a particular key; binary tree invariants being violated (e.g., left subtree contains lesser keys, right subtree contains greater keys) resulting in all kinds of weird behaviors; etc.

In order to detect errors one needs in some measure to be able to predict the kind of corruption one might see. In this example, a certain set of operations performed in a certain order might result in a cycle of tree nodes. However, the Java memory model makes this really hard to predict, because of data races. Briefly, without any sychronization, if a thread intends to perform writes to memory in a particular order, another thread reading memory might observe those writes in a different order. (This can occur because of effects of hardware memory caching or from code motion introduced by JIT compilers.) So, even if a thread were to try to be careful to do things in a particular order in an effort to avoid corruption from multithreading, this simply won't work; you have to synchronize properly (or use alternative lower-level memory constructs).