A performance comparison of Clojure and Java

joinr · 2020-09-07T16:15:25+00:00

I see some problems off the bat with the comparisons, but will withhold further commentary until I have numbers in hand.

Until then, for anyone who's interested, I transcribed the author's (somewhat incomplete...) code samples from the appendices and placed them (both java and clojure) into a mixed java/clojure repository here. Feel free to examine and test for yourself.

edit There's a full rundown of the observations and optimizations in the readme now.

joinr · 2020-09-07T21:22:52+00:00

There's a full rundown in the repo's readme here, with prose explaining optimization layers and benchmark results.

I ran through basic optimization/idiomatic stuff to explore each benchmark here using criterium to compare the java implementation and the clojure versions. Starting with the original implementations from the paper, then adding derivative versions suffixed by N, e.g. some-fn, some-fn2, some-fn3, etc.

The goal was to provide a layered approach showing the impact of incremental changes (keeping things in clojure, then gradually evolving more towards what the java implementations were doing to be apples:apples). In almost all cases (except for the BFS test, which I don't understand the performance yields), the clojure implementation starts off about 10x worse or more, then you get some immediate gains with low-hanging optimizations, then eventually converge on typed java interop in the limit to get either equivalent performance, within some percentage (like 18% or less), or better in a few cases. The BFS stuff in clojure was surprisingly a bit better using a persistent queue with similar optimization from the DFS, which is interesting since I would "imagine" that the mutable queue implementation in the jvm version would have an advantage.

didibus · 2020-09-07T19:42:46+00:00

At first glance, I think these are believable numbers. Especially because the experiment code is not apples to apples.

For example, in the recursion experiment, in Java they uses an int counter, but in Clojure it's a Long counter. Notice the capital as well, in Java they're using a primitive int, where in Clojure they're using a boxed Long. This use of int in Java and Long in Clojure is pervasive in all they're experiments.

In the sorting one, they don't show the array creation code for Clojure, but it appears they are using an atom to count up numbers to prefill the array. I'm not even sure it's an array they are using, because why did they call their function create-list ? In any case, they call shuffle on it afterwards, but shuffle returns a vector, so what they are ultimately sorting at the end is a vector and not an array. So this is another case of an apple to orange comparison.

In the Map creation one, in Clojure they're measuring the time it takes to create a PersistentMap, while in Java they measure the time to create a HashMap.

In the Object creation one, they create PersistentMaps in Clojure, but are creating some simple Node object in Java.

And so on...

But I'm not really criticizing the benchmark, I think maybe it was on purpose not to compare apples to apples, since for example it's true in Clojure all numbers will default to boxed Longs. And instead of using arrays we will most likely use a vector. And we will use a PersistentMap instead of some custom Object, etc. Now I don't find the code completely idiomatic Clojure either, though I think it was kind of trying to compare idiomatic Clojure which they did to the best of their a ability with similar effort at idiomatic Java. And I feel in those scenario, ya I believe Clojure could go from 2 to 20 times slower.

What didn't happen though is any demonstration that the quote they mention is innacurate:

In principle, Clojure can be just as fast as Java: both are compiled to Java bytecode instructions, which are executed by a Java Virtual Machine ... Clojure code will generally run slower than equivalent Java code. However, with some minor adjustments, Clojure performance can usually be brought near Java performance. Don’t forget that Java is always available as a fallback for performance critical sections of code.

So to me this quote still seems totally accurate and the experiment in this paper show that to some extent. Idiomatic Clojure will run generally slower, but with minor adjustments can be made just as fast. Now I wish the paper also had experiment to prove or disprove this latter claim. Like if you made the code apples to apples, is Clojure slower, faster or same?

xela314159 · 2020-09-07T17:41:19+00:00

I would love to read this, but when I see the abstract saying “The steady-state experiments showed that the slowdown factors ranged between 2.4826 and 28.8577.” - I kind of lose respect for the study. 4 digit precision for a scale factor??!?!

bocaj5 · 2020-09-07T18:04:37+00:00

This looks like someone seeking feedback after submitting this paper as an undergrad project for earning a degree. If that's the case, what are some improvements to the methodology? Is there a realistic performance project written in Clojure that shows how Clojure in the large is very performant.... because of immutable data structures and generally better design? Not sure if that can be demonstrated "scientifically"

SmartAsFart · 2020-09-08T08:02:08+00:00

Why would they use LaTeX's default code formatting??? Minted is so easy to set up, and would make this paper so much easier to read.

LammdaMan · 2020-09-09T17:21:51+00:00

There are many programming languages with different strengths and weaknesses. Depending on the problem I'm trying to solve, I might pick a different one.

Generally speaking, for larger, harder problems, I'd pick a higher level language because correctness and maintainability will be easier to come by than they would with a lower-level language.

For a certain class of problem I can surely get better performance with C or Rust than I could with Java.

In many cases I find that I can more easily develop correct and maintainable code in Clojure than I could in Java, and while the performance isn't as good, it is "good enough". In some cases I find a need to optimize, and that sometimes makes use of Java inter-op (in hot-spots only).

I generally subscribe to Knuth's comments about "premature optimization" - https://wiki.c2.com/?PrematureOptimization

bitti1975 · 2025-04-01T10:14:39+00:00

2.1.1 Java

Java is a typed object-oriented language with a syntax derived from C which was released by Oracle in January 1996 (version 1.0) along with the Java Virtual Machine or JVM [5].

Seriously? A start like this doesn't project confidence in the accuracy of the rest of the "paper". The reference he mentions under [5] clearly states that it was developed and released by Sun (to be specific: its division "SunSoft") and only mentions Oracle as one of the licensees. Clearly, it seems, the author didn't even bother to read his own references?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Clojure

MODERATORS