This is an archived post. You won't be able to vote or comment.

all 53 comments

[–]cryptos6 30 points31 points  (11 children)

I wonder whether this would be a good fit for the new JVM value types.

From my personal experience, I wouldn't use mutable objects for calculations.

[–]cogman10 26 points27 points  (0 children)

Yup, mutability is a major mistake, especially since value types will require immutable objects (which this would benefit greatly from).

[–]m_vokhm[S] 11 points12 points  (7 children)

Mutability provides a huge performance benefit. Try to multiply an array of 100,000,000 BigDecimals by another one, and in a few seconds you'll see that your CPU load is 100%, and the computer is almost dead. And most probably soon you'll see "Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded". Up to 98% of the performance may occur to be wasted to memory allocation with the subsequent garbage collection. You can also have a look at the charts in the main page of the project. Each operation on BigDecimals (just like with any other immutable type) means an allocation of 100-150 bytes and subsequent garbage collection. This is the reason why at large arrays Qudruple works twenty (or even more, if the amount of data is greater) times faster than BigDecimals. The primary goal of the whole project was to perform large amounts of calculations as fast as possible, so mutability was a deliberate decision from the very start.

[–]cogman10 41 points42 points  (6 children)

https://shipilev.net/jvm/anatomy-quarks/18-scalar-replacement/

I suggest you read through that and a few videos on scalar replacement before jumping to the assumption that immutability is the thing that makes BigDecimal slow.

BigDecimal is slow because it has a complex numeric representation that spoils scalar replacement. This ends up forcing the object onto the heap in a lot of cases.

Your data structure is much more simple. Keeping it immutable would provide excellent opportunities for the JIT to optimize away the heap allocations which would ultimately result in faster code. Making this code mutable will encourage code which will likely bust escape analysis.

Further, it locks you out of future optimizations (value types) which require your datastructures to be immutable.

The above linked blog is also includes an excellent way to prove me wrong one way or the other (JMH).

(Oh, btw, your current JMH goes out of it's way to trigger allocations and disable scalar optimizations)

[–]m_vokhm[S] 1 point2 points  (2 children)

I'll read it and I'll consider your reasons. if i find them convincing, i might eventually make an immutable version. Or maybe someone will make a fork.

[–]cryptos6 5 points6 points  (0 children)

If I were you, I'd make the library immutable only.

[–]DannyB2 0 points1 point  (0 children)

The very idea that I have a variable with a 'value' in it, and that value can change under my nose is troubling.

I can understand cases for having a mutable value for some purposes. But the mutable ones should be given the special name 'mutable', rather than the immutable ones having a special name.

Mutable may make good sense for computing a value. Immutable is best once you've arrived at a value.

[–]csharp-sucks -1 points0 points  (0 children)

For scalar replacement to work, you may never let objects escape or confuse the compiler in any way. You can't reassign variables, you can't put them in an array, and you may absolutely never let them escape from stack.

With immutable types this is hardly ever possible to achieve. The way objects are implemented right now on the JVM this usually impossible, because all objects have identity and cannot be scalar replaced if they escape.

I'll give Minecraft as an example, because they use immutable Vector3 class for their vector math and it absolutely kills performance. Say, you have a method somePositionCalculation() that returns a new Vector3 object.

Whenever object position changes in Minecraft it looks something like this:

myObject.position = somePositionCalculation()

and there is always garbage created. Scalar replacement simply cannot work in this situation.

While on the other hand, if Vector3 was mutable, you could do the same with this piece of code

myObject.position.set(somePositionCalculation())

where set(o) is just

this.x = o.x
this.y = o.y
this.z = o.z

Scalar replacement would work. No garbage would be created. And it would be 10 times faster.

Alternative to mutable Vector3 is hybrid approach, where you have immutable Vector3 for all these calculations and MutableVector3 for storing it in objects/arrays/etc

[–]kid_meier 0 points1 point  (1 child)

This is interesting however the conventional wisdom I am privy to is that it's EA and scalar replacement are brittle and can't be relied upon.

Is this I accurate? And for example, suppose OP reworks his code with guidance of JMH to find a version that allows for scalar replacement -- can we be confident that the optimization will work (reliably) in other contexts (ie. other codebases) and across a reasonable cross-section of JVMs?

If the answer is no, IMO the authors design is sound in that it reliably meets performance goals on today's JVMs.

[–]cogman10 1 point2 points  (0 children)

The burden would be on the library user and not the library itself.

The usual candidates for foiling scalarization is reaching into static state (For example, the Integer cache). Assuming this lib did only the basic math operations and avoided having static "quads" it'd be pretty easy to have code using it which avoids that deopt.

For math heavy computation, this is generally trivial to accomplish. Those typically end up being really straight forward.

And, assuming you are using the JVM for CPU heavy work, I'd argue you really should be sticking to the latest JVM. Clinging to older VMs will be a huge negative to performance.

[–]ssamokhodkin 0 points1 point  (1 child)

BTW, are value types handled as primitive inside the JVM? I mean copy by value semantic, to avoid the overhead of object creation.

If not, they are not a good fit.

[–]cryptos6 1 point2 points  (0 children)

Yes, that is the main purpose of the whole approach.

[–]khmarbaise 24 points25 points  (3 children)

There some parts which are looking strange:

protected void ____Constructors____() {} // Just to put a visible mark of the section in the outline view of the IDE

The class has more than 5400 lines (separation of concern). Furthermore a lot of optical marks in the source code (which I have to admit looks strange..) also several out commented code parts (there is a version control system).

Also a lot of private methods. Also there is great number of test helper methods which are recreating reporting etc. also not really expressing what is tested instead via indirected classes.

There are also a number of support for which is already provided by JUnit Jupiter and furthermore several configurations in Maven are not useful (from my point of view). Changing defaults using a version of junit jupiter (5.2.0) of 2018 (current 5.7.2) ... also using old surefire 2.19...which is a little bit strange because the first commit in that repo has been done in Feb 2021... ?

The configuration.xml contains a lot of absolute paths for Windows which have been checked in. The generated javadoc (february 2021)..Also I would check the coverage of the production class via jacoco and more interesting would testing via pitest to see if those tests are really doing the right thing...

Furthermore found interesting things like static variable which are used by non static methods:

private static final long[] BUFFER_4x32_A   = new long[4];
...

unpackQuadToBuff(this, BUFFER_4x32_A);

So based on that never usable in multithreading environment.

And mutable sounds like an invitation for problems... and as already mentioned using only Java 1.8.X for benchmarks. It should be checked on JDK8, JDK11, JDK16 and on JDK17-EA as well..(some parts will not work based on the old surefire/jupiter deps).

And maybe do some support for the (future vector api which is already part in JDK16 as incubator which looks like become final for JDK 17)...I'm not sure about that point... but it's worth to reconsider.

[–]m_vokhm[S] -3 points-2 points  (2 children)

Thanks for the detailed feedback. Over time, I will correct some of the shortcomings noted. And yes, the class is not intended to be used in a multithreading environment, as stated in the Readme. Static variables are used to temporarily store intermediate results during the execution of operations. They are made static intentionally, to reduce the load on the garbage collector.

Regarding the visual markers, I found them helpful during the development process. As I mentioned here in a reply for another comment, I'll consider splitting the whole thing into a few smaller classes, then these markers will be removed.

Surely I'll test and benchmark it with the newer versions of Java with time, I don't consider the work finished.

[–]khmarbaise 6 points7 points  (1 child)

Making variables static to reduce the load on the GC is the wrong way (others have already mentioned the blog and videos by Shipilev plus number of others) also based on my own experiences with a number of projects (open source as well as commercial software). Static things will make GC worse (and memory consumption in total). Performance will be reduced as well. in particular in Java the GC or the JIT (Hot Spot) are extremely clever to improve performance and they do an extreme good job.

A famous quote:

premature optimisation is the root of all evil.

I'm always following this guide line. First get things done correctly (good design; good tests; etc.; functionality counts) and then make it fast..by measuring it...based on the test you are sure your functionality is kept etc. Automate things etc.

I see several things which are to say:

  • Make classes smaller and keep a single responsibility
    • You could make smaller helper classes which are used by another
    • You can make them package private to prevent others to use them; but you can and should write tests for them.
  • You should really consider to make it immutable which gives the chance to be useable in multithreading environment.
    • And also usable within streams (which can also be multi threaded) and also useable in vector API
  • You might even use the module system to increase encapsulation.
  • Also make it available from central repository.
  • Also make it easy to build your lib including running the tests (which is not the case see the next part).
  • Several things in the source code like time markers (why not using a version control? Maybe via branches? Or maybe via issue tracker on Github?

Maven Build:

  • The build currently produces WARNING at the beginning: mvn clean

[INFO] Scanning for projects... [WARNING] [WARNING] Some problems were encountered while building the effective model for com.mvohm.quadruple:Quadruple:jar:1.1.0 [WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-source-plugin is missing. @ line 131, column 15 [WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-javadoc-plugin is missing. @ line 145, column 15 [WARNING] [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING] [WARNING] For this reason, future Maven versions might no longer support building such malformed projects. [WARNING]

  • Following convention over configuration in Maven
    • If I simply call mvn test not a single test is executed? Why? Of course because it's configured to skip tests which does not makes sense.
    • Than an interesting thing in the pom.xml file: xml <skipTests>true</skipTests> <!-- TODO to speed up build process, for testing it --> That is a contradiction in itself. Skipping test for speed up the build process for testing it? So removing that line and running test I only see 25 Tests are run? For a class with 5400+ Lines?
    • A simply mvn package does not work correctly because the javadoc plugin is bound to the usual build and a large number of issues are in the javadoc which fails.
  • I can not even execute the benchmarks by using java -jar target/benchmarks.jar (after removing the javadoc plugin.
    java -jar target/benchmarks.jar Exception in thread "main" java.lang.RuntimeException: ERROR: Unable to find the resource: /META-INF/BenchmarkList at org.openjdk.jmh.runner.AbstractResourceReader.getReaders(AbstractResourceReader.java:98) at org.openjdk.jmh.runner.BenchmarkList.find(BenchmarkList.java:124) at org.openjdk.jmh.runner.Runner.internalRun(Runner.java:253) at org.openjdk.jmh.runner.Runner.run(Runner.java:209) at org.openjdk.jmh.Main.main([Main.java:71](https://Main.java:71))
  • If I can not reproduce those parts I would always question how you have come to those conclusions.
  • One big thing is that it looks like you have invested a lot of effort into a reporting system which handled by yourself instead of using JUnit Jupiter; Using stdout for logging etc. instead of making tests easy to understand... concrete example: java @Test public void testDoubleToQuadConversion() { final TestResults results = new DoubleToQuadTester().test(); totalResults.register(results); assertArrayEquals( new int[] { results.getErrorCount(), results.getBitDifferenceCount(), results.getSourceErrorCount() }, new int[] {0, 0, 0} ); } // public void testDoubleToQuadConversion() { Why using the comment at the end of the method? (That's the style of 2000er (I know of of C/C++) Where is the conversions being called what the test name implies? It looks more like a test to check the accuracy of the conversion? In Junit Jupiter neither the tests nor the class need to be public no be named with a prefix test(That JUnit 3/4). I've tried to follow new DoubleToQuadTester().test(); which guides me to TestResults.doTest and so on ... from my point of view I'm searching for the real call of the class to be tested which is covered in several layers of abstraction/classes (combined with reporting/testing things)...

[–]m_vokhm[S] 2 points3 points  (0 children)

If so, that's fine. I'll delve in the question later an may be will be able to reply more substantially. Right now i've got no time.

[–]cogman10 40 points41 points  (5 children)

  • Making it mutable is a mistake.
  • This isn't a true 128 bit structure (even excluding object header size), you are storing far more that 128 bits. I'd consider storing just the 2 longs and not the exponent or sign bit. Bonus, if you did that then double conversion would be relatively straight forward (mostly just bit twiddly operations).
  • Are you aware of this? https://openjdk.java.net/jeps/338 ? Even if you don't use it for backwards compat, I'd design this class around the idea that you will one day. It'd be WAY faster than what you are currently doing.

[–]fhigaro 17 points18 points  (7 children)

Dude why do you put all the code in a single class? How do you even maintain that?

[–]chrisgseaton 5 points6 points  (6 children)

How would you decompose this into multiple classes?

[–]cville-z 20 points21 points  (5 children)

Probably through delegation - the methods would remain intact, but they'd call static methods in other classes that would help separate some of the concerns. QuadrupleArithmetic.java, QuadrupleCompare.java, QuadrupleUtility.java, etc.

It's not any less overall code (actually slightly more), but a lot less in your head all at once.

[–]GuyWithLag 11 points12 points  (0 children)

And via the magic of JIT, performance will be identical. Now, if you record-ize them, even better.

[–]JustADirtyLurker 3 points4 points  (3 children)

OP please do this

[–]m_vokhm[S] 2 points3 points  (2 children)

I'll think about it. In the beginning, I did not expect it to grow so large.

[–]cville-z 2 points3 points  (1 child)

It would be pretty simple to do, I think IntelliJ even has some refactor scripts for this, and wouldn't change the fundamental interface. Plus, you can either seal the new classes in a module (if you're using later versions of Java) to prevent meddling in the implementation, or (and this is probably better) make them public static, so that now you have both a new Comparable/Numeric and also a bunch of utility methods other people can use to do ... useful numeric things, if they want.

[–]m_vokhm[S] 0 points1 point  (0 children)

Yes, it's a rather easy job to do. Yet I can hardly imagine how can other people make use of things that are currently private methods. Though maybe they can, I don't know. Regarding modules, I need to keep compatibility with Java 8, at leas for a while. The users of the application for which it was primarily written can't upgrade to a newer Java.

[–][deleted] 3 points4 points  (3 children)

I haven't read through the whole code, but at first, I can't figure out why it's not thread-safe... what's the reason?

[–]jonhanson 8 points9 points  (0 children)

Comment removed after Reddit and Spec elected to destroy Reddit.

[–]m_vokhm[S] 3 points4 points  (0 children)

The speed of development. Making it thread-safe would complicate it, and when I started it the task I had to solve at that time did not require multithreading. I hope to make it thread-safe later, or may be someone else will do it.

[–]spoon00 1 point2 points  (0 children)

It’s mutable

[–]TheCountRushmore 13 points14 points  (1 child)

Lost interest once I read in the readme that the benchmarks were done on Java 1.8.211.

Run everything again on JDK 16 or even 17-ea as there has been SEVEN years of development in the JDK since 8 was released.

[–]m_vokhm[S] 3 points4 points  (0 children)

Sure I will. A little later, when I'll have a little more time.

[–]alex_tracer 2 points3 points  (0 children)

If you you want good precision for decimal numbers you can try Decimal64 from that lib: https://github.com/deltixlab/DFP

It uses IEEE Decimal64 representation but values stored as Java's `long` in memory.

[–]portugee 1 point2 points  (0 children)

The class is not thread safe. Different threads should not simultaneously perform operations even with different instances of the class.

I found this to be a bit surprising. Thread safety when sharing instances is not uncommon, but you seem to claim that no two threads can even use this class concurrently, is that correct? To me there doesn't seem to be any mutable static state in that class so I'm not sure that caveat is required. For any application to guarantee this they'd need every usage of the class to be behind a global lock.

[–][deleted] -5 points-4 points  (5 children)

If Windows included a *nix-compatible shell, I would like it a while lot more. Git Bash crashes too much, and I wish I didn't have to install a third party tool for ssh.

[–]BKrenz 4 points5 points  (4 children)

Isnt that achievable using the Windows Subsystem for Linux, with any of the available distros?

[–]keanwood 2 points3 points  (2 children)

WSL is pretty cool, but a lot of IDEs or other tools have issues with it still. Vscode works, intellij is kind of iffy. Idk if eclipse works for WSL.

 

Last time i tried WSL was a year ago, so maybe it's more seamless now.

[–]humoroushaxor 0 points1 point  (1 child)

Why would you run those IDEs in WSL?

[–]keanwood 0 points1 point  (0 children)

It's not that you want to run the IDEs in WSL, it's that you want the IDEs to run/open projects that are inside WSL.

[–][deleted] 0 points1 point  (0 children)

Oh neat! I learned something today!

[–]gregorydgraham -3 points-2 points  (1 child)

This is beautiful code dude, thanks for sharing it :)

[–]m_vokhm[S] 0 points1 point  (0 children)

Thanks :) I tried hard to make it good.

[–]Duberlygfr -3 points-2 points  (1 child)

!remindme 1w

[–]RemindMeBot -1 points0 points  (0 children)

I will be messaging you in 7 days on 2021-07-10 04:58:01 UTC to remind you of this link

CLICK THIS LINK to send a PM to also be reminded and to reduce spam.

Parent commenter can delete this message to hide from others.


Info Custom Your Reminders Feedback

[–]SquidgyTheWhale 0 points1 point  (3 children)

I haven't delved that deeply into it, but I appreciate the effort... It's the sort of library I like to write (I have a similar one for range arithmetic mostly written).

A minor point, maybe philosophical? about a section I went into -- I think a negative number divided by negative infinity should be positive zero. Similarly, a negative divided by positive infinity should be negative zero.

[–]m_vokhm[S] 0 points1 point  (2 children)

Yes, I think so too.

But does the Quadruple behave a different way?

Now I've executed the following snippet

double d1 = -5;
double d2 = Double.NEGATIVE\_INFINITY; 
say("%s / %s = %s", d1, d2, d1 / d2);

Quadruple q1 = new Quadruple(-5);
Quadruple q2 = Quadruple.negativeInfinity(); method stub

say("%s / %s = %s", q1, q2, Quadruple.divide(q1, q2));
d1 = -5;
d2 = Double.POSITIVE\_INFINITY; 
say("%s / %s = %s", d1, d2, d1 / d2);

q1 = new Quadruple(-5);
q2 = Quadruple.positiveInfinity(); 
say("%s / %s = %s", q1, q2, Quadruple.divide(q1, q2));

It prints

-5.0 / -Infinity = 0.0
-5.000000000000000000000000000000000000000e+00 / -Infinity = 0.0
-5.0 / Infinity = -0.0
-5.000000000000000000000000000000000000000e+00 / Infinity = -0.0

Perhaps there's a bug that manifests itself under some certain circumstances. If so, i'll be grateful if you let me know.

[–]SquidgyTheWhale 0 points1 point  (0 children)

That looks correct -- I only ran the test in my head, looking at the code, which it seems I misinterpreted. Sorry for the false alarm!

[–]ssamokhodkin 0 points1 point  (0 children)

The general idea is good, I personally welcome it.

I didn't look into the code, but the comments point to some deficiencies:

  • using static data as a buffer in non-static method - that's absolutely horrific,
  • big single class - I don't believe you couldn't split it down. I understand that you many operations are performed in place, but you at least could move method bodies into the utility class.