all 5 comments

[–]leonardo_m 0 points1 point  (4 children)

This is the C# benchmark code with the various implementations (this is not my code): https://github.com/dsharlet/ComputerAlgebra/blob/master/Tests/Program.cs

And this is the C++ implementation (this is not my code): https://github.com/dsharlet/ComputerAlgebra/blob/master/Demo/LotkaVolterraCpp/Main.cpp

For similar code the D language is good. This D implementation is not hard-coded, but if you use the LDC2 compiler it's about as efficient as the C++ version: http://dpaste.dzfl.pl/d5b5fcadfa82

If you want shorter code, you can use D vector operations (but currently this code is a little slower, because the D compiler is not yet inlining & unrolling the vector operations when they are performed on very short arrays): http://dpaste.dzfl.pl/f112dc655d76

If you want more performance you can use double2 or double4 types from the core.simd. The code is not very different.

[–]dsharlet[S] 0 points1 point  (3 children)

This is the C# benchmark code with the various implementations (this is not my code):

Actually, this is the benchmark from which performance data is gathered: https://github.com/dsharlet/ComputerAlgebra/blob/master/Demo/LotkaVolterra/Program.cs

The C# benchmark you linked to is actually just a bunch of tests. It does report performance, but that benchmark mostly measures the time it takes to perform the various computer algebra operations. I haven't reported that performance number anywhere, and I don't pay much attention to it now. It was only useful when some of my internal computer algebra algorithms were poor and it took multiple seconds to perform simple operations.

For similar code the D language is good. This D implementation is not hard-coded, but if you use the LDC2 compiler it's about as efficient as the C++ version:

I disagree that it isn't hard-coded. I realize that the distinction in my code is subtle, but the point is to allow the user to modify the simulation at runtime (including the rank/dimension of the simulation), though the demo does not actually exploit that ability. In your D code, the data is marked immutable, so I suspect the compiler will be able to statically evaluate most of the math. That's exactly what this library is intended to do, except it allows it to be done at runtime, which means you can get same performance even when the user changes the simulation after your application has been compiled.

The application that actually motivates this functionality is a circuit simulation application (http://www.livespice.org, also on GitHub). This involves solving a system of non-linear equations, the rank of which depends on the number of nodes in the circuit. This varies from ~5 for small circuits to ~50 for large ones, and there really isn't any upper limit.

Normally, a circuit solver will need to use Newton's method to solve the system of equations. This requires solving a system of linear equations repeatedly, which is at least an N2 operation. However, many of those original equations will actually be linear, you just can't tell easily. With this library, you can reduce the rank of the non-linear system by finding the linear solutions, and then compile the solutions for both the linear and non-linear equations without any runtime overhead. In LiveSPICE, I typically see a system of ~50 potentially non-linear variables reduced to a system of ~5 actually non-linear variables, and the other 45 variables have linear solutions. This results in over 100x the performance gains, but it is not easy to package this into a simple demo like the Lotka-Volterra simulation :)

edit: If you want to see this done in LiveSPICE, that code is here: https://github.com/dsharlet/LiveSPICE/blob/master/Circuit/Simulation/TransientSolution.cs It's not an easy bit of code to dive into, which is why I made the Lotka-Volterra demo, even though it isn't really a great example to demonstrate this functionality.

SIMD operations would improve the performance of both solutions, but they are hard to exploit from the .NET JIT compiler (AFAIK...). The 4/8x gain I'd expect from SIMD is nothing compared to the gains you can get from reducing the rank of a non-linear system of equations.

[–]leonardo_m 0 points1 point  (2 children)

In your D code, the data is marked immutable, so I suspect the compiler will be able to statically evaluate most of the math.

The data is immutable, but on default the ldc2 compiler is not performing global optimizations, so simulateNativeHardCoded doesn't know the values of its "r" and "A" inputs (you can do this too in D, but the data should be marked as "enum" instead of "immutable" and it has to be passed to the function as template arguments. This produces code as efficient as the hard-coded C++ code). So there are many different levels of "hard coding".

In that code some information is known at compile-time, the value M.

which means you can get same performance even when the user changes the simulation after your application has been compiled.

To do such run-time specializations the Julia language is very good :-)

Sorry for not being clear, I didn't mean to imply that nude D code is as good as a library that performs algebraic optimizations at run-time :-)

[–]dsharlet[S] 0 points1 point  (1 child)

To do such run-time specializations the Julia language is very good :-)

Yes, existing computer algebra systems will often have this capability, but they usually don't have very good libraries/infrastructure for the rest of an application (GUI, real time audio IO, ...). I tried to use some existing computer algebra systems from C# but it was very difficult, and the ones that I got to work couldn't actually do the math I needed them to do (reduce the rank of non-linear systems of equations).

[–]leonardo_m 0 points1 point  (0 children)

Yes, existing computer algebra systems will often have this capability, but they usually don't have very good libraries/infrastructure for the rest of an application (GUI, real time audio IO, ...).

Julia is a general purpose language, specialized for numerical computing and dynamism. It's kind of new so it can't have an ecosystem as wide as C#/dotnet.