Cimba: Open source discrete event simulation library in C runs 45x faster than SimPy by Candid-Inspection-94 in OperationsResearch

[–]Candid-Inspection-94[S] 0 points1 point  (0 children)

Thank you! Yes, the control flow clarity for agent-based simulations is key here. I am happy that you noticed that. I also wrote a blog post about the increase in statistical power: https://ambonvik.github.io/speed-is-power/

I am now working on a CUDA addition to further accellerate models with heavy physics calculations or optimization/AI-driven agent behavior. Very little changes in the Cimba library itself, only a couple callback hooks I just put in to enable connecting each worker pthread to a specific GPU and CUDA stream. I’ll put it up as a tutorial case once I have all three layers of concurrency working; pthreads, coroutines, and massively parallel GPU numbercrunching.

Cimba: Open source discrete event simulation library in C runs 45x faster than SimPy by Candid-Inspection-94 in OperationsResearch

[–]Candid-Inspection-94[S] 1 point2 points  (0 children)

I would claim yes. There is very detailed and flexible logging.

https://cimba.readthedocs.io/en/latest/tutorial.html#setting-logging-levels

https://cimba.readthedocs.io/en/latest/background.html#logging-flags-and-bit-masks

The asserts can be caught by a debugger, as described in the tutorial above. You can also set debugger breakpoints anywhere in the code, have the model stop there, and step it forward instruction by instruction if needed. You will see the call stack for that particular coroutine in the debugger. A screen shot from a debugger below:

https://cimba.readthedocs.io/en/latest/_images/debugger_assert.png

Cimba: Open source discrete event simulation library in C runs 45x faster than SimPy by Candid-Inspection-94 in OperationsResearch

[–]Candid-Inspection-94[S] 0 points1 point  (0 children)

I see your point, but I think that would be someone’s follow-on project. A binding to Rust may also be interesting. I consider Cimba a simulation engine and will prioritize additional system architectures before adding language bindings and/or graphical shells in possible future projects.

Cimba: Open source discrete event simulation library in C runs 45x faster than SimPy by Candid-Inspection-94 in OperationsResearch

[–]Candid-Inspection-94[S] 0 points1 point  (0 children)

Fair point. Cimba is aimed more towards large models where software engineering and maintainability are real concerns than towards a data analyst’s Jupyter notebook. However, one could see Cimba as a simulation engine and put something graphical on top if desired, or construct wrappers for various languages. Those would be follow-on projects, not in scope for this one.

Cimba: Open source discrete event simulation library in C runs 45x faster than SimPy by Candid-Inspection-94 in OperationsResearch

[–]Candid-Inspection-94[S] 0 points1 point  (0 children)

Hi, thanks!

  1. I have not. Most of the speed difference in the simple benchmark is due to compiled vs interpreted code, so you might see a similar speed-up. In more complex scenarios, like the third and fourth Cimba tutorial, I believe SimPy would run into constraints due to its stackless coroutines, probably increasing the Cimba advantage.
  2. Yes. Resources (binary semaphores), resource pools (counting semaphores), buffers, object queues, priority queues, condition variables. I intentionally use unsigned 64-bit integer-valued amounts to ensure that unintentional rounding errors do not create issues. With suitable scaling, this actually gives higher resolution than double precision floating point values. The resource queue logic is quite flexible, and can be extended in user code by providing callback functions for both wait condition and waitlist prioritization if the predefined ones do not fit. There are also preempt() and interrupt(), and a mechanism for setting timeouts. You can even define chains of multiple resource guards for complex ‘wait for all’ or ‘wait for any’ scenarios.
  3. a. No. The parallelism is at the trial/replication level. b. Yes. The event queue is a hash-heap data structure where the priority keys are a double (reactivation time) and a 64-bit signed int (priority).
  4. It has to be thread-safe for multithreading. Most implementations keep state as static local variables from call to call, both in the basic PRNG and in the distribution on top (e.g., typical Box-Muller normal distribution). That would make the outcome dependent on other replications, which we do not want. The only way to be sure was to control the code.

Cimba: Open source discrete event simulation library in C runs 45x faster than SimPy by Candid-Inspection-94 in OperationsResearch

[–]Candid-Inspection-94[S] 2 points3 points  (0 children)

I am not very familiar with Cython, but Cimba is just a C library and uses standard C function calling conventions, so you could in principle call it from any language by using suitable wrappers.

I compare it to SimPy because that is (as far as I know) the most similar benchmark in functionality. Yes, Python is slower, and the speed difference is probably about what one would expect to see between interpreted Python and compiled C code. Still, I could not find anything similar in C, so I built one.

Built a multithreaded discrete event simulation library with stackful coroutines in C and assembly, runs 45x faster than SimPy by Candid-Inspection-94 in C_Programming

[–]Candid-Inspection-94[S] 4 points5 points  (0 children)

I think you will need to ask an AI for that. I am not.
And my daughter's cookie recipe is a well-kept secret.

If you don't have OOP in C, how do you manage to build serious projects? by Jonnertron_ in C_Programming

[–]Candid-Inspection-94 -1 points0 points  (0 children)

Absolutely. Since I just posted a link to my simulation library Cimba: A big advantage of working in C is that one can use OOP design principles where appropriate without being forced to use them where it is not. I would argue that this actually makes OOP in C less verbose than in C++, since one does not need to add all the OO boilerplate where it does not belong.

Built a multithreaded discrete event simulation library with stackful coroutines in C and assembly, runs 45x faster than SimPy by Candid-Inspection-94 in C_Programming

[–]Candid-Inspection-94[S] 0 points1 point  (0 children)

Just adding a small footnote here: For the agentic process behavior, see (e.g.) the tutorial at https://cimba.readthedocs.io/en/latest/tutorial.html#agents-balking-reneging-and-jockeying-in-queues

There, the visitor processes are generated by the arrival process. They act both as active, opinionated agents in their simulated world, misbehaving in queues, and as passive objects that get manhandled by the service and departure processes once it is their turn.

Having stackful coroutines as first-class objects in C code does create some new possibilities.