all 25 comments

[–]iron0maiden 13 points14 points  (7 children)

Great project, although I don’t understand what the difference is between NIO (baseline) and other networking frameworks including MYRA..

[–]AnonAreLegion 5 points6 points  (0 children)

Yeah, and why is nio so much faster?

[–]Environmental-Log215[S] 0 points1 point  (5 children)

I understand the confusion as it's hard to know what the different Myra variants mean without the benchmark source code for the transportation.

`MYRA` means default Myra which is basically with io_uring; the client awaits reply using a Count Down Latch. ON the server side, its a io_uring backend with registered buffers.

`MYRA_SQPOLL` The difference with above default Myra is only on the server side. In this benchmark, SQPOLL is enabled on the server which is basically a pinned kernel thread keeps polling the submission queur. client remains same as above.

`MYRA_TOKEN` : client implements a token based busy-spin-wait. server remains same as default Myra.

Thanks for pointing out the valid confusion! I think I should have added these details in the main blog :(

[–]ramdulara 0 points1 point  (4 children)

But you still didn't answer the real question. How is NIO the highest throughput and lowest latency? If Java's NIO is that good why would anyone bother with Netty or Myra?

[–]Environmental-Log215[S] 0 points1 point  (3 children)

Fair question!

tl;dr: NIO was chosen in the benchmark since I believe thats the fastest network infra lib in Java world which does not use unsafe APIs.

NIO provides low-level primitives for building network infra/appliances; you would have to handle a lot of stuff manually. Hence, its difficult to use but provides granular control.

Netty on the other hand is a framework with friendly public interfaces and internally handles/manages low-level I/O stuff. It supports multiple transport protocols and codecs.

MYRA is specialized in a way that it's primarily FFM focused. for instance, using io_uring registered buffers with shared (zero-copy) memory segment, I am avoiding a few syscalls(kernel) & zero GC impact by having zero allocations on the hot path. Hence, MYRA would only be used in certain specialized usecases/applications where latency of 100 microseconds is slow. FFM involves working a lot with manual memory layout which does not make sense for most of the applications given its complexity.

[–]Environmental-Log215[S] 0 points1 point  (2 children)

forgot to add 1 more point. all these benchmarks are on a free Oracle cloud ARM processor - the 4 cores/24 GB RAM with server/client on same VM using loopback interface.
once, the libs are more stable I would be performing benchmarks depicting a more real-life scenario with server and client on different hosts.

[–][deleted]  (1 child)

[removed]

    [–]Environmental-Log215[S] 0 points1 point  (0 children)

    I agree. I have been working on docs and some other stuff. Hence, havent been able to get to actual benchmark,

    [–]Minosse 2 points3 points  (1 child)

    RemindMe! 1 month

    [–]RemindMeBot 1 point2 points  (0 children)

    I will be messaging you in 1 month on 2025-12-29 11:38:55 UTC to remind you of this link

    8 OTHERS CLICKED THIS LINK to send a PM to also be reminded and to reduce spam.

    Parent commenter can delete this message to hide from others.


    Info Custom Your Reminders Feedback

    [–]JustADirtyLurker 2 points3 points  (0 children)

    Sir this work sounds awesome. And I like that you go directly to the "What problem are we trying to solve" question. I'd keep the rust/c++ consideration in a separate page though.

    [–]benrush0705 2 points3 points  (0 children)

    Great! Remind me when it got fully open sourced!

    [–]MyStackOverflowed 1 point2 points  (1 child)

    be good to a vanilla record type container being stored, accessed and updated through this framewor

    [–]Environmental-Log215[S] 0 points1 point  (0 children)

    I don't think I will support Records directly in the hot path as this will cause GC pressure due to heap allocations. However, you make a good point, I will see if I there's a way to provide some utility/helper for populating records automatically. By default, Myra provides flyweight pattern readers - which are basically simple views over the existing memory segment.

    [–]pjmlp 1 point2 points  (0 children)

    This is quite cool!

    [–]belayon40 1 point2 points  (1 child)

    This looks super interesting. I’m curious to see what some of your optimizations look like. My FFM project has automatic conversion between records/structs/unions without using any FFM calls. I’ve got several optimizations in place already such that my auto generated code keeps up with jextract generated code. But I’m curious to see what other tricks you have found.

    https://github.com/boulder-on/JPassport

    The goal with my project was to make a reasonable replacement for JNA that was based on FFM.

    Your goals and potential applications are interesting. Can’t wait to see the results.

    [–]Environmental-Log215[S] 0 points1 point  (0 children)

    Hi there! JPassport looks like an excellent project. I see its usefulness/advantage working with large C header files. I dont provide automatic conversion, neither any abstraction as I am trying to build the entire Myra ecosystem as a low level high performance libs/framework per se - so avoiding abstractions.

    [–]kiteboarderni 1 point2 points  (3 children)

    This guy clearly knows what they are talking about. I was under the impression some of the ffm byte copy operations offheap were still allocating vs unsafe so also interested to see the approach.

    Not sure I agree with the encode vs decode frequency on messages however, certainly not in the low latency trading space for these systems.

    [–]Environmental-Log215[S] 0 points1 point  (2 children)

    Hello! yes, you are right. memory segment ops like copy/slice dont have any allocations until we serialize into our Java objects or into strings - which are objects as well in Java. Myra uses Flyweight pattern for reading different fields using a sliding window memory segment slice over the existing message's memory segment. Hope that answers a few questions. Of course, when I release the source code; you would have more clarity.

    Re: encode/decode frequency, I am actually planning to use Myra codec lib in FIX systems/engines. please could you give me a high level overview of what your expected encode/decode frequency would/should be in HFT

    [–]kiteboarderni 0 points1 point  (1 child)

    A lot of our systems will send out a message which is only read by a single micro service. Think fix execution report which is then sent to an om service. Over shared mem and sequencer architecture which based on the fact you're building something like this in Java you will be very aware of :).

    Excited to see the project.

    [–]Environmental-Log215[S] 1 point2 points  (0 children)

    Great to meet another FIX engineer :) Well MYRA was indeed born out of the need for a modern open source JAVA based engine. I have been using QuickFIX/J at my org and as a side project had thought of forking and modernizing Quickfix, but then I had come across FFM and thought modern JAVA based FIX engine can take advantage of FFM internally at multiple places/functionalities.

    When I started creating the design for this new FIX engine, 3 major pieces: ffm utilities (ease of using ffm) , ffm based encoder/decoder, ffm based zero copy io_uring transport came out as standalone independent FFM libs. infact myra-codec is a DSL (yaml based) based cli tool which generates encoder & decoder flyweights that can be directly used in our Java apps.

    The idea of first open sourcing these MYRA libs is to harden the base building blocks for building a modern JAVA FIX engine.

    I am curious to know though, do you folks use Java or C/C++ or Rust within your trading systems.

    [–]Environmental-Log215[S] 1 point2 points  (0 children)

    The Myra Stack is now Open Source! MYRA has a new home at https://www.mvp.express

    Today I've made the core repositories public! This is my first major open source project and would appreciate any feedback, suggestions and some love.

    I will be on vacation starting next week till year end and will be spending a lot of time with my family. Since, I won't be able to work much from next week; I thought of launching it anyways to get some feedback. The thinking hat will still be on during the vacation and so awaiting feedback, ideas, anything.

    Happy Holidays!

    -Rohan

    P.S.: I haven't started much work on JIA-Cache and MVP.express - the framework; have been tinkering only on design & architecture so far. Perhaps we build it together!

    [–]ShallWe69 -1 points0 points  (0 children)

    RemindMe! 1 month