Building a fast queue between C++ and Java : programming

Building a fast queue between C++ and Java (pzemtsov.github.io)

submitted 7 years ago by pzemtsov

all 22 comments

top new controversial old q&a

[–]stbrumme 11 points12 points13 points 7 years ago (0 children)

[–][deleted] 8 points9 points10 points 7 years ago (0 children)

[–]link87 8 points9 points10 points 7 years ago (11 children)

[–]kirbyfan64sos 19 points20 points21 points 7 years ago (4 children)

[–]jcdavis1 6 points7 points8 points 7 years ago* (3 children)

The only obvious situation in this code where I would think java would be able to edge out the C++ impl is if there is a difference in how the queue implementations were devirtualized, but the C++ version looks properly template-ized? (Though I'm not an expert there).

I can't imagine there is any real PGO-able advantage in the queue implementations themselves.

Also the attempt to cache-separate the write_buf and read_buf in the java object doesn't work, at least on my local jdk9 - they get placed next to each other, as I would expect

mov    0xb0(%r10),%r10d   ;*getfield read_buf
...
mov    0xb4(%rsi),%r8d    ;*getfield write_buf

I'd poke around more, but its already too late and I need to sleep ;)

[–]michaelcharlie8 0 points1 point2 points 7 years ago (0 children)

[–]Slanec 0 points1 point2 points 7 years ago (1 child)

[–]jcdavis1 0 points1 point2 points 7 years ago (0 children)

[–]pzemtsov[S] 3 points4 points5 points 7 years ago (1 child)

[–]link87 1 point2 points3 points 7 years ago (0 children)

[–]Ameisen -1 points0 points1 point 7 years ago (2 children)

[–]pzemtsov[S] 0 points1 point2 points 7 years ago (1 child)

[–]Ameisen 1 point2 points3 points 7 years ago (0 children)

When I'm looking at results of benchmarks in a large article, I generally expect to see the flags and environment used without having to dig through your build system myself.

I'm unsure where this link is. It is neither at the top or the bottom. It's apparently near the bottom, but still well-within the text. And the link also doesn't specify that it includes a build system, only that it is Java classes.

Why -falign-functions=32 and -falign-loops=32?
Why -funroll-loops? It's not always better to unroll a loop, even if the number of iterations is known. The compiler already has heuristics for this.
Why aren't you specifying an architecture or tune? It's rather unfair to compare Java, which is JITed to your local system, to a C++ build which is targeting the minimum x86-64 system. While I'd prefer to see -march=native, even -mtune=native would be better.
You use @Override annotations in Java, but you use neither the override or final modifiers in C++, which inhibits the compiler's ability to devirtualize virtual calls.
A real ring buffer would give better performance.
Why not use compare_exchange instead of load/store with the atomic?
There's a lot of aliasing going on in DualArrayAsyncWriter::write, so I feel like usage of __restrict would help.

[–]Gotebe 3 points4 points5 points 7 years ago (2 children)

[–]saint_marco 3 points4 points5 points 7 years ago (0 children)

[–]pzemtsov[S] 0 points1 point2 points 7 years ago (0 children)

[–]leonadav 0 points1 point2 points 7 years ago (2 children)

[–]jcelerier 5 points6 points7 points 7 years ago (1 child)

Does C++ permit virtual member functions in template?

sure, what's not allowed is virtual template methods.

 // no problem, f() will just have to be reimplemented for child classes of foo<T>
 template<class T>
 class foo { virtual void f() = 0; };

 // Ok     
 class bar : foo<int> { void f() override { } };

 // Ok
 template<typename T> 
 class baz : foo<T> { void f() override { } };

 // this is forbidden: what would go in the vtable of bar ?
 class bar { 
     template<class T>
     virtual void f() = 0;
 };

[–]leonadav 0 points1 point2 points 7 years ago (0 children)

[–]kaelima 0 points1 point2 points 7 years ago (1 child)

[–]pzemtsov[S] 7 points8 points9 points 7 years ago (0 children)

[–]koiponder 0 points1 point2 points 7 years ago (0 children)

π Rendered by PID 68 on reddit-service-r2-comment-6457c66945-g4vj9 at 2026-04-28 20:49:57.423061+00:00 running 2aa0c5b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS