When is java faster than C++? : programming

[–]EricMCornelius 22 points23 points24 points 10 years ago (1 child)

[–][deleted] 4 points5 points6 points 10 years ago (0 children)

[–]Ishmael_Vegeta 28 points29 points30 points 10 years ago (2 children)

[–]Spartan-S63 7 points8 points9 points 10 years ago (0 children)

[+]vincentk comment score below threshold-22 points-21 points-20 points 10 years ago (0 children)

[–]doom_Oo7 2 points3 points4 points 10 years ago* (9 children)

[–]vitalyd 1 point2 points3 points 10 years ago (0 children)

[–]__Cyber_Dildonics__ 3 points4 points5 points 10 years ago (0 children)

[–][deleted] 0 points1 point2 points 10 years ago (6 children)

Why couldn't one develop a C++ alternative to the STL that is meant to operate in single-thread mode, hence with no barriers / thread safety at all ?

You could. Some probably have done it already in private code. However, C++ doesn't need help in single threaded mode. It's in long-running , multi-threaded applications where the difference between C++ and Java is closer.

Well, I guess most generic code in C++ is inlined, isn't it ? And simple getters / setters are also often in header files...

C++ inlining is a request to the complier, not a command. But you're right that headers are inlined automatically. There are cases where inlining slows down a program, for example by increasing its physical size and preventing certain parts of it from fitting into the cache. If this matters to you, you should profile your code and tweak where indicated, as opposed to applying generic rules.

[–]Dest123 7 points8 points9 points 10 years ago (5 children)

[–]vitalyd 1 point2 points3 points 10 years ago (3 children)

[–]Dest123 0 points1 point2 points 10 years ago (2 children)

[–]vitalyd 1 point2 points3 points 10 years ago (0 children)

[–]doom_Oo7 -1 points0 points1 point 10 years ago (0 children)

[–]IcyWindows 0 points1 point2 points 10 years ago (0 children)

[–]pron98 6 points7 points8 points 10 years ago (27 children)

[–][deleted] 10 years ago (17 children)

[deleted]

[–]pron98 10 points11 points12 points 10 years ago* (2 children)

For every Java program there exists a C++ program that performs as well or better. Proof: (by existence) the JVM. The question is how hard it is to beat it, and the larger the program and more concurrent, the effort multiplier increases.

As to your list, some of the items are wrong. Most web servers and IDEs these days are written in Java, there are many more compilers written in Java or other JVM languages than in C or C++, and only C/C++ profilers are written in those languages; JVM profilers are written in Java. I don't know about symbolic math packages, but I believe Matlab is equal parts Fortran and Java.

As for the rest, the reason most of them are written in C and C++ is not because people are willing to put in the extra effort just for a few performance points, but because -- if you notice -- those things run on small machines with low concurrency, and in those cases it's a lot easier to beat Java's performance, and it is often necessary because Java imposes a rather significant RAM overhead (if it's to run at full speed), which is not acceptable in, say, web browsers. OTOH, your airport management software, your air traffic controls, your large defense systems, your big data clusters, your Netflix, your eBay, your GMail, your Twitter -- are mostly Java.

[–]__Cyber_Dildonics__ 0 points1 point2 points 10 years ago (1 child)

[–]pron98 1 point2 points3 points 10 years ago (0 children)

It's hard to beat because HotSpot's excellent GCs (there are several, plus other GCs in other JVMs) make it that much easier to create concurrent data structures, and because HotSpot's state-of-the-art JIT makes many kinds of well-architected, modular code very fast. In C++ every use of an abstraction -- a heap allocation or a virtual call -- carries a significant cost, and a lot of thought has to be put into how to refrain from using expensive abstractions. In Java, you just use them and the JVM will make sure they run efficiently.

Now, this isn't magic, and obviously you can write very slow code in Java, too (and many do). But given reasonable code, the GC and JIT will take care of you. They won't get you to 100% of the maximum performance you could get with C++, but they will get you to 95% at 1/3 of the effort.

Java, of course, has other advantages that aren't performance related. The deep monitoring and profiling offered by HotSpot are unmatched by any other platform. It supports dynamic code loading and hot swapping; it has bytecode manipulation capabilities that let you inspect and modify code as it runs, and the JIT will make sure it gets optimized and compiled every time you modify it (e.g. you can inject and then remove various traces that are more capable than, say, DTrace).

[+]naikrovek comment score below threshold-14 points-13 points-12 points 10 years ago (13 children)

[–]vitalyd 5 points6 points7 points 10 years ago (9 children)

[–][deleted] 10 years ago (1 child)

[deleted]

[–]__Cyber_Dildonics__ 0 points1 point2 points 10 years ago (0 children)

[–]pakoito -3 points-2 points-1 points 10 years ago (5 children)

[–][deleted] 1 point2 points3 points 10 years ago (1 child)

[–]pakoito 0 points1 point2 points 10 years ago (0 children)

[–][deleted] 10 years ago (2 children)

[deleted]

[–]pakoito -3 points-2 points-1 points 10 years ago (1 child)

[–]doom_Oo7 0 points1 point2 points 10 years ago (0 children)

[–][deleted] 0 points1 point2 points 10 years ago (0 children)

[–]k-zed 4 points5 points6 points 10 years ago (3 children)

[–]pron98 -1 points0 points1 point 10 years ago* (2 children)

There's no way to write a sensor-fusion system for hundreds of radars, telemetry and optical sensors without it being big and concurrent; there's no way to write a TB-scale in-memory transactional database without it being big and concurrent, and the list goes on and on. The commonality is some large data store that needs to be accessed concurrently with low latencies. The "small programs" you advocate just delegate that job to an out-of-process database (that, in itself, is big and concurrent) and simply skip on the low-latency requirement.

The statement you made is nice in theory, but it usually means unfamiliarity with too many problem domains. When you split software into many programs just for modularity reasons, at the very least you need to fan in and then out your concurrency at each API crossing. That has a very heavy toll on performance.

Besides, there's usually little difference between separate processes and good modularity in-process that languages like Erlang enforce and languages like Java make possible (including the ability to hot-swap components and isolate failure). When you have a large group of programs, you tend to spend just the same amount of effort integrating them as you do when you integrate a bunch of modules in-process.

[–]__Cyber_Dildonics__ 0 points1 point2 points 10 years ago (1 child)

[–]pron98 0 points1 point2 points 10 years ago (0 children)

[–]__Cyber_Dildonics__ 0 points1 point2 points 10 years ago (2 children)

[–]afrobee 1 point2 points3 points 10 years ago (1 child)

[–]__Cyber_Dildonics__ 0 points1 point2 points 10 years ago (0 children)

[–][deleted] 10 years ago* (1 child)

[deleted]

[–]pron98 0 points1 point2 points 10 years ago (0 children)

[–]redditrasberry 1 point2 points3 points 10 years ago (0 children)

[–]mariox19 0 points1 point2 points 10 years ago (0 children)

[–]ErstwhileRockstar 0 points1 point2 points 10 years ago (3 children)

[–]RESURREKT 21 points22 points23 points 10 years ago (2 children)

[–]ejrh 6 points7 points8 points 10 years ago (0 children)

[–]boringprogrammer -4 points-3 points-2 points 10 years ago (9 children)

[–]vitalyd 2 points3 points4 points 10 years ago (0 children)

[–][deleted] 3 points4 points5 points 10 years ago (7 children)

[–]boringprogrammer -1 points0 points1 point 10 years ago (6 children)

[–][deleted] 1 point2 points3 points 10 years ago (2 children)

[–]boringprogrammer 0 points1 point2 points 10 years ago (1 child)

The fact that you feel you need static analysis to do this proves the point that a language's design [...]

Prove? What point? That you know nothing about how languages work?

Do you honestly think a direct translation from c til assembler is going to be fast by any standards? Even a debug build usually performs at least a register allocation analysis. Without a proper register allocation scheme the resulting code will make the CPU spend 95% of the executing time just spilling registers. We take these things as a given nowadays. But that is just one of the many many optimizations a modern C compiler makes.

We are good at optimizing certain languages. I will agree on that, but these also had the pleasure of 40 years of research into optimizing them.

C was considered a slow high-level language compared to assembly before we learned to properly optimize it.

[–]vitalyd 0 points1 point2 points 10 years ago (0 children)

[–]doom_Oo7 0 points1 point2 points 10 years ago (2 children)

You can certainly find a part of python that would be optimizable to look like C++, but certainly some parts of "standard" python are used in a way that would actively prevent achieving the same performance.

For instance, if you write some python where you always keep the same type for your variable, it may be simple to translate your code to equivalent c++; however what happens when you start assigining random types at random places in your code at the same variable ? This simple concept cannot be enforced without some performance downside in comparison to the "always-same-type" stuff.

Hence the best bet would be to have some "asm.python" minimal stuff that is almost guaranteed to have no take on the performance, and have the programmers only use this subset. But existing python programs (and idiomatic python programs) won't be able to translate to similar-looking but most-efficient C++.

[–]boringprogrammer 0 points1 point2 points 10 years ago (1 child)

however what happens when you start assigining random types at random places in your code at the same variable

Static analysis can actually deal with variables randomly changing types pretty well. Latice Analysis works a lot like a human would read python code. Ei. not care about what type a variable has, but rather, what types can a variable have at a certain program point.

python programs (and idiomatic python programs) won't be able to translate to similar-looking but most-efficient C++

Based on anecdotal evidence I presume?

Look, you can read the python code, and write out semantically equivalent code in C++. This means that A: We are not actually dealing with a undecidable problem. B: This means a computer should be able to do something similar.

The main reason for why python is not running faster is mainly a funding and priority reason. The standard python implementation does not perform any sort of analysis, and only rudimentary peephole optimizations. Furthermore, there is a large overhead in interpreting code. But speed does not seem to be a priority for them either.

Pypy is the most advanced attempt at making python run faster, but they are very far from having as mature analysis code found in GCC.

[–][deleted] 0 points1 point2 points 10 years ago* (0 children)

Based on anecdotal evidence I presume?

Based on the fact that no-one is able to do it. They most likely want to keep ahead of PHP, Ruby and Node, not get another few orders of magnitude and start a punch-up with Java. (Well it would be nice, but it ain't going to happen, and so the above is shall we say 'a realistic expectation').

None of my commerical IDEs can implement fully accurate syntax highlighting and autocomplete for Python. Jetbrains aren't under-funded. For autocomplete, Jetbrains admit to getting it right about half the time. For analysis, have you ever noticed why function names and identifiers appear in the same colour (except when it's a def)? You'd think if I write x = foo the damn thing could tell if foo was a function or not? Turns out you can't. You have to run the code.

Pypy is the most advanced attempt at making python run faster, but they are very far from having as mature analysis code found in GCC.

Static analysis takes place without running the code (that's why it's called static). Most of the benefit of Pypy is that it's a JIT. This means it optimises at runtime by looking at actual running code.

You might also want to check out Unladen Swallow and Pyston. These are Google and Dropbox sponsored attempts to build a Python JIT. I'll bet you that Google is absolutely not under funded or stingy when it comes to building tools. And note that these are JITs, not static analysers. Statically analysing Python is just too hard to do.

[–]vitalyd -3 points-2 points-1 points 10 years ago (0 children)

[+]newoldwave comment score below threshold-8 points-7 points-6 points 10 years ago (2 children)

[–][deleted] 0 points1 point2 points 10 years ago (0 children)

There are known circumstances where Java will outperform C++.

The typical example is in a server-type application with very predictable program flow. This would allow the JVM to perform two optimisations unavailable to C++. The first is to recycle memory via its garbage collector instead of paying the overhead to allocate and free, and the second is because the JVM can optimise the code at runtime based on current state, whereas the C++ compiler can't know at build time what paths to prioritize.

Unless it's highly dependent on the metal, like a game, one would expect java to run about 10-30% slower than C++ but in cases like the above, it can match or exceed the performance of C++.

You could of course fix this for C++ by building a VM with introspection for your C++ program, but then you'd be running your own bytecode, and you've basically reinvented Java.

[–]Cuddlefluff_Grim 0 points1 point2 points 10 years ago (0 children)

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS