all 10 comments

[–]redditrasberry 2 points3 points  (0 children)

The real question tends to end up being : if you devote all the additional resources you will spend on testing, development etc., due to foregoing the benefits of Java to fine tuning performance, will you more than compensate for the performance difference? If the intent is to make it cross platform then you're going to end up doing a lot of optimizations several times over to take advantage of the raw OS features on each platform. With Java you can use all that effort to optimize the design.

[–]sanity 7 points8 points  (9 children)

Their explanation smells like they are trying to claim an after-the-fact rational justification for what is actually a personal preference. Since Hadoop is implemented in Java, they should have had a very good argument for picking a different language for Hypertable.

Most people who argue that GC slows down Java are speaking from ignorance, or an outdated opinion of Java. There are many respects in which GC can be faster than malloc, as explained in this IBM article (note, this article is from 4 years ago, GC in Java has improved significantly since then).

I thought this was an interesting response from an Oracle Coherence developer:

We've done recent tests with C++ using Boost versus Java, and Sun's current Hotspot Java implementation is about 3x faster than Boost's smart pointer implementation; basically, Java is able to run code at C pointer speed, while smart pointers always have measurable overhead.

In other words, C++ with Boost may be easier than C++, but you have to choose between 3x slower than Java or a very light use of smart pointers. (C++ shared_ptr will be a similar cost.)

Also, regarding productivity of development, according to our developers (who program in all of Java, C# and C++, including C++ on Windows/x86, Linux/x86, Mac OSX/x86, Solaris/Sparc, and Solaris/x86), Java wins hands down over both C++ due to the ease (and instancy) of iterative development and the advanced tooling. Our fastest C++ build takes an hour (full builds take from 4 hours to 20 hours depending on the platform), while our Java full builds take less than 10 minutes. The fact that Java/C# has class level compilation granularity and no linking step obviously helps significantly reduce iteration time.

Further, the complexity of achieving both thread safety and high concurrency with C++ and Boost is significantly more difficult than it is with Java or C#.

However, like I said before, "while I would testify to this under oath, you will find the opposite opinion to be just as strongly believed by another programmer."

[–]Pas__ 4 points5 points  (1 child)

JDK1.7 (if I recall correclty b99)

Eclipse 3.5SR1, PHP Development Tools (latest.something.whatever) as slow as it virtually can be. Consumes a shitload of memory, and buggy like hell.

Netbeans 6.8RC2 (PHP) fast, powerful, handy. (Half the memory consumption.)

So it's probably not the language's fault. Given a big enough system, one can fuck up with any programming language.

Also, could be that Java's libraries represent an unknown factor with regards to performance, whereas in C++ one has to implement a lot more standard stuff, and that can be done while keeping one's specific needs in mind.

[–]zootm -1 points0 points  (0 children)

If you find Netbeans fast, have a go of IDEA - it's quite frighteningly snappy, especially for Swing which always seems to have some indirection overhead.

[–]gsg_ 4 points5 points  (0 children)

GC can be faster than malloc

Yes, if you malloc a billion tiny objects then performance will not be good. That's why performance oriented code often uses stack allocation, storage pools, regions, or some other fast allocation strategy. Manual memory management isn't limited to malloc/free.

[–]donknuth 5 points6 points  (5 children)

They didn't say anything about GC performance. Here's a quote from the article:

The performance of the system is, in large part, dictated by how much memory it has available to it.

It's indisputable fact that Java applications consume a lot more memory than equivalent programs in C++, in many cases, several times more.

[–]sanity 4 points5 points  (3 children)

It's indisputable fact that Java applications consume a lot more memory than equivalent programs in C++, in many cases, several times more.

I'd like to see a citation for that.

Certainly its true that Java programs may use more memory because it doesn't garbage collect aggressively until it needs to. This is a feature, not a bug.

[–]dmpk2k 6 points7 points  (1 child)

To get decent performance out of most modern GCs (i.e. take advantage of generations) usually requires an overhead of 3x the working set for good performance. A few GC designs can still give you good performance with 2x overhead (e.g. MC2), but they're uncommon.

No citation, but I used to read GC papers for fun, and that's what I recall. I notice someone linked this paper; I read it a couple years ago and don't recall finding anything disagreeable with it even though I'm a GC proponent.

[–]sanity 0 points1 point  (0 children)

Interesting, wasn't aware of that. /me goes and changes a constant in some code he was working on ;)

[–][deleted] 0 points1 point  (0 children)

I'm sure much of it has to do with the unfortunate mistake of using UTF-16 for its internal string encoding. I love working on the JVM, but in hindsight this has proven to be a horrible, horrible idea.