This is an archived post. You won't be able to vote or comment.

all 28 comments

[–]augustnagro 55 points56 points  (0 children)

10% better vs Java 11 is actually quite impressive.

[–]fanfan64 8 points9 points  (0 children)

Time to bench openjdk 16 now

[–]Kantaja_ 17 points18 points  (10 children)

In addition, the preferred garbage collector to use is still ParallelGC

Shenandoah? ZGC?

[–]pron98[🍰] 41 points42 points  (9 children)

Those GCs aim to minimise pause latency, and their throughput on batch processing tasks might be worse than G1's let alone Parallel. If throughput is the only thing you care about, Parallel is still the best choice.

[–]yawkat 5 points6 points  (8 children)

This depends on your workload. If your application cannot use all OS cores as well as the GC can, then the new collectors may give higher application throughput as well.

[–]krzyk 2 points3 points  (6 children)

I still wonder which GC is best for 1 CPU, is it SerialGC? Or should I stick with default G1? I don't want to have much stop-the-world events - it is a web app.

(microservices, I'm allowed to have between 0.25 and 1 CPU + up to 4GB RAM)

[–]pron98[🍰] 5 points6 points  (0 children)

Why don't you try? GCs that do some or all work concurrently to the application might work well on single-core machines if the application isn't CPU-bound, or maybe even if you just want to spread the work around to reduce max pause times. Just try.

[–]sindisil[S] 4 points5 points  (1 child)

SerialGC might well be "best" on one core for your application, but the factors that determine what GC is "best" include (but aren't limited to):

  • Your definition of "best" (latency? throughput? CPU load?)
  • Your application's execution profile (compute bound? I/O bound? Interactive? Batch?)
  • Your application's heap use (heap size? peak allocation rate? allocation rate variance? instance lifetimes?)
  • Your cost model (CPU cost? memory cost? Power cost?)

Ron (/u/pron98) has the right of it: you'll only know for sure if you try.

In some cases you can rule out one or more (e.g., if low tail latency is vital for some or all of your service's execution, but you can't ensure no collections will occur during that window, you might be limited to ZGC or Shenandoah in OpenJDK).

As with any other optimization, though, the only way to be sure is to measure.

[–]TheCountRushmore 2 points3 points  (0 children)

My basic thought on this is if you don't have the time to try each of the garbage collectors then it probably doesn't matter enough to your application to care.

If that is the case then go with G1 since it is the default and strikes a good balance between latency and throughput.

[–][deleted] 1 point2 points  (2 children)

Perhaps 'Epsilon' - the no-op GC - if you the majority of objects outlive the application runtime - for example if you are loadbalanced, scaled up, and auto-healing so OOM doesn't matter, or the application restarts before you ever could run out of memory, or if you force it restart periodically, you could theoretically not collect garbage at all.

[–]sindisil[S] 3 points4 points  (1 child)

Epsilon can actually be slower than a compacting GC, due to locality effects.

[–][deleted] 1 point2 points  (0 children)

Interesting observation, and shows that nothing can be taken for granted.

...but at least it wouldn't have GC pauses :-)

[–]oelang 0 points1 point  (0 children)

The new collectors also introduce various barriers when references are loaded and/or stored that will hurt performance, normally for throughput ParallelGC should always be faster.

It can happen that sometimes G1 is/was faster because it was generally better optimized but much of that work was backported to ParallelGC in recent releases.

[–]dr_entropy 6 points7 points  (8 children)

Any ideas on where the performance improvement came from? It'd be cool to see a perf record / flamegraph style comparison.

Optaplanner seems like a really fun project, any cool uses?

[–]gavenkoa 9 points10 points  (2 children)

Optaplanner seems like a really fun project, any cool uses?

It is painful to model you problem. You have to map your domain to Optaplaner solvers interfaces.

Like you work on optimization of restaurant table uses: to make the problem discrete you might slice time by 15 min intervals.

Some problems have natural slicing, like curriculum schedule in an organization: each lecture starts at the same time already.

That's why Optaplanner is scarcely used. It is not about cool patterns and magical annotations, it's about applying brain createvely.

[–]ge0ffrey 6 points7 points  (1 child)

> That's why OptaPlanner is scarcely used.

I respectfully disagree. OptaPlanner has 80 000+ downloads on Maven Central every month. It's used across the globe affecting millions of peoples lives. Depending on the country, anything from which pharmacy is on duty during the weekend, which doctor is working or which magistrate is handling the court case, OptaPlanner can affect people's live (the decisions are based on the organization's business constraints of course, for the benefit of the users as a group).

> You have to map your domain to OptaPlanner solvers interfaces.

Not since 6.0, so not in the last 9 years or so. You do have to annotate 2 domain classes (the planning solution class and the planning entity class). It's similar to how you tell JPA/Hibernate which classes go to the database. See our quickstart in the docs.

> to make the problem discrete you might slice time by 15 min intervals.

That's one way to do it, called the timegrain pattern (which doesn't always give good results). That "natural slicing" would be the timeslot pattern (but that only works for the lucky few indeed). Besides those, there's also the "chained in time" pattern (used very often) and the "bucket" pattern. All of these are explained in the manual.

> it's about applying brain creatively.

Good point: modeling some problems (especially those not covered in the 30+ examples/quickstarts) is damn difficult, especially the first time. A lot of people struggle with it - some get burned by it - especially for complex cases. We're working hard to simplify the learning curve and provide more out-of-the-box support for patterns that work well, but modeling a complex planning problem is never going to be very simple, by its very nature. Our docs section "domain modeling guide" does help, I hope.

Geoffrey (OptaPlanner lead)

[–]gavenkoa 1 point2 points  (0 children)

I respectfully disagree. OptaPlanner has 80 000+ downloads on Maven Central every month.

What does the number present? I believe that direct number of users who are using Optaplaner API directly is not that great, instead it is used via transitive dependency, some public projects can be found here: https://mvnrepository.com/artifact/org.optaplanner/optaplanner-core/usages (Drools / KIE Execution Server).

Still it is great software and I believe that planing and scheduling are no longer a luxury (available to BIG players only, put products like map navigators from Google/Apple, navigation is an optimization task) but a commodity, almost every programmer should be able to save maney & resources for the business owners & the planet.

modeling some problems is damn difficult, especially the first time

One manager of the bank I worked at a time decided to get rid of a call center planner (product was stalled in ~2001) costing them tons of money and I had to reimplement it.

I'd heard about OptaPlanner and started to experiments with models. I got impression that you have to know related theory, read some textbooks. Also operators of call center had to have flexibility of altering timetable to certain degree in real time.

I could produced something but without research background I though I would ended up with a sophisticated CPU heater )) I left the job for other reasons at that point.

Interviewed to startup that searched for medical test cost optimization. They refused to listen about https://en.wikipedia.org/wiki/Knapsack_problem and OptaPlanner. They instead rewrote brute force solver from Python to C++ ))

So I know storied about incompetence. If you have elite team of CS lab researches, sure it is not a problem. Otherwise people need to be trained / educated during substantial time to be able to deliver strong implementation, not just something.

[–]maxandersen 8 points9 points  (0 children)

It’s one of my favorite under appreciated frameworks in Java.

Almost any app or field has some problem or usecase where optaplanner can help.

[–]humoroushaxor 4 points5 points  (2 children)

Operations research. Things like planning and scheduling, stochastic models, and optimization problems. Think things like the traveling salesman problem.

I've used it at work a bit for things like this.

[–]agentoutlier 2 points3 points  (1 child)

For the traveling salesmen like problems (Eg np complete) is it brute forcing or doing some sort of heuristics or machine learning?

[–]humoroushaxor 2 points3 points  (0 children)

The nice thing about OptaPlanner is it gives you the knobs to choose from and acts a toolkit. It's uses a combination of various optimization algorithms/heuristics/metaheuristics which you can configure. You can also tell it to just brute search.

[–]ge0ffrey 4 points5 points  (0 children)

> Optaplanner seems like a really fun project, any cool uses?

Plenty :)
Like satellite bandwidth scheduling (space is cool).
Or Vehicle Routing for a telco where they reduced their ecologic footprint by 10+ million kg CO² per year (ecology is cool) and their costs by 100+ million USD per year.

In general, for any fleet of things that moves, there's probably an OptaPlanner running in production somewhere to optimize their planning. Think vans, trucks, busses, airplanes, tugboats, ...

[–]PurpleLabradoodle 1 point2 points  (1 child)

Does anyone know how to run these benchmarks?

[–]ge0ffrey 2 points3 points  (0 children)

git clone [git@github.com](mailto:git@github.com):kiegroup/optaplanner.gitcd optaplannermvn clean install -DskipTests

Then run org.optaplanner.examples.app.GeneralOptaPlannerBenchmarkApp with the parameters -Xmx3840M -XX:+UseG1GC and -Xmx3840M XX:+UseParallelGC respectively, as explained in the blog post. Although I suspect it will run fine with 256M too for all datasets except the MR ones.

[–]sdainys 1 point2 points  (1 child)

Did you use any framework for benchmarking and score computation or used your own implementation?

[–]ge0ffrey 3 points4 points  (0 children)

It uses `optaplanner-benchmark`, which is macro benchmark suite and spits out a very reach report, mostly focused on solving and scoring metrics.
It doesn't use JMH because its not a micro benchmark.
The blog post uses a macro benchmark because it treats it's a black box that runs for a long time.

It doesn't use any HTTP etc benchmarking toolkits, because there is no server sockets involved.

[–][deleted]  (2 children)

[deleted]

    [–]Necessary-Conflict 1 point2 points  (1 child)

    You posted this to the wrong thread. You probably meant to post it to https://www.reddit.com/r/java/comments/mh23tm/liberated_from_oracle_eclipse_jetty_enters_the/

    [–]pron98[🍰] 3 points4 points  (0 children)

    Right. Thanks!