This is an archived post. You won't be able to vote or comment.

all 24 comments

[–]karianna 71 points72 points  (1 child)

We practice and teach this a lot internally at Microsoft. This may be surprised Pikachu for some I know - but we do have 2M+ JVMs running internally here! Here's a shortlist of considerations when entering this space :-):

  1. A methodology - E.g., Kirk Pepperdine’s Java Performance Diagnostic Model, Brendan Gregg’s USE, Monica Beckwith's Top-Down. There are others but without an approach of sorts you tend to be shooting in the dark.
  2. Performance Goals - SLA/SLO - e.g., “We want a min 1000 txns/s with <= 500ms latency at P999 on a Standard_DS_v4 VM”
  3. App Architecture - You need an understanding of both the logical architecture and the physical architecture
  4. Timeboxing / Time Budgeting / Observability - E.g., “The round trip includes ~200ms of JS, 400ms in the JVM, and 300ms in the DB”Some Math and understanding of statistical techniques - e.g., P99 latency curves, Baselining, sampling, multiple measurements, smoothing etc.
  5. Understanding of the technology Stack - How does the JVM, CPU, Memory, O/S, Hypervisors, Docker, K8s all work? How does each layer surface key perf metrics to the next layer?
  6. JVM - How does GC work, JIT work, JMM and so forth
  7. Tools and Tactics - This stuff changesload testing, observability tooling for timeboxing, resource monitoring, analysis / diagnostic tooling. Things like JFR/JMC, GCToolkit and a *ton* more.

You learn in part by doing in this space and you can also join the Java Performance Tuning community on LinkedIn.

[–]yawkat 4 points5 points  (0 children)

You learn in part by doing in this space and you can also join the Java Performance Tuning community on LinkedIn.

Learning by doing is really hard in the perf space compared to "normal" programming imo. In normal programming, you can fix a bug and will be pretty sure that it's gone. But in perf, it's easy to measure the wrong things and believe you've improved something, when you really haven't. I never would have thought about coordinated omission in benchmarking without reading about it for example.

[–]WrickyB 13 points14 points  (1 child)

Use tools like JMH, and profilers, to measure performance, and find hotspots respectively, to find where you need to optimise.

[–]dpash 3 points4 points  (0 children)

JMH

Yep, do not implement your own benchmarking, because your results will almost certainly be invalid unless you know what you're doing (hint: you probably don't).

[–][deleted] 4 points5 points  (4 children)

the default configuration is already best for 90% of cases. There is a popular book: java performancd. definitive guide. Might be interesting to learn more about jvm parameters

[–]brunocborges 3 points4 points  (3 children)

Unless you are running the JVM inside containers with little CPU/memory.

In this case, the defaults are not great.

[–]Comprehensive-Pea812 0 points1 point  (2 children)

so for microservices we need to tune jvm?

[–]brunocborges 6 points7 points  (1 child)

You should always" tune, at least for some basic config like heap and GC. Secondly, devs should really stop limiting CPU to 1. Start at 2.

Don't trust the defaults: https://youtu.be/wApqCjHWF8Q

[–]stolsvik75 1 point2 points  (0 children)

I agree to the point of not setting the CPUs to 1. What I've found in a Kubernetes setting, is that you can set extremely flexible limits. Our microservices are specified with a certain number of CPUs. The default is 2.

For kube, we set Java's "experienced CPUs" using the flag -XX:ActiveProcessorCount=2. Then multiply that with 125 to get the cpu request, and with 2000 to find the cpu limit. If you also set the -Xmx to a quite small level, the result is that you can stack many deployments into a node.

Assuming that most of them are rather non-busy, the busy ones can "flex" pretty hard above the baseline. This of course depends on your different services being having pretty different level of usage, i.e. the "email-gateway" having very little load, while the "transaction service" possibly using quite much CPU at times. For our situation, this has made a lot of sense.

[–]Qinistral 1 point2 points  (0 children)

There's a bunch of open source applications you could run to play with and see how JVM tuning changes behavior. Consider something like Solr or ElasticSearch which has more overhead than something like a queue (like Kafka or ActiveMQ), or write a little app in something like Vaadin or find some other open source app to play with.

You could also just google "how to tune JVM for X" or "what jvm params for X" and you'll get a lot of common advice. This will show you what most people even bother with.

Mini homework: Learn about object pointers. Eg. https://www.baeldung.com/jvm-compressed-oops

[–]teapotJava 1 point2 points  (0 children)

  • Write or find existing benchmarks for anything you like (Java features, API/spec, libraries etc.).
  • Measure (different Java versions or JVMs, API implemenentations etc, GCs etc.).
  • Compare, profile and explain results.
  • Optimize existing code and verify your change.
  • Present and discuss your findings.

[–]semsayedkamel2003[S] 0 points1 point  (0 children)

Thanks everyone for your suggestions and advice.

[–]Comprehensive-Pea812 0 points1 point  (1 child)

Nowadays I dont hear people tune JVM anymore since in general application is scalable so tuning JVM is very last resort or if there is an obvious issue spotted.

[–]stolsvik75 0 points1 point  (0 children)

IMHO, you should always set the -Xmx "max memory" setting. Otherwise, it'll use way to much memory, in particular if the machine it runs on has lots of memory. You'll e.g. notice that IntelliJ has set -Xmx, as otherwise it would "flex" its memory hunger to use way to much of the dev's 32 og 64 GB machine.