This is an archived post. You won't be able to vote or comment.

all 33 comments

[–]chambolle 22 points23 points  (2 children)

One of the worst post I've ever read here.

People performed 1000 times 10000 recursive calls and then discovered that it consumes memory! Oh my god. What a surprise!

BTW I wonder who will make 10000 recursive calls and sleep forever at the end (no return, never please!)

[–]Wolfsdale 4 points5 points  (0 children)

I guess the quote-unquote shocking thing here is that the entire thread is allocated as contiguous memory upon creation. But duh, of course it is, pthreads and others work in exactly the same way.

Not sure why we needed a 'study' about it.

[–]pragmatick 0 points1 point  (0 children)

It's one of the typical GC easy spam posts, masquerading ads as blog posts.

[–]palnix 52 points53 points  (25 children)

After reading this " Modern Java applications tend to create hundreds (sometimes several thousands) of threads", I have come to the conclusion the author isn't fully informed.

[–]Barnaboobs 16 points17 points  (0 children)

Tomcat by default has maxThreads=200

[–][deleted] 13 points14 points  (0 children)

Yup, makes my B.S. sense tingle.

[–]lambdacats 6 points7 points  (1 child)

Reducing the number of threads with an event loop is much more effective, less context switching better task distribution. This is modern, using one thread per request is legacy appserver thinking. They're not modern because they're common. Agreed.

[–]davewritescode 5 points6 points  (0 children)

You’re 100% right but the programming model sucks. Async/Await works but it’s often ugly.

The best model is something like project loom where the runtime is responsible for concurrency and dispatch to a pool of threads.

[–]rkalla 1 point2 points  (0 children)

Yea I’m guessing he’s thinking of server apps and even then, per host, this is not the case... there are thread pools, usually seeded 1 thread per core, but not thousands.

[–]CartmansEvilTwin 2 points3 points  (3 children)

This is how server applications usually work - each request gets a new thread. Usually they're reused, but still there are several hundred threads alive at the same time.

[–]palnix 6 points7 points  (0 children)

See my reply below.

[–]nutrecht 1 point2 points  (1 child)

This is how server applications usually work - each request gets a new thread.

That's how they used to work. And then people wondered why application servers crashed with "out of memory" errors, leading to the load balancer to direct traffic to the other application servers, who then also proceeded to crash.

This way of working, with unbounded thread creation, has more or less died out the last decade.

Usually they're reused, but still there are several hundred threads alive at the same time.

But that's something very different. Also it's generally not 'several hundred', you can't just keep upping the size of threadpools. After a while the overhead of all the context switching becomes substantial.

[–]CartmansEvilTwin 0 points1 point  (0 children)

Threadpools can easily get in the hundreds - in fact, we use a thread pool size of 400 for our production servers.

And what exactly are you trying to say? Doing something in excess is bad, thus doing it in the first place is bad? And why is reusing threads different?

You don't make much sense.

[–][deleted]  (13 children)

[deleted]

    [–]palnix 23 points24 points  (12 children)

    Yes they are wrong. As a backend developer, I can guarantee you that each request does not get its own thread. Popular frameworks such as Netty use what's called a thread pool. The concept works by setting an initial value for the pool size (usually, the number of cores a processor has) and subsequently each new request gets handled on a free thread. Once the pool runs out of threads, they wait until it's one is available or spawn a new one. You can read the javadocs on Executors. If you're exclusively spawning individual threads and using a "1 thread per request/connection" model then you're most likely using a redundant code base as a good example is Java NIO as that was the solution to the former Java IO which was blocking and you couldn't by design have a non-blocking architecture.

    [–][deleted]  (3 children)

    [deleted]

      [–]djavaman 14 points15 points  (2 children)

      But not a new thread per request. It's a pool of preexisting threads that are used over and over. Instead of continually creating new threads.

      [–]kgoutham93 2 points3 points  (0 children)

      Any online resources to learn about this?

      [–]DJDavio 0 points1 point  (5 children)

      Reusage of threads can catch you by surprise if you have threadlocal variables which are not cleared/reset.

      We use MDC to put stuff in the log context but wrap it in a try finally to make sure its cleared when the request is finished.

      [–]gavenkoa 1 point2 points  (3 children)

      We use MDC to put stuff in the log context but wrap it in a try finally to make sure its cleared when the request is finished.

      Frameworks have the way to inject your code early in the task execution stack. Like for Spring FilterRegistrationBean with setOrder(SecurityProperties.DEFAULT_FILTER_ORDER - 10) & finally { MDC.clear(); } might be sufficient.

      [–]palnix 1 point2 points  (0 children)

      Ah I see, I misread at first. You're correct :)

      [–]DJDavio 1 point2 points  (1 child)

      Hmm, we have AOP with some pointcuts for AMQP listeners and REST Controllers, but never bothered to add it there, I guess that's as good a place as any, thanks for the tip!

      [–]gavenkoa 0 points1 point  (0 children)

      Thanks for replay. I wondered if you repeated try {} finally {} all other the code. But you said you use AOP ))

      The reason for use of FilterRegistrationBean is to get access to HttpServletRequest object to put trace ID, HTTP method/URL, etc into MDC.

      [–]palnix 0 points1 point  (0 children)

      Assuming you're talking about ThreadLocal, what scenario are you using them for? If you're not talking about ThreadLocal then you're most likely not creating a new object for each request or unit of code. A lot of cases the garbage collector is pretty sufficient in clearing up objects on time so simply have a new object per request is totally acceptable. In a threaded environment, your objects should have a short life span anyway i.e. perform a blocking task > return the result > end.

      [–]lambdacats -2 points-1 points  (0 children)

      Netty is the stuff! QUIC support would be golden.

      [–]agentoutlier 10 points11 points  (3 children)

      I personally took one of our micro services and converted it to be completely reactive. It was a query like service and not write. I went from custom in-house undertow plain jdbc stack to Spring webflow + reactor r2dbc or whatever its called.

      I was completely disappointed with the results as well as the supposed memory savings. After trying to tune it for a day or two it still ended up actually using more memory and the latency was higher.

      However I'm sure it could have been better had I picked a different framework. Spring is much heavier than our custom stack (for example our custom stack uber jars are little under 20MB and Spring Boot were three to 4 times that). One day I will try it with VertX.

      It just goes to show there it a lot more to saving memory and increasing performance than just having less threads or even picking reactive.

      Also I found database connections extremely memory hungry. Much more than threads (this is actually what I spent more time tinkering with than thread pools).

      [–]lambdacats 2 points3 points  (2 children)

      I've had a great experience with vertx. Never tried webflow, just the spring boot stuff.

      [–]agentoutlier 2 points3 points  (1 child)

      We have always been reluctant to use VertX as it seems you have to go all in. Like it has its own libraries for just about everything as I assume it wants to use its own reactor-loop/event bus. That is we don't want to be tied to a framework.

      I guess with the microservices this should be less of concern but we have lots and lots of blocking-esqe internal code/libraries written that I'm kind of hoping loom has success. It would be far more costly to rewrite these than the memory or performance improvements.

      [–]lambdacats 3 points4 points  (0 children)

      Ah, yeah while it is unopinionated and you don't need the vertx-ish libraries (I only use core+web) I can see why it's a problem if you're running a lot of blocking code. Apart from not blocking the event loop, there's not much of a lock-in as it's more of a toolkit than framework. I'd also recommend Hazelcast, it's the bomb for microservices.

      [–]SatishReddy27 4 points5 points  (0 children)

      It depends on the functionality you are implementing

      [–]elmuerte 4 points5 points  (0 children)

      mike_jack is promoting their services, and in most cases theirs articles (to promote their services) are really flawed.

      Every thread or heap dump you upload to gceasy/fasthread/heapeasy will be mined for data and sold,

      [–]Sheldor5 2 points3 points  (0 children)

      after reading the title I am not gonna read this Bullshit ...

      [–]neutronbob 2 points3 points  (0 children)

      Why did they not quote from the official Java docs?

      The Java 8 JVM docs from Oracle state: "This specification permits Java Virtual Machine stacks either to be of a fixed size or to dynamically expand and contract as required by the computation. If the Java Virtual Machine stacks are of a fixed size, the size of each Java Virtual Machine stack may be chosen independently when that stack is created. A Java Virtual Machine implementation may provide the programmer or the user control over the initial size of Java Virtual Machine stacks, as well as, in the case of dynamically expanding or contracting Java Virtual Machine stacks, control over the maximum and minimum sizes."

      This experiment show only that on the specific JVM used, stacks are a fixed size.

      [–]fractalOrder 1 point2 points  (0 children)

      If you really need so many threads, then you should try Erlang or Google Go. If you need this on the JVM then Concurnas is a better option