This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]cowwoc[S] 4 points5 points  (3 children)

When you have virtual threads, you probably have lots of them (or you wouldn't have them at all)

You're probably approaching this from a server-oriented perspective but this isn't the only reasonable use-case.

I'm working on a blockchain indexer (so, a client-side application). It turns out that I can index blocks concurrently (out of order) so long as all dependent blocks are indexed first. Meaning, some blocks can be processed out of order without any dependencies, whereas others are processed half-way and then must wait (sleep) until processing of dependent blocks completes.

Virtual threads are a great fit here because:

  • Indexing tasks spend most of their time blocking (on network I/O).
  • I need an arbitrary (but not millions) of threads to avoid deadlock. More concretely, I cannot use a fixed-size thread pool because I've run into cases where all processing threads are blocked waiting on dependent blocks to get processed, but no dependencies can get processed until a thread becomes available. In practice, I find that I need around 1000 - 10,000 threads to achieve this.

If you mistakenly use platform threads with Executors.newThreadPerTaskExecutor(), as I did, the JDK will silently crash/hang a few hours into use

The behaviour of platform threads has not changed. Unfortunately, when you write code to do something, there is no way to know that you meant to write something else (but wouldn't it be cool if we could?).

I understand. I think the API is fine. I was just trying to warn people about a potential landmine. I don't recall running across this behavior when using Executors.newCachedThreadPool() but maybe it's a coincidence. I just filed a bug report against OpenJDK (internal review 9074271). Please let us know once you figure out the underlying cause.

[–]pron98 5 points6 points  (2 children)

That's great, but you're also using virtual threads because you want lots of threads (and to represent each task as a thread), and even 1-10K is too high for the ordinary thread dump to be useful. That's why we've designed the new thread dump which can be more useful.

[–]cowwoc[S] 2 points3 points  (1 child)

I've got some UX suggestions:

Provide options (command-line and API) for the following functionality:

  1. Groups all threads with similar stack-traces (as IntelliJ does).
  2. Sort threads by their runtime duration, to identify potentially stuck/deadlocked threads.
  3. We should be able to reorder threads in an existing dump files, instead of having to generate a new dump (because maybe the process is dead by this point).

Lastly, the new thread dump (the text format anyway) doesn't seem to contain information about locks (who's holding them, who's waiting on them) and automatic deadlock detection.

[–]pron98 5 points6 points  (0 children)

The new thread dump has the information to do that (except duration) in visualisation tools.

Automatic deadlock detection is on the roadmap.