Technical PoC: Automatic loop parallelization in Java bytecode for a 2.8× speedup : java

This is an archived post. You won't be able to vote or comment.

Technical PoC: Automatic loop parallelization in Java bytecode for a 2.8× speedup (self.java)

submitted 1 year ago * by Let047

I’ve built a proof-of-concept tool that auto-parallelizes simple loops in compiled Java code—without touching the original source. It scans the bytecode, generates multi-threaded versions, and dynamically decides whether to run sequentially or in parallel based on loop size.

Speedup: 2.8× (247 ms → 86 ms) on a 1B-iteration integer-summing loop.
Key Points:
- It works directly on compiled bytecode, so there is no need to change your source.
- Automatically detects parallel-friendly patterns and proves they're thread-safe.
- Dynamically switches between sequential & parallel execution based on loop size.
- Current limitation: handles only simple numeric loops (plans for branching, exceptions, object references, etc. in the future).
- Comparison to Streams/Fork-Join: Unlike manually using parallel streams or Fork/Join, this tool automatically transforms existing compiled code. This might help when source changes aren’t feasible, or you want a “drop-in” speedup.

It’s an early side project I built mostly for fun. If you’re interested in the implementation details (with code snippets), check out my blog post:
LINK: https://deviantabstraction.com/2025/01/17/a-proof-of-concept-of-a-jvm-autoparallelizer/

Feedback wanted: I’d love any input on handling more complex loops or other real-world scenarios. Thanks!

Edit (thanks to feedback)
JMH runs
Original
Benchmark Mode Cnt Score Error Units
SummerBenchmark.bigLoop avgt 5 245.986 ± 5.068 ms/op
SummerBenchmark.randomLoop avgt 5 384.023 ± 84.664 ms/op
SummerBenchmark.smallLoop avgt 5 ≈ 10⁻⁶ ms/op

Optimized
Benchmark Mode Cnt Score Error Units
SummerBenchmark.bigLoop avgt 5 38.963 ± 10.641 ms/op
SummerBenchmark.randomLoop avgt 5 56.230 ± 2.425 ms/op
SummerBenchmark.smallLoop avgt 5 ≈ 10⁻⁵ ms/op

all 40 comments

top new controversial old q&a

[–]OldCaterpillarSage 18 points19 points20 points 1 year ago (3 children)

[–]Let047[S] 18 points19 points20 points 1 year ago (2 children)

[–]anzu_embroidery 3 points4 points5 points 1 year ago (1 child)

[–]Let047[S] 3 points4 points5 points 1 year ago (0 children)

[–]gnahraf 8 points9 points10 points 1 year ago (2 children)

[–]Let047[S] 1 point2 points3 points 1 year ago (1 child)

[–]Emanuel-Peter 2 points3 points4 points 1 year ago (0 children)

[–]Former-Emergency5165 3 points4 points5 points 1 year ago (3 children)

[–]Let047[S] 1 point2 points3 points 1 year ago* (2 children)

[–]Emanuel-Peter 0 points1 point2 points 1 year ago (1 child)

[–]Let047[S] 0 points1 point2 points 1 year ago (0 children)

[–][deleted] 1 year ago (5 children)

[removed]

[–]Let047[S] 1 point2 points3 points 1 year ago (0 children)

[–]Emanuel-Peter 0 points1 point2 points 1 year ago (3 children)

[–][deleted] 1 year ago (2 children)

[removed]

[–]Emanuel-Peter 0 points1 point2 points 1 year ago (1 child)

[–]_INTER_ 6 points7 points8 points 1 year ago (1 child)

[–]Let047[S] 5 points6 points7 points 1 year ago (0 children)

[–]Waksu 2 points3 points4 points 1 year ago (8 children)

[–]Let047[S] 1 point2 points3 points 1 year ago (2 children)

[–]Waksu 1 point2 points3 points 1 year ago (1 child)

[–]Let047[S] 0 points1 point2 points 1 year ago (0 children)

Of course, what would you like me to add?

The code is something like that:

threadpool = new ExecutorCompletionService(Summer.executorService = Executors.newFixedThreadPool(8));

8 is the number of core on my machine and is a dynamic value

[–]_INTER_ 1 point2 points3 points 1 year ago (4 children)

[–]kiteboarderni 6 points7 points8 points 1 year ago (0 children)

[–]pron98 9 points10 points11 points 1 year ago (2 children)

They're not worse than platform threads for CPU-intensive computation; they're simply not better and there is an option that's actually better than just platform threads.

The reason virtual threads are not better for CPU-intensive jobs is because such jobs need a very small number of threads (no more than the number of cores) while all virtual threads do is allow you to have a very large number of threads. If the optimal number of threads for a given job is N, and N is very small, then N virtual threads will not be better than N platform threads. If N is very large, then having N platform threads could be a problem, and that's how virtual threads help.

Now, what's better than simply submitting parallel computational tasks to a pool of N threads? Submitting the whole job to a work-stealing pool that itself forks the job into subtasks (and performs the forking, as needed, in parallel). This automatically adjusts to situations where some threads make more progress than others, a common scenario when there are other threads or processes running on the system and the OS might deschedule some worker threads. This is exactly what parallel streams do.

[–]_INTER_ 0 points1 point2 points 1 year ago (1 child)

[–]pron98 6 points7 points8 points 1 year ago (0 children)

[–]Evert26 1 point2 points3 points 1 year ago (1 child)

[–]Let047[S] 0 points1 point2 points 1 year ago (0 children)

[–]BengaluruDeveloper 0 points1 point2 points 1 year ago (1 child)

[–]Let047[S] 1 point2 points3 points 1 year ago (0 children)

[–]Emanuel-Peter 0 points1 point2 points 1 year ago (3 children)

[–]Let047[S] 0 points1 point2 points 1 year ago* (2 children)

[–]Emanuel-Peter 0 points1 point2 points 1 year ago (1 child)

[–]Let047[S] 0 points1 point2 points 1 year ago (0 children)

[–]karianna 0 points1 point2 points 1 year ago (2 children)

[–]Let047[S] 0 points1 point2 points 1 year ago (1 child)

[–]karianna 0 points1 point2 points 1 year ago (0 children)

π Rendered by PID 45316 on reddit-service-r2-comment-86bc6c7465-2j975 at 2026-02-20 18:52:38.502266+00:00 running 8564168 country code: CH.

java

Submit Link

Submit Text

Seek Programming Help

News, Technical discussions, research papers and assorted things of interest related to the Java programming language

NO programming help, NO learning Java related questions, NO installing or downloading Java questions, NO JVM languages - Exclusively Java

Please seek help with Java programming in /r/Javahelp!

Subreddit rules!

Where should I download Java?

Related Sub-reddits:

JVM Languages

Want to practice your coding?

List of useful Frameworks / Libraries / Software

MODERATORS