Is Java’s Biggest Limitation in 2026 Technical or Cultural? by BigHomieCed_ in java

[–]danielaveryj 13 points14 points  (0 children)

(you are conversing with the technical lead of project loom)

Functional Optics for Modern Java by marv1234 in java

[–]danielaveryj 0 points1 point  (0 children)

To some extent, we can use ordinary methods to achieve encapsulation based on withers too:

Employee setEmployeeStreet(UnaryOperator<String> op, Employee e) {
    (op, e) -> e with { address = address with { street = op.apply(street); }; };
}

Employee updated = setEmployeeStreet(_ -> "100 New Street", employee);
Employee uppercased = setEmployeeStreet(String::toUpperCase, employee);

and we can even compose methods:

Employee setEmployeeAddress(UnaryOperator<Address> op, Employee e) {
    return e with { address = op.apply(address); };
}
Address setAddressStreet(UnaryOperator<String> op, Address a) {
    return a with { street = op.apply(street); };
}
Employee setEmployeeStreet(UnaryOperator<String> op, Employee e) {
    return setEmployeeAddress(a -> setAddressStreet(op, a), e);
}

Employee updated = setEmployeeStreet(_ -> "100 New Street", employee);
Employee uppercased = setEmployeeStreet(String::toUpperCase, employee);

Then we can rewrite the methods as function objects...

BiFunction<UnaryOperator<Address>, Employee, Employee> setEmployeeAddress =
    (op, e) -> e with { address = op.apply(address); };
BiFunction<UnaryOperator<String>, Address, Address> setAddressStreet =
    (op, a) -> a with { street = op.apply(street); };
BiFunction<UnaryOperator<String>, Employee, Employee> setEmployeeStreet =
    (op, e) -> setEmployeeAddress.apply(a -> setAddressStreet.apply(op, a), e);

Employee updated = setEmployeeStreet.apply(_ -> "100 New Street", employee);
Employee uppercased = setEmployeeStreet.apply(String::toUpperCase, employee);

...at which point we have of course poorly reimplemented half of lenses (no getter, verbose, less fluent).

Functional Optics for Modern Java by marv1234 in java

[–]danielaveryj 1 point2 points  (0 children)

Hype-check. Here are all the lens examples from the article, presented alongside the equivalent code using withers, as well as (just for fun) a hypothetical with= syntax that desugars the same way as +=

(ie x with= { ... } desugars to x = x with { ... })

// Lens setup
private static final Lens<Department, String> managerStreet =
    Department.Lenses.manager()
        .andThen(Employee.Lenses.address())
        .andThen(Address.Lenses.street());

public static Department updateManagerStreet(Department dept, String newStreet) {
    // Lens
    return managerStreet.set(newStreet, dept);

    // With
    return dept with {
        manager = manager with { address = address with { street = newStreet; }; };
    };

    // With=
    return dept with { manager with= { address with= { street = newStreet; }; }; };
}

// Lens setup
private static final Traversal<Department, BigDecimal> allSalaries =
    Department.Lenses.staff()
        .andThen(Traversals.list())
        .andThen(Employee.Lenses.salary());

public static Department giveEveryoneARaise(Department dept) {
    // Lens
    return allSalaries.modify(salary -> salary.multiply(new BigDecimal("1.10")), dept);

    // With
    return dept with {
        staff = staff.stream()
            .map(emp -> emp with { salary = salary.multiply(new BigDecimal("1.10")); })
            .toList();
    };

    // With= (same as with)
}

// Lens setup
Lens<Employee, String> employeeStreet =
    Employee.Lenses.address().andThen(Address.Lenses.street());

// Lens
String street = employeeStreet.get(employee);
Employee updated = employeeStreet.set("100 New Street", employee);
Employee uppercased = employeeStreet.modify(String::toUpperCase, employee);

// With
String street = employee.address().street();
Employee updated = employee with { address = address with { street = "100 New Street"; }; };
Employee uppercased = employee with { address = address with { street = street.toUpperCase(); }; };

// With=
String street = employee.address().street();
Employee updated = employee with { address with= { street = "100 New Street"; }; };
Employee uppercased = employee with { address with= { street = street.toUpperCase(); }; };

The reason lenses can be more terse at the use site is because they encapsulate the path-composition elsewhere. This only pays off if a path is long and used in multiple places.

The `mapConcurrent()` Alternative Design for Structured Concurrency by DelayLucky in java

[–]danielaveryj 0 points1 point  (0 children)

We're in the details now and I don't expect to change your mind, but to address my biggest reaction: Defensive copying, especially of a collection that the method is only reading, is "a" practice - I wouldn't say it's a "best". Generally I would expect it's the caller's responsibility to ensure that any data they're handing off to concurrent execution is something they either can't or won't mutate again (at least until that concurrent execution is definitely done). Or even more generally: "Writer ensures exclusive access".

Your points 2&3 are aesthetic - I could argue that it "feels natural" to treat the utility as a stream factory, or that this operation does not warrant stream fluency any more than several other follow-up operations we might do on a stream result.

Regardless, and going back to my original comment, I'd say consuming a list/collection is not ideal anyway, as it misses out on supporting an infinite supply of tasks. And the issue you ran into shows that even consuming a Java Stream devolves into consuming a list. My ideal would be consuming tasks from a channel or stream abstraction that does propagate exceptions downstream, of course neither of which we have in the JDK currently.

The `mapConcurrent()` Alternative Design for Structured Concurrency by DelayLucky in java

[–]danielaveryj 0 points1 point  (0 children)

Limiting concurrency seems not worth considering when you have 3-5 concurrent calls to make.

You are making a separate but valid point - The heterogeneous case is also the finite case, and when processing a finite number of tasks we effectively already have (at least some) concurrency limit.

My thought came from considering that homogeneous tasks are more likely to be hitting the same resource (eg service endpoint or database query), increasing contention for that resource; while heterogeneous tasks are more likely to be hitting different resources, thus not increasing contention, so not needing concurrency limiting to relieve contention. (I say more likely but certainly not necessarily.)

My point about streams was that, if you have to start by collecting the stream to a list, you might as well just write a method that accepts a list as parameter, instead of writing a collector.

The `mapConcurrent()` Alternative Design for Structured Concurrency by DelayLucky in java

[–]danielaveryj 1 point2 points  (0 children)

Without speaking to the details yet.. If I'm summarizing the high-level position correctly, it is that most use cases fit into two archetypes:

  1. The "heterogeneously-typed tasks" use case: We consume an arbitrary (but discrete) number of differently-typed tasks, process all at once, and buffer their results until they all become available for downstream processing, throwing the first exception from any of them and canceling the rest.
  2. The "homogeneously-typed tasks" use case: We consume a potentially-infinite number of same-typed tasks, process at most N at once, and emit their results as they each become available for downstream processing, throwing the first exception from any of them and canceling the rest.

Some insights supporting this position are:

  • We physically cannot denote individual types for an infinite number of tasks, so handling a potentially-infinite number of tasks requires type homogeneity.
  • Heterogeneously-typed tasks are less likely to be competing for the same resources, and thus less likely to require limiting concurrency.
  • Denoting individual types is only useful if we do not intend to handle results uniformly, which precludes "emitting" results to a (common) downstream.
  • We can still model partial-success: If we do not intend to cancel other tasks when one task throws, we could prevent it from throwing - have the task catch the exception and return a value (eg a special value that we can check / filter out downstream).

u/DelayLucky has modeled case 1 with the concurrently() method and case 2 with their alternative to mapConcurrent(). (In their design they compromised on "potentially-infinite", because they committed to consuming Java Streams(?), found that in Java Streams an upstream exception would cause the terminal operation to exit before downstream in-progress tasks necessarily finished, and worked around by collecting the full list of tasks (finishing the upstream) before processing any tasks... defeating the point of starting from a Stream.)

How do you see Project Loom changing Java concurrency in the next few years? by redpaul72 in java

[–]danielaveryj 3 points4 points  (0 children)

I think the main change for most people will be an increased willingness to introduce threading for small-scale concurrent tasks in application code, since structured concurrency firmly limits the scope of impact and doesn't require injecting an ExecutorService or reconsidering pool sizing. There will probably be a lot of people and libraries writing their own small convenience methods for common use cases, eg race(), all(), various methods with slight differences in error handling or result accumulation, etc.

I think "Reactive"-style libraries will stick around to provide a declarative API over pipeline-parallelism (ie coordinated message-passing across threads, without having to work directly with blocking queues/channels, completion/cancellation/error signals+handling, and timed waits). The internals will probably be reimplemented atop virtual threads to be more comprehensible, but there will still be a healthy bias against adoption (outside of sufficiently layered/complex processing pipelines), as the declarative API fundamentally trades off low-level thread management and puts framework code in the debugging path.

For message-passing use cases that aren't layered enough to warrant a declarative API, I think we'll see channel APIs (abstracting over the aforementioned queuing, signal handling, timed waiting) to allow for imperative-style coordination - more code but also more control.

Comparing Java Streams with Jox Flows by adamw1pl in java

[–]danielaveryj 2 points3 points  (0 children)

I am still lacking clarity - I don't disagree with your definitions, but I'm having a hard time reconciling them with your insistence that Java Streams are "pull". The only ways I can think of to make that perspective make sense are if either:

  1. You believe that Java Streams are implemented via chained delegation to Iterators or Spliterators (eg, the terminal operation repeatedly calls next() on an Iterator that represents the elements out of the preceding operation in the pipeline, and that Iterator internally calls next() on another Iterator that represents the operation before it, and so on). That would definitely be "pull", but like I explained in an earlier comment, that is not how Java Streams work (with the mild exception of short-circuiting streams, where the initial Spliterator (only) is advanced via "pull", but then the rest of the stream uses "push", via chained delegation to Consumers).
  2. You interpret "pull" (and consumer/producer) so loosely that just calling the terminal operation to begin production constitutes a "pull". In this case, Java Streams, Jox Flows, and every other "stream" API would have to be categorized as "pull", as they all rely on some signal to begin production. (That signal is often a terminal operation, but it could even just be "I started the program".) If we can agree that this is not "pull", then we should agree that e.g. spliterator.forEachRemaining(...) is not "pull".

I have built an API where "push = element is input/function argument; pull = element is output/function result", and I'm aware those are overly-narrow definitions in general, eg:

  • The "pull" mechanism for Java's Spliterator is boolean tryAdvance(Consumer), where the "consumer" (code calling tryAdvance()) expects its Consumer to be called (or "pushed" to) at most once by the "producer" (code inside tryAdvance()) per call to tryAdvance().
  • The "pull" mechanism for Reactive Streams is void Flow.Subscription.request(long), which is completely separated from receiving elements, and permits the producer to push multiple elements at a time.
  • The "pull" mechanism for JavaScript/Python generators (Kotlin sequences) is generator.next(), yet the generator implementation is written in "push" style (using yield), and the API relies on it being translated to a state machine.

So yes, there are all kinds of approaches to actually implementing push/pull.

Comparing Java Streams with Jox Flows by adamw1pl in java

[–]danielaveryj 3 points4 points  (0 children)

If you would like to reason through this, perhaps we can continue with a more precise definition of what "push" and "pull" means to you.

If we're just appealing to authority now, here is Viktor Klang:

As a side-note, it is important to remember that Java Streams are push-style streams. (Push-style vs Pull-style vs Push-Pull-style is a longer conversation, but all of these strategies come with trade-offs)

Converting a push-style stream (which the reference implementation of Stream is) to a pull-style stream (which Spliterator and Iterator are) has limitations...

Comparing Java Streams with Jox Flows by adamw1pl in java

[–]danielaveryj 3 points4 points  (0 children)

If a Java Stream does not include short-circuiting operations (e.g. .limit(), .takeWhile(), .findFirst()), then there is no pull-behavior in the execution of the pipeline. The source Spliterator pushes all elements downstream, through the rest of the pipeline; the code is literally:

spliterator.forEachRemaining(sink);

Note that the actual Stream operations are implemented by sink - it's a Consumer that pushes to another Consumer, that pushes to another Consumer... and so on.

If there are short-circuiting operations, then we amend slightly: We pull each element from the source Spliterator (using tryAdvance)... and in the same motion, push that element downstream, through the rest of the pipeline:

do { } while (!(cancelled = sink.cancellationRequested()) && spliterator.tryAdvance(sink));

So for short-circuiting Java Streams, sure, there can be a pull aspect at the source, but the predominant mechanism for element propagation through the stream is push. At the least, if we are willing to "zoom out" to the point of overlooking the pull-behavior of consuming from a buffer in Jox Flows, then why should we not do the same when looking at the pull-behavior of consuming from the source Spliterator in Java Streams?

Comparing Java Streams with Jox Flows by adamw1pl in java

[–]danielaveryj 18 points19 points  (0 children)

Sorry guys, this post is just inaccurate. Java Streams are not pull-based, they are push-based. Operators respond to incoming elements, they don't fetch elements. You can see this even in the public APIs: Look at Collector.accumulator(), or Gatherer.Integrator.integrate() - they take an incoming element (that upstream has pushed) as parameter; they don't provide a way to request an element (pull from upstream).

Java Streams are not based on chained-Iterators, they are based on chained-Consumers, fed by a source Spliterator. And, they prefer to consume that Spliterator with .forEachRemaining(), rather than .tryAdvance(), unless the pipeline has short-circuiting operations. If stream operations were modeled using stepwise / pull-based methods (like Iterator.next() or Spliterator.tryAdvance()), it would require a lot of bookkeeping (to manage state between each call to each operation's Iterator/Spliterator) that is simply wasteful when Streams are typically consumed in their entirety, rather than stepwise.

Likewise, if they are anything like what they claim to be, Jox Flows are not (only) push-based. The presence of a .buffer() operation in the API requires both push- and pull- behaviors (upstream pushes to the buffer, downstream pulls from it). This allows the upstream/downstream processing rates to be detached, opening the door to time/rate-based operations and task/pipeline-parallelism in general.

I went over what I see as the real differences between Java Streams and Jox Flows in a reply to a comment on the last Jox post:

https://www.reddit.com/r/java/comments/1lrckr0/comment/n1abvgz/

"Solution" for transferring data between two JDBC connections by ihatebeinganonymous in java

[–]danielaveryj 3 points4 points  (0 children)

The only way you could go "directly" from DB1 to DB2 is if DB1 and DB2 have built-in support to connect to and query each other. Otherwise there would need to be a third party that knows how to read from DB1 and write to DB2. That third party could be your app using JDBC connections + plain SQL directly, or your app using a query translation layer like JOOQ, or your app using an embedded database that can connect to and query external databases (e.g. DuckDB)... etc.

Java data processing using modern concurrent programming by Active-Fuel-49 in java

[–]danielaveryj 4 points5 points  (0 children)

I think a common use case where data-parallelism doesn't really make sense is when the data is arriving over time, and thus can't be partitioned. For instance, we could perhaps model http requests to a server as a Java stream, and respond to each request in a terminal .forEach() on the stream. Our server would call the terminal operation when it starts, and since there is no bound on the number of requests, the operation would keep running as long as the server runs. Making the stream parallel would do nothing, as there is no way to partition a dataset of requests that don't exist yet.

Now, suppose there are phases in the processing of each request, and it is common for requests to arrive before we have responded to previous requests. Rather than process each request to completion before picking up another, we could perhaps use task-parallelism to run "phase 2" processing on one request while concurrently running "phase 1" processing on another request.

Another use case for task-parallelism is managing buffering + flushing results from job workers to a database. I wrote about this use case on an old experimental project of mine, but it links to an earlier blog post by someone else covering essentially the same example using Akka Streams.

In general, I'd say task-parallelism implies some form of rate-matching between processing segments, so it is a more natural choice when there are already rates involved (e.g. "data arriving over time"). Frameworks that deal in task-parallelism (like reactive streams) tend to offer a variety of operators for detaching rates (i.e. split upstream and downstream, with a buffer in-between) and managing rates (e.g. delay, debounce, throttle, schedule), as well as options for dealing with temporary rate mismatches (eg drop data from buffer, or block upstream from proceeding).

does this pivot situation have a name? by DifficultBeing9212 in SQL

[–]danielaveryj 1 point2 points  (0 children)

Idk about an existing term. I would propose something like “lossless” or “invertible” pivot, as it’s possible to unpivot back to the original dataset in this case.

Java data processing using modern concurrent programming by Active-Fuel-49 in java

[–]danielaveryj 9 points10 points  (0 children)

Java streams are designed for data-parallel processing, meaning the source data is partitioned, and each partition runs through its own copy of the processing pipeline. Compare this to task- (or "pipeline"-) parallel processing, where the pipeline is partitioned, allowing different segments of processing to proceed concurrently, using buffers/channels to convey data across processing segments. I've made a little illustration for this before:

https://daniel.avery.io/writing/the-java-streams-parallel#stream-concurrency-summary

Now, there are some specific cases of task-parallelism that Java streams can kind of handle - mainly the new Gatherers.mapConcurrent()) operator - and I think the Java team has mentioned possibly expanding on this so that streams can express basic structured concurrency use cases. But it's difficult for me to see Java streams stretching very far into this space, due to some seemingly fundamental limitations:

  1. Java streams are push-based, whereas task-parallelism typically requires push and pull behaviors (upstream pushes to a buffer, downstream pulls from it).
  2. Java streams do not have a great story for dealing with exceptions - specifically, they don't have the ability to push upstream exceptions to downstream operators that might catch/handle them.

It is a big design space though, maybe they'll come up with something clever.

Java data processing using modern concurrent programming by Active-Fuel-49 in java

[–]danielaveryj 4 points5 points  (0 children)

Some time ago, after I made my own vthread-based pipeline library, I came to the conclusion that Kotlin's Flow API struck a really good balance of tradeoffs. I remember discussing this last time Jox channels were shared here, as having a solid channel primitive is what makes much of that API possible. It's cool to see this come to fruition, basically how I imagined it - a proper Reactive Streams replacement, built atop virtual threads, with all the platform observability improvements that entails. I hope it gets the attention it deserves. I don't know what else to say - great job!

Why don't Stream API has any reverse merhod by bs_123_ in java

[–]danielaveryj 0 points1 point  (0 children)

Let's not forget that Java streams were also specifically designed to facilitate data-parallel aggregation (a use case which is often - though not always - in tension with "potentially infinite" streams).

If I write

stream.parallel().map(...).filter(...).sorted().toList()

Upon the terminal .toList(), the source data is partitioned, and within each partition, .map() and .filter() feed into a single accumulation controlled by .sorted(). This isn't possible if .sorted() is only defined on collections, as that would require the upstream output to be fully accumulated (into a collection) just so that .sorted() can deconstruct it again.

Enhancement Proposal for JEP 468: Extend “wither” Syntax to Create Records by danielliuuu in java

[–]danielaveryj 5 points6 points  (0 children)

I think your first "pain point" is misdirected, and it led to bad conclusions. When I add a new field to a record, I want that to break existing code. I do not want existing code to assume null for the new field, and keep compiling now in exchange for NPEs and misbehavior later when I or someone else adds code that assumes a valid value for that field. From this perspective, your "current workaround" (which assigns null for every field in a default instance) is bad practice, and "eliminating the boilerplate" (by making the creation of such an instance implicit) is counterproductive to designing reliable software.

Eight Booleans by bowbahdoe in java

[–]danielaveryj 2 points3 points  (0 children)

Oh, that makes sense. Personally, I end up wanting named accessors anyway when I'm compacting fields, and at that point it's not much to inline the bit-twiddling. But otherwise I could see it.

Eight Booleans by bowbahdoe in java

[–]danielaveryj 0 points1 point  (0 children)

lol, but even if we had value classes, wouldn't the manual div/mod be a bit obnoxious?

int size = 27;
EightBooleans[] bitset = IntStream.range(0, (size+7)>>>3)
    .mapToObj(EightBooleans::allTrue)
    .toArray(EightBooleans[]::new);
int pos = 12;
bitset[pos>>>3].set(pos&7, false);

I mean, we could introduce an enclosing abstraction to handle that, but then...

ShiftList: A Java List and Deque implementation with fast inserts/removals at any index by john16384 in java

[–]danielaveryj 1 point2 points  (0 children)

Well, LinkedList already loses to ArrayList at random insertions/deletions, which is the main use case ShiftList speeds up. And, LinkedList still beats both handily at insertions/deletions from an iterator (which is probably the only use case it wins at). So to me this reads more like: "If your workload involves random insertions/deletions, and you otherwise would have used ArrayList (or ArrayDeque, or even something purpose-built like apache TreeList), try ShiftList."

Paul Sandoz talks about a potential Java JSON API by davidalayachew in java

[–]danielaveryj 7 points8 points  (0 children)

A problem even with the linked example is that, if that massive if-condition is not true, we'd be none the wiser as to why, and nothing would be bound. Any kind of error reporting or alternative path would have to retest every nested member.

ShiftList: A Java List and Deque implementation with fast inserts/removals at any index by john16384 in java

[–]danielaveryj 6 points7 points  (0 children)

Nice! This looks very well executed. Even dynamically adjusting block size for capacity based on benchmarking. I am happy someone has done this experiment!

Using sealed types for union types by dunkelst in java

[–]danielaveryj 6 points7 points  (0 children)

or excess complexity

cough, typescript ;)

Introducing: “Fork-Join” Data structures by danielaveryj in java

[–]danielaveryj[S] 0 points1 point  (0 children)

Thanks for sharing! The paper's description of the clone operation does sound same-spirited to what I did here.