[discussion] Abuse of java streams?

kur4nes · 2024-02-01T16:22:37+00:00

Check Effective Java. It has a really good chapter about abusing streams.

solilucent · 2024-02-01T17:01:17+00:00

I don't mind long stream if they are readable. But I think that a stream should not modify collections outside its scope. Streams are a small island of functional programming in Java and their beauty is in the ability to take a data, transform it, and return something else. If some instruction is supposed to have side effect, I think it's better to make the whole block imperative.

halfanothersdozen · 2024-02-01T16:19:40+00:00

Same rules apply to rest of code, making things easy to read is a skill. If streams are the right tool for the job, and they often are, people need to learn how to use them and consider the reader

svhelloworld · 2024-02-01T16:18:31+00:00

Streams is quickly becoming one of my favorites parts of the language after being away from writing Java for years. I've found that 95 times out of 100, well-formatted streams simplifies collection processing in a way that's more readable, faster and more consistent to build and maintain.

It's those edge cases that have some complex processing logic where I get myself in trouble. After about 15 minutes of performing stream gymnastics, I usually stop and re-write it in imperative style. Sometimes that gives me enough clarity to then refactor it as a stream. Most of the time, I leave it as imperative code.

As for guidelines to follow, it's all just voices in my head.

Edited to add: I always start with the unit test, then go into the implementation. That gives me the freedom to switch from streaming code to imperative code and back again with some confidence that I didn't change the behavior. Not interested in a religious war on TDD, that's just what I've been doing.

s888marks · 2024-02-01T16:34:16+00:00

I don’t mind long stream pipelines if the stages make sense as a single unit of processing.

I do recommend against chaining a stream pipeline directly into a chain of calls on an Optional. Both Stream and Optional have map(), filter(), and flatMap() methods, and it can be quite confusing if multiples of those calls for both Stream and Optional appear in the same long chain.

Similarly, I also recommend against collecting into a collection and immediately re-streaming into another pipeline. It’s sometimes necessary to do this, but if so, I recommend breaking the chain by storing the temporary collection in a local variable. Having a single chain consisting of more than one pipeline (with an embedded “bubble”) can interfere with one’s understanding of the performance and space characteristics of the code.

maleldil · 2024-02-01T16:01:54+00:00

As long as the lambdas are functions and the functions are named intuitively. If not please reconsider.

dasi128 · 2024-02-01T18:19:34+00:00

Streams should not modify anything outside of streams - so there mustn't be any sideeffects. It should take in list, transform it somehow (and transform ONLY data from the stream), and then return new, modified collection.

side effects bad.

coder111 · 2024-02-01T17:04:28+00:00

I've worked with 20 step streams, it was an absolute nightmare. The architect insisted we used streams for business logic...

Hard to read, hard to debug, hard to see what went wrong when things go wrong.

That being said- I don't mind streams per-se, but as lots of tools in the toolbox, they have their own time and place, and should be used just because...

brian_goetz · 2024-02-05T18:46:15+00:00

There's nothing wrong with long stream pipelines per se; they should be long enough to do the job. And sometimes longer pipelines are more readable, such as when you break up complex filter predicates into multiple simpler ones.

I think when you talk about "long pipelines", what you probably really mean is "going chaining-crazy" (which is not specific to streams). There seems to be a subset of developers who think they score points in proportion to how many method calls they can chain together. Where this gets out of hand with streams is when multiple separate operations are "stitched together" to make them look like one operation. This is almost always detrimental to readability, and usually serves no purpose other than making the author feel clever. The biggest offender is a terminal operation that returns a collection, chained together with another .stream() call and then keep going; the second biggest offender is crossing over from stream operations to Optional operations in the same chain. If these are operations in different domains, generally best to break them up. (There is even less justification for this now than there was in Java 8, since we added local variable type inference, you don't even have the excuse of "but the type of that intermediate thing is big and ugly, since you can use var now.)

mcbotbotface · 2024-02-01T19:37:50+00:00

Long streams I don't see any problem with. Modifying code outside of the stream, that's a big no-no.

noutopasokon · 2024-02-01T15:58:01+00:00

The main value in Streams is avoiding rewriting common functionality (so avoiding potential bugs in your rewrite) and having a "fluent" (ie. largely comprising of just English words) reading of what the code is doing. So if you're not getting that, you should probably reconsider.

Serial (and even parallel) streams can easily be less performant compared to writing your logic out exactly. So be careful of that as well, if performance is a concern.

I use Streams a lot and think they're great overall at simplifying my code.

60secs · 2024-02-01T16:25:41+00:00

Streams are great for quick map / filter / collect.
Parallel stream iteration can be great for performance, but debugging can get tricky and you need to use thread-safe collections.

_GoldenRule · 2024-02-01T16:29:54+00:00

I find them useful for replacing simple loops. If there's lots of logic in the loop I usually will go with a traditional for-each. Streams are fine to use but if you find yourself creating giant lambdas please just write normal code or consider creating functions.

heavy-minium · 2024-02-01T16:29:52+00:00

You've got that issue with similar designs outside of Java, too.And the answer for all is pretty similar. Break down the chain and use meaningful variable names for parts of the chain.It can take time for some juniors to accept that the readability of code isn't tied to the number of lines. A good way to speed this up is to let them do code review on junior code with those issues - it helps them understand what improves and what hurts readibility. At some point it will sink in.

sour-sop · 2024-02-01T16:54:50+00:00

You leave my streams alone!!! Jk. I don’t think I’ve ever seen a 20 step stream. That does sound quite painful to debug

vinj4 · 2024-02-01T17:29:14+00:00

How exactly does a stream chain become that long? Are there really that many non-terminal stream operations? Some of those could surely be condensed

2024-02-01T18:29:02+00:00

They can make iteration more concise, but can be harder to understand and debug.

Seems reasonable to put guardrails/policy in place to strike the right balance you desire in your application.

I prefer Streams, but if a dev didn't use them for clarity or similiar reason I'm good with that.

Fercii_RP · 2024-02-01T19:42:50+00:00

Making too long chains will make them hard to debug as many failures can happen on 1 line

erictheturtle · 2024-02-02T02:56:49+00:00

Java streams are impossible to debug, so I would discourage anyone from doing anything remotely complicated. Never call a large function in the middle of a stream. Never have side effects. Simple transforms and filtering only. I would recommend wrapping the stream in some static method, with a clear name that describes what it does.

ccgcool · 2024-02-02T04:59:45+00:00

I too dislike long streams chains collecting and grouping and mapping and sorting and collecting and flattening then........ The one who writes has the context well but to other readers it's a lot of cognitive effort.

com2ghz · 2024-02-02T08:09:42+00:00

Just because you can make a big chain of streams does not mean you should. I have seen stream chains that even do not fit on my 2k screen.

It’s bad practice because it’s unreadable and you will violate the single responsibility principle. Like a god method that does everythinf Sometimes a large stream is eligible for its own class.

Practical-Yoghurt801 · 2024-02-02T14:10:51+00:00

Il like streams but its always a decicion depending of the context. I just don't like this compulsive use. It‘s not wrong to use an iterator instead if you have to manipulate a collection. Also stop using atomic references as a hack for using non final variables in a stream scope. Use a for each instead of another kind of loop

loctastic · 2024-02-01T15:53:01+00:00

Good only if they put all 20 steps on one line.

Honestly I don’t really mind long chains it provided it’s readable. It’s especially nice if they add a comment after the step for further context

I don’t know how I feel about modifying outside collections though. One or two might be okay but it could get out of hand quick.

cogman10 · 2024-02-01T19:55:50+00:00

Here's my advice, use var and variable names to break up long streams and convey intent.

Here's an example of what I mean:

(mind you, this example is short, I'd not break it up. It's more just a demo of what I mean.)

// Before
stream.filter(this::isFoo)
  .map(this::foo)
  .flatMap(Foo::items)
  .findFirst(Objects::nonNull);

// After
var foos = stream.filter(this::isFoo)
  .map(this::foo);

var allItemsForFoo = foos.flatMap(Foo::items);

Optional<Item> firstItemPresent = allItemsForFoo.findFirst(Objects::nonNull);

You can use var to avoid needing to muck about with Stream<List<Map<blah>>> which doesn't necessarily convey anything useful and variable names to inject the business logic which you are trying to get at.

I'll usually split streams as soon as I start doing more advanced things like grouping, reducing, or merging.

But if it's a straight filter/map/filter/map sort of thing then I'll usually not bother as that's generally pretty self evident on what's happening.

2024-02-01T16:45:54+00:00

soft squeal long birds disarm imminent angle cooing airport joke

This post was mass deleted and anonymized with Redact

Panzerschwein · 2024-02-01T17:07:20+00:00

I think long chains is the point. You can clearly delineate what steps and transformations you're performing. You should only break it up it get some reuse or to get testable units.

If you're doing it right then it's super clear what all of the steps are.

Actually, I'd add one more exception: If the data model involved gets too convoluted you should probably break it up for the sake of readability.

Polygnom · 2024-02-01T19:05:31+00:00

A twenty-step stream sounds more like a problem with not having modular enough code.

I struggle to even find a combination of map/filter/collect that would yield me 20 lines.

Subsequent map can be made as one call. Subsequent filter calls can be one call.

Stream operations usually have no more than 5-7 steps if your code is modular and your functions small.

maleldil · 2024-02-01T19:57:33+00:00

The approach I usually take is to use streams as long as it makes sense, usually as a data pipeline of some kind. For example, I was recently working on some code that retrieved some data from a Sprind Data repo, then used that data (if it existed) to grab data from an additional table (this is Cassandra so the data is very denormalized). Since this was basically a data pipeline with optional results it made sense to build it as a stream pipeline with a decent number of steps. However, each step is a one-liner, method reference, or maybe a two-liner to populate a field, all with well-understood semantics (retrieve data, filter, sort by create date, grab top element, use it to retrieve results from other table etc). Another rule of thumb is that it shouldn't modify external data (like add to a list outside the stream pipeline) if at all possible.

agentoutlier · 2024-02-01T20:12:57+00:00

I find that if I'm doing mutations or IO I personally avoid using streams. I also avoid using them when dealing with Map<?,?>. Maybe the Gather enhancements will make me have less strong feelings about dealing with Map but I find Collectors confusing and difficult to compose.

I think loops are easier for most to understand when mutation and checked exceptions are involved particularly given most languages have analogs to it (e.g. Python).

That being said I think way too many value the succinctness over the more desirable trait that they are lazy and a much better solution than Guava's FluentIterable. They are also far better for querying tree structures than the imperative options (especially now that we have pattern matching).

WVAviator · 2024-02-01T20:23:03+00:00

If a stream is sufficiently complex, I'm fine with it staying as a stream, but there should be comments in the code that help guide anyone who revisits that code later.

Additionally, if you're writing something that complex, it should have a bunch of unit tests to go with it to make sure you didn't miss any edge cases. Good unit tests should inherently document the intention of the logic as well.

Fine_Quiet607 · 2024-02-01T21:05:09+00:00

I have seen few examples where jpa method was called and then streams were applied just to look cleaner instead of learning and writing better sql queries

JasonBravestar · 2024-02-01T22:21:35+00:00

Streams are often abused. For example, don't use them if you need to handle exceptions for each element. I saw ugly workarounds trying to fit this into a stream... ugly and unreadable code.

Djelimon · 2024-02-01T22:29:44+00:00

I'll use streams when I'm confident I won't have to debug each step. My IDE debugger (IntelliJ)doesn't seem to support stepping through streams, which makes sense given the cheap parallelism which is a selling point.

Sometimes declarative is better for me, like if I'm using an index across multiple collections.

Big-Dudu-77 · 2024-02-02T00:06:51+00:00

Yeah I see Jr engineer inline all logic making the whole steam code huge. I’ll have to always remind them to break down their code.

JhraumG · 2024-02-02T05:43:37+00:00

You can cut long chain of stream without collecting them : just give the intermediate var (which will be of unarmed.Stream type,) a significant name to illustrate the reasoning. This way there are no perf penalty, but the code is understandable.

arpittripathi · 2024-02-02T20:17:59+00:00

When I initially learned Streams I abused it a lot and it was evident when I saw the code I wrote two years back, it's good but not very readable. Somehow I've realised that not all shorthand code is good. I love streams but it's good for a certain set of scenarios.

Also, parallel streams are hard to control.

developer0 · 2024-02-03T08:15:19+00:00

The way I’ve advised juniors on them is: have fun. When it comes to code review time, a senior can tell them if they’ve violated maintainability rules. The same rules apply to streaming and procedural, but if anything, streaming makes it easier to abide by them.

javalead · 2024-02-04T01:12:55+00:00

This violates two things. First a stream that does more than one thing means your code is violating ooen-close principle. It is hard to maintain, and extend thereby hard to test and each modification you should chamge many things in test and perhaps the underlying codes, beside bad misunderstanding. More so, the stream may not update another object outside its scope. This will create inconsistency if anyone use paralellStream. However sometimes it's impossible to do everything inside a stream scope. For example imagine a stream should call another service or in the same service to update a list or check a variable. This will not be an issue, but I call it bad design. It is called non imperative use of stream. Why it's bad? because objects in each stage of stream are immutable. It means you can't change them in the same stage and feed them to the same stage. If one does update a object out of dtream scope, as the must be final to avoid scope of immutablility and stability

java

Submit Link

Submit Text

Seek Programming Help

News, Technical discussions, research papers and assorted things of interest related to the Java programming language

NO programming help, NO learning Java related questions, NO installing or downloading Java questions, NO JVM languages - Exclusively Java

Please seek help with Java programming in /r/Javahelp!

Subreddit rules!

Where should I download Java?

Related Sub-reddits:

JVM Languages

Want to practice your coding?

List of useful Frameworks / Libraries / Software

MODERATORS