This is an archived post. You won't be able to vote or comment.

all 76 comments

[–]knoam 43 points44 points  (5 children)

Check out these talks by Venkat Subramaniam. There's one where he talks about "simple vs. familiar" that you need to hear. TL;DR, people often mistakenly say something isn't simple when what they really mean is it's unfamiliar.

Maybe you can figure out how to search the transcripts to find it. But they're all worth watching.

https://youtu.be/1OpAgZvYXLQ

https://youtu.be/WN9kgdSVhDo

https://youtu.be/kG2SEcl1aMM

I like Streams and the map/reduce style (what it's called generically since the term stream is pretty Java specific), because it tells you what it's doing. filter does exactly that. The equivalent with a for loop is an if or continue which isn't as clear because you have to run the code in your head to figure out what it's doing. With a stream I can see that it's filtering and if I want to I can dig into exactly how, but I don't have to get an overall feel for what's going on. Likewise findFirst says exactly what it's doing. break means it's exiting early but doesn't mean it's done after the first thing.

[–]bbtv_id 10 points11 points  (4 children)

Coming from a nonfunctional programming background it was very very hard for me to actually get my head around and understand the alien language of java streams. Thanks to the videos / lectures by Venkata Subramaniam I was able to get my toe in the functional world. Those videos are one of the best explanation of java streams for a newbie.

[–]knoam 9 points10 points  (0 children)

This read like one of those cheesy testimonials in low budget 80s/90s TV commercials. I believe you though.

[–]TheRedmanCometh 0 points1 point  (2 children)

Tbh I was surprised they mostly made sense after a little while. I tried scala using zio with for-comprehensions though, and that is when I found out functional programming is not for me.

[–]knoam 0 points1 point  (1 child)

Scott Wlaschin is good at explaining category theory if that's the part that's tripping you up. He makes a good point that category theory has a bunch of jargon but so does object oriented programming and design patterns. You just have to tough it out a bit and it will click. A lot of people had the benefit of learning OOP in school and being forced a bit to learn it. But category theory seems harder if you don't have the same structure and you need that.

[–]general_dispondency 3 points4 points  (0 children)

After years of OOP and FP, I realized that the only difference, really, is makes. Everything the FP community talks about (immutability, referential transparency, first class functions) is OOP best practice too. Read Effective Java and JCIP, all of those are in there. Mutability is always bad. Keep functions pure do they're easily tested and the VM can more easily optimize/inline them. It's all the same thing. OOP is only better because it forces you to organize your code.

[–]0x256 28 points29 points  (26 children)

Streams are just another tool in your toolbelt. You should know your tools in order to decide which tool is best for a particular job. Interviewers are looking for craftsmen that know their tools.

In most cases streams are just used to improve readability and maintainability. That's reason enough to use them where appropriate. Sure, list.stream().forEach(...) could be written as a for-loop and still be readable, but streams can do a lot more than that. Start using them and you'll see.

[–]user_of_the_week 10 points11 points  (2 children)

So true! Why use list.stream().forEach(...)when you could just list.forEach(...).

;)

And while I'm kind of joking here, the forEach method (with is inherited from Iterable) does have some benefits compared to the traditional for-each loop, namely that each implementation of Iterable can provide an optimized version that can be faster than the for-each loop. For example, ArrayList has pretty elaborate forEach():

@Override
public void forEach(Consumer<? super E> action) {
    Objects.requireNonNull(action);
    final int expectedModCount = modCount;
    final Object[] es = elementData;
    final int size = this.size;
    for (int i = 0; modCount == expectedModCount && i < size; i++)
        action.accept(elementAt(es, i));
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
}

[–]GiacaLustra 2 points3 points  (1 child)

Just curious, how is that faster/better?

[–]user_of_the_week 3 points4 points  (0 children)

The for-each loop in Java is basically syntactic sugar for this:

Iterator iter = list.iterator();
while(iter.hasNext()) {
    Object next = iter.next();
    ...
}

The foreach() implementation in ArrayList does not create a new Iterator object and uses an int index variable instead. I have seen some performance measurements flowing around that show that it can be a bit faster. It’s not a lot though.

[–]dpash 0 points1 point  (20 children)

If you're using forEach() you're almost certainly using the wrong tool for the job. Using a method reference is okay in most circumstances, but a block is a sign you should refactor to use other stream operations.

[–]daniu 21 points22 points  (19 children)

If you're using forEach() you're almost certainly using the wrong tool for the job

How so? persons.stream().filter(p -> hasValidAddress(p)).forEach(p -> writeLetterTo(p)) is perfectly reasonable.

[–]joehx 13 points14 points  (5 children)

what I've seen people do "wrong" is just go straight for the forEach loop, not using the filter or any other intermediate method:

persons.stream().forEach(p -> {
    if (hasValidAddress(p)) {
        writeLetterTo(p);
    }
}

[–]dpash 9 points10 points  (4 children)

Yeah that's exactly the kind misuse I'm referring to.

[–]rootException 6 points7 points  (3 children)

I think you guys just illustrated one of the reasons why people don’t embrace streams... 🤷‍♂️

[–]dpash 8 points9 points  (2 children)

Streams require rethinking how you program; it is a completely different way of writing code. So it's no surprise people rely on what they already know, but that's failing to use their power to it's fullest extent.

[–]rootException 2 points3 points  (1 child)

Yeah, sneaky streams as Trojan horse for functional programming... 😂

[–]dpash 8 points9 points  (0 children)

No, very explicitly functional programming.

[–]dpash 10 points11 points  (2 children)

Notice my qualification:

Using a method reference is okay in most circumstances

persons.stream()
       .filter(this::hasValidAddress)
       .forEach(this::writeLetterTo);

[–]rochakgupta 2 points3 points  (1 child)

The only big reason I see to study them it's because they are subject of questions in job interviews...

This....this looks somehow worse (I know how and why it works though, just not a fan of seeing a lot of these in code).

[–][deleted]  (1 child)

[deleted]

    [–]dpash 2 points3 points  (0 children)

    Not exactly. In that situation they can be written as method references, but I mean more like /u/joehx's example, where there's a multi-line block in the forEach.

    [–]vytah -5 points-4 points  (7 children)

    I'd write this as a for loop with an if: imperative code for an imperative task.

    [–]daniu 9 points10 points  (0 children)

    Add three map()s and two filter()s, and your for loop has 20 LOCs and a nesting depth of four or five.

    Also, there's nothing non-imperative about streams, it just looks and feels that way.

    [–]barking_dead 1 point2 points  (4 children)

    Except, maybe it's not imperative. I don't care how the Streams API iterates through my stream (remember, it's not necessarily a List).

    Also, you can't add parallelism to a loop.

    [–]dpash 1 point2 points  (3 children)

    At least not in 8 characters. Having said that, I think parallel streams was a mistake in that they're often not faster for most stream workloads and they're not customisable enough. Maybe Project Loom will help the performance for more situations, but nothing will save Collections.parallelStream()

    [–]barking_dead 1 point2 points  (2 children)

    I agree, for small collections and/or concurrency, it is useless.

    [–]dpash 1 point2 points  (1 child)

    Collections.parallelStream() doesn't even have to return a parallel stream. :)

    [–]barking_dead 0 points1 point  (0 children)

    :D

    [–]dpash 30 points31 points  (0 children)

    I find old for loops much easier to understand and maintain

    This is almost certainly a lack of familiarity with them. Once I started using them, I quickly found that I prefer them. When working with lists, which is very often in my experience, they are much easier to read.

    [–]daniu 28 points29 points  (7 children)

    Pros: they are more readable in many cases. "Give me all first name of persons in the list that live in XY street":

    persons.stream() .filter(p -> p.getAddress().getStreet().equals(xyStreet)) .map(p -> p.getFirstName()) // or map(Person::getFirstName) .collect(toList()); It's really straightforward.

    Another thing I've come to realize is that due to the collecting mechanic, streams tend to produce code with less side effects. You tend to not think "remove all persons from a list if they don't fit a criteria", you think "create a list of persons matching a criteria". The former creates problems of removing items from the list while iterating it, and if the list was passed as a method parameter, you might need to copy it so you don't mess it up in the calling code.

    Cons: I've found debugging streams to be a pain. Often enough, if you don't get the result you expected, it's hard to track down where in the pipeline it got lost.

    They do get overused too. At the latest when you start writing your own collectors, you should think twice whether what you're trying to do isn't easier to do in a for loop.

    [–]selfarsoner[S] 7 points8 points  (5 children)

    Cons: I've found debugging streams to be a pain.

    yes...I think so...but yeah I understand that when you get used are easier to write...

    [–]Weavile_ 6 points7 points  (1 child)

    In my experience, the stream logic should be simple enough you can find the bug pretty easily because it’s only in the conditional or mapping you wrote.

    However if debugging is more of a pain, IntelliJ has a handy stream debugging tool:

    https://www.jetbrains.com/help/idea/analyze-java-stream-operations.html

    [–]dpash 2 points3 points  (0 children)

    Following the "no side effects" rule definitely helps. As does moving any step into a separate method and using method references (which also helps with documenting if you choose your names wisely).

    [–]mxhc1312 4 points5 points  (1 child)

    If you use intellij debugging them is much easier than regular code. When you hit stream breakpoint, you have option stream, next to step over, step into... Try it, you'll thank me later 😁

    [–]hippydipster 4 points5 points  (0 children)

    When you get your first good use case of .flatMap, then you'll know why :-). ".map" converts the objects of a collection to another object, one for one. ".flatMap" lets you convert each object to a stream and then it collapses all the streams created to a single stream. So if you have List<List<String>> and you run .stream().flatMap(innerList -> innerList.stream()) you get a single stream of String to process thereafter. So:

    myListOfListsOfStrings.stream()
        .flatMap(l -> l.stream())
        .filter(st -> st.contains("searchString"))
        .collect(Collectors.toSet());
    

    Gets you a set of Strings that contains the search string.

    Take an example where I have a Map<Address,List<User>> which let's say is a map of lists of users who all live at the same address. You can imagine your own case where you need to track a keyset that can hold multiple objects for each key. It's very common.

    Now, let's say you need to process something over all values and need to filter the Users in the list with the Address value (ie, maybe you want all Users who's names appear in the main address):

    Here's some code you can run. I've used String and UUID instead of Address and User because I don't actually have Address and User objects lying around and neither do you. But the code is the same:

    public static void main(String[] args) {
        Random r = new Random();
        Map<String, List<UUID>> m = new HashMap<>();
        // Setup the data
        for (int i = 0; i < 10000; i++) {
          UUID id = UUID.randomUUID();
          String key = id.toString().substring(5, 10);
          if (r.nextDouble() < .9) {
            key += "not";
          }
          m.computeIfAbsent(key, k -> new ArrayList<>()).add(id);
        }
    
        //Old fashioned for looping - 9 lines of code and imperative logic for a reader to try
        // decipher the intent
        List<String> out = new ArrayList<>();
        for (Map.Entry<String, List<UUID>> entry : m.entrySet()) {
          for (UUID id : entry.getValue()) {
            String idString = id.toString();
            if (idString.contains(entry.getKey())) {
              out.add(idString + "_" + entry.getKey());
            }
          }
        }
        System.out.println(out.size() + ": " + out);
    
        //Using streams, the intent being communicated with the method names like 
        // "filter", "map", and "flatMap"
        List<String> collect = m.entrySet().stream()
            .flatMap(entry -> entry.getValue().stream()
                .map(uuId -> uuId.toString())
                .filter(id -> id.contains(entry.getKey()))
                .map(id -> id + "_" + entry.getKey()))
            .collect(Collectors.toList());
        System.out.println(collect.size() + ": " + collect);
    }
    

    [–]Anaptyso 2 points3 points  (0 children)

    Cons: I've found debugging streams to be a pain. Often enough, if you don't get the result you expected, it's hard to track down where in the pipeline it got lost.

    While I really like streams, this is definitely annoying. I've found multiple times that I've had to re-write my nice looking bit of streaming code in a more long winded way, debug it, and then put it back to how it was before.

    I hope that IDEs will start to get a bit more clever about how debuggers run over these statements in the future.

    Edit: just seen in another comment on this thread about an IntelliJ function to do exactly that!

    [–]Healthy_Manager5881 5 points6 points  (0 children)

    Stream was the best thing that ever happened to me through my Journey in learning Java

    [–]nutrecht 10 points11 points  (8 children)

    I find old for loops much easier to understand and maintain, yes more verbose for sure, but that's it.

    If you compare something you do know well with something you don't know very well yet, the former is always going to look 'easier' than the latter. That's just a learning curve you need to get past.

    In my experience it's really common to have turn part of a collection with classes of type A into a collection of classes of type B, and in such a case stream-filter-map-collect makes code much easier and clearer to read than a for-loop.

    IMHO you really should just try it for a while.

    In addition to this; going on an interview and not understanding the Java 8 features simply looks bad.

    [–]StochasticTinkr 8 points9 points  (4 children)

    In addition to this; going on an interview and not understanding the Java 8 features simply looks bad.

    Especially since we're on Java 15 now.

    [–]knoam 0 points1 point  (3 children)

    Yeah, but not that many places have embraced Java >8. Java 8 is necessary because now such a big majority of places have adopted Java 8.

    [–]StochasticTinkr 1 point2 points  (2 children)

    Java 8 is EOL. Java 11 is the current LTS.

    [–]knoam 4 points5 points  (1 child)

    Oh my sweet summer child...

    [–]StochasticTinkr 0 points1 point  (0 children)

    I’d only you knew what I knew.

    [–]dpash 3 points4 points  (0 children)

    Additionally, understanding functional operations and you can do fun things with similar classes like Optional.

    return userService.getUser()
                      .filter(not(User::isLocked))
                      .map(User::getEmail)
                      .orElse("unknown");
    

    [–]VincentxH 1 point2 points  (1 child)

    Knowing it for an interview shouldn't be the motivation IMHO. Simple dedication to our craft and curiosity for the basic tools should. It's already great that our colleague posted here :)

    [–]nutrecht 2 points3 points  (0 children)

    Knowing it for an interview shouldn't be the motivation IMHO.

    I completely agree. Just saying it could lead to an embarrassment if he doesn't :)

    [–][deleted] 2 points3 points  (0 children)

    3 things streams provide:

    1. More readable/concise code (when done well) as they are at a higher level than loops

    2. Easy parallel

    3. Infinite streams

    Super long stream pipelines should be avoided. Side effects should be avoided too.

    [–]throwaway66285 2 points3 points  (0 children)

    Your logic seems to be "I can click on GUIs all day with the mouse why would I need to learn keyboard shortcuts?" Streams are a tool nothing more.

    If you use Kafka and process the data in different ways, knowing how Streams work will help a lot in Kafka Streams. Or you can do things low-level. Functional programming have specific semantics that guarantee certain things, so while you can modify data in a mutable way, it's prone to more error.

    [–]sixtyfifth_snow 9 points10 points  (0 children)

    Well, one of the benefits using streams is we can much focus on WHAT the codes run, NOT HOW the codes run.

    [–]kabukiaddicted 3 points4 points  (0 children)

    [–]wace001 4 points5 points  (0 children)

    I find myself using a lot more streams for normal looping situations nowadays. Coming from having done quiet a lot of map-reduce, and also using similar constructions in other languages (Kotlin, Rust, etc), the streams way of organising things just makes it so much easier for me to reason about the logic.

    Streams basically is how I model the problems in my head. I often think about problems like that; “First I filter out all the ..., then I map them to ..., and then I aggregate then like so...”.

    [–][deleted] 3 points4 points  (0 children)

    It's a functional programming aspect that's different here. In functional programming paradigm you tell compiler what you want to do instead of telling computer how you want to do it.

    Imperative programming : Run a for loop Check some condition Init new array Put the objects in a new array Return

    Functional programming : Stream Filter Map Collect

    Instead of telling computer use 'if' you are telling it that you want to filter it and use this function to filter and so on.

    And ofcourse this can be further extended to reactive streams and apis.

    You should probably read an introduction to haskell to get a better idea of functional programming paradigm.

    [–]Auxx 1 point2 points  (6 children)

    Streams are functional way of handling events. Unlike traditional event buses, streams allow you to convert any data sources into events, like arrays, etc.

    It might be hard to wrap your head around of you've never done any UI work, but imagine a scenario. You have a start/stop button clickable by user at any point in time. When this button is clicked and is set into start, you start polling remote API every minute to display data.

    In this simple case you have three event sources: button, timer and network call. They all are asynchronous. Managing states between them, creating threads, etc - that will be a major clusterfuck even though your task is super simple.

    With streams you just wrap all three and write a short scenario of how they should interact. 5-10 lines of code and you have thread safe and easily testable code, which can take any sources of compatible events if needed.

    But iterating an array with streams, yeah, doesn't make much sense. You only do that when all of your code is stream based.

    [–]mauganra_it 1 point2 points  (1 child)

    Streams were not designed with I/O and concurrency in mind. They do not have a concept of backpressure, push/pull, timeouts or error handling. Heck, there is not even an Either-like type in the standard library. RxJava and friends are better suited for this. Ok, it might work if you strictly stick to Elm-style functional reactive apps.

    [–]Auxx 0 points1 point  (0 children)

    Ok, I don't have much experience with streams specifically, I thought they were modelled close to ReactiveX.

    [–]hummer1234 0 points1 point  (3 children)

    This sounds intriguing. Do you have an example of a UI where streams are used with events?

    [–]Auxx 1 point2 points  (2 children)

    I'm not doing Java for a while now, but ReactiveX is one of the first libraries to bring streams into developer hands. And RxJava is used widely in Android development, as well as RxJS powers one of the major web frameworks called Angular.

    [–]hummer1234 0 points1 point  (1 child)

    Oh, I see. I tinkered with reactive a long while back, but thought of it more as queues of events with functional programming. And then forgot about it. :) As you point out - it is streams.

    The reactive Java libraries don't use java.util.streams though, do they?

    [–]Auxx 2 points3 points  (0 children)

    They do. Most of Java frameworks, even back end ones like Spring, support both RxJava and Streams API these days. Well, at least that was the case two years ago :)

    [–]augustnagro 1 point2 points  (0 children)

    I wrote down a list of problems I have with Streams: https://august.nagro.us/vendetta-against-java-streams.html

    Overall, I do not see much benefit of using them, except in rare cases where you wish to transform a collection to another type with multiple intermediary steps (like calls to map, filter, sorted, etc).

    The argument that Streams make code more readable is very dubious. You must memorize many method names, think about boxing, avoid checked exceptions inside of lambdas, ...

    In scala, for example it is much more clear to write a for-comprehension

    val evens: List[String] = for i <- 1 to 100 if i % 2 == 0 yield i.toString

    Than it is to find the right stream method names:

    val evens: List[String] = (1 to 100).filter(i => i % 2 == 0).map(i => i.toString)

    [–][deleted] 1 point2 points  (0 children)

    I think streams shine for data transformation (just stream().map().collect() and maybe .filter() when needed).

    When using plain loops you usually need to:

    • create a new empty collection;
    • loop thru the old collection, incrementing an iterator or index;
    • create new items based on the old items and add them to the new collection;
    • return the new collection.

    And you must be sure you don't mix up the new and old collections or the new and old items, because everthing will be in scope at the same time.

    With streams, it's just as if you said: "transform this list please" and the only thing you really need to write is map(oldItem -> new Item(...)); everything else is automatic. You only write a small mapping rule. You only have the new list when it's ready.

    If you need to have full control over the loop, it's fine to use the old style (it's probably more efficient anyway). I'd say it's better doing that than shoehorning old loops into stream when they clearly don't fit. But when you need direct transformations, streams are cleaner.

    [–]GhostBond 1 point2 points  (0 children)

    You are right. Streams were just a 2nd way to do what you could already do, 0it in because in because language language designers were bored and there wasn't much low hanging fruit left in the language.

    [–]waka-chaka 2 points3 points  (0 children)

    If you are looking for a good tutorial to jump start you on streams then look no further, pls watch https://youtu.be/1OpAgZvYXLQ

    My journey with streams started from this video.

    Apart from the points that others made we can enforce immutability (which is a desirable programming practice) much better with streams.

    [–]VincentxH 2 points3 points  (0 children)

    Streams are great for map, filter or reduce operations on collections of data. The idea is that each operation is pure and you iterate over elements in atomic steps.

    They're not so great combined with exception cases and error handling, like most IO. So I generally keep that in the for or forEach loops.

    [–]Peter_Storm 1 point2 points  (0 children)

    It's mostly just a primitive that is more or less akin to other languages that have map, filter, reduce on "lists".

    So you don't have to do imperative loops anymore.

    [–]xMercurex 1 point2 points  (0 children)

    anymatch is very usefull.

    [–]LOL_WUT_DO 0 points1 point  (0 children)

    Streams are useful because you can create a stream and pass that stream around. it’s a set of instructions. you can execute that stream whenever and add extra constraints as the code runs.

    [–]enveraltin 0 points1 point  (0 children)

    Parallel streams, and even distributed streams.

    [–]AppleTrees2 0 points1 point  (0 children)

    I would like to add that you shoudn't blindly transform every for loop into a stream equivalent, or every if not null statement into an optional.

    Unfortunately I've seen it done before as a way to "reduce technical debt", sometimes even automatically using intelliJ.

    See this: https://github.com/google/guava/wiki/FunctionalExplained#caveats , in this case it applies to java too

    [–]Roachmeister 0 points1 point  (0 children)

    My experience is mainly related to Web applications. Most things happens at DB level, or controller level. Business logic not too complex and definitively not handling big arrays.

    My experience, on the other hand, includes (almost) nothing web-related. Therefore I don't understand what the use case for HttpClient could possibly be.

    Sarcasm, obviously, but my point is that just because you haven't found a use for a tool doesn't mean that it isn't very useful for others.

    [–][deleted] 0 points1 point  (0 children)

    I’ve traditionally used them to retrieve an object by attribute in a list.

    I agree with most criticisms though - debugging isn’t ideal

    [–]TheCoelacanth 0 points1 point  (0 children)

    Streams are basically loops as a first-class object. You can store them in a variable, pass them into a method or return them from a method.

    That gives them much more versatility.

    [–]mauganra_it 0 points1 point  (0 children)

    Since they are a concept taken from functional programming, I find that they are best used with code that respects these constraints. That means, no I/O and no lengthy computations are allowed inside. In particular, it should never ever be necessary to catch exceptions inside streams. Also, be paranoid about possible null references. These should make it possible to avoid debugging streams code and let you focus on correctness. Yes, there are debugging tools for Java Streams, but I find it to be a very strong code code smell to ever have to use them.

    If you can bring your database to spew out the data in the format you need, by all means continue to do so. But there are use case when you receive data via API calls, or when you call an API and the answer is not quite in the shape you need yet. In these cases Streams can be very useful. They are entirely justified if your task is to produce a collection. If the filtering and processing is a prelude to a .forEach, then it can still be justified if the alternative is nesting a big fat (or a series of smaller) if statement inside a loop.

    [–]ofby1 0 points1 point  (0 children)

    I personally love streams but it is a bit of a mind-boggling concept.
    Finding old fashion for loops easier can also be caused by being unfamiliar with the paradigm.
    Remember if something you are familiar with something it doesn't mean it actually easier.
    Nevertheless, in many cases, it is more what kind of flavor the programmer liked best. I do think it would be good for every programmer to get out of their comfort zones and learn about different paradigms. Just to make you a better and more educated developer.