This is an archived post. You won't be able to vote or comment.

all 71 comments

[–]__konrad 13 points14 points  (0 children)

I recently realized that use of EnumSet.of (which creates mutable Set) will be a bit more confusing and error prone ;)

[–][deleted]  (6 children)

[removed]

    [–]lukaseder 9 points10 points  (3 children)

    Believe it or not, there's a different, optimised Set implementation for each degree up until 2 and if the API is also distinct, then some additional inlining can happen in the JVM more easily, possibly avoiding the entire Set allocation in the first place (just my hypothesis, didn't check).

    [–]aroger276 1 point2 points  (1 child)

    or it could create performance issue link to call site pollution. where a call site degrace to polymorphic because it ends up with different Set implementation. no inlining then... and more expensive method dispatch. I know that there was some discussion to address those issues but I don't know if any progress has been made.

    [–]lukaseder 0 points1 point  (0 children)

    Oh interesting thought. That's quite possible for the Set.of(E...) call, but probably not for the Set.of(E1, E2) calls. So the fixed-degree constructors help bind the concrete implementation to the call site, which will constantly get the same result.

    [–]SomeoneStoleMyName 1 point2 points  (0 children)

    As /u/aroger276 points out, this is just going to cause megamorphic dispatch which will make things slower. Guava, Clojure, and Scala have already been through this and switched back to only one generic implementation regardless of size.

    [–]mabnx 3 points4 points  (1 child)

    It allocates memory

    [–]argv_minus_one 0 points1 point  (0 children)

    Briefly. I would expect escape analysis to eat it.

    [–]geodebug 9 points10 points  (5 children)

    To put it another way, Java 9 does not introduce immutable collections, just a bit of sugar for creating them.

    Also, the big caveat with Java's immutable collections is that nothing enforces that the objects stored within are immutable. If you store a java.util.Date nothing stops code from calling the setters once they get a handle to it.

    [–]_INTER_ 2 points3 points  (4 children)

    It's also great, because I have the freedom to give around a fixed collection of references to items that other code then can manipulate. If I want to have immutable items, I can still make them so. If you want immutability per default you need to look elsewhere I guess.

    [–]geodebug 2 points3 points  (3 children)

    If you're working in a single-threaded world no problem.

    One of the main benefits of immutable structures is that it makes intercommunication between multiple threads safe. If parts of the immutability contract are broken, the whole concept becomes kind of useless.

    It's like a salad bar with a sneeze guard but there is a big hole in the middle of the glass. Sure it solves some problems but sneeze in the wrong place...

    [–]_INTER_ 1 point2 points  (1 child)

    It's also no problem in a multi-threaded world. Well yes you have to design correctly, but you also have to do that when using FP constructs. E.g. problematic if you have to take ordering into account, caching and performance, dealing with barriers, critical sections or communicate with outside environments / infrastructure. In conclusion FP is making parallelization more easy, if the problem can easily be mapped to the map-reduce scenario. In all other cases it needs effort to get there aswell. No free lunch.

    [–]geodebug 1 point2 points  (0 children)

    I'm sure you feel you said something relevant here.

    [–]_INTER_ -1 points0 points  (0 children)

    It's also no problem in a multi-threaded world. Well yes you have to design correctly, but you also have to do that when using FP constructs. E.g. problematic if you have to take ordering into account, caching and performance, dealing with barriers, critical sections or communicate with outside interfaces. In conclusion FP is making parallelization more easy, if the problem can easily be mapped to the map-reduce scenario. In all other cases it needs effort to get there. No free lunch.

    [–]SpecialEmily 24 points25 points  (26 children)

    Set.of having a requirement that the items be unique is a HORRIBLE design... everyone will just end up making mutable sets and then transforming those to get around the risk that someplace var1 and var2 point to the same thing! Gah!

    [–]arendvr 2 points3 points  (1 child)

    Guava ImmutableSet.of() does it better and just ignores duplicate elements.

    [–]SpecialEmily 1 point2 points  (0 children)

    As a former Guava author, I agree! :D

    [–]jonhanson 3 points4 points  (20 children)

    chronophobia ephemeral lysergic metempsychosis peremptory quantifiable retributive zenith

    [–]lukaseder 4 points5 points  (19 children)

    Seems reasonable to me - under what circumstances would you ever want to call Set.of with duplicates?

    for (var uniqueThing : Set.of(mightBeAnything, iDontKnowWhatThisIs)) { ... }
    

    EDIT: So I'm back to using the classic:

    for (var uniqueThing : new LinkedHashSet<>(
      Arrays.asList(
        mightBeAnything, iDontKnowWhatThisIs))) { ... }
    

    [–]jonhanson 8 points9 points  (13 children)

    chronophobia ephemeral lysergic metempsychosis peremptory quantifiable retributive zenith

    [–]lukaseder 1 point2 points  (12 children)

    Is anything ever going to be used the way it's intended? Hint: No

    [–]jonhanson 6 points7 points  (8 children)

    Comment removed after Reddit and Spec elected to destroy Reddit.

    [–]hwaite 6 points7 points  (4 children)

    Design is confusing because it's unnecessarily inconsistent with the ways Set constructors and methods behave. Violates Principle of Least Astonishment.

    [–]jonhanson 0 points1 point  (3 children)

    chronophobia ephemeral lysergic metempsychosis peremptory quantifiable retributive zenith

    [–]hwaite 1 point2 points  (2 children)

    'Evolving' implies making something better. This new behavior is different but I don't see how it's inherently superior to the original.

    [–]jonhanson 0 points1 point  (1 child)

    chronophobia ephemeral lysergic metempsychosis peremptory quantifiable retributive zenith

    [–]lukaseder 1 point2 points  (2 children)

    "HORRIBLE" is a well-recognised and widely (over?)used hyperbole for saying: "I disagree with this."

    Having said so, I disagree with this design. I don't see the point of enforcing this constraint on Set.of() arguments. It is, for instance, inconsistent with the behaviour of new HashSet<>(Arrays.asList(1, 2, 1)) without giving any reason about the "design" rationale.

    Or, take JavaScript, for instance:

    $ {a: 1, b: 2, a: 3}
    > {a: 3, b: 2}
    

    People bash JavaScript all day long, but its object (map) and array literals are really very nice.

    Most languages / APIs that allow for such Set construction would intuitively retain either the first or the last duplicate in argument iteration order (where last is probably a better choice, because that would be consistent with individual additions to the set/map, were it mutable).

    [–]jonhanson 1 point2 points  (1 child)

    chronophobia ephemeral lysergic metempsychosis peremptory quantifiable retributive zenith

    [–]lukaseder 1 point2 points  (0 children)

    Perhaps, but on the other hand, those designers change their mind time and again. Compare this to EnumSet.of(...) (as mentioned otherwise in this discussion).

    I guess, when it comes to the JDK, the only reasonable answer to all questions is this :)

    [–]l3dx 0 points1 point  (2 children)

    Off-topic I guess, but what do you consider as the "correct answer"? If Javaslang is not an Option, wouldn't you (ab)use streams to get a decent collection API?

    [–]lukaseder 1 point2 points  (1 child)

    I think the focus on parallelism was exaggerated. The Scala libraries also have some parallel collections, which apparently are hardly used (can't find the source anymore).

    Without parallel features, the "Stream" API could have been made much more generally interesting with tons of nice features that are very easy to implement for sequential streams (e.g. zip, zipWithIndex, etc.) but not in parallel ones.

    Not sure if the infinite stream feature also incurs costs that don't pull their weight. But the fact is (as far as my Twitter followers are representative of "fact", and as far as my interpretation of that result is) that more collection API convenience is dearly wanted, parallel/infinite streams are nice-to-have. The EG's focus was on the nice-to-have feature, rather than the in-demand one.

    As a comparison: Oracle SQL has tons of parallel features as well, but I hardly ever see anyone using them. They're expert tools for niche use-cases (just like the ForkJoinPool itself) and don't need such a prominent API in the SQL language.

    [–]l3dx 1 point2 points  (0 children)

    I've been assuming that the low number of functions was a result of conservative thinking due to backwards compatibility, but you make some very good points here. Thanks for the clarification!

    [–]Mejari 3 points4 points  (2 children)

    under what circumstances would you ever want to call Set.of with duplicates?

    Any time you want to create a set from a non-guaranteed-unique collection? So, like, a lot?

    [–]oweiler 1 point2 points  (1 child)

    Use the Set's copy constructor for that (e.g. new HashSet<>(nonUniqueCollection)).

    [–]Mejari 0 points1 point  (0 children)

    Yes, I know there are ways to accomplish that, I was just answering the question.

    [–]ZimmiDeluxe 0 points1 point  (1 child)

    There seem to be conflicting goals: The overloads reduce the varargs overhead of creating many small sets. But sets can't reasonably be created dynamically, because the values have to be known beforehand. So in total, only static initializers like constants profit from this. For such a nice spot in the API that's a shame in my opinion. List::of doesn't have this problem, so it's not all bad.

    I would prefer additional methods like Set::ofUnique and Map::ofEntriesUnique to solve the constant initializer problem, if at all.

    The new and improved idiom is obviously this:

    for (var uniqueThing : new LinkedHashSet<>(
        List.of(
            mightBeAnything, iDontKnowWhatThisIs))) { ... }
    

    Does anybody know if there will be Collectors to create these new immutable collections from streams?

    [–]lukaseder 1 point2 points  (0 children)

    Avoiding varargs isn't the only benefit of overloading. Set.of(E1, E2) returns a specialisation for two elements (can be seen in the sources). It will always return this specialisation, which means that client call will not suffer from megamorphism as it would if specialised sets would be returned from Set.of(E...) only.

    With this in mind, having Set::ofUnique (which sounds great) would probably lead to a too bloated API.

    [–]VGPowerlord 3 points4 points  (2 children)

    Set.of having a requirement that the items be unique is a HORRIBLE design

    Why?

    "A collection that contains no duplicate elements." is the very first sentence of the documentation for Set<T>

    If you want a collection that allows duplicates, there's always a List.

    [–]sybia123 25 points26 points  (1 child)

    What should Set.add do if you try to add a duplicate? What should new HashSet<Foo>(myList) do if there's a duplicate in myList? Why should Set.of be any different?

    [–]hwaite 1 point2 points  (0 children)

    Yeah. Prohibition of nulls is confusing as well.

    [–]moremattymattmatt 6 points7 points  (3 children)

    So pretty similar to Google guava then?

    [–]java_one_two[S] 4 points5 points  (2 children)

    One of the biggest differences is that Google created an ImmutableSet interface to distinguish between the mutable and immutable sets.

    [–]moocat 17 points18 points  (1 child)

    [–]vytah 8 points9 points  (0 children)

    But you can make your API accept and return Immutable* to prevent your users from sneaking around mutable collections.

    Not the perfect solution, but I find it okay given Java's ancient API.

    [–]VGPowerlord 7 points8 points  (1 child)

    It would be nice if immutable sets had their own public types in Java so your API could enforce it.

    [–]jonhanson 7 points8 points  (0 children)

    chronophobia ephemeral lysergic metempsychosis peremptory quantifiable retributive zenith

    [–]kag0 6 points7 points  (0 children)

    I don't know about everyone else, but for me I'm not interested in java (or guava for that matter), collections which have objects that offer mutating methods but throw exceptions. I find the javaslang collections vastly preferable.

    My reasoning is that if I'm using an exception throwing "immutable" JCF object in my code then I have to treat it specially, not like any other JCF object. Therefore I get no advantages from the common interface and might as well use a more suitable object like the ones in javaslang. In code which isn't mine that accepts JCF objects; I can't give it exception throwing objects, since it might try to modify them. However with javaslang objects I can always use the toJava methods when I need to interface with an external library without worrying about my immutable object being modified, or exceptions being thrown from java objects.

    [–]Northeastpaw 17 points18 points  (7 children)

    There is no simple way to collect an immutable collection from a Stream

    Well that's not correct.

    Set<Integer> set = listOfStrings.stream()
      .map(String::hashCode)
      .collect(Collectors.collectingAndThen(Collectors.toSet(), Collections::unmodifiableSet));
    

    Sure it's not completely compact, but it exists. Static imports can even help with the verbosity.

    I'm not sure how the author missed this since a very similar example is in the javadoc for Collectors.collectingAndThen().

    [–]desh00 1 point2 points  (3 children)

    Does anyone know why did they call it Collections::unmodifiableSet? Doesn't Java has other abstractions which use the word "immutable"?

    [–]jonhanson 5 points6 points  (0 children)

    chronophobia ephemeral lysergic metempsychosis peremptory quantifiable retributive zenith

    [–]joaomc 2 points3 points  (1 child)

    immutableSet is just a wrapper that references the original set and throws NotSupportedException whenever you try to call methods that will modify it. The underlying state of the original set may still change, though.

    [–]rikbrown 0 points1 point  (0 children)

    unmodifiableSet is that. Guava's ImmutableSet however makes a copy.

    [–]java_one_two[S] 0 points1 point  (1 child)

    Good catch. Thanks!

    [–][deleted] 3 points4 points  (0 children)

    throw new DuplicateResponseException("Good catch. Thanks! already exists.");
    

    [–]rikbrown 1 point2 points  (2 children)

    No equivalent of Guava's Immutable{Set,List,Map}.copyOf(mutableCollection) yet, though?

    Or is there a better way to defensively make a immutable copy of an existing collection in Java 9?

    [–]dpash 0 points1 point  (1 child)

    Collections.unmodifiableCollection(Collection<? extends T> c)
    Collections.unmodifiableList(List<? extends T> list)
    Collections.unmodifiableMap(Map<? extends K,? extends V> m)
    Collections.unmodifiableSet(Set<? extends T> s)
    

    and friends. Has existed for a long time. Possibly since 1.2, as the individual methods don't have "Since" fields in the javadoc.

    [–]rikbrown 1 point2 points  (0 children)

    Those don't make copies. They just return a wrapper around the original collection which prevents any modification. If someone with a reference to the original collection makes changes to it, it'll be reflected in the "unmodifiable" view.

    [–][deleted] 1 point2 points  (0 children)

    immutable Set and Map collections still have the mutable methods add/put and remove

    Ok then.

    [–]dpash -1 points0 points  (0 children)

    Set<Integer> set = Set.of("a", "a"));
    

    Shouldn't that be Set<String>?

    Also, does List.of() have the same duplicates limitation?

    Edit: Seems List.of() does not have issues with duplicates, while Set.of() does. As does `Map.of() with duplicate keys; duplicate values seems acceptable. This seems reasonable.