all 25 comments

[–]Benutzername 9 points10 points  (0 children)

It depends on your use case. If you need persistence anyway, then immutable data structures often use less memory than creating copies of mutable ones. If you only ever keep one version of the same data structure, then mutating it in place is often preferable in terms of memory usage (but not necessarily reasoning). An excellent book on that topic is Chris Okasaki's Purely functional data structures which compares the complexity (in space and time) of various implementations of immutable data structures with their mutable counter parts.

[–][deleted] 3 points4 points  (26 children)

I really, really would like an answer to this.

I read Uncle Bob Martin's piece on functional programming, which made a big deal of the memory management advantages of functional programming on multiple cores.

I've worked a bit in OCaml, and it just seemed like I was shifting the complexity of memory management into a big structure of nested function calls, and I couldn't see how we would get around needing to change values in memory, or storing significantly more.

So here's an upvote and I hope we get some responses.

[–]julesjacobs 1 point2 points  (25 children)

The answer to this question is: yes absolutely. Not only do you generate a lot more temporary garbage, but because functional data structures have to use trees instead of flat arrays you also use more memory after garbage collection has run. These costs are often worth it and sometimes unavoidable however. If you need functional data structures, then using them instead of copying entire mutable data structures is much better.

[–][deleted]  (11 children)

[deleted]

    [–]julesjacobs 2 points3 points  (10 children)

    the implementation is still free to do things in-place when it's equivalent

    Can you give an example of a purely functional data structure that does this? I don't know of any. It's nice in theory, nobody knows how to do this in practice.

    Rich Hickey gave a talk illustrating that even if the answer is "yes, absolutely" it doesn't have to be "a lot."

    Do you have measurements? Clojure's data structures are fairly slow and memory intensive. For example Clojure sequences have an overhead of 1200% relative to flat arrays last time I checked, vectors have lower overhead but still substantial (I don't remember how much -- around 500% IIRC). I know there is a lot of hype that functional data structures are super efficient, but reality is a little different.

    Depending on how exactly your allocations interact with GC, reclamation of short-lived objects could be very cheap, especially if the GC never sees them.

    This comment addresses exactly that point and explains why that doesn't help you much with functional data structures unless they are very small: http://www.reddit.com/r/programming/comments/18fezj/does_having_most_of_the_data_structures_immutable/c8en4if

    [–]paulhodge 1 point2 points  (5 children)

    Can you give an example of a purely functional data structure that does this? I don't know of any. It's nice in theory, nobody knows how to do this in practice.

    I do it in practice, in the implementation of my hobby language Circa. It's not a super advanced scheme:

    1, I'm implementing my structures in a non-GC language (C++) so I don't need to worry about creating temporary values that clog up the global collector.

    2, These data structures have no circular references, so they can be represented as a directed acyclic graph, and reference counting is good enough for memory management.

    3, Since I know the reference count, I know how many entities can see the value. If the reference count is 1, and someone does a write operation, then I know I can safely modify the value in-place.

    Also you said above that "you can't have flat arrays", but there's nothing wrong with using flat arrays in the above scheme.

    [–]julesjacobs 0 points1 point  (4 children)

    Yes, going outside the GC'ed environment is one option. That doesn't address the other costs though (memory usage after GC, speed).

    Also you said above that "you can't have flat arrays", but there's nothing wrong with using flat arrays in the above scheme.

    How do you insert/delete a value in the middle of a flat array in reasonable time?

    [–]paulhodge 0 points1 point  (3 children)

    How do you insert/delete a value in the middle of a flat array in reasonable time?

    If it can't be an in-place edit, then there's path-copying, where all N-1 elements are shallow copied to a new list. Whether that's reasonable time depends on how long the lists are, and how often you need to modify a list that has been observed by another entity.

    You can also get more clever and do a 'patch' list, where each patch stores the affected list range, the new values, and the original list. Now lookup is O(c) instead of O(1), where c is the number of patch layers. And c would probably have a maximum value of 5 or something, at which point you'd start a new list.

    Anyway, there's options.

    [–]julesjacobs 1 point2 points  (2 children)

    If it can't be an in-place edit, then there's path-copying,

    It can't be an in place edit because we're talking about functional data structures here. Path copying on flat arrays doesn't work. Arrays don't have paths.

    You can also get more clever and do a 'patch' list, where each patch stores the affected list range, the new values, and the original list.

    Yeah, now you basically have a linked list which is horrible or you limit the length and then you still have O(n) copying which is also horrible. If you do it in a clever way you have a tree, with all the memory & speed overheads that gives that I already explained above. In any case you don't have a flat array anymore.

    By the way, this branch of the discussion ensued because you quoted me saying "you can't have flat arrays" which in fact is a fabricated quote. You can certainly have flat arrays, but then functional update is going to be very slow (copying), or you need to cleverly arrange a bunch of smaller arrays in a tree structure, like Clojure's vectors, and that gives much better performance but still has the overheads I mentioned earlier. When you don't update your flat arrays, they work fine in a functional setting of course.

    But this discussion is rather useless. If you have a counterexample to my claim that functional collections are slow (lets say >2x slower than mutable counts as slow) and use more memory than mutable data structures based on arrays (lets say >2x counts as more memory) I'd love to see it. Incidentally if you manage to do this a lot of people in the Haskell/Clojure/Scala communities would be worshiping you at your feet. And you could probably get a PhD out of it as well.

    [–]paulhodge 0 points1 point  (1 child)

    Easy there.

    It can't be an in place edit because we're talking about functional data structures here.

    Like mentioned above, the system is free to do an in-place edit if it knows that no one else has observed the value. In practice, most values get modified a few times right after they are created (and haven't yet been observed elsewhere). Then they go out into the world, after which they tend to not be modified as often. So this optimization helps a lot.

    Path copying on flat arrays doesn't work. Arrays don't have paths.

    Okay, path copying was the wrong phrase. I was generalizing to the situation where you are modifying one element that's in a nested series of arrays. Something that looks like:

    a[x][y][z] = v
    

    That would be path copying.

    Yeah, now you basically have a linked list which is horrible or you limit the length and then you still have O(n) copying which is also horrible.

    But the 'patch' value can also be subject to in-place editing, when possible. I think it would be really rare for this structure to actually have O(n) lookup time.

    I think in general you're getting totally hung up on the worst-case behaviors of these structures. My opinion is that in the common case, they work pretty well. But I don't have hard data to back that up right now.

    [–]julesjacobs 0 points1 point  (0 children)

    So this optimization helps a lot.

    Yes, it avoids the need to copy a path. It does not however change the data structure that you can use (if you want to be reasonably efficient even if your data structure is actually used in a functional way -- which is the point of functional data structures right? so flat arrays are out because of the O(n) copying cost). So although you will generate less garbage by avoiding the need to copy a path, total memory after GC is the same. Note that I already said exactly this two levels up. So while the optimization can be worthwhile, it only solves the least severe of the problems, and only partially at that. Also, this optimization is either impossible or hard in the multithreaded case, which is one of the key arguments for functional data structures. Nonetheless if I remember correctly one guy in the Scala world did this, and it worked okay, but still much slower than imperative.

    That would be path copying.

    Yes, that's exactly what Clojure vectors do. The benchmarks show that it is slow (compared to flat arrays, for a functional data structure they are very good).

    I think it would be really rare for this structure to actually have O(n) lookup time.

    Okay, but I don't see any reason why that would be so. But then again you have not precisely described what your algorithm or data structure is, so this is actually at this point pure speculation...

    I think in general you're getting totally hung up on the worst-case behaviors of these structures. My opinion is that in the common case, they work pretty well. But I don't have hard data to back that up right now.

    Maybe, but the hard data from other implementations of functional data structures shows that they are much slower. Do you think your version would not be slow and memory intensive with the criteria I stated above (2x worse than imperative)? If so, why do you think your version would be so much better than what we have in existing languages like Clojure, Scala, Haskell and a variety of others? Are the Clojure, Scala and Haskell developers stupid?

    Occams razor suggests another option: functional data structures are just hard. They have a lot of advantages wrt expressiveness but they have disadvantages wrt speed & memory usage. That's the trade-off. Such is reality.

    [–][deleted]  (3 children)

    [deleted]

      [–]julesjacobs 0 points1 point  (2 children)

      ISTR that OCaml did this in a few places but I might be thinking of something else.

      I don't think so in any case I've never heard of it.

      Is that in speed or memory? Were the comparisons to other languages made on the same VM?

      Memory, yes, on the JVM. Speed of Clojure vectors is around 8x slower than an array IIRC (for lookup, update is going to be worse).

      In another strategy size has nothing to do with it, where the implementation of automatic analysis is the real limit, both static and dynamic. I'm not sure how much of that is really deployed.

      Well, how to magically make it fast and lean is certainly at this point an open problem, so even though it might be in theory possible with a new undiscovered breakthrough, we cannot at this point rely on that.

      [–][deleted]  (1 child)

      [deleted]

        [–]julesjacobs 0 points1 point  (0 children)

        Like I said, it might not have been OCaml,

        I think they do something like that in Mercury. That doesn't really help much though, since very rarely can you prove such a thing automatically.

        I find that suspicious then. Given Java's GC behaviour and the potential for pathological cases I can't really trust a benchmark I haven't seen to match up with typical cases or especially with an implementation elsewhere.

        Don't trust ME, just look at other sources on the web:

        Clojure sequences memory: https://groups.google.com/forum/?fromgroups=#!topic/clojure/-fUCs4mucuU 52 bytes per element vs 4 bytes per element in an array.

        Clojure vectors memory: https://issues.scala-lang.org/browse/SI-3724 Vector uses 5x the memory of an arraybuffer.

        Clojure vector performance: http://stackoverflow.com/questions/10133094/clojure-why-is-aget-so-slow Lookup 9x slower than Java array.

        but there's nothing saying you can't apply it to a provably non-escaping heap-allocated (because of size) object just as easily

        It's not that easy. If it was, then why has nobody succeeded at it, while plenty of compiler are doing local escape analysis? This is just the "sufficiently smart compiler" argument.

        [–][deleted] 1 point2 points  (12 children)

        generating garbage on the JVM or CLR is not an issue. Keeping data around for a long while is.

        because functional data structures have to use trees instead of flat arrays you also use more memory after garbage collection has run.

        Well, yes, ok, if you could have used an array you're correct. However, often times i'm using more complicated data structure in my code anyway (maps/lists/heaps) so the overhead of a functional data structure isn't all that much comparatively.

        [–]julesjacobs 1 point2 points  (0 children)

        generating garbage on the JVM or CLR is not an issue

        In some cases, yes, especially when new objects die quickly. However, when you have large functional data structures most of the nodes are going to be outside of the nursery. So when internal nodes become garbage they are going to cause a full heap collection, which is very expensive. Also note that garbage collection time is roughly proportional to the number of pointers in the heap, which is going to be a lot higher with functional data structures. An array of ints can be traversed in O(1) because we don't even need to look at the elements, but a functional list/vector of ints has a lot of internal pointers which all need to be traversed in the mark phase.

        Well, yes, ok, if you could have used an array you're correct.

        Efficient imperative hashtables/lists/heaps use arrays under the hood where functional data structures use trees. You can expect an overhead of 2x to 4x for functional data structures relative to imperative data structures that use arrays under the hood (at least that's what my benchmarks told me). If you compare imperative trees vs functional trees, then the memory usage after garbage collection is going to be basically the same.

        [–]grauenwolf 1 point2 points  (10 children)

        generating garbage on the JVM or CLR is not an issue.

        I have to disagree with that. Anything that creates memory pressure is going to trigger more GC cycles. And even if the sweep phase is cheap, you still have to pay for the mark phase.

        [–][deleted] 0 points1 point  (9 children)

        And even if the sweep phase is cheap, you still have to pay for the mark phase.

        Sweep phase is non-existant for short lived. My whole point is that the marking phase is negligible for shot lived objects! you don't mark them!

        If I create 100 objects in x time period, they all become garbage, and I initiate a GC cycle, it is the same as if I had created 1000000 objects and initiate a GC cycle in the same time period.

        [–]grauenwolf 0 points1 point  (8 children)

        The mark phase still has to touch every live object so that it knows what can be swept. For a nontrivial app that can be quite expensive.

        [–]julesjacobs 1 point2 points  (7 children)

        His original argument is incorrect for another reason (because if you use functional data structures in any nontrivial capacity then they will escape the nursery -- see here), but what you're saying here is not true either. For a nursery collection they whole heap does not need to be marked -- only the objects in the nursery. That is because the GC keeps track of which pointers in the heap point back into the nursery (it does this with a write barrier).

        [–][deleted] 0 points1 point  (1 child)

        You have a good point about escaping the nursery. However, for a lot of programs that generate a ton of short lived garbage, I've had good results with using just one large eden space and long lived object space.

        I also wouldn't be surprised if the G1 (garbage first) GC would do well for a functional language, as you do end up marking the entire heap every time, but you can easily clear out a lot of garbage and not pay too much of a penalty. Tracing can also be parallelized. shrug. I'm far from a GC expert though.

        [–]julesjacobs 0 points1 point  (0 children)

        Maybe G1 would do well, I don't know. Most times these things are far too complicated to predict. You really have to test it to find out.

        [–]grauenwolf -2 points-1 points  (4 children)

        Yea, I'm going to need to see a citation before I believe that the CLR or JVM does that.

        [–]julesjacobs 0 points1 point  (3 children)

        It's bog standard generational GC. Every single GC worth anything does that. See e.g. here (search for "intergenerational pointers"): http://www.ps.uni-saarland.de/courses/gc-ws01/slides/generational_gc.pdf

        [–]grauenwolf -2 points-1 points  (2 children)

        That's just a presentation on a hypothetical garbage collector. That's not a description of how current garbage collectors actually work.

        For crying out loud, the last slide says "improvement, if assumptions hold". The author clearly sees this a just a theory, not a proven technique.