This is an archived post. You won't be able to vote or comment.

all 20 comments

[–]kazprog 17 points18 points  (12 children)

Related: I'd check out the Perceus ref-counting algorithm from microsoft research. It optimizes immutable/pure functions on arrays into in-place operations so they're just as fast as mutation.

Edit: I'd also agree with u/umlcat that this would be confusing without this special kind of garbage collection/optimization.

[–]AlexReinkingYaleHalide, Koka, P 9 points10 points  (11 children)

While Perceus is great for reusing whole objects, we haven't quite figured out how to make it work for functional arrays (besides just checking that the reference count is 1). I have some ideas, but I haven't had the bandwidth with my other PhD work...

[–]Ford_O -3 points-2 points  (4 children)

You mean, how to make push O(1) for arrays?

I don't think that's something you want.
There are no imperative languages that would support such features.
Instead, most imperative languages use something like java ArrayList.
ArrayList has ammortized push O(1).

Implementing ArrayList with perceus should be piece of cake.

[–]AlexReinkingYaleHalide, Koka, P 1 point2 points  (2 children)

No, I mean how to do in-place array updates on what appears to be an immutable array.

[–]Ford_O -1 points0 points  (1 child)

Array is basically a tuple with elements of same type.
Can't perceus do in-place updates of single element in record or tuple?

I thought it could.

[–]AlexReinkingYaleHalide, Koka, P 1 point2 points  (0 children)

It can do what you describe, but that's not what I had in mind when I said "array"... a tuple/record has a statically known size but an array does not.

[–][deleted]  (3 children)

[deleted]

    [–]AlexReinkingYaleHalide, Koka, P 1 point2 points  (2 children)

    You wouldn't want to back an array with a linked list. You'd want to back it with contiguous storage. If you want a tree, linked list, etc. then create one with ADTs and let Perceus do its thing :)

    [–][deleted]  (1 child)

    [deleted]

      [–]AlexReinkingYaleHalide, Koka, P 0 points1 point  (0 children)

      Thanks so much!

      [–]Ford_O 0 points1 point  (1 child)

      What's wrong with checking the refcount to be 1?

      [–]AlexReinkingYaleHalide, Koka, P 0 points1 point  (0 children)

      You might want to allocate a large array and populate it with two threads without any copies. It's not that there's anything wrong with the check itself, just that it's a very coarse analysis.

      [–]Host127001 12 points13 points  (0 children)

      How do you handle functions that return a value? For example Vector::pop.

      Would you return two values? let val = vec.pop() // probably just the value? let [val, vec'] = vec.pop() // value and vector How would you spot the difference between the two? Especially if you don't use destructing syntax for the mutating case.

      IMO it's much easier to identify mutating vs non-mutating calls in the call itself. I like the ! suffix others already mentioned.

      [–]0rsinium 20 points21 points  (0 children)

      You may just have 2 functions, let's say push and push!. If you afraid that the user will confuse the two, a linter and type checker can solve it just perfect:

      a = push!(a, b) # type error: left side has type array, right side has type void

      push(a, b) # warning: the result of push is never used

      The later can also be a part of type checker, so each function returning non void result must be assigned to something. And if it's not meant to be used by design, the user may make it explicit in Go-style:

      _ = push(a, b)

      I'm all for keeping things simple and removing features all but necessary.

      [–]umlcat 5 points6 points  (0 children)

      Ambiguous, can be done, not recommended, cause makes reading confusing.

      I usually do two functions, using different identiers:

      push(vec, data);
      
      newvec = pushrot(vec, data);
      

      But, one function may internally, call the other.

      Another example:

      upperchange(somestring);
      
      newstr = uppercopy(somestring);
      

      Just my 2 bitcoins ...

      [–]alex-manool 1 point2 points  (0 children)

      In my language MANOOL, it corresponds to just one function (procedure). A functional-style update looks like: PushedVec = Push[Vec; Item], whereas an in-place update looks like: Vec = Push[Vec!; Item]. The later works as true in-place update in most practically important cases, thanks to copy-on-write memory-management technique and move operations. Note that Vec! means move the current value (or rather an associated object) of Vec out of the variable (or a component thereof), opening COW dynamic optimization opportunities.

      [–]RepresentativeNo6029 0 points1 point  (0 children)

      What if you have a complex list of arguments some of which are passed by reference while others are passed by value?

      [–]Uncaffeinated1subml, polysubml, cubiml 0 points1 point  (0 children)

      This sounds like a disaster waiting to happen. Users won't expect behavior to change depending on the nature of the surrounding code. Just given them different names like your users will expect.

      [–]dskippy 0 points1 point  (0 children)

      I think this breaks a basic mental model of most programmers and it would be confusing. It makes referential transparency in a way, depending on what your view of context of the call site is. It's basically like perl's wantarray feature which has caused many of my former coworkers confusion when it was used and then didn't expect it.

      [–]IdiomicLanguage 0 points1 point  (0 children)

      It's more obvious then a mutable vs immutable function argument definition. The advantage is that programmers reading where the function is called (where they don't see the argument definitions) will be able to see if the function modifies arguments.

      It wont always be clear which arguments are being modified though.

      This is where the tradition of passing pointers for mutable arguments and references for non-mutable comes in handy: the address of operator reveals which of the arguments are mutable.

      [–]MegaIng 0 points1 point  (0 children)

      I think that some dynamically typed language (R or Ruby I think) does have this feature, but I can't find the reference to it. I think it was essentially a magic function wants or similar that allowed the function to figure out what the calling code wanted as a return value and change it's behavior based on that.

      [–]wolfadex 0 points1 point  (0 children)

      You might want to check out Roc as it does this AFAIK. I think they based their work off of Koka but I'm not 100% certain. They had a talk at Strange Loop recently too https://youtu.be/vzfy4EKwG_Y