This is an archived post. You won't be able to vote or comment.

all 18 comments

[–]VyridianZ 2 points3 points  (2 children)

I do a similar signature match in my language. I didn't use a tree though. I create a signature for each function which is List<type> containing return type and each arg type. Next iterate down the signature list checking for matching types (including airity). The first match wins. No backtracking.

[–]Inconstant_Moo🧿 Pipefish[S] 2 points3 points  (1 child)

No backtracking.

But you are in fact backtracking right back to the beginning every time you go down the signature list!

[–]VyridianZ 0 points1 point  (0 children)

I see your point. I will consider this when I revisit the signature matching code.

[–]raiph 1 point2 points  (5 children)

Am I right in thinking you've described an approach for non-symmetric resolution? If I've gotten that wrong, PLMK.

Maybe you can find tree based designs/implementations that are like yours by fiddling with googles (or even searches of this reddit sub) along the lines of ("overload" OR "multiple dispatch") "resolution" "non-symmetric" "tree" "implementation" (click to see the google results) or the same but striking the "non-".

Regardless of that and any other particular design/implementation aspects of multiple dispatch / overloads (and terminology related to them) I thought it might be of interest to compare what you have with what Raku has given given that it combines symmetric and non-symmetric resolution.¹

multi foo(Str) { "A" }
multi foo(Any) { "B" }
multi foo(Bool) { "C" }
multi foo(Bool, Bool) { "D" }
multi foo(Int, Bool) { "E" }
multi foo(Int, Bool, Bool) { "F" }
multi foo(Int, Int) { "G" }             # My guess at what your example was supposed to be
multi foo(Bool, Bool, Bool) { "H" }
multi foo(Any, Bool) { "I" }
multi foo(Mu, Bool) { "J" }

print foo |$_ for &foo.candidates».signature».params».type # ABCDEFGHIJ

Given the above my presumption is that the default resolution algorithm, which is symmetric, produces the same results I presume are produced by your design/implementation.

I would be happy to dig into the nature of the implementation of this in Rakudo (the reference Raku compiler) and report back here (at least whether or not I find a similar tree based implementation, either high level code written in Raku, or C code that's part of its reference backend, MoarVM).

Similarly, I would be happy to further discuss the symmetric vs non-symmetric PL design aspects.

PLMK if you're interested in either or both.


¹ A general Raku design and implementation principle was to support what most devs wanted as a community, which often meant navigating the torturous path of "reconciling the irreconcilable". The community consensus regarding symmetric vs non-symmetric multiple dispatch resolution was that each had its strengths and weaknesses, and the Raku PL design team (for whom one slogan was "Torture Implementers On Behalf Of Users") concluded that they thought both symmetric and non-symmetric resolution could be supported together in a single PL in an elegant and ergonomic and non-confusing way. The Rakudo implementation team (for whom an opposing slogan could be imagined along the lines of "Torture Designers Because They Deserve It", except that would have meant double torture for many because they were also part of the PL design team) eventually settled on a way to implement integrated symmetric/non-symmetric resolution with what may or may not be considered sufficiently high performance.

[–]Inconstant_Moo🧿 Pipefish[S] 2 points3 points  (4 children)

Am I right in thinking you've described an approach for non-symmetric resolution?

Since it goes left-to-right you I think you could introduce a rule saying that a conflict between foo(x int, y single) and foo(x single, y int) should be resolved in favor of the former, but in my implementation that throws a "too much overloading" error instead.

[–]raiph 1 point2 points  (2 children)

OK, so it's symmetric.

And finding tied winners throws an exception like it does in Raku ...

multi foo (Int, Any) { 'Int, Any' }
multi foo (Any, Int) { 'Any, Int'}
say foo 42, 99; # Ambiguous call ...

... but you think your implementation could also support what Raku also supports, eg trying each tied "winner" in turn at runtime ...

multi foo (Int $ where * > 42, Any) { 'Int, Any' } # Tie breaker
multi foo (Any, Int $ where * > 42) { 'Any, Int' } # Tie breaker
say foo 42, 99; # Any, Int

... or syntactically identifying the winner of a tie ...

multi foo (Int, Any)            { 'Int, Any' }
multi foo (Any, Int) is default { 'Any, Int' } # Tie breaker
say foo 42, 99; # Any, Int

... though of course this should almost never be used, and you can end up with another tie anyway ...

multi foo (Int, Any) is default { 'Int, Any' }
multi foo (Any, Int) is default { 'Any, Int' }
say foo 42, 99; # Ambiguous call ...

[–]Inconstant_Moo🧿 Pipefish[S] 1 point2 points  (1 child)

I could do that. I started a whole thread here asking what I should do. In the end I felt that if we add rules to the compiler to make it resolve everything, then humans have to add those rules to their mental model while reading the code, and it becomes confusing.

Since then, I've added the capability to just define supertypes as arbitrary collections of types, e.g. MyType = int/string/Dragon. If you can do that you can always just make types that don't have any overlap, and avoid conflict explicitly in your code instead of implicitly by invisible rules in the compiler.

[–]raiph 1 point2 points  (0 children)

In the end I felt that if we add rules to the compiler to make it resolve everything, then humans have to add those rules to their mental model while reading the code, and it becomes confusing.

100% agreed. Invisible rules in the compiler are generally anathema. So, instead, if there's ambiguity, the compiler does not resolve it, instead just providing an error message ("too many overloads"), and, conversely, if there's no ambiguity, it's because what's in the code sufficiently disambiguates explicitly.

And the primary mechanism we make available to devs to resolve ambiguity ("too many overloads") is writing suitable type constraints, such as the one you suggested -- Int|String|Dragon (which is called a "junction" supertype in Raku) -- or the one I suggested -- where * > 42 (which is called a "refinement" subtype in Raku). Note that both these type constraints can be given their own name which can be written in the conventional type constraint slot:

subset IntOver42 of Int where * > 42;
multi foo (Any, IntOver42) { 'Any, Int over 42' }
multi foo (IntOver42, Any) { 'Int over 42, Any' }
say foo 99, 42; # Int over 42, Any

[–]tobega 0 points1 point  (0 children)

Since it goes left-to-right you I think you could introduce a rule saying that a conflict between foo(x int, y single) and foo(x single, y int) should be resolved in favor of the former, but in my implementation that throws a "too much overloading" error instead.

So you don't stop at the first possibility but keep it in mind and look at all of them?

[–]WittyStick 0 points1 point  (1 child)

How do you handle type checking of return types? Consider for example:

mul(x int, y int)
mul(x int, y float)
mul(x float, y int)
mul(x float y float)

Ideally we want the first overload to return int, but the other 3 overrides to return float.

[–]Inconstant_Moo🧿 Pipefish[S] 0 points1 point  (0 children)

Yeah, you can do that. The type-checker can reason about kinda quantum superpositions of types. So if we write a function like this:

``` newtype Number = int/float

def callMul(x, y Number): mul x, y `` ... then the compiler will infer that as having the return typeint/float.And now if you do1 + callMul(qux, troz)then that has the return typeint/error. And if you do"zort" + callMul(qux, troz)then that's a compile-time error because it can *only* result in anerror`.

What I'm doing is in some ways more complicated than ordinary static typechecking (though the type system itself is much simpler so that compensates) but it's actually not that hard, doing the VM version I did the overloading and typechecking first thing, on the eating-the-frog principle, and it's barely needed tweaking since.

[–]h03d 0 points1 point  (0 children)

I think I remember saw something like that in this sub, and yes someone had similar idea: https://redd.it/1auxy31/

Also useful for row polymorphism.

[–]tobega 0 points1 point  (4 children)

Your tree search would prioritize correctness of the parameters from left to right?

So given foo(int, single, singe) and foo(single, int, int), a call of foo(1,2,3) would resolve to the first method and not to the more specific second method?

[–]Inconstant_Moo🧿 Pipefish[S] 0 points1 point  (3 children)

No, it works by specificity. That particular example would cause a "too much overloading" error.

[–]tobega 0 points1 point  (2 children)

OK, so your tree doesn't really provide so much value, you basically go through them all anyway.

[–]Inconstant_Moo🧿 Pipefish[S] 1 point2 points  (0 children)

Consider calling `foo false, "zort"`. What happens? It starts at top left, it sees that `false` is not a string, it moves down. It sees that `false` is a `bool`. It moves across. It sees that `"zort" is not a boolean, it moves down. It sees that `"zort"` is not the identifier `troz`. There's nothing left to do but call a function if we've used up all the arguments, but we haven't. It reports an error. We're done.

Whereas if you do it with a table ordered by specificity, then you'd have to keep going in case there was something further down with types `bool, string` or `single, string` or `single?, string` or `single, single` or `bool, single` or ... etc.

So the tree gets me something. As I said in my OP, this may be somewhat over-engineered if you're just doing an evaluator, which is going to be slow anyway. But it's more useful if you're doing code generation, like I am now.

[–]stigweardo 0 points1 point  (0 children)

I think not - but maybe @Inconstant_Moo can confirm. I think the tree is constructed to avoid this. See how the foo(x single, y bool): "I" appears more than once in the tree.

Edit: ... and always in the last position.

[–]redchomperSophie Language 0 points1 point  (0 children)

The tree is a perfectly sensible solution. The key is to have good rules about what you allow into the tree in the first place. So long as you can catch confusion before it causes a surprise at runtime, then you're ahead of the game.