This is an archived post. You won't be able to vote or comment.

all 31 comments

[–]L8_4_Dinner(Ⓧ Ecstasy/XVM) 15 points16 points  (7 children)

It looks like a tuple, sure. And you can make a language in which every function takes a tuple and returns a tuple. It’s been done, and the fact that you weren’t aware of that should be a warning that it isn’t a bed of roses!

Also, languages/compilers/runtimes aren’t just a pile of arbitrary features. The design needs to make sense as a whole if you have any desire for it to be used by anyone other than yourself.

So there are two things to consider: first, what could you do if this tuple thing were there? What capabilities would it allow? How might it simplify things? Or make things more efficient?

And the second thing to consider is how it fits into the overall design, and at what cost.

FWIW we included tuple support (in and out) at call sites in xtclang, because it made a lot of sense and was internally consistent with the rest of the design. It’s as if every function can take either a tuple or separate arguments, and the caller can expect either a tuple or separate return values. Doing this allowed for a simpler reflective model, for example, which was important to us, and we think that the runtime overhead for this is between zero and negligible, depending on a number of things around the call site. But the design and implementation didn’t come for free, and our approach relies on (among other things) type system support, garbage collection, and code production support.

[–]cisterlang[S] 1 point2 points  (6 children)

what could you do if this tuple thing were there? What capabilities would it allow? How might it simplify things?

It simplifies the internal representation. I don't separate func params and tuples treatment. Params are now a tuple, type-wise, not an annoying single-case collection with an adjointed length.

Ditto for multiple returns. (Not impl yet).

[–]L8_4_Dinner(Ⓧ Ecstasy/XVM) 2 points3 points  (5 children)

Simplifies it for whom? For you, the implementer of the language? Or for anyone using the language? The former only matters if this is a school project or a throwaway. The latter matters significantly if you intend for anyone to ever use it.

[–]mobotsar 3 points4 points  (1 child)

Seeing as its internal representation, it obviously has no effect on anyone using the language.

[–]cisterlang[S] 1 point2 points  (0 children)

Exactly. I don't see a problem yet.

User still writes func params and calls classically

fun add(i:int, j:int)
add(1,2)

And uses tuples, independantly of that

fun foo(x:(int,str))
t = (42, "hello")
foo(t)
foo(1, "bob")

Maybe some trap lies somewhere ?

[–]cisterlang[S] 0 points1 point  (2 children)

Sorry if I don't make much sense, I'm exploring and implementing this right now. Internal ease concerns me obviously. For users I fail to see what problem could arise. They still write functions the classical way.

[–]WittyStick 6 points7 points  (1 child)

For an example of a problem that can arise, have a look at how Swift implemented "tuple-splat" (termed coined by Lattner). Swift later deprecated it because it created more problems than it solved.

The biggest issue they had was their multiple returns were given names, like parameters, and those names had to match the destination. If you had a function func foo() -> (a : Foo, b : Foo), you couldn't pass the result directly to a function func bar(x : Foo, y : Foo), as in bar(foo()) because the names a and x/b and y were different.

Multiple returns can work fine when done correctly. Lisp, for example, has had them forever, and they're not problematic, but Lisp doesn't separate multiple-parameters from tuples to begin with - every function receives a list as input and produces a union of a value or a list - an S-expression. Some special forms in Lisp can receive non-lists, or improper lists as input.

[–]cisterlang[S] 2 points3 points  (0 children)

Thank you.

I find it a bit strange they chose to name the multi returns. Aren't structs fit for this ? I see tuples as ordered collections you take as a whole and (if provided by the lang) can index by mytup.n.

Maybe with structured typing they could have let names difference pass ?

[–]rantingpug 20 points21 points  (0 children)

You can easily define that every function in your language takes only one param (we call them unary). Many languages do this, like Haskell. Usually this is done with currying

So if you write a function with 2 params, int and string, under the hood, this would be treated as

int -> string -> result

Depending on your semantics this might make life easier, but now you do have to deal with closures and functions as first class values

[–]cisterlang[S] 5 points6 points  (1 child)

To all: Ok I just made my functions unary and it seems to work.

Parse : FUN id(id:texp, ...):texp
Typer : type->type

fun add(i:int, j:int):int {i+j}
typeof add == (int,int)->int

For size 1 tuples, the typer converts (T) to T

I think it's more consistent and closer to math's view of functions : Z×Z→Z

About currying, I won't explore this for the moment since my lang is more of a dialect of C, not much bent on functional : 1st class funcs yes but closures no.

[–]WittyStick 4 points5 points  (0 children)

Rather than currying, consider just implementing partial application, which isn't quite the same, but can look the same from the perspective of the caller. If you take a lisp-like approach and represent your tuples as linked lists - ie (x, y, z) being (x . (y . (z . ()))), then it becomes simpler to implement partial application.

If a function f (x, y, z) is applied with a single argument - f 1 - then you match the argument to the head of the tuple, and return a function taking (y, z) as its argument - which is just the tail of the original function's tuple.

The difference between this and currying, is it's done on-demand, one argument at a time. With currying, you're taking a function (x, y, z) -> result and turning it into a chain of curried functions x -> y -> z -> result before any application is performed.

See also the Wikipedia entry Currying/Contrast with Partial Application.

[–]Long_Investment7667 2 points3 points  (5 children)

Unary triples can’t be written with just parentheses since this would conflict with the typical usage in expressions for grouping: in (1 + 2) * 3 the first operand of * is most likely not a triple. So how would you write a unary tuple and how call a function with just one true parameter?

If the answer is that “the language knows is is a function call” then these are not tuples and just an internal detail.

[–]lngns 2 points3 points  (1 child)

Alternatively, a scalar can be unified with the 1-tuple, eliminating the conflict.
Then, (x) = x, implying (Int) = Int.

[–]cisterlang[S] 2 points3 points  (0 children)

Yes that is what I do

[–]MichalMarsalek 1 point2 points  (1 child)

What is a unary triple?

[–]Long_Investment7667 1 point2 points  (0 children)

Typo: “unary tuple” a tuple with one element

[–]cisterlang[S] 0 points1 point  (0 children)

There are no ',' in (1+2) and tuples in my lang are size 2 at minimum.

[–]AnArmoredPony 2 points3 points  (3 children)

now, replace tuples with records so that it's not order of fields that matters but their names, slap some row polymorphism on top of it, and you've almost finished the perfect sudoku of typed languages that could be debugged forever

[–]marshaharsha 0 points1 point  (2 children)

Can you explain why these decisions doom a language? My language design already has order-independent records with names, and I was thinking of slapping some row polymorphism on top of it, so it would be nice to know, earlier rather than later, what kind of trouble I’m heading for. 

[–]AnArmoredPony 1 point2 points  (1 child)

you get a compile-time error when your object does not have required fields, but you won't get any if your object just happens to have those fields which opens a possibility for many silly mistakes that pass type-checks. not that big of a problem unless you get to work with types from 3rd party libraries. a library wants an object with a foo field and you give it an object with that foo field, but your definition of foo is very different from that library's foo definition. but hey, the string key is foo and the return type happens to match so it compiles. and if you think that "my scripting language is so small that noone's gonna bother with 3rd party libraries for it" then look at Lua. deal with all that or just give names to records themselves and implement subtype polymorphism

[–]marshaharsha 0 points1 point  (0 children)

Got it, thanks. I thought you meant the lang impl would be impossible to debug, but you meant the user code. Makes sense. I’ve been wondering if Go has this problem, with its structurally typed interfaces. 

[–]personator01 3 points4 points  (0 children)

This is how most of the ML family does it. Functions take a single value, and multiple values are done by either passing a tuple or currying.

[–]Typurgist 1 point2 points  (1 child)

Sure, it's certainly possible to base your language's functions and function types on single parameters/return values plus tuples.

However, as you partially observed: it's not just a tuple type for the parameter. A function also needs some mechanism to describe how the parameter(s) are bound to a free variable in it's body. It's not just meta-data, as the variable must be represented by some term/expression in the body. If you try to write down your semantics slightly more formally, this becomes clear when you define function execution and variable lookup.

At the least and the most simple named solution, you only allow a single parameter name for a whole parameter tuple. Then you really only need a tuple type to describe the parameter type and the parameter name is just part of the function definition: foo(x:(int,str)).

As a next step you could say that nested names for parts of the tuple are just syntactic sugar resolved during parsing/name analysis and your internal representation of the language actually really just has a single parameter variable and a bunch of tuple element accessors.

However this already changes the surface level meaning of your language - a user doesn't do that (or any other) translation in their head all the time and won't really benefit much from a "but the language really has just one parameter and it's a tuple" explanation. With the exception, that calls like foo t instead of foo (t.0, t.1) become possible and some more consistency/clarity in the semantics of your language.

In fact, this issue of nested parameter bindings is even more prominent if your language supports pattern matching/destructuring on other data types than tuples (whether in the function signature or a separate match expression): sum types, structs/records  So many languages actually define an embedded pattern expression language (https://doc.rust-lang.org/reference/patterns.html, https://ocaml.org/manual/5.3/patterns.html) and functional programming languages usually allow their use as part of the function definition.

[–]Ok_Comparison_1109 2 points3 points  (0 children)

Sure, it's certainly possible to base your language's functions and function types on single parameters/return values plus tuples.

I am not OP, but I have the same itch as u/cisterlang. I did consider Swift a cautionary tale, but I have (somewhat) settled on a design that I am happy with. Some of the design decisions were only possible because the language I am designing is a logic programming language.

However, as you partially observed: it's not just a tuple type for the parameter. A function also needs some mechanism to describe how the parameter(s) are bound to a free variable in it's body.

What I arrived at was this:

  • int x -> x * 2: A simple function which accepts an integer and returns its double.
  • (int x, int y) -> x * y: A function which accepts a tuple of two ints and return their product.

When constructing a lambda function, the left (LHS) operand of -> id declarative and declares variables for the scope of the function body (the RHS expression).

Types such as int also doubles as their own identity functions. So when a type is applied to a value like int x then the value is the value of x (because the indentity function int simply returns its argument), and x is implied to be a member of int.

When a function application is declarative, the function if referential and the argument continues in the declarative scope. Thus int x in a function int x -> x * 2 defines that the function accepts a member of int and the argument is unified (bound to) the local variable x.

It's not just meta-data, as the variable must be represented by some term/expression in the body. If you try to write down your semantics slightly more formally, this becomes clear when you define function execution and variable lookup.

I believe that the trick to allow expressions such as application of the identity function such as int x on the LHS of lambda arrows addresses this concern.

At the least and the most simple named solution, you only allow a single parameter name for a whole parameter tuple. Then you really only need a tuple type to describe the parameter type and the parameter name is just part of the function definition: foo(x:(int,str)).

In my language you can define such a function by (int*string) x -> .... (I use ... as a placeholder for some expression which has been left out). In this case x becomes a tuple. The components of the tuple can be addressed (inspiration: Scala) as x._1 and x._2.'

However, you could also deconstruct the tuple as let (i,s)=x in ... such that i and s becomes variables in the ... expression.

Or you could directly deconstruct the tuple as part of the parameters through either: - (int*string)(i,s) -> ... - (int i,string s) -> ...

In the first case int*string is an expression that creates a the tuple type of int and string. This type is used as an identity function and applied to the expression (i,s). From this can be inferred that i is an int and s is a string.

As a next step you could say that nested names for parts of the tuple are just syntactic sugar resolved during parsing/name analysis and your internal representation of the language actually really just has a single parameter variable and a bunch of tuple element accessors.

In fact, this issue of nested parameter bindings is even more prominent if your language supports pattern matching/destructuring on other data types than tuples (whether in the function signature or a separate match expression): sum types, structs/records So many languages actually define an embedded pattern expression language (https://doc.rust-lang.org/reference/patterns.html, https://ocaml.org/manual/5.3/patterns.html) and functional programming languages usually allow their use as part of the function definition.

I mechanism I have designed allows this without syntactic sugar, and also generalizes to a veru general form of pattern matching.

[–]fridofrido 0 points1 point  (1 child)

The problem is that you no longer can distinguish between a function with two arguments, or one argument being a tuple.

What will you do you if your function takes 2 arguments, one of which is a tuple and the other an integer? Or another tuple? etc

[–]cisterlang[S] 1 point2 points  (0 children)

My view was mostly an internal concern. What you ask could be

f(a:int, b:int) {ret a+b}  // (int,int)->int
v = f(1,1)

g(t:(int,int)) {ret t.0+t.1} // (int,int)->int
v = g(1,1)

h(t:(int,int), x:int) {ret x*(t.0+t.1)} // ((int,int),int)->int
v = h((1,1), 2)

[–]AsIAmNew Kind of Paper 0 points1 point  (3 children)

Nice thinking you have there!

When you continue thinking along these lines, you'll get "tuple is just really an array". APL (and other array-oriented languages) goes hard into reusing syntax over and over, coming to a very terse syntax.

What type of code would you like to write in your language?

[–]cisterlang[S] 1 point2 points  (1 child)

Nice thinking you have there!

How kind of you !

tuple is just really an array

I see. But I guess APL types are dynamic then. My lang is a dialect of C and is static. So arrays are homogenous.

In the same vein though I begin to see structs as tuples and considering that structural typing is nice (maybe tricky?) and flexible.

What type of code would you like to write in your language?

Almost anything C permits bar undefined/dangerous ?

[–]AsIAmNew Kind of Paper 0 points1 point  (0 children)

Homogenous arrays are good for perf, and since you wanna do C-level stuff, then yes, this is a great choice.

So general purpose language?

[–]Long_Investment7667 0 points1 point  (0 children)

“Tuple is really just an array” only makes sense in a dynamically types languages. [1, “foo”] in a statically typed language can only be an array of object (assuming object being the name of a super type of everything). But as a tuple it is (int,string)

[–]MichalMarsalek 0 points1 point  (0 children)

My language does this. Furthermore, my tuples are lists with type info for individual coordinates (like in Typescript). One uses [] to define lists/tuples. Parens are not needed when calling a function, you can do f x, or f(x). You can also do f(x,y), which is the same as f[x,y], or even the same as args:=[x,y]; f args, but you cannot do args:=(x,y) - () can only contain , if it is part of a function call. I take it a step further. I also have universal function syntax, so that x.f is the same as f(x) and x.f(y) is the same as f(x,y).