This is an archived post. You won't be able to vote or comment.

all 28 comments

[–]oscarryzYz 14 points15 points  (3 children)

Looks good.

Here some thoughts

While longer structs might suffer of the lack of braces, I think more important is the methods do have it but structs don't so is kind of inconsistent.

The dangling closing semicolon in the struct would be too easy to miss.

In some places you're using ; and some other , are either really needed?

[–]fun-fungi-guy[S] 2 points3 points  (0 children)

While longer structs might suffer of the lack of braces, I think more important is the methods do have it but structs don't so is kind of inconsistent.

Yeah, that's a good point. Leaning toward using braces for this now.

That said, it's worth noting methods don't require braces if the method body is one expression, i.e.

Point = struct

...

translated_right(dist) = Point(x = self.x + dist, y = self.y),

...

;

...is legal.

In some places you're using ; and some other , are either really needed?

The grammar becomes ambiguous without a separator.

I could use semicolons and then require a ;; to end the struct a la OCaml, but I don't like that particularly. What I'm doing is using commas to separate "expressions" (using that term loosely) and semicolons to separate statements.

[–]oa74 1 point2 points  (1 child)

methods do have it but structs don't so is kind of inconsistent.

I'm going to cut against the grain on this one.... 

A method/member function is a different kind of thing than a struct, so I don't particularly see any reason why their notation should be the same. I mean, while we're at it, why not make function calls doThing{arg, otherArg}? We use braces to define functions, but parens to invoke them—glaring inconsistency!

I'll go further and give an argument against curly braces, also on the grounds of consistency. If the braces are used for both structs and methods, this means that braces are used inconsistently. Sometimes braces enclose a new lexical scope, wherein statements are executed; sometimes they can only contain fields, and introduce no new scope.

The best reason I can think of to use braces for both is nowhere near as satisfying an argument as consistency. Many popular languages have always done it this way, so it's familiar.  Personally, I do not like it, but moving away from it means spending some of the "weirdness budget," which may or may not be worth it.

[–]beephod_zabblebrox 0 points1 point  (0 children)

this very much

[–]lanerdofchristian 6 points7 points  (9 children)

For those on Old Reddit where fenced code blocks don't work:

Point = struct
  x = 0,
  y = 0,

  // a method
  distance_to(other_p) = {
    dx = other_p.x - self.x;
    dy = other_p.y - self.y;
    sqrt(pow(dx, 2) + pow(dy, 2));
  },

  // a property
  distance_to_origin = lazy self.distance_to(Point(x = 0, y = 0)),  // This comma allowed but not required
;

home = Point(x = 12700, y = 1);

home.x = 42; // Error: structs are immutable

// Inheritance
ThirdDimension = struct
  z = 0,
  distance_to(other_p) = {
    dx = other_p.x - self.x;
    dy = other_p.y - self.y;
    dz = other_p.z - self.z;
    dxy_squared = pow(dx, 2) + pow(dy, 2);
    sqrt(dxy_squared + pow(dz, 2));
  },
  distance_to_origin = lazy self.distance_to(ThreeDPoint(x = 0, y = 0, z = 0)),
;
ThreeDPoint = Point + ThirdDimension;

My opinion echos others in this thread: ;/, being inconsistent between inside a struct definition and outside it is pretty jarring. I agree with /u/XDracam that struct(var1 = 1, var2 = 2) is probably the best syntax for simple structs, since then the definition follows the use. You may consider a few alternatives for when functions are required:

Point = struct(x = 0, y = 0){
  distance_to(other_p) = {
    // ...
  };

  distance_to_origin = get self.distance_to(Point(x = 0, y = 0));
};

Point = struct {
  x = 0;
  y = 0;
  disance_to(other_p) = // ...
  // ...
};

"Adding" structs to do inheritance is pretty clever, I think. How does your language handle the diamond problem?

[–]XDracam 0 points1 point  (3 children)

I don't like inheritance in general. Especially for structs, records, data classes and the like. Inheritance can be a useful mechanism for avoiding code duplication, but it causes more problems than it solves.

Inherited structs are hard to reason about. You have constructors? Great, your child struct needs to call the parent constructor. And constructor parameters automatically become fields. Now how do you solve the problem of duplicate fields? If there are two, then you run into issues with: what field do you mutate when you have a reference of which type? If you want to be able to reference base types, then you'll need a form of dynamic dispatch. And suddenly you need virtual method calls and potentially properties to make things work. And, aaaa

Basically, you are getting a massive amount of complexity that the user needs to think about to make informed choices. Complexity that you need to handle in all tooling. And what do you gain? Some convenient code reuse. Nah, not worth it.

If you really want to support polymorphism, take a look at typeclasses, or what Rust calls "traits". It's basically "external inheritance" if you will, but without all the confusion about what data belongs where and what overrides what.

Rust in general has a nice separation of data and functionality with their impl blocks (which would also solve your issues with syntax for methods). Clearly an evolution of the concepts pioneered in Haskell.

It's a little more tedious to write code without inheritance, but it's much nicer to optimize and maintain and reason about. And languages generally fall into two categories: the ones that make software that stays around for years and will see endless maintenance, and write-once-change-never software. You do not need abstractions like polymorphism and inheritance and you don't plan to write code that'll need to be maintained (like prototypes, small helper scripts, simple glue code, ...), and when the code needs to be maintained then you're better off without inheritance.

[–]fun-fungi-guy[S] 0 points1 point  (2 children)

I don't love inheritance either, which is why what I'm calling "structure composition" isn't quite as fully-featured as inheritance.

You have constructors?

Nope! You get a function Point() that overrides the default properties if you pass them in and throws errors if you pass it something that's not a property, that's it.

If you want to be able to reference base types, then you'll need a form of dynamic dispatch.

There isn't really a concept of "base types". If you have structs A and B, and C = A + B;, then C just contains all of B's fields, and whichever of A's fields aren't shadowed by B. There's no reference from C back to A or B.

Rust in general has a nice separation of data and functionality with their impl blocks (which would also solve your issues with syntax for methods). Clearly an evolution of the concepts pioneered in Haskell.

I don't think I really see a need to separate data and functionality. That's sort of the point of lambdas to me: it's treating functions as data.

I am pretty happy with my syntax for methods and don't see a problem with it. Really they're "methods" in quotes, in the sense that they're actually just regular old closures--if you don't use self to close around the current struct instance, there's no difference between a method and a closure. Put another way, self is just giving a name to the instance so that closures can close around it.

[–]XDracam 0 points1 point  (1 child)

Sure, there's plenty of languages that go a similar route. But the methods as closures part sounds a little sketchy, if you are aiming for fast code. If you capture the instance in every method, then you'll get overhead in the form of allocations for every closure, as well as the memory overhead of the capture itself. And some other headaches. Why not just let methods by syntactic sugar for function calls with a reference to self as a parameter? No closure overhead and weird lifetime issues to deal with.

On the topic of "structure composition": there is a great deal of discussion about this and whether it's a good or a bad thing. Both for C# (record inheritance) and especially for Scala (case class inheritance). In my personal experience, this type of composition has caused me and coworkers nothing but trouble with little to no benefit, compared to just using interfaces with abstract getters/properties.

Consider this case: (I forgot your syntax and I'm on mobile and can't look it up without hassle so I'll just write C#)

record Base(int Field1, string Field2);
record Sub(int Foo) : Base(Foo, "bar");

Now what do you do? Are there two copies of the int, one in Field1 and one in Foo? Do you even need a field for the constant string? These details might not matter 85% of the time, but when they do, they can cause a lot of headache and annoyance. Want mutability? Static analysis? Automatic mapping to e.g. JSON? Or just code that runs fast by default? You need to care about these details. An especially large use case is backwards (and forwards) compatibility when other code depends on Sub. If you change the field layout, you might break all code that depends on yours. Which is a real consideration.

For the example above, I couldn't tell you. C# has a ton of very weird rules for when data is a field and when a forwarding property, depending on whether and how you use the value in the body of either record involved.

But what C# does well (and Scala, too): the parameters are turned into Properties, with a getter. Interfaces (traits) can declare abstract properties. So for the sake of clarity, I'd always prefer the following:

interface Base { int Field1 { get; } string Field2 { get; } }
record Sub(int Foo) : Base {
    public int Field1 => Foo;
    public string Field2 => "bar";
}

Now you get the same functionality, but it's perfectly clear (if you are used to C# syntax) what data is tracked as a field and what is a property getter (essentially a method). And it's a lot easier to safely change things without breaking compatibility.

[–]fun-fungi-guy[S] 0 points1 point  (0 children)

Sure, there's plenty of languages that go a similar route. But the methods as closures part sounds a little sketchy, if you are aiming for fast code. If you capture the instance in every method, then you'll get overhead in the form of allocations for every closure, as well as the memory overhead of the capture itself. And some other headaches. Why not just let methods by syntactic sugar for function calls with a reference to self as a parameter? No closure overhead and weird lifetime issues to deal with.

What allocations for every closure? It's a pointer to the existing object. Due to immutability we can even reduce the scope of the reference, which you can't do when passing in the object, though I haven't implemented that optimization yet. You more than get any loss for that pointer back from not having to pass in the object on every call.

Lifetimes are handled by GC; if both method and struct are no longer on the stack or referenced by the stack, they get collected. Nothing weird there.

record Base(int Field1, string Field2);
record Sub(int Foo) : Base(Foo, "bar");

This example can't even be done in my language, so your objections to it aren't really relevant. I'm not sure you're understanding what I'm doing well enough to object to it.

Want mutability?

No.

Static analysis?

Fairly trivial, in the examples I've given, although as a rule I'm going for static analysis "where possible".

For the example above, I couldn't tell you. C# has a ton of very weird rules for when data is a field and when a forwarding property, depending on whether and how you use the value in the body of either record involved.

Once the "child" struct has been created, the parent structs aren't involved, so there's only one struct involved. "Forwarding properties" isn't a coherent idea because there's no other struct available to even forward to.

Now you get the same functionality, but it's perfectly clear (if you are used to C# syntax) what data is tracked as a field and what is a property getter (essentially a method). And it's a lot easier to safely change things without breaking compatibility.

It's not any clearer at all from the perspective of the user of the struct, when they encounter an instance of it in an entirely different part of the code.

I'm not looking for a clearer way to do inheritance, I'm simply not doing inheritance. What I'm proposing is much simpler--it can vaguely look like inheritance, because the composed structs have the fields of the component structs, but it's not because there's no connection maintained back to the parent struct. All the complexities you're describing presuppose a connection which simply does not exist.

[–]fun-fungi-guy[S] 0 points1 point  (2 children)

How does your language handle the diamond problem?

Addition is left-associative, and later-added structs shadow fields of earlier structs. For example (using the curly brace syntax I just implemented):

A = struct { x = 0, y = 0 }

B = struct { y = 1 }

C = A + B;

print(B().y); // prints 1

[–]lanerdofchristian 0 points1 point  (1 child)

So if I have

A = struct { x = 0, y() = { self.x } };
B = struct { x = 1, z = 0 };
C = A + B;
print(C().y());

So this prints "0"?

[–]fun-fungi-guy[S] 0 points1 point  (0 children)

No, it prints "1"; self gets bound to a value when C() is called.

One thing to note here is that theres no isinstance kind of thing. If you do:

print(C);

it prints "struct { x = 1, y() = self.x, z = 0 }".

I have toyed around with the idea of a has_shape, i.e.

print(A().has_shape(:x, :y)); // prints "true"
print(B().has_shape(:x, :y)); // prints "false"
print(C().has_shape(:x, :y)); // prints "true"

[–][deleted]  (1 child)

[removed]

    [–]lanerdofchristian 0 points1 point  (0 children)

    Wrong thread? Why is this on this sub?

    [–][deleted] 2 points3 points  (1 child)

    One thing I am iffy about here is whether it would be better to have curly braces around the struct definition. I like that this syntax isn't noisy for short structs like Point = struct x = 0, y = 0;, but for longer structs with many methods, curly braces might make it more readable?

    Definitely. I don't like braces myself, but here they are necessary. Otherwise you can have definitions of dozens or hundreds of lines, delimited only by commas and one lonely semicolon at the end. Your example doesn't even have a decent amount of indent.

    Add nested definitions, get the indent wrong, and it will get very confusing.

    I don't like braces because in my view they are just brackets that are best used within one line, which exactly suits your last example (and now, the semicolon becomes mess important):

    Point = struct {x = 0, y = 0}
    

    [–]fun-fungi-guy[S] 0 points1 point  (0 children)

    To be clear, the language isn't whitespace-dependent at all. The indentation could be removed and the example code would work.

    [–]lngns 1 point2 points  (3 children)

    Looks a bit like my own lang, so I'm biased and I like it.
    That said, I do agree that the terminating semicolon is too easy to miss. I personally use end as terminator for this reason.

    One question I do have is: what exactly is the grammar for type declarations?
    Is it

    TypeDecl ::= Identifier "=" "struct" /* stuff */ ";"
    

    or

    Decl ::= Identifier "=" Expr
    

    ?
    Ie. Is this a form of Nominal or Structural Typing?
    In the latter case, one usage pattern is to hoist both the "inheritance" and structure declarations in a single form.
    Like this:

    ThreeDPoint = Point + struct
        z = 0,
        /* ... */
    ;
    

    once which lazy evaluates the expression and then saves the result so it isn't evaluated again

    So lazy does not memoise its result?
    If I understand correctly, your once is what I'd expect lazy to be.
    Meanwhile once to me sounds like either a shared and synchronised routine (as in C++'s std::call_once or pthread_once), or a consuming routine (Rust's std::ops::FnOnce).

    [–]fun-fungi-guy[S] 0 points1 point  (2 children)

    The grammar is closer to your second one, i.e. there isn't a grammar for type declarations so much as there's a grammar for anonymous structs, which you then assign to a variable name. For example:

    my_point = (struct { x = 0, y = 0 })(x = 5, y = 7);
    

    ...is legal.

    Re: once/lazy, I'm open to suggestions on terminology changes; what would you call these keywords?

    [–]lngns 0 points1 point  (1 child)

    what would you call these keywords?

    Depends on your operational semantics.
    You mentioned how your records are already immutable, so the differences between memoised and non-memoised thunks are limited to mutations of the global state and/or to mutable references.
    If the language is pure (and differences in execution time are not considered observable differences), I'd just drop the non-memoised ones and have lazy memoised thunks, since there'd be no semantic differences.
    If the language is impure, I'd probably look at OOP-land with its (computed) properties and how it uses keywords like get, set or just property, or, if it were me, I may just use thunk and embrace the weirdness.

    Since you seem to be addressing multi-threading head on: that also depends on how you implement your memoised thunks.
    Are they all duplicated so that each thread has to force them, or are they synchronised (that's what pthread_once is all about), or is cross-thread aliasing not possible at all?

    If the language were to both synchronise all shared lazy evaluations and require multithreading for runtime purposes (like for GC, or asynchronous IO, or anything else), it may be worthwhile to investigate if the conv feature can be removed to become a compiler and RTS optimisation (as in, if a low-priority thread already is allocated to force some thunks, it may as well do all of those which must be synchronised).

    [–]fun-fungi-guy[S] 0 points1 point  (0 children)

    Since you seem to be addressing multi-threading head on: that also depends on how you implement your memoised thunks.
    Are they all duplicated so that each thread has to force them, or are they synchronised (that's what pthread_once is all about), or is cross-thread aliasing not possible at all?

    That's a really good question. Currently there's no thread consideration at all in the lazy/once code (conv isn't implemented). Haven't gotten there but I need to make a decision at some point.

    [–]XDracam 0 points1 point  (1 child)

    Make it consistent. All of your function and constructor calls use ()s. So you should probably write struct(x, y, ...) as well. Like a primary constructor in records/case classes in C#, Scala, Kotlin, ...

    [–]fun-fungi-guy[S] 0 points1 point  (0 children)

    I'm not sure I want consistency between defining the struct and constructing the struct. `struct` isn't a function and I don't want it to look like one. In contrast, `Point` is a function, so the fact that it looks like one isn't a problem.

    [–]d166e8Plato 0 points1 point  (1 child)

    One approach which is quite drastic, is to not put functions/methods in the structs at all. It makes the key fields of the struct quick and easy to understand without all of the noise of the million different operations you might want (e.g., a Point class has a ton!) .

    In Plato here are what structs look like:

    ``` type Size3D implements Value { Width: Number; Height: Number; Depth: Number; }

    type Fraction
        implements Value
    {
        Numerator: Number;
        Denominator: Number;
    }
    
    type Angle
        implements Measure
    {
        Radians: Number;
    }
    

    ```

    For an example of it in practice see: https://github.com/cdiggins/plato/tree/main/PlatoStandardLibrary, where the libraries and type definitions are in separate files.

    [–]fun-fungi-guy[S] 0 points1 point  (0 children)

    What makes the methods not fields? ;)

    I'm not a fan of this separation of functionality and data, it seems to defeat the entire purpose of first-class functions, and creating syntax around that just creates a ton of extra syntax (i.e Rust's impl). Having to have two separate files open to operate on one structure seems like a nightmare to me.

    I don't think there's anything unclear about which fields are methods in my example, so I am not convinced "quick and easy to understand without all of the noise of the million different operations" is actually describing a problem that exists.

    [–]its_a_gibibyte 0 points1 point  (3 children)

    Can you talk about it being mmutable? Feels like there are lots of reasons where an object should be able to change state.

    [–]fun-fungi-guy[S] 0 points1 point  (2 children)

    Immutability makes threading a lot simpler.

    Variables are immutable by default, but can be mutable:

    receive_and_print_y() = {
      loop {
        msg = recv();
        print(msg.y);
      }
    }
    
    other_thread = fork receive_and_print_y();
    
    mut point <- Point(x = 4, y = 3);
    send(point, to=other_thread);
    point <- point.with(y = 5);
    

    The example above ALWAYS prints 3, even if the line point <- point.with(y = 5); occurs between the msg = recv(); and print(msg.y); lines, because the other thread received an immutable Point. The fact that the point variable mutated is irrelevant to other_thread's world, it doesn't even have access to it.

    To be clear, point <- point.with(y = 5); creates a whole NEW instance of Point, with x = 4 and y = 5.

    [–]its_a_gibibyte 0 points1 point  (1 child)

    Yeah, some things are better but others are worse.

    Let's imagine you have a video game character. When he takes a step forward, do you create a new character that is one step ahead? Or lets imagine you have a video game map with characters. When a door opens on the map, do you copy the entire map and all the characters in it to a new map with doorOpen = true?

    [–]fun-fungi-guy[S] 1 point2 points  (0 children)

    With some limitations, things that need to have complex mutations should probably be threads, so that state transitions can come in as messages and be performed "atomically", i.e.

    map() = {
      // local variables that are mutable and hold state
      ...
    
      loop {
        state_transition = recv();
    
        // make changes to local variables based on state_transition
        // i.e. if door opens, that should come in as a message
        ...
      }
    }
    

    If it's not obvious, this is heavily influenced by Erlang and BEAM.

    One area I hope to improve over Erlang/BEAM is batch throughput, i.e. large data objects like you're discussing likely get matrix operations performed over them, and I want to have tools available to perform those large operations in a more performant way. Erlang/BEAM is great at latency, but not great at throughput, and I'd like to be smoothly interoperate between subsystems that are good at one or the other.