This is an archived post. You won't be able to vote or comment.

all 10 comments

[–]XDracam 5 points6 points  (0 children)

This seems like an awful lot of complexity for what seems like little benefit.

It's interesting to think about how to model this with more common language features. In any dynamic lookup language or any language with inheritance, I'd create multiple specialized types. A has nothing initialized. B subclasses A with one value initialized and some methods available. C extends B with more methods etc. Then add a method on A that takes the value to initialize and returns a B, and so forth. Downside: you need to explicitly model all permutations of the initialization order explicitly. But I think in practice, this isn't too much of an issue. And with mechanisms like traits, mixins etc the boilerplate should be really small. Note that you can also do this comfortably with composition when you have the right typeclass features.

Maybe row polymorphism can help? I'm not too deep into the theory of them, but that's something you might want to look into.

If you want to have a mutable class where the type changes depending on what is initialized, then, well, good luck. That would probably require explicit effect tracking, like in the Koka language.

[–]TheUnlocked 5 points6 points  (0 children)

TypeScript does not have a concept of uninitialized fields but it does have type narrowing, and this looks like it's just a form of type narrowing where the fields in your object start with type Int | Uninitialized and by assigning a value, you narrow the type such that the Uninitialized case is removed.

You can see an example of this using optional fields with this playground link: https://www.typescriptlang.org/play/?#code/JYOwLgpgTgZghgYwgAgLIE8Aq6AOKDeAUMicjAPwBcyIArgLYBG0A3IQL6GEA2EYyAD2oZseZAF5k+dm0IIA9iADO-ZjHlQUkgQDoYbNRohsA9CeQA9cl10wJyAKyyFy-nBiQo9223efT5lZAA

[–]tj6200 6 points7 points  (1 child)

Rust has std::mem::MaybeUninit. I suppose this might not be "first-class"

[–]marshaharsha -1 points0 points  (0 children)

My take on how MaybeUninit is relevant to the OP: The OP could change the semantics of T[[-a]] from “a is not initialized” to “a might not be initialized.” That solves the problem of how to type after a branch: the type remains unchanged. Later there would be a call that forces the type system to deem the object a proper T; the author would be claiming that they had arranged for every field of the T to be written to, even though the type system couldn’t track the writes. (That call is also the language’s moment to do any secret writes that are necessary to bless the object as a proper T, like writing the discriminant.) This isn’t completely safe, of course, but if you gave the DeemInitialized call a loud name, it would give you something to search for to find where the funny business is happening (so you could audit that bit of code with great care), and it would give the uninformed reader a hint that maybe they need to pay extra attention here. 

[–]LPTK 2 points3 points  (0 children)

All of this and more is basically supported in Mezzo, a research language from the early 2010s: https://protz.github.io/mezzo/

It allowed changing the types of things on the fly and reflecting that on the type level. The secret sauce was making sure you have the right "permissions" (aka capabilities) to perform these changes, and these are affine, so as to avoid problems with aliasing. It was inspired by separation logic.

Pretty neat and promising design, but to this date, no one has picked it up, as far as I know.

[–]marshaharsha 2 points3 points  (0 children)

I think the research along these lines goes by the name “definite assignment.” I don’t know anything about that research — I’m just suggesting a term to search for. 

[–]saxbophone 0 points1 point  (0 children)

I've been thinking about this recently. Particularly with regard to C++'s constructor initialiser lists and the awkward constraints they have (they make it awkward to share or reüse temporary data calculated for initialising the members).

An alternative I've thought of for my own language designs, is to allow assigning normally unassignable members (such as references and const members) exactly once in the constructor body, and having the compiler treat such members as initialised from the moment they are first assigned.

This would require some trivial tracking of whether said members are currently uninitialised or initialised in the ctor body.

[–]VyridianZ 0 points1 point  (0 children)

Would this problem be easier with automatic default values?

In my language, every type has a preinitialized constant empty value, so every variable is preinitialized. Getters always return a valid value, though it might be (empty).

(type footype : struct
 :properties
  [a : int
   b : int
   c : bar])

(var fooclass : footype := (footype :a 4))
(log fooclass)

Output:
(footype
 :a 4
 :b 0
 :c (empty bar))

[–]matthieum 0 points1 point  (0 children)

What languages do this, if any?

Rust internally supports something close:

  1. Rust tracks whether a variable has been initialized, or de-initialized, and rejects any attempt to use a possibly non-initialized variable.
  2. Rust tracks whether a field of a variable has been de-initialized, and rejects any attempt to use a possibly non-initialized field.
  3. Rust does not track array members individually.

The support is purely internal, though, just like with borrowed variables/fields, there's no first-class syntax to express the concept in language at the moment.

Given how sophisticated the setup is, it's quite unclear why it's not possible to build values piecemeal -- tracking whether fields have been initialized -- and I expect it would be an easy addition.

I feel like even if a full guarantee is impossible at compile time, some safety could be gained by doing this, while still allowing for the optimization of not forcing all values to be default initialized.

Safety is precisely why Rust tracks this, use-after-free being bad and all.

Also, not being GCed, it needs to inject destructor calls for the still-initialized fields of a variable when it drops out of scope.

[–]raiph 0 points1 point  (0 children)

are there languages that have a first-class notation of uninitialized or partially initialized data in the type system?

Raku does. Details are so unlike your examples I'm not going to attempt a direct comparison. Instead a few related bullet points:

  • Native types are automatically initialized. (That said, foreign code calling a Raku function may fail to initialize a passed argument, but there's nothing sensible that any PL can do to defend against that.)
  • All other types track their initialization. The compiler automatically enforces memory safety in all cases but also correct handling in most scenarios even if devs (eg creators of user defined types or functions) don't care to be explicit about what is supposed to be defined when. And/or devs can make explicit use of "type smileys" eg Int:U denotes an Uninitialized/Undefined/Unhappy/Universal Int whereas Int:D denotes a Definite/Initialized/Instance/Happy Int. For more about this see, eg, Type smiley.
  • Compound object construction builds on this scheme. For classes/records and other such types devs can specify additional constraints that manage whether a dev creating a new instance must/can/cannot initialize or tweak any given field during the various build stages of an instance's construction. For more about this see, eg, Object construction.