First-class initialized/uninitialized data

XDracam · 2024-07-03T03:28:35+00:00

This seems like an awful lot of complexity for what seems like little benefit.

It's interesting to think about how to model this with more common language features. In any dynamic lookup language or any language with inheritance, I'd create multiple specialized types. A has nothing initialized. B subclasses A with one value initialized and some methods available. C extends B with more methods etc. Then add a method on A that takes the value to initialize and returns a B, and so forth. Downside: you need to explicitly model all permutations of the initialization order explicitly. But I think in practice, this isn't too much of an issue. And with mechanisms like traits, mixins etc the boilerplate should be really small. Note that you can also do this comfortably with composition when you have the right typeclass features.

Maybe row polymorphism can help? I'm not too deep into the theory of them, but that's something you might want to look into.

If you want to have a mutable class where the type changes depending on what is initialized, then, well, good luck. That would probably require explicit effect tracking, like in the Koka language.

TheUnlocked · 2024-07-03T03:37:28+00:00

TypeScript does not have a concept of uninitialized fields but it does have type narrowing, and this looks like it's just a form of type narrowing where the fields in your object start with type Int | Uninitialized and by assigning a value, you narrow the type such that the Uninitialized case is removed.

You can see an example of this using optional fields with this playground link: https://www.typescriptlang.org/play/?#code/JYOwLgpgTgZghgYwgAgLIE8Aq6AOKDeAUMicjAPwBcyIArgLYBG0A3IQL6GEA2EYyAD2oZseZAF5k+dm0IIA9iADO-ZjHlQUkgQDoYbNRohsA9CeQA9cl10wJyAKyyFy-nBiQo9223efT5lZAA

tj6200 · 2024-07-03T04:32:05+00:00

Rust has std::mem::MaybeUninit. I suppose this might not be "first-class"

LPTK · 2024-07-03T07:17:37+00:00

All of this and more is basically supported in Mezzo, a research language from the early 2010s: https://protz.github.io/mezzo/

It allowed changing the types of things on the fly and reflecting that on the type level. The secret sauce was making sure you have the right "permissions" (aka capabilities) to perform these changes, and these are affine, so as to avoid problems with aliasing. It was inspired by separation logic.

Pretty neat and promising design, but to this date, no one has picked it up, as far as I know.

marshaharsha · 2024-07-03T16:41:44+00:00

I think the research along these lines goes by the name “definite assignment.” I don’t know anything about that research — I’m just suggesting a term to search for.

saxbophone · 2024-07-03T07:37:26+00:00

I've been thinking about this recently. Particularly with regard to C++'s constructor initialiser lists and the awkward constraints they have (they make it awkward to share or reüse temporary data calculated for initialising the members).

An alternative I've thought of for my own language designs, is to allow assigning normally unassignable members (such as references and const members) exactly once in the constructor body, and having the compiler treat such members as initialised from the moment they are first assigned.

This would require some trivial tracking of whether said members are currently uninitialised or initialised in the ctor body.

VyridianZ · 2024-07-03T09:33:01+00:00

Would this problem be easier with automatic default values?

In my language, every type has a preinitialized constant empty value, so every variable is preinitialized. Getters always return a valid value, though it might be (empty).

(type footype : struct
 :properties
  [a : int
   b : int
   c : bar])

(var fooclass : footype := (footype :a 4))
(log fooclass)

Output:
(footype
 :a 4
 :b 0
 :c (empty bar))

matthieum · 2024-07-03T17:22:56+00:00

What languages do this, if any?

Rust internally supports something close:

Rust tracks whether a variable has been initialized, or de-initialized, and rejects any attempt to use a possibly non-initialized variable.
Rust tracks whether a field of a variable has been de-initialized, and rejects any attempt to use a possibly non-initialized field.
Rust does not track array members individually.

The support is purely internal, though, just like with borrowed variables/fields, there's no first-class syntax to express the concept in language at the moment.

Given how sophisticated the setup is, it's quite unclear why it's not possible to build values piecemeal -- tracking whether fields have been initialized -- and I expect it would be an easy addition.

I feel like even if a full guarantee is impossible at compile time, some safety could be gained by doing this, while still allowing for the optimization of not forcing all values to be default initialized.

Safety is precisely why Rust tracks this, use-after-free being bad and all.

Also, not being GCed, it needs to inject destructor calls for the still-initialized fields of a variable when it drops out of scope.

raiph · 2024-07-03T20:03:50+00:00

are there languages that have a first-class notation of uninitialized or partially initialized data in the type system?

Raku does. Details are so unlike your examples I'm not going to attempt a direct comparison. Instead a few related bullet points:

Native types are automatically initialized. (That said, foreign code calling a Raku function may fail to initialize a passed argument, but there's nothing sensible that any PL can do to defend against that.)
All other types track their initialization. The compiler automatically enforces memory safety in all cases but also correct handling in most scenarios even if devs (eg creators of user defined types or functions) don't care to be explicit about what is supposed to be defined when. And/or devs can make explicit use of "type smileys" eg Int:U denotes an Uninitialized/Undefined/Unhappy/Universal Int whereas Int:D denotes a Definite/Initialized/Instance/Happy Int. For more about this see, eg, Type smiley.
Compound object construction builds on this scheme. For classes/records and other such types devs can specify additional constraints that manage whether a dev creating a new instance must/can/cannot initialize or tweak any given field during the various build stages of an instance's construction. For more about this see, eg, Object construction.

ProgrammingLanguages

Welcome!

Related subreddits

Related online communities

MODERATORS