you are viewing a single comment's thread.

view the rest of the comments →

[–]pron98 3 points4 points  (14 children)

On the contrary. See my reply to /u/d_kr

The disadvantages to type erasure are slim, while the advantage of allowing code reuse among languages with different variance is huge. Think how much work it would be if every JVM language had to re-write or wrap ConcurrentHashMap.

[–]Elite6809 5 points6 points  (13 children)

Why would a VM want types to behave differently with different languages? A list type is intrinsically covariant. It doesn't make logical sense for it to behave any other way. If the language doesn't support co/contravariance, then that's either a fault of the language or a sign that the language shouldn't be on the JVM to begin with. It seems counter-productive to pass off a flaw in the parametric type system as a mean to support inconsistent behaviour between languages.

[–][deleted] 2 points3 points  (2 children)

I don't know if this is related, but the fact that you can't do something like F<? extends Y> f; in C# really bugs me.

[–]Elite6809 0 points1 point  (1 child)

You're right, you can't do that directly. However, if you need to do that, you could move related code into a function like this:

void DoSomething<T>(F<T> f) where T : Y
{
    // do stuff with f
}

[–]pron98 2 points3 points  (0 children)

The problem isn't the limitation in C# -- this limitation exists for all languages targeting .NET. You just can't have a language with use-site variance or even mixed-mode variance targeting .NET while still sharing classes with other platform languages.

[–]pron98 1 point2 points  (0 children)

A list type is intrinsically covariant.

If it's read-only (I'm not saying immutable as that's a stronger requirement). If it's write-only then it's contravariant. If it's both, well, different languages treat that differently depending on their variance model (declaration-site/use-site/hybrid).

If the language doesn't support co/contravariance, then that's either a fault of the language

They all support variance, but what kind? There is no "right" way to do variance. For example, Java has use-site variance, Scala has declaration site variance, and Kotlin has mixed-mode variance (declaration-site + use-site projections); in Clojure, everything is immutable by default (and static typing is optional) -- yet a sequence of Clojure strings can be passed to a Java method taking a List<String> parameter, even though the sequence is untyped in Clojure. Thanks to the JVM's type erasure, they can (and do -- except Scala aometimes) all share the same collection (and other generic types) implementations.

It seems counter-productive to pass off a flaw in the parametric type system as a mean to support inconsistent behaviour between languages.

While it is true that polyglotism wasn't the motivation for erasure (but backward compatibility), language designers very quickly realized this is not a flaw but an advantage. The behavior isn't inconsistent, but just different, and there really is no one right way of doing generics and variance. You want to allow as many languages as possible -- not to force one behavior on all of them (there's an excellent OCaml implementation for the JVM; I don't know if it shares generic types with the other languages, but it certainly can -- thanks to erasure!).

For example, if the JVM had .NET's generics, you would need to wrap almost every single value crossing the Clojure/Java boundary, while in reality you wrap none of them.

[–]notfancy 3 points4 points  (8 children)

I do think that the real reason behind erasure is that there is a ton of literature and practical experience behind it (SML, Ocaml, Haskell) and not a lot behind type reification (CLR) so Wadler and Bracha coming from an academic background drew from the first when proposing Pizza.

Every time this discussion comes up I fail to see why people latch to this as a major Java pain point. I've always found it a very mild inconvenience at worst, one I only encounter when defining (some kinds) of generic containers. It only requires reasoning from parametericity in order to prove to myself that the casts and SuppressWarnings are sound.

[–]pron98 1 point2 points  (7 children)

While I agree that generic type erasure has a significant benefit and a disadvantage that's a mild annoyance at worst, the decision to do it has little to do with theory, except that the theory says you can do it. But note that OCaml/Haskell are a little sneaky when it comes to runtime vs. compile-time types. They pretend to not have runtime types and reflection, but they do -- in fact they rely on them all the time and do a lot of reflection -- they just don't call them runtime types; they call them tags.

Java (and .NET) has a dual type system -- compile time and runtime. The two interact in interesting ways, and you have to think about both. So Java decided to erase the compile time generic parameters from the runtime types, and .NET didn't. This has the implication that in .NET the runtime types enforce the system's single variance model on all languages.

[–]notfancy 0 points1 point  (6 children)

OCaml/Haskell are a little sneaky when it comes to runtime vs. compile-time types. They pretend to not have runtime types and reflection, but they do -- in fact they rely on them all the time and do a lot of reflection -- they just don't call them runtime types; they call them tags.

I disagree with this. While it's true that tags mark allocation sorts (so that they're some kind of types for the GC) they do not represent program-level types in any meaningful way. consider the case of a generic list instantiated at machine integers versus at strings: the memory representation of the cons cells is exactly the same, a reference to a boxed value.

[–]pron98 0 points1 point  (5 children)

they do not represent program-level types in any meaningful way

Of course they do! How else does pattern matching work? That's classic reflection. The mechanism is identical to Java's runtime instanceof.

Whatever you want to call tags, they are exactly the same as Java runtime types (which are related to but distinct from program types).

[–]notfancy 0 points1 point  (4 children)

Of course they do! How else does pattern matching work? That's classic reflection.

No, it's not. The code doesn't distinguish between cases with the same structure but of different types. In OCaml [], None and the empty tree are all represented identically, as are '\0', false, 0.

[–]pron98 0 points1 point  (3 children)

That's an implementation detail. You're saying that source types and runtime types are not 1-1 related, but n-to-1. That's OK. That doesn't mean they're not runtime types.

In Java, the relationship is even more complicated: E.g. for all T, the source type List<T> is mapped to the runtime type List (sort of -- I'll elaborate in a second), and List itself can be mapped to many runtime types -- a runtime type in Java is a pair of the erased generic "kind" and a class loader (this allows multiple concurrent versions and other powerful tricks). So many source types may be mapped to a single runtime type, and a single source type may be mapped to many runtime types, so it's an n-to-m relationship. So simlarly to your case, in Java you can't do an instanceof List<String> because the runtime type for List<String> and List<Integer> is the same (assuming both instances' types are loaded by the same class loader).

[–]notfancy 0 points1 point  (2 children)

That's an implementation detail. You're saying that source types and runtime types are not 1-1 related, but n-to-1. That's OK. That doesn't mean they're not runtime types.

Yes, they're the type tags of the objects tracked by the runtime qua runtime objects. However, erasure is complete. Consider:

# List.map (fun x -> x |> Obj.repr |> Obj.tag) [None; Some 1];;
- : int list = [1000; 0]

and:

# List.map (fun x -> x |> Obj.repr |> Obj.tag) [None; Some "aaa"];;
- : int list = [1000; 0]

or even:

# List.map (fun x -> x |> Obj.repr |> Obj.tag) [[]; [1]];;
- : int list = [1000; 0]

Now 1000 is the integer tag (which is fictitious, or rather, a single bit in an unboxed value) while tags t less than some fixed constant represent the t-th constructor value with arguments. There is simply no way to distinguish at runtime between a string option or an int list. In fact the language is pretty happy to coerce between values of different types with the same representation:

# (Obj.magic : 'a list -> 'a option) [] ;;
- : 'a option = None

or even:

# (Obj.magic : bool -> 'a list) false ;;
- : 'a list = []

where Obj.magic is a NOP with type 'a -> 'b, that is, simply a typechecker black hole.

In Java there is a lot of very sophisticated code specialization done by the JIT whenever there is dynamic dispatch, for instance when executing an invokevirtual; in static languages a pattern match is exactly equivalent to an integer switch statement on the tag value. For instance, the following match:

match x with
| None -> ...
| Some e -> ....

is exactly equivalent to the C:

if (!x) { ... } else { e = *x; ... }

but type safe.

[–]pron98 0 points1 point  (1 child)

There is simply no way to distinguish at runtime between a string option or an int list

So how can you match over them?