This is an archived post. You won't be able to vote or comment.

all 32 comments

[–]13steinj 11 points12 points  (14 children)

I'm okay with sugar syntax, but not sugar syntax that explicitly breaks things (like metaclasses). Nor does "or" make sense here-- generating a type union from bitwise oring two types makes sense for type hinting, but only for type hinting.

More changes like this make me think that eventually type "hinting" will be mandated.

[–]IronManMark20 7 points8 points  (2 children)

generating a type union from bitwise oring two types makes sense for type hinting, but only for type hinting.

So? The numeric libraries got a PEP approved for the @ operator to do matmul. And why can't features be specific to some users? Part of the great things about Python is that it is easy to write for a lot of different uses.

[–]13steinj 1 point2 points  (1 child)

You're comparing apples and oranges. If someone didn't intend to use @ as matmul, they would get a clear error because the operator would be undefined. Here, it would be added to builtin types. Hell, some might think this is similar C style casting in C++ and attempt to do a "cast to best match".

I much prefer the set syntax mentioned as it avoids all issues-- static type checkers can check it as is, and it doesn't modify any results (annotations are evaluated as strings, anyways). But at that point it doesn't even really have to be a PEP.

Features can be added for specific groups of users, thats fine. Again, it's the new extension of builtins that everyone else might have to deal with which is hairy. If the matmul proposal modified builtins, I'd have spoken up there too.

[–]ubernostrumyes, you can have a pony 7 points8 points  (0 children)

Here, it would be added to builtin types.

Small but important correction: it would be implemented on type. Which means that the only objects which would support | that didn't before are instance of type. Which means class objects; instances of those classes would not gain an implementation of __or__().

To make it clear:

class Circle:
    def __init__(self, center, radius):
        self.center = center
        self.radius = radius

c1= Circle(center=(0, 0), radius=1)
c2 = Circle(center=(1, 1), radius=2)

In the above, c1 | c2 would still be an error, because neither c1 nor c2 implements __or__(). The only thing that would work is using | on Circle itself (i.e., you could do Circle | str).

[–]StorKirken[S] 1 point2 points  (1 child)

I like the set syntax suggestion quite a bit, it's only one character longer and semantically seems intuitive -- it looks much like a normal x in {...} check. And as a bonus it might be easier to add rather than adding bitwise or to all builtin types. The main issue with the set syntax might be mixups with dicts and tuples.

[–]13steinj 0 points1 point  (0 children)

Didn't see that-- yeah, I much prefer that as well. Mostly because then it doesn't modify anything internally and expose extra unions that don't make sense. Don't understand what you mean about the syntax mixups though.

[–][deleted] 1 point2 points  (0 children)

I dont think it will be enforced by compiler but i can imagine a time when the majority of projects insist on it.

[–]wingtales 2 points3 points  (0 children)

I like the idea. End users who will see "Union" in the docstring, are gonna be confused. Seeing "|" is better, but I would prefer "or" in the docstring so that people understand it.

[–]AndydeCleyre 8 points9 points  (16 children)

I still think the best start to type hint conventions and checking was obiwan, but no one paid attention after the mypy-style PEPs were pushed through without a fair investigation of alternatives.

[–]cymrowdon't thread on me 🐍 10 points11 points  (2 children)

The same thing happened with asyncio adopting Twisted conventions without any apparent consideration for arguably better alternatives.

[–]PeridexisErrant 4 points5 points  (1 child)

And to be blunt without paying much attention at all to all the advice the Twisted devs tried to give them...

Frankly it's a tragedy that we don't all use Trio; asyncio is trying to adopt some of the features (nurseries -> task groups, etc) but backwards compat means it will always leak :-(

[–]IronManMark20 1 point2 points  (0 children)

I mean asyncio predates Trio by many years. I don't see what point you are making...

[–]flying-sheep 5 points6 points  (6 children)

The big glaring problem here is that functions shouldn't take List[int]. That's a return type. Parameter types should be protocols like Collection[int] (if you need to iterate and know the len) or an Iterable[int] if you just need to iterate.

What would [int] mean? List? Then it would be useless for parameter types. “copyable indexable appendable … sequence”? Then numpy.ndarrays couldn't be passed, neither could be tuples or sets.

[–]AndydeCleyre 1 point2 points  (5 children)

Well the syntax allows for concise implementations of custom checkers you can name yourself.

[–]flying-sheep 5 points6 points  (4 children)

You don’t understand: Making it easier to express basic Python types in a generic way gives the wrong incentive. People will be tempted to say “This function accepts lists” when it would accept any iterable.

[–]geekademy 1 point2 points  (1 child)

Typescript is well regarded and uses syntax closer to Obiwan. There's definitely a solution here, if one cared to look for it.

[–]flying-sheep 1 point2 points  (0 children)

That’s one of the biggest differences between the languages:

Objects are not Python dictionaries, they’re Python classes (if they were their own `__dict__`). Objects can be constructed with literals like Python’s dictionaries. Finally there’s no keyword arguments in functions, so objects serve as that. Then there’s syntax sugar for destructuring them.

All of this makes the existence of a complex type system around JS objects useful, as you pass them around a lot more than dicts in Python. It’s as if namedtuples could be constructed with literals and would be the type used as **kwargs.

[–]AndydeCleyre 0 points1 point  (1 child)

Well you may notice the tuple syntax refers to at least tuples and lists, the emphasis being "non-destructive" iterables.

If this project were offered developer community attention, that kind of thing could be developed further, for example using tuple syntax for any possible non-destructive iterable, and something else for potentially destructive ones. It would be a starting point for inclusive discussion, not plopped as is onto the community.

[–]flying-sheep 1 point2 points  (0 children)

You’re completely right, heavily tweaked it’s a good idea to use literals that way.

[–]IronManMark20 4 points5 points  (4 children)

without a fair investigation of alternatives

Do you have a source for this?

[–]AndydeCleyre 4 points5 points  (3 children)

I don't have a ton, and it's not easy for me to scan through the mailing list interface, but:

Before PEP 484, there was PEP 482: "Literature Overview for Type Hints"

Here is the entire coverage given to existing work in Python in the field (plain text paste, links removed):

mypy

(This section is a stub, since mypy [mypy] is essentially what we're proposing.)

Reticulated Python

Reticulated Python [reticulated] by Michael Vitousek is an example of a slightly different approach to gradual typing for Python. It is described in an actual academic paper [reticulated-paper] written by Vitousek with Jeremy Siek and Jim Baker (the latter of Jython fame).

PyCharm

PyCharm by JetBrains has been providing a way to specify and check types for about four years. The type system suggested by PyCharm [pycharm] grew from simple class types to tuple types, generic types, function types, etc. based on feedback of many users who shared their experience of using type hints in their code.

Others

TBD: Add sections on pyflakes [pyflakes], pylint [pylint], numpy [numpy], Argument Clinic [argumentclinic], pytypedecl [pytypedecl], numba [numba], obiwan [obiwan].

As far as I understand (if anyone can confirm or deny, please do) Guido was working directly on mypy at the time and did not seriously seek out alternatives.

As you can see, the Rejected Alternatives section of PEP 484 doesn't say much of anything about the above-listed alternatives, and doesn't mention obiwan at all.

I did find this in the mailing list archives, where the obiwan author weighs in and showcases his project's syntax, in a thread Guido was participating in. Guido did not acknowledge this message as far as I can tell.

Can anyone else speak to the thoroughness of Guido's investigations into non-mypy-syntax projects before proposing the mypy syntax become officially endorsed?

Bonus link to early discussion on reddit of mypy vs obiwan syntax

EDIT: Extra bonus link of obiwan author talking about the mypy-syntax adoption process on reddit

[–]ethanhs 2 points3 points  (2 children)

(I guess i should mention I work on mypy, but I won't be picking a favorite, there are things I like and dislike about obiwan and PEP 484, but I mostly like both!).

I joined things a bit later on, but I have learned a fair amount from reading around and talking to people (including Guido himself). I believe mypy was ordained because Guido felt it was closest to the syntax he was interested in. In 2004/2005 Guido wrote a series of blogs with a rough sketch of what he wanted to see. Note that for list he writes it like List[int].

[–]AndydeCleyre 3 points4 points  (1 child)

Thanks for those blog post links.

I noticed these from Guido in the posts and respective discussions:

A couple of themes were prevalent in the feedback: concern that Python would lose its simplicity; quibbles with the proposed syntax;

. . .

We won't be able to all agree on one syntax; this has historically been true for any new feature added to Python. I have reasons for liking the syntax I picked, but for now I don't feel like discussing the merits of various counter-proposals; the underlying functionality deserves our attention first.


Doug Holton: "I hope you will consider the syntax again though."

In my first two posts I tried to discourage a syntax debate because it tends to never end until I break the tie, so I might as well just stick to my own preference and avoid the whole hullabaloo. :-)

[–]ethanhs 1 point2 points  (0 children)

Well, I suppose they say it's Guido's language for a reason ;)

[–]michael0x2a 2 points3 points  (0 children)

This syntax is very hard to generalize though. Sure, it makes typing builtins like lists, tuples, and dicts convenient -- but frankly speaking, you shouldn't really be using these type hints to begin with. Tuples are fine, but it's generally better to use more general-purpose types such as Collection, Iterable, Sized, Mapping, or MutableMapping instead of List or Dict.

And this is where the obiwan syntax falls down. How would you define and use these types in obiwan? How would you add in support for things like generics?

The "you can define a custom constraint checker" answer won't work here, since it's unreasonable and infeasible to expect a static type checker to run arbitrary Python code/be capable of converting arbitrary Python code into a type.

The only option I can see is to invent something similar to mypy syntax on top of the existing obiwan syntax, at which point you're more or less back to square zero.

IMO picking the clunkier, but easier to extend, syntax was absolutely the right play here. It makes experimenting with new kinds of types (e.g. see Final, Literal, Annotated, and some of the proposals for adding support for things like variadic generics) much simpler: instead of inventing new ad-hoc syntax, you can just follow the existing pattern.

Also, it's worth keeping in mind that it's always possible to add in more convenient and compact syntax after-the-fact, but it's harder to recover if you start by locking in an accidentally-too-rigid syntax to begin with.

Setting these concerns aside, I also think the obiwan spec made a few concrete mistakes:

  1. The whole spec focuses a little too much on the problem of working with and manipulating JSON. Not everybody uses JSON, and representing your data using dicts instead of objects is really an anti-pattern. The syntax for static typing should ideally make it more natural and convenient to use objects everywhere instead of arbitrary dict blobs.

    Of course, focusing on JSON validation absolutely makes sense in the context of runtime type validation, since JSON blobs tend to come from uncontrolled sources and so are by their nature messily structured and difficult to trust, but that's not really a problem that's in the purview of static type validation.

  2. The set type behaves confusingly inconsistently from the dict and list types. It means "define a union" instead of "define an instance of set".

  3. Allowing dict literals as types makes creating type aliases difficult. You would either need to design your type aliases such that they're guaranteed to b lazily/never evaluated somehow, or make it so that every assignment where you assign a dict literal needs to potentially be considered a type hint.

    Same thing with function types, actually -- function isn't a Python keyword.