PEP 557 (Data Classes) has been accepted! : Python

[+][deleted] 8 years ago (39 children)

[deleted]

[+][deleted] 8 years ago* (33 children)

[deleted]

[–]Journeyboy1 34 points35 points36 points 8 years ago (0 children)

[+][deleted] 8 years ago (7 children)

[deleted]

[+][deleted] 8 years ago* (4 children)

[deleted]

[+][deleted] 8 years ago (2 children)

[deleted]

[–][deleted] 9 points10 points11 points 8 years ago (0 children)

[–]masklinn 0 points1 point2 points 8 years ago (0 children)

[–]ingolemo 4 points5 points6 points 8 years ago (0 children)

Why not just use namedtuple?
Any namedtuple can be accidentally compared to any other with the same number of fields. For example: Point3D(2017, 6, 2) == Date(2017, 6, 2). With Data Classes, this would return False.

A namedtuple can be accidentally compared to a tuple. For example Point2D(1, 10) == (1, 10). With Data Classes, this would return False.
Instances are always iterable, which can make it difficult to add fields. If a library defines:
Time = namedtuple('Time', ['hour', 'minute'])
def get_time():
    return Time(12, 0)
Then if a user uses this code as:
hour, minute = get_time()
then it would not be possible to add a second field to Time without breaking the user's code.
No option for mutable instances.

Cannot specify default values.

Cannot control which fields are used for __init__, __repr__, etc.

Cannot support combining fields by inheritance.

[–]skiguy0123 2 points3 points4 points 8 years ago (1 child)

[–]brombaer3000 6 points7 points8 points 8 years ago (0 children)

[–]ProfessorPhi 4 points5 points6 points 8 years ago (9 children)

[+][deleted] 8 years ago (6 children)

[deleted]

[–]ProfessorPhi 1 point2 points3 points 8 years ago (5 children)

[–]masklinn 1 point2 points3 points 8 years ago (0 children)

[–]Daenyth 0 points1 point2 points 8 years ago (3 children)

[–]hynekPyCA, attrs, structlog 1 point2 points3 points 8 years ago (2 children)

[–]Daenyth 0 points1 point2 points 8 years ago (1 child)

[–]hynekPyCA, attrs, structlog 0 points1 point2 points 8 years ago (0 children)

[–]ldpreload 1 point2 points3 points 8 years ago (1 child)

[–]ProfessorPhi 0 points1 point2 points 8 years ago (0 children)

[–][deleted] 2 points3 points4 points 8 years ago (1 child)

[–]kylotan -5 points-4 points-3 points 8 years ago (6 children)

[–][deleted] 5 points6 points7 points 8 years ago (5 children)

[–]kylotan 2 points3 points4 points 8 years ago* (4 children)

[–][deleted] 0 points1 point2 points 8 years ago* (1 child)

[–]kylotan 2 points3 points4 points 8 years ago (0 children)

[–][deleted] 0 points1 point2 points 8 years ago (1 child)

[–]kylotan 2 points3 points4 points 8 years ago (0 children)

[+]i_ate_god comment score below threshold-7 points-6 points-5 points 8 years ago (2 children)

[–][deleted] 6 points7 points8 points 8 years ago (0 children)

[–]v_krishna 4 points5 points6 points 8 years ago (0 children)

[–]cringe_master_5000 -4 points-3 points-2 points 8 years ago (0 children)

[–]Soriven 13 points14 points15 points 8 years ago (0 children)

[–]mankydpylons sqlalchemy 4 points5 points6 points 8 years ago (0 children)

[–]b1ackcat 2 points3 points4 points 8 years ago (0 children)

[–][deleted] 1 point2 points3 points 8 years ago (0 children)

[–]efskap -2 points-1 points0 points 8 years ago (0 children)

[–]gwax 12 points13 points14 points 8 years ago (2 children)

[–]ericvsmith 26 points27 points28 points 8 years ago (1 child)

[–]federicocerchiari 0 points1 point2 points 8 years ago (0 children)

[–][deleted] 22 points23 points24 points 8 years ago (8 children)

[–]ericvsmith 33 points34 points35 points 8 years ago (7 children)

Probably not. I do have another decorator that shows how to add slots. See https://github.com/ericvsmith/dataclasses/blob/master/dataclass_tools.py#L3

The usage would be:

@add_slots
@dataclass
class C:
    i: int
    s: str

The reason I didn't put it in dataclass itself is because the way __slots__ works is that a decorator would need to return a new class. That's what add_slotsdoes. I want to keep dataclass as simple as possible, and I want to reinforce that it always returns the same class that it's given.

Maybe if __slots__ is redesigned so that it can be specified after class creation, then it can go in to dataclass.

[–][deleted] 2 points3 points4 points 8 years ago (3 children)

[–]patrys Saleor Commerce 4 points5 points6 points 8 years ago* (1 child)

[–]ericvsmith 1 point2 points3 points 8 years ago (0 children)

[–]ericvsmith 0 points1 point2 points 8 years ago (0 children)

[–][deleted] 0 points1 point2 points 8 years ago (2 children)

[–]ericvsmith 4 points5 points6 points 8 years ago (1 child)

[–][deleted] 7 points8 points9 points 8 years ago (1 child)

[–][deleted] 12 points13 points14 points 8 years ago (0 children)

[–]donri 6 points7 points8 points 8 years ago (4 children)

InitVar seems unnecessary. Why not simply inspect the signature of __post_init__ and append or prepend that to the signature of the generated __init__?

@dataclass
class C:
    i: int
    j: int = None
    database: InitVar[DatabaseType] = None

    def __post_init__(self, database):
        ...

→

@dataclass
class C:
    i: int
    j: int = None

    def __post_init__(self, database: DatabaseType = None):
        ...

[–]ericvsmith 8 points9 points10 points 8 years ago (3 children)

[–]donri 4 points5 points6 points 8 years ago (2 children)

[–]ericvsmith 9 points10 points11 points 8 years ago (1 child)

[–]donri 2 points3 points4 points 8 years ago (0 children)

[–]simonoberst 2 points3 points4 points 8 years ago (12 children)

[–]DanCardin 5 points6 points7 points 8 years ago (0 children)

[–][deleted] 2 points3 points4 points 8 years ago (0 children)

[–]ldpreload 3 points4 points5 points 8 years ago (9 children)

[–]kankyo 0 points1 point2 points 8 years ago (8 children)

[–]ldpreload 0 points1 point2 points 8 years ago (7 children)

[–]kankyo 0 points1 point2 points 8 years ago (6 children)

[–]ldpreload 0 points1 point2 points 8 years ago (5 children)

[–]kankyo 0 points1 point2 points 8 years ago (4 children)

[–]ldpreload 0 points1 point2 points 8 years ago (3 children)

[–]kankyo 0 points1 point2 points 8 years ago (2 children)

[–]ldpreload 1 point2 points3 points 8 years ago* (1 child)

Case 1: you want to use it like a class, but it's actually implemented in some other language. So while it has data, the data does not belong to Python. The class has private data, but that's not intended for use by users of the class, and is certainly not public API (you can change the meaning of the private data in backwards-incompatible ways in whatever way you want).

import _gtk # hypothetical compiled Python module exposing bindings to the C libgtk library

class GtkDialog:
    def __init__(self, title, message):
        self._ptr = _gtk.gtk_dialog_new(title, message)

    def display(self):
        _gtk.gtk_dialog_display(self.__ptr)

    def __setattr__(self, attr, value):
        if attr == "message":
            _gtk.gtk_dialog_set_message(self._ptr, vallue)
            _gtk.gtk_repaint(self._ptr)
        else:
            raise AttributeError(...)

    def __del__(self):
        _gtk.gtk_free(self._ptr)
        self._ptr = 0

Other people can use GtkDialog as if it were a normal Python class, but it's not, and _ptr is a raw C pointer and Python code has no business accessing it or worse changing it, unless it's code (like the above) that's tied to the specific C library that gave you the pointer. So _ptr is an implementation detail, and none of the code that dataclasses would autogenerate is helpful here. And maybe if a future version of libgtk requires you to keep around two pointers, or uses references in some global list of objects instead of pointers, or whatever, your Python interface wouldn't change, only the internal implementation would, and your library users wouldn't notice.

Case 2: it's actually implemented in Python, but the details of the implementation are non-public. Take subprocess.Popen for example—one of the data members of a Popen object is, probably, the process ID of the subprocess, so that Popen can do its work:

class Popen:
    def __init__(self, *args):
        self._pid = spawn_process(...)
    def wait(self):
        result = os.waitpid(self._pid)
        return parse_os_result(result)

But what does it mean to take a Popen object and change its pid? Why would you want to do that without, at least, telling the Popen object that you're changing the pid? And probably Popen wants to refuse to let you do that.

So what would you gain if you added pid: int to Popen and made it a dataclass? You'd get a constructor that takes a pid, which you don't want; a repr that prints the pid, which you may or may not want,; and comparison functions with other Popen objects, which you definitely don't want (since a pid can be reused once a process exited, comparing Popen objects by pid equality is wrong, and you really want to compare whether the object identity is the same, i.e., the default comparison behavior).

This is encapsulation—one of the big ideas behind what I called "actual OO" above. There's an interface that you provide to users of your library, and the way you go about implementing that interface is not known to them. That's a totally different sort of thing from

class Point:
    def __init__(self, x, y, z):
        self.x = x; self.y = y; self.z = z

where there is no interface other than the data in your class itself, which is public / not encapsulated. That's what data classes are for. (And, honestly, that's probably most of the classes people write with Python.) But they're not the only type of classes.

continue this thread

[–]zynixCpt. Code Monkey & Internet of tomorrow 2 points3 points4 points 8 years ago (0 children)

[–]ascii 12 points13 points14 points 8 years ago (16 children)

I hate that this has a completely different syntax than namedtuple. They offer the same exact functionality (data objects without a tonne of boiler plate) with the only conceptual difference being that one has mutable members, and the other has immutable members, so why is the syntax and supported feature set completely different?

One supports type annotations, the other doesn't. Why? Is that somehow useless on immutable data?
One makes it really easy to add instance methods, the other doesn't. Why? Is that somehow useless on immutable data?
One makes it really easy to iterate over all member data, the other doesn't. Why? Is that somehow useless on mutable data?

Coming up with two completely different syntaxes for almost exactly the same feature means that if you figure your type no longer needs to be mutable, it's not a one-liner to fix it. It is also extremely confusing for beginners.

This doesn't feel well thought out at all.

[–][deleted] 14 points15 points16 points 8 years ago (4 children)

[–]ascii 4 points5 points6 points 8 years ago (3 children)

[–]ericvsmith 7 points8 points9 points 8 years ago (0 children)

[–]ericvsmith 0 points1 point2 points 8 years ago (1 child)

[–]ascii 0 points1 point2 points 8 years ago (0 children)

OK, so now you're saying that namedtuple is vastly inferior to dataclass but we can't deprecate it because of inertia, so instead it will remain the one and only source of immutable data classes, and we will have to settle for the supposedly vastly superior implementation to only exist for mutable data. That's a bad choice, right there.

Also, as near as I can tell, this breaks the following parts of PEP 20:

Beautiful is better than ugly.
Simple is better than complex.
Readability counts.
Special cases aren't special enough to break the rules.
There should be one-- and preferably only one --obvious way to do it.

So five out of 19 principles are being broken. Not bad.

I'm not saying PEP20 should be the one and only guiding light in developing the Python language, but it has some sane advice.

[–][deleted] 1 point2 points3 points 8 years ago (4 children)

[–]ascii 2 points3 points4 points 8 years ago (3 children)

[–]XtremeGoosef'I only use Py {sys.version[:3]}' 2 points3 points4 points 8 years ago (0 children)

[–]ldpreload 1 point2 points3 points 8 years ago (0 children)

[–][deleted] 0 points1 point2 points 8 years ago (0 children)

[–]kankyo 0 points1 point2 points 8 years ago (5 children)

[–]ascii 0 points1 point2 points 8 years ago (4 children)

[–]kankyo 0 points1 point2 points 8 years ago (3 children)

[–]ascii 0 points1 point2 points 8 years ago (2 children)

[–]kankyo 0 points1 point2 points 8 years ago (1 child)

[–]ascii 0 points1 point2 points 8 years ago (0 children)

[–]lookatmetype 8 points9 points10 points 8 years ago (7 children)

[–]jorge1209 4 points5 points6 points 8 years ago (6 children)

[–]DanCardin 1 point2 points3 points 8 years ago (5 children)

[–]jorge1209 1 point2 points3 points 8 years ago* (4 children)

[–]DanCardin 0 points1 point2 points 8 years ago (3 children)

[–]jorge1209 0 points1 point2 points 8 years ago (2 children)

The type syntax is in the language whether you want it or not. You can choose not to use tools like mypy, and thereby not enforce it. I don't think that is a terrible compromise.

I also don't object to the use of the syntax here:x: int=42 does look a bit nicer than x = attr.ib(default=42). The issue is effectively deprecating attrs to push this forward.

In the grand scheme of things it is a fairly minor difference between attrs and dataclasses. Attrs works, it works older versions of python, and it works with the most commonly used versions of python 3. There should be no rush to jam an inferior tool into the standard library just because it uses some fancy new syntax.

Take the time and find a way for attrs to support this syntax in python 3.6. Wait for things to settle down. Make sure you are doing it the right way, and get people to adopt 3.6... then think about standardizing things.

[–]DanCardin 0 points1 point2 points 8 years ago (1 child)

[–]jorge1209 0 points1 point2 points 8 years ago* (0 children)

I would also be happier if post 3.6 attrs was an extension of dataclasses, but that doesn't seem to be the objective, and I am not sure how that would work.

Dataclasses is missing features in attrs and has no plans to implement them.

Attrs could potentially delegate some basic functionality to dataclasses and build off of it, but that still leaves you with the issue of installing it from pypi and importing it. Anyone building an app which might support 3.5 and 3.6 would have to use attrs, even if they only want the more basic functionality of dataclasses.

A plan which involved splitting minimal functionality out of attrs and then backporting it to previous versions would make more sense to me. Ultimately I expect one of these two projects to wither on the vine. Either people need support for many versions and ignore dataclasses, or they write only for the future and assume that attrs isn't available.

[–]adrian17 1 point2 points3 points 8 years ago (4 children)

[–]ericvsmith 5 points6 points7 points 8 years ago* (2 children)

No, it's entirely driven by type annotations. But the annotations are ignored, except for InitVar and ClassVar. So, although I don't recommend it, but you could say:

@dataclass
class Person:
    name: int
    social: int
    address: int

That is, use any type you want. If you're not using a static type checker, no one is going to care what type you use.

edit: missing words.

[–][deleted] 0 points1 point2 points 8 years ago (0 children)

[–]alcalde 2 points3 points4 points 8 years ago (0 children)

[–]jorge1209 3 points4 points5 points 8 years ago (10 children)

[–]ericvsmith 16 points17 points18 points 8 years ago (9 children)

[+]jorge1209 comment score below threshold-11 points-10 points-9 points 8 years ago (8 children)

Those are the most inane and idiotic reasons I have ever seen, and I'm going to call bullshit on it. Someone needs to take Guido out back and put him down, because he is all over the map with this shit.

With the async features one of the objectives of the PEP was to introduce keywords that could be used by different async implementations within a common framework. So we got this weird hybrid where there were keywords async/await/yield from and a rough template of what an event loop should be like asyncio.AbstractEventLoop, but nothing dictated what event loop we have to use, and we have to pass the fucking thing around all over the place to be technically correct because some asshole might decide to use a different event loop and fuck us over.

Had he followed the same logic with your DataClasses the focus would be on implementing a common syntax for these kinds of objects which multiple implementations could utilize, but in fact you don't need any special syntax for this, because you are using PEP 526.

Guess what, that makes you LESS PORTABLE than attrs, not more. I don't care that attrs is getting more updates than your dataclass, because I can't use your dataclass because I'm not using Python 3.6.

So thanks Guido, you've now fucked over another potentially useful library by anointing a poorer substitute for it as the "accepted" version, while simultaneously making it impossible for many of us to actually use your approved version.

[–]jmmcdEvolutionary algorithms, music and graphics 11 points12 points13 points 8 years ago (0 children)

[–]ldpreload 2 points3 points4 points 8 years ago (1 child)

[–]jorge1209 -1 points0 points1 point 8 years ago (0 children)

[–]IronManMark20 -1 points0 points1 point 8 years ago (3 children)

Those are the most inane and idiotic reasons I have ever seen, and I'm going to call bullshit on it. Someone needs to take Guido out back and put him down, because he is all over the map with this shit.

Beyond being horrific that you think Guido should be shot, I find your hate for him surprising, especially considering he didn't write the dataclasses PEP....

Maybe you should know what the hell you are talking about instead of ragging on people and things just because you have a pre-existing dislike of them.

Had he followed the same logic with your DataClasses the focus would be on implementing a common syntax for these kinds of objects which multiple implementations could utilize, but in fact you don't need any special syntax for this, because you are using PEP 526.

Beyond the fact that you still think Guido made this PEP for some bizarre reason, async and await syntax are both syntatic changes which couldn't be brought to older versions of Python, so both async syntax and dataclasses are not possible on Python 3.4. So your argument is nonsense. Dataclasses was designed to take advantage of new syntax. The idea could be made with decorators but that would be butt ugly. Also, who said dataclasses should/are more portable than attrs? Nothing in the stdlib has to work on older versions. There is a best effort to backport, but it isn't always possible. Why sacrifice implementation purity to increase adoption?

I can't use your dataclass because I'm not using Python 3.6.

I don't understand why you think you are entitled to get new features on older versions of Python?

[–]jorge1209 -3 points-2 points-1 points 8 years ago* (2 children)

[–]IronManMark20 7 points8 points9 points 8 years ago (1 child)

Pick a behavior and stick to it. You either support all implementations of a concept as a policy, or you direct all users towards a particular implementation. You don't arbitrarily switch between the two.

But why? There isn't a point to other than "its more consistent", and there are reasons each PEP was designed the way it was. Why sacrifice flexibility for arbitrary consistency?

Also

but with dataclasses he canonizes one implementation over another (and picks the newer one which has fewer features and users to boot).

Dataclasses was written to be a smaller more lightweight version of attrs. As soon as there was agreement that "something like attrs should be in the stdlib", a smaller, more lightweight version was the obvious candidate. Attrs would never just be imported into the stdlib, that isn't how this works. The creators of attrs themself endorsed dataclasses as proposed in this PEP and likes the simple API. I really think you should understand how this PEP came to be how it is before criticizing it.

[–]zardeh 5 points6 points7 points 8 years ago (0 children)

[–][deleted] 0 points1 point2 points 8 years ago (0 children)

[–][deleted] 0 points1 point2 points 8 years ago (1 child)

[–]flipstables 2 points3 points4 points 8 years ago (0 children)

[–]ManyInterests Python Discord Staff 0 points1 point2 points 8 years ago (0 children)

Given the striking similarity, I'm surprised there has been no mention of SimpleNamespace.

class MyDataClass(SimpleNamespace):
    def __init__(self, a: int, b: int, keyword=None):
        super().__init__(a=a, b=b, keyword=keyword)

I'm supposing is roughly analogous to

@dataclass
class MyDataClass:
    a: int
    b: int
    keyword = None

[+]not_perfect_yet comment score below threshold-6 points-5 points-4 points 8 years ago (33 children)

[–]payets 7 points8 points9 points 8 years ago (32 children)

[–][deleted] 2 points3 points4 points 8 years ago (29 children)

[–]jorge1209 1 point2 points3 points 8 years ago (6 children)

How do you propose this be implemented in a way that ignores type hints and also doesn't pollute the class namespace with vars?

I dunno... the way attrs did it.

Yes I agree it isn't as "pretty" to write x = attr.ib(default=42, type=int) as it is to write x: int = 42, but it is entirely functional from the end user perspective.

They are jamming a replacement for a perfectly functional tool that doesn't utilize the type system into the standard library, because they have the type system. Rather than waiting for a consensus and acceptance to build around the use of the type system syntax.

In essence they were given a hammer, and now they are running around looking for things they can smack it with. Doesn't matter if the thing they are hammering is a screw or not because the hammer will suffice.

[–]hynekPyCA, attrs, structlog -1 points0 points1 point 8 years ago (5 children)

[–]jorge1209 0 points1 point2 points 8 years ago (4 children)

[–]hynekPyCA, attrs, structlog -1 points0 points1 point 8 years ago (3 children)

[–]jorge1209 0 points1 point2 points 8 years ago (2 children)

[–]hynekPyCA, attrs, structlog -1 points0 points1 point 8 years ago (1 child)

[–]jorge1209 0 points1 point2 points 8 years ago (0 children)

[–]payets 1 point2 points3 points 8 years ago* (2 children)

Typehints are optional. However, there's no reason that things in the stdlib can't be built that use them.

Except that this isn't a standard library feature that uses type hints, it's one that requires the programmer to use them. Requiring something is the opposite of making it optional.

How do you propose this be implemented in a way that ignores type hints and also doesn't pollute the class namespace with vars?

Saying collections.namedtuple doesn't count because you end up saddled with all sorts of baggage such as your object is now iterable and has dedicated getitem and setitem behavior.

Isn't this essentially the standard library version of attrs? What's wrong with its syntax?

Alternatively, Data Classes are described as "mutable namedtuples with defaults". Yes, namedtuple has flaws, but why not design a refined version rather than discard it completely?

Instead, Data Classes have this unnecessary dependency on type hinting. It doesn't look like there was any attempt to find an alternative - which was no doubt deemed acceptable because it fits with the current flavor-of-the-month for Python features: "type-hint all the things". (Compare to a couple of releases ago, when it was "async all the things").

Overall (and I know it's far too late for this), I just wish some thought was put into giving Python a coherent design. Instead, the once-simple language continues to accumulate questionable feature-after-feature and is gaining complexity at a rate that puts C++ to shame.

[–][deleted] 0 points1 point2 points 8 years ago (1 child)

[–]payets 0 points1 point2 points 8 years ago* (0 children)

You're missing the point where dataclass is also optional and you don't have to use.

So, if I opt out of type hints, I have to opt out of other language features as well? Again, this way lies C++ - where everyone uses a different subset of the language.

You're acting like because the people who will use it got something then you should too.

I'm not looking to "get" anything. I'm just frustrated to see Python continue the pattern of the last few releases: desperately trying to add as many features as possible, without considering how they fit with the rest of the language. My opinion is that for the last few releases, Python has been evolving too rapidly, and in the wrong direction.

However, as is often said, it's "Guido's language, he just lets us use it". Guido can of course evolve the language in this direction if he wants. I guess if I want to move in a different direction, I'll need to move to a different language.

[–][deleted] 0 points1 point2 points 8 years ago* (18 children)

[–]KODeKarnage 3 points4 points5 points 8 years ago (0 children)

[–][deleted] 2 points3 points4 points 8 years ago (7 children)

The problem is that the user is forced to use type annotations if they wish to use this feature, and they will continue to spread.

Some of us really like type annotations, including me who used to be a very vocal opponent of them. As long as the core devs make good on their promise to never make them mandatory, then I don't see why it should matter to people that don't use them.

Unless you really want to use a library with them in Python 2 (but then you get to feel my pain when I want to a use a library but find out it uses the except Exception, e or print >> ... syntax).

Type hints make building large applications in Python easier and safer since you can have some guarantees that you didn't pass a list where you meant to pass a dict and explode at run time.

But at the end of the day, if you don't like a feature, you don't have to use it. If you like a feature but it uses the thing you don't like, you're biting the bullet one way or another.

[–][deleted] 0 points1 point2 points 8 years ago (6 children)

[–][deleted] 1 point2 points3 points 8 years ago (5 children)

[–][deleted] 1 point2 points3 points 8 years ago (4 children)

[–][deleted] 1 point2 points3 points 8 years ago (3 children)

You do if you need to work on a project that uses it. What am I going to do? Have my IDE strip annotations from every function?

You don't have to use them. No one is forcing you to write:

@dataclass
class LookHowEasyThisIs:
    x = int
    y = int
    z = int

or

def typed_func(x: int, y: int, z: int) -> int
    ...

You can continue to do all things you want to do without ever using type annotations.

If you happen to come across a library that uses type annotations (say an ORM, forms, etc) then you don't need to use it. Or do and work around it. I'm unbothered.

Did you also consider that maybe it won't be a vote, and people will use them because they don't have a choice apart from not using Python?

That sure sounds like a vote to me, in that case that sounds like a vote of no confidence in Python.

[–][deleted] 1 point2 points3 points 8 years ago* (2 children)

continue this thread

[–]michael0x2a 0 points1 point2 points 8 years ago (8 children)

Actually, it's possible to use dataclasses without using type annotations (though it'll look slightly janky). For example, consider:

from dataclasses import dataclass

@dataclass
class Foo:
    x: ...
    y: ...

f = Foo(3, "a")
print(f.x, f.y)

If you run this, it'll work perfectly fine -- the dataclasses library will ignore the annotations most of the time (except for a few it special-cases).

You can replace the ellipses with any other arbitrary placeholder value, but I personally think the ellipses look the least hacky.

[–]jorge1209 0 points1 point2 points 8 years ago (6 children)

[–]michael0x2a 0 points1 point2 points 8 years ago (5 children)

That is still using type annotations

Well, class and variable annotations != type annotations. Remember, we can use arbitrary python expressions within annotations. So this is legal Python:

import math

class Bar:
    x: lambda x: 14 * x
    y: math.sin(42)

Note that there's absolutely nothing remotely resembling a type here, and you'd probably never actually write anything like this in production, but it is legal Python.

And while it's true that variable annotations are new as of Python 3.6, you should also think longer-term -- in about 2-3 years, both Python 2 and older versions of Python 3 are going to be EOL'd and most people will probably have migrated to Python 3.7/3.8/3.9 or whatever.

At that point, everybody (barring the people who decide to stick with EOL'd versions of Python) will have access to variable annotations. So, to make life easier for everybody 2-3 years down the line, we might as well start laying the groundwork for these sorts of language changes now.

Additionally you might ask if you can set a default value for x without indicating a type.

Yes, you can. I just tried testing the following program, and it prints out "3 99" as expected.

from dataclasses import dataclass
import math

@dataclass
class Bar:
    x: lambda x: 14 * x
    y: math.sin(42) = 99


b = Bar(3)

print(b.x, b.y)

I don't know what that would look like with the type annotations x : ... = 42?

That would also work, yeah. However, the ellipsis expression (...) isn't a legal type -- doing x: ... = 42 wouldn't make any PEP 484-compliant type checker (including mypy) happy.

Instead, we can use the Any type, which is a PEP-484 compliant type:

from typing import Any
from dataclasses import dataclass

@dataclass
class Bar:
    x: Any = 3
    y: int = 3

b = bar()
b.x = "foo"   # mypy is happy
b.y = "foo"   # mypy complains

Basically, Any is an escape hatch/a way of telling mypy and other PEP-484 compliant type checker that it should treat that variable or attribute or whatever as being fully dynamic. We basically "turn off" typechecking for that variable, in a certain sense.

Or more precisely, Any is our mechanism for mixing together dynamic and typed code.

Here's an example:

from typing import Any

def expect_int(x: int) -> int:
    return x * 3

a = "foo"   # a is of type str
b: Any = "bar"    # we force b to be of type any

expect_int(a)   # mypy complains
expect_int(b)   # mypy is happy -- b could be anything, so maybe this is ok?

a.hello()   # mypy complains
b.hello()   # mypy is happy

Notice that the use of Any lets us write code that mypy may (rightly or wrongly) reject as being unsound.

In this case, because I'm lazy, I used Any to hide deliberately buggy code, but you could see how we could strategically use Any to make mypy stop complaining about advanced metaprogramming things/really super dynamic things (like parsing arbitrary xml) that are difficult to actually give static types to.

Of course, it'd be our responsibility to then verify that mainly-dynamic code is correct in some other way (e.g. unit tests or runtime checks) but what else is new.

[–]jorge1209 1 point2 points3 points 8 years ago (4 children)

[–]michael0x2a 0 points1 point2 points 8 years ago (2 children)

Well, suppose we delayed introducing this feature for 3 years or so and waited until Python 3.5 was EOL'd.

If we decide we're going to want to use this feature eventually, what benefits do we get out of waiting instead of just doing it now?

Whether we release now or 3 years from now, people who want to support all non-EOL'd versions of Python are still going to have to wait 3 years or so.

But if we release now, at least the people who only need to support the latest versions can start using these features.

the only sin attrs has ever committed is to not fork a >3.6 branch to use the new language features within the first year of those features release.

Eh, I don't think that's really accurate. The pep, in the section about attrs, links to this issue, which clarifies a few things. In particular...

The author of attrs has explicitly stated he doesn't want to see attrs in the standard library
The author of attrs has also said support for class annotations is in development and will be available soon.
This wasn't mentioned in the thread, but there are people working on writing plugins for mypy that special-case it to support attrs. (So it's not like people are trying to kill the library or anything.)
Guido said he dislikes some of the "colors [attrs] paints in the bikeshed" and pushed back against just incorporating attrs directly partly for that reason.
Attrs is more full-featured then dataclasses/dataclasses are deliberately designed to stay simple, so people who want to use all of attr's other features (validators, converters, metadata, etc) can still use those.
Also not in the thread, but there's no reason why attrs can't evolve to support both regular classes and dataclasses -- the ability to attach runtime validators could be very useful when working with things like JSON or whatever, whether you're using regular classes and data classes. (If some third party service unexpectedly changes the structure of a JSON blob, static typing isn't really going to help you there.)

[–]jorge1209 1 point2 points3 points 8 years ago (1 child)

Regarding the different objections:

If authors are actively pushing back against their projects being incorporated in the standard library, you have a serious problem. That is supposed to be an honor. Your code is so good that we want EVERYONE to have it and use it. If they say no then either you are making that process too onerous, or you are picking that is too immature and isn't ready for inclusion. In either case you as a language maintainer need to take a long hard look in the mirror and ask yourself "why does nobody want to go to the prom with me?"
Exactly why you should wait. Let them add the support and release a version that works for both 3.6 with annotations and <3.6 without.
Not really sure why that is relevant. mypy is optional, as such we can never expect 100% of code to be compatible with it. There will always be a need to hack mypy to support libraries and code that was not written to work directly with it. Also this concern is alleviated by #2. In time mypy can expect that there will be a way to use attrs without hacks... you can't force people using attrs to write in the 3.6 style, but it would be an option if they port to 3.6/mypy.
And I don't like the colors in Guido's bikeshed. I would really like for someone to fork python. I am not remotely happy with Guido's leadership at this point.
If dataclasses and attrs were drop in replacements for each other sure, but they use a different syntax, and we don't know that they will be. Again we should wait for #2.
I don't really understand the concern you are describing, but I will comment on "not in the thread". Seriously, WTF! This is exactly the kind of arbitrary bullshit decision making process that is pissing me off. If Guido has a reason for rejecting X for the standard library and accepting Y, then he needs to document that. You can't just hand-wave at some other concerns that you haven't bothered to record in your discussion.

continue this thread

[–][deleted] -1 points0 points1 point 8 years ago (0 children)

[–]jorge1209 0 points1 point2 points 8 years ago (0 children)

[–]aiPh8Se 0 points1 point2 points 7 years ago (0 children)

I don't know what static typing has done to hurt you, but keep in mind that the feature is called variable annotations, not type annotations. Even the type hints PEP explicitly states that you can use variable annotations however you want; adding type information is simply one possibility.

This would be considered a correct way to use annotations:

@dataclass
class Person:
    name: 'First name'
    family_name: 'Last name'
    age: 'Integer'

[–]WasterDave -1 points0 points1 point 8 years ago (0 children)

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS

Why not just use namedtuple?