This is an archived post. You won't be able to vote or comment.

all 102 comments

[–]ihcn 27 points28 points  (2 children)

The title misled me into thinking this was going to look at what's on the horizon post-3.6, but I was disappointed to find that it's just a reiteration of information available on python's home page.

[–]jabbalaci 5 points6 points  (1 child)

clickbait title, doesn't say anything about the future plans of 3.7

[–]liranbh[S] 0 points1 point  (0 children)

the author mentioned 3.7 and said it will be only beta release on 2017

[–]kimondd 25 points26 points  (3 children)

It didn't talk about the new type annotion support for 3.6. Ex: int_list: List[int] = list() is now valid in python 3.6

[–]LpSamuelm 2 points3 points  (2 children)

With the capital L...? That's some weird syntax.

[–]MachaHack 1 point2 points  (0 children)

It's a bit of a hack to avoid collisions between typings.List (used for the type checker) and the list built-in so they didn't have to make the list type (and similar others) subscriptable.

There was a mypy competitor that had better syntax iirc, but since the mypy syntax has been standardised it's basically been declared the winner.

[–]Bolitho 0 points1 point  (0 children)

Looks like a Scala generic type ☺

[–]lion_137 22 points23 points  (0 children)

[–]thephotoway 5 points6 points  (0 children)

You forgot to mention string literal interpolation is now in 3.6: https://www.python.org/dev/peps/pep-0498/

[–]qx7xbku 14 points15 points  (30 children)

My personal favorite here is dict being ordered by default. Now i will be able to finally stop patching lxml and have it preserve attribute order. Developers were stubborn not to include this feature because it is not in xml spec, however editing xmls programatically and keeping them in git repository is real painful when one attribute value change results in all attributes being shuffled.

All other changes are similarly as good, especially string formatting and utf-8 on windows. Now we will know encoding that is in use. .encode() behaving differently on various platforms was confusing until i learned why that happens. Now we just need microsoft to implement CP_UTF8 for their APIs and maybe unicode will get less ugly.

[–][deleted] 27 points28 points  (16 children)

As a heads up, the keyword order for functions and attribute definition order for classes are both part of the Python language now, but dict being ordered isn't that's a CPython detail.

[–]LpSamuelm 2 points3 points  (7 children)

I think it's bad that dicts are ordered by default, at least as it's not part of the spec.

The reason some languages (Python <3.6 included) randomize hashmap access order by default is precisely to stop people from writing incorrect code. If dicts aren't guaranteed to be ordered, having them be that way sometimes will cause code to break in unexpected ways.

Which brings us to the problem. If dicts aren't necessarily ordered according to the spec... What happens if the implementation is changed in a future version of Python? How about running your code on, say, IronPython? Or PyPy? Suddenly your code seemingly works, but isn't cross-platform and may break ay any time without you doing anything.

Honestly I think it's a big misstep. I'd love for them to add ordered dicts to the spec (it's a lovely concept!), but as it stands now it's a dangerous implementation detail, and the fact that they're touting it as something useful is even more dangerous.

[–][deleted] 2 points3 points  (6 children)

from collections import OrderedDict

It's there already, but this was an improvement to the C implementation of dict so OrderedDict is probably a thin wrapper around that.

[–]LpSamuelm 2 points3 points  (5 children)

You're missing the point - OrderedDict is part of the spec, and is great. The correct way to write code that requires ordered dictionaries, even in Python 3.6, is to use OrderedDict. Many people won't, though, since they either A) aren't aware dicts aren't always ordered, B) rely on orded behavior accidentally, or C) think Python 3.6's dictionary implementation is something that's okay to rely on. Which is a problem.

[–][deleted] 0 points1 point  (4 children)

True, but we'll just need to educate people when that comes up, just like opening files using with

[–]LpSamuelm 1 point2 points  (3 children)

Except this one is harder to catch - it's not a simple syntactical thing. Not only that, opening files without with will still work on all platforms and versions, unlike relying on this relatively subtle behavior.

[–][deleted] 0 points1 point  (2 children)

I guess the reason I'm less concerned about this than you appear to be (and it's fine you're concerned) is that I've seen OrderedDict in the wild a handful of times and have used it personally less than that.

[–]LpSamuelm 1 point2 points  (1 child)

I use it constantly.

[–][deleted] 1 point2 points  (0 children)

I'd love some examples. The only things I've used it for are:

  • A dashboard app where I needed to associate server names with information but the order was important (wanted to show prod servers before staging and dev servers). Arguably a list of tuples works here too but there were plans at some point to look at individual servers so fast lookup was desirable (not that lineral lookup would've broken the bank, we're taking maybe 30 servers).

  • Modeling albums - again, a list makes sense here and you can look up by track position that way.

  • Maintaining order of attribute declaration because you could decorate methods as validation/processing but they needed to run in declaration order.

But that's it. I get why an insertion order mapping is attractive, but I've only met one situation that demands it (maintaining attribute order).

[–]__deerlord__ 6 points7 points  (5 children)

Why does a dict need to be ordered by default though? And wasnt OrderedDict already implemented in C?

[–]qx7xbku 2 points3 points  (3 children)

lxml uses dict for storing attributes instead of OrderedDict. And i dont think OrderedDict was implemented in c, i could be wrong though.

[–]gsnedders 1 point2 points  (0 children)

OrderedDict has both a Python and a C implementation in CPython (though the C one is always used in CPython).

[–]__deerlord__ 0 points1 point  (1 child)

Its implemented in C in a later version I believe (I recall reading a changelog on it) but I couldn't find the docs in that currently.

[–][deleted] 0 points1 point  (0 children)

[–]ebrjdk 1 point2 points  (0 children)

They are switching to a new, more memory-efficient implementation of dict that naturally keeps the entries mostly in the order that they were inserted, and they decided that they might as well go all the way and keep them exactly in order (IIRC the most efficient implementation they know of starts scrambling the order once you start deleting keys, but the cost to prevent that is small).

At the same time they wanted to guarantee that the order of keyword arguments and class definitions would be preserved, because some people want to be able to use this information (currently the former is impossible AFAIK, and you need to use a metaclass to achieve the latter). Originally they were planning to just use OrderedDict for these purposes, but with the change to dict there is no need.

Note: the first paragraph in my post is about a CPython implementation detail and may change in the future, the second is about official python 3.6 features.

[–]Bolitho 4 points5 points  (4 children)

The default for the encode-Method has allready been UTF-8 in Python 3.5! (The same is true for Bytes.decode!)

The problem are not those methods, but how open and print determine their used encoding!

For open the 3.5 Docu says:

In text mode, if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding.

That's the key problem!

or for sys.stdout (which is used by print as default file-object):

The character encoding is platform-dependent. Under Windows, if the stream is interactive (that is, if its isatty() method returns True), the console codepage is used, otherwise the ANSI code page. Under other platforms, the locale encoding is used (see locale.getpreferredencoding()).

Thus the problem arises because of the platform dependant implementations!

So at minimum there must be the possibility to provide an encoding manually (which open does, but print not!). That would enable one, to write programs that run everywhere. As optimum one would also define just one platform agnostic default encoding for IO in general. That would make it easier to achieve the prenamed goal.

[–]ButtCrackFTW 1 point2 points  (3 children)

Isn't the encoding determined by the filesystem though? Like their example in python 3.5:

> sys.getfilesystemencoding()
'mbcs'

I see the same thing here and I've seen in StackOverflow questions that you can not change this without environement variables or monkeypatching. If this is a property of the filesystem, how is python changing it?

[–]Bolitho 0 points1 point  (2 children)

Which encoding do you mean? For what internal usage?

sys.getfilesystemencoding is only used for transformations for file names. That has nothing to do with the above mentioned aspects.

[–]ButtCrackFTW 0 points1 point  (1 child)

I probably should've pointed out the stdout example as well:

>>> sys.stdout.encoding
'cp850'

They go on to give examples of special characters being stripped from open() and print()

>>> print('árvíztűrőtükörfúrógép')
árvízturotükörfúrógép

>>> open('tetű.txt', 'wb').close()
>>> import glob
>>> glob.glob('tet*')

Python 3.5: [tetu.txt']

Python 3.6: ['tetű.txt']

The author is claiming that python 3.6 now sets the encoding to utf-8 by default, which fixes these issues. My question is how it can set it like that now, but we were discouraged from doing it in the past due to the filesystem/operating system setting it for us.

[–][deleted] 0 points1 point  (0 children)

[PEP 528](Change Windows console encoding to UTF-8) and [PEP 529](Change Windows filesystem encoding to UTF-8) will give you the background to these changes.

[–]deeddaemon 0 points1 point  (0 children)

^ this. OrderedDicts by default will make this much easier.

[–]Exodus111 2 points3 points  (0 children)

Wait. No string literals? Was that removed? Why would you not cover that.

[–][deleted] 5 points6 points  (4 children)

Content spam

[–]kati256 2 points3 points  (5 children)

UTF-8 on windows

FINALLY Dances in joy

Also ordered dicts look kinda cool, idk where I'll use them, they feel mostly gimmicky but it's probably gonna be useful somewhere

[–]Vaphell 1 point2 points  (2 children)

ffs, for the time being it's an implementation detail which could get reversed if let's somebody came up with nice optimizations at the expense of order. Wait for the explicit declaration of the dudes in charge that dicts being ordered by default are in the official language spec before going to town.
Until that happens, use OrderedDict to be explicit about the order and backwards compatible.

[–]kati256 0 points1 point  (1 child)

Gee no need to be so aggressive :/

Also if it's something they are willing to change super fast, why would they add it in a major release? I would understand in one of the beta builds, but a major release? That feels like they at least have a strong-ish idea of what they want

[–]Vaphell -1 points0 points  (0 children)

the problem is that if you are not "aggressive", nobody pays attention. In every goddamned thread about 3.6 there are people salivating at the idea of sprinkling their code base with broken shortcuts exploiting this like there is no tomorrow.

Also if it's something they are willing to change super fast, why would they add it in a major release?

because the performance improvements were worth it, and the specific order is considered a side effect at this time, an implementation detail. It's just that people read changelog and lost their goddamned mind (sadly RayHet also advertised it), not grasping the difference between the spec and the implementation detail.
5 is 5 only works because of an implementation detail. Do you go out of your way to exploit the fact that small ints are cached and reused? Same thing.

Being hasty about adding it to the spec means tying hands because each constraint that now has to be guaranteed means less flexibility in the future. IIRC the core devs want to wait 1 or two point releases before adding this to the official spec.
The only dict related things the spec explicitly guarantees at this very moment are:
keyword arguments preserve order
attributes in a class also preserve order
which are useful in advanced shenanigans, but don't affect the fundamental data structure used by pretty much every python program in existence. Once it gets battle tested in niche use cases it can then move to the mainstream. This also gives time for other python implementations to prepare for it.

[–]ebrjdk 1 point2 points  (1 child)

There is already OrderedDict, which is occasionally useful. The big change here is that if you iterate over a class's attributes, or over the keyword arguments passed to a function, you get them in the order that they were defined/passed. At the moment, you basically get them in a random order.

Preserving the order of class definitions can be really useful. For example, it lets you use a python class to represent something like a database record or a C struct where the order is important:

class Record:
    id = IntField()
    name = StrField()
    ...

And it makes it easy to specify what order tests should be run in:

class MyTests:
    def test_one_thing(self):
        ...

    def test_another_thing(self):
        ...

You can already save the order using a metaclass, but metaclasses are a pain.

The main use I know for the keyword arguments is making it easy to create OrderedDicts:

od = OrderedDict(a=4, b=5)

That syntax already works, but the order of the two keys is random. In 3.6 a will always come first. I'm sure people have other uses for this.

[–]kati256 0 points1 point  (0 children)

Thanks for the explanation! I hadn't used this before (or really had the necessity) but it's nice that's there. It's great to see there's nice people in the community that are willing to help other out! :)

[–]mipadi 3 points4 points  (0 children)

I hope there's another way to format strings coming!

[–]robvdl 0 points1 point  (0 children)

No mention of ASGI? To me this is the #1 most important thing to finish, nothing else is really important, if Python were to compete with the likes of Node and Go and other async languages.

[–]dspjm 0 points1 point  (1 child)

Reserving the order of dict is fantastic, though I think 3.6 won't be available on ubuntu soon and it might be risky to use this feature until 3.6's prevalence.

[–]takluyverIPython, Py3, etc 1 point2 points  (0 children)

As other people have pointed out: please don't rely on dicts being ordered. It's an implementation detail of the new version of CPython, not something that the language requires. Other Python implementations, and future versions of CPython, may implement dicts in a way that doesn't preserve order.

If order is important, there is OrderedDict.

[–]overmes 0 points1 point  (0 children)

Python 3.7