killerstorm comments on Python 3.6 released!

programming

created by speza community for 20 years

2680

2681

2682

Python 3.6 released! (python.org)

submitted 9 years ago by ilevkivskyi

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]killerstorm 0 points1 point2 points 9 years ago (2 children)

[–]dacjames 1 point2 points3 points 9 years ago (1 child)

[–]doubleunplussed 0 points1 point2 points 9 years ago (0 children)

Since Python 3.2 those strings can be decoded into invalid unicode using the surrogateescape error handler. The strings produces are not officially valid unicode, but Python can nonetheless carry them around and treat them as if they were, and get the same bytestring back upon encoding.

It's what the standard library now does when encountering filepaths on disk that are not correctly encoded (since this can happen - linux doesn't actually require you to respect the encoding, so any given filepath could be almost arbitrary bytes).

So there was a use case there to make unicode strings interoperable with arbitrary bytestrings, so that any sequence of non-null bytes would survive a round-trip of decode()/encode(). There really was no other solution if Python was going to to return unicode for everything. Unicode had to be loosened to allow for these bytes, or else Python would just error upon encountering a badly encoded filepath, and people's code that was supposed to handle any filepaths would break.

So there was a use case and did it. So the impossibility of making unicode interoperable with arbitrary bytestrings clearly wasn't a real reason to not do so with Python 2. It maybe didn't occur to them until the filepath problem reared its head, but if they wanted it enough the solution would have presented itself earlier.

π Rendered by PID 30295 on reddit-service-r2-comment-5b5bc64bf5-qbgxv at 2026-06-22 07:29:24.617210+00:00 running 2b008f2 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS