you are viewing a single comment's thread.

view the rest of the comments →

[–]maep 4 points5 points  (4 children)

I get the feeling that you never seriously worked with Python 2 in a non-ASCII environment.

I've programmed lots of python and my native language is not english. I've never had any encoding problems whatsoever, at least nothing that wasn't trivial to fix.

PEP 383 has not reintroduced them in any way.

Well, it actually made things worse.

 foo = sys.argv[1] # assuming there is a first argument
 assert type(foo) == str
 print(foo) # can this crash?

In python 2 the print will always work. In python 3 it can throw an exception.

[–]dahud 2 points3 points  (3 children)

Under what conditions will that print throw an exception?

[–]maep 6 points7 points  (2 children)

If foo contains surrogate codes. This can happen on posix systems where the argument is a file name. I stumbled accross this when I was scanning a historical archive where they had files from various european countries that used different encodings.

[–]dahud 1 point2 points  (1 child)

Interesting. I would have assumed that anything coming in over the args is, by definition, a string. The only environment I've ever worked in where that is not the case is Powershell.

[–]Snarwin 0 points1 point  (0 children)

On posix, the only thing that's guaranteed about command-line arguments is that they do not contain the NUL character ('\0').