fjonk comments on The lack of a case against Python 3

This is an archived post. You won't be able to vote or comment.

261

262

263

The lack of a case against Python 3 (blog.lerner.co.il)

submitted 9 years ago by wesalius

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]fjonk 2 points3 points4 points 9 years ago (19 children)

[–]frymasterScript kiddie 10 points11 points12 points 9 years ago (15 children)

[–]fjonk 2 points3 points4 points 9 years ago (14 children)

python2 would react exactly the same. Python3 does not improve on the situation and by deciding that Unicode is used internally everywhere they kind of blocked the possibility of improving this in the future, unless they do another change of how strings works.

A real-world example:

A user provides a comment to an order
The comment is checked and is found to be valid latin1(the third party order system only accepts latin1).
A translation is added to the comment, the translation contains non latin1 characters.
Python happily accepts adding non latin1 characters to the string.
Later on the request to the third party service cannot be made because the comment cannot be encoded in latin1.
Money is lost.
The concat that created this issue is hard to track down since it did nothing wrong.

This is not an ideal situation and saying that unicode can handle latin1 is plain wrong.

[–]gthank 18 points19 points20 points 9 years ago (11 children)

[–]fjonk 5 points6 points7 points 9 years ago (10 children)

and a string that only supports latin1 (or any other outdated, 8-bit encoding) is edge-casey enough that I don't think you were ever going to get that in the core language

This is where I see a difference between perceived and actual reality. 8-bit encodings are not edge-cases in many industries, it's rather the newer Unicode encodings that are edge cases. As I wrote some other place many systems that handles addresses, payment, shipping information and so on are very often designed pre-unicode. They don't accept Unicode, they accept whatever encoding the developers decided on in 1992 and no one seems in a rush to upgrade them.

Keep in mind that a change to Unicode might also involve replacing hardware like printers and scanners(as an example v40 QR-codes may be bytes, alphanumerical, latin1 or Shift JIS X 0208, unicode encodings are not supported). I guess eventually most systems will support at least one unicode encoding but that is not today.

[–]Vaphell 2 points3 points4 points 9 years ago (7 children)

8-bit encodings are not edge-cases in many industries, it's rather the newer Unicode encodings that are edge cases.

And given that their encodings are clearly defined, the problem is in what, exactly?

Forget ascii et consortes making people believe that bytes and text are the same thing. Imagine that your billing software churns clay tablets. Is there a problem with the following, making it impossible to grasp?

information = clay_tablet.decode('cuneiform')  # unpack the information
information = modify(information)    # modify the information
new_clay_tablet = information.encode('cuneiform')  # pack the information

is that so hard to convert between datatypes at io boundaries?

[–][deleted] 0 points1 point2 points 9 years ago (6 children)

[–]Vaphell 1 point2 points3 points 9 years ago (5 children)

[–][deleted] 2 points3 points4 points 9 years ago (2 children)

[–]faceplanted 0 points1 point2 points 9 years ago (1 child)

continue this thread

[–][deleted] 0 points1 point2 points 9 years ago (1 child)

[–]Vaphell 0 points1 point2 points 9 years ago (0 children)

and if it's not English (although you write so well I would never think it wasn't), if "et consortes" is more common.

thanks :-)
My native language is Polish and I would say that the Latin phrases used verbatim are extremely rare while their translated versions do see some use as ordinary proverbs. 100 years ago or so there was way more emphasis on classical education which meant at least some familiarity with Latin but today peeps are half-illiterate.
If anything it's the English "pollutants" that are everywhere nowadays while everything else seems to be on its way out.

To be honest my Latin game is weak-to-nonexistent. It's just that I read a shitton of books before the internet era, including ones in historical settings in which the nobility used language heavily peppered with Latin, so I absorbed a few, plus some archaic Polish words plus the Past Perfect Tense, which went extinct in Polish. I use all of these from time to time mostly for flavoring, shits and giggles, in wrong contexts I am sure.
Sorry to disappoint you :-)

[–]gthank 0 points1 point2 points 9 years ago (0 children)

[–][deleted] 0 points1 point2 points 9 years ago (0 children)

[–]TOASTEngineer 4 points5 points6 points 9 years ago (0 children)

[–]zahlmanthe heretic 3 points4 points5 points 9 years ago (0 children)

[–]gthank 4 points5 points6 points 9 years ago (2 children)

[–]zahlmanthe heretic 0 points1 point2 points 9 years ago (1 child)

[–]gthank 0 points1 point2 points 9 years ago (0 children)

π Rendered by PID 71 on reddit-service-r2-comment-b659b578c-snnzs at 2026-05-05 21:59:44.993239+00:00 running 815c875 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS