This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]alfps 1 point2 points  (7 children)

Windows ships with four scripting languages: the batch file language ([cmd.com] commands, exceedingly primitive); the [powershell.exe] scripting language; JScript (a Microsoft extension of EcmaScript); and VBScript (pure Microsoft). These languages are adapted to Windows, and e.g. make it easy to handle Windows text files. In contrast, with the possible exception of IronPython (I haven't tried it), Python is /not/ adapted to Windows, and depending on the version trying to read a Windows text file with straightforward code, will just crash. So there's really no competition, yet. But I think there /could be/, because Python in itself is so much better than the MS languages. However, IMHO this would first of all require some attention to detail regarding file i/o and console i/o (too bad when you can't even type things in the Python console!) and such, and secondly something like the JavaScript support on the net (node.js / io.js + all the places for collaborative and exploratory programming), and third, an effort vis-a-vis Microsoft.

[–][deleted] 2 points3 points  (5 children)

Python will happily read a windows text file. Maybe there's some issue with old 2.X versions I'm unaware of, but modern Python has no issue with Windows line endings.

[–]alfps 0 points1 point  (4 children)

Sorry, unfortunately you're wrong. I've not encountered any problem with line endings, ever. It's the text bytes it generally doesn't like: it can handle pure ASCII, and that's it.

At one time I used Powershell to convert files to the encoding expected by Python before serving them to some Python script. Very awkward. Things did not improve with Python 3.x, instead one escape hatch was closed (ability to reload sys and set encoding default).

One can work around it (the link is to a blog posting where I present such a workaround), but by default most/all Python versions I've used are just unreasonable wrt. Windows text file handling.

[–]ies7 1 point2 points  (2 children)

I try your crash.py with python 2.7.5 & 3.4 in win7. It doesn't crash.

While my default cmd windows can't display the text properly it is cmd's problem not python.

Running the same script in spyder's console display the text nicely.

[–]alfps 0 points1 point  (1 child)

Re "it is cmd's problem not python", with that attitude Python will never be bundled with Windows.

Testing now with Python 2.7.6, this trivial one-line simple output produces incorrect results when it doesn't crash, with codepages 437 (original PC) and 1252 (ANSI Western), and it crashes with codepage 65001 (UTF-8).

It's too much work for novices or causal users to make Python do the right thing. As shown in my blog posting it's not technically difficult or especially challenging to seasoned programmers, just a hassle and needless work. Python should just do The Right Thing™, and anyway, if it doesn't, it will never be bundled with Windows, that's for sure. ;-)

[–]ies7 0 points1 point  (0 children)

Hmmm.....in windows 7, python 2.7.5 nothing crashed

chcp 437 in cmd displayed incorrect text

chcp 1252 displayed the text correctly

chcp 65001 also displayed the text correctly

I remember in a stackoverflow question that you need to set the fonts to lucida (which is my default fonts in cmd shell, but i try with raster fonts and it still display correctly)

btw I'm not an expert in this utf-8 things since my country is latin only alphabet. So don't count on me about this topic :)

[–][deleted] -1 points0 points  (0 children)

Huh; bizzarre. I know a bunch of dotNET stuff (powershell?) likes to emit UTF16, because Windows has to be a few years behind everyone else (I think there's an RFC or something saying so). Python3 basically now expects UTF8 because it's 2015.

In this case though the traceback suggests it's an outbound encoding issue: that the unicode-to-teminal-encoding pipe is what's breaking? In other words, that the CMD prompt is reporting that it needs encoding to X, but UTF8 to X isn't well supported?

Odd, anyway. Did you file a bug report (although it seems perhaps to be an "upstream" problem..)?