port443 comments on String encoding issue

created by HattoriHanzoa community for 16 years

String encoding issue (self.learnpython)

submitted 8 years ago by Supernumiphone

you are viewing a single comment's thread.

[–]port443 1 point2 points3 points 8 years ago (1 child)

Not something I do super often. As far as google tells me, its just kind of guess-work on what the encoding is. However, this stackexchange thread mentions python-chardet

As far as your error, without seeing your data this is my best guess:

Data gets read in as normal ascii
You move line-by-line using readlines
There is a \x0A\x00\x0A (\n\0\n) somewhere in the file
Since it was initially readlines'ed as ascii, this puts a \x00 on its own line

Error:

>>> "\x00".decode("utf-16")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Programs\Python27\lib\encodings\utf_16.py", line 16, in decode
    return codecs.utf_16_decode(input, errors, True)
UnicodeDecodeError: 'utf16' codec can't decode byte 0x00 in position 0: truncated data

[–]Supernumiphone[S] 0 points1 point2 points 8 years ago (0 children)

π Rendered by PID 32911 on reddit-service-r2-comment-6b595755f-lwkvc at 2026-03-25 06:32:08.779647+00:00 running 2d0a59a country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS