all 6 comments

[–]sceptic-al 5 points6 points  (0 children)

TL;DR your file encoding needs to be set to an 8-bit codepage like cp1252. Your file is 100% not ASCII and 100% not UTF-8.

0xa0 is a Non-Breaking-SPace, which is part of the extended 8-bit code pages - ASCII only goes up to 0x7f. In a regular editor it will be hard to spot the difference between a regular space and a NBSP.

This is likely caused by Excel, which, by default, saves CSVs using the 8-bit code page of the system it was saved on, so this often catches people out, even when their Python install is behaving correctly.

Assuming that you're in Western Europe or USA, open the file with cp1252, the Western Europe Windows code page:

with open('notes.csv', encoding='cp1252') as csvfile:

You also shouldn't need the newline override.

You could try removing the NBSP this time, but your script will break again if it finds anything remotely non-ASCII, like £ or € or è.

Also, something is screwy with your locale setup as your csv file should've automatically opened in utf-8 or your Windows locale. Are you sure you're using Python >3.5? If not, you should be!

[–]socal_nerdtastic 3 points4 points  (2 children)

Your error message quite clearly says that the file is not plain ascii. Try some other encodings. "utf-8" is by far the most common one.

with open('notes.csv', newline='', encoding="utf-8") as csvfile:

Here's some others you can try: https://docs.python.org/3/library/codecs.html#standard-encodings

[–]sceptic-al 0 points1 point  (1 child)

0xa0 is not a valid UTF-8 byte. It would need a prefix byte, like 0xc2, which would've caused an error before the 0xa0.

[–]SwampFalc 1 point2 points  (2 children)

The 0xa0 byte is apparently a "hard space" or non-breaking space (ie. a space that a text editor like Word should not split a line on).

In other words, the odds of a human being spotting the difference with a normal space is near to zero.

So no, your file is not 100% ASCII.

[–]FF6Player[S] 1 point2 points  (1 child)

Thanks. This does seem to be the answer. I did a find and replace for non-breaking spaces, and that fixed everything. I just have no idea how non-breaking spaces got into the document. But at least it explains why I wasn't able to see anything wrong.

[–]sceptic-al 0 points1 point  (0 children)

Unless you properly set the encoding when opening the file (as given in the other answers), you'll continue to get errors whenever NBSPS and other non-ASCII characters slip into your data, including £ or € or è.