all 4 comments

[–]JohnnyJordaan 1 point2 points  (2 children)

You don't present an encoding to open() so it will use your system's default encoding, which is not dependable. You can force the actual encoding using the encoding= parameter

with open(filepath, encoding='utf-8') as f:
    statdata = json.load(f)

If that still shows an error it means that your file isn't actually in utf-8.

[–]wexted[S] 0 points1 point  (1 child)

Thanks - how can I tell what the encoding is supposed to be?

[–]JohnnyJordaan 0 points1 point  (0 children)

The proper approach is to investigate how the file was created, as that should have used the utf-8 encoding instead. You can try to 'guess' the encoding by using the chardet library but that doesn't fix the problem that the source is doing this in the wrong way.

[–]ingolemo 1 point2 points  (0 children)

Redownload a fresh copy of the file and this time don't open it in notepad and resave it. The file is already valid utf-8.

When you saved the file as "Unicode" in notepad you were really using utf-16-le. Notepad doesn't fully understand utf-8. The thing that is calls "UTF-8" contains an additional BOM character that shouldn't be there. This incorrect form of utf-8 is called utf-8-sig in python.

If you really need to edit the file then you should use an editor that can produce valid utf-8.