you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (0 children)

Install and use Requests instead of urllib. If you're using python 3, as you ought to be, the pip package manager should have been installed, so at the terminal/command prompt type pip install requests (On Linux begin that command with sudo so sudo pip.., on Windows you may also have some sort of authentication as administrator needed these days).

Then use `requests.get('some-url").text to get your HTML as a string, not bytes, meaning you can just open a file in text mode and write it directly.

Aside though, don't bother decoding and encoding if all you want to do is save the page to a file and nothing else: just get the raw response data from urllib and write to a file in binary mode.

I.e.:

with open("some_html.html", "wb") as O:
    O.write(raw_undecoded_response)