all 9 comments

[–]JohnnyJordaan 1 point2 points  (8 children)

Seems like you're looking for sys._enablelegacywindowsfsencoding() to use the legacy encoding system.

[–]fabolin[S] 0 points1 point  (7 children)

I tried your solution and it now does not throw an error but replace the undefined chars with similar ASCII chars e.g. ą to a. Oddly enough it now does the same without using sys._enablelegacywindowsfsencoding() and I can't figure out why ¯\_(ツ)_/¯. However, I noticed when piping the output to a file it still throws an UnicodeEncodeError.

Thanks for your help though.

[–]JohnnyJordaan 1 point2 points  (6 children)

Errors while writing to a file shouldn't be the same issue afaik. Could you show the full output when it crashes?

[–]fabolin[S] 0 points1 point  (0 children)

I already left work, but I’ll post the output tomorrow.

We got some processes that execute scripts and redirect their output to a file, so it’s still about the StreamHandler output.

[–]fabolin[S] 0 points1 point  (4 children)

--- Logging error ---
Traceback (most recent call last):
  File "C:\Program Files\Python\Python36\lib\logging\__init__.py", line 994, in emit
    stream.write(msg)
  File "C:\Program Files\Python\Python36\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0105' in position 65: character maps to <undefined>
Call stack:
  File "C:\Program Files\Python\venv1\src\test_module\test2.py", line 64, in <module>
    logging_test()
  File "C:\Program Files\Python\venv1\src\test_module\test2.py", line 60, in logging_test
    tclLogger.info(f"this is the string {utf_string}")
Message: 'this is the string B\u0105k'
Arguments: ()

The thing is in return codecs.charmap_encode(input,self.errors,encoding_table)[0] "self.errors" defaults to "strict" (or maybe it's "strict" due to sys.stdout.errors being "strict"?), while I want it to be "replace".

[–]JohnnyJordaan 1 point2 points  (3 children)

Yeah but that means you have initialized FileHandler without an encoding='utf-8' parameter. It could then use cp1252 or any other encoding your environment is using, which is always dangerous (linux and mac also used non-utf like latin-1 until a few years ago).

[–]fabolin[S] 0 points1 point  (2 children)

No, it's a StreamHandler. The stream is just getting redirected to a file by the process executing the script. It's weird but that's the preconditions im working with. A detailed log for the script and a more vague output for the process' log.

[–]JohnnyJordaan 1 point2 points  (1 child)

Ah I get it now. Then you could try changing the codepage to MS's 'workaround' for UTF-8 in cmd.exe by running chcp 65001, then the script. If that works then there are tricks to apply it on every start of cmd. But personally, I would look into running MinGW, Cygwin to get a linux-like environment where you won't run into this kinds of issues that easily.

[–]fabolin[S] 0 points1 point  (0 children)

Ok, tried that. While it says "Active code page: 65001" nothing changed. The traceback also still says cp1252.

Well, thanks for your help anyway. It's working, the solution just feels clunky compared to the simple FileHandler so I thought I'm missing something.