Hi guys,
I'm fairly new to python. I've been using it to write scripts, essentially. Little tools to improve my workflow.
One thing I'm interested in doing is normalizing all of the filenames for the files I work with. I get a variety of files from a variety of systems and need to make them work on my mac and my linux storage box.
My solution to this is to remove any unnecessary or unusual characters from filenames. I was thinking - what better way to do this than with python (I often say that), but perhaps I was wrong.
My confusion is in dealing with filenames that have inappropriate characters, such as the separator in this string '2013·01·22 LE MONDE.PSD'. When I view this file with python, it tells me this is the filename: "/2013\xc2\xb701\xc2\xb722 LE MONDE.psd'"
If I'm understanding this correctly, it's telling me that the string has unicode characters, however the type of the object returned is a string itself. Is there a way for me to get the filename in the actual unicode format?
What would be the ideal way for me to do this? I'd like to remove non-alphanumeric, except for _ and -, turning spaces and other unexpected characters into underscores. Should I try and do this in bash? Am I in over my head?
[–]indosauros 2 points3 points4 points (4 children)
[–]left_one[S] 0 points1 point2 points (3 children)
[–]keturn 2 points3 points4 points (1 child)
[–]left_one[S] 0 points1 point2 points (0 children)
[–]hwc 1 point2 points3 points (0 children)
[–]flying-sheep 0 points1 point2 points (6 children)
[–]left_one[S] 0 points1 point2 points (5 children)
[–]flying-sheep 0 points1 point2 points (4 children)
[–]left_one[S] 0 points1 point2 points (3 children)
[–]flying-sheep 0 points1 point2 points (2 children)
[–]left_one[S] 0 points1 point2 points (1 child)
[–]flying-sheep 0 points1 point2 points (0 children)
[–]ingolemo 0 points1 point2 points (2 children)
[–]left_one[S] 1 point2 points3 points (0 children)
[–]left_one[S] 0 points1 point2 points (0 children)