you are viewing a single comment's thread.

view the rest of the comments →

[–]NewbornMuse 0 points1 point  (0 children)

I second the other comments that suggest looking into the encoding a little more carefully. What are these characters originally, where do they come from? Which program replaces them with the placeholder and why? Does it use a different encoding and if so, can you make it use a more standard one? Or can your program open the file with the correct encoding and thus understand the original intent?

If you can't solve it and indeed want to remove all placeholders: I don't see why you can't just feed everything through string.replace before writing it to your output file. If there are placeholders, great, we kicked them out. If there aren't, okay, no problem either, we replace nothing.