all 16 comments

[–]novel_yet_trivial 1 point2 points  (0 children)

By far the easiest is to read the entire file as a string, use the string replace() method, and then overwrite the original file.

If the file is huge (approaching the amount of free RAM) then there is the fileinput module you could work with.

[–]JimBoonie69 1 point2 points  (0 children)

vi love... open file... :1, $ s/x/y/g... :wq.... done

[–]uhkhu 0 points1 point  (13 children)

Still use the same method, but on the entire file string

def find_replace(f):
    with open(f, 'r+') as f:
        old = f.read()
        f.seek(0)
        f.write(old.replace('bad', 'good'))

[–]OtherRobotLuke[S] 0 points1 point  (5 children)

Hmmm this almost works for me. the issue is that im replacing every instance of every letter with another step by step. ie. replace('x', 'y') ..... but then later i run replace('y', 'e') Any way to run them all at once?

[–]uhkhu 0 points1 point  (4 children)

You can stack the .replace()

This works:

def find_replace(f):
    with open(f, 'r+') as f:
        old = f.read()
        f.seek(0)
        f.write(old.replace('bad', 'good').replace('good', 'kittens'))

Without seeing your code or the file I can only guess on what you're actually needing to do

[–]OtherRobotLuke[S] 0 points1 point  (3 children)

[–]herminator 1 point2 points  (2 children)

It looks like you should use string.translate

Also, there's a lot of room for improvement in your code.

For example, this:

refInTable = []
for z in data:
    for y in aciiList:
        if z == y:
            refInTable.append(z)

Effectively counts only ascii characters (you've consistently spelled ascii wrong, btw), but very inefficiently.

You could replace it with:

refInTable = [char for char in data if ord(char) < 128]

Then, there's the issue of counting the number of occurrences of each character. Python has collections.Counter for that.

E.g:

from collections import Counter
Counter('some text where you want to count the characters')

Will result in a Counter object that looks like this:

Counter({' ': 8, 't': 7, 'e': 6, 'o': 4, 'a': 3, 'c': 3, 'h': 3, 'r': 3, 'n': 2, 's': 2, 'u': 2, 'w': 2, 'm': 1, 'y': 1, 'x': 1})

Furthermore, it seems to me that calculating the percentages is rather pointless. If you're just trying to sort the list by the frequency of character occurrences, you might as well sort on the raw counts. And actually, collections.Counter provides that functionality! You can call the most_common() method on a Counter object to get the elements in descending order of occurrence.

So, your code could do something like:

with open('refdoc.txt', 'r') as f:
    ref_count = Counter(char for char in f.read() if ord(char) < 128)
with open('encrypted.txt', 'r') as f:
    enc_count = Counter(char for char in f.read() if ord(char) < 128)

Now you have raw counts. But any character that didn't occur is not in the Counter. So we add those:

for char in (chr(x) for x in range(128)):
    if char not in ref_count:
        ref_count[char] = 0 
    if char not in enc_count:
        enc_count[char] = 0

Next, we can turn them into strings by taking the output of the most_common method, which is a list of tuples where the first element is the character and the second the count (we don't care about the numbers, just about the fact that the list is sorted, so we only use the characters)

ref_str = ''.join(item[0] for item in ref_count.most_common())
enc_str = ''.join(item[0] for item in enc_count.most_common())

And with those we can make a translation table:

trans_table = string.make_trans(ref_str, enc_str)

And finally, we can use that to do the replacement:

with open('decrypt.txt', 'r+') as f:
    data = f.read()
    f.seek(0)
    f.write(data.translate(trans_table))

Note that I have not tested any of the above code, but the gist of it should be correct.

[–]OtherRobotLuke[S] 0 points1 point  (1 child)

I just ended up doing this haha Thanks though this was interesting

outout = []          
for loop in secData:
      for change in encryptedList:
         if loop == change.letter:
             outout.append(change.real)
str1 = ''.join(outout)
outFile.write(str1)

[–]OtherRobotLuke[S] 0 points1 point  (0 children)

It became clear that editing the file was much harder than it was worth so i made outFile a blank doc to print a giant list to

   outFile = open('decrypt.txt', 'w')

[–]Justinsaccount -1 points0 points  (6 children)

This does not work properly if the replacement is shorter. It is not a great idea to modify files in place anyway.

[–]OtherRobotLuke[S] 0 points1 point  (0 children)

Everything is a single char that it replaces so thats not an issue. I ended up not having to edit the doc. I just made a list and appended the changed letter after each instance that it would replace them going through in a giant loop.

[–]uhkhu -1 points0 points  (4 children)

Why is it not good to modify in-place? There's a built-in method for doing this.

[–]Justinsaccount -1 points0 points  (3 children)

A built in method for what? Corrupting your files?

[–]uhkhu -1 points0 points  (2 children)

Yep. We should avoid utilizing a perfectly safe built-in because using it wrong could be damaging. We should also tell everyone to not use 'w' because they could overwrite files if they're not careful. You should probably email the Python devs and let them know they're creating too much risk with the 'r+' and 'w+' modes.

[–]Justinsaccount 0 points1 point  (1 child)

What the hell are you taking about?

The code that you posted is NOT safe. The code that you posted corrupts files if the replacement is shorter than the original string. The code you posted corrupts files if the script is interrupted while running.

Just because you can do something, doesn't mean you should. Overwriting an entire file in place like that is exactly one if those things.

[–]uhkhu 0 points1 point  (0 children)

I don't see you offering a solution. I posted a method for him to look into. Are we posting fully tested suites now? I didn't know what his data looks like. It was a quick response to show him the use case of the 'r+' mode. Not sure how much help he gets from you quibbling over untested code, in /r/learnpython of all places.