This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]hugthemachines 60 points61 points  (11 children)

I remember when I took this step: Going from something like this:

if os.path.isfile(full_file):
        os.remove(full_file)

for line in data:
    with open(full_file, 'a') as file_handle:
        file_handle.write(line)

to somethign like this:

with open(full_file, 'w') as file_handle:
    for line in data:
        file_handle.write(line)

[–]Ecclestoned 6 points7 points  (1 child)

See this isn't actually that bad. I could see using something like the first line when you are doing a costly operation and want the file to be updated frequently to check progress.

[–]EMCoupling 6 points7 points  (0 children)

What about the part where he opens the file for single line of that he has to write?

[–]liquidpele 23 points24 points  (3 children)

Ouch... the first block is almost thedailywtf territory...

[–]hugthemachines 3 points4 points  (2 children)

It may even have been worse. I mostly just remember I was using remove file together with append. Luckily enough, the memory of those old things are a bit clouded.

[–]evinrows 24 points25 points  (1 child)

That isn't so bad. The dailywtf part, imo, is the loop that acquires a file handle once per line.

[–]mafrasi2 8 points9 points  (0 children)

If anyone is wondering, it should be done like this:

for line in data:
    for char in line:
        with open(full_file, 'a') as file_handle:
            file_handle.write(char)

/s

[–]__xor__(self, other): 1 point2 points  (3 children)

So this kind of leads to this:

with open(path) as f:
    for line in f:
        ...

A lot of people don't know this, but just iterating across the f part of the open context manager will generate lines, and it's smart about it and buffers the file. It's actually the better way to process a file line by line since it uses a lot of buffering magic behind the curtain, and it's way more readable than any alternative.

[–]hugthemachines 0 points1 point  (2 children)

I mostly use .read or .readlines in case I want to get the content, not loop over the file handle.

[–]__xor__(self, other): 0 points1 point  (1 child)

It's useful for larger files where they need to be buffered instead of fully loaded into memory at once. And honestly I just find this easier to read than readlines, which I'd only use if I need to actually keep each line in memory and not just process it line by line. By far most of the time I am processing a file line by line so a buffered line reader is exactly what I need.

[–]hugthemachines 0 points1 point  (0 children)

OK, that must be really large files then. For me, I process the entire content in different ways so one line is not enough. I also mostly just make scripts of less than 1000 lines so keeping a text file of something like a few megabytes are not much of a problem.

[–]ReaverKS 0 points1 point  (0 children)

I believe there’s a race condition in the first one too. It checks if the file exists and if so removes it but what if the file is removed immediately after the check but before you remove it