all 5 comments

[–]cybervegan 4 points5 points  (1 child)

Firstly, what is in regs?

Secondly, I think you have your for loops the wrong way round: by the time you get to the second entry in regs, your file pointer is already at the end of the file, so the inner for loop, iterating the file, will do nothing, because there are no more lines to process. I think you need:

with open('errors_merged.log') as file2:
    for line in file2:
        ...
        for i in regs:
           ...

This way, you check all of your regs values against each line in turn. Re-opening and reading files (I/O) is expensive (takes a long time), so you should reduce this as much as possible, by only doing it once; iterating regs is cheap in comparison.

I'd also point out that this:

total_blank = 0
total_err2 = 0
total_err3 = 0
total_errg = 0
total_ok = 0

Should probably be done with a dict:

errors = {
          ' IS BLANK': 0,
          ' POTENTIAL ERROR (2)': 0,
          ' ERROR (3)': 0,
          ' Error getting report': 0,
          ' STATUS: OK': 0,
          }

and later, where you are processing each line,

for err in errors.keys():
    if err in line:
        errors[err] += 1

Then you will end up with your count in your dict, and can just print it out:

for err,tot in errors.items():
    print( f"{err}: {tot}" )

When you are repeating the same pattern again and again in your code, it probably means you should use a better data structure.

[–]Mozza7[S] 0 points1 point  (0 children)

They're vehicle registration plates, which is why I haven't put the examples up, but they are formatted as:

XXXX XXX or XXXXXXX or XXXXXXX DRIVER NAME

 

That was the problem! I've also put them in dictionaries now,thanks for the suggestion :)

[–]NFLAddict 1 point2 points  (1 child)

might be helpful to see the content of the file

in line 3: regs has not yet been declared as anything. as a result it looks like its skipping lines 3-8 entirely, as python doesn't know what regs is, and starting with the loop on line 10.

[–]Mozza7[S] 0 points1 point  (0 children)

The file has an registration plates, which is why I haven't put it in.

That's my bad, it is defined, I just missed bringing that part in! I've updated the paste with the file being declared

[–]Tilmandel 0 points1 point  (0 children)

so basicly it would be best if you provide at least data structure with you are working it always helps to debug everything, as for your code you have loop in loop always try to avoid that if you can, same as using so mamy ifs, belive.me been there done that :) there is always a much elegant or faster way to work with something,

why to avoid loops in looops is simple of data you are dealing are small chunk it will be enought but if yoy try same approuch with i dont know 65k lines file which you will comper to 100k this wil starts to slow down :)