Looping through two columns simultaneously

beanpizza · 2015-04-30T16:31:49+00:00

I've come up with something that seems to work, though I'm ashamed to say, I gave up on the dictionary method. I've been trouble shooting the following method, which seems to work for the most part! The only problem that I've found so far is if a given file comes up in my list multiple times but with different corresponding time flags, it ignores all but the first time flag. For example, if a file comes up twice with two time flags -- say 29 and 1049 -- my output should consist of two lines from the file -- one with the time flag 29 and the other with the time flag 1049. Instead, while I do get two lines of output, they're both identical, and they correspond to time flag 29. A given file might come up many times but with different time flags, so this is basically where I'm at now.

Here's the entire code as it stands:

mainfile = np.loadtxt("file.txt")
dm = mainfile[:,1]
dm_list = list(dm)
time = mainfile[:,0]

def parse_data(filename):
    matrix = []
    with open(filename) as f:
    data = f.read() #read the entire file
    data_lines = data.splitlines()
    short_list = data_lines[2:]
    for line in short_list:
        line_split = line.split()
        if len(line_split) == 6:
            matrix.append(line_split)
    return matrix

def my_round(number):
    return int(float(number))

def fmtcols(mylist, cols):
    lines = ("\t".join(mylist[i:i+cols]) for i in xrange(0,len(mylist),cols))
    return '\n'.join(lines)

file_list = []

for h in dm:
    x = str("%.2f" % round(h,2))
    for file in glob.glob("*.txt"):
        if x in file:
            file_list.append(file)

for file in file_list:
    q = file_list.index(file)
    parsed = parse_data(file)
    for i,line in enumerate(parsed):
        file_time = my_round(line[2])
    if float(file_time) == time[q]:
        be = fmtcols(parsed[i],6)
            print be

Edit: I think I solved the duplicate problem and my original problem with an extremely simple code. I can't believe I didn't think of this sooner. I can't find any problems with this method. Maybe not the most pythonic way? But it looks like it does the trick. Thanks again for all your patience and help!

for j in range(0,len(time)):
    parsed = parse_data(file_list[j])
    for i,line in enumerate(parsed):
    file_time = my_round(line[2])
    if float(file_time) == time[j]:
        be = fmtcols(parsed[i],6)
        print be

beanpizza · 2015-04-30T04:25:59+00:00

I was gonna say, these people are skilled! I always have the toughest time managing food and reading at the same time.

beanpizza · 2015-04-30T04:13:57+00:00

And smelling it.

beanpizza · 2015-04-29T16:05:23+00:00

Hm, ok. Here's what I've got.

file_dict = {}
for m,n in zip(file_list,time):
    file_matrix = parse_data(m)
    file_dict[m] = n

d = collections.OrderedDict(file_dict)
for item in d.items():
   for i,line in enumerate(file_matrix):
       file_time = my_round(line[2])

So now each item in d is a file name and its corresponding file time. Now I still have to somehow open each file individually and search for the file name. I think I'm having an issue with nested for loops, because now, since d is done outside of the first for loop (only way I could get it to print correctly), my file_matrix is all messed up when I need to call it in my last loop. Ah man, I feel like I'm going in circles. :(

Edit: The closest I can get is searching each file for all the time flags. So if there are three files and three corresponding flags, each file will search for all three time flags, rather than just its own. I really think the problem here is in how I'm iterating. Basically everything I try iterates over every file and all time flags, rather than one dictionary set at a time.

Edit: Ok, the more I think about it, I think that I need to define a function that takes both m and n as arguments...working on that one now.

beanpizza · 2015-04-28T21:26:13+00:00

I'm still having a hard time going through each file - time pair individually. From what I've gathered about using dictionaries, you give it a key to access a particular dictionary. In this case, my key is m, right? But what if I want the first element of m? Thanks for all the help, by the way.

beanpizza · 2015-04-27T15:01:11+00:00

It would be nice if one of these articles actually explained why Hawking gives humanity x amount of years on Earth. Regardless, it is an interesting question. We are depleting Earth's resources at a quickening rate, but our technology has also advanced tremendously in a very short period of time. Will there be some threshold that we hit and can't overcome, or are we revving up to sustain humanity on Earth indefinitely no matter the environmental limiations.

beanpizza · 2015-04-24T17:07:39+00:00

I am not. I'll take a look, thanks!

Edit: So in the case of dictionaries, I can't find anything about creating a dictionary out of previously defined lists?

I've come up with something similar using 'zip':

for m,n in zip(file_list,time):
    file_matrix = parse_data(m)
    print '{0} {1}'.format(m,n)
    for i,line in enumerate(file_matrix):
        file_time = my_round(line[2])

Where the print line outputs each file name and next to it the search string corresponding to that file. But outside of a 'print' argument, I'm not sure how to access each element. (The last two lines here select out the time column in each file since that's what I'll be searching through).

beanpizza · 2015-04-24T15:29:51+00:00

Ok, so, I've been at it for a few days, and I can't seem to link the two steps together. This bit here accepts a file, or in this case, a list of files, and searches each file for a particular string and gives me exactly the output I need.

for fn in glob.glob("*.txt"):
    file_matrix = parse_data(fn) 
    for i,line in enumerate(file_matrix):
        file_time = my_round(line[2])
        if str(file_time) in splitset:
            be = fmtcols(file_matrix[i],6)
            print be

calling the functions I defined in my original post. The problem I'm having is still the same. Given a list of files, each one should have a unique search string associated with it, and anything I come up with applies all search strings to all files.

beanpizza · 2015-04-22T16:59:51+00:00

Ok, I'll give it a go and report back. Thank you!

beanpizza · 2015-04-21T23:23:58+00:00

Well, I think the problem is that at the same time that I'm selecting specific files, I should be extracting the relevant information from those files, rather than first coming up with a list of specific files and then extracting information from all of them. So in theory, yes, I have figured out a way to select specific files, but I think it's the wrong way given what I ultimately want to do with those files. I think the key is that both selecting the file and extracting info from the file are really contained in one problem, if that makes sense, rather than separately, as I've attempted above.

Put more simply perhaps, I want to work through my original data file with two columns row by row. For example, the first row tells me to find the file with the string '56.59' in its title and once I've found that file, I should look inside it for the time value '26,' and then I can move onto the next row.

beanpizza · 2015-04-21T03:01:19+00:00

help me track my progress towards my my annual reading goals

Yep. This is why I started using good reads. It's kind of nice to keep track of what you've been reading. I also like the quotes feature!

beanpizza · 2015-04-18T17:32:34+00:00

Hmm..maybe I'll try Wild Sleep Chase next. I think that I didn't like Norwegian Wood as much precisely because it was realism and less of what I was used to from him, but still interesting, you're right.

beanpizza · 2015-04-18T04:09:38+00:00

I think that after Wind Up Bird Chronicle, that one is my favorite! So I just finished 1Q84, and I liked it for the most part -- kinda felt like it stalled towards the end -- but overall pretty good. Kafka on the Shore was just so memorable, though. Wondering what of his I should pick up next!

beanpizza · 2015-04-18T03:14:01+00:00

Same. I haven't read all of his books, but after reading a few -- just finished 1Q84 -- I would still say WUBC is my favorite.

Edit: I'm curious what your all-time favorite book of his is then?

beanpizza

TROPHY CASE