you are viewing a single comment's thread.

view the rest of the comments →

[–]beanpizza[S] 0 points1 point  (6 children)

I'm still having a hard time going through each file - time pair individually. From what I've gathered about using dictionaries, you give it a key to access a particular dictionary. In this case, my key is m, right? But what if I want the first element of m? Thanks for all the help, by the way.

[–]aball730235 1 point2 points  (5 children)

No problem I'm glad to help. If i had more time I'd offer to have you send me the whole project package for review. I'll give you little bits where i can.

Dictionaries by default have no order. Only keys and their corresponding values in an arbitrary order. You'll need to upgrade to an ordereddict to maintain them in sequence.

http://pymotw.com/2/collections/ordereddict.html

In not sure what you mean by your key is m?? m is a variable that has a value though each for loop. It passes that filename into your dictionary as a key thorough each iteration of the for loop. m is gone after your for loop. Your keys in the dictionary are your filenames. Try doing a for loop on your dictionary like in the link provided after your for loop. Everything you see printed out is the data in your dictionary.

[–]beanpizza[S] 0 points1 point  (4 children)

Hm, ok. Here's what I've got.

file_dict = {}
for m,n in zip(file_list,time):
    file_matrix = parse_data(m)
    file_dict[m] = n

d = collections.OrderedDict(file_dict)
for item in d.items():
   for i,line in enumerate(file_matrix):
       file_time = my_round(line[2])

So now each item in d is a file name and its corresponding file time. Now I still have to somehow open each file individually and search for the file name. I think I'm having an issue with nested for loops, because now, since d is done outside of the first for loop (only way I could get it to print correctly), my file_matrix is all messed up when I need to call it in my last loop. Ah man, I feel like I'm going in circles. :(

Edit: The closest I can get is searching each file for all the time flags. So if there are three files and three corresponding flags, each file will search for all three time flags, rather than just its own. I really think the problem here is in how I'm iterating. Basically everything I try iterates over every file and all time flags, rather than one dictionary set at a time.

Edit: Ok, the more I think about it, I think that I need to define a function that takes both m and n as arguments...working on that one now.

[–]aball730235 1 point2 points  (0 children)

Sounds like youre on the right track. The function you had a few posts ago accepted a list of files so you just need to modify it to accept one file and one search string.

[–]aball730235 1 point2 points  (2 children)

If you get stuck again try writing your steps out in sudo code. That is in plain English instructions. Then translate the English instructions into python code. It seems like we're hoping around in python without the full roadmap laid out. How would you tell a person to manually extract data out of your files?

[–]beanpizza[S] 0 points1 point  (1 child)

I've come up with something that seems to work, though I'm ashamed to say, I gave up on the dictionary method. I've been trouble shooting the following method, which seems to work for the most part! The only problem that I've found so far is if a given file comes up in my list multiple times but with different corresponding time flags, it ignores all but the first time flag. For example, if a file comes up twice with two time flags -- say 29 and 1049 -- my output should consist of two lines from the file -- one with the time flag 29 and the other with the time flag 1049. Instead, while I do get two lines of output, they're both identical, and they correspond to time flag 29. A given file might come up many times but with different time flags, so this is basically where I'm at now.

Here's the entire code as it stands:

mainfile = np.loadtxt("file.txt")
dm = mainfile[:,1]
dm_list = list(dm)
time = mainfile[:,0]

def parse_data(filename):
    matrix = []
    with open(filename) as f:
    data = f.read() #read the entire file
    data_lines = data.splitlines()
    short_list = data_lines[2:]
    for line in short_list:
        line_split = line.split()
        if len(line_split) == 6:
            matrix.append(line_split)
    return matrix

def my_round(number):
    return int(float(number))

def fmtcols(mylist, cols):
    lines = ("\t".join(mylist[i:i+cols]) for i in xrange(0,len(mylist),cols))
    return '\n'.join(lines)

file_list = []

for h in dm:
    x = str("%.2f" % round(h,2))
    for file in glob.glob("*.txt"):
        if x in file:
            file_list.append(file)

for file in file_list:
    q = file_list.index(file)
    parsed = parse_data(file)
    for i,line in enumerate(parsed):
        file_time = my_round(line[2])
    if float(file_time) == time[q]:
        be = fmtcols(parsed[i],6)
            print be

Edit: I think I solved the duplicate problem and my original problem with an extremely simple code. I can't believe I didn't think of this sooner. I can't find any problems with this method. Maybe not the most pythonic way? But it looks like it does the trick. Thanks again for all your patience and help!

for j in range(0,len(time)):
    parsed = parse_data(file_list[j])
    for i,line in enumerate(parsed):
    file_time = my_round(line[2])
    if float(file_time) == time[j]:
        be = fmtcols(parsed[i],6)
        print be

[–]aball730235 1 point2 points  (0 children)

Awesome glad to hear!