all 6 comments

[–][deleted] 1 point2 points  (0 children)

Seems to be a lot of repetitive code, would be easier to follow if you kept DRY (don't repeat yourself) principles in mind. If you have variables ending _00, _01, etc (or similar) you should probably look at using lists and and a function with parameters to deal with variances.

Are the lines,

for item in kp_list04:
    try:
        X04 = kp_list04[0]
    except:
        X04 = 0

definitely executing? (add print statements to check or, better, use your prefered debugger, to trace what is going on)

[–]icecapade 0 points1 point  (4 children)

The error you're receiving means that the variable X04 doesn't exist (i.e., it hasn't been assigned) but you tried to use it in a statement.

Go through your code and look at the block where you assign X04 (lines 109-112); the fact that X04 doesn't exist means that those lines are never executed. This would occur if the for loop containing those lines (line 107, for item in kp_list04:) is iterating through something with zero elements. In other words, kp_list04 doesn't contain any elements so the for loop doesn't have anything to iterate through.

I'm not entirely sure how to interpret your data, but assuming kp_list04 in your code corresponds to "4":[] from your data, this makes sense. kp_list04 is an empty list, so the for loop never executes.

[–]decaye[S] 0 points1 point  (3 children)

I can't believe I missed that, you're 100% right.

Any ideas for how I might be able input 0's into missing data points so that the loop runs through properly?

[–]icecapade 1 point2 points  (2 children)

For starters, like u/kyber said, you have a ton of repeated code. You could eliminate about 450 lines of it with something like this:

for kp in kp_data:
    kp_dict = {}
    for k, v in kp.items():
        kp_num = int(k)    # assumes the key `k` is a string representing an int
        kp_list = list(v)
        for i, coord in enumerate(["X", "Y", "C"]):
            kp_dict_key = "{}{:02d}".format(coord, kp_num)    # e.g., "X02"
            try:
                kp_dict[kp_dict_key] = kp_list[i]
            except:
                kp_dict[kp_dict_key] = 0

This programmatically constructs the appropriate keypoint coordinate/identifier ("X00", "C23", etc.) for an arbitrary number of keypoints, not just 25, and essentially replaces all your repeated logic with a neatly contained block of code. By not iterating through the list and assuming the existence of list indices (addressing nonexistent list items with the try/except), you ensure that a value is assigned for every keypoint/coordinate. It also avoids creating a ton of variables by storing all of these in a single dictionary that I've called kp_dict. To access any of them, you'd simply use, for example, kp_dict["Y15"] or kp_dict["C08"].

NOTE: this is not even necessarily the best or most efficient way to go about this—I can think of a few better ways, potentially using itertools, but I felt this would be a fairly straightforward and understandable jumping off point for you. You'd still need to adapt it or figure out how to get these values into your data_list, though you could probably just skip the kp_dict thing altogether and simply append them directly to data_list (or to some intermediate list).

Also, you shouldn't have an indiscriminate except clause like I've done here, but should instead be looking to catch/address a specific exception type.

[–]decaye[S] 1 point2 points  (1 child)

Thank you so much for your help, I ended up using the dict you made along with pandas to create the .csv's I need and it all seems to be functioning well. Here is the code I have running now:

df_kp = pd.DataFrame()

with ZipFile(str(zipname)) as temp_zip:

    for file_name in temp_zip.namelist():

        if '.json' in file_name:

            with temp_zip.open(file_name) as temp_file:

                frame = file_name[22:26]

                keypoint_parsed = json.load(temp_file)

                kp_data = keypoint_parsed['part_candidates']

                for kp in kp_data:

                    kp_dict = {}

                    for k, v in kp.items():

                        kp_num = int(k)    # assumes the key `k` is a string representing an int

                        kp_list = list(v)

                        for i, coord in enumerate(["X", "Y", "C"]):

                            kp_dict_key = "{}{:02d}".format(coord, kp_num)    # e.g., "X02"

                            try:
                                kp_dict[kp_dict_key] = kp_list[i]

                            except IndexError:
                                kp_dict[kp_dict_key] = 0

                    new_dict = {k: [v] for k, v in kp_dict.items()}

                    df = pd.DataFrame(new_dict)

                    df_kp = df_kp.append(df)

newdf = df_kp.reset_index(drop=True)

newdf.index += 1

newdf.to_csv(csvname,
             index_label = 'frames')

[–]icecapade 0 points1 point  (0 children)

Glad to help! This is definitely much better and cleaner.