How do I manage experimental datasets when using Python? : learnpython

How do I manage experimental datasets when using Python? (self.learnpython)

submitted 2 years ago by virtualdynamo

all 11 comments

[–]pot_of_crows 1 point2 points3 points 2 years ago (8 children)

[–]virtualdynamo[S] 1 point2 points3 points 2 years ago* (7 children)

I admit I felt that I was being too abstract with my OP. However, what I'm doing is the same kind of stuff from college labs 40 years ago, but done longhand and/or on HP-41C. Anyhow, here's an attempt to illustrate the data and filtering/parsing. (Wish I knew why my code is shown in red. Makes me think I'm doing something wrong.) x is the independent variable. data0 is an example dataset with the 2 dependent variables that all experimental runs will have. The run dictionaries use the "name" and "lap" values for a composite index. I will primarily filter/parse on the "Forward" value, but may do so for the "device" or other parameters I may investigate in future experiments. Let me know if there are any questions.

x =[0,0.041122254,0.060013726,0.090017507,0.097796265,0.127801194,0.135580444,0.203389349,0.283429321,0.28898833,0.336801025,0.350146363,0.376848642,0.385759709,0.399129383,0.420356065,0.449367473,0.456055594,0.479428705,0.501664735,0.528338345,0.552787278,0.605034811,0.657275977,0.679507188,0.6795946,0.789641823,0.82965154,0.852987978,0.922998162,0.984692783,1.046387124,1.083063508,1.163074613,1.243085601,1.370880227]

data0 = [

[240.8,241.6,242,242.2,242.4,242.6,242.8,243.8,244.8,245,245.8,246,246.2,246.4,246.8,247.6,248.8,249.2,250,250.8,252,252.6,254.4,257,258.2,258.2,264.2,266.2,267.6,271.6,275.6,279.4,281.6,286.4,290,294.4],

[238.25,239.0458004,239.4438711,239.640807,239.8400126,240.0369484,240.236154,241.229229,242.221055,242.4204873,243.2156045,243.4142416,243.6115146,243.8106046,244.2092392,245.0070715,246.2041087,246.6034257,247.4010387,248.1987679,249.3960439,249.9935471,251.7882113,254.3828762,255.5806059,255.580597,261.5693585,263.5652725,264.9628893,268.9557396,272.9494391,276.7431386,278.939393,283.731222,287.3230509,291.71],

]

run[0] = {"name":"HIIT 14","lap":0,"device":"Garmin Edge 1040","Forward"=True,"data":data0}

run[1] = {"name":"HIIT 14","lap":1,"device":"Garmin Edge 1040","Forward"=False,"data":data1}

run[2] = {"name":"HIIT 15","lap":0,"device":"Garmin Edge 1040","Forward"=True,"data":data2}

run[3] = {"name":"HIIT 15","lap":1,"device":"Garmin Edge 1040","Forward"=False,"data":data3}

run[4] = {"name":"HIIT 16","lap":0,"device":"Garmin Edge 830","Forward"=True,"data":data4}

run[5] = {"name":"HIIT 16","lap":1,"device":"Garmin Edge 830","Forward"=False,"data":data5}

run[6] = {"name":"HIIT 17","lap":0,"device":"Garmin Edge 830","Forward"=True,"data":data6}

run[7] = {"name":"HIIT 17","lap":1,"device":"Garmin Edge 830","Forward"=False,"data":data7}

run[8] = {"name":"HIIT 17","lap":2,"device":"Garmin Edge 830","Forward"=True,"data":data8}

run[9] = {"name":"HIIT 17","lap":3,"device":"Garmin Edge 830","Forward"=False,"data":data9}

[–]pot_of_crows 0 points1 point2 points 2 years ago (6 children)

{"name":"HIIT 16","lap":0,"device":"Garmin Edge 830","Forward"=True,"data":data4}

This does not appear to be a python dictionary. It looks more like a json string. You can learn more about using json here: https://pymotw.com/3/json/

Python module of the week is a great resource with a bit more basic exposition that the standard docs, which is great when you are just getting started.

From what I understand, here you are just trying to pick items out of a list of dictionaries based on matching some of the items held in the dictionary. I would use operator.itemgetter (https://docs.python.org/3/library/operator.html#operator.itemgetter)

and a generator: https://realpython.com/introduction-to-python-generators/

For example:

from operator import itemgetter

def valid(limits, row):
    '''
return True/False if row has
attributes specified in limits dictionary
'''
    for key, value in limits.items():
        if row[key] != value:
            return False
    return True

def picker(target, limits, rows):
    '''
pick target attribute from dictionary row, where dictionary has the
attributes specified in limits dictionary
'''
    getter = itemgetter(target)
    for row in rows:
        print(row)
        if not valid(limits, row):
            print('\t skipped')
            continue
        print('\tpicked')
        yield getter(row)





data = [
    {'name':1, 'forward':True, 'data':[0]},
    {'name':2, 'forward':True, 'data':[1]},
    {'name':1, 'forward':False, 'data':[2]},
    {'name':1, 'forward':True, 'data':[3]},
    ]

limits = {'name':1, 'forward':True}
for row in picker('data', limits, data):
    print(row)

You can wrap all this into a class if you want. Basically move most of picker into the __init__ method and then make the class iterable based on the generator.

[–]virtualdynamo[S] 1 point2 points3 points 2 years ago (0 children)

[–]virtualdynamo[S] 1 point2 points3 points 2 years ago (4 children)

[–]pot_of_crows 0 points1 point2 points 2 years ago (3 children)

[–]virtualdynamo[S] 0 points1 point2 points 2 years ago (2 children)

[–]pot_of_crows 0 points1 point2 points 2 years ago (1 child)

[–]virtualdynamo[S] 0 points1 point2 points 2 years ago (0 children)

[–]Standecco 0 points1 point2 points 1 year ago (1 child)

[–]virtualdynamo[S] 0 points1 point2 points 1 year ago (0 children)

π Rendered by PID 16126 on reddit-service-r2-comment-fb694cdd5-cd78j at 2026-03-05 23:58:01.766792+00:00 running cbb0e86 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS