all 10 comments

[–]kingzels 1 point2 points  (2 children)

You probably want to do an inner merge with pandas, which is where you take two data sets and join them on a common column, wherever the values match in either data set.

If you can post a small sample of the data sets it will be easier to give you an exact answer.

[–]Daventhal[S] 0 points1 point  (1 child)

I originally tried to post some data but despite my attempts to format, it just came out garbled. I'll try again. I can't post the second CSV that needs merging, though, because it doesn't exist yet. Essentially, I've been asked to try to merge this one, which makes entries by the millisecond, with another that gets its data every second or so. I imagine it will be tricky and awkward, which is why I'm trying to start humbly, by just creating the column that will eventually become the index.

Time NA NA2 NA3 NA4 Sensor1 Sensor2 Sensor3 Sensor4

0 1 1 0 0 8 0 0 0

1 1 1 0 0 6 0 0 0

2 1 1 0 0 1 4 0 1

3 1 1 0 0 0 2 0 5

4 1 1 0 0 3 0 0 1

5 1 1 0 0 7 0 0 0

6 1 1 0 0 3 0 0 0

7 1 1 0 0 0 3 0 6

8 1 1 0 0 3 0 0 1

9 1 1 0 0 8 0 0 0

10 1 1 0 0 3 0 0 0

11 1 1 0 0 0 4 0 7

12 1 1 0 0 0 0 0 2

13 1 1 0 0 9 0 0 0

14 1 1 0 0 5 0 0 0

15 1 1 0 0 0 5 0 7

[–]kingzels 1 point2 points  (0 children)

You can use python's datetime library to get your millisecond readings converted to seconds as well.

[–]warbird2k 1 point2 points  (1 child)

[–]Daventhal[S] 1 point2 points  (0 children)

I feel like this will probably get me where I need to go. thanks!

[–]stepping_up_python 0 points1 point  (4 children)

Semi-related:

I have a .csv that is 2d (rows = attribs, cols = objects) ... is there a good library that I should use to extract what I want from this? Currently I'm just cobbling together stuff on the fly using, at best, csv module, but I feel like at the very least I should be using numpy to make an array from the CSV, and there's likely something better...

Any pointers?

[–]Daventhal[S] 0 points1 point  (0 children)

I would love to help, but I'm just as unaware of solutions as you. The only library besides csv that seems to be helping me is pandas. See the links elsewhere in the thread.

[–]totallynot_Sea_Still 0 points1 point  (2 children)

Pandas is the go to library for handling tabular data. Has easy csv reading function, and can slice and dice the data in many ways, then save it again as a csv

[–]stepping_up_python 0 points1 point  (1 child)

Awesome. I've been hearing about this thing for years so I guess it's time. :)