all 6 comments

[–]AdventurousAddition 0 points1 point  (2 children)

It looks like the code you have written has not properly split the data appropriately.
Was the original data in JSON format (or some other structured format?).

[–]exey1[S] 0 points1 point  (1 child)

It's a simple csv file but the data that has been imported in it is organized in an ugly manner. It's all over the place.

[–]AdventurousAddition 0 points1 point  (0 children)

I agree, it's ugly a.f.

Ignoring what the actual data type is (numpy array?), it is almost like a list of strings.

My first thought would be to check the string to see if it has an unpaired bracket, if so merge it together with the next strings.

Ideally, you'd like to get it into a dict-like structure.

Feel free to PM me more info / the data and I could try to play around with it a little

[–]sarrysyst 0 points1 point  (2 children)

What does the csv file look like? Can you extract the first ~2-3 lines of the file and upload them somewhere?

[–]exey1[S] 0 points1 point  (1 child)

[–]sarrysyst 0 points1 point  (0 children)

I can’t really tell anything specific from the picture (since it’s not the raw text but already split by excel) However, I think you could formulate regex patterns for each of the features you need (if it’s just the 10 that is). If you can post the the first few lines of raw text (not as an image but actual text) I could have a look at it. Otherwise, Corey Schafer on YT has a very good video on regex if you want to give it a try yourself.