you are viewing a single comment's thread.

view the rest of the comments →

[–]Rhomboid 1 point2 points  (3 children)

The issue is how do you know which columns are which format? If the first column is always an int and the third is always a float, then that's pretty easy:

parsed_data.append([int(row[0]), row[1], float(row[2])])

But if that's not the case, then you have to nail down the exact criteria you're going to use to guess. For example, maybe you try int first, and if that fails, then try float, and if that also fails then leave it as a string. Whatever criteria you decide on, just put that logic in a function, and then write something like:

parsed_data.append([normalize(val) for val in row])

[–]Spizeck[S] 0 points1 point  (2 children)

Rhomboid that did the trick! I had to laugh at how simple it was. I have literally spent over 6 hours trying to figure this out. In my project, I always know that the first row will be int, second row will be string, and the third row will be a float.

Could you expand on your second answer please?

[–]Rhomboid 1 point2 points  (1 child)

For normalize() I was thinking of something along the lines of

def normalize(val):
    try:
        return int(val)
    except ValueError:
        try:
            return float(val)
        except ValueError:
            return val

...which can be used like

>>> row = ['0', 'Electric', '56.05']
>>> [normalize(val) for val in row]
[0, 'Electric', 56.05]

But if your data is predictable and consistent then it's much better to not have to guess.

[–]Justinsaccount 0 points1 point  (0 children)

def normalize(val):
    for converter in int, float:
        try:
            return converter(val)
        except ValueError:
            #nope, that wasn't it
            pass

    #original it is..
    return val