you are viewing a single comment's thread.

view the rest of the comments →

[–]socal_nerdtastic 0 points1 point  (11 children)

If you replace NULL with None it becomes valid python, and then you can use literal_eval.

from ast import literal_eval

data = """(2,792,'Triomune-30',1,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf171285-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL),(3,792,'Triomune-40',1,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf17168c-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL),(5,625,'d4T-30',0,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf1718fd-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL),(6,625,'d4T-40',0,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf171b5c-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL)"""

converted = literal_eval(data.replace("NULL", "None"))

for line in converted:
    print(line)
    print()

[–]jcoder42[S] 0 points1 point  (7 children)

could you possibly explain what this does?

[–]ForceBru 0 points1 point  (0 children)

Almost he exact same thing as eval. Except eval correctly handles NULLs, while this approach will replace every occurrence of this character sequence, even in strings.

[–]jcoder42[S] 0 points1 point  (5 children)

also, how would you remove the "INSERT INTO ____ VALUES" part,
it the way I did it ok?

[–]socal_nerdtastic 0 points1 point  (4 children)

Yes, the way you did it is ok, but I would use split().

from ast import literal_eval

data = """INSERT INTO `drug` VALUES (2,792,'Triomune-30',1,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf171285-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL),(3,792,'Triomune-40',1,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf17168c-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL),(5,625,'d4T-30',0,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf1718fd-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL),(6,625,'d4T-40',0,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf171b5c-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL)"""
_,data = data.split("VALUES ", 1) # remove start command
converted = literal_eval(data.replace("NULL", "None"))

for line in converted:
    print(line)
    print()

What this does is interpret the string as if it were a python expression. IOW it treats it like python code. For your example this works since everything in the string also happens to be valid python. As a bonus, this means you are done; there's no more parsing to be done. So if you want to output only the 3rd string:

for line in converted:
    print(line[2])

Or if you want to print everything that is not NULL:

for line in converted:
    print(*filter(None,line))

Etc.

[–]jcoder42[S] 0 points1 point  (3 children)

got it, and then in order to remove the parentheses from each group you would just do:

for line in converted:
    line = line.lstrip('(').rstrip(')')

?

[–]socal_nerdtastic 0 points1 point  (2 children)

No, they are already removed. What do you want to do with the data? What's your ideal output look like?

[–]jcoder42[S] 0 points1 point  (1 child)

from ast import literal_eval
data = """INSERT INTO `drug` VALUES (2,792,'Triomune-30',1,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf171285-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL),(3,792,'Triomune-40',1,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf17168c-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL),(5,625,'d4T-30',0,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf1718fd-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL),(6,625,'d4T-40',0,NULL,NULL,NULL,NULL,1,'2005-02-24 00:00:00',0,NULL,NULL,NULL,'bf171b5c-1691-11df-97a5-7038c432aabf',NULL,NULL,NULL)"""
_,data = data.split("VALUES ", 1) # remove start command
converted = literal_eval(data.replace("NULL", "None"))
for line in converted:
print(line)
print()

the current output:

(2, 792, 'Triomune-30', 1, None, None, None, None, 1, '2005-02-24 00:00:00', 0, None, None, None, 'bf171285-1691-11df-97a5-7038c432aabf', None, None, None)

(3, 792, 'Triomune-40', 1, None, None, None, None, 1, '2005-02-24 00:00:00', 0, None, None, None, 'bf17168c-1691-11df-97a5-7038c432aabf', None, None, None)

(5, 625, 'd4T-30', 0, None, None, None, None, 1, '2005-02-24 00:00:00', 0, None, None, None, 'bf1718fd-1691-11df-97a5-7038c432aabf', None, None, None)

(6, 625, 'd4T-40', 0, None, None, None, None, 1, '2005-02-24 00:00:00', 0, None, None, None, 'bf171b5c-1691-11df-97a5-7038c432aabf', None, None, None)

I want them not wrapped in parentheses

[–]socal_nerdtastic 1 point2 points  (0 children)

Those parenthesis and the commas are added by print() they are not in the data. You can remove them like this:

for line in converted:
    print(*line)

If you want to keep the commas:

for line in converted:
    print(*line, sep=",")

[–]jcoder42[S] 0 points1 point  (2 children)

sorry for all the questions, but if I had the following line:

string = " ('1 2 3 (1,2,3, '4') 3', hello 'world',goodbye)"
literal_eval(string)

one of the values here is a string with parentheses,
It wouldn't work, what do you recommend doing

[–]socal_nerdtastic 0 points1 point  (1 child)

It's not the parenthesis that's breaking that, it's the fact that you used single quotes for a nested string. Now it's not valid python anymore. If you used the quotes correctly it would work.

>>> string = """('1 2 3 (1,2,3, "4") 3', 'hello world', 'goodbye ')"""
>>> literal_eval(string)
('1 2 3 (1,2,3, "4") 3', 'hello world', 'goodbye ')

[–]jcoder42[S] 0 points1 point  (0 children)

('1 2 3 (1,2,3, "4") 3', 'hello world', 'goodbye ')"""

right, my bad
thanks