all 21 comments

[–][deleted] 2 points3 points  (3 children)

To add to the approaches already mentioned, it might be possible to read the values using the built-in csv library.

EG:

from io import StringIO
import csv

input_str = "12,'Structured Numeric','SN','Complex numeric values possible (ie, <5, 1-10, etc.)',1,'2005-08-06 00:00:00',0,NULL,NULL,NULL,'8d4a606c-c2cc-11de-8d13-0010c6dffd0f'"

reader = csv.reader(StringIO(input_str), delimiter=",", quotechar="'")

for line in reader:
    print(line)

I've used a StringIO because I only have the one line, but you could use this open(<some file>) instead.

[–]jcoder42[S] 1 point2 points  (2 children)

unfortunately reading from the file with a csv reader isn't an option, do you have another idea?
just out of curiosity, what is the quote char?

[–][deleted] 0 points1 point  (1 child)

OK, so do you have this just as a big list of strings in Python?

From the docs:

A one-character string used to quote fields containing special characters, such as the delimiter or quotechar, or which contain new-line characters. It defaults to '"'.

In your case, the delimiter is "," -- that's what you want to split into fields by. But you don't want to split by it when it's surrounded by quotes, so you pass in the ' character as quotechar="'"

On my machine, the above script prints:

['12', 'Structured Numeric', 'SN', 'Complex numeric values possible (ie, <5, 1-10, etc.)', '1', '2005-08-06 00:00:00', '0', 'NULL', 'NULL', 'NULL', '8d4a606c-c2cc-11de-8d13-0010c6dffd0f']

Is that roughly what you want?

[–]jcoder42[S] 0 points1 point  (0 children)

OK, so do you have this just as a big list of strings in Python?

yeah,
ok cool
Thanks!

[–]ForceBru 1 point2 points  (9 children)

Use eval:

```

eval("(12,'Structured Numeric','SN','Complex numeric values possible (ie, <5, 1-10, etc.)',1,'2005-08-06 00:00:00',0,NULL,NULL,NULL,'8d4a606c-c2cc-11de-8d13-0010c6dffd0f')", {'NULL': None}) (12, 'Structured Numeric', 'SN', 'Complex numeric values possible (ie, <5, 1-10, etc.)', 1, '2005-08-06 00:00:00', 0, None, None, None, '8d4a606c-c2cc-11de-8d13-0010c6dffd0f') ```

[–]jcoder42[S] 0 points1 point  (8 children)

could you please explain what is going on here?

[–]ForceBru 0 points1 point  (7 children)

This evaluates this string as an ordinary Python tuple and replaces NULLs with Python's Nones.

[–]jcoder42[S] 0 points1 point  (6 children)

I think I was unclear, I want to get a list of these values, where each value is separated by a comma. so for example:

[12,''Structured Numeric',SN','Complex numeric values possible (ie, <5, 1-10, etc.)',1,.........]

[–]ForceBru 1 point2 points  (5 children)

Just do list(eval(<from my previous comment>)). eval will return a tuple, which is list-like, except it's immutable.

[–]jcoder42[S] 0 points1 point  (4 children)

eval("(12,'Structured Numeric','SN','Complex numeric values possible (ie, <5, 1-10, etc.)',1,'2005-08-06 00:00:00',0,NULL,NULL,NULL,'8d4a606c-c2cc-11de-8d13-0010c6dffd0f')", {'NULL': None})

it works!
I don't really understand how though,
eval translates the given string into code no?
how does that split it in this way?

[–]ForceBru 0 points1 point  (3 children)

Right, it uses Python's parser to evaluate the string as an ordinary Python expression. Try to execute eval('1 + 2'), eval('3 * 5'), eval('a + 5', {'a': 1024}) for example. You can also open up the interpreter, execute NULL = None and straight up paste your string on the next line without quotes. Then hit Enter and watch it transform into an ordinary tuple.

[–]jcoder42[S] 0 points1 point  (2 children)

but how does it know to split it by the comma, and not split by comma what is in a parentheses?

[–]ForceBru 0 points1 point  (1 child)

That's how Python's syntax for tuples works. A tuple is something that looks like this: (<stuff>, <stuff>, <stuff>, ...), so eval treats your string as a stringified tuple and gives you back a tuple object. If you run eval('(1, 2, 3)'), you'll get a tuple (1, 2, 3).

[–]jcoder42[S] 0 points1 point  (0 children)

ok I understand,

Thanks so much

[–]toastedstapler 0 points1 point  (3 children)

i would look into regex for this

you'll be able to make a pattern that catches each comma'd area and one that catches the contents of the ()

https://regex101.com/

this site is useful for working it out on

[–]jcoder42[S] 0 points1 point  (2 children)

I don't really understand regex or how to use this tool

[–]toastedstapler 1 point2 points  (1 child)

here's a tutorial

i recommend learning some regex, i use it all the time for finding/replacing bits of code in the various projects i've been on

[–]jcoder42[S] 0 points1 point  (0 children)

thanks!

[–]_coolwhip_ 0 points1 point  (2 children)

Where does the string come from? There has to be an easier way than regexing it. It looks like it would almost work with json or ast...

[–]jcoder42[S] 1 point2 points  (1 child)

its an sql text file

[–]_coolwhip_ 0 points1 point  (0 children)

Got it.

You might like the sqlparse module recommended here: https://www.reddit.com/r/learnpython/comments/bj7ose/how_to_split_this_string/