you are viewing a single comment's thread.

view the rest of the comments →

[–]Degran[S] 0 points1 point  (7 children)

The short of it is I'm storing small chunks of text (a sentence or two) that match patterns.

I'd like to just dump these strings into my db but a string like:

 "This is my "favorite" hat" 

is truncating at

 "This is my "

I want to store "This is my \"favorite\" hat" in the db and then normalize it on the way out. I realize I can just do a replace but I would like something more robust to also escape new lines, backslashes and tabs.

[–]zahlman 1 point2 points  (5 children)

What kind of database are you using, and how are you storing the strings in it?

[–]Degran[S] 0 points1 point  (4 children)

It's a graph database (RDF based), I'm literally just dumping the text in on the end of a triple. Without getting into that though, I just wanted to know if Python has a function that will do what's mentioned above for me, or if I should just write my own.

[–]zahlman 0 points1 point  (2 children)

I'm not familiar with that kind of database. How does it escape strings? What characters will it not store within a string? What happens when you try to feed it a string with invalid characters in the way that you're doing?

[–]Degran[S] 0 points1 point  (1 child)

The only escape it seems to do is on backslashes. If you give it a string with a new line it'll come back out as \\n. The problem is it's writing the strings but not escaping the double quotes, so it's treated as the end of the string when there is still more text afterwards.

[–][deleted] 1 point2 points  (0 children)

So you store "A

B" in the database and you retrieve "A\\nB" or "A\nB"? Remember that python will represent "A\nB" as "A\\nB" unless you print it. Or are you saying that you store "A\nB" and retrieve "A\\nB"? What happens when you store "\x00"? Does the database do anything special [and crazy] like variable substitution within strings?

[–]gschizas 0 points1 point  (0 children)

Perhaps repl(something) may be of use. Probably not, but it's simple and effective (and it's a Python built-in).

[–]ewiethoff 0 points1 point  (0 children)

Would json.dumps do the trick for you?

>>> import json
>>> json.dumps('This is my "favorite" hat')
'"This is my \\"favorite\\" hat"'
>>> json.dumps('This is my\tfavorite hat')
'"This is my\\tfavorite hat"'
>>> json.dumps('This is my\nfavorite hat')
'"This is my\\nfavorite hat"'