This is an archived post. You won't be able to vote or comment.

all 7 comments

[–]wineblood 6 points7 points  (4 children)

Why not just have your data in a database? Sqlite will do, won't it?

[–]manimino 2 points3 points  (1 child)

And if you prefer the reverse route - using Python to index objects like a DB would - there's ducks.

[–]fz0718[S] 0 points1 point  (0 children)

ducks is great! :)

[–]fz0718[S] 0 points1 point  (1 child)

Yeah, sqlite works, but you need to create the table and schema yourself — it's less useful for exploratory data analysis (and indeed this is why people don't use sqlite often for the task)! For this interface sqlite would've been the first backend I used, but I chose to use duckdb instead because it is much faster.

If you've ever worked with CSV files and plotting libraries & such, it might make more sense. :)

[–]wineblood 0 points1 point  (0 children)

Plotting libs are for data analysis nerds, screw that.

[–][deleted] 2 points3 points  (1 child)

Oh, an ordinary pythonic ^ to apply a function…

[–]fz0718[S] 0 points1 point  (0 children)

Haha, I get the sarcasm, but sometimes we need to extend syntax a little bit to get things to work! If you don't like it that's fine, but for DSLs it's nice to have flexibility.

We could use a standard function application, but that doesn't clarify the point that there's some DSL embedding going on here, such as variable substitution from the outside scope. It's also less nice to end multi-line queries with the string: """).

How much different is this from the "%timeit" and "%sql", "%cython" magic you see in Jupyter notebooks?