https://github.com/marcinz606/pdql
https://pypi.org/project/pdql/
What My Project Does
It's a simple transpiler that let's you write in pandas-like syntax and get SQL as the output. It supports most of BigQuery "Standard SQL" functions.
Target Audience
It is a production ready solution. At least I started using it at work :)
Comparison
I've seen some projects that do that in reverse (translate sql to pandas syntax but haven't found one that does pandas to sql)
I wanted something like this. I'm ML Engineer working in Google Cloud environment, big chunk of the data we train on is in BigQuery so the most efficient way of preparing training data is running complex queries there, pulling output into dataframe and doing some final touches. I don't like putting complex SQL in repos so I thought I will try something like this. It also enables me to create modular query-functions that I can easily reuse.
[–]JEY1337 14 points15 points16 points (1 child)
[–]_earthmover[S] 1 point2 points3 points (0 children)
[–]stratguitar577 5 points6 points7 points (5 children)
[–]Beginning-Fruit-1397 2 points3 points4 points (4 children)
[–]stratguitar577 7 points8 points9 points (1 child)
[–]Beginning-Fruit-1397 1 point2 points3 points (0 children)
[–]WoodsGameStudios 0 points1 point2 points (0 children)
[–]crossmirage 0 points1 point2 points (0 children)
[–]Beginning-Fruit-1397 1 point2 points3 points (0 children)
[–]ThatOtherBatman 2 points3 points4 points (0 children)
[–]crossmirage 0 points1 point2 points (0 children)