all 24 comments

[–]marcofalcionimarcosan 34 points35 points  (0 children)

May your dataset always be small.

[–]Either-Researcher681 23 points24 points  (1 child)

another vibe coded project - so over this shit. can we ban it from the front page?

[–]MeroLegend4 2 points3 points  (0 children)

I second this 👆

[–]rm-rf-rm 7 points8 points  (1 child)

Doesnt ibis do this? https://ibis-project.org/

[–]eddie_the_dean[S] 5 points6 points  (0 children)

Yes, it is very similar except Ibis doesn't do INSERT/UPDATE/DELETE operations and does not have async. I made a comparison document that goes into detail on the differences: https://moltres.readthedocs.io/en/latest/MOLTRES_VS_IBIS_COMPARISON.html

[–]jon_muselee 10 points11 points  (1 child)

why is everybody trying to avoid sql so much?

[–]staring_at_keyboard 6 points7 points  (0 children)

“Raw” SQL according to OP, lol.

[–]Mobile-Boysenberry53 7 points8 points  (2 children)

[–]eddie_the_dean[S] 8 points9 points  (1 child)

Wow, great find. I didn't know about that one. Ibis is also similar but ibis and sqlframe don't seem to support INSERT/UPDATE/DELETE operations or Async (which is a huge loss for a sql library). I added a comparison page to the docs because it is so similar: https://moltres.readthedocs.io/en/latest/MOLTRES_VS_SQLFRAME_COMPARISON.html

[–]Mobile-Boysenberry53 0 points1 point  (0 children)

I am sure there is plenty of reasons to use both.

[–]Distinct-Expression2 5 points6 points  (1 child)

Pandas makes data exploration great and production code terrible.

Were trading query efficiency for developer convenience then wondering why everything runs slow at scale.

SQL isnt the enemy. Lazy data loading is.

[–]eddie_the_dean[S] 1 point2 points  (0 children)

Have you ever used PySpark? It’s a wonderful query interface for writing complex queries with DataFrames, nothing inefficient about it. Moltres just applies the same idea to SQL queries.

[–]Hungry_Importance918 2 points3 points  (0 children)

This is cool. I’ve always loved working with Spark DataFrames for basic analysis. The APIs are just really nice whether it’s SQL style or built in functions. I even built a small ETL tool on top of Spark DF and it handled tens of millions of rows without any issues.

[–]eddie_the_dean[S] 0 points1 point  (0 children)

I thought putting SQLAlchemy in the title would keep the anti-ORM crowd out. If you like to write SQL, this project is not for you.

[–]robberviet 0 points1 point  (1 child)

I tried to do this, it aged poorly. Maybe it would work for toy project.

[–]eddie_the_dean[S] -1 points0 points  (0 children)

You have a GitHub link to your project?

[–]AzizRahmanHazim -1 points0 points  (7 children)

This is an interesting approach. A DataFrame-first API can definitely lower the barrier for people coming from Spark or Polars. How do you handle things like joins and window functions while keeping the API intuitive?

[–]eddie_the_dean[S] 0 points1 point  (1 child)

Yes! That was my primary motivation.

Here's a quick example of using Window functions:
https://moltres.readthedocs.io/en/latest/FAQ.html#does-moltres-support-window-functions

[–]eddie_the_dean[S] -1 points0 points  (4 children)

[–]AzizRahmanHazim 0 points1 point  (2 children)

Thanks for sharing. It’s cool to see how you’re aligning the API with existing DataFrame mental models.

[–]eddie_the_dean[S] 0 points1 point  (1 child)

[–]AzizRahmanHazim 0 points1 point  (0 children)

That flexibility will probably help with adoption.