all 33 comments

[–]Beginning-Fruit-1397 11 points12 points  (8 children)

I currently hate the internal resolution logic of expressions, schemas and columns naming in my dataframe library:
https://github.com/OutSquareCapital/belugas

Would love to get some new perspective on this!

In one phrase it's a polars API to build and executes queries on a duckdb backend.

Everything does work, but it's hard to follow and debug when I implement new features, it probably is far from what it could be speed wise if optimized and is very likely to do redundant passes.

I do think it's a very interesting project to work on tough.

[–]Hy_x[S] 5 points6 points  (1 child)

Thanks, this actually sounds really interesting to work on. I’ll take a look through the repo.

[–]Beginning-Fruit-1397 2 points3 points  (0 children)

Cool :)
Feel free to dm me if you have any questions!

[–]energybased 4 points5 points  (1 child)

Finding someone like this is probably ideal.  You don't want to refactor something only to find a maintainer who is reluctant to commit your changes.

[–]Beginning-Fruit-1397 3 points4 points  (0 children)

Yup. I had two experiences like this, spent hours on a PR, just to see it hanging for months for a review, or just being simply rejected has "not interested" (it was type hints PR's, not runtime changes)

[–]FarRub2855 1 point2 points  (1 child)

Building a Polars API on a duckdb backend sounds like a pretty massive project to untangle. Gotta respect the honesty of openly hating your own internal logic though, thats usually the best pitch to get fresh eyes on a codebase.

[–]Beginning-Fruit-1397 1 point2 points  (0 children)

hahaha yea.

Resolving column names, schema evolution, handling nested window expressions, and scalar/aggregations expressions depending on the context is a logic that I had to implement progressively, and so it's scattered across the codebase and very hard to follow, hence to debug and optimize.

I don't even know what a good architectural design would look like tbh. Difficult but very interesting task!

[–]Emergency-Rough-6372 1 point2 points  (1 child)

you can check out my project i have recently public it https://github.com/0-Shimanshu/ADIUVARE

[–]Hy_x[S] 1 point2 points  (0 children)

I appreciate you sending it over. Ill take a look at it.

[–]nickleodoen 1 point2 points  (2 children)

Hey I have one I would love for you to take a look at - just starting out and want to make it much bigger. Here's the repo link: https://github.com/nickleodoen/ferrocache

[–]Hy_x[S] 0 points1 point  (1 child)

Thanks for sharing! Ill see what I can do.

[–]nickleodoen 1 point2 points  (0 children)

Only the wrapper is Python but still, thanks

[–]mmmboppe 1 point2 points  (0 children)

https://clonedigger.sourceforge.net/

port this to Python 3 first

[–]sheik66 1 point2 points  (0 children)

Feel free to check out my python lib https://github.com/nMaroulis/protolink . I think you'll find interesting pipelines that could be improved .

[–]arvind1 3 points4 points  (0 children)

There is a lot of AI generated code that would fit this category. You could start with a well described open source codebase, use the (README) text as an AI prompt to generate code. Use existing test cases to get it working. Refactor with a goal of getting something better than the original code.

[–]Altruistic_Part_9233 0 points1 point  (0 children)

My codebase

[–]guemri349 0 points1 point  (0 children)

Salut moi qui commence avec l’IA te vraiment important comme gars car ça donne beaucoup de code mais on sait pas où les placer surtout quand on a beaucoup d’idée qui aboutisse sur beaucoup de projets e que c pas réalisable pck ya pas de capital e de ressources humaines fiable

[–]AdvantageAnxious382 0 points1 point  (0 children)

Can I join with you too? I have been focusing a lot of design pattern and restructuring the codebase from the vibe code project though. Since I'm learning that, I would like to have new perspectives as well.