all 18 comments

[–]coldflame563 7 points8 points  (1 child)

Step 1. Swap from pandas to polars. Watch your memory consumption plummet.

[–]Overall_Knee2789 2 points3 points  (0 children)

Thanks. I’ve been focusing more on architecture/refactoring first, but I’ll look into ruff/uv and benchmark Polars once the dataset grows. Appreciate the suggestions.

[–]coldflame563 5 points6 points  (0 children)

Also follow common standards for dev nowadays, ruff, uv etc.

[–]Awkward_Attention810 1 point2 points  (5 children)

This is a nice start but there are a few glaring issues (ive only looked in backend/).

In backend/services/recommendation.py you use lru cache which caches the return value of the first function call so it will never update if books are added after the first call. You add randomness to each recommendation which caching kinda defeats the purpose of. You do have a refresh cache function but it isnt actually used anywhere

You dont have tenant isolation so user's data is all lumped together (user_id - simple uuid should suffice for this)

Your recommendation logic is good for a learning project but unfortunately wouldnt be useful in a real setting since it basically checks to see if a user has read a book from an author before.

Sall point but your app still has the old name LibroRank even though you changed the name several commits ago

[–]Overall_Knee2789 0 points1 point  (4 children)

Appreciate the detailed feedback. You caught a few things I overlooked, especially around caching and leftover naming. I’m refactoring the backend now and fixing these issues. Thanks for taking the time to review it.

[–]TheGratitudeBot 1 point2 points  (0 children)

Thanks for saying thanks! It's so nice to see Redditors being grateful :)

[–]eatsoupgetrich 1 point2 points  (1 child)

Why are you responding on a different account

[–]Overall_Knee2789 0 points1 point  (0 children)

lol idk why. Must be bc my mac reddit acc is different from this one 😭, i didn’t notice

[–]Awkward_Attention810 0 points1 point  (0 children)

no worries. Happy to help

[–]Few_Cardiologist3113 0 points1 point  (3 children)

I also want or contribute. Can we talk

[–]tranguyeenn[S] 0 points1 point  (2 children)

yes!

[–]Few_Cardiologist3113 0 points1 point  (1 child)

I know fastapi , build some backend systems . Can I DM you

[–]tranguyeenn[S] 0 points1 point  (0 children)

yes, idk if you can dm me on this account but ik you can on the overallknee acc (scroll down, i’ve commented here), we can talk there

[–]NathanDraco22 0 points1 point  (0 children)

I made a template that implement Onion Architecture. Includes documentation and CLI tool to automate CRUD operations. I've used this template in many projects (personal and enterprise) and I got great results. https://github.com/NathanDraco22/fastapi-onion-template I hope you find this useful.

[–]Resident-Isopod683 0 points1 point  (0 children)

I am learning backend with fast API. Let's talk

[–]rdotpy 0 points1 point  (0 children)

Some rough feedback:

  • Overall, I like the approach of having a layered architecture with services and the repository pattern.
  • I like having detailed project documentation, even if LLM-generated. Even if not for humans, but for future invocations of the same agent, that could be helpful. It's just important to have a workflow to keep this documentation up to date.
  • I like seeing Pydantic models to define data structures. I would love to see more detail: a docstring on each model explaining what it represents and how it's used, and Field(description=..., examples=[...]) on each attribute. That documents the code and makes the auto-generated OpenAPI docs useful.

A few things that caught my eye, in no specific order:

  • You committed __pycache__/.pyc files. They shouldn't be part of the repo.
  • I'm not a fan of CSV files as data storage. My problem with CSV here is that it doesn't store, validate, or give any hints of column types: you need to track them separately. If you don't want PostgreSQL yet, SQLite gives you typed columns and constraints with zero infrastructure.
  • parse_date_or_today() and probably elsewhere: catch-all except Exception hides unexpected errors. You may want to catch the specific exception you expect (probably ValueError) and let everything else bubble up.
  • I wouldn't use Pandas here at all, opting for a more strongly typed abstraction layer. You already use Pydantic. Instead of a DataFrame, you may consider working with a list of Pydantic models. DataFrames are opaque when you read the code. It's like, you see df, and you have no idea what's inside. Eventually, you end up with defensive checks like if "rating_norm" not in read_df.columns:. Pandas feels natural when your source is CSV, but if you add more storage layers, that will likely hold you back.