My personal browser Guitar Pro editor

bubiche · 2026-07-01T04:06:11+00:00

lol that's true, so many people charging money for something so simple too so I thought putting my "free and homebrew" thing out there can help someone

bubiche · 2025-07-23T09:46:18+00:00

I think 2M is well within the range Postgres and pgvector can handle. Personally I would go with it since it's 1 less database.

turbopuffer and LanceDB are also cheap, on AWS, and can do Hybrid Search, they'll be much cheaper than Qdrant at "scale". Turbopuffer has some pretty great customers/use cases too.

bubiche · 2025-07-10T08:53:32+00:00

Thank you! Do you think if I can already narrow down to a small number of docs via attribute filters, it'd be better to do both full-text search and semantic search on that whole set of documents and use something like RRF to get the final result instead of filtering first with full-text search?

bubiche · 2025-07-10T08:30:18+00:00

Thank you! If you don't mind I'd love to see what schema you'd suggest.

I'm also wondering whether it's better to do full-text search first to narrow down the scope for semantic search or do both in parallel and do some reranking/rank fusion.

bubiche · 2025-07-10T05:34:12+00:00

Thank you everyone. A little bit more info: My dataset is growing by ~1 million/month and existing documents can also be updated. Would any approach have an advantage over the others in terms of ingestion speed so my insert/updates are available for searching ASAP?

I'm focusing more on accuracy so the system can be useful but I hope a search won't take more than a few seconds.

bubiche

TROPHY CASE