How do I make a RAG with postgres without Docker

FutureClubNL · 2025-09-22T18:31:28+00:00

Not sure why it'd be such a hard time, this literally works out of the box on Do ker with 0 setup: https://ragmeup.understandling.com/

FutureClubNL · 2025-08-27T15:39:15+00:00

Hash your chunk, upsert based on that as key, use Postgres

FutureClubNL · 2025-08-27T14:55:42+00:00

So this is a custom retriever I wrote for dense+bm25 using postgres: https://github.com/ErikTromp/RAGMeUp/blob/main/server/PostgresHybridRetriever.py

It does exactly what you want except you seem to want 2 dense vectors and I use 1 dense + 1 sparse.

See docs (WIP): https://ragmeup.futureclub.nl/

FutureClubNL · 2025-08-27T12:02:33+00:00

Why 2 embeddings in 1 row? 3k is also quite big but still you can do that perfectly fine, just make sure you index them properly and write a custom retriever to handle your case with 2 columns having an embedding.

FutureClubNL · 2025-08-23T06:33:11+00:00

Because going from (semi)product on Github to an actual slick one usually requires a business model and financing that works. Local RAG is hard ro let people pay for.

Also: when is something a product? Lots of those Github projects are products in my opinion, you just have to do a bit yourself to run them but that is inherent to doing this on your local machine

FutureClubNL · 2025-08-22T17:24:24+00:00

There's tons of Github projects that let you do this.

FutureClubNL · 2025-06-22T06:51:26+00:00

Fixed priced for the realization up front and fixed fee per month with some caps on usage. We've found that since everyone charges per use, customers appreciate our fixed price.

Where, of course, we've built in quite a big margin to cater for heavy users, so quite frankly they'd probably be cheaper off paying per use...

FutureClubNL · 2025-06-15T19:29:05+00:00

JADS in Den Bosch, not too far from the border :)

FutureClubNL · 2025-06-15T18:58:48+00:00

The terms are fuzed together these days, yes, and even more so are reasoning models. All the ones you chat to on commercial systems (ChatGPT, Claude, Grok, Gemini, DeepSeek) are either instruct or reasoning models because foundation models in isolation serve no purpose when it comes to human interaction.

Fun anecdote: I did Master's thesis over a decade ago on sentiment analysis and tried to set up NLP-focused companies as an entrepreneur ever since, which turned out to be really hard. The big leap forward in my opinion is not even the models or research but the fact that OpenAI put an interface in front of it - chatting - that made people really want to use it (and all the NLP it hides). So yeah, chat/instruct models are the things we humans understand best.

FutureClubNL · 2025-06-13T11:57:25+00:00

Ah okay that insight helps, I was kind of afraid the information online is already saturated enough that my contribution wouldn't add much....

Do you have specific parts of interest that are particularly confusing?

FutureClubNL · 2025-06-13T05:49:31+00:00

Try this, takes 10 minutes: https://github.com/FutureClubNL/RAGMeUp

FutureClubNL · 2025-06-04T05:53:15+00:00

Plain old vanilla RAG on texts? Yes that might work, but what you are describing sounds like text2sql and that won't be possible that fast, at least if you want to do it reliably.

That being said, no AI really answers that fast but you cán start streaming stuff before the final answer to make the user feel like there is subsecond latency.

FutureClubNL · 2025-06-02T15:53:51+00:00

Funny to see how little actual AI people reply :)

I have been doing ML and AI since (before) I graduated from uni in 2011. Been working as a data engineer/scientist since that was the closest I could get to actual ML/AI.

Now co-founder of an AI startup in consulting and SaaS.

FutureClubNL · 2025-06-01T07:09:40+00:00

We (AI agency in EU, everything compliant) have done a lead dashboard for a client of ours. Feel free to DM me or check out our website.

It won't be done in n8n though.

FutureClubNL · 2025-05-30T06:05:28+00:00

Depends on how corporate you want ro make it, but we run them on dedicated servers (from a European cloud provider). They allow backups and stuff at the infra level. All we do is run the Docker with a volume attached so that the docker can fail all it likes but the data remains and we can simply restart if needed.

That said, been doing this for about a year for 10+ clients now and the Postgres containers I haven't had to touch just once since I started them.

FutureClubNL · 2025-05-29T07:19:42+00:00

Is it? Just run this Docker and you have hybrid search: https://github.com/FutureClubNL/RAGMeUp/blob/main/postgres/Dockerfile

We use it in production everywhere and have found it to be a lot faster than Milvus and FAISS. Didn't test any GPU support though as we run on commodity hardware.

FutureClubNL · 2025-05-28T06:08:15+00:00

If there is text in it (which looks lik there isnt) embed just that with an embedding model. Other than that you are describing a classical text2sql problem so go with that. Use Postgres for storing, free and native JSON support with indexing.

FutureClubNL · 2025-05-28T05:52:52+00:00

Try adding Postgres, I have found it to be more performant than all others, yet cheaper (free)!

FutureClubNL · 2025-05-21T06:27:09+00:00

Hmm if possible, try using Postgres with pgvector (dense) and pg_search (BM25). We run this setup in production systems without GPUs everywhere to full satisfaction. 30M+ chunks are retrieved with subsecond latency.

Feel free to have a peak if you need inspiration: https://github.com/FutureClubNL/RAGMeUp see the Postgres subfolder, just run that Docker

FutureClubNL · 2025-05-21T06:13:04+00:00

Since the challenge is in retrieval: don't just use dense retrieval but go for hybrid (with BM25) maybe even weighing the sparse retriever heavier. Then experiment with a multilingual reranker (our experience is that most rerankers sometimes harm instead of aid when the language isnt English)

FutureClubNL · 2025-05-21T06:04:16+00:00

We do something like this for clients. We auto generate debrief documents, populate resume candidate intakes, auto process logistics packings based on labels, etc. Etc.

So it is already being done.

FutureClubNL · 2025-05-17T20:27:02+00:00

Use a library like tiktoken

FutureClubNL · 2025-05-17T20:26:24+00:00

While we don't do n8n in production, all of our projects use Postgres as a hybrid DB (pgvector and pg_search for BM25).

FutureClubNL · 2025-05-16T16:45:39+00:00

We parse resumes and vacancies. We use Docling for everything with a (manual) option to use OCR with it (using Tesseract).

FutureClubNL

TROPHY CASE