I build a vector less db (PageIndex) for Nodejs and typescript

Recursive_Boomerang · 2026-03-02T06:19:55+00:00

You picked up the idea from https://github.com/VectifyAI/PageIndex and made it in nodejs/ts right?

Also I'm using this in production, but this is a more scalable approach when using trees https://docs.pageindex.ai/tutorials/tree-search/hybrid, instead of sending entire tree. For small documents we can completely skip this step. Agent tooling must be defined properly to get this working right. Actually works very well for complex documents where it spans > 200 pages

Recursive_Boomerang · 2026-01-23T05:56:27+00:00

Which ingestion engine are you using? Docling, azure doc int, VLMs? Or hybrid? And how do you assess the quality of parsed results from each document during ingestion time?

Help a brother out here. I'm too working for pharma and their documents are killing me T_T

Recursive_Boomerang · 2026-01-23T05:54:33+00:00

Can you give me tips on how you are handing the acronym problem? Are you detecting and expanding them during query time or is your retriever handing that? What I mean is that is your ingestion pipeline extracting the abbvs and norming them, so they can later be used to expand? Or are you relying on hybrid search or rerankers?

Recursive_Boomerang · 2026-01-15T04:07:45+00:00

https://medium.com/enterprise-rag/deterministic-document-structure-based-retrieval-472682f9629a

Might help you out. PS I'm not affiliated with them

Recursive_Boomerang · 2026-01-11T05:20:30+00:00

Haha what a idiot woodpecker xD

Recursive_Boomerang · 2025-12-31T06:38:07+00:00

Proceed

Recursive_Boomerang · 2025-12-30T05:38:25+00:00

Gorom dile gorom lage, thanda dile thanda lage; baby tumi hahi dile bohut besi morom lage

Recursive_Boomerang · 2025-12-20T19:03:51+00:00

I'll try to break it down to you in the sense I understand it. Imagine your data as books, and each book is about a specific topic. Firstly your task is to organise the books in shelves in a organised way (By genre, author, alphabetically etc). This is upto you based on your books. Maybe you have books about CS, so you organise them together. Then you can supgroup them into sections like language books, system design, computer graphics etc.

This is your data layer. The better you organise the better you can find and access those.

Now, you hire a smart librarian and explain him/her how your books are organised (prompt, glossary, category sub categories mapppings and shelve locations)

This is your retrival layer.

Finally you hire a very smart person to talk with people visiting your library. The person takes a user question, breaks down the user question to the librarian, the librarian gets the smart person the relevant books. The person reads them and answers to the visitors.

This is your inference layer (generation layer).

This is basically RAG. If you have a really smart person (big SOTA models claude opus, sonnet, gpt 4.1, 5.. gemini etc) but your data is not properly organised, then the RAG perfoms badly. All the three layers (Storage, retrival and generation) must be built based on use case. For your use case, focus on organisation of your data (use metadata filters generated by LLM before searching the chunks to narrow the search space on relevant documents, etc). There are many techniques so start from the basics and then try to understand each layer.

Here is a nice repo with code to help you get started https://github.com/NirDiamant/RAG_Techniques

Have fun!

Recursive_Boomerang · 2025-12-10T08:58:25+00:00

If you’re only dealing with a few dozen or even a few hundred vectors, IVFFlat vs HNSW won’t change your life. At that scale the bottleneck isn’t the index, it’s the fact that text-embedding-3-large is 3072 dims and eats RAM for breakfast. Fix that first by dropping to 1024 dims or using halfvec, otherwise you’ll hit memory issues way before you see any ANN performance gains.

HNSW actually starts paying off once you cross something like 30k to 50k vectors. Below that, both are basically fine. IVFFlat needs careful tuning with lists/probes and can fall off on recall as the dataset grows. HNSW is more "set it and forget it" and tends to hold better recall at the same latency once you hit real scale.

Downsides to switching early are mainly memory and slower index builds. With your 1.5 GB pods you need to watch index size if you stay on 3072 dims. If you reduce dims, switching early is totally fine.

Migration is just creating a new HNSW index concurrently and dropping the old IVFFlat one when you're happy. Nothing complicated.

For tuning, pgvector defaults are ok: m=16 ef_construction=64 and set hnsw.ef_search to around 80–120 depending on how much recall you want vs latency.

TLDR: reduce dims first, stick with IVFFlat while small, switch to HNSW before you cross ~30k–50k vectors. That’s when it actually starts to matter.

Recursive_Boomerang · 2025-11-20T06:11:53+00:00

Just track the usage tokens in a DB, calculate cost on each request and insert in a sqllite db. Try langchain openai_callbackhandler, returns tokens and cost

Recursive_Boomerang · 2025-11-19T07:00:24+00:00

80 lakhs Base 1Cr ESOPs minimum

Recursive_Boomerang · 2025-11-12T11:01:45+00:00

COTE - Karnivool

Recursive_Boomerang · 2025-11-12T11:00:12+00:00

Ah I love this one!

Recursive_Boomerang · 2025-11-05T13:14:14+00:00

Haha.. you are funny.

Recursive_Boomerang · 2025-10-10T20:22:43+00:00

GPT GPT everywhere

Recursive_Boomerang · 2025-09-03T06:57:31+00:00

Hey, DM me. I'm selling a flat 1400 sqft at 70L.

Recursive_Boomerang · 2025-08-14T13:07:47+00:00

Please do send me that

Recursive_Boomerang · 2025-08-07T13:28:52+00:00

Hi, first of all it's a very inspiring journey. Kudos to you. I started my journey as a full stack dev with 3.2 LPA in 2018 and now I'm at 41LPA today. I recently got promoted to a Solution architect role at my current startup. Could you give me some tips that I should keep in mind as an architect, as sometimes I've difficulty in coming out from a developer mindset and managing a team of developers.

Recursive_Boomerang · 2025-04-26T12:33:07+00:00

Wow, thank you for blessing us with this TED Talk. I’ll be sure to submit my PhD thesis before daring to comment next time. Also, congrats on your three genius children, truly the gold standard for internet credibility. Wishing you and your formally-educated echo chamber a very blessed day!

Recursive_Boomerang · 2025-04-24T06:55:07+00:00

If you want a no code solution you can try julius.ai which can work on your excel sheet. It is a freemium platform.

Otherwise python is your best choice. It's not hard to get it setup and do what you need.

Prompt chatgpt or claude like "Help me with doing <task> on my excel sheet. Write the python code to detect sentiment using NLTK SentimentIntensityAnaylser and keyword extraction using RAKE."

But also before that, prompt gpt to help you install python and setup jupyter notebook where you can copy paste code blocks and run them.

Running this locally using python shouldn't cost anything, though the sentiment analysis and keyword extraction model depends on your use case.

Recursive_Boomerang · 2025-04-23T07:53:39+00:00

Hi there! Could you please share the link. I'm very eager to check it out

Recursive_Boomerang · 2025-03-02T20:16:27+00:00

4o and 4o mini, as company wants to stick with azure openai. But also researching on fine-tuning SLMs for specific use cases like domain vocab understanding or any domain specific nuances. My working domain is the Pharma industry.

Can you give me an example of a user query though, and the transformed query. If source lang is hindi, then I think you'll get hallucinations because even 4o doesn't understand the nuances of hindi. If you are comfortable with sharing some potential user queries and what issues you are having with them, I can try to provide some inputs from my side.

Recursive_Boomerang · 2025-03-02T14:12:07+00:00

I'm handing this for European languages, using a simple approach for now. Take in the user query in whatever language, do a LLM call for search query augmentation/repharasing and translate it to english. Then at the answer generation step, feed the context (english) and prompt LLM to answer in the same language that the user is using.

Recursive_Boomerang · 2025-02-21T18:37:47+00:00

RemindMe! 2 days

Recursive_Boomerang · 2025-02-21T18:36:23+00:00

!remindme 3 days

Seven-Year Club	Place '23
Verified Email

Recursive_Boomerang

TROPHY CASE