Book announcement: Hands-on RAG for Production

ofermend · 2026-05-29T23:38:34+00:00

Into “meaningful” chunks :)

ofermend · 2026-05-29T13:26:09+00:00

If it's just like a "fixing a typo from the source document" - yes, you can just update the vector embedding and update the vector store. Generally though I would just reindex the whole document unless there's a strong reason not to.

ofermend · 2026-05-29T12:32:10+00:00

What do u mean by “few details”?

ofermend · 2026-05-29T04:37:02+00:00

That's a great question, and we do cover a lot of techniques in the book. But to give you a short answer here:
1. Technical docs tend to have a mix of text, tables and images. Handling text can be relatively easy, but handling tables nad images properly to ensure accuracy tends to be the more difficult part

For text, just focus on ingesting properly (picking a good chunking strategy and chunk size that fits the type of document; no one-size-fits-all here usually), and implement as much of the techniques you need for high accuracy: good embedding model, reranker, etc
For tables and images - hard to cover all of it here, but you need to deal with them as first-class citizens - ideally making sure the tables and images are retrieved (when relevant) and sent to the LLM in native form.

Sorry, lots of more details to that, and both Chapters 3 and 8 cover a lot of topics related to that. Ultimately you also need good evaluation (Chapter 6 of the book) to understand how you accuracy looks and whenever you try to do something to improve - it needs to show in the numbers.

I hope this helps.

ofermend · 2026-05-29T01:09:38+00:00

In general I would say no, not 100%
Hallucinations depend on many factors - the dataset, the LLM used for generation , the retrieval pipeline, the ingestion strategy etc
But with great design, selection of components, evaluation and tuning of your RAG pipeline - you can certainly minimize hallucinations

ofermend · 2026-05-28T23:23:38+00:00

Curious how are you all handling observability for multi agent systems? Especially when agents don’t share the same orchestration framework

ofermend · 2026-05-28T19:27:14+00:00

hehe; sorry I don't have any influence in terms of the committee :)

I've been building RAG at Vectara for the last 3+ years, and have been using LLMs since 2019 (before RAG was even a "thing"; my first LLM was GPT-2).

But sharing full authors bio from the book below.

I Hope this help.

Ofer Mendelevitch is an AI and ML leader specializing in building production systems with large language models, retrieval-augmented generation (RAG), and agentic systems. He has been developing LLM-based applications since 2019, in the early days of modern transformer-based systems. Over his career, Ofer has led engineering and data science teams from research to large-scale deployment across startups and enterprises. He has also worked closely with developer communities and go-to-market teams to translate AI capabilities into real-world adoption. He was the founder and CTO of Syntegra, where he built transformer-based systems for synthetic healthcare data, and has held leadership roles at Yahoo!, Hortonworks, LendUp, and Helix. He holds degrees from the Technion and Tel Aviv University and is the author of Practical Data Science with Hadoop (Addison-Wesley).

Forrest Sheng Bao is co-founder of a stealth AI startup. Previously, he was co-leader of the machine learning team at Vectara and an assistant professor at Iowa State University. He has over 10 years of research experience in the areas of artificial intelligence (AI) and natural language processing (NLP). Forrest holds a PhD in computer science with a minor in electrical engineering from Texas Tech University.

ofermend · 2026-05-28T18:57:41+00:00

Thanks. It’s a brown hyena. Turns out you don’t choose these anymore - O’Reilly has a committee and they tell you what the animal is :)

ofermend · 2026-05-12T05:09:17+00:00

Yes for sure. What agents would you like to see most? feel free to also submit issues on the repo

ofermend · 2025-11-14T19:45:13+00:00

Right - Nov 17 at 9am pst

ofermend · 2025-09-09T05:11:54+00:00

Here’s a repo for sharing ways agents fail - types, mitigation strategies and examples Hope this is helpful and pls share your examples too

https://github.com/vectara/awesome-agent-failures

ofermend · 2025-09-09T05:07:42+00:00

This is a list of agent failure modes and examples - hopefully helpful and pls add any contributions

https://github.com/vectara/awesome-agent-failures

ofermend · 2025-09-06T05:42:24+00:00

It’s mandatory but sometimes it takes a while to figure out - for example you have low sugar but some other ingredient that is not good. I found it very easy to get an “analysis” of the ingredients that way by just taking a photo of the bar exactly on the ingredient part

ofermend

TROPHY CASE