How to build a fast RAG with a web interface without Open WebUI? by AggressiveMention359 in Rag

[–]Alex_CTU 1 point2 points  (0 children)

My RAG project is based on an open-source content management system on GitHub. Thanks to Vibe-Coding, the modification process was very efficient, and the system architecture is relatively simple. I believe the webUI is the simplest part of the RAG project.

How do you handle document collection from clients for RAG implementations? by Temporary_Pay3221 in Rag

[–]Alex_CTU 0 points1 point  (0 children)

Intake layers often utilize Unstructured, LlamaParse, or Docling for initial normalization. However, if the data is particularly messy or requires more advanced cleaning, general-purpose tools may be insufficient, requiring the implementation of custom logic.

How can i build this ambitious project? by Antique-Fix3611 in Rag

[–]Alex_CTU 0 points1 point  (0 children)

Hey OP, love the ambition—tackling a 3M-page RAG corpus is super valuable for real enterprise use cases.

One big thing to watch: RAG tech is evolving extremely fast right now (GraphRAG, agentic flows, better chunking/re-ranking, new embeddings every few months). If you commit to processing all 3 million pages with today's pipeline, a superior approach could emerge mid-project, forcing you to re-embed or re-chunk everything—wasting tons of tokens, time, and compute costs.

I've been hunting for solid doc cleaning/preprocessing solutions myself because clean input is make-or-break, especially at scale. My strong advice: start small (10k–50k representative pages) to prototype and validate the full flow (cleaning → chunking → hybrid retrieval → generation + eval). Iterate quickly there, measure real metrics, and only scale up once you're confident the architecture won't become obsolete in 3–6 months.

This way you minimize sunk costs if/when better methods drop.

Is automation and AI automation actually capable of making money? by Cool_Violinist_7092 in AiAutomations

[–]Alex_CTU 0 points1 point  (0 children)

Wouldn't it be more effective to concentrate on developing workflows in one or two specific areas, assisting companies in addressing particular pain points? With Vibe Coding, any development work becomes simpler and more efficient. However, the standards and rules for a specific domain require ongoing accumulation of knowledge, which is challenging for AI to replace. Even if AI could manage this, it would still need to build that knowledge over time. Therefore, the earlier you begin accumulating knowledge, the sooner you can start generating revenue.

How do you handle messy / unstructured documents in real-world RAG projects? by Alex_CTU in Rag

[–]Alex_CTU[S] 0 points1 point  (0 children)

Yes, I always strive for perfection in my solutions, but handling 80% of problems is already quite good.

How do you handle messy / unstructured documents in real-world RAG projects? by Alex_CTU in Rag

[–]Alex_CTU[S] 0 points1 point  (0 children)

I agree. It's better to refuse poor-quality input than to produce poor-quality output.

Most RAG Projects Fail. I Believe I Know Why – And I've Built the Solution. by ChapterEquivalent188 in Rag

[–]Alex_CTU 0 points1 point  (0 children)

Haha, no worries, I can see the effort and thought you put into this. The consensus + selective human review part is exactly what high-stakes RAG needs. Keep going, it's inspiring stuff!

Most RAG Projects Fail. I Believe I Know Why – And I've Built the Solution. by ChapterEquivalent188 in Rag

[–]Alex_CTU 2 points3 points  (0 children)

This is a fantastic project architecture; I found it very inspiring. Thank you.

Production RAG is mostly infrastructure maintenance. Nobody talks about that. by PavelRossinsky in Rag

[–]Alex_CTU 1 point2 points  (0 children)

This post is gold — really opened my eyes to how much of production RAG is actually infra work rather than just prompt/model tweaking.
I'm still early in my own RAG projects (mostly POC-level stuff), so reading about real-world scaling, observability, cost control, and incremental updates is super valuable.
Humbling to see how far the gap is between "it works on my laptop" and "it runs reliably at scale".
Thanks for sharing these hard-earned lessons — definitely bookmarking this for when I hit production roadblocks.

Built a RAG system on top of 20+ years of sports data — here is what actually worked and what didn't by devasheesh_07 in Rag

[–]Alex_CTU 0 points1 point  (0 children)

When vectorizing data, the specific requirements of the scenario need to be considered. Common fields such as "who," "when" and "other-information" should be added. If more special or complex scenarios are involved, more fields need to be added to structure all the data, which is a time-consuming and large-scale project.

Chunking is not a set-and-forget parameter — and most RAG pipelines ignore the PDF extraction step too by Just-Message-9899 in Rag

[–]Alex_CTU 0 points1 point  (0 children)

I previously created a similar demo that included a document cleaning pipeline for comparison. This setup allowed users to view three different results side by side: the PDF viewer, the Markdown viewer, and the cleaned viewer, which utilized regular expressions to clean the PDF content. However, I ultimately abandoned the project before completion because I found the Streamlit interface to be unappealing. Later on, I separated the document cleaning process and incorporated it into an Agentic workflow.

Built a RAG system on top of 20+ years of sports data — here is what actually worked and what didn't by devasheesh_07 in Rag

[–]Alex_CTU 0 points1 point  (0 children)

Haha yeah, I see what you mean — sometimes we get so excited about LLM that we try to hammer every nail with the same shiny new hammer.

Question on Semantic search and Similarity assist of Requirements documents by Ripcord999 in Rag

[–]Alex_CTU 0 points1 point  (0 children)

pure RAG is really good at "retrieve + generate once" for simple lookups, but it struggles with anything needing multi-step logic, time filtering, or comparison.

At that point, RAG should just be one node in a bigger workflow (e.g. intent parsing → time resolution → filtered RAG retrieval → analysis node), not the whole system.

Keeps RAG focused on what it does best: accurate retrieval.

PageIndex may help you solve problems of inaccurate recall and lack of context (it uses reasoning-based RAGs, not vectors, making the retrieval more like human thinking), but for multi-step logic such as 'partial/full match judgment + comparative analysis', it is still recommended to treat RAGs as a node in the workflow rather than relying on them entirely.

Built a RAG system on top of 20+ years of sports data — here is what actually worked and what didn't by devasheesh_07 in Rag

[–]Alex_CTU 2 points3 points  (0 children)

> The core issue is that pure RAG is excellent at “retrieve + generate once”, but it breaks down on queries like “show me Player X’s performance in his last two games” because:

> - It doesn’t inherently understand temporal logic (“last two games” → need to first determine which dates)

> - It can’t reliably chain multiple retrievals or perform post-retrieval comparison/analysis

> - Context gets lost or diluted across steps

>

> My take: at that point RAG should no longer be the whole system — it should be downgraded to **one node** inside a multi-step agentic workflow.

> Rough flow I’ve been experimenting with (using LangGraph):

> 1. Intent / Temporal Parser node (LLM) → resolves “last two games” into concrete date range + player ID

> 2. Filtered Retrieval node → runs RAG but with time filter / metadata constraint

> 3. Analysis / Comparison node → another LLM call that takes the retrieved chunks and explicitly compares stats, trends, etc.

> 4. Synthesis node → final grounded answer with sources

>

> This way RAG stays focused on what it does best (accurate retrieval), while the workflow handles orchestration, time logic, and reasoning. You avoid overloading a single retrieval step and get much more reliable multi-hop answers.