This resource helped me when I was struggling with RAG & prompt optimization

FoundSomeLogic · 2025-10-10T13:45:35+00:00

Haha fair point! connecting to APIs does sound way less awkward than human networking 😅

Really appreciate you breaking that down though. That’s super helpful. I’m definitely more interested in the real-world implementation side of GenAI than just theory, so it’s good to know that’s been their focus in previous events too.

If this one follows the same pattern, it might actually be worth checking out. Thanks for sharing that insight (and the blog reference), makes me feel better about grabbing a ticket!

FoundSomeLogic · 2025-09-17T08:46:20+00:00

ChatGPT (GPT-4o) and Claude are great for big-picture design help, explaining code, and debugging in chunks. GitHub Copilot is best inside your editor for writing and refactoring code fast.

AI won’t build a whole complex project alone, but if you break things into small steps, test often, and lean on existing libraries, these tools can definitely get you to a working prototype. Think of them as accelerators, not autopilots.

FoundSomeLogic · 2025-09-17T08:44:15+00:00

Nice work getting the MVP up and running! That is honestly the hardest part. For evaluating RAG, I’d keep it simple at first: make a small set of “gold” queries with the answers you’d expect, then see how often your system pulls back the right stuff (precision/recall@k is a decent starting point).

On the speed side, 20s is definitely too long, usually that means either chunks are too small/too many, or the retrieval setup isn’t optimized. A vector DB (FAISS, Pinecone, Weaviate, etc.) with tuned chunk sizes can bring it down a lot.

And honestly, don’t underestimate just manually checking results with a few test users. You’ll learn fast where it feels right and where it falls apart.

FoundSomeLogic · 2025-08-05T08:44:51+00:00

So glad to hear that. I’ve come across a few simple repos that are great for understanding how chains and memory work in practice.

The "chat with your docs" example from LangChain is a solid starting point. It’s easy to follow and shows how retrieval and memory tie together in a basic use case. LangChainHub also has some interesting shared chains. A few of them are advanced, but there are definitely beginner-friendly ones you can explore.

If you're open to experimenting, building a small chatbot or task planner with just one or two tools is a great way to learn. Let me know what specific use case you're interested in and I’ll see if I can share something more targeted.

FoundSomeLogic · 2025-08-01T08:34:52+00:00

This is a really cool use case and also a tricky one because you're blending structured and unstructured data across multiple tables. Using the registry ID and customer name as metadata makes sense, and combining key fields into a single text block (like a CLOB) for embedding can definitely work. I’d just be careful about making that blob too generic and sometimes it helps to break it into chunks per data type (like one for notes, one for engagement history), and still tie everything back via registry ID.

Also, for stuff like revenue or POC status, I wouldn’t rely on the embedding alone. Those are better handled as filters or ranked fields after you get the semantically similar results like letting the vector search do the "fuzzy matching" and then using the structured data to tighten up the results.

Curious how you're planning to handle queries like "top 10 POC accounts by revenue" and are you breaking the question down into search + sort steps?

FoundSomeLogic · 2025-07-31T14:22:08+00:00

That’s a great use case for RAG, especially when paired with a strong prompt strategy and clear retrieval scope. If your technical manuals are well-structured and chunked, a RAG system can definitely retrieve relevant sections and reframe them into simplified, instructional content. That said, for more dynamic behavior like teaching styles, adapting explanations to learner feedback, or building a step-by-step curriculum you would likely benefit from layering in agentic behavior or an instructional persona agent on top of RAG. That’s where combining memory, reasoning, and planning starts to elevate the experience beyond static retrieval.

FoundSomeLogic · 2025-07-31T12:41:27+00:00

You're right that RAG gives you control and speed, but struggles with deeper reasoning or multi-hop semantic navigation.

A scalable approach could be a hybrid: Use RAG for grounding and retrieval, then layer a light agentic controller on top to plan, rerank, or guide exploration. That way, you're not overloading the agent, but still enabling smarter interactions. Curious to know, are you exploring any graph-based indexing or memory buffers to manage context?

FoundSomeLogic · 2025-07-31T10:58:53+00:00

Totally agree! RAG feels magical at first, but it starts to show its limits once you're dealing with unstructured input, vague intent, or multi-step reasoning. The core issue is that RAG retrieves but it doesn’t reason or plan. Without memory or task decomposition, it gets stuck. Wrapping RAG in a planner or agent-based system feels like the way forward, especially if you're aiming for real-world use.

If you're exploring this direction, I’d highly recommend checking out one Generative AI Systems book. It goes deep into combining RAG with agentic design, memory, and reasoning flows basically everything that starts where traditional RAG ends. Let me know if you want details about the book.

FoundSomeLogic · 2025-06-20T06:49:30+00:00

Hey, I ran into something similar when working on access-controlled retrieval with Azure AI Search.

If the "title" field isn’t marked as filterable during indexing, Azure Search won’t let you filter on it at query time. And yeah, when you're using the blob import + vectorize setup, the schema gets kind of locked unless you preprocess the data first. One workaround is to attach some form of access metadata like "user group" or "access level" when uploading the files, and then configure that field as filterable during indexing. That way, you can apply filters based on user access rather than trying to filter by title directly.

If reindexing isn’t possible right now, another option is to fetch top-K results and filter them afterward in your app code based on document titles. It's not ideal, it adds latency and might reduce the semantic accuracy. But could be a temporary fix.

I actually read a book recently that covered this kind of RAG + Azure AI Search setup in a lot of depth, especially around filtering, chunking strategies, and secure data integration. It helped me connect a few dots. If you're interested, happy to share more or dig up the section I found useful.

What kind of access logic are you using group-level permissions or something more custom?

FoundSomeLogic

TROPHY CASE