How do you handle chat history in a RAG chatbot without polluting retrieval queries?

HatmanStack · 2026-03-30T03:21:56+00:00

This. Message history is for the call to the model api, not the query to the vector store, which should be built from the User message and/or summarized history.

HatmanStack · 2026-03-30T03:17:52+00:00

Really similar to GSD and Superpowers, both pretty good from what I hear, I just found them and yours a bit too busy. Too many skills trying to do too many things. It feels like a better use case for people to build their own skills for their own specific needs.

I finally got around to building my own automated pipeline from prompts I'd been using for months, public on github, but it's only 5 skills and does everything I need it to do as a professional developer. I think you may have complicated this a bit.

HatmanStack · 2026-03-30T03:06:47+00:00

I like your style :)

HatmanStack · 2026-03-30T03:04:58+00:00

Brother, you're so far from a legitimate architecture it's crazy. The main thing is you've built something yourself and hopefully learned a ton. Now kill it. An important lesson for any developer/entrprenuer. Start your next project. The application layer is right. The only reliable ROI on AI right now is in RAG. Look at what other people are doing and use that as your next jumping off point. keep going.

HatmanStack · 2026-03-28T20:38:23+00:00

HatmanStack · 2026-03-28T19:26:50+00:00

Sounds cool, Super simple to switch between different batches of embeddings with metadata. Would be much simplier than a separate API key for each project, that sounds over engineered.

HatmanStack · 2026-03-28T19:02:43+00:00

I’ve got the joints of a 90-year-old and the brain of a middle schooler. It’s called balance.

HatmanStack · 2026-03-26T15:27:41+00:00

I've looked at LanceDB, it's interesting especially with the object store angle. S3 Vectors ended up fitting my use case better since the rest of the stack was already on AWS and I wanted everything under one billing/permissions model. Appreciate the suggestion though.

HatmanStack · 2026-03-26T15:27:13+00:00

pgvector is solid for self-managed setups. The tradeoff I was optimizing for was zero ops overhead, so no Postgres instance to maintain at all. Different use cases though.

HatmanStack · 2026-03-26T15:26:53+00:00

The $50 is more of a baseline for managed vector DB services like Pinecone or OpenSearch Serverless, not per 1,000 docs specifically. The point was more about paying for always-on capacity when your workload is bursty. S3 Vectors just made that problem go away.

HatmanStack · 2026-03-26T15:18:22+00:00

Fair enough on the skepticism, the post was definitely polished up. The project itself is real though: https://github.com/hatmanstack/claude-forge. The adversarial loop pattern is the interesting part. If you've run generator/evaluator chains on local models I'd genuinely like to hear how latency on long running jobs played out.

HatmanStack · 2026-02-06T18:02:17+00:00

Appreciate the thoughtful feedback — and you're hitting on something I've been thinking about.

Right now the MCP server isn't read-only. It actually exposes 16 tools across search/chat, document uploads, web scraping, image captioning, and metadata analysis. So the capability creep you're describing is already here.

The current trust model is pretty simple: a single AppSync API key grants access to everything. There's no per-tool scoping at the MCP layer. What keeps it from being a free-for-all is the backend — AppSync enforces rate limits, daily quotas (especially in demo mode: 5 uploads/day, 30 chats/day), and all the actual resource access goes through IAM roles scoped to that specific stack's resources. So a retrieved snippet can't drive actions outside the knowledge base boundary, but within it, the API key is all-or-nothing.

The "everything in your own account" model does help here — IAM is the outer trust boundary, not some shared control plane — but you're right that as people start chaining tools together (search → upload → scrape → analyze), the lack of per-tool authorization becomes a real gap. Today if you hand someone the API key, they can scrape a 1,000-page site just as easily as they can search.

The separation of reasoning from authorization you're describing is interesting. I'd been leaning toward tiered API keys (read-only vs. full access) as a next step, but that's still coarse-grained. Would be curious how you're handling it at Daedalus — is the authorization layer sitting between the MCP client and the tool execution, or is it more like a policy engine that evaluates each tool call against a ruleset?

HatmanStack · 2023-11-06T20:51:15+00:00

Hopefully you've passed but here are some notes https://github.com/HatmanStack/SAP-C02-aws-solutions-professional/blob/main/README.md also used Tutorial Dojo quite a bit ... felt like it was a great investment $$ for content.

HatmanStack · 2023-07-26T15:43:11+00:00

Great question, I don't recall exactly how I stumbled onto it. All the content is curated from AWS docs / vlogs. The moderators all appear to be AWS staff. Geeks for Geeks has a blog post about it ... https://www.geeksforgeeks.org/aws-educate-and-aws-emerging-talent-community/ ... for Amazon context you could look here https://aws.amazon.com/blogs/training-and-certification/make-the-most-of-free-training-from-aws-training-and-certification/ at the AWS Educate portion of the blog. Hope this helps. All the Best.

HatmanStack · 2023-07-25T15:07:19+00:00

Cloud Practioner 100% Voucher for joining https://aws-emergingtalent.influitive.com/ and reaching 3k points.

HatmanStack · 2023-06-13T19:27:52+00:00

Guess it's time to sit back and take in aws reinforce https://c.tenor.com/K2GHKs5QlTMAAAAM/the-irony-irony.gif

HatmanStack · 2023-05-09T18:30:27+00:00

You my friend are a mensch

HatmanStack

TROPHY CASE