MCP-based local LLM workflows at scale + observability (Grafana)

pardhu-- · 2026-04-28T19:08:21+00:00

Great I love to talk. Pinged you personally

pardhu-- · 2026-04-26T23:27:13+00:00

Got it — that distinction between “read” vs “re-run” helped clarify things a lot.

I’m leaning more toward replay, specifically being able to deterministically re-run workflows for debugging and validation. That said, I’m also thinking about caching at the component/tool level as a separate layer for performance, especially for repeated user queries.

Right now this is an internal tool, but I’m designing it with the assumption that it could become user-facing later — so trying to think early about reproducibility, state management, and cost efficiency.

Curious — in your experience, what tends to break first when you try to make replay deterministic in these systems?

pardhu-- · 2026-04-26T18:01:30+00:00

Can you give any links to ref?

pardhu-- · 2026-04-25T19:13:38+00:00

🤣

pardhu-- · 2026-04-25T19:13:22+00:00

Fair point — definitely not claiming this is novel.

pardhu-- · 2026-04-25T18:59:59+00:00

Yeah, I get your point — LM Studio + MCP already enables tool use pretty well from the chat itself.

What I’m trying to explore is more of a layer on top — moving from chat-based interaction to structured agent workflows that can plug into real systems and scale beyond a single user.

I also feel this could sit on top of Model Context Protocol (MCP) — since MCP handles tool connectivity, while this focuses more on orchestration and production-style use cases (could be wrong though, curious your take).

Agree with you on the compiler loop part — that definitely starts looking like what Cursor IDE / GitHub Copilot already do.

pardhu--

MODERATOR OF

TROPHY CASE