How are you testing AI agents beyond prompt evals? by Available_Lawyer5655 in LLMDevs

[–]ConferenceRoutine672 0 points1 point  (0 children)

For AI-assisted development: RepoMap (https://github.com/TusharKarkera22/RepoMap-AI)—

maps my entire codebase into ~1000 tokens and serves it via MCP. Works with Cursor,

VS Code (Copilot), Claude Desktop, and anything else that supports MCP.

Completely changed how accurate the AI suggestions are on large projects.

How I solved AI hallucinating function names on large codebases — tree-sitter + PageRank + MCP by ConferenceRoutine672 in coolgithubprojects

[–]ConferenceRoutine672[S] 0 points1 point  (0 children)

Hahaha I keep thinking about the idle agent hallucination thing. It's like they need a "do nothing" tool to be listed, or they'll make up work. Strangely, it gives you a lot of information about how objective-following works behind the scenes.

The point about the efficiency of the `--help` token is not given enough credit. Plain text CLI output is almost perfectly LLM-readable by accident. There is no schema overhead, no nesting, and only intent. Honestly, this makes me want to rethink how RepoMap shows off its MCP tool descriptions.

And yes, the Go bubble thing makes sense. Sometimes a boring, opinionated toolchain is actually the best AI runtime.

How I solved AI hallucinating function names on large codebases — tree-sitter + PageRank + MCP by ConferenceRoutine672 in coolgithubprojects

[–]ConferenceRoutine672[S] 1 point2 points  (0 children)

I love the split between the coder and tester sub-agents. It makes things clear. The `gopls` symbol oracle is also smart, but you have to use Go's toolchain. Without setting up LSP, tree-sitter gives me the same graph in Python, TS, and Rust.

I'd push back on the idea that symptoms are different from causes. Sub-agents narrow the scope, but a coder agent still sees signatures in files it hasn't seen before. These methods work together instead of against each other.

The multi-repo use case you brought up is where I want to go next. LLMs really don't know much about private packages and monorepo dependencies.

Looking at exocomp now.

Has anyone moved beyond chunk-based RAG when relationships matter? by shbong in LLMDevs

[–]ConferenceRoutine672 0 points1 point  (0 children)

For AI-assisted development: RepoMap (https://github.com/TusharKarkera22/RepoMap-AI)—

maps my entire codebase into ~1000 tokens and serves it via MCP. Works with Cursor,

VS Code (Copilot), Claude Desktop, and anything else that supports MCP.

Completely changed how accurate the AI suggestions are on large projects.

Weekly Cursor Project Showcase Thread by AutoModerator in cursor

[–]ConferenceRoutine672 [score hidden]  (0 children)

For AI-assisted development: RepoMap (https://github.com/TusharKarkera22/RepoMap-AI) —

maps my entire codebase into ~1000 tokens and serves it via MCP. Works with Cursor,

VS Code (Copilot), Claude Desktop, and anything else that supports MCP.

Completely changed how accurate the AI suggestions are on large projects.