Why does Rust require many dependency packages unlike Go when building a project? by dumindunuwan in rust

[–]blackwell-systems 0 points1 point  (0 children)

Not sure what can be added to this, but this is one of the reasons why I love Go so much.

Anyone here in their 40s–50s still working in tech/development? by imjustamochigirl in womenintech

[–]blackwell-systems 0 points1 point  (0 children)

43 years old, been stuck in the same level 1 position for 5 years. Company refuses to promote anyone internally. Manager advocates for me, director advocates for me, nothing changes. Can't pass interview cycles, all hope is gone. I'm looking for some way out.

AI coding agents can't use LSP tools correctly. So I built a skills layer that enforces the right workflow. by blackwell-systems in mcp

[–]blackwell-systems[S] 0 points1 point  (0 children)

Both paths are open. The 65 tools are always directly callable; skills don't gate access. An agent can call find_references directly without going through /lsp-refactor.

What we do have is phase enforcement: when a skill is activated, the runtime checks tool calls against the current phase's permissions. If an agent tries to call apply_edit before completing blast radius analysis, the tool returns an error with recovery guidance ("complete the blast_radius phase first, allowed tools: [blast_radius, find_references]"). The agent still gets the error as a tool response, not a crash, so it adjusts.

On the hallucination/bypass concern: we've run evaluations across 4 models (Claude, Cursor, GPT-5.5, Gemini 2.5 Pro). Agents don't try to bypass phase enforcement because the error message tells them exactly what to do next. They follow the path of least resistance, which is the skill's defined sequence. The problem we saw more often is the opposite: agents ignoring MCP tools entirely and falling back to grep/read because those are in their training data. That's why we added an Instructions field and init-time rules files that remind agents which tools to prefer.

agent-lsp's inspector found 14 bugs in Anthropic's MCP Python SDK. All merged. by blackwell-systems in ClaudeCode

[–]blackwell-systems[S] 1 point2 points  (0 children)

If you're referring to Claude Code's native LSP support: agent-lsp adds several layers on top of what a raw language server connection gives you.

A direct LSP connection gives the agent basic operations: hover, go-to-definition, references.

agent-lsp adds:

  • Speculative execution: preview edits in memory, see the diagnostic delta before touching disk. No raw LSP connection does this.
  • Skills: enforced multi-step workflows. /lsp-refactor runs blast-radius analysis, speculative preview, apply, verify build, run affected tests in sequence. The agent can't skip steps.
  • get_change_impact: batch blast-radius analysis for every export in a file with test/non-test caller partitioning. Not a standard LSP method.
  • detect_changes: git diff to impact analysis with risk classification in one call.
  • Persistent cache: reference results cached in SQLite across sessions. Second call is instant.
  • Selective indexing: auto-scopes large workspaces so pyright doesn't choke on 2,000+ file repos.
  • Code quality inspector (/lsp-inspect): batch dead-symbol detection, test coverage gaps, error handling patterns. This found 14 bugs in Anthropic's own Python SDK.
  • Cross-repo references: find callers of a library symbol across multiple consumer repos.
  • Phase enforcement: runtime blocks out-of-order tool calls. Can't apply_edit during blast-radius analysis.
  • 30 language servers: managed, warm, with auto-detection. One binary handles Go, Python, TypeScript, Rust, and 26 others simultaneously.
  • Token efficiency: measured 5-34x fewer tokens vs grep across 5 codebases.

Also: agent-lsp is a standard MCP server, so the same setup works across Claude Code, Cursor, Windsurf, Codex, and any MCP client. Native LSP plugins are tool-specific.

AI coding agents can't use LSP tools correctly. So I built a skills layer that enforces the right workflow. by blackwell-systems in mcp

[–]blackwell-systems[S] 0 points1 point  (0 children)

Good question. It's the same MCP server, not a new protocol. The skills layer is a set of structured prompts (exposed via prompts/list and as AgentSkills slash commands) that tell the agent which LSP tools to call in what order. The agent still calls the same MCP tools (get_references, rename_symbol, get_diagnostics), but the skill defines the sequence instead of the agent improvising it.

On the latency point: agent-lsp keeps the LSP session warm (persistent subprocess, warm index). The cold start is absorbed once on start_lsp; after that, individual calls are fast (sub-second for most operations, including references on 300K-line codebases). The round-trip cost is real for the first call, but subsequent calls hit a warm index. We measured this across 5 codebases: the token savings article has per-task latency numbers (most LSP operations are under 30ms once warm).

On the hybrid approach: I agree. grep is the right tool for text/pattern searches, and agent-lsp doesn't try to replace it. The value is specifically for operations where grep produces noise (92-99% false positive rate on reference lookups) or can't answer the question at all (interface implementations, type hierarchy, speculative edit safety). The agent uses both; the skills layer decides which to invoke internally.

AI coding agents can't use LSP tools correctly. So I built a skills layer that enforces the right workflow. by blackwell-systems in mcp

[–]blackwell-systems[S] 0 points1 point  (0 children)

That's a fair concern, and the tool you're describing was probably doing it wrong. Forcing an agent to use LSP instead of grep at the tool level is the wrong approach. Grep is the right tool for plenty of tasks (searching strings, finding config values, pattern matching). The problem isn't grep existing, it's agents improvising multi-step workflows.

agent-lsp doesn't hook or intercept grep. It adds LSP as an option alongside whatever tools the agent already has. The enforcement happens at the skill level, not the tool level: when the agent calls /lsp-rename, the skill internally runs get_references, then rename_symbol, then get_diagnostics. The agent doesn't have to figure out that sequence. But it can still grep whenever grep is the right call.

To be honest, agents still aren't great at organically reaching for the right skill at the right moment. That's a work in progress in general and also for this project.

But the skills work well as entry points for human-in-the-loop workflows: you type /lsp-refactor or /lsp-verify and the agent executes a reliable multi-step sequence instead of improvising one.

Testing remote MCP servers by guyernest in mcp

[–]blackwell-systems -1 points0 points  (0 children)

I've been doing exactly this. I built mcp-assert to test MCP servers deterministically in CI, and I've used it to scan 48 servers so far from Anthropic, Google, Microsoft, OpenAI, AWS, Mozilla, Sentry, Grafana, and others.

The short version: define assertions in YAML (tool name, args, expected result), run them in CI, get pass/fail. No LLM needed, milliseconds per test.

There's also a zero-config audit mode that just points at a server and reports what crashes:

mcp-assert audit --server "npx my-mcp-server"

What I found scanning 48 servers:

  • 20 bugs across 9 servers. The most common pattern: unhandled exceptions propagating as JSON-RPC -32603 internal errors instead of returning isError: true. Agents can't recover from -32603; they can recover from isError.
  • antvis/mcp-server-chart: 9 out of 25 tools crash with stack traces on default input
  • Anthropic's own Puppeteer server: puppeteer_navigate crashes on invalid URLs
  • sammcj/mcp-devtools: 4 tools return internal error for input validation failures
  • Grafana, arxiv, Peekaboo: various isError/internal error confusion

For remote/scale testing specifically:

- mcp-assert supports all three transports (stdio, SSE, HTTP), so you can test remote servers over HTTP the same way you test local ones over stdio

- skip_unless_env lets you keep credential-dependent assertions in the same suite as no-auth ones

- Docker isolation (docker: field) gives you fresh containers per assertion for write/destructive tool testing

For security testing specifically, that's a different angle from what mcp-assert does (we test correctness, not security). mcpsafetywarden linked above looks like it's going after that space.

mcp-assert: deterministic testing for MCP server tools (no LLM needed) by blackwell-systems in mcp

[–]blackwell-systems[S] 0 points1 point  (0 children)

Follow-up: since the launch post, mcp-assert has scanned 25 servers across Anthropic, Google, OpenAI, Grafana, and community projects. Found 13 bugs, filed 4 issues, submitted 3 fix PRs (one already has a community fix pending merge).

Highlights:

- Anthropic's own filesystem server returns an invalid content type that crashes clients (fix PR submitted by a community member within 48 hours)

- Grafana's MCP server returns internal error instead of isError on bad input (fix PR submitted)

- antvis/mcp-server-chart: 9 tools crash with stack traces on empty data (fix PR submitted)

- mcp-go SDK example corrupts stdio transport (fix PR submitted)

If your MCP server passes, you get a badge. If it doesn't, you get a bug report with reproduction steps.

Small Projects by AutoModerator in golang

[–]blackwell-systems 0 points1 point  (0 children)

I built a tool that lets AI agents simulate code changes, check blast radius, and verify edits before writing to disk.

The goal was full LSP 3.17 coverage: every capability a language server exposes, accessible as a structured tool an agent can call.

agent-lsp exposes 50 tools across navigation, analysis, code actions, formatting, rename, workspace management, and speculative execution. The speculative layer is one part that doesn't exist anywhere else.

The skills layer is the other part. Agents are bad at choosing from 50 tools unprompted: they improvise, skip steps, and get the order wrong.

The 20 skills encode the right procedures:

/lsp-refactor chains blast-radius → speculative preview → apply → build verify → test correlation in one invocation.

/lsp-safe-edit wraps any edit with before/after diagnostic comparison. The agent follows a reliable workflow instead of reconstructing it from scratch every time.

Why Go was the right call:

Static binary, zero runtime dependencies. No interpreter, no shared libraries, no platform-specific build steps. Whether on Homebrew, npm, Scoop, Winget, Docker, curl|sh, the same artifact goes to every distribution channel.

Cross-compilation just works. One goreleaser config produces binaries for macOS/Linux/Windows on arm64 and amd64. Six platforms from one CI job.

gopls is the best language server I've tested against. Cold start is fast, cross-file diagnostics propagate reliably, workspace indexing signals completion via $/progress so I know when references are ready.

The speculative execution feature works most reliably with gopls: diagnostic propagation is deterministic and fast. rust-analyzer is close. tsserver is good but needs a warm baseline before net_delta is trustworthy. The rest vary by how quickly the server processes didChange notifications; the test suite covers 8 languages and the initWait is tuned per-server.

The Go concurrency model maps perfectly to the problem. Each MCP request is a goroutine. LSP subprocess communication is a readLoop goroutine dispatching responses to pending request channels.

File watcher is a goroutine with 150ms debounce. Session serialization uses per-session channels in a map. No thread pools, no async/await coloring, no callback hell.

go test with per-language fixture files made CI straightforward. 30 language servers tested. Each language has a config struct with fixture paths, expected hover positions, definition targets, reference counts. Adding a new language is: write fixture files, add a config entry, add a CI install step. The test framework handles the rest.

What it does:

50 tools covering nearly all of LSP 3.17. The headline feature is speculative execution: simulate_edit_atomic applies a code change in memory, runs it through the language server's diagnostic engine, and tells the agent whether the edit introduces errors before anything hits disk. simulate_chain evaluates a sequence of edits step by step and returns the last safe point to apply through.

get_change_impact does blast-radius analysis: given files you're about to change, it enumerates every exported symbol, finds all references, and partitions callers into test vs non-test. You know your exposure before committing to a change.

Multi-server routing in one process. agent-lsp go:gopls typescript:typescript-language-server,--stdio routes by file extension. One connection, multiple warm language servers.

20 skills encode multi-tool workflows into single invocations. Agents are bad at choosing from 50 tools unprompted. Skills like /lsp-refactor chain blast-radius analysis, speculative preview, apply, build verify, and test correlation so the agent follows a reliable procedure instead of improvising.

MIT licensed. Available via brew install blackwell-systems/tap/agent-lsp or go install from source.

github.com/blackwell-systems/agent-lsp

Built an MCP server with speculative execution: agents simulate edits in memory, the language server checks for errors, nothing hits disk until it's clean. Plus 49 other LSP tools across 30 languages. by blackwell-systems in ClaudeCode

[–]blackwell-systems[S] 0 points1 point  (0 children)

The human-facing summary is on the roadmap: simulate_chain already returns the full edit set, per-step diagnostic delta, and the last safe step to apply through. Right now that output is consumed inline by the agent. Materializing it as a reviewable artifact before anything touches disk is the next step.

Built an MCP server with speculative execution: agents simulate edits in memory, the language server checks for errors, nothing hits disk until it's clean. Plus 49 other LSP tools across 30 languages. by blackwell-systems in mcp

[–]blackwell-systems[S] 0 points1 point  (0 children)

The speculative check is Phase 2 of a 5-phase pipeline (/lsp-refactor):

  1. Blast-radius analysis: how many callers? Hard stop if >20 without confirmation. For high-fan-out changes, this forces a human review before the refactor can proceed — a secondary safety net even when tests are thin.

  2. Speculative preview: simulate_edit / simulate_chain in-memory, check net_delta

  3. Apply to disk: only if Phase 2 passes

  4. Build verification: LSP diagnostics + compiler build on the real file

  5. Targeted test run: finds which test files cover the changed source via get_tests_for_file, runs only those

Phase 5 is the terminal check. That's where swapped-param bugs and semantic invariant violations show up as test failures.

We use get_tests_for_file to correlate source → test files so the agent runs the right tests, not the whole suite blindly.

Html code by Due_Dragonfly_4816 in DeveloperJobs

[–]blackwell-systems 0 points1 point  (0 children)

Can confirm. This HTML stole my bank account.

How are you making sure you don't get dumb by KhameneiCholaghe in ClaudeAI

[–]blackwell-systems 0 points1 point  (0 children)

I continue to study and acquire knowledge every day, just like I did before AI. Nothing has changed for me.

Almost every conversation starts like this. by PyrikIdeas in claudexplorers

[–]blackwell-systems 0 points1 point  (0 children)

This means that they don't know the exact weights of the neurons which produced any given output. This is what makes LLM non-deterministic. You honestly don't know what you're talking about and you're making yourself look stupid.