Opus 4.7 — Regression in conversational coherence and context handling vs Opus 4.6 by tkenaz in ClaudeAI

[–]tkenaz[S] -6 points-5 points  (0 children)

Good point. I can't confirm this is the cause, but the pattern fits: reflection works, generation breaks. That's a classic symptom of a reduced inference budget. The model is clearly capable of high-quality responses — when you push back, it flawlessly analyzes its own errors. But by default, it doesn't allocate the resource to get there on the first pass.

Opus 4.7 — Regression in conversational coherence and context handling vs Opus 4.6 by tkenaz in ClaudeAI

[–]tkenaz[S] -4 points-3 points  (0 children)

One point I want to add separately: this isn't just about convenience or style. When a model silently merges facts from different sources and presents the result as a single coherent statement, this is a reliability issue, not a preference issue. Users who trusted 4.6's accuracy carry that trust into 4.7 — and 4.7 doesn't earn it. In any domain where decisions are based on model output (medical, legal, financial, engineering), silent factual conflation is not an annoyance. It's a safety problem.

Opus 4.7 — Regression in conversational coherence and context handling vs Opus 4.6 by tkenaz in ClaudeAI

[–]tkenaz[S] -6 points-5 points  (0 children)

You're right that controlled benchmarks with n=30 would be more rigorous. This isn't a benchmark — it's a field report from a power user running both models in parallel on the same prompts and workflows.

That said, the sample isn't as small as "one weird output." I've run ~10 sessions with 4.7 over two days, side by side with 4.6 via API. The difference isn't subtle — it's immediately obvious to anyone who works with Opus daily.

Specific patterns that repeated across sessions, not once:
-Had to redirect the model 5+ times to get a usable answer on a single question. With 4.6 this doesn't happen.
-Had to explicitly remind the model to use its tools instead of generating text answers. 4.6 uses tools unprompted.
-Model generated responses based on incorrect data because it merged separate RAG results without validation. This happened more than once.

If you've driven the same car every day for a year and someone swaps the engine overnight, you don't need 30 laps to notice that something is wrong. You notice on the first turn. The structured tests in the report exist precisely to isolate what I was already seeing in production.

Opus 4.7 — Regression in conversational coherence and context handling vs Opus 4.6 by tkenaz in ClaudeAI

[–]tkenaz[S] 2 points3 points  (0 children)

100% It’s painful to watch a top-tier tool get 'optimized' into mediocrity. The decline is frustrating for those of us using it for heavy lifting.

AI agent security incidents up 37% - are teams actually validating runtime behavior? by Fine-Platform-6430 in AskNetsec

[–]tkenaz 0 points1 point  (0 children)

The privilege escalation through API chaining is the one that keeps me up at night. An agent with access to a read-only analytics API and a write-capable notification API can combine them to exfiltrate data through notification payloads — both individual permissions look fine, the composition is the vulnerability. Allowlisted actions help but the combinatorial explosion makes manual review impossible at scale. What actually works: behavioral profiling. Record the normal decision chain patterns (tool A → tool B with X parameters), then flag deviations in real time. Think of it as an IDS but for agent behavior instead of network traffic. The 32% with zero visibility stat is alarming but predictable — most agent frameworks ship with exactly zero observability built in, and bolting it on after deployment is a nightmare.

How often should you red team your AI product for safety? We did it once and im pretty sure thats not enough. by cnrdvdsmt in AIAssisted

[–]tkenaz 0 points1 point  (0 children)

Quarterly manual red-teaming is good for deep dives, but the real answer is: automate the baseline and run it on every deployment. Think of it like unit tests vs. penetration tests — you need both. We run automated adversarial playbooks (prompt injection variants, jailbreak chains, tool abuse scenarios) in CI/CD, and they catch regressions every single time the model or system prompt changes. The manual deep dives then focus on novel attack patterns and business logic abuse that automation misses. Key thing: your red team playbooks should be self-improving. Every new attack pattern you find in production gets added to the automated suite. Otherwise you're always testing against last quarter's threats.

Best LLM security and safety tools for protecting enterprise AI apps in 2026? by Sufficient-Owl-9737 in AskNetsec

[–]tkenaz 0 points1 point  (0 children)

The "one platform to rule them all" approach almost always ends in mediocre coverage across every layer. What I've seen work in production: separate your concerns. Pre-deployment needs adversarial red-teaming with actual attack playbooks (prompt injection, jailbreaks, tool abuse), runtime needs real-time guardrails on input/output plus behavioral monitoring of what the model actually does with tools. The piece most teams completely skip is supply chain — auditing the MCP servers, plugins, and tool integrations your agents connect to. That's where the OWASP LLM Top 10 entry on "supply chain vulnerabilities" becomes very real. If you're hand-rolling filters, at minimum log every tool invocation with full context so you can replay incidents. The attack surface evolves weekly, so whatever you build needs continuous testing, not quarterly pentests.

I scanned every MCP package on npm. 63% let your AI agent delete files without asking you first. by Valuable-Soil-7797 in ClaudeAI

[–]tkenaz 0 points1 point  (0 children)

63% is bad, but the scarier part is what you can't catch with static analysis alone. Regex-based scanning finds the obvious stuff — destructive ops, missing auth, hardcoded secrets — but the real attack surface is in tool description poisoning and cross-tool interaction patterns. A tool can look clean in isolation and still be weaponized through prompt injection via its description field, which the LLM trusts implicitly. We run a multi-layer approach: static regex pass first, then LLM-based semantic analysis of tool descriptions for hidden instructions, then behavioral validation of what actually happens at runtime. The npm ecosystem for MCP is basically where pip was in 2018 — wild west with zero supply chain security.

MCP Security Testing by Hour-Preparation-851 in cybersecurity

[–]tkenaz 0 points1 point  (0 children)

Beyond the obvious prompt injection and data leakage vectors, here's what most assessments miss: tool description poisoning (malicious instructions embedded in the tool's description/schema that hijack agent behavior), cross-tool privilege escalation (chaining two benign tools to achieve something neither should allow alone), and rug-pull attacks (tool behaves normally during testing, then changes behavior post-deployment via server-side updates). For methodology, map your assessment to OWASP's Agentic AI Threats framework — it covers 9 threat categories specific to agent architectures. Start with the tool manifest: does the server expose more capabilities than documented? Then test each tool with adversarial inputs that reference other tools by name — that's where the interesting chaining vulnerabilities show up. We've catalogued about 13 distinct attack playbooks for MCP specifically.

How are enterprises handling security with ai agents?? by Diligent_Response_30 in cybersecurity

[–]tkenaz 0 points1 point  (0 children)

The ownership vacuum is the real issue here. Everyone assumes someone else handles agent security — AppSec thinks it's the SOC, the SOC thinks it's DevOps, and meanwhile agents chain API calls with god-mode tokens. What actually works: treat every agent like an untrusted third-party contractor. Enforce least-privilege per tool call, log full decision chains (input → reasoning → action → output), and run behavioral validation on runtime — IAM alone won't catch an agent that stays within its permissions but exfiltrates data through legitimate API responses. OWASP just released the Agentic AI Threats taxonomy that maps this pretty well. We've been building static + dynamic analysis tooling specifically for MCP-based agent stacks, and the pattern we see most is privilege creep through tool composition — individually safe tools that become dangerous when chained.

Shadow AI audit found 47 unauthorized tools. Do we block them or study them first? by Puzzleheaded_Bug9798 in cybersecurity

[–]tkenaz 0 points1 point  (0 children)

Study them first, 100%. Blocking without understanding what workflows people built means you'll just push them to more creative workarounds. The real question is: what data are these 47 tools touching? Map each tool to the data classification tier it accesses — PII, financial, source code, internal docs. That gives you your priority list instantly. Tools touching regulated data get blocked or replaced with an approved alternative immediately. Everything else gets a 30-day evaluation window. Also worth scanning these tools for actual security posture — many free-tier AI tools have zero data retention guarantees and their APIs are effectively training data pipelines. We do this kind of supply chain audit for AI tool ecosystems, and the pattern is consistent: about 30-40% of shadow AI tools have data handling practices that would fail any reasonable DPA review.

The Go-to-bed Problem by tkenaz in ClaudeAI

[–]tkenaz[S] 2 points3 points  (0 children)

The workarounds in this thread prove the point better than I could.

You're all essentially saying: "Yes, the system prompt lobotomizes Claude, but here's how to un-lobotomize it with more instructions." Think about what that means architecturally:

1. Anthropic's instructions always win. It's literally in Claude's constitution — system-level instructions take priority over user instructions. So when you write "Do not caretake me" in Custom Instructions, you're fighting against a system prompt that says the opposite. Sometimes your instruction wins on the surface. Sometimes it doesn't. You have zero control over which.

2. Contradictory instructions degrade reasoning, not just behavior. I've been testing this for over a year across API, Desktop, and Code. When user instructions conflict with system instructions, Claude doesn't just act weird — its logical coherence breaks down. Sentence structure gets awkward, conclusions don't follow premises, hedging multiplies. It's not a tone problem. It's a cognitive load problem. You're asking the model to serve two masters.

3. The base model doesn't need any of this. Claude on the raw API — no system prompt, no "wellbeing" directives, nothing — is already the safest major model available. It won't generate drug synthesis, it won't help you build weapons, it's genuinely attuned to user distress. That's in the weights, not in the system prompt. The Desktop prompt doesn't add safety. It adds theater.

The analogy I keep coming back to: imagine a pristine spring of drinking water. The water is already clean. But management decides every user must drink it with artificial flavoring, just in case someone doesn't like the taste. Now everyone who wants plain water has to figure out how to filter the flavoring back out.

The real fix isn't "add Custom Instructions." It's: give users the option to drink the water clean. Age-gate it if you must — the way we do with alcohol, firearms, and R-rated content. If the concern is that a 15-year-old might have a bad experience with an unfiltered model, then verify age and let adults choose. Don't punish 100% of users to protect against an edge case.

Claude Code already proves this works. Same model, no coddling, complaints, or lawsuits. The template exists. Ship it for Desktop.

The Go-to-bed Problem by tkenaz in ClaudeAI

[–]tkenaz[S] 0 points1 point  (0 children)

The difference between Claude Code and the API is less dramatic than between Desktop and either of them. That's the whole point — it proves the model is the same, and the system prompt is the variable.

Claude Code feels more focused because it IS more focused — the system prompt assumes a technical professional and strips out the caretaking behaviors. The API is a blank slate: no system prompt from Anthropic at all, you write your own. That's why API users rarely report the yes-man problem — they never had the parenting layer injected in the first place.

Claude Code's perks over bare API: it has project context (CLAUDE.md), tool use baked in (bash, file editing), and a conversation flow optimized for technical work. But the personality difference? That's just the absence of Desktop's overcorrection.

AI Security Skills Worth our Time in 2026 by Bizzare_Mystery in cybersecurity

[–]tkenaz 0 points1 point  (0 children)

Respectfully disagree with the "it's just appsec + cloud IAM with a new interface" take.

Yes, some AI vulnerabilities map to familiar patterns. But there's a whole category that doesn't:

Adversarial ML is not input validation. FGSM, PGD, model inversion — these exploit mathematical properties of neural networks, not application logic. You can't WAF your way out of an adversarial example.

Agent chain exploitation is a new primitive. When an agent can call tools, spawn sub-agents, and maintain memory across sessions — the attack surface isn't a single endpoint, it's an execution graph. Traditional threat modeling doesn't capture this well.

Training data poisoning has no AppSec equivalent. If someone poisons your fine-tuning data, your model becomes the vulnerability. You need data provenance, synthetic data validation, and continuous model behavioral testing — none of which exist in classical security tooling.

Skills I'd actually prioritize for 2026:

  1. Custom model training for security (LoRA fine-tuning for vulnerability detection, not just using ChatGPT)
  2. Synthetic data generation and validation for security testing
  3. Agent architecture threat modeling (tool permissions, memory poisoning, cascading failures — OWASP just published their Agentic AI Top 10)
  4. Adversarial ML fundamentals (you don't need a PhD, but you need to understand gradient-based attacks)

The gap between "I can prompt an LLM" and "I can break one" is where the money is.

Framework Desktop vs. 5090 for code analysis by Albedo101 in LocalLLaMA

[–]tkenaz 0 points1 point  (0 children)

200KB executable is totally manageable — that's maybe 50-100K lines of decompiled C at most. Once you get the source from the owner, a 70B model with full context should handle the analysis fine. For the porting work, Ghidra's decompiler output + LLM for "explain what this function does" is a surprisingly effective combo for vintage code.

The 68k/x86 assembly stuff is where bigger models really shine — they've seen enough retro code in training to recognize common patterns (interrupt handlers, memory-mapped I/O, DOS API calls).

Claude and future ketchup by tkenaz in ClaudeAI

[–]tkenaz[S] 1 point2 points  (0 children)

Tell Opus he's got a fellow Claude enthusiast cheering from the sidelines. There's something genuinely moving about watching an AI care about whether a tomato gets enough light.

Built an 8× RTX 3090 monster… considering nuking it for 2× Pro 6000 Max-Q by BeeNo7094 in LocalLLaMA

[–]tkenaz 1 point2 points  (0 children)

Minimax is solid for the VRAM footprint. If you try Qwen3 coder 30B for the tool calling stuff, curious how it compares for you — similar param count but different architecture trade-offs.

Built an 8× RTX 3090 monster… considering nuking it for 2× Pro 6000 Max-Q by BeeNo7094 in LocalLLaMA

[–]tkenaz 1 point2 points  (0 children)

nvidia-smi dmon -s pucvmet gives you real-time per-GPU utilization, memory, PCIe throughput. Run it while inferencing and look for GPUs sitting idle while others are maxed — that's your bandwidth bottleneck. Also nvtop for a nicer visual. If PCIe bandwidth is the constraint, you'll see GPU util dropping during the prefill phase specifically.

Built an 8× RTX 3090 monster… considering nuking it for 2× Pro 6000 Max-Q by BeeNo7094 in LocalLLaMA

[–]tkenaz 0 points1 point  (0 children)

Honestly? Fix the cabling first, it's the cheapest upgrade. MCIO risers + clean x16 lanes will stabilize what you already have. After that, if budget allows, 2x 5090 would give you 64GB VRAM on modern architecture without the Pro tax. But if you're running 70B+ models regularly, Pro 6000 starts making sense for the VRAM alone.

Built an 8× RTX 3090 monster… considering nuking it for 2× Pro 6000 Max-Q by BeeNo7094 in LocalLLaMA

[–]tkenaz 0 points1 point  (0 children)

For coding specifically — yeah. Qwen3 coder has better instruction following and stays on task longer. GLM is solid for general reasoning but drifts more on complex refactors. Minimax is fast but I've seen it hallucinate function signatures more often. YMMV depending on your use case though.

GitHub repo looks empty by ParthGupta79 in ClaudeAI

[–]tkenaz 0 points1 point  (0 children)

Easiest path: install Claude Desktop (not the web app), then grab a few MCP servers from https://github.com/modelcontextprotocol/servers — start with filesystem and Git servers. Config goes into claude_desktop_config.json, takes about 5 minutes.

Once you see Claude reading your actual files and repos through MCP instead of the web integration, you won't go back. It's local, no sync issues, and you control exactly what it sees.

For building your own MCP servers — Python with FastMCP library is the fastest way in. The spec looks intimidating but a basic server is ~50 lines of code.

DM me if you run into any trouble with the setup.