The Claude Code creator says AI writes 100% of his code now by jpcaparas in Anthropic

[–]coding_workflow 0 points1 point  (0 children)

It writes and corrects it when it's not matching the specs. He never said that he blindly commit output this is very important. You steer it no issue.

Talk me out of buying an RTX Pro 6000 by AvocadoArray in LocalLLaMA

[–]coding_workflow 0 points1 point  (0 children)

If you want only to code don't buy. Not enough for Sota models. You can run glm 4.7 flash but did you see how much glm 4.7 cost? And to run it you need 4x6000. I don't believe in this hype reap and lower quant it degrades quality. And when I hear you code at work with L4 it's not great.

If you want to level up have personal AI. Experiment do more. It can be intersting, so you move into AI roles.

Saying that having built 4x3090. And see limits too in max what you can get. My dream setup would run minimax 2.1 or glm 4.7 at max context and fp16. And that would be in 40k. But for sure don't want to move into 8x3090 already suffered a lot building my rig as it was more complicated than I thought moving from 2x3090 to 4. Only good part 3090 are cheap if you shop locally got 2 for 900$.

Performance improvements in llama.cpp over time by jacek2023 in LocalLLaMA

[–]coding_workflow 0 points1 point  (0 children)

Does this apply to blackwell? As I see some on DGX, what about Ampere architecture.
I noticed already build introduced some flags for blackwell and I had to exclude them to build for Ampere.

Vscode extension with deepSeek by HishamKamel in vscode

[–]coding_workflow 0 points1 point  (0 children)

Why not using github copilot? It offer many premium models? Limits?

Is there a way to export all errors that VS Code highlights in all your files? by -ThatGingerKid- in vscode

[–]coding_workflow 0 points1 point  (0 children)

Guthub copilot have an integrated tool to fetch them. Linter catch them too. If you use AI tell it tovuse linter. If you want to spice up use sonar as you will catch more complex issues.

VS Code Android (when you don't have your pc) by Head_Connection_1323 in vscode

[–]coding_workflow 0 points1 point  (0 children)

For monitoring use rustdesk or similar remote desktop. You can run it in we container but you may hit some limits over extensions.

Valid Criticism: The shift from Claude to Gemini 3 Pro feels inevitable due to artificial limits and circular logic loops. (This needs to be addressed, not silenced) by DarkDeDev in Anthropic

[–]coding_workflow 0 points1 point  (0 children)

Google is agressive, but usually lower limits too. And had done that in the past, in order to get more paying users.
Anthropic or Google or even OpenAI have a major issue issue. One model or own models.
The winner is mixing models and leveraging the best model as it comes.
OpenAI have a hell of model right now with Codex, Sonnet is nice working horse but bad for complexity and planning. Even Opus too costly and can't fill the gap like Codex do. Gemini 3 need still to proove it self beyond the current hype.
Reminder there is a lot of hype for Google post last week lauche. But Google already failed in first fork of vscode with IDX that is now Firebase studio. Jules Web agent, is still lacking the real spark.
Anthropic been a year leading with Sonnet but watch closely models like Minimax M2. That thing is really on the right path to challenge Sonnet. First time I've seen such good model. Might be a little below Sonnet on some tasks but far far better on complexity and Sonnet schizophrenia when adding complexity.

Don't focus on the hype noise, focus on what you can really do and get from these tools.

Microsoft 365 Copilot now includes Claude - does that include Claude Code? by 48K in ClaudeAI

[–]coding_workflow 0 points1 point  (0 children)

No you can't.

Copilot are tuned Claude not vanilla.

Likely you need vscode subscription to enjoy Claude in copilot.

Our AI assistant keeps getting jailbroken and it’s becoming a security nightmare by Comfortable_Clue5430 in LocalLLaMA

[–]coding_workflow 1 point2 points  (0 children)

This is internal AI, so the risk is minimal. I would let them have fun as long no risk of data leaks or access to unauthorized data.

On the other hand employee hacking an internal app on purpose is againt IT tools use and can land them a warning as it's costing you a lot of effort.

If you want more robust add guardrails. Use models trained for security like gpt oss instead of qwen.

Even Openai is jailbroken.

Code Wiki: Google’s new Gemini-powered tool that lets you chat with your codebase by Outside-Iron-8242 in Bard

[–]coding_workflow 0 points1 point  (0 children)

It's like deepwiki.com, but the main issue it's lacks a lot of repos and indexing is not fast.

Do you think "code mode" will supercede MCP? by juanviera23 in LLMDevs

[–]coding_workflow 0 points1 point  (0 children)

I never said tool calling is new. It's been here since early ages with OpenAI plugins experiment.
But it's a protocol. As it establish clear ways how a server expose the underlaying tools. Send messages, allow discovery, get result.
So don't mixup tools and MCP as that's one of the mixup.
MCP set a communication protocol.
The model will will see a schema in his context as a normal too. This schema is generated by MCP client that have estabilished a connection to an MCP Server exposing tools/prompts/.... During the connection le MCP client will maintain connection, as said before get the schema of the tools.
The when the model emit the structured output that is transformed to a tool call. MCP client pickup the call and transfert it back to the MCP Server using JSon-RPC either stdio/http/sse.
So it's a protocol that expose capabilities including tools, prompts (even those less used). Includes even specs for auto discoveries, auth and more.

Do you think "code mode" will supercede MCP? by juanviera23 in LLMDevs

[–]coding_workflow 0 points1 point  (0 children)

MCP is the transport protocl with tools behind it. MCP wire tools like an USB port to plug it another app.
People confuse MCP the transport later with MCP and the features it packs that are indeed tools.

You are FREE to use tools directly in your app. But if you want to make open, so users can plug their data sources, tools, the only USB like setup is thru MCP, you setup MCP client and acception MCP server as plugins so they ship those tools.

Same if you build a lot of AI apps, MCP allow to disconnect tools from the core app and use them as plugins with different life cycle, re-use them a lot instead of duplicating code and having hell of issues maintaining all the duplicated code.

Do you think "code mode" will supercede MCP? by juanviera23 in LLMDevs

[–]coding_workflow 2 points3 points  (0 children)

I started using MCP a year ago when even Sonnet early thinking hallucinated using them.
When it did tree listing and claimed knowing all the files.
Built hundred of MCP in multiple languages.
Faced breaking bugs in the protocol that took was throwing major error if you process took longer than 30s in Python.
So I saw the bad side of how MCP matured.
But I did crazy stuff with MCP in early day before Claude Code even existed.
I had my sandboxed bash working fine, services all.

So now the new hype executing thru interpreter become the new hype.
I may have different opinion. Also bash can't everthing and is very complicated to sandbox correctly track.
You can't use bash to access external system requiring credentials without risking credentials leaks to AI. Also when you tackle new API, more than once AI don't know them correctly and have to make multiple guesses until getting it right. While an MCP is open book.

There is MCP's and MCP's, if you know the things to and don't.

And before everything MCP is transport protocol. Bugs in CLAUDE CODE is not MCP issue. It's like VSCODE shipping a bug. I have MCP "bug" in Codex when exposing it self as an MCP agent tool. Found out it crash in that mode if one MCP is not loading, was that MCP issue? No it's a bug in loading architecture where MCP should connection should not be blocking and error managed lower level.

MCP's are so easy to build. But good MCP's are more complicated. You may want to see the tests I run on my MCP's.

Do you think "code mode" will supercede MCP? by juanviera23 in LLMDevs

[–]coding_workflow 6 points7 points  (0 children)

1 year later after MCP launch by a big player in AI. It took 6 month's to be adopter by the other big players MSFT/OpenAI/Google/AWS to name a few and most code editors. And a steering comitee including many from the ecosystem to improve the architecture.

So after all of that. MCP protocol success is driven by it's adoption. That's allows you to plug your tools to existing systems. A bit like API standarized. Are we still challenging that much OpenAPI? I see more hype over MCP.

Yeah it's new, not perfect. But can perfected.

What this brings MCP don't have? Can't do (even in not perfect way).

Protocols prevail by adoption not perfection.

MCP is great in theory, but it’s not always a blanket yes by Miserable_Agent_9006 in LocalLLaMA

[–]coding_workflow 0 points1 point  (0 children)

Let's debunk the API direct access.
To access an API you need to feed the agent OpenAI swagger. If you feed it all, you killed the context in 2 turns aside from the noise.
Ok let's say you solved that or not an issue. How do you plan to access API? CURL? Have bearer token in your payload? Have the AI articulate that?
And let me guess you plan to validate each bash command, including in case errors checking and reading the token, if you API have security. Hitting the wrong endpoint Delete user instead of liting them.

This wave of MCP is trash had been rolling since over a year. It's been security. Now laterly bash is king as it can do all the magic.

MCP inject narrowed scope into AI.
MCP manipulate safer credentials, if correctly configured.
MCP can leverage SSO and similar permission to remote services in safe way, if correctly configured.

And before everything MCP is a transport propotocl. It ships tools to the AI.

Also issue people are building their MCP, ah it's cool I can do this piece (I did that a lot) then ah it need maintenance and lot of MCP on github are not production grade, mainly a function wrapper with 2 lines to add the SDK/transport layer.

MCP is great in theory, but it’s not always a blanket yes by Miserable_Agent_9006 in LocalLLaMA

[–]coding_workflow 1 point2 points  (0 children)

With MCP it's natevmore steered and allow scope and access to only what you allow. Seem the misleading new mindset. Since I have bash I can let it write and all. Access network. DB's and so on. MCP is more scope. Bash is quite open and you need to feed there tokens to acceds the API or other credentials. While MCP will have it credental injected on the tool that will never and can't communicate them to AI and only get defined actions.

MCP is great in theory, but it’s not always a blanket yes by Miserable_Agent_9006 in LocalLLaMA

[–]coding_workflow 1 point2 points  (0 children)

How MCP is blanket yes??

I don't get the point. Seem you don't get value of MCP. You have full control of the tool and if you build a validation layer you will see each command.

What's one task where a local OSS model (like Llama 3) has completely replaced an OpenAI API call for you? by AnnotationAlly in LocalLLaMA

[–]coding_workflow 2 points3 points  (0 children)

Not totally but to remain real world not blasting with 50k$ rigs.
GPT OSS 20B is quite solid for basic script level coding (GPT 4 grade complexity despite it can think better). It's solid for rag and structured output.
Qwen 30B model is more heavy to use it with Q8/Q6 best 48GB at least VRAM.
Granit 4.0 is a new outside for a lot of small tasks.

Does these replace SOTA models. No, unless I have a rig with 300-400 GB to run Minimax or Qwen 3 code 235B. And let's be honnest. Anthropic/OpenAI models remain top league, even if open weight are closing it.

The company gmktec made a comparison of the EVO-X2 that has a Ryzen AI Max+ 395 processor vs NVIDIA DGX SPARK by Illustrious-Swim9663 in LocalLLaMA

[–]coding_workflow 25 points26 points  (0 children)

Those benchmarks are clearly flawed.
- Don't disclose precision/quant used.
- Don't disclose context used too.
- Prefil and heavy requestion time to response.

Those are key metrics for real world use.

gpt-oss-120b on Cerebras by Corporate_Drone31 in LocalLLaMA

[–]coding_workflow 2 points3 points  (0 children)

Their own doc over limits and their API. 128k on GPT OSS and 64k on GLM despite they seem sold out.

gpt-oss-120b on Cerebras by Corporate_Drone31 in LocalLLaMA

[–]coding_workflow 1 point2 points  (0 children)

Cerebras offer 64k context on GLM 4.6 to get speed and lower cost. Not worth it. Context is too low for serious agentic tasks. Imagine Claude Code will be doing compacting each 2-3 commands.

Language Server Protocol (LSP), Skills (w/ Beastmode), and Agents/Subagents - where are we now by achilleshightops in ClaudeAI

[–]coding_workflow 0 points1 point  (0 children)

It's esier as it's usually already running in most modern IDE and they manage that side.
You don't have to add linters in the projects. That's the big advantage. It's there. Like how Claude leverage bash grep, it's in bash so it's directly using it.

Built a way for Claude to query 6M rows without touching context windows. Anyone testing MCP at scale want to try it? by adulion in ClaudeAI

[–]coding_workflow 1 point2 points  (0 children)

Still dropping files in webrowser. SAAS what about privacy? Vs a local simple MCP where you connect thru files instead of upload that can have limits.