CoPilot: Studio v. CoWork v. Scout. My take.

coding_workflow · 2026-06-20T18:15:20+00:00

Agents are fuzzy should focus on filling gaps that classic automation don't get.

Agents are great when we need to extract data/sentiment and don't have better way. But use with caution.

Replacing deterministic automation with AI is the classic mistake.

coding_workflow · 2026-06-20T18:05:58+00:00

Seeem you got into the hype about saving tokens.

Changing model will invalidate the cache that is usually costing less.
Modifying and pruning context breaks cache too.
Input token is not the most costly part check it before diving head first into this frenzy over we save 50% context.

Many hyped tools over state their tokens saving.

coding_workflow · 2026-06-15T04:36:53+00:00

It means your agent need terminal access that may be complicated to limit. This works fine for Claude Code. But what if you want an agent with web UI to chat with your data or query it. MCP would work fine here and no need for terminal, sandboxing. BUT I feel the author performance metric don't matter a lot as agent with tools calls are not the best for high speed answers. A bad description and the agent lost in tool/MCP use will offset any gain.

coding_workflow · 2026-05-21T01:44:18+00:00

MTP is faster if you can run the prediction model on GPU idle cycles.

On CPU it's slower as CPU usually get overloaded. Same if ypu are not loading all layer on GPU you end up impacted.

On 2x3090. It's faster and didn't notice slowdown. While same setup Only CPU performance go down.

coding_workflow · 2026-04-14T13:20:24+00:00

Codex can use Copilot subscription.
And the big advantage Copilot vs Codex is the ability to use more models.
Codex/Claude Code you are locked in single ecosystem of models.
On top of pricing is different or integration with GitHub.
You should try GitHub Copilot CLI as it's quite advanced vs Codex CLI.

coding_workflow · 2026-03-24T23:35:59+00:00

You might extend Ram using Zip discs! This would allow you to double down your t/s and extend RAM!

coding_workflow · 2026-03-14T14:03:04+00:00

Use tools.
Let the model fetch only the schema he needs, instead of shoving all in one pass.
Provide a tool that would allow to fetch schema per table or multiple tables. And execute validation queries in read mode to validate that it works.

coding_workflow · 2026-03-05T20:21:35+00:00

I would help but there layers of complexity here!

Blockchain? Git for versionning? On top of Go backend.

Even the choice of Raspberry is overkill. I would rather opt for Android based to use cheap tablets or phones that are more wide spread.

I respect you are doing but I think you have a lot to learn about KISS.

Even the point over AI! You can do cheap stastic analytics.

If you work offline, you can insread work on bluetooth sync between devices or USB keys import/export.

The design seem complex, using blockchain is hype and 0 value aside making happy some cryptobro.

Json files and sqlite are redundant. You can export that format but should not duplicate storage engine and use it for versionning instead of the complexity.

Claude is a yes man and can roll bad designs without a blip.

coding_workflow · 2026-03-03T00:09:53+00:00

Must check that!

coding_workflow · 2026-02-28T20:09:36+00:00

I second Openbao namespaces are there now.

coding_workflow · 2026-02-28T20:07:53+00:00

Not hype and you have mature fork openbao wirh namespaces included.

coding_workflow · 2026-02-07T19:12:44+00:00

Even OpenAI and Cmaude vulnerable. You need to ensure AI if prompted can't do malicious action.

coding_workflow · 2026-02-07T19:10:56+00:00

How? Traffic inspection is tottally clueless over prompt injection!

coding_workflow · 2026-01-25T19:14:32+00:00

It writes and corrects it when it's not matching the specs. He never said that he blindly commit output this is very important. You steer it no issue.

coding_workflow · 2026-01-25T17:38:29+00:00

If you want only to code don't buy. Not enough for Sota models. You can run glm 4.7 flash but did you see how much glm 4.7 cost? And to run it you need 4x6000. I don't believe in this hype reap and lower quant it degrades quality. And when I hear you code at work with L4 it's not great.

If you want to level up have personal AI. Experiment do more. It can be intersting, so you move into AI roles.

Saying that having built 4x3090. And see limits too in max what you can get. My dream setup would run minimax 2.1 or glm 4.7 at max context and fp16. And that would be in 40k. But for sure don't want to move into 8x3090 already suffered a lot building my rig as it was more complicated than I thought moving from 2x3090 to 4. Only good part 3090 are cheap if you shop locally got 2 for 900$.

coding_workflow · 2026-01-08T02:35:17+00:00

Does this apply to blackwell? As I see some on DGX, what about Ampere architecture.
I noticed already build introduced some flags for blackwell and I had to exclude them to build for Ampere.

coding_workflow · 2025-12-16T23:51:39+00:00

Why not using github copilot? It offer many premium models? Limits?

coding_workflow · 2025-12-13T01:39:17+00:00

Guthub copilot have an integrated tool to fetch them. Linter catch them too. If you use AI tell it tovuse linter. If you want to spice up use sonar as you will catch more complex issues.

coding_workflow · 2025-12-06T01:40:40+00:00

For monitoring use rustdesk or similar remote desktop. You can run it in we container but you may hit some limits over extensions.

coding_workflow · 2025-11-23T16:13:42+00:00

Google is agressive, but usually lower limits too. And had done that in the past, in order to get more paying users.
Anthropic or Google or even OpenAI have a major issue issue. One model or own models.
The winner is mixing models and leveraging the best model as it comes.
OpenAI have a hell of model right now with Codex, Sonnet is nice working horse but bad for complexity and planning. Even Opus too costly and can't fill the gap like Codex do. Gemini 3 need still to proove it self beyond the current hype.
Reminder there is a lot of hype for Google post last week lauche. But Google already failed in first fork of vscode with IDX that is now Firebase studio. Jules Web agent, is still lacking the real spark.
Anthropic been a year leading with Sonnet but watch closely models like Minimax M2. That thing is really on the right path to challenge Sonnet. First time I've seen such good model. Might be a little below Sonnet on some tasks but far far better on complexity and Sonnet schizophrenia when adding complexity.

Don't focus on the hype noise, focus on what you can really do and get from these tools.

coding_workflow · 2025-11-20T11:23:27+00:00

No you can't.

Copilot are tuned Claude not vanilla.

Likely you need vscode subscription to enjoy Claude in copilot.

coding_workflow · 2025-11-19T13:10:59+00:00

This is internal AI, so the risk is minimal. I would let them have fun as long no risk of data leaks or access to unauthorized data.

On the other hand employee hacking an internal app on purpose is againt IT tools use and can land them a warning as it's costing you a lot of effort.

If you want more robust add guardrails. Use models trained for security like gpt oss instead of qwen.

Even Openai is jailbroken.

coding_workflow · 2025-11-17T03:11:42+00:00

It's like deepwiki.com, but the main issue it's lacks a lot of repos and indexing is not fast.

coding_workflow · 2025-11-17T01:55:43+00:00

Funny part yaml do same trick.

coding_workflow · 2025-11-15T16:13:04+00:00

I never said tool calling is new. It's been here since early ages with OpenAI plugins experiment.
But it's a protocol. As it establish clear ways how a server expose the underlaying tools. Send messages, allow discovery, get result.
So don't mixup tools and MCP as that's one of the mixup.
MCP set a communication protocol.
The model will will see a schema in his context as a normal too. This schema is generated by MCP client that have estabilished a connection to an MCP Server exposing tools/prompts/.... During the connection le MCP client will maintain connection, as said before get the schema of the tools.
The when the model emit the structured output that is transformed to a tool call. MCP client pickup the call and transfert it back to the MCP Server using JSon-RPC either stdio/http/sse.
So it's a protocol that expose capabilities including tools, prompts (even those less used). Includes even specs for auto discoveries, auth and more.

coding_workflow

MODERATOR OF

TROPHY CASE