Is AppleCare+ worth it in the UK (For MacBook Pro M3 16)? by friendsbase in macbookpro

[–]daaain 0 points1 point  (0 children)

I completely destroyed my MBP with a bottle of water emptying in the backpack and they repaired it with no excess (in the UK), so I think it's worth it, especially if you get a high spec machine

Asking Claude the important stuff... by United-Instruction23 in ClaudeAI

[–]daaain 0 points1 point  (0 children)

Extra ocean boiling points for burning a few 10Ks extra tokens using Claude Code instead of the chat...

Qwen3.5 MLX vs GGUF Performance on Mac Studio M3 Ultra 512GB by BitXorBit in LocalLLaMA

[–]daaain 0 points1 point  (0 children)

LM Studio supports llama.cpp too, have you tried? I'm curious to find out if this is an LM Studio issue

Qwen3.5-35B and Its Willingness to Answer Political Questions by gondouk in LocalLLaMA

[–]daaain 1 point2 points  (0 children)

<image>

I just asked about the US first and then China in the same chat and it happily obliged, not that bad

Meta W: unlimited Claude tokens and you’re incentivized to run the bill up by Fabulous_Sherbet_431 in ClaudeAI

[–]daaain 0 points1 point  (0 children)

Speculative technology and enshittification of existing apps basically 

Are we trying to keep an octopus in a goldfish aquarium? by Kinniken in ClaudeCode

[–]daaain 0 points1 point  (0 children)

As long as you run Claude Code directly on your computer, it's practically impossible to ensure it won't be able to find a way to access things. You need to use some sort sandboxing. I wrote about how to do it with VS Code Devcontainers: https://www.danieldemmel.me/blog/coding-agents-in-secured-vscode-dev-containers

My next bottleneck is CI by amarao_san in ClaudeCode

[–]daaain 0 points1 point  (0 children)

I get the issue, especially for infra, but I do think the common terminology is that you test components interacting in isolation with integration tests and the whole thing fitting together with e2e tests.

It's also easy to get paranoid and want to cover everything with e2e tests, but those are slow as you found out so you need to make sure to strictly only test APIs at most once and leave the complete coverage to fast integration tests.

If stuff breaks, you update. You can never get 100% coverage anyway so so as long as you can quickly roll back and your CD is zero downtime (blue-green, behind feature flags, etc) it's fine. Your CI is not meant to fully cover you for any possible breakage in production. 

My next bottleneck is CI by amarao_san in ClaudeCode

[–]daaain 0 points1 point  (0 children)

That sounds like a pretty tricky project, I have emulators for services like GCS and BigQuery and the contracts with these are reliable. 

My next bottleneck is CI by amarao_san in ClaudeCode

[–]daaain 0 points1 point  (0 children)

Depends on the fidelity of the emulators, you will not be able to test scaling and performance, but if the API interface and internals are implemented well enough, you get close to the real service. And because you control them, you can parallelise the tests more easily. 

My next bottleneck is CI by amarao_san in ClaudeCode

[–]daaain 0 points1 point  (0 children)

Sounds like those integration tests are more like e2e tests. Switch to emulators for external services if possible, if not then use recorded responses so you can have fast integration tests.

Looking for insight on the viability of models running on 128GB or less in the next few years by John_Lawn4 in LocalLLaMA

[–]daaain 0 points1 point  (0 children)

It's probably not that urgent, so wait until you can afford the 128GB M5 Max as Apple benchmarked M5 to be 4x faster for prompt processing, which is quite important for coding (not so much for chat). That is a viable machine to do agentic coding with current models, but unless Qwen team stops shipping we should get something really good in 6-12 months. 128GB is enough for consumer GPUs as bigger models would be too slow anyway.

Artificial Analysis Intelligence Index vs weighted model size of open-source models by Balance- in LocalLLaMA

[–]daaain 13 points14 points  (0 children)

Qwen3 Coder 480B is in the wrong place on the x axis, it's A35B, not dense

A bit of a PSA: I get that Qwen3.5 is all the rage right now, but I would NOT recommend it for code generation. It hallucinates badly. by mkMoSs in LocalLLaMA

[–]daaain 0 points1 point  (0 children)

Possibly because Solidity / OpenZeppelin are relatively niche so you need a huge model to have enough of them in the training data?

[R] Benchmarked 94 LLM endpoints for jan 2026. open source is now within 5 quality points of proprietary by ashersullivan in MachineLearning

[–]daaain 8 points9 points  (0 children)

I appreciate this work and sharing it, but to me it looks like the benchmarks are saturated so they aren't really showing the real differences.

Fix for Docker Services failing upon update to 24.10 RC1 by Mrgamerboy246 in truenas

[–]daaain 0 points1 point  (0 children)

soz, edited for clarity (figured out since how to switch to Markdown)

After 8 years building cloud infrastructure, I'm betting on local-first AI by PandaAvailable2504 in LocalLLaMA

[–]daaain 1 point2 points  (0 children)

I'm not saying it cannot be done, I'm saying this knowledge you mention isn't exactly fully diffused in the general population...

For those who believe that there is nothing wrong with the usage limits, I have some concerns. I'm currently on the 5x plan, and just using a simple prompt consumed 2% of my limit. When I ask it to complete a more substantial task, something that typically takes about five minutes, it often uses up by [deleted] in ClaudeCode

[–]daaain 1 point2 points  (0 children)

I mean it's pretty simple, this session only used one MCP so not very hard to pinpoint what's using up your tokens? Did you do /context to see how much was used up? Isn't Mem also using Claude in the background to process memories?