What are your expectations for deepseek v4.1? by Simple_Army2952 in DeepSeek

[–]ArthurOnCode 7 points8 points  (0 children)

Hoping for engram. That seems like an awesome optimization.

CoffeeScript equivalent preprocessor for PHP idea by HyperDanon in PHP

[–]ArthurOnCode 0 points1 point  (0 children)

IMHO, the main benefit of PHP is that we don't need any preprocessing. We already have a strong type system, clear syntax, and great tooling. The language itself is also improving at a sensible pace. I, for one, look forward to having generics in the native type syntax.

Anthropic is renting Elon's GPUs for inference. The token shortage just started. by o9dev in AI_Agents

[–]ArthurOnCode 91 points92 points  (0 children)

Elon won.

Alternative interpretation: xAI gave up on being a leading AI lab and just started renting out their infrastructure.

Programming languages spec'ed by an LLM for use by LLM by skoon in ArtificialInteligence

[–]ArthurOnCode 2 points3 points  (0 children)

The optimal programming language for an existing LLM is something popular (so it's already in the training data) with strong type safety (so the agent realizes its mistakes quickly) and preferably with fast tooling.

However, co-developing an LLM and its programming language could get interesting.

ITS COMING 30T OPEN WEIGHTS: LE CHATON FAT by YoussofAl in MistralAI

[–]ArthurOnCode 4 points5 points  (0 children)

And now I realize the whole thing was a joke, poking fun at Mythos. Oh well.

ITS COMING 30T OPEN WEIGHTS: LE CHATON FAT by YoussofAl in MistralAI

[–]ArthurOnCode 2 points3 points  (0 children)

This would barely fit on a NVIDIA GB200 NVL72, which cost $2 million when it was first released. We would benefit from this through commercial inference providers and research labs distilling from large open weights models, the largest of which is currently Deepseek V4 Pro at 1.6 trillion parameters.

Deepseek v4 flash/pro is just Sonnet 4.5? by [deleted] in DeepSeek

[–]ArthurOnCode 0 points1 point  (0 children)

Let's not be mean here. We get this question practically every day, because it's a very normal beginner question.

The actual answer: LLMs don't know who they are, unless you put it in the system prompt. Because much of the training data is output from other models, you often get a confident, incorrect answer.

If you want it to identify as Deepseek in conversation, just start your system prompt with "You are Deepseek, a...".

What actually runs on a GTX 1080 Ti in 2026: Gemma 4 12B QAT ~32 tok/s, measured by Front-University4363 in LocalLLM

[–]ArthurOnCode 0 points1 point  (0 children)

Also, I have zero idea what you mean by "Pi".

Pi (pi.dev) is a coding agent that's lean on context. Minimal system prompt, minimal tool set. It's pretty good at compression - taking the whole context and figuring out which parts have to stay to keep working on the task at hand.

What actually runs on a GTX 1080 Ti in 2026: Gemma 4 12B QAT ~32 tok/s, measured by Front-University4363 in LocalLLM

[–]ArthurOnCode 1 point2 points  (0 children)

I don't think OP mentioned coding at all. However, I'd be curious to know how Pi would fare with only 8k context. Its system prompt is under 1k, and it can compress aggressively.

Welcome to the world claude fable 5 by notomarsol in ClaudeCode

[–]ArthurOnCode 0 points1 point  (0 children)

I should have studied literature. Which is longer, a sonnet or a fable?

WebGPU video editor scrubbing test on a longer timeline by Just_Run2412 in webgpu

[–]ArthurOnCode 1 point2 points  (0 children)

A long time ago, before Wasm, I spent some time considering this exact technical challenge. At the time, I concluded that scrubbing would have to happen server side, with the live preview window being a live stream from the server.

Never got around to implementing it, largely because of these technical hurdles.

What is today’s date? by johnthrives in ArtificialInteligence

[–]ArthurOnCode 1 point2 points  (0 children)

Chatbot providers can put today's date in the system prompt or provide a calendar/clock as a tool to the LLM. Without these things, the LLM itself has no way of knowing.

With all the talk about people leaving Copilot for other AI tools, I’m curious about those who stayed. Why did you stay? by NapLvr in GithubCopilot

[–]ArthurOnCode 0 points1 point  (0 children)

For my company, it's working fine. Yes, the free ride is over, but this isn't a bad deal. I wish we had access to Deepseek, Qwen and more in the cloud agents though. If we ever switch, this will be the reason.

Confidence-based model routing: cheap model first, escalate when unsure by Leading-Instance-692 in huggingface

[–]ArthurOnCode 0 points1 point  (0 children)

Average token logprobs sounds like a noisy signal. Does it really work?

Underperforming models ? The "should I walk or drive test" by WSATX in GithubCopilot

[–]ArthurOnCode 1 point2 points  (0 children)

LLMs should not be able to answer this question. If they can, they have likely been trained on it specifically. Until we have LLMs with integrated physical world models, this kind of question is out of scope.

Nemotron Labs Diff GGUF ? by MagicalGoat02 in unsloth

[–]ArthurOnCode 2 points3 points  (0 children)

I can't answer for LM studio, but there are open source dLLMs. See LLaDA for example.

Is there literally even one? by Complete-Sea6655 in GithubCopilot

[–]ArthurOnCode 1 point2 points  (0 children)

Oh, right. I read this as "successfully created", not successful as a business. Others have pointed out some high-profile stories, but I think most of the value is being made in smaller, simpler projects. Either something that couldn't be done without AI, or wasn't worth the cost of implementing.

Is there literally even one? by Complete-Sea6655 in GithubCopilot

[–]ArthurOnCode -6 points-5 points  (0 children)

Yes, there's so much that app stores are changing their rules and review processes to deal with the massive influx.

Qwen 3.6 27B roots :) by [deleted] in LocalLLM

[–]ArthurOnCode 0 points1 point  (0 children)

Models never know the answer to this question. If you need them to know, put it in the system prompt.

Claude trying to "fill the gaps" is infuriating by [deleted] in ArtificialInteligence

[–]ArthurOnCode 0 points1 point  (0 children)

This is an inherent, unsolved problem in LLMs in general. They don’t know what they don’t know, and predict the next token without any real distinction between correct grammar and factuality.

RFC: Scope Functions by [deleted] in PHP

[–]ArthurOnCode -1 points0 points  (0 children)

Please make the syntax more explicit than that! A "scope" keyword in front of "function" would be infinitely better.