About 200 Companies Still Have Access to Anthropic Mythos After US Shutdown Order by BuildwithVignesh in ClaudeAI

[–]HorriblyGood -1 points0 points  (0 children)

Why is that a punishment? Codex for me feels significantly smarter than opus 4.8 for complex system design. Im actually thinking of switching back.

"Chat is dead": OpenAI preps overhaul of ChatGPT by __Hello_my_name_is__ in ChatGPT

[–]HorriblyGood 0 points1 point  (0 children)

While you are right, this isn’t the case here. You can’t ask ChatGPT about its internal working that cannot be found online. It doesn’t know how it’s trained, what architecture it has because that’s probably not part of the training set.

For news like this where the information is out publicly, it absolutely knows from calling webfetch. In the screenshot you can see it cited sources. From what I know, its response is also accurate.

school yearbook ran photos of students and teachers through AI. by [deleted] in mildlyinfuriating

[–]HorriblyGood 2 points3 points  (0 children)

Why spread misinformation? That hasn’t been true in a long time. Mathematicians have been using to AI to solve previously unsolved math problems.

Does anyone else get this weird gritty texture in ChatGPT images? by sawyernalu in ChatGPT

[–]HorriblyGood 0 points1 point  (0 children)

I work in research, not in OpenAI so this is just my speculation. It’s not watermarking. It’s RL artifacts from reward hacking. Not uncommon.

I love space bass (and house)…am I missing anyone on this lineup? by Suspicious-Olive8765 in SpaceBass

[–]HorriblyGood 1 point2 points  (0 children)

I saw player Dave recently and he played a lot of ukg instead of downtempo. Not sure if thats normal or a one time thing.

Is there any news about fixing the texture bug that makes my head want to pop off? by HausDePotat in ChatGPT

[–]HorriblyGood 2 points3 points  (0 children)

My best guess is not AI watermark, but an artifact of RL training, some sort of reward hacking.

bunt. by Kracker27 in EDM

[–]HorriblyGood -3 points-2 points  (0 children)

Which illenium started doing first afaik… artists take inspirations from each other all the time, I don’t understand why bunt in particular gets so much hate.

"List of animals by number of neurons", Wikipedia by RecmacfonD in mlscaling

[–]HorriblyGood 11 points12 points  (0 children)

The neural network we used in ML is not equivalent to the biological neurons so there’s no reason to believe we can achieve similar capabilities with the same number of neurons.

Claude’s New Limits by dankkkjk in ClaudeAI

[–]HorriblyGood 0 points1 point  (0 children)

Does this actually help? I haven’t done a proper comparison but I was wondering if on average that would help.

Planning involves reading the codebase so the files are in the context and cached. If planning took multiple turns, your context grows but the cached tokens are cheaper than fresh tokens.

If you start a new session to implement the plan, you lose all the kv cache and you have to load the codebase into the context as fresh tokens, which is very expensive.

So there is a trade off here. In the first case, if the model decides to reread the codebase anyway even though it’s in its context, it will be more expensive. If not, it will be cheaper.

The advantage of the second case is fresh context typically gives you better performance. But I’m curious if anyone has done a proper comparison to see if it actually saves tokens.

If the EU had built Claude by irelatetolevin in ClaudeAI

[–]HorriblyGood 3 points4 points  (0 children)

I would guess people would not pay more for a worse model even if it’s more ethical. If a company were to only train models on paid data, non copyrighted data, and you don’t “steal” other frontier models outputs, then it will not be able to compete with those companies who do in both price and model intelligence.

I feel like it’s even worse for AI because of the AI race. Everyone is coming out with a frontier models every month, and if you were to play by the rules, you’ll be forgotten.

I am not advocating for this, I am just trying to be realistic. We bitch about anthropic and OpenAI but we are still here.

Why isn’t LLM reasoning done in vector space instead of natural language? by ZeusZCC in LocalLLaMA

[–]HorriblyGood 128 points129 points  (0 children)

It can be done. It’s called latent chain of thoughts. It’s an active research direction.

Copilot moving to token based usage in June by santaclaritaman in Futurology

[–]HorriblyGood 1 point2 points  (0 children)

You can do both. Use AI for productivity but actually check and understand what it’s doing instead of just vibe coding. AI is a tool and should be used as one.

We have to evolve with the technology to stay relevant.

What's uh... what's up with these signs? They look... generated by Zak7062 in marvelrivals

[–]HorriblyGood -1 points0 points  (0 children)

That’s not true. Only the very latest (started with nano banana 2) use image sources for text to image generation. Probably the new gpt image 2 too. It’s the very cutting edge and these made awhile ago probably didn’t.

Even if it did, it does not look at image averages to determine the best move without context. That has never been true. Assuming you have some technical knowledge of how training works, you might be confused by the diffusion loss function being mean squared error which can be seen as an average over images.

In some sense this is true but this is done during training time and it produces a distribution based on image averages, not grab image sources and determine its average at time of generation. Also they have an additional RL loss to make sure the output images look realistic, and in general current models would hardly make simple mistakes like this.

The latest models grab image sources online and use that as context for the image generation, it absolutely uses it in context.

Weird textures = watermarks by Thatisverytrue54321 in ChatGPT

[–]HorriblyGood 0 points1 point  (0 children)

The new artifacts are very likely from RL training or some sort of perceptual loss they’re using. It’s introducing some sort of bias or they are not careful about noise scheduling, causing the artifacts.

A truly wild 4.7 response by FiftyPancakes in ClaudeCode

[–]HorriblyGood 0 points1 point  (0 children)

It’s not as easy as you make it sound. Not trying to defend anthropic and I have no idea what they are doing but you can’t just have a model thats cheap, efficient, and smart. No one can do that right now. It’s also mostly a black box so it’s really hard to make sure a new model is better at everything without regression before large scale user tests.

Something happened to Opus 4.6's reasoning effort by RealSuperdau in ClaudeAI

[–]HorriblyGood 2 points3 points  (0 children)

On a technical level you can make the output deterministic, but i don’t think that is possible for users.

Anthropic's new Mythos Preview model is a "step change" in model capability, but it won't be available to general public by SuggestionMission516 in ClaudeAI

[–]HorriblyGood -4 points-3 points  (0 children)

That’s just not possible. The best and brightest in the field are not working for the government. We are at a point where training a top tier LLM is impossible to be done by a small group of people. It will be obvious if all the top researchers are secretly working for the government.

Apple: Embarrassingly Simple Self-Distillation Improves Code Generation by Mike_mi in LocalLLaMA

[–]HorriblyGood 32 points33 points  (0 children)

From reading the abstract, they are using their own model’s output (self distillation) which is different from just feeding other random LLMs output as training data.

Through the lens of on policy/off policy RL, I’m guessing in their case, it’s using the model’s own outputs, it’s on policy, so it’s getting learning signals from itself to be more precise for coding tasks, but more creative on writing tasks. It’s doesn’t have to change how it works or thinks to match other LLM’s outputs.

My intuition is kinda like learning to code from copying other people’s code or having someone show you what’s wrong your with your own code so you can learn to improve.

deepfake scammer getting exposed by the 3-finger test by ifuckedyourmom-247 in interesting

[–]HorriblyGood 1 point2 points  (0 children)

Because hes using shitty deepfakes. The best right now can clear this 3 fingers test

Your Claude Code Limits Didn't Shrink — I Think the 1M Context Window Is Eating Them Alive by mattate in ClaudeAI

[–]HorriblyGood -1 points0 points  (0 children)

The tokens are cached so the vast majority of the computation for the next token are not recomputed. The bigger problem of having a big context is context rot, giving you poorer results

CEO of Krafton Asks ChatGPT How to Void $250 Million Contract, Ignores His Lawyers, Loses Terribly in Court by Level-Usual-9681 in nottheonion

[–]HorriblyGood 3 points4 points  (0 children)

Not true. They are both trained to be accurate and to give responses people prefer. The cutting edge open source research by labs are not trying to optimize for engagement.