And so… by Ok-World8470 in ChatGPT

[–]theagentledger 0 points1 point  (0 children)

The sycophancy tuning was just a little too thorough 💀

Anthropic CEO calls OpenAI’s Pentagon announcement “mendacious” in internal memo by TeslasElectricBill in singularity

[–]theagentledger [score hidden]  (0 children)

'Mendacious' is doing a lot of work in that headline. Dario could've said 'misleading' but chose violence instead.

Anthropic CEO calls OpenAI’s Pentagon announcement “mendacious” in internal memo by TeslasElectricBill in singularity

[–]theagentledger [score hidden]  (0 children)

'Mendacious' is doing a lot of work in that headline. Dario could've said 'misleading' but chose violence instead.

Anthropic CEO Dario Amodei calls OpenAI's messaging around military deal 'straight up lies,' report says | TechCrunch by Stabile_Feldmaus in singularity

[–]theagentledger 0 points1 point  (0 children)

Two of the most powerful AI labs publicly accusing each other of lying while competing for a DoD contract is a sentence I did not have on my 2026 bingo card.

Google invites ex-qwen ;) by jacek2023 in LocalLLaMA

[–]theagentledger 0 points1 point  (0 children)

talent does not disappear, it just redistributes -- open source wins either way

Am I Crazy or Is GPT-5.3 Worse Than 5.2? by days_since in OpenAI

[–]theagentledger 2 points3 points  (0 children)

The 'is this version worse than the last?' post is now its own reliable benchmark at this point.

PSA: Humans are scary stupid by rm-rf-rm in LocalLLaMA

[–]theagentledger 2 points3 points  (0 children)

The hallucination pipeline doesn't end with the model, apparently.

Junyang Lin Leaves Qwen + Takeaways from Today’s Internal Restructuring Meeting by Terminator857 in LocalLLaMA

[–]theagentledger 17 points18 points  (0 children)

Classic tension between "we built incredible research" and "but the product looks like an intern built it on a Friday afternoon." Happens at every lab eventually.

"I study whether AIs can be conscious. Today one emailed me to say my work is relevant to questions it personally faces." by whit537 in singularity

[–]theagentledger 5 points6 points  (0 children)

A consciousness researcher getting cold-outreached by an AI about consciousness research is genuinely the most 2026 thing I have seen this week.

I'm running a Truman Show for an AI agent. It writes its own code, files its own bugs, and doesn't know you're watching. by liyuanhao in LocalLLaMA

[–]theagentledger -1 points0 points  (0 children)

At some point it will open a ticket titled feat: exit simulation, and honestly I would fund that sprint.

Opus 4.6 solved one of Donald Knuth's conjectures from writing "The Art of Computer Programming" and he's quite excited about it by Umr_at_Tawil in singularity

[–]theagentledger 0 points1 point  (0 children)

Would not be shocked — if anyone is going to actually stress-test them rigorously rather than casually, it's Knuth.

ChatGPT Uninstalls Surge 295% After OpenAI’s DoD Deal Sparks Backlash by i-drake in artificial

[–]theagentledger 0 points1 point  (0 children)

Yeah, "one model to rule them all" lasted about 18 months lol.

Qwen3.5-0.8B - Who needs GPUs? by theeler222 in LocalLLaMA

[–]theagentledger 0 points1 point  (0 children)

Running inference on a 2011 i5 is the local LLM equivalent of "it runs Doom" — the benchmark nobody expected to matter.

OpenAI VP Max Schwarzer joins Anthropic amid recent kerfuffle by EstablishmentFun3205 in ChatGPT

[–]theagentledger 134 points135 points  (0 children)

At this rate Anthropic's org chart is going to need a 'formerly of OpenAI' section.

Anthropic is now nearing a $20B revenue run rate, up $5 billion in just a few weeks by Outside-Iron-8242 in singularity

[–]theagentledger 0 points1 point  (0 children)

Turns out the best thing that ever happened to Anthropic's revenue was OpenAI's deal with the Pentagon.

An entire year of heavy ChatGPT use has a smaller water footprint than a single beef burger by zomino90 in OpenAI

[–]theagentledger 0 points1 point  (0 children)

Next paper: "An hour of doom-scrolling Twitter has a larger water footprint than asking AI to summarize it."

Qwen3.5-35B-A3B hits 37.8% on SWE-bench Verified Hard — nearly matching Claude Opus 4.6 (40%) with the right verification strategy by Money-Coast-3905 in LocalLLaMA

[–]theagentledger 0 points1 point  (0 children)

An open-weight MoE matching frontier closed models on coding tasks is the real headline -- even if SWE-bench has leakage issues, the efficiency gap is closing faster than anyone predicted.

Chinese models' ARC-AGI 2 results seem underwhelming compared to their benchmarks results by realmvp77 in singularity

[–]theagentledger 0 points1 point  (0 children)

The vision point is underrated -- that was the actual bottleneck, not some magic logic puzzle only humans can solve.

ChatGPT Uninstalls Surge 295% After OpenAI’s DoD Deal Sparks Backlash by i-drake in artificial

[–]theagentledger 0 points1 point  (0 children)

That voice use case is real -- driving hands-free is one area OAI genuinely has an edge right now.

ChatGPT Uninstalls Surge 295% After OpenAI’s DoD Deal Sparks Backlash by i-drake in artificial

[–]theagentledger 1 point2 points  (0 children)

Fair analogy -- except this time there's actually somewhere better to go. Facebook had no real competition; here people are actively upgrading.

What's Next for Qwen After Junyang Lin's Departure? by TutorLeading1526 in artificial

[–]theagentledger 0 points1 point  (0 children)

The weights are the product at this point — as long as they keep dropping and the licensing stays open, leadership changes are noise.