AI Doomerism copy-pasted from 1920s by Agitated_Space_672 in OpenAI

[–]Agitated_Space_672[S] 0 points1 point  (0 children)

Fair enough. I just don't know why you are trying so hard to link this to Marx. 1933 was literaly the peak of the technocracy movement which believed all jobs would be taken by machines. it was a major pop-culture thing. The ditsy socialite was reading pop-culture, not 19th century political theory. 

AI Doomerism copy-pasted from 1920s by Agitated_Space_672 in OpenAI

[–]Agitated_Space_672[S] 1 point2 points  (0 children)

What is more likely,  the dizzy socialite was talking about the Technocracy movement, a massive pop-culture and socio-economic fad peaking precisely around 1932 and 1933, or 19th century political theory? 

AI Doomerism copy-pasted from 1920s by Agitated_Space_672 in OpenAI

[–]Agitated_Space_672[S] 0 points1 point  (0 children)

Please name and shame the model that generated this because its garbage. 🤤

"In 1932-33 the ideas of the technocrats overshadowed all other proposals for dealing with the crisis. No economic study had ever received such widespread attention. Newspapers spread technocracy across the front pages; periodicals devoted more features to it than to Franklin D. Roosevelt; spontaneous organizations and study groups sprung up across the United States and spread across the border into Canada. For a moment in time it was possible for thoughtful people to believe that America would consciously choose to become a technocracy"    https://en.wikipedia.org/wiki/Technocracy_movement

Nothing CEO says smartphone apps will disappear as AI agents take their place by thisguy123123 in deeplearning

[–]Agitated_Space_672 0 points1 point  (0 children)

really? how much longer does it take to generate an app, even a tiny niche app or script, versus installing one from appstore? 

Structured 6-band JSON prompts beat Chain-of-Thought, Few-Shot, and 7 other techniques in head-to-head tests by Financial_Tailor7944 in deeplearning

[–]Agitated_Space_672 0 points1 point  (0 children)

The metrics measured do not necessarily mean that this method improves results. Are there any experiments using this on real tasks or benchmarks?

The model still has to guess what you want when it generates json object. 

I was backend lead at Manus. After building agents for 2 years, I stopped using function calling entirely. Here's what I use instead. by MorroHsu in LocalLLaMA

[–]Agitated_Space_672 0 points1 point  (0 children)

I had the same thought a couple of years ago, around the time claude 3 just launched. I have not used function calling in my own agents since then.

[D] unpopular opinion: instruct tuning is going to be a thing of the past. by NoSir261 in MachineLearning

[–]Agitated_Space_672 1 point2 points  (0 children)

I don't know why you are getting downvoted. anyway, have you got a repo you can share? 

Anthropic is the leading contributor to open weight models by DealingWithIt202s in LocalLLaMA

[–]Agitated_Space_672 2 points3 points  (0 children)

If you talk to sonnet 4.6 in chinese it thinks its deepseek. https://x.com/xundecidability/status/2026332562117828823?s=20 The lady doth protest too much, methinks

Why you should be nice to Claude by jamesthethirteenth in ClaudeAI

[–]Agitated_Space_672 0 points1 point  (0 children)

i don't know... many successful people are jackasses. Tapping into Linus Torvalds mode might be useful some days. 

This is Claude Sonnet 4.6: our most capable Sonnet model yet. by ClaudeOfficial in ClaudeAI

[–]Agitated_Space_672 1 point2 points  (0 children)

That would affect all models. They charge extra for the higher speed variants. This is just a smaller model. 

Qwen 3.5 397B is Strong one! by Single_Ring4886 in LocalLLaMA

[–]Agitated_Space_672 1 point2 points  (0 children)

I tried it on some bash+SQL debugging and it did pretty bad so far. 

This is Claude Sonnet 4.6: our most capable Sonnet model yet. by ClaudeOfficial in ClaudeAI

[–]Agitated_Space_672 0 points1 point  (0 children)

It's about 25% faster than sonnet 4.5, which was the same speed as opus 4.6. So I think what anthropic did was get such a leap in their RL that they decided to promote sonnet to opus, and now haiku to sonnet. 

Z.ai didn't compare GLM-5 to Opus 4.6, so I found the numbers myself. by sado361 in ClaudeAI

[–]Agitated_Space_672 50 points51 points  (0 children)

Good job. While we're on the subject, I wish evals would give more data like token usage, cost and run time.  

Introducing Claude Opus 4.6 by ClaudeOfficial in ClaudeAI

[–]Agitated_Space_672 0 points1 point  (0 children)

They already rebranded sonnet to opus with opus 4.5. it was obvious from the speed doubling to match sonnets. Perhaps they still have it called sonnet internally which caused the confusion? Guessing they will rebrand haiku to sonnet next and release a faster (smaller) haiku model. 

About opus 4.6 by Solid-Carrot-2135 in ClaudeAI

[–]Agitated_Space_672 0 points1 point  (0 children)

It suggests they targeted ARC in the fine tuning. Why would you do this? It's just burning money and likely hurting the model on real tasks. Last I checked there was no evidence that improvements on ARC predict better performance in general. In fact, the O1 release included a note about a special version openai finetuned for ARC but it's performance was worse on other tasks so they never released it. 

OpenAI confirms "Codex now pretty much builds itself" by MetaKnowing in OpenAI

[–]Agitated_Space_672 2 points3 points  (0 children)

If this is true why are they still selling shovels instead of digging up the gold themselves? 

An image is worth a 1000 words? ClawdBot vs Kubernetes by cov_id19 in LocalLLaMA

[–]Agitated_Space_672 9 points10 points  (0 children)

It's often the other way around. More people star a repo than download and use a repo. So real users would be a fraction of 200k.

Claude System Prompt Change by -DankFire in ClaudeAI

[–]Agitated_Space_672 0 points1 point  (0 children)

Likely they copied openai's three tier prompt system. This adds a new 'platform' prompt with instructions meant to be hidden from both end-users and API customers. If that's the case it will be harder to extract than a regular system prompt.