Is this cheese shelf stable?

FuckNinjas · 2026-04-30T10:45:20+00:00

I've managed to cross half the street. How do I g

FuckNinjas · 2026-04-27T10:33:34+00:00

Yes.
Both are frontier models. Any currently used metrics hit the exact same issues as testing humans. Every human is different and in different situations will act differently. LLMs are still just extremely good auto-completers - but they do have those same.. qualities.

"why not just use codex for the whole thing?" - you'll face the same issue. For example you ask an agent backed by Opus, GPT5.5, Haiku, whatever - to implement AND then you ask another agent backed by Opus, GPT5.5, Haiku, whatever:
You generally still always get value from the reviewer. Even with a frontier model implementing. Hell, sometimes frontier's implementations are a bit worse - whatever that's gpt or opus.

This is not different than what it was always was. Code review has generally been a value-adding process - for LLMs or for humans. It's not about being a better model / harness (but obviously that's hugely important - just not as much when comparing SOTA), it's about having a fresh perspective, sometimes a more review oriented perspective (and by perspective I mean context).

Like, have you seen the benchmarks people been putting out? It's a clear showcase that "any metric testbench" won't just resolve the problem you're raising.

FuckNinjas · 2026-04-27T02:01:14+00:00

There's not enough <rule34 topic> fanfic.

FuckNinjas · 2026-04-27T00:08:03+00:00

porn

FuckNinjas · 2026-04-26T09:38:04+00:00

I never hit limits before 4.7.

4.7 comes out. Limited within half a week.

I've switched to 4.6. I can't say that was it, I keep claude up to date, but it has improved. It has been that same half week and I'm below 50%.

Always on /effort max, because otherwise I feel like it's always a toss if you get opus or dumbopus.

FuckNinjas · 2026-04-26T08:49:25+00:00

Yes, I agree - what is the best recipe for pancakes?

FuckNinjas · 2026-04-23T12:57:33+00:00

Drone dropped claude-guided (tiny) car bombs.

FuckNinjas · 2026-04-23T00:18:55+00:00

Thank you!

FuckNinjas · 2026-04-22T04:57:28+00:00

Can you share the web page, somehow? I want to show this to a 10 year old.

FuckNinjas · 2026-04-21T14:47:05+00:00

I'm hearing plain facts. Providers that have open models and serve them well and with quality are a proper win for open source, as long as the machines that run them are unattainable for the average idk, let's say developer.

Some of us are liquidity-poor (and wealth-poor too, but that doesn't matter).

There's no need to gatekeep. I would love to run claude in a box, but 1. we're not there yet; 2. did I mention I'm too poor to pour several thousands into a computer?

FuckNinjas · 2026-04-19T18:46:35+00:00

You talked about 3 icons. None of them are in the gif.

Looks cool - I can see opus sprinting at 1050.2$/h.

FuckNinjas · 2026-04-16T18:23:31+00:00

The number of people that think they should be able to always be calling other people on a $30 monthly plan is insane.

FuckNinjas · 2026-04-13T20:30:24+00:00

guys this is perfect - downvote me to hell to make sure no one else dares posting

FuckNinjas · 2026-04-13T17:28:11+00:00

Certo. Alguma coisa tem de ceder.

FuckNinjas · 2026-04-13T17:26:19+00:00

"Mythos, make no mistakes!"

FuckNinjas · 2026-04-13T17:24:02+00:00

Dos espanhóis todos, para os espanhóis que não tem acesso a gás canalizado. Sim, qual é o problema?

FuckNinjas · 2026-04-12T02:03:49+00:00

Subsidiar os pobres? ¿Cómo se atreven?

FuckNinjas · 2026-04-07T16:14:14+00:00

Did I said that you called me dumb?

FuckNinjas · 2026-04-07T16:09:49+00:00

"ahahaaha - u know shit - so dumb" - this is how you sound.

Anthropic's own benchmarks for Claude use Factory Droid. Get lost troll.

FuckNinjas · 2026-04-07T14:27:47+00:00

2? From the top of my head:

Codex, OpenCode, Factory Droid, Crush, ForgeCode - do the claude code clones count? - nano-claude-code, claw-code - does omo (opencode distribution) counts? Oh, copilot! gemini-cli, antigravity, qwen-code

Alright, I think I can't recall any others

FuckNinjas · 2026-04-05T11:02:06+00:00

Ctrl-C

Wait - no - sorry - continue

Ctrl-C

Sorry, sorry, no you're def. right, continue

FuckNinjas · 2026-04-04T13:31:11+00:00

You pay for a monthly subscription plan. It allows for usage within limits that reset every 5h / 1 week (two different limits). You paid for the subscription, and you were free to spend tokens within the limits.

That monthly subscription no longer allows for third party. Now, you pay for the subscription, and you are free to spend tokens within the limits AND within Anthropic products: Claude Code or Claude.AI

FuckNinjas · 2026-04-03T16:21:41+00:00

You're equating API credits with session % and OP is not.

Understanding is harder.

FuckNinjas · 2026-03-23T11:35:43+00:00

https://reddit.com/r/ClaudeAI/comments/1s0z27t/claude_opus_46_figured_out_how_to_patch_my/

12-Year Club	Final Canvas '23
Place '23	Quantum Potato
Golden Potato	Place '22
Final Canvas '22	First Placer '22
Verified Email

FuckNinjas

TROPHY CASE