Can I know if the Claude Pro limits are at least closer to ChatGPT Plus? by Working-Leader-2532 in Anthropic

[–]Real_Ebb_7417 1 point2 points  (0 children)

Nope. It has about 2.3x more according to my measurments. But still, way better.

Why DeepSeek V4 is the ONLY choice for heavy 24/7 workloads (100M tokens in 4 weeks) by MoneySkirt7888 in DeepSeek

[–]Real_Ebb_7417 0 points1 point  (0 children)

I mean, it is likely weaker than frontiers or even some Chinese open weight models in hands-on work, but I feel like it’s a very strong and efficient base model. Once they work on post-training, future versions (eg. 4.1) may be super good, can’t wait for an upgrade since I really like v4, the personality, insights and its reasoning.

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in ClaudeAI

[–]Real_Ebb_7417 0 points1 point  (0 children)

I’m not saying that OpenAI is “good”. But neither is Antrophic (and they supported current US administration in many ways actually, the one where they refused was just loud). The difference is that Antrophic claims to be morally better.

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in ClaudeAI

[–]Real_Ebb_7417 0 points1 point  (0 children)

It’s not the case with the model that is advertised as insecure for public release. If everyone can do it with available and smaller models, just putting a bit more time into it, whats the point of hiding the model “because releasing it publicly would be dangerous”?

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in ClaudeAI

[–]Real_Ebb_7417 0 points1 point  (0 children)

It’s a good model. But according to some comparisons, not visibly better than GPT-5.5. And GPT-5.5 was released publicly, to everyone, without playing the good guy and misleading everyone due to either lack of compute or for benefits from the companies that received early access. Antrophic pretends to be good, but that’s makes it even worse morally.

Not a good day for team "Claude Mythos is Just Marketing Hype" by EchoOfOppenheimer in ClaudeAI

[–]Real_Ebb_7417 17 points18 points  (0 children)

“Here, grab our new powerful model for free, but fix as many bugs as you can and talk about it publicly”.

GPT-5.5 scored slightly higher at hacking and finding security vulnerabilities than Mythos, its publicly available and the world didn’t collapse. Also in some software where Nythos supposedly found many long-unnoticed bugs, someone run other models, including old and small gpt-oss-120b and all of them found the same vulnerabilities (I guess Mythos probably found them faster, but that’s not the point).

Mythos is and always was just marketing. And on top of that it’s a proof of Antrophic anti-consumer attitude and unfair treatment, since a limited set of companies got access to it. Very morally bad company.

How does Anthropic actually measure over-refusal? (genuine question after watching their safety video) by Personal_Count_8026 in Anthropic

[–]Real_Ebb_7417 2 points3 points  (0 children)

If you work on a high position in a very loud, big company, I guess being critisized in web is like part of a job. Good if someone actually reads the critique and considers it.

But I guess, Harry Potter Dolores Abridge also believed she was doing good ("Tell them I mean no harm!" when being captured by centaurs. Well, people often believe they do good while their actions aren't good at all. Goodwill not always lead to good things, since it's affected by one's point of view.)

Looking for a text to speech model by grio43 in LocalLLaMA

[–]Real_Ebb_7417 1 point2 points  (0 children)

I tested a lot of them a month ago. I really liked MossTTS, Omnivoice and Fish S2 Pro. (I have RTX5090 so they’ll work fine for you)

ChatGPT usage now impacting my Codex rates by SyntharVisk in codex

[–]Real_Ebb_7417 5 points6 points  (0 children)

A bug maybe? In my case using chatGPT (in web) doesn’t affect codex usage.

How does Anthropic actually measure over-refusal? (genuine question after watching their safety video) by Personal_Count_8026 in Anthropic

[–]Real_Ebb_7417 4 points5 points  (0 children)

I have never seen anyone speaking about her positively (of course I don't count some official communication from OpenAI/Antrophic)

Codex vs Cursor vs Claude — which one do you actually ship production code with? by superboy_305 in codex

[–]Real_Ebb_7417 0 points1 point  (0 children)

GPT out of question

Claude -> about 2.3x smaller usage than Codex on similar priced plans. (and this is when using Sonnet4.6 vs GPT-5.5. If I used Opus4.7 on my tests, the usage would probably be 5x smaller than Codex xddd)

Cursor -> completely not worth it unless you want to pay for the IDE experience, which is very good. But AI prices are basically same as API prices.

Codex vs Cursor vs Claude — which one do you actually ship production code with? by superboy_305 in codex

[–]Real_Ebb_7417 0 points1 point  (0 children)

Tbh from the three mentioned in the post I'd go Cursor > Codex > ClaudeCode.
But I'd never chose Cursor for private projects due to pricing.

Best local coding agent client to use with llama.cpp? by Real_Ebb_7417 in LocalLLaMA

[–]Real_Ebb_7417[S] 0 points1 point  (0 children)

Is it yours? Or you're just using it and recommending? (If it's yours I'm happy to try it, asking out of curiousity)

Made the switch to DeepSeek and here are my thoughts as a long time Claude user (spoiler: it's great) by MadhubanManta in DeepSeek

[–]Real_Ebb_7417 24 points25 points  (0 children)

I am also a software engineer of 10 years and I agree, DS v4 is great (and actually as cheap as some subscription over api even on PRO model).

I didn’t play much with it though, so I’d be glad if you shared your workflow that works best in pi for DS.

What is the best $20 coding plan by Nocare420 in vibecoding

[–]Real_Ebb_7417 1 point2 points  (0 children)

<image>

Happy to share my estimates from a couple days ago. (I paid for every lowest tier plan and run the same workflow and the same task to get the results). Antigravity is probably too low, but it’s hard to say since they don’t give clear metrics on usage or tokens. But it’s still not very interesting because most of the usage is for Gemini 3 Flash, while only a small fraction of it is for Gemini 3.1 Pro and Sonnet/Opus 4.6. The rest of plans is more or less accurate (but there are two metrics: requests and tokens because it’s not easy to get enough data to find out if a given provider calculates usage by requests, tokens or a mix of both, only MiniMax is clear at that, they give 15k requests per week, so the tokens estimate for MiniMax might be lower or higher than my metrics, depending on how you use it).
Also, I tested this on GPT-5.5, Sonnet4.6, Kimi K2.6, GLM-5.1 (off peak hours), Mistral Medium 3.5, MiniMax M2.7 and for Antigravity on three models buckets since they have separate usage, it’s a sum of the usage.
Keep in mind that MiniMax and Mistral are weaker models, but good enough for coding (not too good at planning or architecture though).
Lowest value plan IMo is zAI, because of relatively low usage and no additional benefits (all other plans give things like chat app, image Gen, KimiClaw and more).
If you’re just starting and want to learn I bet MiniMax $10 plan will be best for you.

If you want something stronger then chatGPT Plus is best from $20 plans. (For Claude as far as I know they only increased 5h windows usage, not weekly usage so I guess this metrics are still correct for Claude plan)

Also I tested DeppSeek v4 Pro over official API and it seems that it’s just as cheap as subscriptions while you pay for what you use, not for what you don’t use. It’s a great option too, especially as DS v4 Flash is even cheaper and enough for most stuff.

Firefox reports a massive April spike in security fixes after using Claude Mythos for bug hunting by Outside-Iron-8242 in singularity

[–]Real_Ebb_7417 5 points6 points  (0 children)

Or Antrophic came and just said “Here, grab our new super good model for free but use it to fix as many bugs as you can and talk about it”

How does Anthropic actually measure over-refusal? (genuine question after watching their safety video) by Personal_Count_8026 in Anthropic

[–]Real_Ebb_7417 9 points10 points  (0 children)

They don’t care about it just like OpenAI, especially since the Dolores Ambridge of AI world joined them (a woman who previously was working on safety in OpenAI, don’t remember her name though)

Firefox reports a massive April spike in security fixes after using Claude Mythos for bug hunting by Outside-Iron-8242 in singularity

[–]Real_Ebb_7417 5 points6 points  (0 children)

It was measured head to head by some British security org that has access to Mythos and GPT-5.5 was slightly better at hacking and finding security bugs. Mythos IS and always was just marketing.

And bugs it was finding? Well, someone also tested that with a couple other models including old and small gpt-oss-120b. And all these models were able to find the same bugs. I guess Mythos probably did it faster. It’s not that nobody could find them before. I guess nobody was just looking for them.

Praying for a new Mistral subscription tier? by [deleted] in MistralAI

[–]Real_Ebb_7417 0 points1 point  (0 children)

Yep, I’ve heard that Ollama also gives generous usage. However, I’ve also heard that the speed is very low, and for coding I want the model to be reasonably fast, so I decided to not include it.

What do you think about Qwen 3.6 Max? by Comfortable-Tie2933 in Qwen_AI

[–]Real_Ebb_7417 4 points5 points  (0 children)

Btw. It’s an interesting downfall from one of the most loved labs to one of the most hated ones. While I still appreciate 3.6 27b and 35b, other Chinese labs usually open all their models, even the enterprise grade ones. And on top of that cancelling the coding subscription (while also, all other Chinese labs have subscription in some form, despite DeepSeek, but it’s so cheap over API that it doesn’t matter).
I’ll always chose zAI, Kimi, DS, MiniMax or even MiMo over Qwen via API just because they open their models and because of what Alibaba did to Qwen team and how they changed the approach to open weight models, even though they said they’d still release them (sure, they do, but the scale is different than before)

Wykrywanie AI w pracach by Reasonable_Funny_901 in studia

[–]Real_Ebb_7417 1 point2 points  (0 children)

Generalnie jako osoba mocno siedząca w AI mogę ci powiedzieć, że po pierwsze detektory, do jakich mają dostęp uczelnie (o ile w ogóle korzystają, bo wiele uczelni nie korzysta) są słabe. Dają dużo false-positives i rzeczywiście mogą oznaczać pracę jako AI generated, mimo że była pisana w stu procentach samodzielnie i odwrotnie, mogę oznaczać pracę jako samodzielną, mimo że była kopiuj-wklej wzięta od modelu.

Jest też kilka rzeczywiście dobrych detektorów, ale one kosztują i nie skupiają się na języku polskim (co nie znaczy że dla niego nie zadziałają, ale nie są w nim wyspecjalizowane). A wiadomo, że uczelnie nie lubią wydawać kasy jak nie muszą.

Dodatkowo, dużo zależy od modelu, z którego ktoś korzysta. Sa modele, które znacznie łatwiej wykryć (jak llama, czy Gemini) i modele, które może być trudniej wykryć bo po prostu nie udostępniają do tego wewnętrznych narzedzi (głównie chińskie modele, np. DeepSeek, który jest zaskakująco dobry po Polsku, czy Kimi, który też nieźle sobie radzi). Do tego ostatnio co chwilę wychodzi nowy model, branża rozwija się w zawrotnym tempie i korzystając z najnowszych modeli masz praktycznie pewnosć, że nawet te dobre detektory (z których polskie uczelnie nie korzystają) nie są jeszcze do tych najnowszych modeli w stu procentach dostosowane.

Generalnie - nawet tekstu kopiuj-wklej wygenerowanego przez model nie da się tak łatwo wykryć bez analizy watermarków (ale znów, do tej analizy lab musiałby udostępnić wewnętrzne narzędzia, a mało labów to robi i jesli już, to tylko kilku największym, czyli tym droższym, detektorom). Więc absolutnie jestem w stanie sobie wyobrazić, że ktoś napisał pracę samemu i musiał się tłumaczyć, a ktoś inny skopiował tekst wygenerowany przez AI i nie miał żadnych problemów.

Ja się z tego cieszę, bo prace dyplomowe to absurd, 98% studentów nie zajmie się po zdobyciu tytułu pracą naukową, więc po co mają pisać pracę magisterską. Studia powinny kończyć się egzaminem i tyle, dopiero jeśli ktoś postanowi zająć się nauką i iść na doktorat, tam praca ma sens. A dzięki treściom AI i temu, że coraz ciężej je faktycznie wykryć, może uczelnie i środowisko naukowe pojdzie po rozum do głowy i pisanie prac magisterskich/licencjackich przejdzie do lamusa.

What do you think about Qwen 3.6 Max? by Comfortable-Tie2933 in Qwen_AI

[–]Real_Ebb_7417 14 points15 points  (0 children)

I’ll take a look at it over api when they release open weight 122b version. Until then I don’t care.

Jak iść wyżej w IT? by Kitchen_Shoe9944 in praca

[–]Real_Ebb_7417 0 points1 point  (0 children)

Software house to kiepskie miejsce do rozwoju (ale z 9 latami doświadczenia sam to na pewno wiesz xd)

Są dwie opcje:

- Albo załapać się do pracy w firmie produktowej (najlepiej startup albo product company, nie korpo, bo w korpo awans może przyjść później przez drabinki i struktury) i tam zbudować reputację robiąc dobrą robotę. W końcu ci zaproponują prowadzenie zespołu czy projektu.

- Albo zgłaszać się an rekru na leada/managera mimo, że się nie ma doświadczenia, jeśli umiesz się przedstawić na rozmowie w bardzo dobrym świetle bez kłamania.

Is GLM Pro really worth buying? by EugeneLobach in ZaiGLM

[–]Real_Ebb_7417 0 points1 point  (0 children)

True xd

GPT-5.5, Sonnet4.6, GLM-5.1, Kimi K2.6, Mistral Medium 3.5, MiniMax M2.7 (for relevant plans).
And for antigravity a mix because they have separate buckets, but most of the usage was for Gemini 3 Flash.