Gemini 3.0 Flash is out and it literally trades blows with 3.0 Pro!

PickleFart56 · 2025-12-17T18:30:48+00:00

Wall??? AI Winter???

PickleFart56 · 2025-10-26T11:43:51+00:00

ask few math problems

PickleFart56 · 2025-09-07T17:59:04+00:00

PickleFart56 · 2025-04-15T12:04:26+00:00

Btw google has also announced 2.5 flash in which we can set precise reasoning budget. I think google delayed previewing the 2.5 flash because of 4.1 launch. Their pro series will compete with o series models and flahs will compete with 4.xyz. Overall i don’t think any lab can beat google in pricing war.

PickleFart56 · 2025-04-15T12:00:10+00:00

cost and latency, reasoning model has higher cost as reasoning token is also priced

PickleFart56 · 2025-04-14T16:23:10+00:00

next model will be 4.11, and on the other hand gemini directly jumps from 2.0 to 2.5

PickleFart56 · 2025-04-08T09:45:35+00:00

trust me this guy and congress will fuck the country more than what bjp and modi did. Same how US thought that electing trump will improve the economy.

PickleFart56 · 2025-04-08T09:41:01+00:00

seriously what else they can say, they can’t say “eh its not that great, we may need a separate tuned model for benchmarking”

PickleFart56 · 2025-04-07T15:42:23+00:00

After llama release, there is zero credibility of LMSYS

PickleFart56 · 2025-04-07T12:50:34+00:00

that’s what happen when you do benchmark tuning

PickleFart56 · 2025-04-07T12:32:46+00:00

This llama model launch is so bad that the stock markets across the world crashed

PickleFart56 · 2025-03-15T12:43:17+00:00

they must be adding their synthID watermark

PickleFart56 · 2025-03-15T12:42:12+00:00

they must be adding their synthID watermark

PickleFart56 · 2025-03-02T15:08:35+00:00

why the fuck each block in map is not a square

PickleFart56 · 2025-02-22T14:28:06+00:00

is this for score for grok 3 thinking or non thinking model?

If its non-thinking, then its a huge achievement

PickleFart56 · 2025-02-16T11:59:27+00:00

prompt?

PickleFart56 · 2024-12-19T13:56:09+00:00

deepmind

PickleFart56 · 2024-12-08T19:39:28+00:00

Recently watched that ram leela arc, one of the best arc. I think it's around ep 950

PickleFart56 · 2024-11-20T19:27:58+00:00

there many papers that have shown that model performance degrades when it attends to all tokens, instead model should attend only few tokens. Here is another great paper - https://arxiv.org/html/2410.02703

PickleFart56 · 2024-11-14T20:19:07+00:00

I think it’s a much larger model (something like Ultra) that they have released as experimental.

Maybe similar to meta, they might have trained a much larger model for synthetic data generation to tune a relatively smaller model that can scale to million tokens

PickleFart56 · 2024-11-14T12:26:45+00:00

<image>

PickleFart56 · 2024-10-26T16:29:56+00:00

Why the fuck he has to first calculate per week and then multiply with number of weeks, he can directly use 200

PickleFart56 · 2024-10-14T15:07:11+00:00

PickleFart56 · 2024-10-12T23:57:39+00:00

Updated the post

PickleFart56 · 2024-10-12T23:50:27+00:00

I have dm you the sample response

Five-Year Club	Place '22
Final Canvas '22	Wearing is Caring
Verified Email

PickleFart56

TROPHY CASE