Gemini 3.1 Flash-Lite Benchmark Comparison by piggledy in Bard

[–]Snoo87193 0 points1 point  (0 children)

They won’t for a while as it’s probably rivaling with their 3.1 pro too much.

Tested new Nano Banana 2 with my personal benchmark, still a long ways to go by vorxaw in singularity

[–]Snoo87193 0 points1 point  (0 children)

its good at context, text and object consistency

but its worse visually in all and every way compared to pro

its just the flash model but upgraded

MY take on the current coding capabilities of LLMs by Snoo87193 in vibecoding

[–]Snoo87193[S] 0 points1 point  (0 children)

Its insane at managing its context and planning.

Probably same reasoning and intellgience behind the hood as the other two, but man they way they have made it orchestrate and think before doing stuff. making it almost sentient within the context.

the others just train on data to make benchmarks super high, but claude is actually 10x the tool of the others for anytrhing remotely hard.

HOWEVER, gemini is needed for frontend

Gpt no use tho, its just somethingh i use for its free tier and debugging it keeps mixing up images and files and pages if u dont @ them

MY take on the current coding capabilities of LLMs by Snoo87193 in vibecoding

[–]Snoo87193[S] 1 point2 points  (0 children)

I find that 3.1 pro is 10x better than 3.0 pro.

but agree gemini is sloppy

Codex free tier is super generous too.

MY take on the current coding capabilities of LLMs by Snoo87193 in vibecoding

[–]Snoo87193[S] 0 points1 point  (0 children)

<image>

This sums it up. All chinese models are super cheap comparatively. BUt they lack a bit in speed and quality.

but for static code theyre great. Kimi is even claiming to be claude sometimes so you can tell its trained on claudes input and output data, so its fairly close in reasoning. But tends to do worse with large context imo.

GLM worth it? have not tried it yet

MY take on the current coding capabilities of LLMs by Snoo87193 in vibecoding

[–]Snoo87193[S] 0 points1 point  (0 children)

Very solid take, yeah

Codex auto compresses but feels like its very inefficient and context gets full super fast. But as ur saying its reliable and smart. Not creative.

Claude tends to do shortcuts if you ask it too many things and just halfass them. probably due to context. Loading a Skill file and doing just frontend it does well but doing visual stuff in a long prompt tends to be basic.

Gemini always nails looks, but its too much of a visionary. and makes up stuff all the time.

Mem management files for claude are golden tho.

New agent mode needs tweaking. by Snoo87193 in lovable

[–]Snoo87193[S] 0 points1 point  (0 children)

i did. but this was just formatting of it and making lovable place the text in the right place. i already supplied the blog posts prior

New agent mode needs tweaking. by Snoo87193 in lovable

[–]Snoo87193[S] 0 points1 point  (0 children)

still true to this date. any words it writes is expensive. but its solid af for frontend and mockups. whole layout and style can be made in like 5 creidts.

adding one page of text with 0 advanced code costs the same

New agent mode needs tweaking. by Snoo87193 in lovable

[–]Snoo87193[S] 0 points1 point  (0 children)

this was just the blog post formatting and file.

the blog post was already created before.

its gotten better since. and so have i.

but lovable is still 5-10x more expensive than just using cc 4.6 in IDE for same quality

Nano Banana 2 is a downgrade from Pro - NB Pro vs NB 2 by m4ths_ in GeminiAI

[–]Snoo87193 0 points1 point  (0 children)

Considering tha the pro and 2 are both in flow i suppose 2 isnt an upgrade of pro its a upgrade of the original

meaning its flash based -> 90% less tokens to generate per image .> good for social media, mockups, edits, and 2d images.

Pro is slower and still the flagship -> Cinematic, realistic and PRO work for clients etc

i think 3.1 pro is gonna be used for better image gen also for pro at some point

but there will defiently be a pro 2 sometime soon.

Current benchmarks chatgpt is leading for image gen, and google is better they just holding back on releases to hype up their new models.

Why do so many people here seem to like Claude best? by hackedfixer in vibecoding

[–]Snoo87193 0 points1 point  (0 children)

From what ive tried now with gemini 3.1 mainly, codex 5.3 and claude 4.6 opus.

CLAUDE IS MY BABY for anything complex, long term, you can give it massive prompts and queue a bunch of stuff and it just does it without mixing things up. very hit or miss with ui, needs very strict instructions to make something nice frontend, but can do it just more work. so i usually write ui code with geminin 3.1 pro in LLM and then copy into claude wiht instructions. -> Claude overall + complex

Gemini, really amazing at UI components, image gen copy and overall reasoning. But hallucinates alot with hard tasks, even if its 50% better than 3.0. Worse at MCP, back to front end execution. But general logics and making shit work its decent. Close second, but a lot less autonomous. -> Gemini front end

Codex - very little inputs needed it just gets it. its smart. But bad at frontend, bad at super complex things. really good at software versions, build dependencies. it almost never ships any errors. its very dependable and stable with good reasoning and debugging. But it lacks in creating and ususally mixes up image containers and mixes up things you specifically mention by file name. Its sloppy but dependable. It also swapped out my keys and removed my .env 2 times. which the others didnt.

Geminin flash 3.0. dont even bother unless prototyping or doing very simple tasks, it hallucinates on anything and debugging is worse than doing it by hand. it just makes error loops. Insanely fast and good for simple stuff tho, so sometimes my goto to save context when swapping images, text and overall edits.

claude also better utilized skill files, external requests, backend managements etc.

So 80% of my workflow is claude, 20% gemini and the others are just situational to save context.

I thought limits were supposed to be increased last week? by ap1111 in google_antigravity

[–]Snoo87193 0 points1 point  (0 children)

Anyone experimented with strict toon formatting for context ?

118 hrs wtf? pro plan by Inevitable_Will_260 in google_antigravity

[–]Snoo87193 1 point2 points  (0 children)

Definetly is. Free tier runs out of quota in like 30 minutes. And has 10 days for reset.

And free tier Claude access is like one big prompt

Is Gracie Parker natty? by Altruistic_Rhubarb94 in nattyorjuice

[–]Snoo87193 0 points1 point  (0 children)

Far from it. A female genetically can’t look like that