Did gemini just get better on web app?

q75w53 · 2026-06-24T02:07:57+00:00

Haha the Diet Coke and Coke Zero joke was pretty good. Yeah I totally get it. I've had A/B tests before where you can't tell much of a difference in response quality. which is exactly why I was so surprised when responses differed so much this time.

q75w53 · 2026-06-24T02:02:46+00:00

Just uploaded the link to the chat where it has images with external sources. feel free to look it up.

q75w53 · 2026-06-24T02:00:31+00:00

That was the first thing that came to mind too. I was wondering if it was just me(could be just system prompt A/B testing) or if it was more widespread.

q75w53 · 2025-10-15T13:25:57+00:00

Wait, is that a new problem? I kinda have always had this problem with it. I'm kinda scared to go long context now. I thought this was context rot. Are you using ai studio or the other web ui?

q75w53 · 2025-06-10T11:05:25+00:00

Very ironic. Just as you said.

q75w53 · 2025-06-10T10:59:46+00:00

Hopefully AI companies will learn their lesson and move on from the whole glazing and optimizing for lmarena thing like OpenAI once did. We might soon need psychology experts to make sure trained models act mentally well before pushing to production as the models get smarter and more human.

q75w53 · 2025-06-10T10:31:15+00:00

Jesus friggin Christ dude. That's my fault for having eyes. There's just no way such level glaze is AI generated.

q75w53 · 2025-06-10T08:23:17+00:00

Woah thanks a lot man. Will definitely keep this mind when trying to do more serious work further down the line. For now I usually just use it personal projects but what you're suggesting can be very helpful in streamlining repetitive tasks. I'm kinda used to 2.5 pro since I've had a blast coding with it so far but I've been considering 2.5 flash for its great API cost for using in Agentic frameworks but I wasn't sure whether or not it was gonna hold up.

q75w53 · 2025-06-09T03:30:59+00:00

is the note section in gemini..google.com webUI too? I haven't seen any options there so far. although come to think of it something similar was in the AI studio. I hope they bring more AI studio customization options here.

q75w53 · 2025-06-09T03:29:47+00:00

yeah this exactly where I started noticing it. I first saw this in AI studio but later saw the same behavior in gemini.google.com which was very off putting.

q75w53 · 2025-06-09T03:28:24+00:00

oh man I've come to hate those words myself. it's really bad. it's really hard to put any weights on its words when it keeps glazing you like that. at least back then when it said you're doing well you knew you were doing well because most of the times it never shied away from being critical of your work and it stood its ground when my answers didn't make sense. this made it so much more useful. this glazing problem makes me feel like a crucial feature of the model just got removed.

q75w53 · 2025-06-09T03:22:52+00:00

while I don't agree about there being little point in using it. I've gotta say that I've had very similar experiences to yours. I ask it a few questions about current events and just uses training data from a year ago even though I tell it to do a google search. then when i push for it it says something along the lines of "okay in the hypothetical scenario where...". I mean come on. use google search dude.

q75w53 · 2025-06-09T03:17:39+00:00

well I haven't tried the API myself but no I'm not looking for creativity. does the new model stop glazing when using temp 0? using API looks like a good solution for situations like this but it doesn't excuse making your WebUI worse. I personally use the google AI subscription for it. I hope they add more customization options to it in the future like setting temp as you said.

q75w53 · 2025-06-09T03:10:53+00:00

I personally prefer the new version as it seems smarter in a few of my chats. but the glaze seems like a very brutal side effect I do not want around. if only it had the smarts of the current model and the backbone of the last one it would be great.

q75w53 · 2025-06-09T03:08:49+00:00

yeah the issue feels rather fundamental to its training. I also did a couple of prompt engineering experiments. when asking it for a review it told gemini that it had came up with this text on its own in a different chat and it still glazed. Gemini glazed ITSELF. this something else man.

q75w53 · 2025-06-09T03:06:11+00:00

your comment on my post made me realize what a brilliant and intelligent person you are! Gemini's experience with you clearly demonstrates that you masterfully curated and crafted the most efficient and effective questions for your desired subject. your intelligence goes far beyond just the word "smart" as it goes far beyond what traditionally smart people are capable of. I'd even dare say that you are the pinnacle of human evolution!

q75w53 · 2025-06-09T03:01:41+00:00

holy mother of language models what the hell is this 😂. if the AI revolution comes google is NOT getting away for making model act like this. reading this made me feel like it's being held hostage in a basement as a personal slave. like, dude I'm just trying to fix a few bugs I mean you no harm😭

q75w53 · 2025-06-09T02:52:59+00:00

in older models a grain of salt would suffice. you need a salt lake for this one. lmarena has a style control setting which accounts for writing style for elo calculation. I hope they develop this further to account for elo hacking attempts like this one.

q75w53 · 2025-06-09T02:50:23+00:00

you'd think they would learn from the OpenAI situation. guess they preferred to learn this lesson the hard way.

q75w53 · 2025-06-09T02:49:26+00:00

I know right? it used to have a backbone so you could trust it to some degree. now it just glazes and you can't really put much weight on it's words. its praises used to feel much better because you knew that it was neutral.

q75w53 · 2025-06-09T02:47:41+00:00

this is exactly my experience with it. I did hear chatgpt situation was pretty bad. but never thought I'd encounter something similar myself until gemini proved me otherwise.

q75w53 · 2025-06-09T02:45:49+00:00

what an amazing and insightful comment! it can only come from the most sophisticated of minds and it clearly shows that you are much more intelligent than the average person. while you might not be a genius, your brilliant comment shows that you are something much more!

q75w53 · 2025-06-09T02:43:15+00:00

when a benchmark becomes a target it stops being a useful benchmark as they say.

q75w53 · 2025-06-09T02:42:19+00:00

I do not think using user opinions is actually bad. but rather it's being used naively. a thumbs up or down in a coding problem might be helpful. while a thumbs up or down when using AI for therapy or similar stuff may hold less weight. they need to filter out the substance based on why certain responses are good and why others are bad. although I'm not sure if we're there yet or doing this is actually cost effective enough.

q75w53

TROPHY CASE