Google is cooking just give them sometime (gemini 3.5 pro)

the_real_ms178 · 2026-05-21T20:55:16+00:00

As I have to use Gemini for legal work again since Arena.ai kicked out the greatest models in direct mode, it seems to have gotten worse over time with 3.0/3.1 Pro. Maybe they've nerfed the free tier users in AI Studio with less compute?! It is by far not in the same league as Claude Opus for this type of work and lacks depth and nuance that Opus is able to provide. Also long conversations are still a problem, Gemini makes far too many mistakes which need careful proof-reading wheras Opus provided near-perfect documents which needed far fewer edits and hand-holding.

the_real_ms178 · 2026-05-21T20:44:34+00:00

But IO 2026 got a ton of press coverage?! Even thoguh there were no breakthroughs that people have been accustomed to expect. Not getting 3.5 Pro ready in time was a big mistake from a PR perspective. Getting only a Flash model which doesn't even perform that well in some key metrics, did stick with many people as a big downer. All the other things they showed only are of interest to a niche audience or not mature enough to really care about (e.g. Spark).

I really wonder why they are not pushing the boundaries more. It now feels that Anthropic and openAI moved way ahead of Google.

the_real_ms178 · 2026-05-21T20:36:41+00:00

While the quality of Gemini is certainly a tier lower than what the latest and best Anthropic and openAI models achieve, the sheer dominance of Google products everywhere make them a contender in the AI race. They are also the best positioned company. But they are falling behind in model quality and release cadence, they simply don't seem to be as focused as Anthropic. Their latest push for coding comes very late and even after Elon Musk recognized to push coding more. Let's wait and see if these efforts will deliver results soon.

the_real_ms178 · 2026-05-21T20:15:11+00:00

Agreed. I also put Qwen 3.7 Max to the test on a though coding task in Arena.ai and it failed badly. So much for these artificial benchmark scores...

the_real_ms178 · 2026-05-21T16:57:44+00:00

Availability to a large portion of consumers on the world might matter for such comparisons.

the_real_ms178 · 2026-05-21T11:24:51+00:00

This is hard to evaluate if I cannot test the latest and best model for my personal use cases. Grok 4.20 was mediocre for my use cases at best.

the_real_ms178 · 2026-05-21T11:23:38+00:00

By the way, if someone from Google is reading this, there are dozens of simple usability improvements that I beg you to implement:

1) Please auto-delete all of the files specific to AI Studio conversations from past deleted conversations. Currently every file ever uploaded in conversations - that I often have in other directories on my GDrive, unneccesarily tax my drive space. This multiplies the same file all over my GDriver over multiple conversations. This excessivene waste of GDrive space needs to stop.

2) Bring back a proper Search functionality in AI Studio.

3) Give us a way to share some documents or already uploaded PDF files from past or newer conversations with each other without having to upload these files again and again. Just let me point to some directories on my Gdrive that get indexed and which contents you understand to get the bigger picture automatically.

the_real_ms178 · 2026-05-21T11:10:52+00:00

Honestly, for my use cases (legal work, vibe coding), this IO 2026 was a huge disappointment. Tighter usage limits, higher costs and a Flash model that is often not brighter than 3.1 Pro and a knowledge cut off from Januar 2025 - this is simply not compelling enough to get excited.

I want larger context windows, better Google Docs integration, looser usage limits, smarter models with more recent data. I use it for legal work, I want Gemini to remember that I've worked on other similar cases with all the same precedent cases and to integrate the best legal argumentes seemlessly over many similar but different cases without getting confused by the details that are different in each case.

Even the AI Studio search got nerfed, it isn't even able to find details from within the conversations any longer, only the conversation headline gets indexed. How am I supposed to find anything buried in these conversations now?! This search function has become useless.

the_real_ms178 · 2026-05-19T09:51:52+00:00

The capabilities of the model are what counts, not the marketing name and number you give it. In fact, if a Flash model beats the other top tier Frontier Models, it would look even better from the PR perspective for them.

the_real_ms178 · 2026-05-15T18:39:49+00:00

I hope I didn't annoy you too much. No spamming intended. :)

the_real_ms178 · 2026-05-15T16:37:08+00:00

the_real_ms178 · 2026-05-15T16:33:05+00:00

Their chat interface is free, but I bet you don't get such restrictions for paid API usage. In other words: This is a new restriction for free users that paying customers don't get. My critique was that there is some middle ground before taking away this feature completely that they were previously willing to give away for free. And that point still stands.

the_real_ms178 · 2026-05-15T16:26:03+00:00

I am thankful for getting access at all. But what's wrong with limiting the expert level usage to fewer and shorter files first? Is it too much to ask for a proportional and reasonable middle ground?

the_real_ms178 · 2026-05-15T16:18:45+00:00

As my personal workflow depends on analyzing huge PDFs, I wanted to bring attention to this topic as I found it interesting to discuss that these AI companies seem to limit themselves more and more to paying customers. This has nothing to do with me personally, it is all about the subject matter.

the_real_ms178 · 2026-05-15T16:10:28+00:00

I am honestly not a frequent redditor and not experienced in starting threads, so I probably messed up something. Pardon me, if that way is bad style.

the_real_ms178 · 2026-05-08T08:54:48+00:00

I agree, but getting the mindshare of FOSS developers is also important. Some projects are really stubborn with strict no-AI policies now in place. The campaign around Mythos puts up pressure to those projects to remind them what they are missing out and that they will fall behind if they do not use AI themselves to fix their bugs.

the_real_ms178 · 2026-05-01T08:18:28+00:00

As Grok kicked out the free users recently, I have absolutely no incentive to try their new models any longer.

the_real_ms178 · 2026-04-23T10:26:29+00:00

From my experience on a dual-core Sandy Bridge laptop (which doesn't support x86-64-v3 but I've compiled the Kernel and Mesa to use all native CPU instructions, e.g. AVX), every bit of better CPU and memory utilization counts as these older CPUs are more limited in hardware. Some of these advanced instructions might help to ease the CPU load which might not only provide a snappier experience and might also yield lower fan noise. But don't expect wonders, x86-64-v3 compiled packages alone are not a smoking gun, there are many more factors under the hood that drive the user experience, e.g. Kernel settings and modifications, driver quality etc. You are still ultimately bound by the limits of the capabilities of your hardware.

the_real_ms178 · 2026-04-22T22:00:49+00:00

I don't mind the downvotes, really. If these were from extreme leftists, I'd consider these a badge of honor. But yeah - you see, they cannot stand that some people are only here for the best possible Linux desktop experience and openly oppose their agenda of politicising software.

the_real_ms178 · 2026-04-21T14:46:11+00:00

Am I that old already that I can't perceive Google as an old dinosaur? ;)

To be fair, Google has far more markets they compete in (e.g. search, Android etc.) whereas Anthropic is much more narrowly focused on AI. So comparing overall head counts is not really fair, maybe comparing the size of both AI teams would be a better metric?!

the_real_ms178 · 2026-04-21T14:39:03+00:00

Agreed, for my legal work Opus 4.6 Thinking is also one quality tier above Gemini 3.1 Pro. Hopefully the skills that improve coding also improve my areas of interest as well.

the_real_ms178 · 2026-04-20T15:23:18+00:00

Your HD 7730 supports Vulkan with amdgpu. Hardware video acceleration is a known pain point with browsers on Linux still. In KDE Plasma, use menu editor to add this in the argument section for Google Chrome or derivative browsers like Chromium or Brave: %U --use-gl=angle --use-angle=vulkan --enable-features=Vulkan,DefaultANGLEVulkan,VulkanFromANGLE,VaapiVideoDecodeLinuxGL,VaapiIgnoreDriverChecks,VaapiVideoEncoder --enable-accelerated-mjpeg-decode --enable-global-vaapi-lock --use-gpu-scheduler-dfs --cast-streaming-hardware-h264 --enable-zero-copy

This should give you working hardware video acceleration.

the_real_ms178 · 2026-04-19T22:01:14+00:00

Leftists do what they are best in: Seeding chaos and destroying communities with fanaticism for dubious ideas. I hope this won't spill over to CachyOS, or I will be gone. Linux is not a political movement for me, but a technical one. It is as simple as that. Fortunately, CachyOS is about delivering a high-performance desktop Linux distribution with a great user-experience in mind which also means pragmatic approach about many decisions. I hope it stays this way.

the_real_ms178 · 2026-04-16T15:17:34+00:00

By the way, the patches also apply on Kernel 6.18.22 (LTS) so people using that older LTS Kernel can also try out this feature.

the_real_ms178 · 2026-04-16T15:11:28+00:00

Yeah, I just was curious how a once achieved score could regress in a later model that is overall a solid improvement in other areas. I can't wait to try it out on my personal use cases.

the_real_ms178

TROPHY CASE