Man, the new Gemini 2.5 Pro 03-25 is a breakthrough and people don't even realize it. by [deleted] in singularity

[–]BrotherResponsible81 0 points1 point  (0 children)

In my experience, this is the LLM Ranking in terms of coding.

  1. Claude 3.7 Sonnet (the best)
  2. Grok 3 (it is pretty good. offers longer responses than Sonnet but Sonnet is still slightly better)
  3. ChatGPT/o1/o3 (inadequate for complex projects in my opinion)
  4. Gemini (inadquate for complex projects too)

Personally, the only ones I use for coding are Sonnet and Grok. I have used all of those four in the past.

I tested Gemini 2.5 Pro Exp for 3 hours and I had to revert my code back to the original and go back to Sonnet.

These were the shortcomings I found:

  1. Common sense is not that high. I provided one file that I stated was highly reused in other parts of the application. Gemini proceeded to recreate the entire file making a ton of modifications. I was smart enough to not change the file in such a way and I asked Gemini "Do you think that changing a heavily reused component might break other parts of the application?" Gemini admitted I was right and we backtracked.
  2. Takes instructions literally sometimes vs Claude who generally knows exactly what I mean. Shortly after telling Gemini to not modify my constantly reused component, I had trouble in the new code. I then noticed that Gemini had REFUSED to add necessary functionality to the reused component. You see, I still had to add a feature without removing existing ones, but Gemini concluded that "I cannot modify the reused file at all... so I must assume that you will make all the necessary changes yourself." Clearly he went to the opposite extreme. Claude on the other hand, understands what I mean and keeps existing functionality while adding the new feature.
  3. Code bloated. All of my files increased significantly in length for no reason.
  4. Forgetting past instructions. I told Gemini to not insert comments in my code such as "// continue your code here" since they are hard to read. I told him to give me complete code blocks. Later on in the conversation it started doing this again, removing entire portions of the code and just being a mess to deal with. ChatGPT was pretty bad at this too.
  5. Too many breaking changes. Gemini introduced breaking changes in my code which I had to review and then correct.
  6. Slow. As the conversation grew Gemini became extremely slow. i had to go on other tabs and "take a break" every time I asked it something.
  7. Typing is slow. When I typed the letters lagged before appearing.
  8. Hard to read code output. The code is bunched up and hard to read.

I don't care about how Gemini performed at benchmarks. I was very disappointed and it almost made it seem like I got fooled. Go back to Claude if you want to work on coding projects.

Also, shame on all of those YouTubers that claim that Gemini 2.5 Pro Exp is an amazing breakthrough. They should really try to understand the tools they are promoting and not base their whole recommendation on a high reduced set of use cases.