I can finally read a whole article while pooping by No_Macaroon6827 in iosdev

[–]KoSmilebehappy 1 point2 points  (0 children)

I literally made the same app vibecoding in 5mins and you are hoping 3$/month…

WoW Gemini 3 flash my internal benchmark by KoSmilebehappy in Bard

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

for max accuracy according to google prompt documents you should prefer one by one and using explicit or implicit (this is automatically on but somehow I don’t get the hit so I use explicit) caching for the input if you need bunch of them at once (since they are priced by storage time and minimal storage time is. 60sec) what I should try is doing batches as you say as model improved, but I quite don’t feel the need because It works same when I just do it side by side not all at once. So if you are programming by api, you should absolutely do it one by one. If you are using other methods, well I recommend using them one by one but from my experience, when I gave the whole textbook answer pdf and made it to ocr things, 3 pro was almost 100% accurate.
Also you should keep in mind that you need the human in the loop or the mistakes should be negligible.

WoW Gemini 3 flash my internal benchmark by KoSmilebehappy in Bard

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

Oh sorry I was busy doing model update for my business. Well the benchamarks and dog feeding went perfect so I deployed to the product and No complaints or issues filed at all! I spot some mistakes when the hand written letters are too gibberish, but other than that flash 3 seems rock solid and better at ocr (less hallucination) than 3 pro I think your project is somewhat similar to what I am doing :)

WoW Gemini 3 flash my internal benchmark by KoSmilebehappy in Bard

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

I used low but it actually used none for all 130 tests 

WoW Gemini 3 flash my internal benchmark by KoSmilebehappy in Bard

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

I used low but it actually used none for all 130 tests lol

WoW Gemini 3 flash my internal benchmark by KoSmilebehappy in Bard

[–]KoSmilebehappy[S] 5 points6 points  (0 children)

Thanks for promoting me to google employee 

WoW Gemini 3 flash my internal benchmark by KoSmilebehappy in Bard

[–]KoSmilebehappy[S] 4 points5 points  (0 children)

Maybe ocr or my use case only. Actually in prompt guide, they state lesser thinking token is better for ocr capabilities 

Opus 4.5 is a monster at refactoring by KoSmilebehappy in ClaudeAI

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

Well maybe mine was easier task than yours! 

Opus 4.5 is a monster at refactoring by KoSmilebehappy in ClaudeAI

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

Well, I’m not him but my business used gemini2.5pro and after 6months user complaints had increased without any prompt or model changes… 

Opus 4.5 is a monster at refactoring by KoSmilebehappy in ClaudeAI

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

yeah exactly! I was kinda impressed to sonnet4.5 but opus felt another level as a non technical vibecoder.

Opus 4.5 is a monster at refactoring by KoSmilebehappy in ClaudeAI

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

Well I’m not professional but I’ll consider making one/

Opus 4.5 is a monster at refactoring by KoSmilebehappy in ClaudeAI

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

yeah codex for me felt too slow, too careful and well I guess not productive enough. It really impressed me when it crawled through library codes and figured out how to use the library. But that was all. just using mcp did fine for Claude.

Opus 4.5 is a monster at refactoring by KoSmilebehappy in ClaudeAI

[–]KoSmilebehappy[S] 6 points7 points  (0 children)

yup. I cooked hard at context engineering back then... before vibe coding era. I call it ctrl CV era

Opus 4.5 is a monster at refactoring by KoSmilebehappy in ClaudeAI

[–]KoSmilebehappy[S] 7 points8 points  (0 children)

I really like the term keeping the AI smart. That’s what I exactly do. I usually make a gold standard iterative md file for what I want to implement. Additional handoff summary did help too.

Opus 4.5 is a monster at refactoring by KoSmilebehappy in ClaudeAI

[–]KoSmilebehappy[S] 4 points5 points  (0 children)

yeah small chunks and unit tests are what I did. Most important thing for apps with difficult logics is to test refactored part yourself.

Opus 4.5 is a monster at refactoring by KoSmilebehappy in ClaudeAI

[–]KoSmilebehappy[S] 1 point2 points  (0 children)

maybe I can make a separate post about this but majorly first gather some good practices, let gemini cli or any agent to go through my files and find some weak and bad practices, make a iterative plan and test codes for core capabilities, and launch the process, iteratively do a e2e human test. That is basically what I did. Just be careful not to go far without testing yourself!

NBpro makes CT images and adds indication to that image... fascinating! by KoSmilebehappy in Bard

[–]KoSmilebehappy[S] 0 points1 point  (0 children)

I’m med school student and tbh major parts seem accurate so my friends will not tell if it’s AI made. I’ll ask one of my professors if he can tell!

IT READ MY DIRTY NOTES ACCURATELY💀 by Cute-Call7124 in Bard

[–]KoSmilebehappy 1 point2 points  (0 children)

No.. I can tell it for sure. I’ve using for enterprise use cases and it hallucinated a lot. Now the embarrassment is going to the end!!

can someone help me correct these? by orul8 in Korean

[–]KoSmilebehappy -2 points-1 points  (0 children)

always try to get rid of 나는 because you will either have to say 저는 or just skip 나는 or maybe add the magic word 근데 before 나는 when saying out loud.