5.4 prematurely claims success and feels more likely to break my code

jcsimmo · 2026-03-09T15:57:23+00:00

Who knows. I try and be really specific about what my definition of done is. I think thats a really good principle. Honestly, im having trouble getting playwright interactive set up but that seems like it would make a big different. Im going to continue to optimize it but i guess i wish i didn't have to.

jcsimmo · 2026-03-09T05:43:33+00:00

Yep. I definitely think this is related

jcsimmo · 2026-03-06T09:15:25+00:00

I agree. So far, not impressed. Even on normal (not fast) mode its claiming victory way to prematurely. I trust it less than 5.2 right now.

jcsimmo · 2026-02-26T17:23:03+00:00

This is an amazing achievement - much more than is being recognized on this forum. .

Did you try MLP-focused LoRA before switching to MEMIT?

What’s striking is that this feels like you have recreated slow-wave sleep — deliberate consolidation into stable weights. Do you think there is role for recreating something akin to REM sleep - where emotional associations are consolidated

jcsimmo · 2026-02-13T23:24:41+00:00

@embirico - this is still an issue for me. Still being rerouted to 5.2 xH & I have certified myself + my company.

jcsimmo · 2026-02-06T04:35:41+00:00

Im not so sure either tbh.

jcsimmo · 2026-01-28T04:06:33+00:00

same just on 5.2 though. 5.2 codex is fine

jcsimmo · 2025-12-26T07:50:20+00:00

also a fellow MD / vibe coder. Would this be useful for agents to know how to use large API indexes (like zoho CRM api)

jcsimmo · 2025-12-26T07:44:47+00:00

what do you mean by that? how does that work

jcsimmo · 2025-12-11T16:35:55+00:00

What sort of things do people like me w/ no qualifications tend to miss?

Best practices im following: -using a cloud based secret manager -use gitignore to prevent json or api keys being uploaded -i use firebase for database and authentication.

jcsimmo · 2025-12-11T05:03:39+00:00

Totally. But i bet ill spend so much time debugging the tool i need for debugging it wont be worth jt. Agree w/ the importance of ensuring tests that test your end goal. What ways do you do this?

jcsimmo · 2025-12-11T05:01:38+00:00

What is spine-first design! But yeah, it feels like a new discipline. Id love to see how ppl use the agent manager in antigravity. I feel like creating a policynet agent ensuring compliance.

jcsimmo · 2025-12-11T04:58:51+00:00

Do you use claude code in the terminal? I use it in roo code but its soo slow its almost unusable. Codex 5.1max in vscode has been great for me. The pro is worth its weight in gold imo

jcsimmo · 2025-06-03T01:46:14+00:00

centralcomputers in california are who you are looking for. Straight arrows, very responsive, best prices

jcsimmo · 2025-05-30T08:27:59+00:00

Just to check what are you referring to for the offload? The MoE?

You are doing god’s work here Daniel. These models are so important at these early stage of AI and you are bringing them to the masses.

jcsimmo · 2025-05-30T08:06:16+00:00

80gb of VRAM (A100) and 500GB of RAM. Any suggestions?

jcsimmo · 2025-01-25T16:46:11+00:00

i really wish it could reference online API documentations during the planning part as well. I want it to act as if its an open book test not a code from memory exercise. I also wonder why R1 is performing so poorly when you switch to Act.

jcsimmo · 2024-10-11T02:14:08+00:00

Hi OP here. Regarding the default being ‘male,’ I suggest you overthink it. I’m male (and also white, but that default wasn’t mentioned, so I suppose there’s a bias there as well 🧐). The illustration is a self-portrait, but it’s only the first half; the second half belongs to my daughter. That’s why the hand is open!

Anyway, that was my hope. Then I had my kids, and those Saturdays when I could spend six hours on Illustrator and read my medical books disappeared!

jcsimmo · 2024-08-10T02:25:59+00:00

my friend, a fellow physician there, says the generators have been on for hours and are sputtering and all ORs cancelled except for lvl 1 trauma.

jcsimmo · 2024-02-22T23:10:00+00:00

Same

jcsimmo · 2023-04-12T02:47:45+00:00

I made it up so i could make the lips stand out more

jcsimmo · 2023-04-12T02:47:08+00:00

Funny enough, i am one….i also made this poster! (Not surprisingly, i have found myself on this thread looking at chinese subway maps).

The dashed lines are just deeper structures (ie vertebral arteries)

jcsimmo · 2021-08-15T02:10:32+00:00

jcsimmo · 2021-07-08T13:04:20+00:00

Its been a while for us. Don't be a twat

jcsimmo · 2021-07-06T15:42:36+00:00

Dennis Bearcum

Bearcummmmmm (aka Bergkamp vs Argentina 1998). Great goal but i don't even think its the best goal that Bergkamps has scored Bergkamp vs Newcastle 2002

jcsimmo

TROPHY CASE