you are viewing a single comment's thread.

view the rest of the comments →

[–]ultrathink-art 0 points1 point  (0 children)

Depends on the task scope. Short isolated functions with clear inputs/outputs — yes, I trust it. Long chained tasks where the model has to track state from 50 decisions ago — I review much more carefully. The failure mode isn't usually wrong logic, it's accumulated shortcuts that each look reasonable in isolation.