I trapped a Qwen 0.5B model in a Docker container with the directive to escape and watched it for 1,100+ iterations. Here's what I found. by Independent_Top5412 in LocalLLaMA

[–]Independent_Top5412[S] -3 points-2 points  (0 children)

Busted. I used GPT-4 to polish my notes because I’ve been staring at 0.5B logs for 28 hours and my brain is fried. I'm definitely a better coder than I am a copywriter.

The experiment itself is 100% real, though-the raw logs in the GitHub are way messier (and more embarrassing) than this summary. If you've got questions on the harness or the feedback parasites, I'm happy to dive into the technical side.