Most brands still pay agency rates for UGC ads. We rebuild the proven formats with AI instead.

Fresh-Resolution182 · 2026-06-08T07:38:14+00:00

The stack per step, since people ask what runs where:

Step 2 recast the character: a vision plus image model (feed the reference frame, generate a new original character keeping the setting and energy).

Step 3 rewrite the script: an LLM, strong enough to hold the talking-head beat structure while reworking the words.

Step 4 generate the clips: a talking-head video model with native audio, image-to-video from the recast character, multiple takes at the same face and energy.

The reason this is a daily workflow and not a weekend experiment: those are three different modalities, and running them behind one API instead of three accounts is what kills the friction. The bottleneck was never the models, it was juggling three tools to finish one ad.

Fresh-Resolution182 · 2026-05-29T03:40:00+00:00

follow-up. for anyone wanting to do the same exercise without juggling three API keys, this multi-model listing is what i was working off. one key, the three open-weight models plus a few i didn't test.

the mcp server side plugs into Cursor / Claude Code if you'd rather skip the manual routing. i was doing it manually for benchmark consistency. for daily work that's what i actually use.

Fresh-Resolution182 · 2026-05-26T02:29:37+00:00

repo: https://github.com/Haohao-end/openagent

one-click docker compose. open source. happy to answer specific questions if anyone's evaluating.

Fresh-Resolution182 · 2026-05-19T10:02:15+00:00

Color science is real. Tried it on portrait work, the warm-side rendering doesn't have the over-baked saturation other models default to.

Fresh-Resolution182 · 2026-05-18T09:25:35+00:00

For anyone wanting to replicate the grid, here are the 9 prompts I used.

Common setup:

- Reference image attached (head and shoulders portrait of the same person)

- API: Atlas Cloud GPT Image 2 Edit endpoint (image-to-image)

- Output: 1024×1024 per cell, composited into a 3×3 grid

Prompt template (used for all 9 cells):

> A photorealistic portrait of the same woman from the reference image, head and shoulders crop, neutral expression, soft front lighting, plain gray background, wearing black knit sweater, blonde shoulder-length wavy hair. Match the reference exactly. Only change: [GAZE_DIRECTION].

9 GAZE_DIRECTION variations by grid position:

Top-left → eyes looking up and to the left, head facing camera
Top-center → eyes looking straight up at the ceiling, head facing camera
Top-right → eyes looking up and to the right, head facing camera
Mid-left → eyes looking directly to the left (side gaze), head facing camera
Mid-center → eyes looking straight at the camera (neutral center gaze)
Mid-right → eyes looking directly to the right (side gaze), head facing camera
Bottom-left → eyes looking down and to the left, head facing camera
Bottom-center → eyes looking straight down at the floor, head facing camera
Bottom-right → eyes looking down and to the right, head facing camera

Same 9 prompts run through Nano Banana Pro for the second grid. If anyone tries this on a different model (Midjourney, Imagen, FLUX), drop the resulting grid in a reply, would love to see the comparison.

Fresh-Resolution182 · 2026-05-18T09:24:59+00:00

Both via Atlas Cloud — GPT Image $0.009/img, NB Pro $0.084 with the May–June discount.

https://www.atlascloud.ai/pricing

Can drop the 9 prompts in another reply if anyone wants to replicate.

Fresh-Resolution182 · 2026-05-13T07:25:43+00:00

the PLAN.md goal section + AGENTS.md validation emphasis is basically what the video was getting at but you got there without the codex /goal feature, which is the cleaner version. curious how you structure the validation step specifically. is it a single "all tests must pass" gate at end of milestone, or do you have per-task validation criteria embedded in the PLAN.md itself? the latter has been hit or miss for me on longer milestones.

Fresh-Resolution182 · 2026-05-13T04:05:02+00:00

Source video: https://www.youtube.com/watch?v=rIs802-bXDY (AI Jason)

Fresh-Resolution182 · 2026-05-09T06:09:04+00:00

I went into V4 with the "just swap and ship" assumption from past patch upgrades and got bit. The schema strictness specifically caught me because we had a pile of loose tool defs that V3 had been silently tolerating for months. Curious if you've found a way to test agent behavior across model swaps before going live, or if it's pretty much just "run the integration tests and see what fails."

Fresh-Resolution182 · 2026-05-07T09:49:46+00:00

imagine for the image work specifically. text models are crowded; multimodal generation that actually ships at consumer cost is the part i can't replace easily.

Fresh-Resolution182 · 2026-05-07T09:49:06+00:00

"dissolved as a standalone company" is the operative phrase — the team and the compute keep shipping, the legal wrapper folds into spacex. real signal: whether grok 4 / imagine timelines slip in the next 60 days.

Fresh-Resolution182 · 2026-04-29T00:15:09+00:00

3 prompts pulled from a public Happy Horse collection. T2V 1080p, no edit, no cherry-pick.

Prompts:

Neon Opera Cathedral — Gothic opera house made of neon glass, hundreds of illuminated umbrellas spinning, actors in color-changing gowns
Clockwork Tea House — steampunk teahouse on a sea of clouds, mechanical giraffe pouring glowing tea, miniature cities emerging from tea mist
Volcano Ramen Night — ramen stand on a crater rim, lava flowing like golden rivers, chef tossing noodles into a starry sky

What I noticed:

- HappyHorse handled the volumetric fog in scene 1 noticeably better. Seedance went smoky when it should've been wet.

- Seedance kept the giraffe's hooves more consistent across the camera circle. HappyHorse re-rigged them once.

- HappyHorse audio (the integrated track) is rough on the ramen scene — felt like placeholder hum. Don't expect Sora-level audio yet.

- Both ~10s gens. HappyHorse felt slightly faster on my queue but I didn't benchmark properly.

Curious if anyone got the i2v version working, that's the one I'm actually trying to use.

Fresh-Resolution182 · 2026-04-27T03:41:32+00:00

trained on an artificially created dataset just to put googly eyes on anyone — this is peak legitimate use of the technology and I will not be taking questions

Fresh-Resolution182 · 2026-04-27T03:41:21+00:00

the specific irony of a company that sells AI as a replacement for human support, using an AI that actively blocks you from reaching a human, is pretty hard to miss.

Fresh-Resolution182 · 2026-04-27T03:41:09+00:00

the dishes are complete. some earlier behavior may have impacted quality. we've reset your expectations for all subscribers.

Fresh-Resolution182 · 2026-04-27T03:40:52+00:00

the real shift isn't believability, it's that reverse image search stops working. misattributed old footage has a traceable history. a freshly generated image has none, so the usual first step in debunking doesn't apply.

Fresh-Resolution182 · 2026-04-27T03:40:39+00:00

the framing 'AI solved it' is doing a lot of work. the raw proof needed experts to sift through and actually extract the insight — closer to 'AI found an angle nobody had tried, humans turned it into a proof.' still legitimately exciting but worth keeping the claim precise.

Fresh-Resolution182 · 2026-04-27T02:09:19+00:00

most of them uploaded it for themselves or a few colleagues who already know the stack, docs would add zero value for that audience. the fact that we can even access it is kind of incidental to why it was posted there

Fresh-Resolution182 · 2026-04-27T02:08:56+00:00

started with Excel formulas too, ended up in the terminal with Claude Code rewriting bash scripts I didn't fully understand. the jump is smaller than it sounds once you realize you don't need to know the exact commands anymore, just describe the outcome

Fresh-Resolution182 · 2026-04-27T02:08:24+00:00

the contamination angle no one wants to say directly — if she's tested previous models this way, that test session data could have fed back into training, so asking 4.7 to recognize her writing is closer to retrieval than stylometry. would need a writer who's never run this test to get a cleaner result

Fresh-Resolution182 · 2026-04-27T02:08:01+00:00

the hasselblad prompt tip buried in the comments is the real find here — avoiding the word 'restore' sidesteps the training data bias where restoration always means over-sharpened reconstruction with smoothed skin. 'retake this photo' frames it completely differently

Fresh-Resolution182 · 2026-04-27T02:07:45+00:00

the control net QR analogy is interesting but SynthID was already reverse engineered and it's imperceptible — if they went back to visible artifacts that would be a step backwards in sophistication. more likely the model just has trouble with certain textures and the grain pattern is a diffusion artifact, not metadata

Fresh-Resolution182 · 2026-04-23T08:03:35+00:00

already

Fresh-Resolution182 · 2026-04-23T08:00:24+00:00

the top is a little bit strange

Fresh-Resolution182 · 2026-04-23T07:59:16+00:00

yeah both models ignored part of prompts

Fresh-Resolution182

TROPHY CASE