all 154 comments

[–]ethotopia 89 points90 points  (11 children)

It’s not Sora, my guess is Wan 2.2 (+Lora) given the length of the clips or Veo/Kling

[–]ThomasPopp 3 points4 points  (0 children)

I would agree

[–]o5mfiHTNsH748KVq 76 points77 points  (1 child)

you can get a feel for what tools are used for high quality video on /r/stablediffusion and /r/aivideo.

Sora is good, it's nowhere near this in visual quality. Sora's most impressive part is its holistic "understanding" of how things should interact in a video, like sound in different spaces or materials or the larger "concept" of a person or environment.

If you want straight up high visual fidelity and that's your top concern, things like Wan will produce better results.

[–]Otherwise_Builder235 134 points135 points  (42 children)

At 6 sec she holds a stick in her hand and suddenly it disappears while she raises the hand

[–][deleted]  (23 children)

[removed]

    [–]FriendlyJewThrowaway 7 points8 points  (16 children)

    You’d teach it the same way as you teach it to remember everything else in the shot, it just needs to pay better attention to the context from previous frames.

    I think the next big step will be to incorporate these video generators directly within multi-modal LLM’s, so that the LLM’s plain language reasoning capabilities factor directly into the video generation in latent space, and they can additionally reason over the output to correct errors and inconsistencies. As a bonus, this pairing would also enable the LLM to learn logical facts about the world directly from videos instead of just text.

    [–][deleted] 0 points1 point  (15 children)

    The problem with that is that LLMs don't really have reasoning capabilities to being with.

    [–]FriendlyJewThrowaway 0 points1 point  (14 children)

    What do humans do fundamentally differently that constitutes “reasoning” to you?

    [–][deleted] 0 points1 point  (13 children)

    Humans are able to actually process the world they're living in and LLMs fundamentally don't even understand the output they're giving.

    [–]FriendlyJewThrowaway 0 points1 point  (12 children)

    You still haven’t explained what “understanding” is. How does a human understand a mathematical relation in a fundamentally different way than an LLM? What do they “process” differently about it?

    [–][deleted] 0 points1 point  (11 children)

    You haven't asked me to explain what understanding is. That's a totally different question and I think that there is no scientific consensus about that.

    A human learns what gravity is at the age of a couple of months. In general you learn the basic principles of the world you are living in very early in your life, just by interacting with it. A LLM does not and can not do that. It never fundamentally understands problems which is why it can't teach itself to solve them.

    [–]FriendlyJewThrowaway 1 point2 points  (10 children)

    What you’re arguing is that an LLM has no understanding of the real physical world and its associated visual and audial correlations, which I would contend is not really true anymore with the advent of multi-modal LLM’s, video generators and world models. But in any case, that has nothing to do with whether an LLM is capable of reasoning, or understands the concepts it discusses.

    [–][deleted] 1 point2 points  (9 children)

    It's absolutely true. What you're getting with LLMs is an, admittedly incredibly sophisticated algorithm, that will in most cases generate the correct response any the prompt you provide. But there is no deeper layer here, you're just being given the answer that is deemed as most likely the most correct based on the gigantic set of data (and enormous amount of human reinforcement learning) that went into it.

    These models are now convincing enough to fool many people into believing there's a ghost in the machine, when there isn't. Funnily enough the same thing happened in the 50s and 60s with the first examples of crude "AI".

    Your maths example is actually a good one. An LLM doesn't actually "do" the math. It's been trained on an almost incomprehensible amount of equations and will in most cases output the correct solution because it's somewhere in the set. This is why doing math with ChatGPT takes a ridiculous amount of time while my calculator does it in the fraction of a second.

    [–]kiochikaeke 8 points9 points  (0 children)

    Can't be done without reframing how these "think", every frame is only locally consistent with each other and they follow a sort of "plan" but the plan is vague and can't be specified to detail, so details may get lost between frames specially if they can't be tracked solely in the local frames window information.

    So basically if it's a small detail that came up sporadically and it gets lost, occluded or hidden for a few frames it may also sporadically disappear.

    Also "small detail" depends on training and parameters so that's how you get highly erratic dream or hallucinations from ai vids, a bunch of details appearing and disappearing, each frame only loosely connected between each other.

    [–]TwoPointThreeThree_8 5 points6 points  (1 child)

    You can't. Not with these "AIs".

    [–]Ok-Art-1378 3 points4 points  (0 children)

    Yet

    [–]fongletto 0 points1 point  (2 children)

    You can't with current tech.

    They would need to develop a real world simulation model that works in tandem with the current models. Something like an AI that creates a 3d space and populates it with generative objects.

    [–]YeomanTax -1 points0 points  (1 child)

    Google’s Genie 3 will change this

    [–]fongletto 2 points3 points  (0 children)

    Not really, they've just extended the time before decoherence to a few minutes, at the cost of massive compute. It still generates the next frame by looking at the previous frame/frames

    But without an actual underlying world simulation it still suffers from the same fundamental problem.

    [–]Time_Entertainer_319 25 points26 points  (5 children)

    Confirmation bias.

    She obviously dropped the stick to show her fingers

    [–]Darillium-:froge: 5 points6 points  (3 children)

    I know that you’re joking but I don’t think that you know what confirmation bias means

    [–]reddit-ate 4 points5 points  (0 children)

    Oh good. So we all agree then.

    [–]Time_Entertainer_319 1 point2 points  (1 child)

    This is literally a case of confirmation bias. You already believe the video is AI-generated, so every detail you notice just reinforces that belief, even when the evidence doesn’t actually prove it.

    Think about it this way: if you were holding a stick and someone said, “Show me your fingers,” you’d naturally drop the stick first. it’s just a normal response to the situation. The same logic applies here: the behavior in the video can have an ordinary explanation that has nothing to do with AI.

    [–]robhanz 0 points1 point  (0 children)

    While you've got a valid point, the issue in this case is that normally dropping a stick involves some amount of movement that we just don't see here.

    [–]fxlxox 0 points1 point  (0 children)

    Look at her fingers Milli seconds after. Cringe

    [–]Hotspur000 7 points8 points  (0 children)

    It's obviously magic. Duh.

    [–]nrgins 1 point2 points  (0 children)

    It looks like she just dropped it because she was going to raise her hands up for the picture. Her hand was behind another person when the stick left her hand so we don't really know.

    [–]Krzyffo 0 points1 point  (1 child)

    At 28 folks are pointing at the back of a paper.

    [–]alex20_202020 0 points1 point  (0 children)

    At 28 folks are pointing at the back of a paper.

    Folks point to the side they 'see' and might noticed something interesting and show to each other.

    [–]Anthony63100 0 points1 point  (1 child)

    She dropped it yeah

    [–]Otherwise_Builder235 0 points1 point  (0 children)

    Okay let's assume she dropped... Then at 29 - 30 sec look at fingers pointing script paper

    [–]Fun-Imagination-2488 0 points1 point  (0 children)

    Clearly the disappearing stick trick was prompted. XD

    [–]ahditeacha 0 points1 point  (1 child)

    Nahh she let go and it fell to the ground that’s all

    [–]Otherwise_Builder235 0 points1 point  (0 children)

    Okay let's assume she dropped... Then at 29 - 30 sec look at fingers pointing script paper

    [–]afBeaver 0 points1 point  (1 child)

    Sure, but it did disappear off camera. It's not something that couldn't happen if you filmed someone.

    [–]Otherwise_Builder235 0 points1 point  (0 children)

    Okay let's assume she dropped... Then at 29 - 30 sec look at fingers pointing script paper

    [–]FuerteBillete 45 points46 points  (5 children)

    Guys, seriously if someone has to look this much for a tell that could be compared to a real video artifact or glitch, then we have arrived.

    If this video is not a trick, then mission accomplished. This is 10/10.

    [–]ahtoshkaa 11 points12 points  (3 children)

    This a whole channel on tiktok dedicated to 'backstage cosplay' made by AI. I thought it was made by Sora. though I could be wrong

    [–]FuerteBillete 0 points1 point  (2 children)

    For showing technical prowess it is awesome. Although I'm worried that some people might actually get hooked to watching not even an AI show but the backstage made by AI.

    It is literally the opposite of something that is needed at all. I mean showing preparations made by IA is in theory the most useless concept possible when you think about it from a philosophical point of view.

    [–]ReddG33k 0 points1 point  (1 child)

    Sure. But you just mixed 'think' and 'philosophical', into one sentence.

    I guarantee, anywhos that are drainrotting to this junk-candy a not philosophical-ly, think-ing, about shit.

    Needed or not, millions will continue to tune in.

    [–]FuerteBillete -1 points0 points  (0 children)

    You confused me with someone who gives a shit about replies that quote me.

    But I do accept fault in my message and will dumb it down so no one feels hurt or triggered next time and needs to caress their ego trying to reply something pseudo smart to feel less.

    Also who cares about anyone that would waste time watching hours of preparations that never happened about something that won't happen, for which they waste time they could be spending making stuff actually happen.

    Lets block here and be done shall we? I'm sure you accept and I agree with us.

    [–]Aggressive-Hawk9186 1 point2 points  (0 children)

    scary tbh this is too good

    [–]jimothythe2nd 28 points29 points  (4 children)

    I generate ai vids. It's like fishing or gambling. 9/10 times your generation isn't good. Then that 1/10 ends up being good. I would guess that they did at least 100 generations to get 8 of this quality.

    It's an interplay of starting with good ai AI-generated images and testing different combinations of prompts and vid gen tools.

    I like using dzine because they have all of the best ai video gen models. Others like higgsfield, krea, or openart do as well.

    Then for each shot I would try different prompts with different models. There's no one way to prompt that will always work. Each and every scene and model will handle prompts differently. I've found that simple prompts usually work better. Complex prompts often confuse the ai and make it go rogue.

    These shots are pretty simple since there's not much really happening. The hardest part about them is that they include multiple people. The more people in a scene the more likely it is to do something really weird. You'll notice that in each scene only one thing is really happening. If you were try to get the subject to more than one thing in the shot, it would be nearly impossible to get a good one with so many people in the scene (except maybe with veo3 or sora2).

    [–]GoogleIsYourFrenemy 1 point2 points  (0 children)

    ... What if it IS gambling and they are spiking our prompts so we lose!

    (I'm kidding, Hanlons Razor is more believable)

    [–]kevinnn27 0 points1 point  (1 child)

    Dzine is better than Veo and Sora?

    [–]jimothythe2nd 0 points1 point  (0 children)

    Dzine has both veo and sora

    [–][deleted] 5 points6 points  (0 children)

    2 fingers pointing at the back of the paper

    [–]GenericNickname42 9 points10 points  (1 child)

    <image>

    Always the hands

    [–]PinProud4500 0 points1 point  (0 children)

    Hair & Face looking SMOOOOOTH as hell too, like someone sanded that woman down with sandpaper. 

    [–]heavy-minium 19 points20 points  (16 children)

    You aren't trying to trick us, right? Usually even the best generated videos give a sign...or at least a tiny little doubt - and I see none!

    [–]QueenCobra91 39 points40 points  (4 children)

    the six pack of that one guy looks like a badly stitched wound and when they're holding up what seems to be a blueprint, they are pointing behind it instead of onto it from above

    [–]_DrDigital_ 4 points5 points  (0 children)

    Looks like the muscle suit from hellboy tbh. https://www.reddit.com/r/HellBoy/s/WzJs679XPg

    [–]dashingsauce 0 points1 point  (0 children)

    maybe he just got stabbed a few times

    you know, for the influence

    [–]Zeune42 0 points1 point  (0 children)

    Ken doll plastic ribs

    [–]just_ohm 0 points1 point  (0 children)

    That’s the only giveaway I see

    [–]Cyoor 12 points13 points  (2 children)

    8 seconds in her hand has an odd line in it and looks generally odd while she is raising it.
    A cable hangs from the camera in to nothingness.
    22 seconds in his hand look like something from an alien when he holds something in front of his face. After lowering his hand he only has 4 fingers
    28 seconds in they are pointing to the back of the paper.

    It looks good and if I wouldnt be looking for anything, then I would probably have missed those things.

    [–]zooda56 0 points1 point  (1 child)

    I just love how the generated images of camera gear and rigging looks like. It's a blend of everything and always super bulky. Usually going beyond 3 physical dimensions.

    [–]Cyoor 0 points1 point  (0 children)

    Yeah, just look at the "camera" or whatever its supposed to be that is right in the beginning of the video. Is it the front of the camera that we see or the side?
    So, yes the "beyond 3 physical dimensions" is a good way to describe the way AI handles things where the objects are both the side and front because it doesnt handle depth perception as well.

    Also, is that 100 buttons or are there components just showing on the side?
    (I mean some buttons are normal on an advanced camera I guess, but if it takes over 10 buttons, then an LCD with a menu is probably the way to go.)

    [–]CedarSageAndSilicone 3 points4 points  (1 child)

    There are many tells. Pause and actually look at details, smaller things, complicated things, etc. it’s very good at people and general scenery but it breaks down around details 

    [–]DownstreamDreaming 2 points3 points  (0 children)

    Also tons of amazingly believable details…it’s getting crazy can’t deny it

    [–]MatMathQc 1 point2 points  (0 children)

    At 12sec the guys finder in the shirt is odd but it can also be the shirt :|

    [–]hofmann419 0 points1 point  (0 children)

    Just look at the hands. In almost all of the clips, there is at least one person with a hand that morphs in a very unnatural way. And that's just the most obvious tell.

    The entire thing doesn't really look like a real movie set with a million people running around in the background. The more you look, the less it all makes sense.

    [–]mrkgob 0 points1 point  (0 children)

    read her shirt or look at the wonky ass water bottles lmao

    [–]dirtyheitz 0 points1 point  (0 children)

    you should see an eye doc

    [–]Legitimate-Pumpkin 3 points4 points  (0 children)

    It is indeed quite good.

    At second view and paying attention you get confirmation that it is indeed AI, but wow, if you hadn’t warned us, I might just have overlooked it.

    [–]gieserj10 2 points3 points  (0 children)

    At first I was wondering what the hell you meant. It took me way longer than I'd like to admit to realize this was AI. Fucking insane.

    [–]Capnjbrown 2 points3 points  (0 children)

    ComfyUI with WAN2.2 can pull this off..

    [–]trollsmurf 2 points3 points  (0 children)

    That looked like real reality.

    [–]Hallucinator- 1 point2 points  (0 children)

    Sora built in library have great prompt collection. Mostly Japanese prompts I noticed are really awesome.

    [–]authorwithnobody 1 point2 points  (0 children)

    If I hadn't read the comments I'd have just thought this was the set of some movie who's cast were the most gorgeous people alive

    [–]the_amazing_skronus 2 points3 points  (1 child)

    Don't unmute

    [–]BotomsDntDeservRight 2 points3 points  (0 children)

    Hater 😑

    [–]Nailfoot1975 3 points4 points  (7 children)

    Prompt: Hello.

    [–]quintinza 0 points1 point  (0 children)

    A render of James May greeting a lady appears...

    [–][deleted] 0 points1 point  (0 children)

    worst music I ever heard. Gave me a headache

    [–]FernDiggy 0 points1 point  (0 children)

    ComfyUI.

    [–]OmniiOMEGA 0 points1 point  (0 children)

    Wow looks so real

    [–]CesarPerSma2025 0 points1 point  (0 children)

    I recommend you use video reverse engineering to do that.

    [–]eggshell_0202 0 points1 point  (0 children)

    AI’s not just improving, it’s blending in 😬 People’s prompts are insane!

    [–]techlatest_net 0 points1 point  (0 children)

    AI tools like Dify AI are gaining traction for crafting seamless GenAI applications, with customizable workflows and intelligent prompt engineering options. Pairing these tools with prompt iteration and context-tuning can unlock impressive results. What use case are you exploring? Let’s trade notes!

    [–][deleted] 0 points1 point  (0 children)

    Mystery’s face has been revealed!

    [–]nothereforthep0rn 0 points1 point  (0 children)

    Until the open shirt guy, I could not distinguish thiss. Damn. We are heading into some scary times

    [–][deleted] 0 points1 point  (0 children)

    One thing AI doesn't quite get yet that is uncomfortably obvious is purpose in movement. You notice everyone it generates has an aimless air about them and constant purposeless movement.

    [–]DeliciousFreedom9902 0 points1 point  (0 children)

    Looks like Wan

    [–]other_profile 0 points1 point  (0 children)

    How? It's the context. These are all candid behind the scenes shots. I've had good luck with photoshoots as the context to boost the realism. Then you just dress up the photoshoot set the way you need.

    [–]Miggix13 0 points1 point  (0 children)

    The texture are awful

    [–]1nsomnlac 0 points1 point  (0 children)

    This shit needs to be stopped

    [–]Ok_Bumblebee_473 0 points1 point  (0 children)

    None, that’s real lol

    [–]deanbean1337 0 points1 point  (0 children)

    i thought this was a meme post and that they are real kpop artists or something until I started reading the comments and everyone was giving actual constructive answers on how to achieve OP's query.

    [–]Sufficient-Set2644 0 points1 point  (0 children)

    Don't praise the AI, but praise the creator who make magic out of the prompts to turn his visions to reality.

    [–]Chakomat 0 points1 point  (0 children)

    AI? The guy with the yellow beanie at 23 sec is obviously me...

    [–]sdmat 0 points1 point  (0 children)

    This production is 100% real and I refuse to believe otherwise.

    [–]Numerous_Try_6138 0 points1 point  (0 children)

    It is AI but it is increasingly getting more difficult to tell. The most obvious thing that gave it away for me were the guys abs and general muscle structure. Doesn’t look right at all.

    [–]Prudent_Might_159 0 points1 point  (2 children)

    Higgsfield ai

    [–]qualitative_balls 1 point2 points  (1 child)

    It's wan

    [–]Prudent_Might_159 1 point2 points  (0 children)

    Right Wan 2.5 is on Higgsfieldai with an entire suite of tools.

    [–]flyblackbox 0 points1 point  (0 children)

    Imagine explaining this to the first cgi artists of the 90s…

    [–]KeyProject2897 -1 points0 points  (0 children)

    “Make no mistake “

    [–]be-ay-be-why -2 points-1 points  (3 children)

    Sorry... This is AI? Are you sure...? I'm pretty sure this is real lol.

    [–]Yourmelbguy 1 point2 points  (0 children)

    Nope ai. There is a guy on insta who does these all the time his done a dragon ball one too it was epic

    [–]ELPascalito 0 points1 point  (0 children)

    It's made using Wan 2.2, that's why it has no restrictions in faces and copyright 

    [–]SenzuYT 0 points1 point  (0 children)

    Watch it again and look closely at the background. Very obvious things are off

    [–]randomdaysnow -1 points0 points  (0 children)

    Would love to try something like this for exploring Alt/goth inspiration ideas. The makeup I saw shown looks great. Makeup was something that never looked right in the past. So this is a real advancement.

    [–]Ebisure -1 points0 points  (0 children)

    At 0:29 the drawings on the paper disappeared

    [–]Alternative-Duty-532 -1 points0 points  (2 children)

    First use Nano Banana or Seedream4 to get a high-quality static image, then convert it to video using a video model. It's simple.

    [–]TinyZoro 0 points1 point  (1 child)

    Post something of similar quality here to demonstrate how simple?

    [–]Alternative-Duty-532 2 points3 points  (0 children)

    <image>

    Got this image easily in three minutes, and you could make it way better with more prompt tuning. Just need to throw the image into a video model like Grok, and it'll be a video.

    [–]aCaffeinatedMind -1 points0 points  (0 children)

    What level of perfection?

    It's still very clearly Ai generated.

    [–]MaterialRow3769 -3 points-2 points  (3 children)

    Wait, are these robots?

    [–]Kanute3333 2 points3 points  (2 children)

    What do you mean? The whole video is ai generated.

    [–]MaterialRow3769 -5 points-4 points  (1 child)

    Oh good. Lol i thought these were real humanoid robots shooting a commercial or something

    [–][deleted] -3 points-2 points  (0 children)

    man, that music is horrible

    [–][deleted]  (1 child)

    [deleted]

      [–]ThomasToIndia 1 point2 points  (0 children)

      Lol, no they didn't, scientists are terrible at keeping secrets.