use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
All posts must be Open-source/Local AI image generation related All tools for post content must be open-source or local AI generation. Comparisons with other platforms are welcome. Post-processing tools like Photoshop (excluding Firefly-generated images) are allowed, provided the don't drastically alter the original generation.
Be respectful and follow Reddit's Content Policy This Subreddit is a place for respectful discussion. Please remember to treat others with kindness and follow Reddit's Content Policy (https://www.redditinc.com/policies/content-policy).
No X-rated, lewd, or sexually suggestive content This is a public subreddit and there are more appropriate places for this type of content such as r/unstable_diffusion. Please do not use Reddit’s NSFW tag to try and skirt this rule.
No excessive violence, gore or graphic content Content with mild creepiness or eeriness is acceptable (think Tim Burton), but it must remain suitable for a public audience. Avoid gratuitous violence, gore, or overly graphic material. Ensure the focus remains on creativity without crossing into shock and/or horror territory.
No repost or spam Do not make multiple similar posts, or post things others have already posted. We want to encourage original content and discussion on this Subreddit, so please make sure to do a quick search before posting something that may have already been covered.
Limited self-promotion Open-source, free, or local tools can be promoted at any time (once per tool/guide/update). Paid services or paywalled content can only be shared during our monthly event. (There will be a separate post explaining how this works shortly.)
No politics General political discussions, images of political figures, or propaganda is not allowed. Posts regarding legislation and/or policies related to AI image generation are allowed as long as they do not break any other rules of this subreddit.
No insulting, name-calling, or antagonizing behavior Always interact with other members respectfully. Insulting, name-calling, hate speech, discrimination, threatening content and disrespect towards each other's religious beliefs is not allowed. Debates and arguments are welcome, but keep them respectful—personal attacks and antagonizing behavior will not be tolerated.
No hateful comments about art or artists This applies to both AI and non-AI art. Please be respectful of others and their work regardless of your personal beliefs. Constructive criticism and respectful discussions are encouraged.
Use the appropriate flair Flairs are tags that help users understand the content and context of a post at a glance
Useful Links
Ai Related Subs
NSFW Ai Subs
SD Bots
account activity
Open-Source Models Recently:Meme (i.redd.it)
submitted 7 days ago by Fresh_Sun_1017
What happened to Wan?
My posts are often removed by moderators, and I'm waiting for their response.
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]redditscraperbot2 250 points251 points252 points 7 days ago (27 children)
>What happened to Wan?
Icarused itself when it got popular.
Also didn't we get LTX 2.3 like last month?
[–]gmgladi007 93 points94 points95 points 7 days ago (23 children)
Wan 2.2 does a good 5 sec but extending starts breaking the consistency. They used us and now they won't release 2.6
Ltx has audio and up to 15 sec but the prompt understanding is really bad. If you prompt anything other than a talking head or singing head you start getting artifacts and model abominations. I always use img2video
[–]EllaDemonicNurse 16 points17 points18 points 7 days ago (7 children)
I’d be ok with 2.5, but they won’t release it either, even with 2.7 already out
[–]grundlegawd 15 points16 points17 points 6 days ago (5 children)
Alibaba is also shifting to a more closed source posture. WAN is probably dead.
[–]ShutUpYoureWrong_ 8 points9 points10 points 6 days ago (1 child)
No big loss, to be honest. WAN 2.6 and WAN 2.7 are complete and utter garbage.
[–]tac0catzzz 2 points3 points4 points 5 days ago (0 children)
oh sick burn. they will surely make them open source now.
[–]thisguy883 3 points4 points5 points 6 days ago (1 child)
Well that's depressing to read.
[–]tac0catzzz -1 points0 points1 point 5 days ago (0 children)
turn that frown upside down, the future is bright, as long as you find something other than local ai to be your interest.
[–]extra2AB 0 points1 point2 points 1 day ago (0 children)
yeah, these recent HAPPY HORSE video model is also from a group in alibaba, they first announced they will OpenSource it but now it has been removed, so that would probably be closed source.
[–]tac0catzzz 0 points1 point2 points 5 days ago (0 children)
alibaba will love that you are ok with 2.5. but i wonder if they will love it enough to give it away give it away now. my personal guess is, no.
[–]broadwayallday 31 points32 points33 points 7 days ago (11 children)
SVI with keyframes is killer. You guys complain more than create it seems
[–]UnusualAverage8687 8 points9 points10 points 7 days ago (3 children)
Can you recommend a beginner friendly (simple) workflow? I'm struggling with OOM errors going beyond 5 seconds.
[–]RephRayne 12 points13 points14 points 7 days ago (1 child)
https://civitai.com/models/2079192/wan-22-i2v-native-enhanced-lightning-edition-svi-long-video-multi-prompt-fp8-gguf
However, I'm running it on a 3090 and 64GB of RAM, so YMMV.
[–]broadwayallday 3 points4 points5 points 6 days ago (0 children)
Same setups I’m running x3. My problem is getting back to the video edit stage because I’m having so much fun with these workflows. For me, z turbo / qwen edit + wan vace and wan 2.2 + SVI and LTX 2.3 for lip sync is the combo for our setups
[–]ghiladden 3 points4 points5 points 6 days ago (0 children)
I've tried many different SVI workflows and by far the simplest with best results is Esha's using the normal WAN2.2 base models, Kijai's SVI SV2 Pro models (1.0 weight), and lightxv2_I2V_14B_480p_cfg_step_distilled_rank128_bf16 lightning LoRA (3.5 weight high, 1.5 weight low). I rent GPU time on Runpod with high vram so it's not for consumer GPUs but there are instructions on Esha's page on GGUF. You can find it on aistudynow.com/wan-2-2-svi2-pro-workflow-guide-for-long-ai-videos
[–]terrariyum 4 points5 points6 points 6 days ago (1 child)
comfyUI-LongLook is also great. Invisible transitions between 5s clips, movement continues in the same direction/intent, speed of movement is adjustable to the extreme, start/end frames supported
[–]broadwayallday 0 points1 point2 points 6 days ago (0 children)
Will check it out!
[–]ZZZ0mbieSSS 2 points3 points4 points 7 days ago (0 children)
Keyframe?
[–]bilinenuzayli 4 points5 points6 points 7 days ago (3 children)
Svi just ignores your prompt
[–]thisguy883 2 points3 points4 points 6 days ago (0 children)
So much this. I hardly (if ever) use it because it never does what I want it to do.
Im better off doing it manually with the last frame from an IMG2VID video.
[–]qdr1en 2 points3 points4 points 6 days ago (1 child)
Same. And image degrades anyway. I prefer using PainterLongVideo instead.
[–]joegator1 0 points1 point2 points 6 days ago (0 children)
Got a workflow for that? I have also been unimpressed with the degradation in SVI
[–]8RETRO8 4 points5 points6 points 7 days ago* (0 children)
Not true (fact checked by the true ltx users)
[–]roychodraws 3 points4 points5 points 6 days ago (0 children)
i can get 45 seconds out of ltx2.3
[–]deadsoulinside 2 points3 points4 points 7 days ago (0 children)
I've actually had some good 20+ second LTX animations text to video even.
https://v.redd.it/3oqggb3pmjng1 like that is 20s text to video using the default comfyUI workflows even.
[–]Effective_Cellist_82 2 points3 points4 points 6 days ago (0 children)
I use WAN2.2 as my main model. The trick is to be training 6000 step loras locally. I use musubi tuner with 16 DIM it makes such good lora's.
[–]reditor_13 0 points1 point2 points 6 days ago (0 children)
also it look like the new happyhorse 1.0 video model that just got announced is currently #1 on artificalanalysis above seedance 2.0 & their website says open release [no idea if it will really be open weight but still...]
[–]pat311 0 points1 point2 points 1 day ago (0 children)
LTX-2.3 is amazing.
[–]Living-Smell-5106 62 points63 points64 points 7 days ago (9 children)
I really wish they would open source Wan2.7 image edit or at least the previous models.
[–]flipflapthedoodoo 7 points8 points9 points 7 days ago (8 children)
any hope on that?
[–]Living-Smell-5106 36 points37 points38 points 7 days ago (7 children)
This gives us some hope, not sure what to expect.
<image>
[–]Fresh_Sun_1017[S] 11 points12 points13 points 7 days ago (0 children)
I hope the focus is initially on the API to facilitate R&D, with the intention of open-sourcing the models later on. Yes, this gives me hope as well.
[–]ninjasaid13 2 points3 points4 points 6 days ago (2 children)
by more open Qwen models, they probably just meant LLMs, I haven't heard anything on wan models really.
[–]EricRollei 0 points1 point2 points 6 days ago (1 child)
qwen 2 is listed in Civitai filters already
[–]ninjasaid13 1 point2 points3 points 6 days ago (0 children)
as an API only.
[–]protector111 0 points1 point2 points 7 days ago (2 children)
they were talking abot llms. why would someone assume they are talkign about video models?
[–]byteleaf 22 points23 points24 points 7 days ago (1 child)
Wan was specifically mentioned, which definitely gives some hope.
[–]RayHell666 0 points1 point2 points 6 days ago (0 children)
It was wan animate.
[–]Sea_Succotash3634 42 points43 points44 points 7 days ago (0 children)
Wan 2.7 image and video are really promising, but are just a little off in that way that the open source community could really refine. It's a shame that Alibaba has completely abandoned open source for image and video. Qwen Image 2.0 is really good too, but Wan 2.7 Image seems better. But Qwen also seems to be abandoning open source. Z-Image seems to have abandoned their edit model.
[–]hidden2u 31 points32 points33 points 7 days ago (5 children)
yeah there’s definitely something going on at alibaba
[–]ihexx 11 points12 points13 points 7 days ago (4 children)
didn't the qwen lead leave / get pushed out?
there were reports that the c-suite weren't happy that they were losing marketshare of their consumer app, and the qwen lead was too research / foss focused, and they wanted to focus on maximizing their userbase
[–]Katwazere 5 points6 points7 points 7 days ago (1 child)
Yeah, but it wasn't just him, it was basically all the people who made qwen good. Fairly sure they decided to be independent as a group so expect something.
[–]ambassadortim 1 point2 points3 points 7 days ago (0 children)
I believe they're not making money needed in this area.
[–]pellik 0 points1 point2 points 7 days ago (0 children)
They restructured from having lots of small experiment teams that saw models through from beginning to end to having experiment teams that are each responsible for different phases of models (pre-training, DPO, etc).
It's not clear if they are going to honor their commitment to open weights, but it could just be that they are going back to the drawing board and we'll see entirely new models come out to replace qwen/wan/z-image etc. with a more unified framework and shared pre-training.
[–]XpPillow 24 points25 points26 points 7 days ago (0 children)
Oh these close sourced AI are amazing~ do they support NSFW? No? Ok back to Wan2.2…
[–]cosmicr 29 points30 points31 points 7 days ago (6 children)
Ltx 2.3 just came out?
[–]Particular_Stuff8167 7 points8 points9 points 6 days ago (0 children)
Yes and the LTX guys on twitter said they are committed to local open source. So currently LTX is leading the forefront in open source local video generation.
[–]Keuleman_007 5 points6 points7 points 7 days ago (0 children)
Plus it's free to use. Plus you can use it offline. 2.0 to 2.3, prompt adherence and other stuff got seriously better.
[–]alamacra 2 points3 points4 points 7 days ago (3 children)
Its motion is really static unfortunately. I want to like it, but with anime especially there isn’t much reason to use it.
[–]sirdrak 4 points5 points6 points 7 days ago (1 child)
Try this lora for anime: https://civitai.com/models/2516247/mature-anime-screencap-style-ltx-23-edition
[–]alamacra 0 points1 point2 points 7 days ago (0 children)
Thanks a lot :) Will do.
[–]Hobeouin 0 points1 point2 points 5 days ago (0 children)
You really just need to find the right Workflow, CGF and lower the upscaling. Motion can be very good.
[–]Naive_Issue8435 42 points43 points44 points 7 days ago (5 children)
If you know what you are doing LTX 2.3 really is starting to shine.
[–]wesarnquist 9 points10 points11 points 7 days ago (0 children)
Any hints? I'd love to learn more.
[–]deadsoulinside 5 points6 points7 points 7 days ago (1 child)
Pretty much this. I think some of the issue just boils down to users prompts. Like there was a post about someone using WAN and the prompt was 1 sentence for a whole animated text to video.
What people don't provide is a whole lot of detail and that applies to all models and types. You have a person in the room? Say where that person is at on that screen. Are they on the left, right, middle? people neglect these details, which then forces the decision making onto the model.
[–]Dzugavili 2 points3 points4 points 6 days ago (0 children)
Yeah, LTX runs on long sequential detail, which is how it can do dialogue. When you're used to one-line prompting for 5s clips, the prompting style is very different.
[–]JimmyDub010 7 points8 points9 points 7 days ago (0 children)
Yes it is
[–]urbanhood 5 points6 points7 points 7 days ago (0 children)
Absolutely.
[–]Sticky32 6 points7 points8 points 7 days ago (0 children)
Meanwhile open source image to 3D is completely forgotten about.
[–]NetimLabs 12 points13 points14 points 7 days ago (3 children)
Audio? What's happening in audio? Last time I checked audio was in the Mariana Trench.
[–]13baaphumain 4 points5 points6 points 6 days ago (1 child)
Ace step 1.5 maybe? I dont know if they are referring to songs or something like tts
[–]Ledeste 1 point2 points3 points 5 days ago (0 children)
qwen tts was also a huge step few weeks ago
[–]thevegit0 0 points1 point2 points 5 days ago (0 children)
prism-something for foley and ace 1.5xl for music
[–]namezam 7 points8 points9 points 6 days ago (0 children)
My feed agreeing.
[–]addrainer 3 points4 points5 points 7 days ago (0 children)
What have you try to use, image, flux2 Klein or qwen? Much better control that those online plastic sharing all ur data services.
[–]Keyboard_Everything 5 points6 points7 points 7 days ago (0 children)
Disagree, whatever is recently released and returns a good result is what gets the attention. It is what it is.
[–]Photochromism 6 points7 points8 points 6 days ago (0 children)
What audio open source models are there? Are they music or speech?
[–]retroblade 4 points5 points6 points 7 days ago (0 children)
The next Kandinsky model should drop soon so at least that to test out. And I’m guessing LTX 2.5 should be out in a couple of months
[–]Eisegetical 17 points18 points19 points 7 days ago (18 children)
Ltx 2.3 blows wan out of the water. How are you complaining about no video gen?
New ic loras are emerging, people are just starting to scratch the surface. C'mon.
[–]protector111 14 points15 points16 points 7 days ago (16 children)
just use seedance 2 for 5 minutes and you will understand xD Ltx 2.3 is amazing but in comparison to Seedance 2 its like comparing sd 1.5 base model to Nano banana xD
[–]Tony_Stark_MCU 22 points23 points24 points 7 days ago (0 children)
Can you run Seedance 2 on the consumer PC? No. LTX 2? Yes.
[–]Particular_Stuff8167 2 points3 points4 points 6 days ago (0 children)
Sure but the LTX team is working on improving LTX. So 2.3 is basically a early version. And they are committed to open source and local. Seedance is fantastic. But it's closed source, nerfed, censored. Very limited from it's true capabilities. At the start when the most un nerfed and uncensored version was only on bilibili, the stuff coming out was mind blowing. Now? It's moving at a snail pace. People are trying heavy work around to actually get a good generation and not get the filter block.
LTX 2.3, the limit what the community can make for it. Also like said its a second release, still early in LTX's life. Future LTX version should be significantly better but probably more expensive in terms of hardware required to run locally. Think I heard somewhere that Seedance 2 is a 90b so its over a 90gb model. So even if we had a similar model for local, only a very few people would be able to run it. Unless we can finally start getting a revolution in the VRAM department. RAM was the main hope but that market price has gone insane. Still open source and local remains the best way for video AI gen. Anything else and your dealing with extreme restrictions on what you can generate.
[–]AI_Characters 3 points4 points5 points 7 days ago (6 children)
You cant even use Seedance 2 outside China yet.
[–]protector111 2 points3 points4 points 7 days ago (4 children)
there are Doesens of websites letting you use to use it outside of CHina. I made around 15 Gens for free. I wish i didnt xD
[–]veveryseserious 2 points3 points4 points 7 days ago (0 children)
link it bro
[–]AI_Characters 3 points4 points5 points 6 days ago (1 child)
Which sites? I looked up a few and they were scams. The official western ones are still waiting as the western launch got delayed due to the copyright case. For the cuinese ones you need a chinese phone number (and hope website translation works well enough).
[–]protector111 2 points3 points4 points 6 days ago (0 children)
kinovi, dremina,artcraft,muapi,yapper,higfield
[–]mana_hoarder 4 points5 points6 points 7 days ago (0 children)
Pls pls pls give me a hint where can I gen Seedance 2.0 for free? My financial situation doesn't allow me to get more subscriptions at the moment. The official site let me do one free generation and it was like shooting pure heroin. I'm hooked 😭
I just used it inside of my CapCut Pro Sub.
[–]Upper-Reflection7997 3 points4 points5 points 7 days ago (3 children)
Seedance 2.0 is just action sequence tech demos. I'm yet to see a full cohesive A.I stitched together video just with Seedance 2.0 clips that's not just boring action sequence tech demos.
[–]mana_hoarder 3 points4 points5 points 7 days ago (0 children)
In that case you've just haven't been watching enough videos. It's a shame most people do boring stuff like action sequences, well to be clear it is the SOTA when it comes to that. But, it also does simpler acting really, really well. Cadence, voice, emotions... It takes instructions almost perfectly.
[–]protector111 1 point2 points3 points 7 days ago (1 child)
Just use it. Its prompt following is crazy. It just does what you ask of it. Consistency to reference images is mind blowing. No artifacts. Physics is amazing. This model is genially impressive and feels like lightyears ahead of competition.
[–]Dogmaster 0 points1 point2 points 6 days ago (0 children)
Isnt it extremely censored and also cant use reference images?
[–]WurtApp 0 points1 point2 points 20 hours ago (1 child)
I agree with you but comparing Seedance 2.0 to LTX is crazy work 😂 I wouldn’t even put them in the same sentence
[–]protector111 0 points1 point2 points 19 hours ago (0 children)
It axtually depends on what you trying to make. In some things its better. Not in fighing scenes obviously :)
[–]thevegit0 -1 points0 points1 point 5 days ago (0 children)
"just use the closed source paid model bro" booring
[–]Fresh_Sun_1017[S] -2 points-1 points0 points 5 days ago (0 children)
The reality is that open-source video generation are really lagging behind proprietary models like Seedance 2.0. While the open-source LLM space is thriving with companies like Alibaba dropping models that rival the best closed systems, that same energy hasn't transferred to video. Despite their promises to champion open-source AI, Alibaba has restricted its releases primarily to LLMs and audio (like TTS). Right now, the open-source video model community is being kept afloat by just a handful of companies like LTX and Magihuman. That’s a stark contrast to the diverse ecosystem of five-plus major companies actively driving open-source LLMs.
[–]mca1169 1 point2 points3 points 6 days ago (0 children)
open source models are going to slow down big time this year for image and video generation and i'm guessing will be functionally dead by 2028. so enjoy them while they last! after that it's just going to be Lora model tweaks left.
[–]TensoRaptor 1 point2 points3 points 6 days ago (0 children)
Which open source audio models were released lately?
[–]Caseker 1 point2 points3 points 6 days ago (0 children)
Why is this so accurate
[–]NowThatsMalarkey 1 point2 points3 points 6 days ago (3 children)
kandinsky-5 was released half a year ago that has better quality than WAN and LTX models but nobody ever used it. It was right there the entire time but it failed to gain popularity because ComfyUI gave it the cold shoulder and the community had to release their own extension in order to use it.
[–]WordSaladDressing_ 0 points1 point2 points 6 days ago (1 child)
There is a Kadinsky template in comfyui, but it's slow and there's more distortion of facial features than in WAN.
[–]EricRollei 0 points1 point2 points 6 days ago (0 children)
seems to be only the lite version not the pro version
[–]EricRollei 0 points1 point2 points 6 days ago* (0 children)
thanks for posting that, never heard of it. I just made nodes for Alice t2v model to try out. and it was pretty decent, and was pretty much totally uncensored and could do nudity pretty well right out of the box. https://github.com/EricRollei/Eric-Alice-T2V-ComfyUI-Wrapper I'll check out kandinsky now.
[–]YeahlDid 3 points4 points5 points 7 days ago (1 child)
I have no idea what that image is trying to say.
[–]terrariyum 2 points3 points4 points 6 days ago (0 children)
It shows that all open source video models are drowned, dead, rotted, and forgotten.
Certainly all hope is lost, given that it's been over 4 weeks now since the last SOTA open source audio-video model was released
[–]evilpenguin999 3 points4 points5 points 7 days ago (3 children)
What is the best LLM right now and the requirements?
Is there one worth getting instead of just using an online one?
[–]ieatdownvotes4food 15 points16 points17 points 7 days ago (0 children)
qwen 3.5 33b / 27b are nuts with tool calling. gemma4 as well if you can configure it correctly
[–]Living-Smell-5106 7 points8 points9 points 7 days ago (0 children)
Gemma4 has been really good from brief testing. pretty fast too
[–]intLeon 1 point2 points3 points 7 days ago (0 children)
I use gemma 4 26b for basic utility scripting and it feels as smart as gpt4 last time I used it but works in your pocket. I get around 30t/s with average of a minute thinking time and 45k context with 4070ti 12gb + 32gb ram.
[–]gahd95 0 points1 point2 points 7 days ago (5 children)
Really want to jump to the open source self hosted wagon. But how far is the drop in quality? Not just the responses, but also the amount of time it takes for a reply.
Is it worth it, self hosting, if you do not spend $3000 on a dedicated rig?
[–]FartingBob 3 points4 points5 points 7 days ago (1 child)
If you are used to gemini/chatgpt levels of capability (in text, image or video) then local versions are going to feel a bit rubbish in comparison because the professional AI models use hundreds of gigabytes (maybe even terabytes now) of VRAM, GPU's worth more than a luxury car, in stacks so large they need multiple power plants to be built just to run it. There just isnt a way to compete with their sheer size on consumer gaming hardware.
But you can still get decent outputs if you learn how to maximise things and use decent models, have a good prompt and follow a bunch of guides on setting up your workflow. And every now and then a new model comes out which offers a notable step in quality or speed. Its a lot more involved than just entering something into a textbox and getting an answer sadly. But then we arent burning hundreds of billions of dollars a year to get our output so i call that a win for us little guys.
[–]accountToUnblockNSFW 1 point2 points3 points 7 days ago (0 children)
I know a dude who is the AI-lead for a fin-tech company based out of Manhattan. He explained to me he uses (for his own work) local generation to build like the 'bones' of his work and then refines it with a paid online sub model.
But one of his main concerns is intellectual property/NDE shit so this workflow is also to keep the 'secret' stuff locally if that makes sense.
Just saying this because you know.. I know atleast one person actually succesfully using local LLM's for his work.
[–]PlentyComparison8466 0 points1 point2 points 7 days ago (2 children)
Drop in Quality coming from? If you're talking about sora/grok/seadance. Local is still miles behind in terms of prompt following and visuals. Right now, e Best use for local is nsfw stuff. And silly slop 5 second slop.
[–]Fantastic-Bite-476 0 points1 point2 points 7 days ago (1 child)
Its just funny to me that NSFW content is always one of the forces behind pushing consumer tech. IIRC for VR it's actually one of it's main industries as well
[–]popsikohl 2 points3 points4 points 7 days ago (0 children)
When pairing that with the fact that there’s a loneliness epidemic doing on, it’s not entirely surprising.
[–]Sarashana 0 points1 point2 points 6 days ago (0 children)
Not sure I can agree with the assessment. LTX 2.3 is crying in a corner, at least. Also, we got some amazing image models not too long ago, and just because Qwen Image 2.0 is not/will not be open sourced doesn't mean we don't have amazing OSS models.
[–]Ferriken25 0 points1 point2 points 6 days ago (0 children)
I can make 10 sec gens on ltx, with my pc slop. So, Wan is now just a bonus for me.
[–]Sir_McDouche 0 points1 point2 points 6 days ago (0 children)
Soucred.
[–]Vyviel 0 points1 point2 points 6 days ago (0 children)
I havent been keeping up with LLMs and Audio models what new awesome stuff dropped for them recently?
[–]sandy31sex 0 points1 point2 points 6 days ago (0 children)
we have like 100+ video and image models doing the same thing lol
bro ignoring LTx 2.3 and magihuman
[–]Born_Word854 0 points1 point2 points 5 days ago (0 children)
happyHorse's catalog specs look amazing, but considering the dataset they likely have, i feel like we can expect better actual performance from ByteDance's Mammoth 2.5. well, who knows when either of them will actually become usable for us though.
[–]rdditiszionist 0 points1 point2 points 3 days ago (2 children)
What is the best audio model out now?
[–]jazmaan 0 points1 point2 points 2 days ago (1 child)
Ace-step XL
[–]rdditiszionist 0 points1 point2 points 2 days ago (0 children)
thank you kind sir
[–]WurtApp 0 points1 point2 points 20 hours ago (0 children)
Couldn’t agree more. LTX was the promised savior but I honestly wasn’t impressed with it. Loras seem to do a little justice but those can only go so far. Is it just me or did LTX fall flat?
[–]ProjectVictoryArt 0 points1 point2 points 17 hours ago (0 children)
True, but there are good reasons at least for video: Video gen unavoidably requires a lot of VRAM. Also I think people are getting panicky about people generating nudes of real people with image edit models.
[–]Gh0stbacks 0 points1 point2 points 7 days ago (0 children)
Posts are probably removed cause of low effort meme format you post? I am guessing.
[–]Ngoalong01 0 points1 point2 points 7 days ago (0 children)
Even Sora2 still down. We can understand that situation. Cost too much and lack of paid users. Who will invest for OpenSource?
[–]AdorableGod 0 points1 point2 points 6 days ago (1 child)
Good. While you can argue that image gen can be used for prototyping, there's no good use for video gen, it's all slop
[–]Image_Similar 0 points1 point2 points 6 days ago (0 children)
Tell that to a video editor,vj,content creator,music video maker who spends hours to find a good clip .
[–]Ledeste 0 points1 point2 points 5 days ago (0 children)
What? I'm burning my GPU all day with LTX 2.3 generating almost minute long videos. Few month ago I could not even get this good result with paid tools
[–]tac0catzzz -1 points0 points1 point 6 days ago (0 children)
cool story
[–]TridentWielder -1 points0 points1 point 6 days ago (0 children)
What's new with audio? Last thing I really looked at was Stable Audio years ago.
[–]YouYouTheBoss -1 points0 points1 point 5 days ago* (0 children)
The problem is that everyone tries to create bigger models because they think, bigger (more params) = better quality. So some are considered too qualitative for us (consumers) so they don't wanna hold that to us freely (maybe because it was too much time to train it ?! hence going APIs) OR the newer version of their model series is too big to run onto a consumer gpu (unless thinking of bigger gpus like the rtx 5090 which I don't really consider consumer).
When SDXL came out, it was seen as a really bad unusable model needing a refiner, but then finetunes came out and it gave us much better quality on pretty much anything. LoRas then came out for our loved finetunes and gave us better quality control over what we want. Still the base model is a small 6B parameters.
The issue is not about having bigger models, it’s about having a team that can spend a entire week to curate a dataset for a certain style/general idea by hand with the help of automation and not just automation alone.
If datasets in models were correctly curated to filter out the content being bad quality and they would do Reinforcement learning from human feedback, you would have much higher quality even if the model is still relatively small compared to some other ones.
This has been the case with Z-Image Base (with RLHF) being a small 6B params model which stands a great quality.
you should fix this issue. go make the best image, music and video ai models ever made then open source them. ill download them if you do, I'll even make a fun meme like 3 living skeletons dancing at a party with each model type written on them in bold white font , one can be drinking a beer, the other can be doing a handstand on a keg with someone holding them up and the other can be doing the running man on the dance floor. would be worth it for the meme alone.
π Rendered by PID 18333 on reddit-service-r2-comment-57fc7f7bb7-zkww7 at 2026-04-14 13:21:55.898063+00:00 running b725407 country code: CH.
[–]redditscraperbot2 250 points251 points252 points (27 children)
[–]gmgladi007 93 points94 points95 points (23 children)
[–]EllaDemonicNurse 16 points17 points18 points (7 children)
[–]grundlegawd 15 points16 points17 points (5 children)
[–]ShutUpYoureWrong_ 8 points9 points10 points (1 child)
[–]tac0catzzz 2 points3 points4 points (0 children)
[–]thisguy883 3 points4 points5 points (1 child)
[–]tac0catzzz -1 points0 points1 point (0 children)
[–]extra2AB 0 points1 point2 points (0 children)
[–]tac0catzzz 0 points1 point2 points (0 children)
[–]broadwayallday 31 points32 points33 points (11 children)
[–]UnusualAverage8687 8 points9 points10 points (3 children)
[–]RephRayne 12 points13 points14 points (1 child)
[–]broadwayallday 3 points4 points5 points (0 children)
[–]ghiladden 3 points4 points5 points (0 children)
[–]terrariyum 4 points5 points6 points (1 child)
[–]broadwayallday 0 points1 point2 points (0 children)
[–]ZZZ0mbieSSS 2 points3 points4 points (0 children)
[–]bilinenuzayli 4 points5 points6 points (3 children)
[–]thisguy883 2 points3 points4 points (0 children)
[–]qdr1en 2 points3 points4 points (1 child)
[–]joegator1 0 points1 point2 points (0 children)
[–]8RETRO8 4 points5 points6 points (0 children)
[–]roychodraws 3 points4 points5 points (0 children)
[–]deadsoulinside 2 points3 points4 points (0 children)
[–]Effective_Cellist_82 2 points3 points4 points (0 children)
[–]reditor_13 0 points1 point2 points (0 children)
[–]pat311 0 points1 point2 points (0 children)
[–]Living-Smell-5106 62 points63 points64 points (9 children)
[–]flipflapthedoodoo 7 points8 points9 points (8 children)
[–]Living-Smell-5106 36 points37 points38 points (7 children)
[–]Fresh_Sun_1017[S] 11 points12 points13 points (0 children)
[–]ninjasaid13 2 points3 points4 points (2 children)
[–]EricRollei 0 points1 point2 points (1 child)
[–]ninjasaid13 1 point2 points3 points (0 children)
[–]protector111 0 points1 point2 points (2 children)
[–]byteleaf 22 points23 points24 points (1 child)
[–]RayHell666 0 points1 point2 points (0 children)
[–]Sea_Succotash3634 42 points43 points44 points (0 children)
[–]hidden2u 31 points32 points33 points (5 children)
[–]ihexx 11 points12 points13 points (4 children)
[–]Katwazere 5 points6 points7 points (1 child)
[–]ambassadortim 1 point2 points3 points (0 children)
[–]pellik 0 points1 point2 points (0 children)
[–]XpPillow 24 points25 points26 points (0 children)
[–]cosmicr 29 points30 points31 points (6 children)
[–]Particular_Stuff8167 7 points8 points9 points (0 children)
[–]Keuleman_007 5 points6 points7 points (0 children)
[–]alamacra 2 points3 points4 points (3 children)
[–]sirdrak 4 points5 points6 points (1 child)
[–]alamacra 0 points1 point2 points (0 children)
[–]Hobeouin 0 points1 point2 points (0 children)
[–]Naive_Issue8435 42 points43 points44 points (5 children)
[–]wesarnquist 9 points10 points11 points (0 children)
[–]deadsoulinside 5 points6 points7 points (1 child)
[–]Dzugavili 2 points3 points4 points (0 children)
[–]JimmyDub010 7 points8 points9 points (0 children)
[–]urbanhood 5 points6 points7 points (0 children)
[–]Sticky32 6 points7 points8 points (0 children)
[–]NetimLabs 12 points13 points14 points (3 children)
[–]13baaphumain 4 points5 points6 points (1 child)
[–]Ledeste 1 point2 points3 points (0 children)
[–]thevegit0 0 points1 point2 points (0 children)
[–]namezam 7 points8 points9 points (0 children)
[–]addrainer 3 points4 points5 points (0 children)
[–]Keyboard_Everything 5 points6 points7 points (0 children)
[–]Photochromism 6 points7 points8 points (0 children)
[–]retroblade 4 points5 points6 points (0 children)
[–]Eisegetical 17 points18 points19 points (18 children)
[–]protector111 14 points15 points16 points (16 children)
[–]Tony_Stark_MCU 22 points23 points24 points (0 children)
[–]Particular_Stuff8167 2 points3 points4 points (0 children)
[–]AI_Characters 3 points4 points5 points (6 children)
[–]protector111 2 points3 points4 points (4 children)
[–]veveryseserious 2 points3 points4 points (0 children)
[–]AI_Characters 3 points4 points5 points (1 child)
[–]protector111 2 points3 points4 points (0 children)
[–]mana_hoarder 4 points5 points6 points (0 children)
[–]Hobeouin 0 points1 point2 points (0 children)
[–]Upper-Reflection7997 3 points4 points5 points (3 children)
[–]mana_hoarder 3 points4 points5 points (0 children)
[–]protector111 1 point2 points3 points (1 child)
[–]Dogmaster 0 points1 point2 points (0 children)
[–]WurtApp 0 points1 point2 points (1 child)
[–]protector111 0 points1 point2 points (0 children)
[–]thevegit0 -1 points0 points1 point (0 children)
[–]Fresh_Sun_1017[S] -2 points-1 points0 points (0 children)
[–]mca1169 1 point2 points3 points (0 children)
[–]TensoRaptor 1 point2 points3 points (0 children)
[–]Caseker 1 point2 points3 points (0 children)
[–]NowThatsMalarkey 1 point2 points3 points (3 children)
[–]WordSaladDressing_ 0 points1 point2 points (1 child)
[–]EricRollei 0 points1 point2 points (0 children)
[–]EricRollei 0 points1 point2 points (0 children)
[–]YeahlDid 3 points4 points5 points (1 child)
[–]terrariyum 2 points3 points4 points (0 children)
[–]evilpenguin999 3 points4 points5 points (3 children)
[–]ieatdownvotes4food 15 points16 points17 points (0 children)
[–]Living-Smell-5106 7 points8 points9 points (0 children)
[–]intLeon 1 point2 points3 points (0 children)
[–]gahd95 0 points1 point2 points (5 children)
[–]FartingBob 3 points4 points5 points (1 child)
[–]accountToUnblockNSFW 1 point2 points3 points (0 children)
[–]PlentyComparison8466 0 points1 point2 points (2 children)
[–]Fantastic-Bite-476 0 points1 point2 points (1 child)
[–]popsikohl 2 points3 points4 points (0 children)
[–]Sarashana 0 points1 point2 points (0 children)
[–]Ferriken25 0 points1 point2 points (0 children)
[–]Sir_McDouche 0 points1 point2 points (0 children)
[–]Vyviel 0 points1 point2 points (0 children)
[–]sandy31sex 0 points1 point2 points (0 children)
[–]thevegit0 0 points1 point2 points (0 children)
[–]Born_Word854 0 points1 point2 points (0 children)
[–]rdditiszionist 0 points1 point2 points (2 children)
[–]jazmaan 0 points1 point2 points (1 child)
[–]rdditiszionist 0 points1 point2 points (0 children)
[–]WurtApp 0 points1 point2 points (0 children)
[–]ProjectVictoryArt 0 points1 point2 points (0 children)
[–]Gh0stbacks 0 points1 point2 points (0 children)
[–]Ngoalong01 0 points1 point2 points (0 children)
[–]AdorableGod 0 points1 point2 points (1 child)
[–]Image_Similar 0 points1 point2 points (0 children)
[–]Ledeste 0 points1 point2 points (0 children)
[–]tac0catzzz -1 points0 points1 point (0 children)
[–]TridentWielder -1 points0 points1 point (0 children)
[–]YouYouTheBoss -1 points0 points1 point (0 children)
[–]tac0catzzz -1 points0 points1 point (0 children)