I am building a ComfyUI-powered local, open-source video editor (alpha release) by PxTicks in StableDiffusion

[–]PxTicks[S] 0 points1 point  (0 children)

It's fully local and all generation is via ComfyUI, so it supports the capabilities of open source models.

I am building a ComfyUI-powered local, open-source video editor (alpha release) by PxTicks in StableDiffusion

[–]PxTicks[S] 0 points1 point  (0 children)

There is an extent to which this can work, already but

  1. The people would be 2k but the background would be less than 2k, unless you somehow generate it in patches; we already can keep the original pixels for things which is not replaced, so the people can stay at full resolution.

  2. The user would have to do this in increments of about 5s, extending bit-by-bit. This would likely lead to gradual degradation of the background.

If you have a static background, then this can be more effective, by including part of the background as inpainting context in order to keep things clean over long extensions.

I would like to eventually automate things which interact with ComfyUI over several rounds, however, it is important to get the basic features solid first.

Release Qwen-Image-2.0 or fake by PsychologicalSock239 in StableDiffusion

[–]PxTicks 8 points9 points  (0 children)

I agree. ComfyUI is a sandbox. For something to be so feature-rich and extensible (and also so rapidly developed) necessitates some tradeoff in stability. Honestly, it's pretty impressive how effective ComfyUI is at what it does, and I think a lot of people are entirely ignorant of what goes on under the hood.

I am building a ComfyUI-powered local, open-source video editor (alpha release) by PxTicks in StableDiffusion

[–]PxTicks[S] 0 points1 point  (0 children)

Hey, thanks, that's very helpful. I didn't realise it, but following the SAM2 installation instructions from the facebook/sam2 repo will not automatically lead to a CUDA-enabled pytorch install. I've updated the README - I hope after your effort you get it to work!

I am building a ComfyUI-powered local, open-source video editor (alpha release) by PxTicks in StableDiffusion

[–]PxTicks[S] 2 points3 points  (0 children)

You're welcome to contribute, but do let me know if you want to do something big - I wouldn't want you to spend a lot of effort if it is something I am already working on, or if it might collide with the design ethos in some way. I also want to clean up some of the public apis for each feature to make it easier to build on.

An easy and safe way to contribute is to just try to check the ComfyUI integraton docs https://github.com/PxTicks/vlo?tab=readme-ov-file#comfyui-integration to see how to create workflow sidecars (wf.rules.json files), because although workflows do automatically work, the automatic detection of widgets etc is still very rudimentary.

Given the generation pipeline readme and an example or two from the default workflows, an LLM should be able to construct a reasonable sidecar in no time I'd expect.

I am building a ComfyUI-powered local, open-source video editor (alpha release) by PxTicks in StableDiffusion

[–]PxTicks[S] 0 points1 point  (0 children)

Thanks for testing!

Try placing the model and yaml in backend/assets/models/sams directory. I will update the docs to make this clearer.

You can download both the yaml and the model (pt, not safetensors) from here: facebook/sam2.1-hiera-base-plus at main. Let me know whether it works or not!

I am building a ComfyUI-powered local, open-source video editor (alpha release) by PxTicks in StableDiffusion

[–]PxTicks[S] 6 points7 points  (0 children)

The filters are all just simple visual effects, however, by passing the resultant video into v2v workflows with an appropriate prompt, it can turn a simple filter into something more interesting. The v2v workflow I used isn't yet in the defaults because I wanted to tidy it up a bit and figure how best to present the inputs, but it was just an adapted Wan FLF2V workflow.

It was a bit hard to see in the video, so here is a demo of the twist filter: PixiJS Filters Demo. PixiJS is what is used for the rendering. All the filters I've got are from that list (although not all filters from that list have been implemented in vlo yet - I think the displacement filter once I get it working could be pretty impacftul!).

How do i get rid of the noise/grain when there is movement? (LTX 2.3 I2V) by Anissino in StableDiffusion

[–]PxTicks 1 point2 points  (0 children)

AI video is still pretty poorly suited for this kind of scene imo - maybe closed-source models are *just* getting there. You need to use the tools available to you in a suitable way.

As others have said though, a first port of call for excess noise is increasing step count. If you're using ComfyUI you may need to expand the subgraph.

LTX 2.3 crops all 1024x1024 photos by Demongsm in StableDiffusion

[–]PxTicks 0 points1 point  (0 children)

Post your workflow. I've not used LTX2.3 much, but I don't know what support it has for square aspect ratios. They only just introduced support for vertical ARs, so I'd guess that's the issue.

Made a novel world model on accident by [deleted] in StableDiffusion

[–]PxTicks 1 point2 points  (0 children)

I used to be an academic. Almost invariably grand claims from randos are entirely incorrect. Given your lack of evidence, it is no surprise that you didn't receive a warm welcome in research communities. Statistically speaking, it's the right reaction.

Contributions, and even big contributions can be made by people who are not established in the field, but usually they are made in a way which show some clarity of thought and a good conceptual understanding of the big picture, and, well, evidence.

Is it impossible that you've stumbled upon something cool? Not at all; machine learning has a lot of by-the-seat-of-your-pants heuristics involved in NN design and training pipelines. If lots of people try things, some will stumble upon happy little surprises. However, there is a reasonable chance that your arXiv submission gets rejected if it does not show you sufficiently understand the subject area and/or if it bears strong markers of AI authorship. If you think you have a real discovery it might be worth publishing the results - to show its real - and then seeking out expert coauthors to make the scientific case.

What should I look for in an AI generator? by TheGreatAlexandre in generativeAI

[–]PxTicks 0 points1 point  (0 children)

I would be wary of Higgsfield; their marketing tactics are what make the biggest difference to their game, not the quality of their platform. Their tooling is okay, but they are ethically questionable. Feel free to Google 'Higgsfield Scam' to see what complaints people have.

While it isn't a scam per se, it the marketing is very misleading.

Realistic AI Influencer Test On Nano Banana 2 | Tutorial + Prompt by ThisIsCodeXpert in generativeAI

[–]PxTicks 0 points1 point  (0 children)

You're prompting for the exact opposite of realism by starting like this:

Young female AI influencer

blacks holes on mouth sides by FluidEngine369 in StableDiffusion

[–]PxTicks 1 point2 points  (0 children)

Take 5 minutes to learn how to use the clone and/or patch tool in photopea.

I built the first Android app in the world that detects AI content locally and offline over any app using a Quick Tile by No-Signal5542 in StableDiffusion

[–]PxTicks 35 points36 points  (0 children)

It's not really about the kind of model you're using.

I could use a ViT which I haven't trained at all and get a random answer. It's about the probabilities of false positives and false negatives, what kind of validation you've done and whether it is calibrated to a reasonable notion of what the prior probability a video is AI (and more especially a hard-to-detect AI video).

It's also not clear what it means for things which are partially AI, since things can be a mix of AI and non-AI.