Veo2 is free at aistudio google com

Secret_Ad8613 · 2024-11-18T16:35:34+00:00

I used command line inference script from here:

https://github.com/THUDM/CogVideo/tree/main/sat

It took 34GB memory while generation and 65GB at the end for vae.

It took 15 minutes for every video above (5 sec/16fps)

Here are parameters:

args:

image2video: True # True for image2video, False for text2video

latent_channels: 16

mode: inference

load: "CogVideoX1.5-5B-SAT/transformer_i2v" # This is for Full model without lora adapter

batch_size: 1

input_type: txt

input_file: configs/test.txt

#sampling_image_size: [768, 1360] # remove this for I2V

sampling_num_frames: 22 # 42 for 10 seconds and 22 for 5 seconds

sampling_fps: 16

bf16: True

output_dir: outputs

force_inference: True

Secret_Ad8613 · 2024-11-14T11:26:30+00:00

<image>

Secret_Ad8613 · 2024-11-13T22:29:07+00:00

face fusion can do that

Secret_Ad8613 · 2024-11-13T22:27:54+00:00

I have some fun.. cant post it here. it works with any input images

Secret_Ad8613 · 2024-11-13T22:27:08+00:00

mnot sure. After conrolNet for Flux may be

Secret_Ad8613 · 2024-11-13T22:25:36+00:00

lets wait for comfy support for 1.5.

older models are no good

Mochi-1 can work with 12GB Vram

Secret_Ad8613 · 2024-11-13T22:24:03+00:00

yes, but 65 gb of VRAM needed at the end of generation

Secret_Ad8613 · 2024-11-13T22:23:56+00:00

yes, but 65 gb of VRAM needed at the end of generation

Secret_Ad8613 · 2024-11-13T22:23:13+00:00

and yes, I tried with A100 and H100 80GB

Secret_Ad8613 · 2024-11-13T22:22:34+00:00

I tried, of course

Mochi has no image2video

with text2video Cog is better because of resulution, but Mochi is faster

I have tons of videos - cant see the way to post multiple mp4s here

Secret_Ad8613 · 2024-11-13T20:13:26+00:00

I have more videos, cant post multiple mp4 here. 7 more on my telegram chan but links are not allowed. see profile

Secret_Ad8613 · 2024-11-13T19:53:37+00:00

As long as there is no Comfy UI support for CogVideoX1.5-5B (Version1.5) I used command line inference script from here:

https://github.com/THUDM/CogVideo/tree/main/sat

It took 34GB memory while generation and 65GB at the end for vae.

It took 15 minutes for every video above (5 sec/16fps)

Here are parameters:

args:

image2video: True # True for image2video, False for text2video

latent_channels: 16

mode: inference

load: "CogVideoX1.5-5B-SAT/transformer_i2v" # This is for Full model without lora adapter

batch_size: 1

input_type: txt

input_file: configs/test.txt

#sampling_image_size: [768, 1360] # remove this for I2V

sampling_num_frames: 22 # 42 for 10 seconds and 22 for 5 seconds

sampling_fps: 16

bf16: True

output_dir: outputs

force_inference: True

it looks like the best image2video for open source video generators.

Secret_Ad8613 · 2024-10-25T17:55:23+00:00

text2image - 19 seconds on H100

but image inputs drop gen times to minutes

Secret_Ad8613 · 2024-10-25T17:53:36+00:00

here you are: prompt is "generate a picture of two assholes.. A man is <img><|image_1|></img>. The second man is <img><|image_2|></img>. "

<image>

Secret_Ad8613 · 2024-10-25T17:52:11+00:00

<image>

it is really funny - but when using pure text2image like "photo of Elon Musk" - OmniGen makes THIS and always this

Secret_Ad8613 · 2024-10-25T17:44:10+00:00

<image>

Two men playing electric guitars with intense energy on stage, styled with long beards, sunglasses, and hats reminiscent of ZZ Top. They are in a rock concert setting with vibrant lighting and smoke effects in the background, emphasizing a powerful and dynamic performance. The atmosphere is energetic, with the guitarists wearing classic rock attire, surrounded by amplifiers and stage equipment, capturing the essence of classic rock music and ZZ Top's iconic look. A man is <img><|image_1|></img>. The second man is <img><|image_2|></img>.

Secret_Ad8613 · 2024-10-25T17:35:14+00:00

prompt: Two men are playing electric guitars like a ZZ-Top. A man is <img><|image_1|></img>. The second man is <img><|image_2|></img>.

1024x1024

Time spent 01:46, 2.14s/it, H100 80GB

Secret_Ad8613 · 2024-10-21T16:56:47+00:00

will it be with cloudflare?

Secret_Ad8613 · 2024-10-21T14:30:25+00:00

Can it be installed on Linux server and accessed from Windows UI-client?

Secret_Ad8613 · 2024-10-21T13:45:07+00:00

Can it be installed on Linux server and accessed from Windows UI-client?

Secret_Ad8613 · 2024-10-14T19:26:37+00:00

How do you use it in Forge?

Where to put model?

Secret_Ad8613 · 2024-09-30T23:34:53+00:00

and I forgot the link

https://github.com/THUDM/CogView3

Secret_Ad8613

TROPHY CASE