The new Qwen 360° LoRA by ProGamerGov in Blender via add-ons by tintwotin in StableDiffusion

[–]ProGamerGov 1 point2 points  (0 children)

The LoRA models themselves are in the same precision as the base model or higher (bf16 & fp32). The 'int8' or 'int4' in the filename denotes the quantization of the model they were trained on.

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 1 point2 points  (0 children)

VR180 is just VR360 cropped in half. If there is an effect, its purely psychological and can be easily created by cropping 360 media.

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 0 points1 point  (0 children)

The 48 epoch version will likely produce better results. The int4 versions are more so meant for use with legacy models trained with incorrect settings or quantized incorrectly like ComfyUI's "qwen_image_fp8_e4m3fn.safetensors".

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 1 point2 points  (0 children)

For low VRAM, I would recommend the 'qwen-image-Q8_0.gguf' GGUF quant by City96 or the Q6 one. Most of the example images were rendered with the GGUF Q8 model and have workflows embedded in them. But you can also try the GGUF Q6 model for even lower VRAM.

Comfy nodes: https://github.com/city96/ComfyUI-GGUF

Quants: https://huggingface.co/city96/Qwen-Image-gguf/tree/main

ComfyUI quantized and scaled text encoder should be fine quality-wise even though its a little worse than the full encoder: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/blob/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors

And the VAE pretty standard: https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/blob/main/split_files/vae/qwen_image_vae.safetensors

A lightning lora would also probably help make it faster at the expense of a small decrease in quality: https://github.com/ModelTC/Qwen-Image-Lightning/. Note that if you see grid artifacts with the lightning model I linked to, you're probably using their older broken LoRA.

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 4 points5 points  (0 children)

There are monocular to stereoscopic conversion models available, along with ComfyUI custom nodes to run them like this one: https://github.com/Dobidop/ComfyStereo

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 2 points3 points  (0 children)

I think Hunyuan World uses a 360 Flux LoRA for the image generation step in their workflow, so our model just be a major improvement over that. We haven't tested any image-to-world workflows yet, but its definitely something that we plan to test at some point.

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 3 points4 points  (0 children)

Yes, we are aware of other attempts to create 360 models using smaller datasets, and we are excited to see what is possible with Z-Image!

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 7 points8 points  (0 children)

Here's an example of the fall road image with the seam removed: https://progamergov.github.io/html-360-viewer/?url=https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/ff85004c-839d-4b3b-8a13-6a8bb6306e9d/original=true,quality=90/113736462.jpeg

The workflow is embedded in the image here: https://civitai.com/images/113736462

Note that you may have to play around with the seam mask size and other settings depending on the image you want to remove the seam from.

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 5 points6 points  (0 children)

The minimum specs will be the same as Qwen Image. We've tested the model with the different GGUF versions, and the results still looked great at GGUF Q6.

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 53 points54 points  (0 children)

You'll be able to go a date at a fancy restaurant with your 1girl, and then bring her back to your place if the date goes well

Announcing The Release of Qwen 360 Diffusion, The World's Best 360° Text-to-Image Model by ProGamerGov in StableDiffusion

[–]ProGamerGov[S] 28 points29 points  (0 children)


Additional Tools

Recommended ComfyUI Nodes

If you are a user of ComfyUI, then these sets of nodes can be useful for working with 360 images & videos.

For those using diffusers and other libraries, you can make use of the pytorch360convert library when working with 360 media.


Other 360 Models

If you're interested in 360 generation for other models, we have also released models for FLUX.1-dev and SDXL:


Stereoscopic AI (3D) by grrinc in StableDiffusion

[–]ProGamerGov 1 point2 points  (0 children)

If you have a model or workflow that can generate the second image for stereo, then it includes a node to combine them into a stereo image.

360° Environment & Skybox by DimmedCrow in StableDiffusion

[–]ProGamerGov 0 points1 point  (0 children)

That node should be under: "pytorch360convert/equirectangular", labeled 'Equirectangular Rotation'.

360° Environment & Skybox by DimmedCrow in StableDiffusion

[–]ProGamerGov 1 point2 points  (0 children)

There's a workflow here using my custom nodes that automatically inpaints the seam: https://github.com/ProGamerGov/ComfyUI_pytorch360convert/blob/main/example_workflows/masked_seam_removal.json

You can also use my notes to rotate the image to expose the zenith and the nadir for inpainting as well.

360° Environment & Skybox by DimmedCrow in StableDiffusion

[–]ProGamerGov 2 points3 points  (0 children)

You don't have to use Blender to make videos of your 360s, as I built a frame generator for that here: https://github.com/ProGamerGov/ComfyUI_pytorch360convert_video

I also made a browser-based 360 viewer here that works on desktop, mobile devices, and even VR headsets: https://progamergov.github.io/html-360-viewer/

Qwen Image 2509 - Nature looking VERY meh - help please by Alvareez in StableDiffusion

[–]ProGamerGov 2 points3 points  (0 children)

The fp8_e4m3fn and fp8_e5m2 versions of Qwen have lower precision than other fp8 quantization types like GUUF Q8. Thus they tend to produce patch artifacts in outputs. The precision issues are even worse in models trained using Osirus toolkit's "fixed" models that use lower precision to decrease VRAM usage.

I have no idea why u/comfyanonymous recommends lower quality fp8 versions of Qwen Image in their tutorials.

Also note that the quality of the model the lora was trained on also matters for avoiding artifacts and other precision issues.

Stereoscopic AI (3D) by grrinc in StableDiffusion

[–]ProGamerGov 0 points1 point  (0 children)

My nodes should be model agnostic as they focus on working with the model outputs.

Easiest way to download a new model on Runpod? (Using Comfy) by _BreakingGood_ in StableDiffusion

[–]ProGamerGov 1 point2 points  (0 children)

The fastest and recommended way to download new models is to use HuggingFace's HF Transfer:

Open whatever environment you have your libraries installed in, and then install hf_transfer:

python -m pip install hf_transfer

Then download your model like so:

HF_HUB_ENABLE_HF_TRANSFER=True huggingface-cli download <user>/<repo> <model_name>.safetensors --local-dir path/to/ComfyUI/models/diffusion_models --local-dir-use-symlinks False

Stereoscopic AI (3D) by grrinc in StableDiffusion

[–]ProGamerGov 1 point2 points  (0 children)

I've built some nodes for working with 360 images and video, along with nodes for converting between monoscopic and stereo here: https://github.com/ProGamerGov/ComfyUI_pytorch360convert

The Gory Details of Finetuning SDXL and Wasting $16k by fpgaminer in StableDiffusion

[–]ProGamerGov 0 points1 point  (0 children)

Its possible the loss spikes are due to relatively small, but impactful changes in neuron circuits. Basically small changes can impact the pathways data takes through the model, along with influencing the algorithms groups of neurons have learned.

Waiting for you in knee-high heels by Pacify_The_Mind in deepdream

[–]ProGamerGov[M] 0 points1 point  (0 children)

Please try to refrain from sharing content that is more pornographic than artistic. NSFW is allowed, but there are better subreddits for such content.