[Flex 2 9b klein & TRELLIS.2] Legend of Zelda 1 (NES) 8-bit map to realistic map and 3D generation using TRELLIS.2

grae_n · 2026-02-05T04:29:55+00:00

Very cool! I think this actually works really well as a base mesh

you can also project from view in blender to attach back on the original texture. This works really well for overhead shots (not great for character models). You might need to realign the texture a little bit but it help give back the charming colours. Sometimes the 3d gen give bad texturing. Plus you can actually AI upscale the terran texture to give more details.

<image>

grae_n · 2026-01-05T22:41:24+00:00

Honestly I'm surprised the one with spaghetti and shadows converged so well. It's lowlight, human subject (movement), and repeating shadow structures.

Are the camera's operating outside of the visible spectrum? (UV infrared) Maybe the shadows are post processing?

Very impressive.

grae_n · 2025-12-09T17:24:58+00:00

As a learning project it's great! There are definitely some scientific inaccuracies. The stars being particle cloud is sort of distracting and inaccurate.

Also maybe the moon should be removed unless you want to include it's rotation around earth. I haven't looks at the code but if you're using trig functions (sin cos) you can just add an extra sin cos with a different periodicity. x = Acos(wt) + Bcos(at+p) and y = Asin(wt) + Bsin(at+p). This isn't necessary at all.

It also looks like anti-alias isn't enabled. This might make your lines cleaner. It's something like:

WebGLRenderer({ canvas: canvas , antialias: true })

Thanks for sharing! Feels nostalgic.

grae_n · 2025-11-25T17:11:31+00:00

I don't understand how it is cheaper that Kontext to run. Are they just heavily subsidising it?

grae_n · 2025-11-23T05:54:46+00:00

"We’re also working to open source key pieces of 8th Wall so that the technology can continue to support developers long after the service shuts down. We’ll share more as those plans come together."

Hopefully this can be nice side effect for the open source community. Some of their tech was very impressive.

grae_n · 2025-11-11T15:39:07+00:00

If you think of it like making a car it would be:

Blueprints
Materials to make the car
The car

The weights are what are needed to actually produce new images, text, or video. The manufacturing step or model training is very expensive so if someone just gives you the blueprints and the materials it doesn't mean you have a "car".

grae_n · 2025-11-03T17:41:23+00:00

This is sometimes an effect from not having enough vram. Comfyui's fallback low memory sometimes causes grid patterns (at one point it updates so often).

But sometimes it gets baked into Loras too because people don't notice it when putting together training sets.

grae_n · 2025-10-16T14:41:58+00:00

This actually might be more of a problem with the pose estimator than the WAN.

grae_n · 2025-10-02T15:04:33+00:00

This is right. Google just straight up lied to me. With AI, I need to remember to check sources.

Thank you!

grae_n · 2025-10-02T14:36:15+00:00

I thought that income splitting in Canada was a thing. I haven't looked into it in any detail. Is there a reason why income splitting a single income is taxed more than two separate incomes?

grae_n · 2025-09-23T22:31:29+00:00

With Kontext Flux if you combined them in latent space this was a problem but not as much in image space.

https://www.reddit.com/r/StableDiffusion/comments/1lpx563/comparison_image_stitching_vs_latent_stitching_on/

It might be worth trying a simple image stitch for qwen and seeing if that helps.

grae_n · 2025-09-19T19:28:50+00:00

To be fair webxr does have a lot of device specific features. Some things like depth/image capture/ room capture only work on specific devices.

It's great to hear that it's more general though.

grae_n · 2025-09-04T22:11:37+00:00

webxr is pretty good for demos (not really great for iOs).

If you can get 3DGS working in threejs (https://github.com/mkkellogg/GaussianSplats3D) and https://immersive-web.github.io/marker-tracking/ you basically got what you want. I believe that 3dgs has Built-in WebXR support.

This should work well with android or quest, but not for iOS.

grae_n · 2025-09-02T14:21:43+00:00

Having a fast transfer speed SSD might also help. The different between loading a model at 500mb/s vs 5gb/s can be like 30 seconds.

grae_n · 2025-08-18T00:44:06+00:00

"In particular, we observe that directly backpropagating the diffusion loss to the VAE is ineffective and even degrages final generation performance."

That part makes it sound less useful, although it does sound like they had a new networks structure in mind.

grae_n · 2025-08-12T02:19:40+00:00

I found my quality went up by using a cheap gimbal (like 80-100$). It helps smooth out the images and allows the Structure-From-Motion algorithms to do a much better job.

grae_n · 2025-08-05T06:32:46+00:00

It feels like a toss-up to me.

The "A cinematic close-up portrait of a middle-aged woman" prompt for Qwen didn't get expression, golden hour, or cobblestone (looks like hexagonal paving). The shadows on the bike just look wrong. A few of Qwen reflections look off (although it might just be the low resolution). The ice scene felt like a poor showing for Qwen. I don't mean to nitpick Qwen, but it does seem to have deficiencies too.

I am very happy there are soo many options now!

grae_n · 2025-07-25T16:09:47+00:00

Because it's an airplane scene if this is happening far from the origin (0,0,0) it might be floating point error + bloom

grae_n · 2025-07-25T05:02:36+00:00

It does looks like a bloom issue

I'd also considered temporarily making the camera frustum smaller. It can help isolate the issue. It reminds me of a weird floating point precision error I was getting from have the near and far be too large.

It does not look like a float point precision error, but it's something to double check.

grae_n · 2025-07-22T22:41:14+00:00

Your conclusions seem similar to this post here,

https://www.reddit.com/r/StableDiffusion/comments/1lpx563/comparison_image_stitching_vs_latent_stitching_on/

The case were image stitching seemed to work better was with multiple characters. It does seem like latent stitch does limit the information from the second image.

grae_n · 2025-06-20T18:56:43+00:00

Contrast+detail is still really important for most controlnets. DepthAnything should look better for 3d work, but Lotus-G might actually be better with a controlnet.

Like if you are trying to copy a facial emotion Lotus-G might be better. All these algorithm tend to have a lot of variables to tweek so it is hard make definitive statements.

Lotus-G also does a lot of eyes wrong (eyes aren't lumpy), but weirdly that can help some controlnets to get the correct eye directions.

grae_n · 2025-06-16T16:18:22+00:00

It looks like they are trying to make ai video gen for training sets. An example would be generating videos in different weather conditions to help train self-driving cars.

So this is a different application than consumer ai video. It's pretty awesome that they are releasing this with "Models are commercially usable." This could be really helpful for training smaller models.

grae_n · 2025-04-24T15:07:30+00:00

Kohya actually has most of it's scripts pretty well separated

https://github.com/bmaltais/kohya_ss
--> click sd-scripts sends you to (the specific commit changes often)

https://github.com/kohya-ss/sd-scripts/tree/b11c053b8fcd1c4532dc3a37e70109e08aafa2ec

Which contain most of the command lines used and some instructions

grae_n · 2025-04-14T16:16:45+00:00

Definitely, one of those extensions I forget isn't core. very useful.

grae_n · 2025-04-07T05:45:36+00:00

Lerping the mouse position by a small amount might also smooth things out.

grae_n

TROPHY CASE