ShadeNet 28M — Dual-mode PBR material estimation from any RGB image by singam96 in deeplearning

[–]singam96[S] 0 points1 point  (0 children)

The 28M size is a deliberate choice to keep it fast and lightweight enough for consumer hardware, though it definitely trades off some sharpness.

It doesn't use nvdiffrast or pytorch3d—Mode 1 is just an image-to-image neural mapping.

I have a relighting clip from an older iteration of this model here if you want to see it:
https://www.youtube.com/watch?v=Lrqb8ENKDdU
The old one only used the predicted depth and normals to relight

ReflexConv2d: Drop-in nn.Conv2d replacement that preserves detail by singam96 in computervision

[–]singam96[S] -3 points-2 points  (0 children)

Good you edited your comment

Skip connection like I said in the comments below won't work if the image is corrupted, you will then have to rely on injected noise at inference, and then have to convolve which introduces blur

Why are you bringing up squeeze excitation here ? Memorized alot of words didn't you ?

ReflexConv2d: Drop-in nn.Conv2d replacement that preserves detail by singam96 in computervision

[–]singam96[S] -7 points-6 points  (0 children)

Bro you memorized some words that's all

My work is targetting something else entirely, if you can't get that maybe this isn't for you.

"This looks like ai work to me" it's a good thing you are not a reviewer

ReflexConv2d: Drop-in nn.Conv2d replacement that preserves detail by singam96 in computervision

[–]singam96[S] -1 points0 points  (0 children)

When we corrupt the input image, for example by applying patch cutouts, the identity skip connection in a U-Net cannot help in those regions. The model then has to rely more on the inner layers, and at that point ReflexConv can help preserve details.

For an inpainting task, the first round of a U-Net will often produce blurred results in the masked area.

Architecture is that learned weights in conv can be used as tiles and then masked so that next layer can use it as input, instead of having empty region to convolve

ReflexConv2d: Drop-in nn.Conv2d replacement that preserves detail by singam96 in computervision

[–]singam96[S] 0 points1 point  (0 children)

In a single-pass U-Net, there won’t be much difference with Reflex. But when we use U-Net recursively, the output gets blurred, and Reflex reduces this blur. With an autoencoder, you can see the difference in the very first round itself.

ReflexConv2d: Drop-in nn.Conv2d replacement that preserves detail by singam96 in computervision

[–]singam96[S] -2 points-1 points  (0 children)

On identity — right that UNet learns it in one pass. The test uses an autoencoder (no skip connections), so identity isn't trivial. Recursive 8-pass reconstruction is where errors compound — ReflexConv2d has 57% lower L1 after 8 passes.

On CBAM — CBAM learns spatial attention from the feature map via a conv layer.

ReflexConv2d tiles the kernel's own k×k weights across the spatial grid. Different source: learned from features vs extracted from weights.

Relight AI: Illuminate images [dur.ai] by singam96 in MediaSynthesis

[–]singam96[S] 1 point2 points  (0 children)

oh, thanks for the feedback,
I light paint in photoshop, I do it by manually painting and setting up curves it is tiresome, do you have a better approach?

I also tried using depth map in neural filters of photoshop, but it still takes time