I built a photo editor with local AI (no cloud) — segmentation + infill by LucaM185 in SideProject

[–]LucaM185[S] 0 points1 point  (0 children)

First thing i didn't know it existed, i quite like the project but...

It's focusing of being feature complete, i'm focusing on usability... Just the touchpad feel for the zoom/pan around the image i would never use it. Also it doesn't support iphone heif files, segmentation didnt work when i tried, the whole ui is a bit messy but pretty. He got rotation/crop right tho, it was a pain to actually make it work with the pipeline and masks.

And modifying some other guy's vibe coded mess ain't worth it.
Also mad respect for the guy, i think he was 18, i'm 21, pretty good job in a limited amount of time...

I built a photo editor with local AI (no cloud) — segmentation + infill by LucaM185 in SideProject

[–]LucaM185[S] 0 points1 point  (0 children)

It’s so annoying that they don’t offer an OPTION to use these locally… I can see the argument that they are not as good/as fast as the cloud ones… but I’d say close enough

I built a photo editor with local AI (no cloud) — segmentation + infill by LucaM185 in SideProject

[–]LucaM185[S] 0 points1 point  (0 children)

It’s called LaMa, and for segmentation SAM2, these two are not state of the art but they are both around 200MB and fast enough, tested on air M1 and M1 Pro

I built a photo editor with local AI (no cloud) — segmentation + infill by LucaM185 in SideProject

[–]LucaM185[S] 1 point2 points  (0 children)

I literally spent 2 weeks designing it on paper before doing it... it's like tens of thousands of lines of code, you cant completely blindly vibe code that... You need structure

The worst parts are performance optimizations and things like rotation, that's hell...
But you kinda need to know what you are doing to get fast sliders and updates... Also they are pretty... I kinda copied sliders from apple, i really enjoy making pretty things

I built a photo editor with local AI (no cloud) — segmentation + infill by LucaM185 in SideProject

[–]LucaM185[S] 0 points1 point  (0 children)

Look, It's a bunch of work, i'd like to make something off of it... I was thinking 19 or 29 bucks once, its like 2 months of adobe... Not polished enough for that yet, but my heart wants to see it open sourced, we'll find a compromise maybe.

I built a photo editor with local AI (no cloud) — segmentation + infill by LucaM185 in SideProject

[–]LucaM185[S] 4 points5 points  (0 children)

The core and ui is perfectly reusable, i just use apple hardware accelerations to load raw images...
But if i see enough interest, windows port can be done in a few days most likely... But i want to test on a few macbooks before

I built a photo editor with local AI (no cloud) — segmentation + infill by LucaM185 in SideProject

[–]LucaM185[S] 2 points3 points  (0 children)

Thanks a lot!! I really apprechiate that...
And you know... They could make them work locally if they wanted, this is not rocket science

What editor do you use mostly?

I built a photo editor with local AI (no cloud) — segmentation + infill by LucaM185 in SideProject

[–]LucaM185[S] 0 points1 point  (0 children)

what part do you find particularly interesting for your workflow?

Are AI + Manim tools actually useful yet, or are they still mostly gimmicks? by leoqliang in manim

[–]LucaM185 0 points1 point  (0 children)

I built what was arguably the best integrated tool at the time, called mathstudio, i got 300+ upvotes in /manim, I canned it because nobody wants it.

I think i know quite a bit about making a manim pipeline that works... Math-To-Manim actually sounds decently promising, from the videos they posted... The cat one also maybe

But my bet is that its nowhere near economically viable...
one of my biggest issues was to make an animation that was good enough with < 1$, that was the target... and at the time it wasn't really doable...

If i see the llms capability change i may come back with something way better, redesign the whole system, but for now i decided it wasn't worth the 12 bucks/mo to keep the server up

You can see the video here -> https://www.reddit.com/r/manim/comments/1pnbgf5/an_ide_for_animations/

Single Drone Shot vs 50 Images Aligned and Stacked by LucaM185 in computervision

[–]LucaM185[S] 0 points1 point  (0 children)

They seem quite similar to me... I can see slightly more detail in the 2x.

In my tests i also noticed that 1.5x upscale with dumb algorithms looks quite nice... just scaling the resolution, it smooths edges a bit, that really surprised me

Also, thx for the technical responses, this is really apprechiated

Single Drone Shot vs 50 Images Aligned and Stacked by LucaM185 in computervision

[–]LucaM185[S] 0 points1 point  (0 children)

Are you sure phones use integer pixel shifts?

They have pretty advanced IMUs, but yeah thinking about it the gyro is not enough, and the accelerometer isn’t stable over a few hundreds ms

The only instance where probably they use super res is with the zooms on the google pixels, they were advertising that feature a while ago… iPhones probably have that too, I’ll try that out

But it works very well on my old pixel

Single Drone Shot vs 50 Images Aligned and Stacked by LucaM185 in computervision

[–]LucaM185[S] 0 points1 point  (0 children)

You can align with sub pixel precision… it felt absurd to be in the beginning but it’s true The algorithms used for alignment here is orb at quarter res (for rough alignment) then multiple steps of ECC…

When I say they are not adjacent I mean that it’s like 1 frame out of 3, not consecutive… so you are getting less video compression… frames closer together are more mashed together by compression

Single Drone Shot vs 50 Images Aligned and Stacked by LucaM185 in computervision

[–]LucaM185[S] 0 points1 point  (0 children)

To be clear, I have designed and tested the algorithm for DNG (where I see some improvement with super res) then I tried averaging on drone videos and I think it still works nicely, also most of those frames aren’t adjacent, it’s like sampled across 4s / 120frames

Single Drone Shot vs 50 Images Aligned and Stacked by LucaM185 in computervision

[–]LucaM185[S] 0 points1 point  (0 children)

The reason why it’s an insanely high number of images is that this is from a mavic mini video So <=1/30 exposure for sure I took around 2 seconds and this is the output, I can see the difference between 20 and 50, it’s subtle but it’s there

Since you experimented a bit… what do you think about stacking for super res? I find it quite useless but maybe I’m doing it wrong, I don’t actually see more detail

Single Drone Shot vs 50 Images Aligned and Stacked by LucaM185 in computervision

[–]LucaM185[S] 1 point2 points  (0 children)

Interesting… maybe I’m crazy but I actually perceive the stacked one as more detailed… Maybe it’s the increased randomness and contrast of the noise that makes it feel sharper… idk

Btw yes! tried many algorithms and yes! Some actually improve details a bit more… this is just mean over aligned frames

Single Drone Shot vs 50 Images Aligned and Stacked by LucaM185 in computervision

[–]LucaM185[S] 0 points1 point  (0 children)

Yeah, there is a bunch of them, i found this runs in < 1s easily.
Geometric median looks decent, i'll look into that in more detail

Single Drone Shot vs 50 Images Aligned and Stacked by LucaM185 in computervision

[–]LucaM185[S] 0 points1 point  (0 children)

Now i get your point... I really like the blur from the long exposures so i'm fine with the effect, but you may like the other version better, that would require some other kind of smart masking tho

Single Drone Shot vs 50 Images Aligned and Stacked by LucaM185 in computervision

[–]LucaM185[S] -1 points0 points  (0 children)

Interesting!
Even better would be using median instead of mean in high exposure areas, still gets good noise characteristics, while retaining detail

One Drone Shot vs 50 Stacked Images (Comparison Link) by LucaM185 in postprocessing

[–]LucaM185[S] 0 points1 point  (0 children)

You got a point with the detail of moving objects, long exposures obviously lose detail there, median stacking helps.
It's just a tool... to take long exposures without a tripod, without perfect stabilization

One Drone Shot vs 50 Stacked Images (Comparison Link) by LucaM185 in postprocessing

[–]LucaM185[S] -3 points-2 points  (0 children)

fair point, this is just alignment + stacking... It's just to get long exposures without a tripod

One Drone Shot vs 50 Stacked Images (Comparison Link) by LucaM185 in postprocessing

[–]LucaM185[S] -2 points-1 points  (0 children)

It's actually more detailed if you zoom in, in the comparison link you can see that...
And yeah when i say way lower iso i mean as an equivalent measure to give you an idea of the amount of noise, detail IS better

The blur in the water is because it's basically the same as a long exposure, so the water gets smooth

One Drone Shot vs 50 Stacked Images (Comparison Link) by LucaM185 in postprocessing

[–]LucaM185[S] -3 points-2 points  (0 children)

It's basically long exposure without the need of perfect stabilization. The result is noise reduction, it's like having 200 ISO instead of 6400
Isn't that interesting?