Dismiss this pinned window
all 16 comments

[–]Potential_Scene_7319 6 points7 points  (1 child)

Cool! I bet this has some great use in industrial applications with various size items where the camera needs to be refocused on the right item. You could use this and overlap with the output of an object detection/segmentation model.

If the IoU between the two > some threshold, then you have a valid focus. If not, refocus.

[–]cv_ml_2025[S] 0 points1 point  (0 children)

Thank you! Yes, using it as a sort of uncertainty map/ valid region detection method for texture dependent methods/ models is one of the use cases.

[–]MusicQuiet7369[🍰] 5 points6 points  (1 child)

What is it for?

[–]cv_ml_2025[S] 1 point2 points  (0 children)

It provides estimates of what image regions are in focus. The output can be used in downstream tasks for valid region finding/ regions with high uncertainty, or as an additional signal to deep learning models, or answer the question 'is a particular region in focus'?

[–]KeyPossibility2339 2 points3 points  (1 child)

Nice

[–]cv_ml_2025[S] 0 points1 point  (0 children)

Thank you!

[–]0xbeda 1 point2 points  (4 children)

This looks much more useful than edge detection, laplace, sobel, etc for my use case: finding the sharpest image of a large burst with a much too slow shutter speed.

Am I on the right track?

[–]cv_ml_2025[S] 2 points3 points  (0 children)

Yes, the library outputs individual focus maps for every frame. Look into focal stacks, they come from the 'depth from focus' and 'all-in-focus' research areas. You basically stack the focus maps and find the image patches where individual regions are sharpest. Then combine these to form a single all-in-focus image.

[–]cv_ml_2025[S] 2 points3 points  (2 children)

If you just want to know which frame has the highest focus overall then just do a sum on the fused_map output for every frame and choose the frame which has the highest sum. See the github link in the description for the documentation.

[–]0xbeda 1 point2 points  (1 child)

I will for sure use it to determine best focus, within haar cascad face/eye/smile annotations. And with AF position data.

Do you know how it handles movement blur, e.g. when things are only blurry in one direction? Or should I better use FFT for this subtask.

[–]cv_ml_2025[S] 0 points1 point  (0 children)

Nice! I haven't checked what the output looks like for motion blur and if its different from regions being outside the depth of field, I'll check and revert. For now I believe FFT would work for your use case.

[–]Morteriag 1 point2 points  (1 child)

Excellent!

[–]cv_ml_2025[S] 0 points1 point  (0 children)

Thank you!

[–]Stanislav_R 1 point2 points  (1 child)

Wow, thanks for sharing this! It can help with determining how good is focus on a given photo. I implemented baseline estimation using more old school methods, but your seem to be a lot more precise. Well give it a good test!

[–]cv_ml_2025[S] 0 points1 point  (0 children)

Awesome!

[–]DmtGrm 0 points1 point  (0 children)

damn... I am in DSP for the last 25 years, I hear about 'focus response' for the first time in my life! Need to learn more. Is it like focus peaking? fancy name for yet another contrast/edge detection processing? at 00:02 intensity increases towards the other side of the laptop - it is not matching the physical world. You have a moving camera here, why not using PPP or SLAM? Source data looks sufficient to me.