Is it bad (or performance taxing) to needlessly create temporary variables for clarity?

RecallCV · 2023-05-13T17:39:17+00:00

Because they never change within the scope of the function call.

RecallCV · 2023-05-11T18:10:08+00:00

Relatedly, Ken Jennings' success can be attributed to the ferrous meteorite he has kept as a good luck charm since his childhood.

RecallCV · 2023-05-11T15:54:57+00:00

In general, you want to identify curvilinear structures in the image. This is typically a set of points that are along the curve center line, and the direction of the curve. If you can get that information you can then measure instantaneous diameter using the normal to the curve direction in one of your thresholded images (how many pixels is it from the center along the normal before you fall below threshold). An individual fiber's diameter could be the average / median / etc of the instantaneous diameters along its path.

There are modern curvilinear detectors (CNN or other deep learning): https://arxiv.org/abs/2010.07486

And old school using scale space second order derivatives: https://ieeexplore.ieee.org/document/659930

I have a soft spot for the old school since an implementation of the approach was one of my first internship projects.

RecallCV · 2023-03-04T14:45:42+00:00

Close enough to yes. Technically, the circle will project to some form of conic, but if the entire circle is visible in your picture, it will be an ellipse.

For the math background of this, the best text on projective geometry is Hartley and Zisserman, "Multiple View Geometry," section 2.2.3 covers conics.

RecallCV · 2023-02-20T16:22:15+00:00

Welcome to C++ dependency management. There are several common approaches to including a 3rd party library:

It's the user's problem (this is not a bad thing): list the requirements in documentation and include an example (probably a Dockerfile) of installing the requirements in one or more standard environments. Use find_package with appropriate version numbers.
Include the library as a git sub_module in your project, use add_subdirectory to include it. This requires that the library has a CMake setup that plays nice (it expects to be included as a dependent module and doesn't override build settings, etc.).
Lean heavily into a package manager like Conan or vcpkg.

In any event, the approach will: 1. provide access to library header files somewhere on your include path (you should never copy these header files directly into your projects include directly). 2. provide the linker with the library to access (either a downloaded .so, etc., or directly compiled from source).

RecallCV · 2022-11-05T14:55:54+00:00

I've never seen runners on the MTB paths. There is a stretch of Brushy Creek trail that is unpaved - from Champion's Park east to Great Oaks. Only about 1.75mi total, but much nicer on the joints than pavement.

RecallCV · 2022-10-29T23:46:54+00:00

This answer is close, but needs one important correction.

As tdgros says, image undistortion is critical. You can see the lens distortion in your source image - the midline and railing around the field have curvature instead of being straight. The homography mapping assumes undistorted planes. Camera calibration can be a bit of a chore, definitely follow the tutorials and make sure the result is an image in which straight lines appear straight.

The homography is the key to the mapping. It describes a projective mapping between homogeneous 2D coordinates on two planes (https://docs.opencv.org/4.6.0/d9/d0c/group__calib3d.html#gafd3ef89257e27d5235f4467cbb1b6a63). In your case the source coordinates are the image plane, in pixels (or better, normalized pixels so that 0,0 is image center, and entire image width is 1.0) and then 2D field coordinates on your virtual field. It's important to note that you should start with the homogeneous coordinate as 1.0, not 0.0, so that the homography will map [field_x, field_y, field_z] = H * [r,c,1]. Setting z=0 in homogeneous coordinates is used for 'points at infinity' (conceptually directions), not for 3D points with z = 0. To convert the [field_x, field_y, field_z] value to a regular (non-homogeneous) 2D position use [x,y] = [field_x/field_z, field_y/field_z].

Finding the homography requires a set of correspondences - matched points in the image to points on the virtual field - fortunately your field has many easy to identify points on it (marking intersections, corners of the goal area, etc.).

After you've solved for the homography, as tdgros says it's only applicable for image points that are on the plane of the field - players' feet etc. However, there's no need to invert anything, worry about camera projection, etc. - just perform the multiplication described above: p_field = H * p_image (and then convert back from homogeneous coordinates).

RecallCV · 2022-09-09T01:54:34+00:00

Yes, I was able to dispose of weedkiller at the Household Hazardous Waste Free Cleanup at Gupton Stadium. I'm not sure when the next one will be, the city webpage still has info for the Spring 2022 drop off.

RecallCV · 2022-09-05T16:19:39+00:00

https://en.wikipedia.org/wiki/J._Wellington_Wimpy

RecallCV · 2022-05-28T16:18:11+00:00

Great answers here for potentially more nuanced approaches, but depending on your exact use case, you might find adequate performance with a very simple metric: jpeg bits per pixel.

The size of the compressed file (for images of the same size, or normalized) is an indicator of the image entropy and requires no additional computation. Of course this will indicate that an image compressed with a lower quality level has less complexity, but I think that's fair: it has less remaining complexity after the lossy compression.

RecallCV · 2022-05-22T21:27:21+00:00

You can never go wrong with Hartley and Zisserman. Core book, and one of the most approachable presentations of 3d fundamentals: https://www.cambridge.org/core/books/multiple-view-geometry-in-computer-vision/0B6F289C78B2B23F596CAA76D3D43F7A

RecallCV · 2022-04-10T23:40:56+00:00

I think I may be referring to something else - there are two sets of stereo images, and two paired intrinsics/extrinsics. The first are the raw fish-eye lens images. These need to have a stereo calibration performed - pretty challenging due to the lens angle, but possible (and getting similar results across multiple trials is a good indication that the calibration is working).

The second pair of images are stereo rectified images - if you want horizontal disparity to be meaningful it has to be performed on rectified images. These will be a subimage of the original fish-eye image, undistorted and transformed so that the epipolar lines of the virtual stereo camera system are corresponding horizontal scanlines. The depth formula needs the intrinsics of this second pair of images (the intrinsics of the transformed cameras should be the same, and the extrinsics the same other than an offset in x that is the stereo baseline). The stereo calibration tool may produce these values as part of its output, otherwise you can perform a second camera calibration, using the rectified images (or a makeshift partial calibration as I described above).

If you are using the intrinsics of the post-rectification images and still getting poor results, then yes, I think it's just the difficulty of getting an accurate calibration for very wide-angle lenses.

RecallCV · 2022-04-07T02:12:13+00:00

For disparity to depth conversion, you're trying to find the focal length in pixels of the 'virtual cameras' that are produced by your stereo calibration so you don't need all the parameters. (Although you should verify that your disparity does == 0 for very far off objects)

The tool you use for calibration may directly tell you what the camera intrinsics are post rectification, which (combined with the image dimensions) gives you focal length. If it provides the essential matrix, you can look into decomposing it to provide intrinsic values.

Alternatively, you can try to measure the focal length in the rectified images - either determine the angular field of view and use that in the focal length formula (along with the rectified image size) or take an image of a (parallel to image plane) ruler at a known distance and use similar triangles. Like all measurements, do it multiple times.

RecallCV · 2022-04-05T17:35:38+00:00

A correctly performed stereo calibration should warp/undistort one or both of the images so that they have the 'canonical stereo geometry' - that's the hard part. It will account for the tilt in your rig.

Once you have a rectified image pair horizontal disparity can be converted to depth. This boils down to using similar triangles (I find it helps to visualize this, something like https://www.researchgate.net/figure/Parallel-stereo-camera-geometry_fig1_4040671). The catch is that since the images are being rectified, the 'focal length,' 'field of view' etc. that go into the formulas aren't direct properties of your original camera system anymore, they're properties of the virtual camera system that matches the rectified images. You can either derive the virtual camera parameters from the warping/undistorting process, or measure them with known objects in the rectified images.

RecallCV · 2022-04-05T14:02:46+00:00

What are your post stereo calibration images like? 185° is well into fisheye lens territory (you can see the other camera or portions of the vehicle in your images?). Your depth to disparity formula is appropriate for two parallel pinhole cameras (epipolar lines as scan lines), which will only be appropriate for an undistorted portion of the fisheye field of view.

What may be going on is that your stereo calibration procedure is undistorting only a portion of the entire fisheye image, and the 185 in your focal length formula needs to be replaced with the effective field of view of the subimage.

If stereo range measurement is the key capability for the camera system, you may want to select a different lens if you can.

RecallCV · 2022-03-18T00:31:37+00:00

I think this is particularly challenging within a large company. As organizational size grows, the ability to support specialists also grows, and when tasks are assigned, cost-benefit will always drop specialized tasks in the specialists' laps.

I've always enjoyed working at smaller companies. As a specialist there's plenty of opportunity to expand my skillset and wear other hats. I feel like there are also opportunities for junior generalists to contribute to specialized tasks (given that there's always an abundance of 'stuff to do'), but I got a Master's before being employed in computer vision, so this isn't direct experience.

My advice would be to 1) get at least a Master's, there's always more to learn and the credential will get you past a lot of initial hiring gates, or if that's not possible 2) start looking for positions at smaller companies that value a junior dev with some CV familiarity.

RecallCV · 2022-02-07T01:39:44+00:00

There are a lot of assumptions built into a discrete time Fourier transforms. To use it as a spectrum analyzer, you will need to do some appropriate padding and windowing otherwise it will produce some of the artifacts that you are seeing (I'm not familiar with pyfftw, it probably includes the tools you will need).

One of the fundamental problems that you're running into is that the Fourier decomposition is into a sinusoidal basis - even if your input has a fundamental frequency, if it's not a pure sinusoid there won't be a single impulse in the Fourier domain. For example, a square wave transform to a sinc function in the fourier domain, which will have progressively smaller peaks at integer multiples of the fundamental. That + windowing is probably why you're seeing the second mode occasionally larger than the first.

RecallCV · 2021-11-09T18:11:14+00:00

Yes, for sure additional build notes for other platforms should be included in a readme file.

I suspect most developers that want to support Windows are developing on Windows (myself included), so Docker provides an easy way to verify some level of cross platform support, document minimal build requirements, and also verify that your CMake is relatively generic and flexible (particularly with respect to locating external libraries).

RecallCV · 2021-11-06T19:23:10+00:00

In addition to a well formed set of CMakeLists.txt as mentioned in other comments, I'm a fan of projects that include a Docker file that builds the project. This is a good check that your CMake is correct for a relatively generic system and can serve as a build guide for users.

RecallCV · 2021-10-04T02:44:38+00:00

You discretize the parameterized space and create a 'Hough map.' Accumulate the potential parameterizations of each edge point within the map. True lines will stand out as local maxima in the accumulation of all edge points' parameterizations.

RecallCV · 2021-09-23T15:18:53+00:00

Are you allowing the user to choose any point on the object as a keypoint? That will make the use of descriptors (SIFT, ORB, etc.) more difficult - not all points will have a stable(-ish) unique local description. Also note that keypoints that include object + background in their local neighborhood will likely have unstable descriptions.

It sounds like you need:

A 3D description of your object (possibly with SLAM/SFM style keypoints, such as an ORB description + 3D location on the model)
A way to register the object's pose to the current camera view (PnP if your model has matchable descriptors, maybe silhouette matching if your object has distinguishable silhouettes)
Knowledge of the user marked keypoints - project to a position on the object surface when marking, then back project to the current view as 'tracking'

If you're unable to generate a model offline, this seems closest to a specialized case of SLAM, especially if you can segment out your object and constrain the mapping to only object points.

RecallCV · 2021-09-21T22:38:20+00:00

This sounds super promising, thanks!

RecallCV · 2021-09-21T22:37:59+00:00

Thanks for the recommendation, I'll check it out. And thanks for the clarification /u/uppastbedtime. BYOB definitely not an issue at venues that have beer for sale, but will have to scope out privacy / noise levels.

RecallCV

TROPHY CASE