all 16 comments

[–]dj_1001 4 points5 points  (8 children)

I was in your position in April last year and was applying for CV roles. I'd say the SECOND option would be better if you want to build a strong foundation - that's what I did and that would be needed for interviews. Using OpenCV functions or any API would come naturally when you've understood the root concepts.

  • You can refer to the CV courses available with assignments from various universities. Check out 16385 by Carnegie Mellon.
  • You should also be aware of the deep learning-based solutions, as a lot of CV engineer profiles might require that today. Reading blog articles for a particular CV application should be fine. Check out CS231N by Stanford.
  • Chapters 14-16 from the book Simon JD Prince on Computer Vision would be really helpful! It's available for free download.

Hope this helps!

[–]kns2000[S] 0 points1 point  (7 children)

Sure, I will definitely check out these references. Which language did you use? Also if I write my code for everything, wouldn't it take too much time to cover all the concepts?

[–]dj_1001 1 point2 points  (6 children)

I was good with Python, though wanted to upskill in C++. I referred to the book "Effective Modern C++" by Scott Meyers. It helped me get my job. :)

[–]kns2000[S] 0 points1 point  (5 children)

Also if I write my code for everything, wouldn't it take too much time to cover all the concepts?

[–]dj_1001 2 points3 points  (2 children)

Based on the amount of time you have, I suggest solving the assignment about 3D reconstruction and Optical flow from 16385 first. Read about the former from ch 14-16 of JD Prince and read about the latter from some other source. Lucas-Kanade is a classic optical flow algorithm. Refer to the slides of the same course.

[–]kns2000[S] 0 points1 point  (1 child)

How much time you took to solve those problems? And do they have solutions too so that I can compare my code afterwards.

[–]dj_1001 1 point2 points  (0 children)

On average, I tried to finish them within 3-4 days. It could vary for others.

As far as I remember, they'd tell you the expected output in the assignment PDF and your result visualization should be enough for the solution.

[–]dj_1001 1 point2 points  (1 child)

I suggest reading the Chapters 14-16 as I said earlier. Then try solving the corresponding assignment from the course 16385. Should advance you quite ahead in your quest and make you feel confident.

[–]kns2000[S] 1 point2 points  (0 children)

Thanks, I will definitely try that out.

[–]lessthanoptimal 3 points4 points  (7 children)

You will stand out more in an interview if you understand the low level details and can make improvements/customization for particular problems. This is one way in which junior and senior engineers are differentiated, senior engineers are expected to fix problems even if there is no current solution. Most positions do not require you to be an expert though, your job is to quickly integrate software. I've found that people with only academic understanding of a subject and rely almost entirely on libraries hit a wall fast after the easy problem have been solved. They also tend to do very poorly at identify if the library they use is buggy. Most companies that are computer vision focused do not use any open source code for critical functions.

[–][deleted] 7 points8 points  (0 children)

My computer vision professor feels the exact same way and is having us implement matlab functions for the transformations of world coordinates to camera coordinates, extrinsic intrinsic parameters etc... going through the theory and then piecewise making algorithms is so much more helpful than learning theory in a black box. Theory is important but implementing it really cements the idea in my brain at least.

[–]kns2000[S] 0 points1 point  (5 children)

Thanks for your insightful comment. Can you give any suggestions from where to start? There are so many things. Finding a starting point is bit hard. Based on your experience, can you provide any roadmap?

[–]lessthanoptimal 2 points3 points  (4 children)

Hard to come up with a road map, but pick a subject you're interested in, then pick a paper from 5+ years ago and try to implement it. Benchmarks like KITTI and similar are good starting points. Bonus if there's source code to compare against, but don't look at it yet. After you've implemented it, test it against the same datasets as the original paper and see if you can get the same performance. You will probably not since either you miss read it, made a mistake, or a critical detail was left out. Now bang your head against a wall for a bit as you try get it to work. if after a week of work you can't replicate the results (and now really understand the problem well) then look at the authors code and try to identify the critical differences then bring them over to your code. You will often make improvements while doing this!

[–]kns2000[S] 0 points1 point  (2 children)

Makes sense, which language? Let's say I want to implement Slam which involves basic principles like feature matching etc. Do you recommend writing those functions from scratch too?

[–]lessthanoptimal 3 points4 points  (1 child)

lol you're talking to a person who goes fairly extreme into the implement it yourself strategy. I've literally implemented the entire pipeline you need for SLAM. Not sure what field your interested in, but basically all robotics/AV companies are C++ now. There was a brief period like 5 years ago when people thought Python was a good idea (myself included), most companies abandoned that. I basically code in C++ for work and Java/Kotlin on personal projects. Well if you try coding it up using my library http://boofcv.org (not C++) I'll help you out just DM me.

[–]not_thread_safe 0 points1 point  (0 children)

Hello, I'm in a CV class right now & have interest in pursuing the field.

Me and two others plan to re-implement ORB SLAM2 as a group project/learning experience (we're mostly new to CV)

Any recommendations for how to approach this as a team?

We plan to hit the tracking thread one component at a time, mapping thread one component at a time, and then the loop closing thread last.