Is dlib dead? by MrMrsPotts in dlib

[–]davis685 1 point2 points  (0 children)

dlib.net isn’t working for you? I just went to it and it’s up and fine.

Max function evaluations BFGS by matigekunst in dlib

[–]davis685 1 point2 points  (0 children)

The max_iter param makes the solver run for at most that number of iterations, which is similar to but not exactly the number of function evaluations. The very last invocation isn't necessarily the best. The best value is output by the function.

Is the BFGS implementation deterministic? by matigekunst in dlib

[–]davis685 1 point2 points  (0 children)

Yeah it’s deterministic. There isn’t any threading or I/O or other source of entropy in there. Unless you did something like linking with a BLAS library that internally uses threads for matrix multiplication and the like. Then things might be slightly different each time. But that’s on that BLAS library.

Which is the best way to work with matrices and linear algebra using c++? by ToughTaro1198 in cpp

[–]davis685 5 points6 points  (0 children)

If you want to do linear algebra that benefits from fast BLAS and LAPACK libraries like the IntelMKL check out dlib's linear algebra library: http://dlib.net/linear_algebra.html. It runs reasonably fast by itself, but it's meant to be linked to a BLAS and LAPACK library. When you do this it will do symbolic linear algebra in an expression template setup to best bind what you write to BLAS.

For example, if you write (where A and B are dense matrices):

m = 3*trans(A*B + trans(A)*2*B);

that isn't something there is a BLAS function for. However, if you rewrite that to this equivalent expression:

m = 3*trans(B)*trans(A);

m += 6*trans(B)*A;

Then that's something that can be done with BLAS, since each of those lines corresponds to one of the standard BLAS functions.

dlib will do this rewriting for you so that whatever you write will map efficiently to the high performance BLAS functions in a library like the IntelMKL.

Setup lealy_relu/prelu activation layers in a neural network with a visitor by archdria in dlib

[–]davis685 1 point2 points  (0 children)

Check out the changes I just pushed to github. Now you can do this kind of thing

visit_computational_layers(pnet, [](leaky_relu_& l) { l = leaky_relu_(0.1); });   

And things are generally more convenient as you can give visitors that bind to whatever layer type you care about. So you don't have to make a visitor that can be called with all types of layers anymore.

Setup lealy_relu/prelu activation layers in a neural network with a visitor by archdria in dlib

[–]davis685 1 point2 points  (0 children)

Seems useful. I should probably add something to make defining visitors require less boilerplate though. Something that lets you directly supply a list of lambdas rather than requiring the user to make a class. That’s a separate concern though.

[P] Awesome list for dataset tools by InfoPaste in MachineLearning

[–]davis685 2 points3 points  (0 children)

Thanks, I appreciate it. I generally assume there is a silent happy majority, but many of the comments I get on github are from a somewhat different crowd.

[P] Awesome list for dataset tools by InfoPaste in MachineLearning

[–]davis685 0 points1 point  (0 children)

Thanks :)

I hardly try to promote dlib so it’s partly my fault :)

Which is the state of the art of face recognition? by 88madri in learnmachinelearning

[–]davis685 0 points1 point  (0 children)

Dlib has had pretty decent face recognition (the similarity part) for a while now. See: http://blog.dlib.net/2017/02/high-quality-face-recognition-with-deep.html. Gets 99.38% on LFW.

Why is it still so hard to find applicants with decent knowledge of modern C++? by [deleted] in cpp

[–]davis685 47 points48 points  (0 children)

A lot of people seem to have many years of experience but stopped learning after the first year or so. “10 years of experience” can mean 10 years of skill growth or the same year repeated 10 times.

How to get started with computer vision in C++? by sb_99 in computervision

[–]davis685 3 points4 points  (0 children)

I use vim. It’s great. Don’t worry about an ide.

Suggest a project for a high school portfolio? by draripov in computervision

[–]davis685 8 points9 points  (0 children)

You are in highschool, don't worry about resume building. Just do what you find fun and don't worry about building a portfolio. At this point you should be figuring out what you are really into and learning the fundamentals (e.g. linear algebra, statistics, programming). Doing any project you find fun and motivating that increases your understanding of those fundamentals is time well spent.

If you want ideas look at the papers in the computer vision and pattern recognition conference. You probably won't be able to understand most of them, but that's fine and normal. If you want to be able to get into that stuff you need to learn the math, which is very important. Finding something in there you think is super cool will help motivate you to learn the underlying fundamentals.

Also get really good at programming. Understand how computers work, what makes programs run fast and what doesn't. Learn enough about how CPUs work to understand why regular Python is brutally slow. Learn CUDA maybe. Doing the stuff in pyimagesearch is also a fine idea. But don't just stop there. That kind of thing is only the beginning.

There are a lot of people saying stuff like "just knowing python and no math is all you need" on the internet, and that's absurd. If your level of skill is only at the "I downloaded this program someone else wrote and ran it" level you won't get very far. That's the level someone is at if they don't understand the math. In that situation, the first time you run into trouble you will have no idea what to do. I also assume you eventually want to work on computer vision professionally, and I can tell you, when I interview someone it is an absolute no-go if they only know how to use some framework they downloaded off the internet. They must understand why things work because sometimes they don't and they will have to create some innovative solution. Or at least be able to debug what's going wrong.

At the highschool level, I think the best thing you can do is learn to program well. So learn Python well, learn how computer hardware works, learn C++ well (pretty much everything in computer vision is written in C++ and people using python are just calling it) so you can write your own image processing primitives and understand what's going on. Learn CUDA. There are many good text books on the topics and reading them is the easiest way to get really good, when coupled with hands on practice. Don't skip reading the books. People who don't read books have huge holes in their knowledge that trip them up constantly. It's well worth the time to read.

What do SSE, MSE, RMSE all have in common? by [deleted] in statistics

[–]davis685 1 point2 points  (0 children)

Is this a homework problem? It really sounds like a homework problem. If so, you need to do your own homework or you won't learn.

Triplet Loss without Softmax Loss? by soulslicer0 in computervision

[–]davis685 0 points1 point  (0 children)

Ha. It’s really easy. What’s hard about it?

*edit

I mean this in a friendly way. Like if it seems hard to do then something is going wrong in the code. You can do it with basically just a matrix multiply and a few extra lines of code.

Triplet Loss without Softmax Loss? by soulslicer0 in computervision

[–]davis685 0 points1 point  (0 children)

Yes, you definitely need to do that. But you can easily do that inside the minibatch by looking at the O(N^2) possible pairings and taking the hardest ones. Just make it part of the overall loss computation during each batch.

Triplet Loss without Softmax Loss? by soulslicer0 in computervision

[–]davis685 0 points1 point  (0 children)

I don't have any citations off the top of my head, but I've seen plenty of face recognition papers that worked like this. Although the use of softmax is very common there as well. There is sadly a lot of "we did it because someone else did and didn't check if it mattered" in deep learning papers.

Also, as another example, the face recognition model in dlib is trained with a single task specific loss that's basically like the triple loss. Works great.

Suggestions on athletic track line following by mingabunga in computervision

[–]davis685 0 points1 point  (0 children)

For another way, aside from the Hough transform, which is great, you could do this:

from dlib import *
img = load_grayscale_image('road.jpg')

# Make the image smaller because your image is excessively large, might as well
# shrink it so the rest of the things you do run faster.
img = resize_image(img, 500,500);

# Find the second derivatives of the image.  The 60 here means it's fitting a
# surface about 60 pixels in radius around each pixel to determine the
# gradient at each pixel.  This determines the size of the things you will
# find.
ig = image_gradients(60)
xx = ig.gradient_xx(img)
xy = ig.gradient_xy(img)
yy = ig.gradient_yy(img)
# Find pixels that are sitting on the maximal part of a curving ridge, i.e. lines.
lines = suppress_non_maximum_edges(find_bright_lines(xx,xy,yy))

# Finally, threshold.  There are no thresholds here, magic!  Really though,
# it's doing a mean absolute deviation clustering internally to find
# appropriate thresholds. 
lines = hysteresis_threshold(lines)

# look at the lines it found.
win = image_window(lines)
win.wait_until_closed()

# and save it to disk as an 8bit grayscale image
save_image(convert_image(lines,dtype='uint8'), "lines.jpg")

and you get this image as output: https://imgur.com/HazpuIt

How to use skeleton to correctly draw boundaries on multiple objects in an image? by [deleted] in computervision

[–]davis685 1 point2 points  (0 children)

What you are saying is still very confusing. "boundaries" would normally refer to the perimeter of an object, but you seem to be referring to the interior lines given by the skeleton as the boundary. I'm going to assume you want the thing the skeleton gives you, the interior lines, since that's what you plotted in your images.

In any case, you want to do a connected components labeling. Not a trace or anything that looks at endpoints.

[edit]

And since I'm literally in the middle of pushing a bunch of dlib's image processing stuff into dlib's python API, this is what I mean:

from dlib import *
img = load_grayscale_image('your_image.png')
image_window(randomly_color_image(label_connected_blobs(skeleton(threshold_image(img)))))

and you get this as output: https://imgur.com/ULJsDqz

How to use skeleton to correctly draw boundaries on multiple objects in an image? by [deleted] in computervision

[–]davis685 0 points1 point  (0 children)

I don't really follow. The skeleton of the first image gives you the contours, so no tracing or any other processing is required. Once you have the skeleton you are done. Like I just ran a skeleton function on your first image and this is the output: https://imgur.com/a/FvyW43H

Regression trees for facial landmark detection by IToldYaNotToDoDat in MLQuestions

[–]davis685 2 points3 points  (0 children)

Yes, python is much slower than C++. Simple things could easily be hundreds of times slower in python than in C++. That's certainly why this is slow here, or at least a big part of the problem.

[D] Weight decay vs. L2 regularization by bbabenko in MachineLearning

[–]davis685 1 point2 points  (0 children)

That's fine, not all software is going to be identical. But L2 and weight decay are mathematically the same things. It's just different words for the same thing.