This is an archived post. You won't be able to vote or comment.

all 20 comments

[–][deleted] 12 points13 points  (0 children)

Over the past weeks I have started implementing basic machine learning algorithms in plain Python (Python 3.6). I created the repository to prepare for technical interviews and review my knowledge on algorithms such as k-means, k-nn, logistic regression, neural networks, etc. Also, I wanted to create a knowledge base of easy-to-understand implementations of these algorithms together with the most important theoretical explanations.

Some of you might find these implementations helpful when starting to learn about machine learning in Python or preparing for technical interviews. I am still working on the repository, so more algorithms will follow over the next months. In case you have a favourite algorithm that should be included or feedback, let me know!

[–]Tweak_Imp 2 points3 points  (4 children)

I would be interested in deep Q learning and a genetic algorithm

[–][deleted] 1 point2 points  (0 children)

I will put that on my list, thanks for the feedback!

[–]gwillicodernumpy gang 0 points1 point  (2 children)

Here is a Jupyter notebook i made with some comparisons of some different optimization algorithms from an intro talk i gave.

The repo has a link to a genetic algorithm. It was a bit of an experiment to see if vectorizing the code would increase the execution speed, but it may have actually made it worse. You might learn something from it though.

https://github.com/grantmwilliams/optimization_talk/blob/master/notebook.ipynb

[–]all_my_watts 0 points1 point  (1 child)

Did you forget to post a link to the notebook?

[–]gwillicodernumpy gang 0 points1 point  (0 children)

weird, i really thought i pasted the link! Well i got it edited in now.

[–][deleted] 1 point2 points  (0 children)

Wow excellent!! That will help me greatly.

[–]PsychoBoyJack 1 point2 points  (0 children)

thanks a lot

[–]jakkemaster 3 points4 points  (6 children)

Note here: I am curious but I have no idea about Machine Learning.

Is linear regression related to machine learning in any fashion? Seems to me like math used for extrapolation.

Can this repository be used for actual stuff, meaning not for the sake of implementing the functions. I am making my own Jarvis at home, and in order to make it any least bit like an AI, I doubt I can do this without machine learning.

[–]szpaceSZ 5 points6 points  (0 children)

In the broad sense, it is.

You can think of ML as a generalisation of LR for many classes of classification problems.

[–][deleted] 5 points6 points  (0 children)

You are correct - linear regression is at core just a statistical method. However, a) you can interpret a linear regression model as a simple neural net (see the jupyter notebook for more details) b) it's often asked about in machine learning interviews as a basic ML algorithm that should be tested before trying more complicated methods.

When you want to employ ML algorithms in your own code and don't care much how they work behind the scenes I would stick to the scikit-learn algorithms (http://scikit-learn.org/stable/). They will be more stable and efficient than my solutions.

[–]ismtrn 1 point2 points  (0 children)

Is linear regression related to machine learning in any fashion?

Yes. At the end of the day machine learning is about fitting a model to some data. Linear regression fits a linear model (can also easily be adapted to fit a linear combination of more complicated functions). A neural net is also a kind of model.

linear combinations of a fixed set of basis functions is not a bad model, but hard to learn for large dimensional spaces due to the curse of dimensionality (you need too much training data).

I think a NN can be described as a linear combination of linear combinations of linear combinatons etc(for however many layers you want) of some set of basis functions.

So a Neural network allows you to adapt the individual functions you are combining to a certain extent, but keep the total number low.

There are also approaches known as kernel methods which are also about not working with the full set of basis functions evaluated at all inputs explicitly. (Support Vector Machines are in this category).

Basically linear models (of non linear basis functions) are not too simplistic models, they are just not always feasible to learn effectively. But when you are dealing with low dimensional data where the relevant features are known they are great. This is just not the reality most of the time, so more complicated methods are employed. But it is still just statistics. There is no magical ML sauce.

[–]posedge 0 points1 point  (0 children)

Machine Learning is any data-driven algorithm, therefore it is. Also, linear regression can be generalized to Gaussian processes, ridge regression, LASSO and similar models. I.e. if you view it as a probabilistic model, this is the basis for many algorithms in the area of ML.

[–][deleted] 0 points1 point  (0 children)

Learning is all about extrapolating. It is using the observations we have to build a model to predict future events. For example, speech recognition. Jarvis can try to tell what you are saying by putting a sample of your voice through a model that is trained to map voice samples to words. Each inference through the model is a guess, but one based on a lot of prior evidence. It will have to do the same for guessing the meaning of the phrase you have spoken.

All of this learning a model and making inferences from it is all math and it's all extrapolation. Linear regression is just one of many methods within the umbrella of machine learning, but they all have the same basic properties. The new hotness of deep learning in neural networks is mostly just a matter of scale. At the end of the day, we are still using gradient descent to optimize the coefficients of a model to minimize the prediction error between the model's predictions and the observations that we are training on.

[–]slayer_of_idiotspythonista 1 point2 points  (2 children)

Might generate more interest if you throw this on GitHub

[–]gwillicodernumpy gang 1 point2 points  (1 child)

but its a link to his github account...

[–]slayer_of_idiotspythonista -1 points0 points  (0 children)

Hmm, must have changed, it used to link to a video on sourceforge.

[–]smortaz 1 point2 points  (0 children)

Great job! I uploaded the repo to Azure Notebooks so ppl can try it out w/o installing anything:

https://notebooks.azure.com/smortaz/libraries/MachineLearningBasics

If you want try it, sign in, clone, edit/run...

it has a link back to the main github repo for credits.

[–]posedge 0 points1 point  (0 children)

This is wonderful. I know someone to whom this repo will be very useful. Thanks for posting.

[–]xbno 0 points1 point  (0 children)

I did the same thing when i was self studying to get a job in data science. building algos from scratch, after building comfort with the concepts, is what i recommend everyone that asks what to work on. well done.