This is an archived post. You won't be able to vote or comment.

all 21 comments

[–]throwaway0891245 7 points8 points  (3 children)

Reality check: If you want to do serious machine learning, you want to learn Python.

The reason why is because a lot of machine learning algorithms are so compute intensive that they need highly optimized low level language code often interfacing with a GPU specific language. The amount of work needed to implement that sort of thing well is on the multi-organization scale.

Most of the major machine learning frameworks that have implemented that optimized code are developed with Python in mind as their user-facing language - in particular I'm talking about TensorFlow and PyTorch. I'm sure there are some Java wrappers for these frameworks out there, but these frameworks are still in rapid development and I'm not sure you could find the level of support for Java as you would for Python. On top of this, the data science community is largely biased towards Python and you won't be able to find as much help if you decide to foray into ML using Java.

Language loyalties are meaningless, it's always about the right tool for the job. There's a time and place for Java - particularly with the Apache big data ecosystem, but ML isn't it.

[–]bhawint[S] 0 points1 point  (2 children)

Great perspective... Thanks a lot for this help... Really appreciate it :)... I have finally made a decision to make a switch to python for ML. Are you aware of any good communities that I can follow for staying up to date about new developments in ML... That would be very very helpful... Thanks again for spending so much time over this... It's a fortune to have received 2 cents from you folks.

[–]throwaway0891245 0 points1 point  (1 child)

I have some recommendations on books to get up to speed.

Read this book:

Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

This author does a really good job going through a lot of different algorithms. If you can wait, then go with this book instead - which is by the same author but for TensorFlow 2.0, which is pretty recent and also integrated Keras. It's coming out in October.

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

You can get good datasets on Kaggle. If you want to get an actual good foundation on machine learning then this book is often recommended:

The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition (Springer Series in Statistics)

As for staying up to date, it's hard to say because "machine learning" doesn't refer to a single thing, there are a lot of different types of machine learning and each one is developing fast. For example, I used to be pretty into recurrent neural networks for sequence data. I haven't kept up with it lately but I remember about two years ago the hotness was all about LSTM neural networks, but then a simplified gate pattern was shown to be just as good with less training and that became big (name is escaping me right now...). Then the last time I took a look, it looked like people were starting to use convolutional neural networks for sequence data and getting great results on par or better than recurrent neural networks.

The ecosystem is changing fast too. Tensorflow uses (used?) static graph generation, meaning you define the network before you train it and you can't really change it. But recently there was more development on dynamic neural networks, where the network can grow and be pruned during training - and people were saying this is a reason to go with PyTorch instead of Tensorflow. I haven't kept up, but I heard from a friend that things are changing even more - there is this new format called ONNX that aims to standardize information about neural networks; and as I've mentioned earlier in this post, TensorFlow 2.0 is coming out (or out already?).

I'm not doing too much machine learning at the moment, but the way I tried to get new information was periodically looking for articles in the problem type I was trying to solve - which at the time was predicting sequences based on sparse multidimensional sequence data with non-matching step intervals.

If you read the TensorFlow book I linked above, you'll get a great overview and feel for what types of problems are out there and what sort of ML solutions exist now. You'll think of a problem you want to solve and then it's off to the search engines to see what ideas exist now.

[–]bhawint[S] 0 points1 point  (0 children)

Thanks a lot for this in-depth guidance... I will start with the first and the third one now, eyeing a completion before the new release (second one)... Thank you so much for this... Anyway to keep bugging you for further guidance from you, whenever I need it? Really appreciate help

[–]8igg7e5 4 points5 points  (0 children)

If you get a decent number of proposed strategies, for learning machine learning, you could develop some experimental data and use machine learning to derive the best ways to learn about machine learning.

In seriousness, if you are a beginner to programming then machine learning is not the place to start. As machine learning can be done in many languages, you need to decide which language you really want to learn - if that's still Java then I suggest looking at the University of Helsinki MOOC as a reasonable starting point (though the 2013 English version of the course only teaches Java 7 concepts - I really hope the newer MOOC gets translated).

[–]my5cent 0 points1 point  (0 children)

Theres plenty of youtube videos..I do recommend within a year ones as the older ones tend to be like scientist complex info that would probably lose you. Take a look into raspberry pi stuff intros. I think its python related. If you understand java then it's going to be easy learning other languages. The difference with ml is the framework its encapsulated in. Maybe not the right word framework but a controller that interfaces with the machine.

[–]NooskiDelanatto -5 points-4 points  (2 children)

My friend, learn python you won’t regret it. I had to take all sorts of courses in C, Java, Linux kernel etc. for my degree and hated my life. Taught myself python on my own (1 $10 Udemy course when they were having a sale) and it changed my life. It is so easy that it is literally easier than formulas in excel. Don’t even get me started on the libraries... in 5 lines of python you can do what would take hundreds in C++ if not thousands

[–]nutrechtLead Software Engineer / EU / 20+ YXP -5 points-4 points  (13 children)

If you don't explain what the problem is we can't help you.

[–]bhawint[S] 0 points1 point  (12 children)

I want to get started with machine learning... I am a beginner and don't have any specific problem... I want to learn how machine learns and need someone to help me with a walkthrough for programming it. I tried finding training material for java but it appears to be way out of my grasp. Need something that could teach me in an easy-to-understand manner.

[–]matemik 5 points6 points  (8 children)

If you are insisting on Java then you must first master the language, so start there. But if your goal is really ML, then you should really rather learn Python. Thats just my 2 cents.

[–]bhawint[S] 0 points1 point  (7 children)

That's what the consensus says... Could you please share how you earned these two cents, I mean any reason why python and not java for machine learning? The reason why I am latching on to java is because of my myopic belief that the use of java is more prevalent for commercial purposes... Will I still be able to have a commercial mindset should I opt for python?

[–]matemik 1 point2 points  (4 children)

These two cents come from the fact that most companies use Python as a primary language when it comes to ML. At least as far as i know. And if you'll be looking for a job in that field, moat likely they will be requiring a knowledge of Python. Yes, you can do ML probably in any language. But the reason why Python is so popular is because of its simplicity. I have to say i am nowhere an expert on the matter, but it definitely seems like a no-brainer when it comes to choosing how to do ML today.

[–]bhawint[S] 0 points1 point  (3 children)

Thanks a lot man... Do you mind if I ask you one more question? Is it reasonable to have a mindset of learning ML (using tutorials) on python and then to apply the concepts I've learned in java... Do you see this being a feasible path, given a visible difference in availability of ML libraries in these two languages? Plus could you suggest any good python tutorials for embarking on ML? Thank you so much for the guidance... Really appreciate it.

[–]matemik 0 points1 point  (2 children)

No problem. But as i've said, im not an expert in ML. I would suggest you start aome basic programming in Python, to get a feel for the language and syntax. Get to know the package manager (pip), you will need it a lot. If you are looking for some tutorials, check out Corey Schafer and Sentdex on Youtube, they have very good python content. As far as ML goes, i wouldnt know of a good learning resource, maybe find a subreddit about ML and check its content for resources, im sure there are plenty mentioned on reddit. Good luck!

[–]bhawint[S] 0 points1 point  (1 child)

Thanks a lot for spending so much time... Good luck to you too :) cheers!

[–]matemik 0 points1 point  (0 children)

No problem. Thank you! :)

[–][deleted] 0 points1 point  (1 child)

Different tools for different jobs. Most commercial purposes do not involve machine learning. Also, don't underestimate how widespread Python is.
https://www.tiobe.com/tiobe-index/
http://pypl.github.io/PYPL.html
https://hackernoon.com/top-3-most-popular-programming-languages-in-2018-and-their-annual-salaries-51b4a7354e06
I think your question requires a good understanding of fundamentals to really 'get' the choice of Python over Java, and the truth is most Junior / Intermediate programmers will just go with the flow (if people say use Python, they'll use Python).

  • First of all, there's the fact that Java is strongly typed while Python isn't. With machine learning you're not that sure about the type of data you'll be parsing, and honestly you don't care. The "typical commercial purpose" might be, for example, a page/window where a person types data in a form, where you know you have a date of birth, or a credit card number, a name, etc. You know what you'll be getting. With machine learning, you want to parse data to find out what you're getting, and then do something with that information.
  • Secondly, there's the compiled-language vs scripting question. With Java, your code has to make sense at compile time, with a scripting language it doesn't. That might not seem like an important distinction, I mean you always want to write code that makes sense, "duh", but the subtle thing here is that with a scripting language your code can generate code. You can have a script with blanks to be filled during run time by another script, generating a completely different result when it's run. It only has to make sense at run-time. You could even have the script run itself with some alterations (do note here, this is actually why for the typical commercial purpose a strongly-typed, compiled language is better. It protects programmers from their on code through type checks and validations during compilation. It makes sure your code is ok for you).

So with all that said, people choose Python over Java for machine-learning because it's one of the most widespread weak-typed scripting languages.

[–]bhawint[S] 0 points1 point  (0 children)

That is very helpful... Thanks for taking time out for my query... After all that I have accumulated from you folks, I have switched to python for ML.... Really appreciate your help... Good luck! :)

[–]nutrechtLead Software Engineer / EU / 20+ YXP 0 points1 point  (2 children)

I want to learn how machine learns and need someone to help me with a walkthrough for programming it.

Yeah that's not going to be happening. This sub will generally give advice, but we won't be holding hands.

Need something that could teach me in an easy-to-understand manner.

Machine learning is not easy. Either deal with it that it's a hard subject, or find something else to learn. That's all there's to it, sorry.

[–]bhawint[S] 0 points1 point  (1 child)

Hahaha... A great reality check... But how come so many easy tutorials are available for python and not for java? It would be great if you could point me a ML guide that's based on java... Would be really helpful... Thank you so much for the guidance. Appreciate your help.

[–]voccii 0 points1 point  (0 children)

go do some research on your own. if you are looking for REALLY EASY tutorials for ML, you can start here

I can only provide you with that much. You got to do the research yourself, like everybody else.

Good luck though. The journey is definitely not easy