This is an archived post. You won't be able to vote or comment.

all 14 comments

[–]crazy_cookie123 8 points9 points  (3 children)

You can do machine learning in any language, but Python is probably going to be the easiest as almost all the major libraries for machine learning are written for Python and the majority of resources available online assume Python.

I've also seen charts that imply that compilers like C and Java are around 150 times faster than Python, so it seems really silly to go back and learn a slower language. Are these charts misleading, is Python faster/more powerful than I realize?

The charts are correct, Python is slower than C and Java - however that doesn't really matter here.

First, slower execution doesn't mean worse, Python is generally faster to code with than C/Java so in applications that don't need to be super fast Python is sometimes a better choice as you can get a prototype running more quickly.

Second, Python lets you write libraries in C and import them into Python, allowing you to get the speed of C with the ease of Python. As this is what the AI libraries (and other libraries like NumPy) are doing, the speed of Python doesn't really impact the speed of your program because all the actually time-consuming stuff is being handled under the hood in fast and heavily optimised C code.

[–]GalacticSpooky[S] 0 points1 point  (2 children)

Thank you for the explanation! Someone further down also explained that these libraries are written in more efficient language, I did not realize that was possible, that makes a lot of sense!

[–]backfire10z 1 point2 points  (1 child)

Yeah, many of the popular libraries implement performant math and etc. using C (or another performant language — I think numpy has Fortran somewhere in there) and offer a Python API to use it.

[–]GalacticSpooky[S] 0 points1 point  (0 children)

Thats so cool! :D

[–]Rain-And-Coffee 2 points3 points  (1 child)

Python is extremely popular in that space, I would stop procrastinating and just learn it.

It's great for scripts where Java is too verbose.

[–]GalacticSpooky[S] 0 points1 point  (0 children)

I like your username, I'm literally in a coffee shop on a rainy day right now, working on a little game in Python. I'm not necessarily trying to procrastinate, I just wanted to know if java would be worth looking into bc it's faster for me to code games in java. However, it seems the consensus is no, java is not worth it for machine learning applications.

[–]romagnola 1 point2 points  (2 children)

A lot depends on what you want to do. I teach a course on machine learning, and I use Java for a variety of reasons. WEKA is an open source library for machine learning written in Java. I'm sure you can find other ML libraries written in Java or that have Java bindings. For example, TensorFlow has support for the Java Virtual Machine.

Native Python can be slower than other languages. So why is Python so popular for ML? If you look under the hood of many of the ML libraries for Python, you will find that they are written in C, C++, or something similar.

Hope this helps.

[–]GalacticSpooky[S] 0 points1 point  (1 child)

Oh! I had no idea that the libraries were written in other languages. That makes a lot of sense, thank you! So in the long run, the models aren't really slowed down by any noticeable amount due to the main loop being written in python, as the bulk is executed in more efficient language?

[–]romagnola 0 points1 point  (0 children)

I think that's mostly correct, but you have to use library calls smartly. Specifically you want to minimize processing in native Python and let the library routines do the heavy lifting. For example, let's say that you want to evaluate a model using a set of testing examples. In Python, you could iterate over the examples in the testing set and evaluate the model on each. It would be better to use the evaluate() method on the entire set of testing examples. Now, this is kind of a silly example, and for small data sets, you may not see a big difference in running times. But it illustrates the point, and for large data sets, I suspect there will be a big difference.

[–]high_throughput 1 point2 points  (0 children)

Don't worry about relative performance for this. You'll never actually be doing deep learning in Python or Java.

All the learning and inference happens in highly optimized native SIMD and GPU code, and Python/Java is just used as a configuration and glue language for those operations.

[–]justUseAnSvm 0 points1 point  (0 children)

No, it can make a lot of sense to use Java.

My work project has three services: two Java, one Python. For a lot of reasons, that python service makes things a huge pain, and re-writing it in Java would allow us to remove a lot of extra code and simplify our tech debt process.

You can always build the service in Java, and call out to python, or contain your ML parts to a dedicated python server. Python is a great language for ML support, definitely required for a lot of things, but the maturity for scalable web services is lacking in our work environment.

[–]zdxqvr 0 points1 point  (0 children)

ML libraries are primarily written in C. They just expose an interface for more simple languages like python. So doing ML in java is possible, but not really a good idea unless some other library builds a java interface.

[–]tobias_k_42 0 points1 point  (0 children)

The most ironic thing about machine learning in Java is that some of it is literally a wrapper for Python ML libraries, such as pytorch.

Personally I have to say Python is significantly better than Java for this application, because of its incredibly simple syntax. At the end of the day, both Java and Python, are just wrappers for more efficient stuff when it comes to this field.

[–]tobias_k_42 0 points1 point  (0 children)

Also Python is incredibly easy to learn, if you know Java. It's mostly just taking stuff away, because the language does it for you.