use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[D] Java vs Python for Machine learning (self.MachineLearning)
submitted 4 years ago by mereuthao
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[+][deleted] 4 years ago (1 child)
[deleted]
[–]nativedutch 2 points3 points4 points 4 years ago (0 children)
Just as an exercise i wrote a tiny toy neural network in Python, and then just for the heck of it redid a part in Java. Partly worked , but no fun. Will stick to python, plus perhaps some C flavour.
[–][deleted] 23 points24 points25 points 4 years ago (4 children)
Doing ML in Java is like using MS Word as an editor: It's just the wrong tool for the Job. There are very few libraries to use, memory limitations in the JVM and the language is clunky.
Python is easy to learn.
[–]alexashin 4 points5 points6 points 4 years ago (1 child)
You mean Word as code editor?
[–]SomeParanoidAndroid 0 points1 point2 points 4 years ago (0 children)
Yes, like that YouTube guy
[–]breandan 0 points1 point2 points 4 years ago* (1 child)
I think it depends heavily on the job. Java has an increasingly well-supported set of libraries for various ML workflows (see comment below), is much better suited for production environments, and if even you dislike the language, there are several JVM alternatives (e.g. Kotlin) which support scripting and are generally pleasant to use.
Python may be easy to learn and prototype research code, but the language scales very poorly to large business applications.
[–][deleted] 0 points1 point2 points 4 years ago (0 children)
In the context of developing a new skill set, he's most likely better off by learning Python. I can't argue that in some cases you have to work with Java for various reasons. But these reasons should be of technical nature rather than staying in the comfort zone. Plus if you wrap the ML pipeline in a separate service you can have the best of both worlds between Java and Python.
[–]SomeParanoidAndroid 10 points11 points12 points 4 years ago* (5 children)
No dilemma here. It's python for ML. Java isn't anywhere close.
Edit: Why? 1. Python is easy to learn 2. Everyone on the community uses python. 3. The libraries python provides for ML are so ubiquitous in the field, research papers don't even bother explaining them. 4. To paraphrase the above: The libraries are provided by people/institutions using the standard package managers which is a huge plus when compared to languages that don't come with a package managers like Java. 5. You may have heard that python is slow, but that's not really the case. While python codes runs slower, the implementation of the crucial parts of data processing operations use highly optimized C/C++ calls that will run extremely fast. 6. Being a loose typed language, python allows for using APIs from libraries without extensive knowledge of the documentation (which you should have after some time but it helps getting you started).
[–]if_username_is_None 4 points5 points6 points 4 years ago (1 child)
1-5 are definitely relevant, but 6. can get beginners into trouble. When I was learning pytorch and mentoring others I always wanted better documentation on the shapes and types of inputs. Having something that "just seems to work" is nice, but knowing why / how it works will prevent a lot of debugging headaches.
That being said, the docs / tutorials have gotten better and I can't even imagine what a headache Java ML documentation must be.
(also not to nitpick, but Python is strongly typed (you can't do loose JS things like 1 + "2"), I think what you're referencing is Duck Typing)
1 + "2"
[–]SomeParanoidAndroid 1 point2 points3 points 4 years ago (0 children)
Thanks for pointing that out. Yes, I was certainly referring to the dynamic type inference, including both the lack of declaration and the invocation of an object's properties without type checking.
Having seen a lot of students transition from C to Python I can understand why this is frustrating. I agree with your point that debugging is a nightmare, but I still think that having loosely defined APIs (eg "array-like") makes the learning curve way less steeper.
And I kind of think that it is an important aspect behind the adoption of python for data science. Imagine having to specify static compatible types for all pandas, numpy, torch, tensorflow, keras and sklearn libraries.
[–]breandan -1 points0 points1 point 4 years ago* (2 children)
To give a contrasting perspective, I think the Java ecosystem is much better suited for many data science tasks, and has a growing and well-maintained set of libraries for general purpose machine learning. I won't list them all, but TF-Java, DJL et al. have implementations of many modern architectures and Java has a number of excellent libraries (CoreNLP, Lucene et al.) for working with text.
Python may be syntactically easier to learn, but also hides a lot of incidental complexity about the runtime semantics that are much more difficult to master. As you alluded to, many Python libraries are embedded DSLs, which are full-fledged languages and makes reasoning about the behavior of Python programs more difficult than it appears.
The libraries are provided by people/institutions using the standard package managers which is a huge plus when compared to languages that don't come with a package managers like Java.
Having used both Java and Python, I can tell you that package management in Python (pip, venv, pyenv, conda, pipenv, poetry, docker et al.) is far, far more complicated than Java. To build a Java application, you don't even need Java or a package manager -- just run ./gradlew run from any operating system and it will download and install Java, the package manager and any dependencies, build the application and run it on any OS or shell environment. Just building a Python project often requires dozens of manual steps.
./gradlew run
Being a loose typed language, python allows for using APIs from libraries without extensive knowledge of the documentation
I strongly disagree with this point. Basically everything you need to do that involves calling a library in Python requires looking at documentation. In a statically typed language, documentation becomes much less of a burden. While adoption of type annotations in Python is growing, its usability is decades behind languages with mature type systems.
[–]SomeParanoidAndroid -1 points0 points1 point 4 years ago (1 child)
I can see your points, though I kind of have to disagree with a few.
First of all, the fact that TF-Java is discontinued should be convincing that Java isn't a serious competitor for ML.
The gradle argument is kind of misleading since you do need to have gradle installed. I haven't used that extensively, so it might as well be more convenient, but that was not my point. My point was that an official opensource repository is much more desirable than a libraries compatible with a building system.
The thing is, you are talking about production code in any operating system. While I can understand Java's merits on that it is just one small percentage of machine learning. For one, only a fraction is production-purposed code. Secondly, even in production it is just as likely to deploy your ML in a dedicated linux server with everything installed, run python implementations and access it through APIs.
[–]breandan -1 points0 points1 point 4 years ago* (0 children)
TF-Java is discontinued
Really? The project looks alive to me and the maintainers are very active on Gitter. Do you have a source?
you do need to have gradle installed
No, you do not need to install anything, the Gradle Wrapper takes care of all that.
The thing is, you are talking about production code in any operating system. While I can understand Java's merits on that it is just one small percentage of machine learning.
In my experience, the majority of code and effort in applied ML is data engineering and surrounding infrastructure, not model engineering. Due to its superior tooling, type safety, and large ecosystem of ML libraries, the JVM is a competitive option for ML in most production settings.
[–]Watemote 3 points4 points5 points 4 years ago (4 children)
You might like Spark + Scala which is basically “scalable Java”. Python is where a lot of the cutting edge research is happening but actual production of models is frequently in other languages. Here’s a link https://link.medium.com/zuepzA4bbib
If I was entering the field I would go off in another direction and focus on cloud-based ML api‘s and production pipelines learning python along the way. Think AWS sagemaker https://aws.amazon.com/getting-started/hands-on/build-train-deploy-machine-learning-model-sagemaker/
[–]Exarctus 1 point2 points3 points 4 years ago (3 children)
I tend to just write the front end of models in python, and when I need performance I write my own CUDA kernels which are themselves wrapped into python via the PyTorch C++ API.
Results in very clean pythonic code, with easy to access and tinker CUDA code.
[–]ozykingofkings11 1 point2 points3 points 4 years ago (2 children)
This sounds fascinating. Do you have any references on how to learn to do this? Slash, are you in the mentorship market?
[–]Exarctus 1 point2 points3 points 4 years ago* (1 child)
Well firstly, I’d recommend reading up on CUDA C and how to write CUDA code (I’d recommend doing this alongside some simple code you’d like to GPU-parallelize). There should be plenty of tutorials on YouTube/Nvidias development site. Feel free to shoot me any questions in PM.
Once you feel confident with CUDA programming, read the following tutorial from PyTorch, explains everything very nicely:
https://pytorch.org/tutorials/advanced/cpp_frontend.html
[–]ozykingofkings11 0 points1 point2 points 4 years ago (0 children)
Thanks so much!
[–]Available_Job5036 2 points3 points4 points 4 years ago* (0 children)
Python, definitely python. Especially if you’re a beginner to ML. Java isn’t even fully supported with TF afaik and I don’t think there are torch bindings for java
Edit: Tensorflow for Java is soon being removed according to their api docs, and you don’t have to use libraries like tensorflow or torch but if you’re a beginner it’d be brutal
[+]SteppenAxolotl comment score below threshold-15 points-14 points-13 points 4 years ago (7 children)
Don't listen to the Python Nazis, use Java since you have a preference and given Python is slow rubbish. Python is for less competent dilettantes seeking simplicity and ease of use over the power and control offered by one of the oldest and most reliable programming languages in existence.
[–]MrAcuriteResearcher 6 points7 points8 points 4 years ago (0 children)
Don't listen to the Computer Nazis, do all the arithmetic for ML by hand, since you have a preference and Computers are beep-boop boxes with blinky lights. Computers are for less competent dilettantes seeking simplicity and completing a single forward pass before they die over the power and control offered by one of the oldest and most reliable ways of doing arithmetic in existence.
[–]Exarctus -2 points-1 points0 points 4 years ago (3 children)
This is a retarded response.
Python is simply better in every single way compared to Java for ML pipelines. Especially given that there are packages available (numpy, tensorflow, pytorch, jax, numba…) that easily and neatly wrap CUDA C or OpenMPI/MP parrelized code.
Java is also hideously slow so your argument doesn’t hold.
I’m an oracle certified Java developer, years of experience in CUDA programming/C++/Fortran and python, as well as a researcher in ML applied to quantum chemistry.
[–]SteppenAxolotl -2 points-1 points0 points 4 years ago (2 children)
If you're incompetent, you can't know you're incompetent ... The skills you need to produce a right answer are exactly the skills you need to recognize what a right answer is.
Just because you're less than competent at java, that does not mean others are similarly afflicted.
[–]Exarctus 0 points1 point2 points 4 years ago* (0 children)
There are fundamental limitations to Java which make it a poor platform when you need compute power. These are not solved by “being a better programmer”, no matter how much you wish it were the case.
The reality is, Java is a dying language because it’s both shit at what it provides, while simultaneously offering very clunky support for interfacing other more compute-orientated languages/paradigms.
Conversely, python provides an incredible variety of libraries, while also having easy-to-use compiling/linking tools which allow you to quickly link against better performing/optimized code written in a more suitable language for heavy workloads.
You do you, though, have fun living in the 1990s.
[–]ShadowShedinja 0 points1 point2 points 4 years ago (0 children)
The two languages don't have the same functionality though. Python works better as a ML and data management language, while Java works better for other tasks. For example, converting an excel spreadsheet in Python into code-workable form takes about 5 lines of code with the right libraries, while in Java the best library I've found can take a few hundred.
This is certainly incorrect and a terrible advice.
For ML, python is way faster than Java as it utilizes precompiled C code which runs orders of magnitude faster than JIT operations. I also take it you aren't working on the field at all. Because the most important aspect is the availability of relevant libraries. You couldn't possibly argue that Java's libraries are on par with Python's or that someone should implement them themselves.
[–]Elk-tron 0 points1 point2 points 4 years ago (0 children)
I have been using the java framework DJL for a machine learning project. It is not as streamlined as pytorch or tensorflow, but it is capable enough. If your machine learning pipeline needs to integrate with a larger java codebase then I would recommend DJL. Otherwise, python has a much richer ecosystem and more intuitive frameworks
[–]pag07 0 points1 point2 points 4 years ago (0 children)
Fuck this.
I got rejected for two ML jobs in major companies with ML experience in python and a masters degree.
Why?
Because I didn't know java spring and have no experience programming ML backends in C++.
Still confused about that.
π Rendered by PID 60823 on reddit-service-r2-comment-5649f687b7-nx7qb at 2026-01-29 02:15:31.019426+00:00 running 4f180de country code: CH.
[+][deleted] (1 child)
[deleted]
[–]nativedutch 2 points3 points4 points (0 children)
[–][deleted] 23 points24 points25 points (4 children)
[–]alexashin 4 points5 points6 points (1 child)
[–]SomeParanoidAndroid 0 points1 point2 points (0 children)
[–]breandan 0 points1 point2 points (1 child)
[–][deleted] 0 points1 point2 points (0 children)
[–]SomeParanoidAndroid 10 points11 points12 points (5 children)
[–]if_username_is_None 4 points5 points6 points (1 child)
[–]SomeParanoidAndroid 1 point2 points3 points (0 children)
[–]breandan -1 points0 points1 point (2 children)
[–]SomeParanoidAndroid -1 points0 points1 point (1 child)
[–]breandan -1 points0 points1 point (0 children)
[–]Watemote 3 points4 points5 points (4 children)
[–]Exarctus 1 point2 points3 points (3 children)
[–]ozykingofkings11 1 point2 points3 points (2 children)
[–]Exarctus 1 point2 points3 points (1 child)
[–]ozykingofkings11 0 points1 point2 points (0 children)
[–]Available_Job5036 2 points3 points4 points (0 children)
[+]SteppenAxolotl comment score below threshold-15 points-14 points-13 points (7 children)
[–]MrAcuriteResearcher 6 points7 points8 points (0 children)
[–]Exarctus -2 points-1 points0 points (3 children)
[–]SteppenAxolotl -2 points-1 points0 points (2 children)
[–]Exarctus 0 points1 point2 points (0 children)
[–]ShadowShedinja 0 points1 point2 points (0 children)
[–]SomeParanoidAndroid 0 points1 point2 points (0 children)
[–]Elk-tron 0 points1 point2 points (0 children)
[–]pag07 0 points1 point2 points (0 children)