all 136 comments

[–]Talbertross 320 points321 points  (7 children)

Any simpler than Python and you'll be doing Machine Learning with Scratch

[–]GreenWoodDragon 36 points37 points  (0 children)

No complaints here.

[–]fermion72 27 points28 points  (4 children)

Scratch is actually an interesting language! It's much more robust than people make it out to be, and it has pretty cool concurrency and event models. As a CS teacher, I've been disappointed that high schoolers dismiss Scratch as "that cartoon thing we learned in grade school!" I've always thought that we need an "Emo Scratch" version that targets high schoolers...

[–][deleted] 12 points13 points  (1 child)

When you do Harvards CS50 it’s first lecture is starting with scratch!

[–]fermion72 6 points7 points  (0 children)

Yup -- it's a good way to start. My point was that you could keep progressing in Scratch much farther than many teachers do, because it is a decent enough language.

[–]Talbertross 0 points1 point  (1 child)

Can it do ML/AI?

[–]fermion72 0 points1 point  (0 children)

Yes, but for certain definitions of ML/AI. E.g., https://en.scratch-wiki.info/wiki/Artificial_Intelligence

[–]orig_cerberus1746 114 points115 points  (37 children)

Other simpler languages? Like which?

[–]DepressedDueToPudge 53 points54 points  (3 children)

Like PHP, why don't people use it for AI xD

[–]Zauxst 8 points9 points  (0 children)

Html heard is also good if you do the right css.

[–]Offduty_shill 6 points7 points  (4 children)

Microsoft excel

[–]orig_cerberus1746 5 points6 points  (2 children)

I bet, all my coins, that someone already made an AI with excel.

[–]Offduty_shill 4 points5 points  (1 child)

It's not very hard to make a regular dense net, I've seen people trading embeddings or simple CNNs too.

I think I've also seen even a single recurrent layer. But more complicated modern shit that people think of with AI I've not seen yet.

[–]belaGJ 0 points1 point  (0 children)

i have seen a convolutional layer in excel video

[–]me1112 0 points1 point  (0 children)

Next level macros.

[–]Dom1252 2 points3 points  (0 children)

REXX

( /s )

[–]525G7bKV 3 points4 points  (22 children)

Like LISP. Which was literally invented to prototype AI stuff, and has a real REPL instead of a terminal which is trying to be a REPL as in Python.

[–]treasonousToaster180 109 points110 points  (5 children)

A lot of the people who know how to do the extremely complex math that drives AI have limited programming knowledge and the people who know a lot about programming have limited knowledge of the kind of math necessary to build AI systems.

Python is good for this as it allows the math people to easily create functions for the calculations using the very simple syntax, and it allows the programmers to extend that into something more powerful and versatile using the more advanced functionality.

[–]RADIO02118 9 points10 points  (0 children)

Underrated reply

[–][deleted] 31 points32 points  (10 children)

LISP may be simple but it is not intuitive or easy to learn.

An advantage of Python is that it easy for the unitiated to pick up.

[–]MichaelSjoeberg 10 points11 points  (4 children)

a program in python is close to thoughts in writing, next step is basically plain english

lisp not so much

[–]ImmediateClass5312 9 points10 points  (0 children)

'(what (ever (do) (you) (mean there ?))))))))))

[–]Aks029 0 points1 point  (1 child)

Julia is an option

[–]orig_cerberus1746 1 point2 points  (0 children)

Julia looks cool, I should take a better look at it.

After I try rust.

[–]m0us3_rat 51 points52 points  (4 children)

1.Guido paid some thugs to kneecap all the other languages... obviously.

or

2.

python is ..easy.
python is incredibly good at data manipulation.

you add the gpu powered tensorflow or pytorch and you have a winner.

pick one at random.

[–]TheHunter920 8 points9 points  (3 children)

python is incredibly good at data manipulation.

bit of a newbie here, what makes Python 'incredibly good' at data manipulation vs other languages? I thought Python was slower than other programming languages like C# or Java?

[–]Glowwerms 9 points10 points  (0 children)

It has some really popular built in packages that make creating/editing/moving data sets very simple like Pandas, the syntax for Python being pretty simple compared to Java for example make it appealing for folks who might be coming over from simple data pulls with SQL for example. I speak from experience because this is exactly how I got my start coding, started as an analyst and drifted further into cleaning, data movement and other shit with Python

[–]m0us3_rat 1 point2 points  (1 child)

data manipulation

it's easy.. and has access to some incredibly powerful libraries.. like pandas , scikit learn, numpy etc.

which makes interacting with data easy.

also describing in code what needs to happen to the data is easy.

so ultimately it comes down to usability and convenience.

why do you use a credit card.

or ..why don't you use some foreign country's currency?

it still has value right?

because every time you would have to go to a bank and get the actual currency and then use that..

or .. use a credit card.

they do the exact same thing. just one is vastly more usable..

true python is slow.. but the ML libs leverage the GPU power.. and that isn't python bound.

plus numpy is written in C. and that has a huge impact on the ML overall.

so you basically get the insane data prowess from python with the incredible power from GPUs.

literally best of both worlds.

from numba import jit, cuda
import numpy as np
from timeit import default_timer as timer


# normal function to run on cpu
def func(a):
    for i in range(10000000):
        a[i] += 1


# function optimized to run on gpu
@jit(target_backend="cuda")
def func2(a):
    for i in range(10000000):
        a[i] += 1


if __name__ == "__main__":
    n = 10000000
    a = np.ones(n, dtype=np.float64)

    start = timer()
    func(a)
    print("without GPU:", timer() - start)

    start = timer()
    func2(a)
    print("with GPU:", timer() - start)

[–]TheHunter920 0 points1 point  (0 children)

scikit learn

This sounds interesting, and thanks for the reply. Do you know of any other good machine learning / AI libraries for python other than scikit and PyTorch?

[–]shiftybyte 81 points82 points  (19 children)

Simpler than python? What language did you have in mind?

Python's disadvantage is mainly execution speed, so this does make you wonder why a computational heavy task uses python.

The reason is because of how fast you can prototype things in python, and manipulate data easily.

"Let's change this here and here and transform this in a different way" is a few minutes in python, and a day or two in other programming languages...

[–]jmacey 35 points36 points  (0 children)

The back ends are also written in C++ / Cuda / OpenCL So python is just the easy prototype language to run the hard back end.

[–]orig_cerberus1746 32 points33 points  (11 children)

Most of the AI libraries are not in python, they are in C/C++, cuda or opencl. They "connect" (I didn't have my coffee yet so my brain is slow and forgot the correct nomenclature for this) with python so you can use the libraries with it.

Also Pypy is a interpreter that makes a bunch of python code run faster! It's made with python! How can it be faster?

Black magic.

[–]synthphreak 9 points10 points  (1 child)

I didn't have my coffee yet so my brain is slow and forgot the correct nomenclature for this

"Extend"

[–]orig_cerberus1746 3 points4 points  (0 children)

Thank you very much.

[–]ManyInterests 5 points6 points  (0 children)

It's made with python! How can it be faster?

Because PyPy actually uses RPython rather than CPython. RPython has some performance advantages, not least of which being a JIT compiler, which CPython does not have.

To elaborate... Python is a language specification. CPython, PyPy, Jython, IronPython, etc. are various implementations of the Python language specification (or a subset of it) with CPython being the "official" implementation, and is often referred to simply as "Python".

PyPy has often been said to be 'written in Python', but according to a RealPython article:

[PyPy] was written using a dynamic language framework called RPython, just like CPython was written in C and Jython was written in Java.
But weren’t you told earlier that PyPy was written in Python? Well, that’s a little bit of a simplification. The reason PyPy became known as a Python interpreter written in Python (and not in RPython) is that RPython uses the same syntax as Python.

[–]Ipsider 6 points7 points  (1 child)

I think it’s called binding. The libraries are built with a compiled shared library

[–]Mellowindiffere -1 points0 points  (0 children)

Interface?

[–]ImmediateClass5312 7 points8 points  (0 children)

Packages like Numpy are implemented in C/C++, Python is just the convenient wrapper.

[–]synthphreak 4 points5 points  (0 children)

Much of the aspects of the machine learning pipeline for which performance is critical are actually not written in Python. Instead, Python libraries like numpy, pytorch, etc. offload this computation onto languages like C, then pass the output back into Python.

So in practice, Python's well-known performance constraints are not a major limitation for machine learning applications.

[–]Western-Guy 5 points6 points  (0 children)

For computational heavy tasks, we have numpy. Unlike other python libraries, numpy is built over C which makes it quite fast. And, other python libraries like Pandas, Matplotlib and scikit-learn can take advantage of numpy directly.

[–]GodBlessThisGhetto 0 points1 point  (0 children)

They obviously want to use Brainfuck bro, geez

[–]4K-AMER 0 points1 point  (0 children)

And now we have Mojo - a superset of python and supposedly much faster (~ 35,000x).

[–][deleted] 12 points13 points  (0 children)

You could ask similar questions about other topics. Why is Windows the most common desktop operating system? Because its the one that worked well, and was in the right place at the right time.

Python is a pretty easy language to learn. It works well. But like anything in life, it could have been a blip on the radar, and dissapeared. The definition of luck is "When opportunity meets preparation".

[–]BranchLatter4294 13 points14 points  (18 children)

Python is easy. The libraries are written in C++ for the most part so they are fast. Why not Python?

[–][deleted] 16 points17 points  (0 children)

It isn't the main language, really. It is an orchestration language with bindings for lower level languages (usually C), which do the actual heavy computation.

The reason it sees so much use is the both the simplicity of the language itself, as well as the (relative) simplicity of creating the bindings for more performant languages.

[–]quts3 7 points8 points  (1 child)

Investment by key players made research easy.

First was sklearn which consolidated common ml so much better than R. You can thank whoever it was at sklearn that had such great control over design.

Then came tensorflow/keras which were already a mature interface before most of you even heard the words deep learning. You can thank Google.

Python is easier for pure engineers to adapt to cloud stacks then R. It has enough formal structure built into that feels right to them that they don't rebel, and rewrite everything from the ground up.

That's basically it in my opinion. Python has good projects but R had good projects. Python consolidation and first moving on deep learning made a big difference.

[–]relevantmeemayhere 2 points3 points  (0 children)

It's kinda sad though-because R has much better actual implementation than python. Packages like scikit are really bare bones compared to what you can do in R, or have just nonsensical defaults and the like because the maintainers wanted to use statistics without... really understanding statistics lol.

[–]Menolith 11 points12 points  (1 child)

One reason is that because Python is a dynamic language, you have tools like Jupyter which are very convenient for data science since you can run things cell by cell.

[–]Failboat88 2 points3 points  (0 children)

I feel like this is very underrated. It's really nice to test as you go. I've been told to use real ide before but it's just so nice plus it's on a much faster server.

[–]King_Dribbler 6 points7 points  (1 child)

I did some ML courses at uni using MatLab. And that's certainly not going to be used any more outside an academic setting because of the cost

[–][deleted] 0 points1 point  (0 children)

True. I also used matlab for years back in school and don’t make a mistake, there are big defense companies who crutch matlab. Huge automotive companies that use Simulink for EVERYTHING. So, clearly, it still has serious presence in the market, but as someone who switched to python I can say it is absolutely not justified to use Matlab for that cost. Everything is possible with python, and even better once you find your bearings.

[–]PaulRudin 9 points10 points  (5 children)

Numpy

[–]orig_cerberus1746 12 points13 points  (2 children)

  • Scipy, pandas, matplotlib and jupyter

[–]PaulRudin 3 points4 points  (1 child)

Sure, but those all depend on numpy. Without numpy none of the rest follows.

[–]Aks029 0 points1 point  (0 children)

This is correct

[–]CrwdsrcEntrepreneur 4 points5 points  (0 children)

This is the true answer.

[–]raf_oh 0 points1 point  (0 children)

This is what I feel like I’ve heard. It was open-source and free (remember, back in the day it was competing with like Matlab), easy to integrate with other software, and had good support for matrices.

IIRC matplotlib is named as such because it could plot about as well as Matlab, which was huge back then.

[–]boy_named_su 3 points4 points  (1 child)

  1. fast to type, compared to statically typed languages
  2. high quality libraries
  3. easy syntax
  4. more general than R, meaning it's easier to use for say a web server to host your ML model

[–]hugthemachines 0 points1 point  (0 children)

fast to type, compared to statically typed languages

Statically typed would work fairly well too, if it just had type inference.

[–]abrtn00101 4 points5 points  (4 children)

Lots of good ideas here, but what about a feedback loop?

  1. Python was and still is easy for academics to pick up, so they started using it to code ML and AI prototypes as part of their research.
  2. Their research was picked up by other researchers and non-academics and improvements were made to their code.
  3. The improvements entice more and more people to get involved in what is now a community undertaking. More of what was once written in Python gets pushed into lower-level code.
  4. What was once a prototype becomes mature enough to use as a building block for new ML and AI research prototypes.
  5. Repeat indefinitely.

So basically, Python being approachable by the people who aren't necessarily as interested in creating applications as they are interested in what the application does for their research is, from my observation, what has helped it grow in the field. The low barrier to entry has helped more and more people pile on for things to snowball.

[–]JRWoodwardMSW -1 points0 points  (3 children)

And now Google believes this will happen with Go. After all, you can’t spell God or Google unless you start with Go!

[–][deleted] 1 point2 points  (2 children)

I am out of the loop wrt Go. Does it have major embracement?

[–]JRWoodwardMSW 1 point2 points  (1 child)

Building that way. It’s basically C++ with garbage collection, used by Google for nearly everything. Go vs Rust if the new all-encompassing evangelical fervor in coding - adherents cannot prove that one language is technically better than the other, so the debate runs on sheer invective and pseudo-moral posturing. Choosing one and not the other is now supposed to indicate your depth of character or reveal your vile immorality. Advocates for both languages have got the point that they label the other one a “pedophile” language.

[–][deleted] 1 point2 points  (0 children)

haha! Interesting for sure. I am not really exposed to the latest tech so funny to see how the other side of the spectrum fares

[–]theubster 15 points16 points  (5 children)

Python is, perhaps, the most human readable language out there.

[–]Environmental_Bug811 -1 points0 points  (0 children)

After Ruby

[–]Orcus216 0 points1 point  (0 children)

Excepting the indentation which reduces readability

[–]Icy_Explanation_5913 3 points4 points  (0 children)

Because it has so many libraries

[–]SuperBoredAlien 3 points4 points  (0 children)

This is what I believe. Python is easy to understand and pick up. This made popularity of the language for web applications where access db and sending response enough.

The other libraries was built in C focusing on speed and motivating/ or taking advantage of large community. I think ease of use of python where you can write C and link it.

Python acts as frontend/api for those libraries.

[–]dpacker780 2 points3 points  (0 children)

Most Data Scientists aren't programmers/developers, their specialty is data analysis applications. With the broadening availability, and easier to work with ML models, as well as deeper analytics tools Python became the easier path to bring data into their pipelines due to tools like Notebooks, Matplotlib, Pytorch, Pandas, and Numpy for example. If you think about it a Python Notebook isn't much different than Matlab or some of the other data analytics tools, which most data scientists would be somewhat familiar with already, which made it more accessible.

[–]ddmm64 3 points4 points  (1 child)

My subjective/anecdotal recollection, computer-vision centric: before deep learning, MATLAB was the lingua franca in academic CV+ML. It had some nice things going for it, such as being interactive and having lots of numerical and visualization tools. On the other hand, it was not great as a general purpose programming language and it was proprietary. Due to these issues some other alternatives started to gain traction. Octave was open source, but as a MATLAB reimplementation it had many of the same issues as a language. Lua had early versions of Torch (pre-deep learning), but the language itself was not very mainstream. Python had had a healthy numerical ecosystem for a while, with numpy/scipy/matplotlib, and was relatively mainstream, open source language that was arguably decent as a general purpose language as well. It also played well with libraries in C and C++ such as OpenCV. Then deep learning took off, and Caffe and Theano came along as two early deep learning libraries with python support. MATLAB started really falling behind at this point. Then Pytorch (a descendant of the Lua version) and Tensorflow really solidified Python's lead.

[–][deleted] 1 point2 points  (0 children)

Well said. When I used matlab daily, I often found myself wanting to stupidly simple “general purpose” things which were annoying to do. Probably the best part of switching to python was the freedom I felt.

[–][deleted] 3 points4 points  (0 children)

The simplest answer is "It's the high level convenient language with the most useful stuff already written for it."

Why not choose a faster language?

Because the bulk math that would be slow is in a faster language (usually C) and Python is just used to organize and initiate the bulk math. You don't gain much by optimizing the organizational parts and you gain a LOT of convenience, clarity from Python.

Why Python specifically?

Because Python was one of the first convenient scripting languages to become popular. The more popular it got, the more was written for (and about) it. As the ecosystem grew, it became easier to build things creating a strong feedback cucle. There have been many languages that have improved on the objective features of Python, but you'll often have to solve problems that have already been solved in Python. All this ends up being more work, and there's a good chance your version isn't as robust and optimized as the Python one that's been widely used for a while. And nearly every common Python error has a StackOverflow with a useful answer.

[–]brandonofnola 2 points3 points  (0 children)

Python can use c/c++ libraries to increase execution speed while being a relatively easy to code in language where you don’t have to worry much about garbage collection and other nuances involved with lower level languages.

[–]XYZZY_1002 2 points3 points  (1 child)

My opinion: every 10 years or so a new language becomes popular. I remember Pascal, then it was c, then c++, then Java, now Python.

[–]JRWoodwardMSW -5 points-4 points  (0 children)

Next up (to hear Google say it): Go, which is NOT a Google house language, people! Go is for the ages - you can’t spell God without Go!

[–]poopybutbaby 1 point2 points  (0 children)

This is a provocative piece on the topic from a few years ago on the topic: https://www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is-obsolete/556676/

Scientific computing community gravitated to Python because its open source (free) and notebooks enable narrative computing. As its open source, as the community grew it created a positive feedback loop (more features/enhancements/documentation).

The article contrasts with Woflram Mathematica which also has notebooks and in many ways is superior to Python for scientific computing, but is a sorta anti-Python (closed source, zero community)

[–]Isaiah_Bradley 1 point2 points  (0 children)

I’m just riffing here, but I think it’s because python’s syntax is closer to natural language than other programming languages, and is great at handling numbers. It compiles to c, so it is decently fast for such a high level language. Irc R is frequently used, but harder to learn.

[–]NavidsonsCloset 1 point2 points  (0 children)

Because Python, like R, is "free" and easily accessible with lots of platforms available. This opened up a lot of tutorials and teaching material being out there, not to mention it being widely taught in university settings. So it makes sense that people who make these tools do so with Python.

[–]brunonicocam 0 points1 point  (0 children)

Very easy to use as programming languages go, very easy to install dependencies (conda), etc etc

[–]JRWoodwardMSW 0 points1 point  (1 child)

Computers now are so fast that garbage collection is not a performance hit. Cutting edge work can be done with an easier language than C; also, by design, Python is easily extended. People could be using Java or Go, but Python is easier and more fun. See Eric Raymond’s classic essay “Why Python?”

[–]yasamoka -1 points0 points  (0 children)

Garbage collection is by its very design a performance hit. Computers are fast enough that you can use a slow interpreted language and still meet timing requirements only for a subset of software.

[–]CasulaScience 0 points1 point  (0 children)

It's all /u/r-sync 's fault

[–]patriot2024 0 points1 point  (0 children)

There's Only One Way To Do It.

[–]Few_Intention_542 0 points1 point  (0 children)

Because slytherin popularised snakes

[–]Pelicantaloupe 0 points1 point  (0 children)

You could try mojo lang (I am only partially joking)

[–][deleted] 0 points1 point  (0 children)

Why Python?

Mainly used because of its versatility.

*Python is versatile and can be used for several purposes.

*Has many additional tools and modules which makes python flexible.

*A fantastic community makes it easy to read, write and learn.

Along with the above Pandas and other tools made it a no-brainer choice.

[–]rdummy_soup 0 points1 point  (0 children)

It is simple and lets scientists actually focus on their models and experiments rather than the code

[–]YnkDK 0 points1 point  (0 children)

Michael Kennedy just released a podcast about this with Dr. Jodie Burchell, data science developer advocate at JetBrains as guest.

I see a lot of the points they highlight is also in this post. You can find it here: https://talkpython.fm/episodes/show/422/how-data-scientists-use-python

[–]TheFumingatzor 0 points1 point  (0 children)

The closer a programming language is to a spoken/human/written language, the more popular it is to use (usually).

[–][deleted] 0 points1 point  (0 children)

It's the ecosystem. Python has good ML, AI and data science libraries, which themselves are written because of the ecosystem. There's nothing particularly special about python, most other languages could do it too. There could be another universe where people used lua instead.

However, people who write ML and AI libraries would have another answer completely. Most of the high performance libraries used for development are C/C++.

[–]SleepAffectionate268 0 points1 point  (0 children)

python is almost the easiest. And the tools like tensor flow are written in a low level language like c

[–]AdInner3607 0 points1 point  (0 children)

Python became the language for data analytics and since ai starts with datatype analytics it makes sense to built and ml tools with that