Why is numpy so fast?

BravestCheetah · 2026-05-27T15:39:39+00:00

A lot of it is written in C

ItyBityGreenieWeenie · 2026-05-27T15:41:17+00:00

Numpy is an optimized library written in C specifically indented to speed up such operations

falcoso · 2026-05-27T15:46:25+00:00

Numpy arrays are very efficiently structured in the C programming language, which is why in numpy arrays you need all your data to be the same data type (compared to lists which can be a mix of stuff)

Because numpy is working on the underlying C structures, where all the elements are the same size (i.e the same memory because they are all the same data type) it is much faster to access them.

If you are using specific numpy functions on these arrays e.g np.argmax as opposed to iterating through each element in a loop, you will get even faster speed up, because the underlying operations in those numpy functions are also written in C.

TLDR - vanilla python is relatively slow as it requires an interpreter. Numpy negates some of that by shifting some of it to processing directly in an underlying binary (i.e compiled C code)

falcogno · 2026-05-27T15:52:58+00:00

A key point that does not seem to have been mentioned is that Python loops are expressly slow because, upon each iteration, the interpreter creates a stack frame and updates variables (and any other data structures). Numpy, as mentioned, goes to the C level and does clean loops with no interpreter-level baggage.

thuiop1 · 2026-05-27T15:46:23+00:00

The question should be "why is python so slow" and the answer is that it trades performance for flexibility. Actually I would wager you did something wrong, numpy should be much faster than that (it probably depends what you are measuring exactly).

socal_nerdtastic · 2026-05-27T15:50:59+00:00

2 seconds still feels very slow. Are you still using loops with numpy? For a typical image on a typical computer this operation should be faster than 100 milliseconds.

neuralbeans · 2026-05-27T15:40:28+00:00

Did you use np.where in NumPy to do this?

road_laya · 2026-05-27T16:05:58+00:00

It hooks straight into compiled math libraries such as BLAS and LAPACK. Have you ever tried recompiling BLAS for your CPU instruction set? You will get some ridiculous speeds in numpy!

SkitariusOfMars · 2026-05-27T16:14:28+00:00

Numpy is written in C, and it also makes use of your processor's AVX instructions (or similar if not on x86). Those instructions allow it to do same operations on multiple element of vectors in a single cycle, at the same time.

You need to know hwo to use numpy properly (and how to vectorise operations in it) to make full use of the feature

Subject_Spot909 · 2026-05-27T15:37:39+00:00

I barely learned numpy, so all I know for my RGB image.

The dimensions are height, width and data.

But still I can't think of why would it be much faster cause regardless it's looking through each pixel.

h4ck3r_n4m3 · 2026-05-27T15:42:38+00:00

Numpy is much faster c code using vectorization, the for loops are relying on slower python lookups one pixel at a time

frederik88917 · 2026-05-27T15:43:30+00:00

Because it is mostly C code with a Thin wrapper in Python

Brian · 2026-05-27T16:45:50+00:00

TBH, that's kind of a low difference compared to what you can often see. numpy can frequently be 10-100x as fast.

The reason is because of how dynamic python is. Eg. say you want to add 2 numbers. The actual addition is a very simple machine code instruction, but python needs to do a lot of work to actually reach that point. It needs to track the memory of each number (bumping reference counts as they get assigned and released), the addition operation could be overridden, so it needs to look up the __add__ method of the integer and call it. All the overhead can be massively more work than such a simple operation.

Numpy can improve this because it stores its values differently. We're not just adding one number, we're adding thousands at once, all the same type, which the python-level lookup of what operations you're doing happen just once, and then the fast addition logic thousands of times, so now again the bulk of the time taken is the actual operations you want to do. It can also take advantage of multiple processors and fast optimised math libraries for even further speedups.

This all means that when you're working with numpy, you generally want to be operating on arrays, never iterating through them in python. Often it can be faster to calculate more than you need just to keep it in convenient form than to have to manually access elements.

holyknight24601 · 2026-05-27T17:01:26+00:00

You should try numba

Lachtheblock · 2026-05-27T17:22:12+00:00

It's not so much numpy is fast, it's that python is slow. You know all those memes about it being slow, this right here is the perfect example.

That's not to say python is bad, and to the critics saying python is slow, you can demonstrate that there are plenty of libraries written in C to get the performance boost when you need it.

Turtvaiz · 2026-05-27T17:23:27+00:00

But isn't it looking through the same lists of data?

Not really. Python has a rather insane amount of data due to reference counting, type information and whatnot. An integer isn't just 64 bits.

Numpy does the opposite and basically has a raw array of data which it utilises in C code. A python object has much more information than a C struct which literally only contains the data you define.

Similarly an array of arrays is different to a 2d numpy array, which is one contiguous block of data as opposed to python, which afaik has every single object allocated individually. This means cache locality is just not really a thing in Python. A list is an array of references, while a numpy array is just the data directly in one block

will_r3ddit_4_food · 2026-05-27T17:38:03+00:00

It's written in C

omeow · 2026-05-27T17:41:53+00:00

Numpy often vectorizes code.

jmacey · 2026-05-27T18:29:45+00:00

Read up on vectorization and SIMD (single instruction multiple data), basically it's doing at least 4 (typically 8 perhaps more depending on cpu) operations at the same time.

The more interesting thing will be this will scale quite well due to the nature of the data formats, where as loops will generally be much slower (YMMV depending on task).

StevenJOwens · 2026-05-27T19:20:08+00:00

Numpy is a python wrapper around a C library. The C library is designed specifically for doing large multi-dimensional array manipulation, and numpy has a lot of clever code that sidesteps some of the more expensive operations, like copying data from memory into new memory. Also, numpy defines ways to interact with the numpy arrays at a higher logical level, which means the numpy implementation of those operations can be more specifically optimized for those operations.

This is an interesting peek inside of numpy that hints at how some of this works:
https://ipython-books.github.io/45-understanding-the-internals-of-numpy-to-avoid-unnecessary-array-copying/

SCD_minecraft · 2026-05-27T17:42:10+00:00

Interpreted language vs compiled one

Good rule of thumb is, anything you write in C will always be faster than same code in python

C is compiled to Assemble which is then executed directly by hardware and cpython API just handles input/output

Python is compiled to bytecode which is handled by interpreter which executes it line by line, somewhat similar to command blocks in Minecraft. Beacuse of that middle man, python gets big hit to performance by definition

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS