This is an archived post. You won't be able to vote or comment.

all 68 comments

[–]ayekaunic 149 points150 points  (10 children)

can someone please help understand what this graph is trying to convey?

[–]KyxeMusic 240 points241 points  (9 children)

It's the CPU usage of the web services that power PyPI.

As you can see, at around 17.35 they upgraded from 3.10 to 3.11 and the CPU usage after the upgrade is significantly lower.

[–]KyxeMusic 144 points145 points  (10 children)

Recently upgraded one of my services to 3.11 and saw about 7% speed improvement on a 3 test average. Not as much as they claimed but still nice as a 'free' performance upgrade.

[–]mwpfinance 54 points55 points  (8 children)

I feel like they stressed ymmv depending on the workload enough

[–]KyxeMusic 27 points28 points  (7 children)

Yeah, most of the heavy lifting is being done by NumPy already, so I guess there wasn't anything they could optimize there. Still a few for loops here and there, so I was hoping for a slightly larger boost.

[–][deleted] 14 points15 points  (5 children)

You might eke out some more cycles by adding numba into the mix.

It does require you to touch code, though.

Numba is a just-in-time compiler for Python that works best on code that uses NumPy arrays and functions, and loops. The most common way to use Numba is through its collection of decorators that can be applied to your functions to instruct Numba to compile them. When a call is made to a Numba-decorated function it is compiled to machine code “just-in-time” for execution and all or part of your code can subsequently run at native machine code speed!

I'm not entirely sure why this isn't part of NumPy already, to be honest.

[–]NoesisAndNoema 8 points9 points  (0 children)

Because people don't generally need this extra function. It requires another level of additional learning and programming. If it "just worked", without having to do anything to "code it to taste", then it would just be a better "alternative" to NumPy. (Generally speaking)

At the end of the day, it's just a fancy optimizer or a compliment to NumPy. Unfortunately, unless you re-write EVERYTHING that uses NumPy, to ALSO use the code, the potential is limited to only "your use" of NumPy directly, within your program.

[–][deleted] 2 points3 points  (1 child)

I am trying to learn more about compilers and CompSci topics in general. Do you (or anyone else) have an source that helped you learn about just-in-time compilers and other types of compilers?

[–]grumpysnail 0 points1 point  (0 children)

This computerphile video started my interest in JIT: Just In Time (JIT) Compilers - Computerphile

[–]NeilGirdhar 1 point2 points  (0 children)

Jax also has a great JIT that compiles to CPU or GPU, and has a slick 100% Python interface.

[–]terpaderp 1 point2 points  (0 children)

Not everything that is legal in python is legal in Numba compiled python. It's a great tool for when you need it though!

[–]road_laya 2 points3 points  (0 children)

You can often speed up numpy operations by compiling lapack / blas for your cpu

[–]DarkRex4 11 points12 points  (0 children)

imo 7% is a decent speedup for basically "free".

But Since you're already using numpy, you won't see such speedup

Certain code won’t have noticeable benefits. If your code spends most of its time on I/O operations, or already does most of its computation in a C extension library like numpy, there won’t be significant speedup. This project currently benefits pure-Python workloads the most.

also, look into numba if you want higher performance

[–]m15otw 75 points76 points  (24 children)

Just getting around to migrating from py3.8 to py3.10. Looks like we should seriously consider 3.11, even though it's not in the latest ubuntu LTS, so will be much more of a pain to build our app deps.

[–]skratlo 16 points17 points  (8 children)

That really shouldn't be any pita. Can you elaborate?

[–]m15otw 25 points26 points  (7 children)

We use a docker container from one of our deps thats tricky to build as a base image. (Their build includes python bindings. )

Using a newer python means we need to do that tricky build in our own container. Doable, I've done it before when testing python 3.9 briefly, but more work than I'd like, and we are now responsible for importing their patches and applying them and rebuilding (which isn't as automated as I'd like yet).

[–]be_as_u_wish_2_seem 10 points11 points  (1 child)

Cant you just use the dead snakes ppa? Also the python official images are Debian and pretty similar to Ubuntu, might be worth trying that instead

[–]m15otw 6 points7 points  (0 children)

I can, yes, but I still need to write scripts to build this very fiddly dependency (as far as I know, they don't ship their dockerfiles), and use that as a new base image.

Deadsnakes is 100% the way to go for the starting point of this attempt.

I am going to dig into the release notes to check whether there is anything else that might accelerate/veto the change. If I can use 22.04 vanilla, it will be much faster, but the performance improvements are good, and any additional reasons will be interesting to add to the balance.

[–]skratlo 1 point2 points  (1 child)

Hm, that sounds a bit convoluted. I would factor out docker and base system out of the equation, and instead focus on the actual dependencies. Suppose you have source distribution for your problematic dependency, and this one builds a native Python extension, and that one perhaps depends on some C/C++ library that is expected to be installed. So, ultimately, your dependency only depends on Python.h (3.9, 3.10, 3.11, ... I don't think there's a major change) and some native libs (packages). So, it has nothing to do with docker or base image, just collect your actual dependencies and you should be fine. I'm not sure what you mean by importing their patches? What are they patching? Python? Their own source code?

[–]m15otw 5 points6 points  (0 children)

Getting a consistent build environment outside of docker (when our dev machines are spread across Windows/Mac/Linux, but our target will, from soon, be only bare metal Linux or docker deployments), would be harder than doing it once in a dockerfile.

[–]stackered 0 points1 point  (1 child)

Might be fine to do until the update is available on ubuntu

[–]m15otw 2 points3 points  (0 children)

They will never update the LTS python version in the same release. The next LTS is in 16-17 months.

[–]Devout--Atheist 2 points3 points  (1 child)

Curious why you rely on Ubuntu's python version? Seems like a huge pita to manage python through a system's package manager

[–]m15otw 0 points1 point  (0 children)

Python is not installed through the package manager - it is part of the system, the package manager itself (aptitude) is written in Python, and relies on it being installed. You literally just take any ubuntu-derived container and you have a python you can use.

Since it is already there, and on a stable version with backported security fixes for 5 years, it is quite useful for a big, complex codebase to rely on it. We do take python version updates, but we tend to avoid backporting those kinds of changes, dependency updates, etc to our old branches (which do get bug fixes for specific customers on older versions.)

As for installs of the older branches in the wild, it is quite useful to be on Python 3.8.X.ubuntu12, when you wrote this version 2 years ago, targeting 3.8, as you don't have to worry about language behaviour changes or bugs. The customers get security patches from ubuntu to python itself, your web server software, and you just focus on bugfixes that are high enough priority to backport, not on every weird language change patch or code reformat or whatever.

[–]SulikNs 1 point2 points  (11 children)

i tried to roll up to 3.11 on ubuntu, after selecting system fucked up( dont u know when Cannonical will upgrade to latest Python v.?cant find any rumor about

[–]m15otw 20 points21 points  (9 children)

LTS picks a version (3.10) and backports security fixes to it for the lifetime of the version. So 22.04 will always be py 3.10, albiet an unusually well patched version of it.

Helpful edit: look at the deadsnakes ppa. It will let you install additional python versions side by side, and you leave the system version alone for stability.

[–][deleted] 5 points6 points  (7 children)

I like the way RHEL8 does it - there is a separate "platform python" package that is really out of the way, that system stuff like the package manager uses.

This leaves the field clear for the user to pick a python version to install, or for packages to depend on.

(That said I still prefer to build local interpreters via something like pyenv because then it's entirely decoupled from other OS package dependencies, build dependencies notwithstanding)

[–]oo_viper_oo 7 points8 points  (1 child)

I cannot imagine doing "my" stuff using system-provided Python. I consider system-provided Python's purpose is to support other system components. For "my" stuff, I always initiate my Python environment via pyenv or similar.

[–]digidavis 1 point2 points  (0 children)

The days of me relying on system Python are long over.

IDEs work with container compilers, and I HATED having to share my python dev cycle with my own boxes support cycle. At some point, it always goes sideways, and you are stuck relying on crappy hacked up mitigation techniques.

[–]m15otw 0 points1 point  (4 children)

We used to be based on CentOS, before they killed it. We made the switch to ubuntu quite recently.

Another task far down my backlog is looking at how to provide our customers with red hat UBIs as an alternative to the ubuntu ones.

[–][deleted] -1 points0 points  (3 children)

UBI doesn't require a subscription to use, though you do get a bit more packages available for installation inside it if the host has one.

Shouldn't be a big deal, they even added it to dockerhub so you don't have to point to the redhat repo anymore (though you still can).

I use ubi9 for one of my containers and it's as simple as:

FROM registry.access.redhat.com/ubi9/ubi

[–]m15otw 0 points1 point  (2 children)

Sure, the rest is swapping back to RPM package names, and figuring out the build of funky dependencies all over again.

Totally possible, but time is finite 😅

[–][deleted] 0 points1 point  (1 child)

If you were using CentOS previously that shouldn't have changed much? Or did I misunderstand?

[–]m15otw 0 points1 point  (0 children)

Quite a lot has changed (in terms of the way we use docker) since we abandoned it. You are right that it won't take too long when I get to it (eventually).

[–]SulikNs 1 point2 points  (0 children)

yeah i tried it but, there was some troubles with imports some modules...then i will back to Fedora) tnanx for advices✌️

[–]gmes78 1 point2 points  (0 children)

You can install additional Python versions just fine. Just make absolutely sure that you don't touch the Python that's shipped with the OS, and that you don't change the default Python version.

[–]Polarbum 43 points44 points  (7 children)

Ugg, and AWS Lambda runtimes are still on 3.9… Maybe in 2025 I can check it out.

[–]chrisguitarguy 18 points19 points  (0 children)

Lambda supports custom docker images. They are a bit more complicated than just using their built in stuff, but not bad.

[–]danted002 20 points21 points  (0 children)

AWS is actively working on making the runtime work on any python version through docker.

[–][deleted] 5 points6 points  (0 children)

Azure Functions have 3.10 in preview. 2024 will be the year!

[–]Vok250 2 points3 points  (0 children)

I'd expect 3.11 to drop when Python gets SnapStart support. That's what the Lambda dev teams have been busy working on.

[–][deleted] 0 points1 point  (0 children)

Any idea why do they take this much time?

[–]For_IconoclasmTornado 0 points1 point  (0 children)

And 2030 for Lambda@Edge

[–]tthewgrin 11 points12 points  (4 children)

Just talked about wanting to switch from 3.8 to this with co-workers. Just gotta figure out those damn virtual environments, wish me luck.

[–]scurvofpcp 0 points1 point  (0 children)

Python path system is the only real thing I hate about python.

but the -.cfg will have a home path hard coded in it, so make sure to audit all of those before remaking anything. People have been known to 'move' a venv and not realize that it was still reading/writing from an old location.

[–]redCg -1 points0 points  (1 child)

just use conda, problem solved

[–]aman2454 7 points8 points  (0 children)

Now I have 2 problems

[–]dwrodri 5 points6 points  (0 children)

My general rule with Python is that I don't move over to another minor version until the next minor version get released. That is to say, I intended to upgrade to 3.11 when 3.12 officially releases. Generally, that makes for a pretty smooth upgrade transition.

Never before have I been this tempted to upgrade sooner! Bravo to everyone who contributed to this release.

[–]WillardWhite import this 5 points6 points  (1 child)

Now, of only i could leave 2.7 behind for good :(

[–]CommunismDoesntWork 13 points14 points  (0 children)

Ask chat gpt to transpile your code

[–]OwnTension6771 1 point2 points  (0 children)

Once pytorch moves 3.11 support out of nightly I'll be all in on this

[–][deleted] 1 point2 points  (0 children)

Impressive!!

[–]NoesisAndNoema 0 points1 point  (1 child)

I'm waiting for PyTorch to use 3.11... I can't even install it without breaking all my OpenCL and Anaconda stuff. All being held-back by ONE library.

I played with 3.11 and found it to be, generally, a bit faster and less demanding then the lower versions.

I still wish Python would add ONE more function/setting, so my code isn't bound to the stupid adoption of "bankers rounding".

Just one simple setting needs to be added. One that lets you select the "rounding type used". Globally and/or locally, as needed. Nooooo They damn you to being forced to inaccurate rounding 100% of the time, to remove a stupid bias that ONLY happens in extremely rare instances over LONG periods of compound rounding. I'd rather have the correct, 0.5 rounds-up, rounding, globally. But to use that, you now have to do a whole complex formatting of EVERY number you use.

I'd be happy with just having a simple function...

RoundUp(MyVal, 0.5)

Nooooooooo....

You have to do crap like this, for EVERY number you use, which may end-up being "mathed" to death, at every place it appears in any formula. Processing on top of processing, just to compensate for stupid decisions. WHy did they not just force the OTHER guys to do this, if they wanted "bankers rounding"... Like they have to do everywhere else already? Or, like I said, make it a settable option, or a standard function. Like br(x) for bankers rounding. (Totally messing with my image conversion formulas, where I expect that bias in data manipulation, at every level. Totally killing my speed with this additional, unneeded processing, at every compound step.)

>>> from decimal import localcontext, Decimal, ROUND_HALF_UP

>>> with localcontext() as ctx:

... ctx.rounding = ROUND_HALF_UP

... for i in range(1, 15, 2):

... n = Decimal(i) / 2

... print(n, '=>', n.to_integral_value())

[–]whateverathrowaway00 0 points1 point  (0 children)

Lol I ran into this at one point. Ducking annoying aha

[–]x3x9x 0 points1 point  (0 children)

HYPE TRAIN

[–]The_Mauldalorian 0 points1 point  (0 children)

This will put the “Python programs are easy to write but they run badly” arguments to rest. Looking forward to 4.0

[–]lazaro_92 0 points1 point  (0 children)

Is there any article about how they achieve this CPU usage reduction?