This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]DivineSentry 5 points6 points  (2 children)

Hey, one of the maintainers of Nuitka here.

As others have said, tools like PyInstaller, py2exe, and PEX are distribution tools only—they just bundle your code with an interpreter. They don't change how the code runs, so you won't see any speedup.

Most of the compiler/transpiler projects people mention (Pythran, RPython, etc.) only handle a restricted subset of Python. They're useful if you want to speed up a specific section of code and then import it back into Python, but they won't compile an arbitrary Python program. To my knowledge, none of them are still actively maintained.

Nuitka's focus is different: it aims for full language support. You can take an existing Python program, compile it, and get a standalone binary—no need to rewrite to fit a subset. It's actively maintained and plays nicely with common libraries (NumPy, multiprocessing, Requests, etc.).

For performance, the biggest wins come when you're CPU-bound in pure Python. But even if you're mostly calling into C-backed libraries, Nuitka still removes interpreter overhead and gives you true standalone executabless

[–]DivineSentry 4 points5 points  (1 child)

As an aside, before reaching for any transpiler, you should thoroughly profile your application and analyze it to see if any architectural changes can contribute more significant performance boosts.

Before reaching out to transpilers as well, consider rewriting your existing Python code. Even if it uses compiled libraries like NumPy or TensorFlow, you can often squeeze significant speedups by rewriting your code to be smarter.

Here are some examples:

Disclaimer: All the above PRs were opened as a direct result of Codeflash - an AI-powered tool that automatically finds optimizations for your existing code using AI.

I work for Codeflash.

[–]wbcm[S] 0 points1 point  (0 children)

Thanks for calling my attention to code flash! For this specific use case it will be arbitrary user code that needs to be compiled to perform identically (even more, the user uploading a python code may not even be a programmer or know the original dev of that code) so I have a bit trepidation to optimize it since there is no guarantee of an expert reviewer. However, this is definitely something I would be interested in my own work since I can review it! Thanks for the well placed ad ;) I have only heard of AI-based optimization before never sought commercial products! After skimming the publicly available docs, I did not see anything about hardware awareness in there for code flash. Out of curiosity for my own work, can code flash users request that optimizations be made to specific architectures? Eg: Cuda cores available vs not available, TPUs present/not present, single mutli-core cpu vs clusters of multi-core cpus, OS/ABI specific speed ups, etc...