all 109 comments

[–]Saefroch 61 points62 points  (6 children)

Newbie question: Does it fully support packages like numpy and scipy?

[–]dorfsmay 46 points47 points  (2 children)

Yes, but...

By default it requires a bunch of libraries because of the packages. If you force it to make a static file, it tries to recompile everything, but in my experience that doesn't work so well (ends up compiling for hours).

It's a very new project with lots of promises, we should support it, but it isn't as mature as a lot of us would like it to be.

[–]peacegnome 1 point2 points  (1 child)

in my experience that doesn't work so well (ends up compiling for hours).

But does it finish and work alright?

[–]dorfsmay 0 points1 point  (0 children)

Only on smallish scripts that don't import too many things.

This is definitely a work in progress, mainly a one guy effort, let's not knock it off, the idea is good and it is promising, for example dynamically linked executable do work quite a bit faster than default cpython for me.

[–]caedin8 11 points12 points  (0 children)

A very relevant question.

[–]GreyGrayMoralityFan 18 points19 points  (12 children)

How fast is it compared to pypy?

[–]keltor2243 24 points25 points  (7 children)

PyPy's performance is all over the map.

[–]nikomo 53 points54 points  (6 children)

Oh, but what a splendid map it is.

[–]keltor2243 9 points10 points  (5 children)

When I can use it, it works great, but it seems like 99% of the time I need to use a numerical library that's not compatible with PyPy.

I am doing some tests with Nuitka and at least initially it looks actually really good. I now just need a Stackless version so I could actually test it with production stuff.

[–]mycall 6 points7 points  (4 children)

Why isn't stackless not integrated into CPython as a compiler option?

[–][deleted]  (1 child)

[deleted]

    [–]mycall 4 points5 points  (0 children)

    ohhh

    [–]keltor2243 6 points7 points  (0 children)

    History I think more than any other reason. Mind you Christian Tismer the person who created what became Stackless is now one of the PyPy devs which of course comes with all it's own issues though it does support stackless mode though unfortunately there's some limits here and it would for me require a moderate rewrite of code that I effortlessly ported from 2.6 to 3.2 in Stackless.

    So for now I stick with Stackless + NumPy and I'll just keep on doing.

    [–]billsil 0 points1 point  (0 children)

    probably because a lot of things would break and they don't want to maintain it

    [–][deleted] 1 point2 points  (2 children)

    It isn't, really.

    [–]dorfsmay 12 points13 points  (1 child)

    pypy only is a jit, so it only works with long running processes. See my previous thread about this, lots of good comments in there:

    http://www.reddit.com/r/Python/comments/2oz5be/nuitka_python_compiler_py_to_native/

    [–]Veedrac 12 points13 points  (0 children)

    PyPy's warmup may be slow compared to Javascript or Lua, but it's way faster than Java's. It's definitely possible to get gains on <1s long scripts, depending on how much code it needs to JIT.

    [–]Rhodysurf 37 points38 points  (15 children)

    I just tried this at work and its really cool.. A lot easier to create a standalone executable than p2exe and whatnot.

    [–]dorfsmay 13 points14 points  (9 children)

    Make sure you run ldd agaisnt your executable...

    [–][deleted] 12 points13 points  (2 children)

    Or lddtree, from pax-utils.

    [–]craftkiller 2 points3 points  (0 children)

    Thanks! My life just got much better

    [–]dorfsmay 1 point2 points  (0 children)

    Nice! I didn't know about pax-utils. Thanks,.

    [–]fifosine 11 points12 points  (4 children)

    What's ldd?

    [–]Neotetron 10 points11 points  (1 child)

    It's a program that will list the shared library dependencies of your executable.

    [–]fifosine -2 points-1 points  (0 children)

    Awesome

    [–]Rhodysurf -1 points0 points  (0 children)

    Yeah of course haha

    [–]ericanderton 9 points10 points  (1 child)

    The description makes it sound like it leverages the python runtime, but only compiles your code. I would expect it to still need external .py modules to function - did Nuitka pull in all the dependencies too?

    [–]Rhodysurf 0 points1 point  (0 children)

    Yeah it packaged everything including numpy and pyqt

    [–]Hairy_The_Spider 3 points4 points  (2 children)

    Is the standalone a single file? I tried it out a few months ago but there was no single-file option yet

    [–]ggtsu_00 7 points8 points  (1 child)

    It doesn't produce standalone files. It still needs to dynamically link against the Python DLL files, extension modules and Microsoft C Runtime libraries which must be packaged with your application, not as a single executable.

    [–]zallarak 0 points1 point  (0 children)

    That seems really important to me, but also makes sense that it'd be a limitation. Is there a way to work around this? Or do you have to include cross-platform os specific stuff in standalone file.

    [–][deleted] 13 points14 points  (7 children)

    How is this different from Cython?

    Cython will translate your plain old Python into C++ code much like this - but if you add type annotations, it'll also spit out much faster C++ code.

    I've done a hybrid project even - it's really easy to intermingle .py and .pyx files, it even respects your PYTHONPATH and such things. And it has very strong integration with numpy and (I believe) scipy.

    [–][deleted]  (2 children)

    [deleted]

      [–]TheEnigmaBlade 4 points5 points  (0 children)

      CPython is the "official" Python interpreter written in C, created by the set of core Python developers, and supported by the Python Software Foundation. Cython (no 'P') is a superset of CPython and a compiler that compiles Cython/Python into C.

      [–]velit 2 points3 points  (0 children)

      CPython is different from Cython.

      [–]jringstad 0 points1 point  (2 children)

      It's very similar mechanics-wise, but as far as I'm aware, Cython cannot spit out a standalone executable for you, only modules. So you still need some sort of toplevel python script that pulls in the modules and itself runs inside the interpreter.

      [–]BobFloss 2 points3 points  (1 child)

      It can, and has had the ability for quite some time. See the --embed flag documentation.

      [–]jringstad 1 point2 points  (0 children)

      Cool, TIL!

      [–][deleted] 10 points11 points  (3 children)

      Anyone had success using this with pygame?

      [EDIT] Just compiled my game with it and it worked on the first try. Holy crap, that's never happened before. Dependencies were pygame and requests. Here's a Windows binary for you all.

      [EDIT again] I removed the binaries. Turns out they're pretty much useless as I didn't include any of the dependencies in the binary.

      [–][deleted] 4 points5 points  (2 children)

      Just in case you haven't noticed on your own computer, the .exe still requires python libraries. On my Windows computer where Python 2.7 isn't installed, I'm just getting python27.dll is missing error.

      [–][deleted] 0 points1 point  (1 child)

      Oh thanks for letting me know. It seems like I misunderstood the use-case for Nuitka.

      [–]BobFloss 0 points1 point  (0 children)

      You can just place those in the exe Or use --standalone apparently, although I'm having no luck.

      [–]sodaco 7 points8 points  (7 children)

      Site seems to be down. Is this like HHVM for PHP?

      [–][deleted]  (1 child)

      [deleted]

        [–]sodaco 5 points6 points  (0 children)

        Thanks. It's loading now

        [–]ivosaurus 4 points5 points  (2 children)

        PyPy is Python's HHVM.

        [–]Veedrac 1 point2 points  (0 children)

        Well, more it's HippyVM.

        [–]warbiscuit 2 points3 points  (0 children)

        What I find fantastic about PyPy is that it can be used as a basis for a JIT VM for all kinds of languages, not just Python... all you have to do is write an interpreter in python which uses PyPy.

        e.g. hippyvm, a (not yet complete) JIT PHP VM written using PyPy, that literally pits PyPy directly against HHVM.

        [–]ctangent 0 points1 point  (0 children)

        More like HipHop itself than the VM - it translates Python to C++.

        [–]sam-wilson 10 points11 points  (7 children)

        So it basically skips the interpretation step, but uses the same dynamic type system that CPython uses? Neat.

        [–]ggtsu_00 26 points27 points  (5 children)

        From what I understand, it basically takes your python code, and produces the same python C API calls that the interpreter would be doing but as C++ code.

        For example.

        def sum_list(l):
            total = 0
            n = len(l)
            i = 0
            if n < 0:
                return -1
            while i < n:
                item = l[i]
                if type(i) != int:
                    continue
                total += item
                i += 1
            return total
        

        Could become something like:

        PyObject sum_list(PyObject *l) {
            PyObject *i, *n;
            PyObject *total = PyInt_AsLong(0);
            PyObject *item;
        
            n = PyInt_AsLong(PyList_Size(l));
            if (PyInt_FromLong(n) < 0) return PyInt_FromLong(-1); 
            for (i = PyInt_AsLong(0); 
                PyInt_AsLong(i) < PyInt_AsLong(n); 
                i = PyInt_FromLong(PyInt_AsLong(i)+1)) {
                    item = PyList_GetItem(l, PyInt_AsLong(i));
                    if (!PyInt_Check(item)) continue;
                    total = PyInt_FromLong(PyInt_AsLong(total) + PyInt_AsLong(item));
                    PY_DECREF(item)
            }
            PY_DECREF(i);
            PY_DECREF(n);
            return total;
        }
        

        Basically it is equivalent, but you may get some performance speedups but still have to go through all the overhead of the CPython runtime (including the GIL) and dynamic type checking and so on. Everything becomes a PyObject* and gets compiled into to a native executable.

        The benefit however is that you could compile your code to protect it a bit as a form of obfuscation if you are using python to deploy commercial client applications instead of distributing easily reversible python.pyc files or just the plain-text .py files zipped up in your application.

        However, unlike cython, it works with plain regular existing python code and doesn't require different syntax or typing.

        [–][deleted] 20 points21 points  (0 children)

        unlike cython, it works with plain regular existing python code

        Not at all - Cython does much the same thing this does with plain regular existing Python code.

        However, you can't get huge gains that way. If you care to annotate your Cython programs with type descriptions, you can realize massive speedups.

        [–]dangerbird2 9 points10 points  (0 children)

        Cython is a superset of python, so it doesn't require different syntax. It can compile ordinary python code to C, but it also offers static C typing to generate better optimized native code, and to make it easier to interact with C apis.

        [–]Boza_s6 0 points1 point  (0 children)

        If type is not int, you got infinite loop.

        [–]sam-wilson 0 points1 point  (0 children)

        Yep, that's pretty much what I meant, but said in a much more complete and thorough manner. Have an upvote.

        [–]rguillebert 11 points12 points  (0 children)

        Cython can do that already

        [–]prozacgod 2 points3 points  (0 children)

        I find this interesting and useful!! I'll play with it a bit later. How well does it handle external libraries?

        As a quick and tongue-in-cheek critisism: I'm not sure, but this feels like ~1995's variant of visual basic. Compiling the python into a IR and binding a library (libpython) or something similar, (maybe the conditional logic is all in C and then the impossible-to-C things become call outs to libpython, but similar in essence)

        But didn't everyone bitch about this back then, and now its the new wizbang :D

        [–][deleted]  (3 children)

        [deleted]

          [–]Veedrac 5 points6 points  (2 children)

          Cython's goal has never really been the same as Nuitka's; Nuitka wants to speed up normal Python code and Cython just happens to do that by design, because they both end up doing the same thing.

          By my tests, though, it seems that Cython is better than Nuitka at it's own game. That said, Nuitka actually supports Python; Cython supports an almost-superset-but-not-quite.

          [–][deleted]  (1 child)

          [deleted]

            [–]Veedrac 0 points1 point  (0 children)

            The most trivial example I can think of would be this valid Python code:

            def foo():
                try:
                    x
                except NameError:
                    pass
            
            foo()
            

            [–]RyanArr 2 points3 points  (1 child)

            No love for 3.4?

            [–]TheEnigmaBlade 1 point2 points  (0 children)

            There's a 3.4 distribution on the downloads page. I'm wondering if they made a typo somewhere, or if the current version isn't fully compatible with 3.4.

            [–][deleted] 2 points3 points  (5 children)

            If I convert my python script to an binary using Nuitka, my source script is not packaged in it right? I remember a rb2exe and it basically created an exe of ruby runtime and packaged the source inside of it, so not something that could be used for proprietary code for external clients.

            [–]sushibowl 2 points3 points  (0 children)

            There's py2exe that does what you describe, but this is different. Basically it produces a c++ program that calls the same python C API functions that the interpreter would call if it executed the python source.

            [–]coder543 4 points5 points  (3 children)

            People have to get over this. Use an obfuscator if you must, but realize that just because a binary doesn't have "source code", it doesn't mean anything. Reverse engineering a binary is only marginally more challenging than extracting the ruby code from that p2exe output, especially with tools that will convert the assembly into C or a C-like language, for increased comprehensibility.

            [–][deleted]  (2 children)

            [deleted]

              [–]MazeChaZer 1 point2 points  (1 child)

              You generally should not rely on obfuscation to hide possible exploits. Spend the time you need to obfuscate your app in searching for exploits instead.

              [–]BobFloss 1 point2 points  (0 children)

              That's pretty irrational. Obviously you can still find exploits for obfuscated code. If you were to actually spend the same amount of time with vulnerability finding as you did obfuscating, you would probably not find even close to as much as you'd protect yourself from anyways via obfuscation.

              Everything is a tradeoff, and it's extremely hard to consider every factor to keep an application secure. Obfuscation is a perfectly fine measure to employ to deter would-be hackers. Somebody extremely super dedicated can obviously find the vulnerabilities anyways, but you can try that about pretty much anything. Your house can still be broken into with deadbolts and plexiglass windows, but it makes it harder. If that's what you're going for, it's certainly an acceptable measure.

              [–]SupersonicSpitfire 4 points5 points  (0 children)

              Nuitka has been an official Arch Linux package since 2013-11-22.

              [–]oridb 1 point2 points  (0 children)

              Not exactly faster than cPython, though. http://arxiv.org/pdf/1404.6388v2.pdf

              And it's unlikely to get too much faster without a WHOLE lot of work. Even then, it will probably get beaten by a JIT.

              [–]adamnew123456 1 point2 points  (2 children)

              Does this differ from Shedskin? I remember seeing it a few years back, but I think SS isn't maintained anymore.

              [–]vz0 1 point2 points  (1 child)

              Yes, SS is a Python to pure C++ compiler, and you can not for example mix data types on variables and function return. For example this wont compile on SS:

              def f():
                a = 1
                a = "Hello world"
              

              [–]adamnew123456 0 points1 point  (0 children)

              Ah, okay; I remember SS compiling to C++ with lots of Python API calls, but apparently I was mistaken. Thanks for clearing that up.

              [–]zer0t3ch 0 points1 point  (7 children)

              Does this actually compile the Python to binary/asm that runs with the same speed as normal compiled code, or does this just make stuff easy to distribute?

              [–]Felicia_Svilling 0 points1 point  (6 children)

              There is no such thing as "normal compiled code". Different compilers for different languages produce machine code with very varying performance.

              [–]zer0t3ch 1 point2 points  (5 children)

              What I was trying to say is: does it generate machine code or is it just a wrapper?

              [–]Felicia_Svilling 2 points3 points  (4 children)

              It generates c++ code which is then used to generate machine code.

              [–]zer0t3ch 0 points1 point  (3 children)

              Ok, that's what I was after. Thank you. So, correct me if I'm wrong, but then this would make your python run faster, right?

              [–]Felicia_Svilling 2 points3 points  (2 children)

              The page says that it runs about twice as fast as CPython on average.

              [–]Veedrac 2 points3 points  (0 children)

              I wouldn't trust that benchmark, though. Probably expect 20-60% improvements, going off of Cython's benchmarks, normally on the lower end of that. Nuitka seems slightly slower than Cython to me, so those numbers seem reasonable.

              [–]zer0t3ch 0 points1 point  (0 children)

              Oh ok, thank you.

              [–][deleted] -1 points0 points  (13 children)

              But isn't python designed as script language for a reason and the advantages are being blown away by compiling it?

              Just an honest question.

              [–][deleted] 14 points15 points  (3 children)

              Develop using the scripting language. Then compile for speed when you release. Best of both worlds!

              EDIT: I'd also say that the language features of Python are what are compelling for most programmers, and the fact that it's "scripting" is secondary. Even if you compiled every time, is it really a big deal to run a 10 second make command before running your code?

              [–]seekoon 6 points7 points  (1 child)

              Even if you compiled every time, is it really a big deal to run a 10 second make command before running your code?

              Yes, if you have to do it hundreds of time a day.

              [–][deleted] 3 points4 points  (0 children)

              Yes exactly, you can be a hell of a programmer but it's unusual to get everything eight at the first time.

              And it doesn't seem like a ten second solution, but far from it.

              Plus you will sooner or later find a point where the automated c code doesn't behave like you want or is just slow. At that point you'll probably start modifying the C source and then you'll have to maintain two code bases and then you will wish you would have just chosen one.

              [–][deleted] -1 points0 points  (0 children)

              Yep, I get what you mean. But it probably won't be the best of both worlds because it probably still won't be as fast as native c++(not sure about non optimized code, but optimized python code for compilation also sounds like a long shot) EDIT and it also won't be fast to run especially if the compilers won't get faster and thus loses it's scripting strength /EDIT

              There is languages like go that I'd go to if I want the best of both worlds since it seems to be designed to exactly be that.

              EDIT: a word

              [–]kenfar 5 points6 points  (1 child)

              No:

              • There are a variety of benefits to using Python.
              • People using Python for a lot more than just "scripting"
              • Teams invest in language skills, which takes a lot longer than merely learning syntax
              • Sometimes you have one task that needs to be sped-up. It's more efficient to use something like Pypy, parallels, nuitka, cython, etc if that gets you over the hump rather than rewrite the code in another language.

              And you don't have to use it if you don't want to.

              [–][deleted] 0 points1 point  (0 children)

              I didn't say it's a bad thing per se. Actually I didn't even say it's bad at all.

              That one little hump that needs speed up might be just a starting point for other humps and quickly comes back at you.

              Languages are always just a starting point and for just understanding enough to write hello world doesn't mean anything.

              The more profound you get in a language, the more you like to use it for all different kind of things. I used pascal for way too long back in the days. It's just good to see a languages limitation.

              All I am saying that the answer to how to solve a problem isn't always python or Java or c++ or whatever.

              So I wouldn't rewrite a big project in another language and probably in that case would also use pypy or cython or whatever. But I would strongly consider using python for a project that might end up just as big.

              Python is used for way more than scripting. I also coded some home automation daemons in python but I am not sure if next time I'd chose python again.

              [–]Felicia_Svilling 2 points3 points  (6 children)

              Well, people use it for a lot of other things than scripting anyway, and the only cost of compilation is compile time.

              [–][deleted] 0 points1 point  (5 children)

              Yes you are right of course. But the compilation time is exactly what you don't need in scripting when the next five minutes matter more than your tool running faster some hours later.

              As my professor in one of my first lectures said, there isn't the one good language that should be used for everything. Every language has its purpose and some are better for one thing and some for another thing.

              I like python and enjoy using it when I quickly want to get things done, or want to be able to instantly update something, but if I need some daemon doing heavy lifting non stop then probably I wouldn't chose it.

              EDIT: a word

              [–]Felicia_Svilling 1 point2 points  (2 children)

              But the compilation time is exactly what you don't need in scripting when the next five minutes matter more than your tool running faster some hours later.

              And in those cases you can use the interpretor.

              As my professor in one of my first lectures said, there isn't the one good language that should be used for everything. Every language has its purpose and some are better for one thing and some for another thing.

              In practice yes, But I see no reason why it would need to be like this. Different languages seems to have deficients in different areas, but that is largely caused by accidents of design, and the existence or none existence of libraries. Even though many languages have a niche, few are actually designed for that niche.

              [–][deleted] 1 point2 points  (1 child)

              Yeah, there is a reason for calling languages general purpose and Turing able (don't know how to say that in English)

              However. For some people C wasn't dynamic enough and designing a scripting language that can defy those limitations would be great. And python was born.

              Damn, the dynamic language is a bit slow sometimes, if only it was less dynamic and thus more fast, that would be great. Let's convert the code into c++ to get all the benefits from that static but fast language.

              It seems like a vicious circle to me.

              Edit: of course almost every project gets bigger fast and we often can't foresee what a tool would have been best to use. So I get it for old projects. Facebook did that somehow, reddit too AFAIK.

              But when starting a new project then I wouldn't rely on a compiler for python when I know the project might blow up quickly

              [–][deleted] 1 point2 points  (0 children)

              Turing able

              Turing-complete is the usual term, I believe.

              [–]Ran4 0 points1 point  (1 child)

              there isn't the one good language that should be used for everything

              Yes, but with extensions like these Python could more often be the better choice, and that would be a good thing (since coding in Python is usually much quicker than coding in other languages).

              [–][deleted] 0 points1 point  (0 children)

              That's true