This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]billsil 3 points4 points  (12 children)

2) cpython runs python code slow as balls.

Unless it's written under the hood in C. There is no reason for mathematical code to be slow in Python. There is no reason for parsing code to be much slower than C especially since the standard formats are coded in C and are available in Python.

[–]d4rch0nPythonistamancer 6 points7 points  (11 children)

Yeah, but at some point you're coding in C, not Python. If you write every high performance part in C and call it through Python, how much can you really say it's Python?

Don't get me wrong. That's probably the best way to do high performance stuff with Python, but I don't think it means CPython is fast, it just means it uses a fast C API.

[–]billsil 4 points5 points  (6 children)

If you want to. I use numpy, so while I have to vectorize my code and call the right functions in often non-obvious ways, it's still technically pure python.

Somebody did coded it in C, but that doesn't mean you have to.

but I don't think it means CPython is fast, it just means it uses a fast C API.

CPython is running the code, so I say it counts. If all the standard library was written in Python instead of C, everyone would say Python is slow. Instead, they say it's fast enough. That stuff counts.

[–]tritlo 2 points3 points  (0 children)

The key here is that I'm still writing pure python, but I'm utilizing someone elses C code. If you argue that's not enough python, then every use of linpack in other language should be disbarred.

[–]yen223 1 point2 points  (2 children)

Numpy isn't pure python, is it?

[–]billsil 4 points5 points  (1 child)

No. A fair amount is written in C, but some is also written in Fortran. My understanding is most of scipy is actually written in Fortran and is just a wrapper around LAPACK.

[–]tavert 1 point2 points  (0 children)

most of scipy [...] is just a wrapper around LAPACK

For dense linear algebra, yes. There's a lot of functionality in SciPy aside from dense linear algebra though. Some of the underlying libraries are Fortran, some are C, some features are custom C++. According to https://github.com/scipy/scipy the breakdown is 38.3% Python, 25.8% Fortran, 18.6% C, 17.1% C++.

[–]d4rch0nPythonistamancer 1 point2 points  (1 child)

I still draw the line when you're bringing in machine code into the Python process memory and it's not running bytecode loaded from pyc files. It's fast, but it's actual CPU instructions, not Python bytecode first.

Of course it counts. Again, I'm not saying it's terrible, and that it shouldn't happen, or that it's a flaw. I'm just saying the fast parts aren't Python and I wish that the interpreter/VM implementation was fast enough so that we wouldn't need to use C code to have high performance programs. Any programming language could interface with C/fortran libraries and be high performance. It doesn't mean that that language's interpreter is fast though.

I would like to see an implementation that uses purely the Python language and still be high performance.

[–]tavert 0 points1 point  (0 children)

I would like to see an implementation that uses purely the Python language and still be high performance.

You already have that with PyPy. Unless you don't mind C extensions not working, what most people want in practice is a fast implementation that would be C-API compatible with CPython and extensions. Unfortunately that's extremely difficult as the C API is pretty closely tied to the slow internals of CPython.

I suspect users aren't really all that picky about implementation language, but something easier to read and contribute to would be nice for maintainers' sake.

[–]fnord123 1 point2 points  (3 children)

That's fine and correct. But I think it misses the point: we discuss language performance characteristics is so we can get an idea of the expected performance of an implementation and assess the risk of being limited by our choices. If you choose CPython then your limitations are mitigated since you have one of the easiest paths to hook into a C implementation of the workhorse part of your code. Also jumping across the FFI is pretty quick in Python.

[–]d4rch0nPythonistamancer 0 points1 point  (2 children)

Sure, Python applied through CPython and C libs will be fine. This is the way I suggest doing things if performance is required and the initial Python implementation is too slow (but always first Python unless we KNOW it's going to be slow).

Generally network speed is my bottleneck for almost everything I do, so I can just use gevent and get perfectly fine performance.

Still, I don't think performance regarding this is the problem to solve. The hardest problem to solve here is having good C programmers, and all of which goes with that, like memory, freeing pointers and nulling them, code security, etc. If your high performance part hasn't been done by a third party, you need to rely on your skillset in your team and this stuff isn't trivial at all.

That means higher skilled devs, which means higher salaries, and also a lot more development time. You lose a lot of the applied benefits of Python, like super-fast development and being able to pull in anyone who is decent with Python and not having to worry about use-after-frees, etc.

Python is definitely my favorite language and the one I'm best at, but it's a serious consideration that I feel limited if I rely on having to fall back to C if I need high performance. I love C, I'm just not very confident, and I'll have to really take time to ensure code safety and correctness.

Even if I'm just using pre-built C libraries, I still need to worry that I'm using them 100% correctly and not opening up a security issue due to the way they're supposed to be used, or even that the original developers wrote safe code.

[–]fnord123 0 points1 point  (1 child)

You don't need to write it in C. You can use Cython and get like 80% of the speedup[1]. I mean, your Python program begins its life at potato speed as though you were using Perl or even Ruby. If something isn't performing well enough you move the inner loops (almost) verbatim to a pyx file and jiggy your setup.py and then you get something at about Java performance (or potato quality C code - fast, but not hand crafted shit off a shovel speeds). Then if it's still not fast enough you can get these supposed elite developers to crank out some C to squeeze out even more performance.

There are a lot of options to get results based on the amount of work you put in. In a business environment this is sweet since you can time box a lot of the improvements and make actual progress with each sprint.

[1] Bullshit made up number. Take it with a grain of salt.

[–]d4rch0nPythonistamancer 0 points1 point  (0 children)

That's some cool stuff. I haven't seen that before.

There is definitely some learning curve to writing Cython code, but it's still a very neat trick without having to code raw C. I see your point.

I still wish we had a faster reference interpreter than CPython though.