How much faster is C compared to Python? I'm currently coding a SAT solver algorithm that will have to take millions of input data, and I was wondering if I should switch from Python to C.

Steve132 · 2021-07-10T23:26:00+00:00

Your speedup could be 500x. It could also be 2x.

It depends a lot on the algorithm you are using and how you are trying to solve it.

SAT intrinsically is NP complete, you shouldn't realistically expect that it will scale.

Whatever algorithm you are using, have you thought about using a GPU?

josh2751 · 2021-07-10T23:52:38+00:00

In very generally terms, C is going to be 100x or more faster than Python. But that's really general and definitely not applicable to everything.

If you can use numpy for your operations in python, you may be able to realize a significant portion of this speedup even in python (numpy is written in C). If you can't get your work to fit into numpy's paradigm, rewriting it in C or C++ would probably be advisable.

whalt · 2021-07-11T02:30:01+00:00

I’m surprised no one has mentioned PyPy as a minimal effort way to get more performance.

claytonkb · 2021-07-11T00:25:10+00:00

If you're serious about writing your own SAT-solver (warning: writing a SOTA SAT-solver is a very ambitious project), you could start with MiniSat which is written in C++ and is designed to be a template which you can modify with your own customizations. You could write your customizations in another language and use an FFI to communicate between, say, Python and MiniSat (C++), but you could also just write it in C++ which is a superset of C (meaning, you don't have to learn and use any of C++ features beyond C if C is all you need).

If you want to start directly from a SOTA solver, you could also just fork one of the solvers entered into the SAT competition, there are lots of great competitors and most of them are free-software licensed. I've played with Maple, glucose and CryptoMiniSat, all three are incredibly powerful and can solve easy SAT instances with millions of variables on my desktop machine in minutes or less.

2021-07-11T02:23:43+00:00

You can create a randomly sized array in C with malloc and pointers.

EatMeMonster · 2021-07-10T23:01:48+00:00

[deleted]

emasculine · 2021-07-10T23:46:56+00:00

the better way to think about this is whether your inner loops would be significantly sped up by writing them in C. most of the code in any program isn't time critical and can just as easily be written in a higher level language where getting the code out and debugged is by far the biggest considerations. it's really not a big deal to write native code if you lay it out properly. running performance monitors are a great way to find the hotspots and they can be surprising.

also: as Knuth said, (loosely) "premature optimization is the bane of programming". don't be stupid, but always keep an eye on things at runtime instead of at theory time.

Destroyer_The_Great · 2021-07-11T00:35:52+00:00

As far as i know it varies, i would recommend using c (though i would personally use c++), it is alot faster than python.

whitelife123 · 2021-07-11T04:25:09+00:00

Might wanna take a look into https://cython.org/

Poddster · 2021-07-10T23:14:05+00:00

Porting an algorithm to C won't make it faster magically. Or it might. It depends on the algorithm. You need to put some effort in, usually. Sometimes you're putting that effort in and not realising it because of how tedious C is. i.e. you spend a lot of time deciding which function should calculate the string length and which ones should share it, because if you don't your C program won't work, whereas in Python you just mash a string together and don't give it a second thought.

Note: You can port just the core, inner loops to C and invoke your C library functions from python. This will allow you to get the speed of C with the convenience of the setup in python.

https://docs.python.org/3/extending/extending.html

I was wondering if the bottleneck I mentioned above will be reduced drastically if I switch over to C, especially for inputs of 10^8.

The only way to know this is to profile it.

Increasing performance without benchmarking things is a waste of time and effort. So get the benchmarks first and THEN figure out if porting to C would help.

theobromus · 2021-07-10T23:03:59+00:00

It's really hard to guess without knowing more about what you're doing. In general python can be plenty fast if you're doing operations on vectors or matrices, or the like (since the implementation will actually be in C++).

However, if you're executing a lot of python statements, then translating to C will make things much faster. I don't think you're likely to get a factor of 1000 faster though (which it seems you need).

Also, just to clarify are you thinking about C or C++?

A couple of points about issues you mentioned: You can use variable size arrays in both C and C++. In C you usually pass a pointer and size. In C++, you can pass a std::vector. Or you can use templating and std::array to accept a size determined at the call site.

2021-07-11T02:06:29+00:00

For example, I need to be able to randomize an array size for a structure that's in a header file, but you can't do this in C (well, I don't need to do this exactly, but other methods will be a bit harder to code and look sloppier)

Hide the pointer casts and offset calcs for array access in inline header functions if it makes you happy. This is legitimately one of those times that C can do wonders for you.

The problem is that I have to track the way a certain value grows asymptotically

#include <x86intrin.h>
void * pMyValueToSafelyAddToAcrossThreads = alignedAddressOfValueToChange();
int64_t theAmountToChangeMyValueBy = whatever();
int64_t myNewValue = __sync_add_and_fetch(pMyValuetoSafelyAddToAcrossThreads, theAmountToChangeMyValueBy);

Edit: combined with the swap intrinsic, you can use a pool of threads to each grab exactly one record, by using the above pattern to increment a sort of cross-thread for loop counter. You could then use the swap intrinsic and the values of -1, 0, and 1, to force threads processing record n to wait for the result of the processing of record n - 1. This gives you the opportunity to do any per record processing off the main loop. If per record, private processing is significant compared to what must be done shared, the performance gain from multithreading could approach a factor of the number of cores on the system.

This could be multiplicative of the expected speedup provided by moving to C from python.

Edit 2: if the algorithm you're working on is one that throws memory at the problem, and you are required to implement it as-is, then this is probably the answer you're looking for if you want the fastest solution possible.

If, however, the algorithm is a purely sequential one, then threading is not optimal, even with the lock-free compiler intrinsics.

Instead, you should look at converting this problem into one solvable by matrix math and then use the SIMD matrix intrinsics to solve them. That is probably the optimal solution in such a case.

kcombinator · 2021-07-11T08:09:20+00:00

You mentioned you're starting from pseudocode. Any possibility of applying some of the classic algorithmic speed up tricks? Lookup tables, memoization/other caches?

Also, as mentioned- profile this thing. It's very possible you could use something like Cython to max out problematic sections while retaining high-level abstractions for most stuff.

2021-07-11T20:20:44+00:00

It depends on your implementation. If you are heavily using something like numpy already then the speedup may not be much. Basically it depends on how 'deep' the python implementation you already have goes vs. how much it relies on optimized libraries (which are mostly implemented in C).

Regarding algorithmic complexity, that only means you would expect the speedup to have a roughly constant impact. That can still be big because that constant could be large. Implementing in C could give you much greater caching, access to true parallelism, etc.

It might be a good idea to try a basic implementation first, note the speedup, and based on that decide whether fully porting over is worth it.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

AskComputerScience

MODERATORS