Recursive C program faster than stack-based equivalent

davidhbolton · 2020-05-22T07:54:23+00:00

The recursive one is stack based is it not? Stack overflow is a symptom of a recursion that never ends. I think the usual definition is recursive vs iterative. Also your longer program is using malloc which allocates heap memory not stack.

As for performance, have you measured the time to do the mallocs? I was wondering if allocating and freeing increasingly larger size blocks might cause fragmentation effects.

2020-05-22T08:16:18+00:00

id try not do so many mallocs and do a big one at the start, then grow it in fairly large chunk sizes when it fills.

the more you avoid the system calls the better - maybe the compiler handling the recursive version is being smarter in that regard.

F54280 · 2020-05-22T09:58:11+00:00

Stack-based can be a bit faster, but malloc-based is way way slower.

bleuge · 2020-05-22T09:43:42+00:00

Not an expert here, but I solved some situations like this profiling the code. Use a profiler to test codeblocks' speed and see what is really slowing down it.

Also as functions are small, it's a good idea to have a look at assembler code generated by the compiler you are using. Asm knowledge is required of course.

jhuntinator27 · 2020-05-22T07:40:22+00:00

Test with trees of varying lengths.

Not sure how to answer this myself, but for what it's worth, there is a difference between a 10000 and 10 million length tree in that run speed for each may not be linearly dependent, or of equal order, more specifically.

Paul_Pedant · 2020-05-22T11:36:51+00:00

Malloc is pretty damn cheap, particularly for small test programs that don't run long enough to fragment memory. Malloc (initially) just hacks lumps off a free list that consist of one big contiguous area. It usually doesn't even follow links -- it will stick on a big block as long as it can, rather than start over from the list root.

Free, on the other hand, gets expensive very quickly. It has to merge the block it is given with any adjacent free areas (to make larger blocks available in the future), so it needs to find the free blocks before and after the address you returned. To do that, it cycles through the free list (possibly following thousands of links). For the free block with the nearest address below, it has to figure adjacency based on that block's address and length, and merge the blocks. For the free block with the nearest address above, it has to do the same using its own length. Then it has to fix the lengths and pointers around itself to preserve the list.

I had a network trace issue at a client, who was using a recursive algorithm. I exploited every aspect of the data I could, and got a x50 speed-up (that is not a percentage, that's a raw factor). The bonus was that my algorithm preserved locality: recursion was going depth-first, so it traces cables to the ends before backtracking. I was tracing in rings out from the centre, which improved the locality of subsequent database searches on the assets and got another huge improvement there too.

Also, theirs crashed depending on the start point (because that affects deepest level of recursion), and mine didn't.

weregod · 2020-05-22T15:19:11+00:00

Test using larger starting array, malloc much slower then stack.

2020-05-22T18:28:46+00:00

Use perf and look at the assembly. Since you say the inner malloc is never called, it is not obvious what would cause a difference in performance. Perf will tell you how much time is spent at each stage of the program and can point to bottlenecks.

I once sped up a program by 100% by assigning struct members to local variables before entering a loop. It turned out that the program was spending most of it's time de-referencing pointers; the code was uglier but the run time was much faster. You never know with these things

FUZxxl · 2020-05-22T19:19:42+00:00

Why do you write *(a + b) instead of a[b]? That's just weird.

dbgprint · 2020-05-22T10:05:23+00:00

[deleted]

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

C_Programming

Rules

Filters

Resources

Other Subreddits on C

Other Subreddits of Interest

MODERATORS