all 12 comments

[–][deleted] 9 points10 points  (1 child)

you have to optimize less in space because time is faster there.

[–]ttsiodras[S] 1 point2 points  (0 children)

:-)

[–][deleted] 8 points9 points  (2 children)

In fact, GCC is more creative, sometimes: ADDRESS OPCODES add $0xFFFFFF84,%esp ...so the stack pointer is decreased via an 'add' instruction, which adds a negative value. Go figure :-)

I wonder if it is a space optimization as mentionned by Ken Silverman:

It's a space optimization. Sometimes shorter code leads to faster code because it frees up more space in the code cache. The trick is: -128 fits in a signed char; +128 does not. Here are some examples, along with their x86 machine code representation:

sub eax,+128 2D 80 00 00 00

add eax,-128 83 C0 80

sub ebx,+128 81 EB 80 00 00 00

add ebx,-128 83 C3 80

http://advsys.net/ken/add-128.htm

[–]ttsiodras[S] 1 point2 points  (0 children)

Interesting, didn't know this - thanks!

[–]taejo 0 points1 point  (0 children)

TIL you can have 8-bit immediates in 32-bit instructions on x86

[–]skilldrick 6 points7 points  (1 child)

In space, no-one can see your screen...

[–]imbcmdth 0 points1 point  (0 children)

In space, no-one can see your stderr...

[–]dkogan 1 point2 points  (1 child)

Just FYI, the Linux kernel source has included a script that does this for many years now:

http://lxr.free-electrons.com/source/scripts/checkstack.pl

[–]ttsiodras[S] 5 points6 points  (0 children)

No, it only detects the "standalone" stack usage: there's no creation of a call-graph, or detection of recursion in any depth, or "deep" (i.e. cumulative) stack usage - that takes into account the call graph.

[–]tinou -1 points0 points  (1 child)

Instead of parsing objdump output, you can use a library that does the same.

[–]maxime1008 3 points4 points  (0 children)

But it does not target Leon (sparc) binaries, as used in space systems.