Fun with Assembly

I still remember an interview I had around February 2001, in which +Ron Garnett talked about how his team wrote code:
We write stuff in Assembler, because we're too lazy to write stuff in C.
Wait...what?  I thought the whole purpose of C was to have portable Assembly, so you could control the bare metal correctly.  I did get an inkling if you were that good, assembly could be seductive in your ability to do whatever you want.

This came to mind again when +Seth Moore posed a similar question on Facebook the other night:

Pop quiz:When you run this, what prints out? 

#include <stdio.h>
int main (int argc, char** argv) {
int i = 5;
int j = 10;
while (--j) { printf("%d %d\n", i, j); } while (--i);
}
view raw quiz.c hosted with ❤ by GitHub


Basically, the above is a quiz to determine if you understand loops, expressions -v- statements, and the pre-decrement operator (--).  Pre-decrement specifies that the lvalue of the expression is the current value minus one and the post-state of that variable is assigned that decremented value.  Post-decrement has the same result (decrementing the value), but the lvalue of the expression is the PREVIOUS value.

As is my wont, I got the above wrong, but that's not the point.   To check my answer, I sucked it into quick c program using vim:
#include <stdio.h>
int main (int argc, char** argv) {
int i = 5;
int j = 10;
while (--j) { printf("%d %d\n", i, j); } while (--i);
}
view raw quiz.c hosted with ❤ by GitHub
Compiling that program and using mac's otool to dump the assembly gives you this
(__TEXT,__text) section
_main:
0000000100000ef0 pushq %rbp
0000000100000ef1 movq %rsp, %rbp
0000000100000ef4 subq $0x20, %rsp
0000000100000ef8 movl $0x0, -0x4(%rbp)
0000000100000eff movl %edi, -0x8(%rbp)
0000000100000f02 movq %rsi, -0x10(%rbp)
0000000100000f06 movl $0x5, -0x14(%rbp)
0000000100000f0d movl $0xa, -0x18(%rbp)
0000000100000f14 movl -0x18(%rbp), %eax
0000000100000f17 addl $0xffffffff, %eax ## imm = 0xFFFFFFFF
0000000100000f1c movl %eax, -0x18(%rbp)
0000000100000f1f cmpl $0x0, %eax
0000000100000f24 je 0x100000f46
0000000100000f2a leaq 0x61(%rip), %rdi
0000000100000f31 movl -0x14(%rbp), %esi
0000000100000f34 movl -0x18(%rbp), %edx
0000000100000f37 movb $0x0, %al
0000000100000f39 callq 0x100000f70
0000000100000f3e movl %eax, -0x1c(%rbp)
0000000100000f41 jmp 0x100000f14
0000000100000f46 jmp 0x100000f4b
0000000100000f4b movl -0x14(%rbp), %eax
0000000100000f4e addl $0xffffffff, %eax ## imm = 0xFFFFFFFF
0000000100000f53 movl %eax, -0x14(%rbp)
0000000100000f56 cmpl $0x0, %eax
0000000100000f5b je 0x100000f66
0000000100000f61 jmp 0x100000f4b
0000000100000f66 movl -0x4(%rbp), %eax
0000000100000f69 addq $0x20, %rsp
0000000100000f6d popq %rbp
0000000100000f6e retq
Some things to note in the above:

  • The compiler has done a faithful job of translating exactly the program (as-is) to assembler:
    • We load the variables in lines 9 and 10
    • We have the first loop in lines 11-22
    • The second loop (despite being a no-op) is in line 24-29

Things get slightly more interesting when you pass the -O (optimize) flag
a.out:
(__TEXT,__text) section
_main:
0000000100000ea0 pushq %rbp
0000000100000ea1 movq %rsp, %rbp
0000000100000ea4 pushq %rbx
0000000100000ea5 pushq %rax
0000000100000ea6 leaq 0xdd(%rip), %rbx
0000000100000ead movl $0x5, %esi
0000000100000eb2 movl $0x9, %edx
0000000100000eb7 xorl %eax, %eax
0000000100000eb9 movq %rbx, %rdi
0000000100000ebc callq 0x100000f6a
0000000100000ec1 movl $0x5, %esi
0000000100000ec6 movl $0x8, %edx
0000000100000ecb xorl %eax, %eax
0000000100000ecd movq %rbx, %rdi
0000000100000ed0 callq 0x100000f6a
0000000100000ed5 movl $0x5, %esi
0000000100000eda movl $0x7, %edx
0000000100000edf xorl %eax, %eax
0000000100000ee1 movq %rbx, %rdi
0000000100000ee4 callq 0x100000f6a
0000000100000ee9 movl $0x5, %esi
0000000100000eee movl $0x6, %edx
0000000100000ef3 xorl %eax, %eax
0000000100000ef5 movq %rbx, %rdi
0000000100000ef8 callq 0x100000f6a
0000000100000efd movl $0x5, %esi
0000000100000f02 movl $0x5, %edx
0000000100000f07 xorl %eax, %eax
0000000100000f09 movq %rbx, %rdi
0000000100000f0c callq 0x100000f6a
0000000100000f11 movl $0x5, %esi
0000000100000f16 movl $0x4, %edx
0000000100000f1b xorl %eax, %eax
0000000100000f1d movq %rbx, %rdi
0000000100000f20 callq 0x100000f6a
0000000100000f25 movl $0x5, %esi
0000000100000f2a movl $0x3, %edx
0000000100000f2f xorl %eax, %eax
0000000100000f31 movq %rbx, %rdi
0000000100000f34 callq 0x100000f6a
0000000100000f39 movl $0x5, %esi
0000000100000f3e movl $0x2, %edx
0000000100000f43 xorl %eax, %eax
0000000100000f45 movq %rbx, %rdi
0000000100000f48 callq 0x100000f6a
0000000100000f4d movl $0x5, %esi
0000000100000f52 movl $0x1, %edx
0000000100000f57 xorl %eax, %eax
0000000100000f59 movq %rbx, %rdi
0000000100000f5c callq 0x100000f6a
0000000100000f61 xorl %eax, %eax
0000000100000f63 addq $0x8, %rsp
0000000100000f67 popq %rbx
0000000100000f68 popq %rbp
0000000100000f69 retq

Some things to note:

  • This looks nothing like the C code.  There are no loops (or indeed, branch instructions) at all.
  • The compiler determined the second loop to be a no-op, and compiled it away completely.
  • Our stack variables are gone.  The compiler is using x64 CPU registers exclusively. 
  • The compiler has analyzed the loop and unrolled it into discrete calls to callq for the printf function.
Lastly:  The answer to the quiz is in the assembly if you look hard enough:

5 9
5 8
5 7
5 6
5 5
5 4
5 3
5 2
5 1

Pretty cool....I never get to look at assembly in my day-job, so getting this close to the CPU is neat.

Comments

Popular posts from this blog

Review: The Southeast Christian Church Easter Pageant

Rant: "Holacracy"....Really?

Randomness...