Lab 3: Compiled C Code

     In this lab I investigated the transformation of my code from source C code to C compiler output to learn exactly what the compiler is doing when I call it. My initial program was a simple Hello World program:

#include <stdio>

int main(){
        printf("Hello World!");
}

     My first compilation was a simple gcc with most of the optimization and fancy tricks stripped out. This was done with the -O0 and -fno-builtin flags, -O0 meaning do not use any optimizations when compiling and -fno-builtin meaning do use any built in function optimizations. So : gcc -g -O0 -fno-builtin -o hello1 hello.c. 

     The compiler then returns an Executable and Linkable Format file (ELF for short) named hello1 which I used to examine what it had done to my code. The first thing I noticed after using objdump to look at the assembly code that was output was that my simple four line program had ballooned out to 199 lines which led me to wonder how much bigger my elf file was then my c file. ls -lh revealed that my C file was 104 bytes while my elf file was now 72 kilobytes so quite a jump in size. It makes sense though as there is a lot of action happening under the hood of a line like #include <stdio> or printf("Hello World!"); Returning to the code in the elf file itself there is a lot of system actions that I could dredge through but for the sake of my understanding I piped my objdump into less and searched for the code I had written by looking for main. It turned out that the elf file puts the code into a section tagged <main> which will make it easily searchable for the rest of this lab.

     With a basic understanding of what the compiler had done to my code it was time to change up some of the compilation instructions and see what would happen. First was to change from a dynamically linked program to a statically linked one. The difference there being that dynamically linked programs have their libraries names stored which are then used to link to those library modules at run-time by the OS, this allows many different programs to use the same version of a library and when that library is updated not every file that uses it needs to be recompiled as well. Static linking has the advantage of reducing the runtime overhead by copying library modules into the program during compilation and also preserves the current state of the library so that each run of the program will be the same until it is recompiled. So using gcc -g -O0 -fno-builtin -static -o hellostat hello.c I compiled my program. Knowing that static linked programs will have library modules copied in my first thought was to look at the size difference and it had jumped from 104 bytes for my C program to 617 kilobytes for the elf file, unsurprisingly I also now had 82381 lines in the file. Since this only effects how the library is handled when I compared the code in main there were no differences.

     Next I tried removing my -fno-builtin option to see what changes would occur. So gcc -g -O0 -o hellobuilt hello.c. Checking the sizes there were no changes from my initial hello elf file and the same number of lines so I combed through main to see if changes happened there. My read through of both mains did not pick up any differences so I decided to objdump -d both hello and hellobuilt into text files and use the diff command to see if I missed something. To my surprise the only difference was in their names at the top of the file so it appears that removing the -fno-builtin option had no effect.

     Next I removed the -g option to remove the debugging information from my elf file. So gcc -O0 -fno-builtin -o hellodeb hello.c. Comparing sizes it dropped from 72k to 69k which would account for the debugging information that is not being added anymore. Comparing the mains of the original to this one show there is no visible difference so its system information that is being excluded.

     Next I added ten sequential integer arguments to my printf and compiled it.
#include <stdio>

int main(){
         printf("Hello World!",20,21,22,23,24,25,26,27,28,29,30);
}

The change of note here is how these arguments are handled in the main section of the code. Each argument is being stored with the first 7 arguments being stored in registers but the rest are being pushed to the stack in reverse order so that they can be popped in the expected order.

     Next I moved the printf function call into its own function outside of main and called it in main.
After compiling and objdump I looked in main for what happened. Instead of the long list of argument handling being in the main section there is a call to the output section where the long list of argument handling is now occuring.

     Lastly it was time to turn optimization back on so compiling with gcc -g -O3 -fno-builtin -o helloopt hello.c. The size of the file is the same as the original so I looked at main to see if anything has changed. While the results are the same the main has been reduced by eliminating the final junk line after ret.









Comments

Popular posts from this blog

Lab 1: Investigating Open Source Development

Lab 4: Assembler