Sunday, September 8, 2024

Wherein We Get Lost And Compare Object Dumps: C vs. Assembly

 

That's a rabbit hole, Alice. And those are books on shelves, all the way down.



Hi again!

I created a simple "Hello, World!" program in C, so that we could have a quick talk about function prologues and epilogues in Assembly, but we're in for a detour, as happens with all rabbit holes. 

And the truth is that it's just rabbit holes as we're going down (until we reach elephants, and then it's turtles all the way down, of course).

Here's the culprit:

Ok, nothing impressive, but it does its job.


After compiling this program through the usual steps, the program runs and prints "Hello, World!" to the standard output.

Next, I wanted to create an Assembly program that would print the exact same line, and although I can read some Assembly and am making progress in that front, I can't (yet) write my own Assembly programs. So I asked our LLM friend to do it for us. And so it did:



Pretty neat.

And we can turn this into a binary file with:
nasm -f elf32 print_hello_ASM.asm -o print_hello_ASM.o


And then turn it into an actual program with:
ld -m elf_i386 print_hello_ASM.o -o print_hello_ASM

And voila! We can run this program just like with our C program...

But...

 "wait, wait, wait, wait!"
You say.

"What's with the turning-the-code-into-binary-and-then-into-a-program-magic?

We don't need to compile stuff in Assembly, like we do with C?"

Well, those are great questions!

The thing is that we take compilation for granted. In fact, compilation is done in 4 steps:

- Preprocessing

- Compilation

- Assembling

- Linking

Let's ask an LLM to give us a little more information on these steps, and let it assume we want it explained in a simple manner:


Confused? Remember that you can always ask it to explain again from a different angle, in simpler terms, through analogy, etc:


We can always check more trustworthy sources, check documentation, forums, etc, like in:
https://unstop.com/blog/compilation-in-c

(I told you, it's rabbit holes most of the way down)


I'm not going to give you an in-depth explanation of these concepts (that's your job, really). But let's just say for the sake of simplicity, that when we compile our C code, we're in fact going through these four steps, and that when turning our Assembly code into a program, we just take the two last steps: Assembling and Linking (also, fyi: note that these steps can be combined or optimized in modern compilers).


To showcase the difference between these two processes and the baggage that comes along with C, let's look at an objdump of both our C and our Assembly programs.

What's an objdump? Here:



So... it's basically when we take a binary file and disassemble it back into Assembly code (+ extra info).


Then let's jump into that Assembly objdump of ours, right? Here:


And, for comparison, here's a gif with the C objdump:


Notice any difference? The C objdump file is a tad longer.
And note that I haven't included all the possible information in these dumps (checkout the man page for objdump. In particular for the -s argument).


Notice, though, that there is something we haven't seen before in our little Assembly forays. In that ASM objdump, we see these "int   0x80" lines. What are these?
Seems important enough.

These are system call interrupts, which are a way for our program to request services from the operating system's kernel. Namely, we want to be able to print our Hello World message on screen and we also want to be able to exit our program - that's what those two syscalls are doing there.

This is done behind the scenes through compilation when we're using C - so it's not all that obvious to us.


More info from our friendly LLMs:



Ah, but I just recalled that we were meant to discuss function prologues and epilogues in Assembly.

I went to https://godbolt.org/ and placed my original C code in there, and immediately got an Assembly representation of that code as well. 

And lo and behold, it's even color-coded, allowing us to see exactly what is the prologue and what is the epilogue.

Here:



But I'm leaving function prologues and epilogues for an upcoming blog post.



In the meanwhile, you can always check that yourself if you're curious. Or anything, really. See something you don't understand? Leave no stone unturned! Jump into that hole, satiate your curiosity and keep learning.



No comments:

Post a Comment

How a Spy Pixel Crashed Into My Friend's Vacation

              So it goes.   A friend of mine, a freelancer, recently went on a much-deserved vacation. Like most of us in today's always...