Dreaming of Dragons: Wherein We Discover Some C Code: With A Little Help From Our Friends

For the past year, while studying networking and programming in ATEC, I kept ChatGPT constantly open—not to give me direct answers, but to engage in a kind of "learning dialogue," let's say. It was there to challenge my understanding of the topics I was learning and to quickly fill in knowledge gaps that came along.

Was I skeptical of its knowledge? Of course. The same way I’m skeptical about any single source of information—take Wikipedia, for example. When it first came out, it was vilified by many for its crowdsourced approach to knowledge. But, hey! We still use it to this day. It’s a great tool, right?

Step 1: Generate and Compile the C Code

Continuing from our previous blog entry, let’s once again use ChatGPT to help us learn a bit more about Assembly and low-level code.

Today, we’re asking ChatGPT to give us a simple C snippet, which we promise not to read. We’ll copy and paste it into a document and compile that document into a binary, which we will then disassemble and try to understand.

Sounds fun? Let’s go!

See that? No peeking. Just copy that code, paste it, and save the script so we can compile it.

Our ask was simple: no recursion (no need to add an extra layer of complexity), only one function, etc (you can read it for yourself).

Copy-paste that sucker into an empty file, and that's it!
You didn't peek. You have no idea what's in that file. The world still makes sense.

If you are a "dirty cheater", just ask a friend to send you something very simple. Hey, it's a great way to make friends. True friends know C.

Next, you'll want to compile that code without debugging symbols. You might need to install the necessary multi-lib support:

sudo apt-get install gcc-multilib g++-multilib

"Oh, but I'm using Red Hat/Arch (btw)/etc, how do I get that package installed?"

Well, just ask ChatGPT. That's what it's there for. Or Google it, or something.

Let us (finally) compile that code:

gcc -m32 -o my_file my_file.c

I'm going to compile a second version with debugging symbols, by adding -g (remember?). More on this later.

Step 2: Disassemble and Explore the Assembly Code

Now we jump into gdb, like we did last time, but with a small twist: we'll be checking out TUI - the Text User Interface, by typing:

(gdb) layout asm

I might be biased, but this looks totally cool.

I'm not going to go deeply into the function prologue and epilogue (I'll leave that for another blog entry). Right now we're just interested in the "meat" of the program. What is it actually doing?

Notice that, right before the call line where we're calling a function named compute we're actually loading the value 4 and pushing that onto the stack?

That value is being loaded into the function as an argument.

That's useful to know!

And here it is, the compute function in all it's glory:

Again, we'll skip all the setup and concentrate on the "actions".

Look at that add. We're taking eax, which you might have noticed is now holding the value 4, and adding to itself, literally doubling that value. And we have another addition further on. We're adding 5 to that value, right before leaving our function and returning to main.

Let's cut this story short.

If you look at main, you'll see that our function will then print the result and end the program.

Like I said: we'll get to the tasty bits another day, but I wanted you to have a view of what we can do with this stuff. How we can use ChatGPT to create simple challenges which we can then work on. Remember: don't know something? Ask it what it means. Ask it to explain from a different angle. Ask it to draw you a picture - literally.

Step 3: Create your own C version of that disassembled code

Let's do it. We're not trying to be perfect here. Only to grasp the idea behind the assembly code and create a C program that could achieve a similar result. And here it is:

Is this perfect? Nah. Far from it. But it gets the gist of what that Assembly code is doing. And that's good enough for now.

Step 4: Now even more TUI

Remember when I said that I was going to create an extra compiled version of the original code? One that kept the debugging symbols.

Let's open that file in gdb, and after we've entered TUI, we'll also write:

Would you look at that? Because we added debugging symbols to our compiled code, we can now use TUI to read both the disassembled code and the original C code. How cool is that?

Oh, right. Notice that the original code doesn't have result+= result? Again, details.

For now, we're pretty satisfied with the result we got.

Next time, we'll be checking another really cool tool—one that is online and that doesn't require any installation or compilation. You just present the code, and it will return you corresponding assembly output.

I hope this was informative and gave you some ideas on how to use an LLM in your learning process. It can help you achieve these small goals competently and in an expedite manner.

Happy disassembling!

Dreaming of Dragons

Wednesday, September 4, 2024

Wherein We Discover Some C Code: With A Little Help From Our Friends

No comments:

Post a Comment

Why Won’t You Power Off?

Report Abuse

Labels