Monday, September 30, 2024

Wherein We Finish The Bandit Game: OTW, part 2

 


Like the title says, I’ve finished that game, and I still maintain that I shouldn't just come here and write my step-by-step flag-finding process. That wouldn't help any reader. Not really.

Besides, in a few seconds, I'm pretty sure you could google several walkthroughs if you wanted to.

What would be the point of that, though? It’s like solving a Rubik's cube not through experimentation and discovery, but by following some ‘recipe’ online.

I'm not going to claim it would be a complete waste of time, but it wouldn't be ideal—for you. Not for me. I'd be fine.

Don’t get me wrong: I have my own write-up of how I found the various OTW flags, in some detail. I might even add it to some half-buried folder in my GitHub account. Why? Because while I was writing, I was trying to find ways to explain my thought process to others—and that’s meaningful—to me. It helped get my thoughts straight.

Instead, I want to offer some ideas, concepts, and approaches to better help you progress and learn from this game.

It is not a hard game. But that’s like saying most mathematics isn't hard. It’s a question of level and having a good grasp of the basics...


Anyway, without further ado, I’ll share a few hints, tips, and ideas to guide you through OTW’s Bandit game. Remember, these are my tips—not some unspoken truth or divine revelation. It’s what I would’ve liked to have hear before starting this trip:



1.

This game deals a lot with Linux CLI commands and basics, a few networking fundamentals and protocols (netcat, nmap, SSH, etc.), and finally some version control (Git).

2.

Take your time. Don’t rush to the flag if you can. Ask yourself questions. What do you know? What would you like to know? How could you get that information you're after? Do you understand what the problem is and what's 'holding you back'?

3.

Use the man pages, use sites, and use LLMs. Ask questions as long as you're not just looking for direct answers. It’s fine to ask how to write a specific command or to search for that command and its arguments. But it’s not okay to just paste the challenge and ask ChatGPT or some forum users how to solve it.

4.

Sometimes the end goal is less ‘fun’ than other parts. You’ve been blocked from accessing X? Oh, you found a way around that. Great! Now you have the flag. A question you might ask is: how exactly did the ‘gamemaker’ block me from accessing X? Can I find out how they did it? Can I at least catch a glimpse of that process/command/block/thinggie?

5.

Is there another way to get to the solution? Often there isn’t, but sometimes there is. Have you seen how quickly a program runs when it's feeding useless output to /dev/null versus when it’s sending everything to stdout? You might be surprised.

6.

Ask where you are. What permissions do you have right now? What can you and can’t you do? Where do you want to go? Do you need to ‘talk’ to a service? Does that service need to stay up while you're working? Is it a one-shot thing? How do you ensure that the process or program gets interrupted so that you can inject some code into it? Some challenges require you to make a program run in a (somewhat) unintended way. Try stuff.

7.

At some point you might feel annoyed, frustrated, irritated, or tired. That’s okay. Really, it is. Don’t give up. Are there any hints from the gamemakers? 

'Go check SSH?'

Alright, then go check it. Stop what you're doing (trying to find the flag at all costs) and go read about SSH. I'm not even kidding; it’s really cool to know more about SSH. Check the man page. Think of all the things you could do with its arguments. Is there anything you’d like to try or test? Test it. Try it. Did it work? Why? Why not? Ask an LLM or another person (a forum, etc.) how to use that parameter, and so on. That leads nicely into the next tip.

8.

Step back from a challenge if you're not seeing the answer right now. I’m sure you're not doing this game to get job X. At least not tomorrow. Take your time. Think about other stuff, then come back to the ‘problem’ at hand. In another life, I was a theater director and acting teacher. A student once asked me, "How do I act?" I told her to orbit around her object of desire. I took a lighter and got it so close to my face that I couldn’t see it anymore. I told her that might be too much, but that she could still perhaps learn from it (she could touch it, at least. That's something). 

Stand back from it. Further back until you can't see it anymore. Now you have to imagine the lighter, perhaps think of what you'd like to check the next time you get closer. 

You get my drift. This stuff isn’t rocket science. Don’t just stare at a problem, all sweaty, until your eyes bleed. Take a break. Walk away, look again, look away, look closer, and spin it around.

9.

Remember, this isn’t just for fun. It’s also for learning (but that is fun, right?). Again, take time to learn these tools, and above all, take time to learn how they work—their protocols, how they communicate, what you need to set up for them to work at all, etc.

10.

Remember when I told you to look beyond the problem and try different things? I’m interested in Reverse Engineering. You can bet your breeches I used radare2, created objdumps and other tools to inspect binaries and programs created by the gamemakers. Not necessarily to find the flag, but to understand what makes those programs tick—why they do what they do.


Take your time, one moment at a time, and have fun. If it stops being fun, that’s okay too. The internet is filled with games and learning tools.
Go find another one that strikes your fancy.

Often it's not the game, it's your perspective on the game. 

As I explained in part one, I got hacked while navigating my way through lvl.20, and I can tell you that I learned more from that interaction and the hours of study that followed than perhaps in 10 or so levels of Bandit combined.

That's it. I'm going to play Leviathan.


The world’s your oyster—or something. 

Saturday, September 28, 2024

Wherein We Get Hacked And Learn In The Process: OTW, part 1




For the past couple of days, I've been playing around with OverTheWire (OTW), particularly the Bandit game. And so it was, I was having my fun and going through each level, mostly reviewing stuff I know about linux or networks, but also learning quite a bit.

Upon reaching lvl 20 I was having a bit more trouble, since I was making a mess for myself in establishing a listening port and so on (I truly don't want to give spoilers or too much... see below)


Not only are these the rules, but I also agree with them. I really really think that if you are going to play these games, you should steer clear of reading spoilers, or walkthroughs. Also, if you want to write about these challenges, you can do it in an interesting fashion, that can help other players, giving them hints, or neat ideas instead of spoon-feeding them answers.

So, I won't tell you how I solved lvl 20. But I'll tell you what happened to me while I was on that level.

You know the saying 'a picture is worth a thousand words'? Well, how about a picture of words? Is the effect multiplicative? Here's the pic:


It's a bit messy, because we were using the command line in tmux as a means of communication within the terminal: which isn't its intended purpose, but still, I think you can follow the brief conversation.

Moments before, I was baffled when I noticed someone taking control of my tmux session. I closed the session, re-opened it and awaited for a reply. And there it was, with a friendly 'hello :)'. I would have started with 'Wake up, Neo...' myself...


...but that might just be my age speaking.


So, conversation friendliness aside, let us figure out together what happened here.

The other user was not only kind, but right in the way they explained to me how I could find out about the 'intrusion' method:

man tmux

That's it, really. Just read the fine manual. And so I did, taking the time to learn a couple of new things instead of just skimming it again.

One of the things that I learned is that you can scan for existing sessions, by listing them:

tmux list-sessions

or

tmux ls

This will show you current sessions and their names.

You can then jump (or attach, really) to that session. Let's say there's a session named my_session open. You could then:

tmux attach-session -t my_session

And yes. It's that simple. You are now sharing the same session as the user that created and is using it.

Try creating a couple of sessions. Create a tmux session, then exit it (Ctrl+D works).

The default name for a tmux session will be called '0'. Just that.

So, you could also, arguably, try to connect to a session automatically as well. Spamming that name until you find it, just to catch someone unawares, with:
while :; do tmux attach-session -t 0; sleep 1; done



Note: Don't do this for long. Avoid spamming the server that's granting us free access for learning. I did it for a couple of seconds, just to test stuff out,and showcase it to you. Remember to stop that loop asap (ctrl-C).

So what's happening in that image?
I have two sessions in level bandit1, and in one I'm running this loop, while in the other I simply typed 'tmux' twice, automatically entering a default tmux session, called '0'.
On both occasions, both sessions entered the same tmux session, which I'll showcase here, but with a specifically named session called my_session:


Upper left corner: my original tmux session (my_session)
Lower right corner: the session I used to jump into my_session

Whatever one user writes, the other one will see.

Test it out. You don't even have to enter OTW. You can try it on your own local machine, with two different terminals.

If you don't know, OTW makes you log into as a different user per level (usually). This means that everyone connected to that server, on that level, will have the same name. This rather interesting, and got me thinking. We're using the same users, but different terminals. Hold that thought. We'll get back to that in a second. Let's first find a way to harden our virtual tmux terminal as to not get interrupted while we're playing OTW, shall we? You're free not to do it. After all, it can be a great way to make friends. Very old school too, if I might say so myself.

Ok, so how do you 'strengthen' your tmux session, making it much harder for the general user to see what your session is?

You need to create a tmux socket and create your session through that socket.
OTW protects its filesystem, and we, as users, are only allowed to create files within the /tmp folder. So we do that. We create a socket and connect through it:

tmux -S /tmp/my_wonderful_new_secretive_socket new -s my_session_name

Any other user now trying to list all active tmux sessions won't see yours, since they'll only be able to see the default sessions.

But couldn't they simply go through all files in /tmp?

They could, but OTW has that folder protected:

There's a sticky bit there (t) ensuring that users can only modify files that they own. Also, 'r' is missing from others, meaning that they cannot read that folder.

This makes it a bit harder to simply scan the folder and start checking out stuff.

"But wait!" you say. "Couldn't we just set restrictive permissions on our own socket?"

We can, yeah. But remember that all users on this levels have the same name as we do. So, they're basically the same user.


We are, in fact, using some mild security through obscurity, and defending our tmux socket (and therefore session) from our other selves (I'll give some info on security through obscurity in another blog post).

Very Pessoesque... or is it the other way around?

Anyway, there we go. Good job at protecting yourself from your other selves!

We can now run our tmux sessions in (relative) peace. We have erected a fence that will keep away the overwhelming majority of attacks.

Remember, though, when we were looking at the rules?

- don't annoy other players

Did they meant this tmux trick? How could we possibly annoy other players?

That got me thinking.

We're all the same user, basically, but we're connecting from different terminals devices, right? Tmux would be a virtual terminal, but we're initially connecting from some other terminal.


Ok, let's see what's my current terminal:


Aha, interesting. And, just for our own pleasure, what's our virtual terminal device when inside our now (highly-amazing and protected) tmux session?


Also interesting. We can also see what permissions are set on our terminals

Try to do ls -hal /dev/pts/40

Interesting... you can see that we have writing permission if we are the user.
But, everyone in this level is bandit1, so we can just do

who


This is a list of all visible learners and their respective levels/users.

So, what's stopping us from quoting The Matrix? Nothing, other than being a nuisance. So, let's open a second session. And let's just flex our age and good taste in cinema onto our own selves in another virtual reality.
Let's also try to reply, with a caveat: we've done 'chmod 000' to our original pseudo-terminal device. Look at the result. It's interesting.


You might be wondering how we're able to continue using our terminal when we've removed all permissions, even for the owner... Good point!

 The key is that our current process already has the terminal open and maintains its file descriptors. This allows us to continue reading from and writing to the terminal as usual.

However, any new attempts to access the terminal device file will fail.
For example, if we try to send an echo message to our terminal using:

echo "Hi, guy!" > /dev/pts/40

...it won't work. 

But we can still do:

echo "Hi, guy!" normally in our current session.

Also interesting!

This showcases a bit how Unix-like systems handle file descriptors and permissions. That's also food for thougth and another day's snack.

I'll be writing more about my OTW playtime ('part 1' was a giveaway, wasn't it?). I'll share ideas, insights and stuff I find interesting, but not walkthroughs or spoilers.

Finally, I'd like to extend a thank you to the nameless 'hacker' with whom I had a pleasant, albeit short, conversation. That romp got me into a learning marathon which has been super fun.

Have fun, enter The Matrix, and behave!

Saturday, September 21, 2024

Wherein We Create An Assembly Program: Butting Heads With The OS



When I first started learning Assembly, I wanted to write simple code, but the apparent complexity of the code I’d seen up to that point threw me off—and that might happen to you as well.


Thing is, you've probably seen ASM code that is basically a decoding of some program - the assembly level code that is the result of a partial compilation or full compilation of another program. That means this code will have a lot of overhead from function prologues, epilogues, and groundwork needed to work smoothly with different libraries and instances.

But you don't need all that to write simple ASM code. In fact you only need this:


That's right:

- you need a data section and a text section (and some detail in between).

So, having found out that that was the case, I started creating simple programs. One of the first ones was this one, a simple "Hello, World!" program (ah, the good old tradition):

See? I told you. Not too complicated.

We have a .data section, a .text section and _start, which is a label that is also where the program execution begins.
 
Yeah, ok... besides that, there's probably a lot here you've never seen before. But remember that this is not a race, it's a marathon (whatever this is). We're going to search, ask and prod for everything we don't understand. And then we're going to get some practice under our belts with these new things.

But let's clear the waters and define some of these terms and words:

db - 'define byte'. This allocates storage for a string of bytes and initializes it with a given value.

0xa - this introduces a newline character (go check 'line feed' and also look up this ASCII table, *wink, wink, nudge, nudge*)

$ is the current address in the assembler, so $ - msg  gives us the difference between that current address and the address of msg which is the address at the start of msg, then we let len be that. Smart, huh? That gives us the length of our msg string, no matter what it contains.

As for int 0x80 it's an interrupt to a syscall. We've mentioned that before. But most of today's blogpost will revolve around syscalls and registers, so we'll get right back to that after we stop babbling and create our first Assembly program.

Remember, we save our ASM script, go back to the command line and enter the following:

nasm -f elf32 -o hello_world.o hello_world.asm
ld -m elf_i386 -o hello_world hello_world.o
./hello_world

And voilà! Our first Assembly 'Hello, World!' program. Newline included, free of charge.

Let's look at our code proper, inside of _start. What are those lines doing? One by one:

mov edx, len -> the sys_write syscall, which will print "Hello, World!" to the stdout expects the length of the data to be in edx. So we move that onto that register.
mov ecx, msb -> that same syscall expects the address of the msg to be in ecx.
mov ebx, 1 -> 1 is the file descriptor of stdout.
mov eax, 4 -> 4 is the call number for sys_write.
Next we do our interrupt, after everything being loaded up and ready.

And finally we do:
mov eax, 1 -> 1 is the number for the sys_exit syscall.
We do our final interrupt, and the program exits.


Let's go back to our 0x80 interrupts
Why are we doing them within our Assembly code, and why aren't we doing them when we create and compile our C programs, for example?

Well, we do, or it does quietly in the background. These syscalls are hidden within our compilation process and in particular within the libraries we do. The libraries are packaged with more than just the functions we regularly use. they also contain these syscalls which tell the OS to act in particular ways.

The function printf, for example, doesn't directly issue a system call. Instead, it goes through the C runtime, which handles formatting, buffering, and then calls a lower-level function like write() (from the C Standard Library) to send the output to stdout.
There is, in fact a system call to the kernel, but it's hidden behind several layers of abstraction.
Next time you check the Assembly code of a C program, be on the lookout for stuff like call 0x1030 <printf>. This is a syscall into the C Standard Library.

Now for something that happened to me a while ago, when I was creating a simple Assembly program to add two predefined numbers. Here’s part of the code—this version works:



Ok, like I said, this worked just fine, added '1' to '2' and returned '3'. Amazing.

Then I started thinking: 'That’s a lot of repetition. I’m constantly refreshing the register values after each interrupt. What if I just skip some of that?'

And so I did. This was one of such versions:


See? I just deleted register instructions in between interrupts.
Long story short: this version doesn't work.

And that was surprising to me at the time, and got me wondering as to why the program wasn't working. I knew that ASM wasn't preserving my register values between syscalls. But why?

After a little research I found out why: the Kernel takes control of our programs upon a syscall and it uses those same registries for its own sake. Hence, we need to always re-set our registers to the values that we want them to have after each syscall.

In fact, when the program is running normally, it’s in 'user mode,' and after a syscall is made, control is given to the Operating System. At that point, we enter 'kernel mode' until the OS finishes and returns control to the user. Here's a bad sketch to view this in action:


Good to know, right? Or at least interesting enough!

So I decided to create a (cheaty) ASM program to detect the values of the registers before and after a syscall. I wanted to actually visualize this change in two moments in time, much like the much maligned print debugging.
Here it is:



If you try to compile this program normally, it won’t work. And yeah, I’m a dirty cheater for this, but explaining all the reasons why is beyond the scope of this post. But here's a hint: see that call printf? That looks a lot like a C thing, but not so much an ASM thing.

I leave it to you to prod as to how I've cheated and what these compilation steps (which make the program work) are actually doing:

nasm -f elf32 -o register_catcher.o register_catcher.asm 
gcc -m32 -o register_catcher register_catcher.o -nostartfiles -no-pie




See? You cannot expect the registers to remain the same after an interrupt.
The OS will use and change them, often in ways you don’t expect.

Care to remake this catcher in pure ASM? Perhaps you'd like to learn more about Assembly? Then check this link. How about a Syscall Table?

The purpose of this blog is not to teach Assembly step by step, but there's a ton of resources out there for you to explore.

I hope you have learned something new. I learned a lot while playing with Assembly and when writing these blog posts.

Enjoy!





Sunday, September 15, 2024

Wherein We Investigate A Function Prologue: Finally!

 

And here it is, finally: the majestic function prologue.


This is, in fact, part of a small C program and its Assembly version, as we can see here in Godbolt (again, use this - it's great).



Godbolt even distinguishes the different sections of our program (both in C and Assembly) in different colors, which is just a neat feature.


Assembly Prologue:

For starters, we'll represent the stack on the upper left with a crude drawing.
Items are added and placed on top of the pile of values, and memory addresses are represented in hexadecimal. These values decrease as we go up in the stack.



Notice that the memory locations are very much fictitious. As an added note, a memory location like 0x70 would be represented in 32 bits as:
0000 0000 0000 0000 0000 0000 0111 0000. So, if we were being very explicit in our representation, that memory address would be represented as 0x00000070.
Now that we have an initial state for our stack and for ECX which we'll use, let's carry on with the first instruction:





We are loading the effective address at ESP+4 onto ECX. Note that LEA doesn't actually access memory, it just performs address calculation.
In this context, ESP+4 is typically pointing to the program arguments.
Also, FYI, we could arguably, substitute LEA ECX, [ESP+4] with:

MOV ECX, ESP
ADD ECX, 4


But this would require two steps instead of one. Either way, it's good to know what LEA is doing behind the scenes. As we can see, the stack remains the same, but ECX is now pointing to 0x68.
Let's carry on to the second instruction:



AND: You probably remember this AND fella from boolean logic. It does exactly what it did back then. We're comparing the value stored in ESP with the value -16 and turning to 0 anything that isn't 1 in both numbers. Here's what I mean:

(ESP AND -16):
0000 0000 0000 0000 0000 0000 0110 0100
 1111 1111  1111  1111  1111  1111 1111 0000 (AND)
--------------------------------------------------------
0000 0000 0000 0000 0000 0000 0110 0000 (0x60)

What's with all those 1's, you ask? Good question. Go check out two's complement. It's really useful and it will be a great aid in the future.

This operation guarantees that the final 4 bits are zeros. Think of those bits like the remainder of a division by 16. If we have no remainder, then it means that we're working with a multiple of 16. And that's exactly why we're doing this. We're ensuring optimal performance by allowing the CPU to access memory more efficiently. Misaligned memory access can, in principle, lead to performance penalties or even crashes. 
This new position to which ESP is now pointing has 'e' stored in value, just for our sake.

Next instruction:



Now we're pushing whatever value is stored at ECX-4 (which is the previous ESP position) onto the top of the stack. So, we do just that. 'd' is sent upstairs and our ESP register follows accordingly. Why DWORD? That's a double word. And since in a 32-bit architecture words are 16 bits in size, our DWORD will be 32 bits, or 4 bytes in size. The PTR lets us know that it's a pointer to a value, so we know that we're accessing a value stored at a memory address (pointer).

Let's carry on:




That's a pretty simple one, we're pushing EBP onto the top of the stack, and so we did. We're saving the previous base pointer onto the stack and allowing the function to establish a new stack frame by setting EBP to the current value of ESP (and later restoring the previous stack frame when the function returns).

Fifth instruction:




And voila, we move EBP, setting up the new stack frame and setting EBP to the top of the stack. These two instructions allow for easy access to function parameters and local variables (relative to EBP).

Sixth instruction:




We're now pushing the value of ECX onto the stack, tracking the stack alignment and assisting in restoring the stack at the end of the function. ECX will later be used at the function epilogue to revert the stack to its original state. It's also a way to ensure that the stack is properly cleaned up when the function returns.

Final instruction of the prologue:



We're subtracting 20 from what we have at ESP, resulting in ESP = 0x40.
This creates a bubble of sorts, some room on the stack for local variables and temporary data used by the main() function.


I know... as drawings and sketches go, this one is a bit rough around the edges, but I hope it carries the message through.

Either way, you should be doing your own drawings! Start exploring this stuff until it starts making sense. A great way to learn these concepts is to keep being exposed to them, write them down, try programming them, rinse and repeat.

Have Fun!








Sunday, September 8, 2024

Wherein We Get Lost And Compare Object Dumps: C vs. Assembly

 

That's a rabbit hole, Alice. And those are books on shelves, all the way down.



Hi again!

I created a simple "Hello, World!" program in C, so that we could have a quick talk about function prologues and epilogues in Assembly, but we're in for a detour, as happens with all rabbit holes. 

And the truth is that it's just rabbit holes as we're going down (until we reach elephants, and then it's turtles all the way down, of course).

Here's the culprit:

Ok, nothing impressive, but it does its job.


After compiling this program through the usual steps, the program runs and prints "Hello, World!" to the standard output.

Next, I wanted to create an Assembly program that would print the exact same line, and although I can read some Assembly and am making progress in that front, I can't (yet) write my own Assembly programs. So I asked our LLM friend to do it for us. And so it did:



Pretty neat.

And we can turn this into a binary file with:
nasm -f elf32 print_hello_ASM.asm -o print_hello_ASM.o


And then turn it into an actual program with:
ld -m elf_i386 print_hello_ASM.o -o print_hello_ASM

And voila! We can run this program just like with our C program...

But...

 "wait, wait, wait, wait!"
You say.

"What's with the turning-the-code-into-binary-and-then-into-a-program-magic?

We don't need to compile stuff in Assembly, like we do with C?"

Well, those are great questions!

The thing is that we take compilation for granted. In fact, compilation is done in 4 steps:

- Preprocessing

- Compilation

- Assembling

- Linking

Let's ask an LLM to give us a little more information on these steps, and let it assume we want it explained in a simple manner:


Confused? Remember that you can always ask it to explain again from a different angle, in simpler terms, through analogy, etc:


We can always check more trustworthy sources, check documentation, forums, etc, like in:
https://unstop.com/blog/compilation-in-c

(I told you, it's rabbit holes most of the way down)


I'm not going to give you an in-depth explanation of these concepts (that's your job, really). But let's just say for the sake of simplicity, that when we compile our C code, we're in fact going through these four steps, and that when turning our Assembly code into a program, we just take the two last steps: Assembling and Linking (also, fyi: note that these steps can be combined or optimized in modern compilers).


To showcase the difference between these two processes and the baggage that comes along with C, let's look at an objdump of both our C and our Assembly programs.

What's an objdump? Here:



So... it's basically when we take a binary file and disassemble it back into Assembly code (+ extra info).


Then let's jump into that Assembly objdump of ours, right? Here:


And, for comparison, here's a gif with the C objdump:


Notice any difference? The C objdump file is a tad longer.
And note that I haven't included all the possible information in these dumps (checkout the man page for objdump. In particular for the -s argument).


Notice, though, that there is something we haven't seen before in our little Assembly forays. In that ASM objdump, we see these "int   0x80" lines. What are these?
Seems important enough.

These are system call interrupts, which are a way for our program to request services from the operating system's kernel. Namely, we want to be able to print our Hello World message on screen and we also want to be able to exit our program - that's what those two syscalls are doing there.

This is done behind the scenes through compilation when we're using C - so it's not all that obvious to us.


More info from our friendly LLMs:



Ah, but I just recalled that we were meant to discuss function prologues and epilogues in Assembly.

I went to https://godbolt.org/ and placed my original C code in there, and immediately got an Assembly representation of that code as well. 

And lo and behold, it's even color-coded, allowing us to see exactly what is the prologue and what is the epilogue.

Here:



But I'm leaving function prologues and epilogues for an upcoming blog post.



In the meanwhile, you can always check that yourself if you're curious. Or anything, really. See something you don't understand? Leave no stone unturned! Jump into that hole, satiate your curiosity and keep learning.



Wednesday, September 4, 2024

Wherein We Discover Some C Code: With A Little Help From Our Friends

 


 For the past year, while studying networking and programming in ATEC, I kept ChatGPT constantly open—not to give me direct answers, but to engage in a kind of "learning dialogue," let's say. It was there to challenge my understanding of the topics I was learning and to quickly fill in knowledge gaps that came along.

Was I skeptical of its knowledge? Of course. The same way I’m skeptical about any single source of information—take Wikipedia, for example. When it first came out, it was vilified by many for its crowdsourced approach to knowledge. But, hey! We still use it to this day. It’s a great tool, right?


Step 1: Generate and Compile the C Code

Continuing from our previous blog entry, let’s once again use ChatGPT to help us learn a bit more about Assembly and low-level code.

Today, we’re asking ChatGPT to give us a simple C snippet, which we promise not to read. We’ll copy and paste it into a document and compile that document into a binary, which we will then disassemble and try to understand.

Sounds fun? Let’s go!

See that? No peeking. Just copy that code, paste it, and save the script so we can compile it.

Our ask was simple: no recursion (no need to add an extra layer of complexity), only one function, etc (you can read it for yourself).

Copy-paste that sucker into an empty file, and that's it!
You didn't peek. You have no idea what's in that file. The world still makes sense.

If you are a "dirty cheater", just ask a friend to send you something very simple. Hey, it's a great way to make friends. True friends know C.

Next, you'll want to compile that code without debugging symbols. You might need to install the necessary multi-lib support:

sudo apt-get install gcc-multilib g++-multilib

"Oh, but I'm using Red Hat/Arch (btw)/etc, how do I get that package installed?"

Well, just ask ChatGPT. That's what it's there for. Or Google it, or something.

Let us (finally) compile that code:

gcc -m32 -o my_file my_file.c

I'm going to compile a second version with debugging symbols, by adding -g (remember?). More on this later.

Step 2: Disassemble and Explore the Assembly Code

Now we jump into gdb, like we did last time, but with a small twist: we'll be checking out TUI - the Text User Interface, by typing:

(gdb) layout asm


I might be biased, but this looks totally cool.


I'm not going to go deeply into the function prologue and epilogue (I'll leave that for another blog entry). Right now we're just interested in the "meat" of the program. What is it actually doing?

Notice that, right before the call line where we're calling a function named compute we're actually loading the value 4 and pushing that onto the stack?

That value is being loaded into the function as an argument.

That's useful to know!

And here it is, the compute function in all it's glory:


Again, we'll skip all the setup and concentrate on the "actions".

Look at that add. We're taking eax, which you might have noticed is now holding the value 4, and adding to itself, literally doubling that value. And we have another addition further on. We're adding 5 to that value, right before leaving our function and returning to main.

Let's cut this story short.

If you look at main, you'll see that our function will then print the result and end the program.

Like I said: we'll get to the tasty bits another day, but I wanted you to have a view of what we can do with this stuff. How we can use ChatGPT to create simple challenges which we can then work on. Remember: don't know something? Ask it what it means. Ask it to explain from a different angle. Ask it to draw you a picture - literally.


Step 3: Create your own C version of that disassembled code

Let's do it. We're not trying to be perfect here. Only to grasp the idea behind the assembly code and create a C program that could achieve a similar result. And here it is:


Is this perfect? Nah. Far from it. But it gets the gist of what that Assembly code is doing. And that's good enough for now.


Step 4: Now even more TUI

Remember when I said that I was going to create an extra compiled version of the original code? One that kept the debugging symbols.

Let's open that file in gdb, and after we've entered TUI, we'll also write:

Would you look at that? Because we added debugging symbols to our compiled code, we can now use TUI to read both the disassembled code and the original C code. How cool is that?

Oh, right. Notice that the original code doesn't have result+= result? Again, details.

For now, we're pretty satisfied with the result we got.

Next time, we'll be checking another really cool tool—one that is online and that doesn't require any installation or compilation. You just present the code, and it will return you corresponding assembly output.

I hope this was informative and gave you some ideas on how to use an LLM in your learning process. It can help you achieve these small goals competently and in an expedite manner.


Happy disassembling!





How a Spy Pixel Crashed Into My Friend's Vacation

              So it goes.   A friend of mine, a freelancer, recently went on a much-deserved vacation. Like most of us in today's always...