Monday, November 18, 2024

Wherein We Do Some Magic!: File Headers

 

All the world's a stage, and all the men and women merely players


Today, we'll be talking about File Headers, also known as Magic Numbers.


These are specific sequences of bytes at the beginning of files that identify the type and format (e.g., PNG: 89 50 4E 47). They facilitate programming by allowing quick identification of the type of file being used, precluding the need to search within the file for specific functions or structures.


GZIP: 1F 88 08


As you can see here, the GZIP signature is right at the beginning of the file. Throughout this blog post, I'll be using tools like xxd (which we've seen before) to actually check these headers.



In Reverse Engineering, these file signatures allow for quick identification of files, detect tampering with said files and determine the appropriate tools or parsers to use.

Remember that file signatures can be modified to disguise file types or to bypass detection. Malware often uses such tactics, obfuscating payloads to evade analysis.

It's a good idea to practice in identifying these numbers. If, for some reason expected signatures aren't detected, it might  be a good idea to whip out a hex editor like xxd or use tools like file or binwalk to analyze headers.  These commands and tools rely on databases of known file signatures to identify file types and structures quickly.

- The file command, in particular relies on a magic database (commonly /usr/share/misc/magic), which contains predefined patterns for file headers.

- binwalk goes beyond headers to scan the entire binary for embedded file types or compressed data. It also uses signature databases but is more specialized for firmware analysis, detecting compressed archives, or images embedded in binaries.


JPEG: FF D8 FF E0


Speaking of JPEG files, I found an interesting challengee on a CTF: I was presented with a data file which was hard to interpret. This was its header:



If we know nothing about headers, then this is meaningless. 

But if we recognize the JPEG signature, then we can see that the header is there, but reversed in 4-byte chunks (due to endianess). So I wrote a python script to process the whole file, reversing the byte order, 4 bytes at a time.



When that was done, the weird file was shown to be a well-behaved JPEG file (containing a flag). CTFs are fun!

ELF: 7F 45 46


But there's more to headers than just the initial signature. For instance, in this ELF file, if we look beyond the initial bytes, we can see:
  • 02 -> 64-bit (0x01 for 32-bit)
  • 01 -> Little-endian (0x02 for big-endian)
  • 01 -> Current version

You can extract this information with tools like readelf (as shown above). For images, tools like exiftool are handy for extracting metadata embedded in files.

There are tables and references available for identifying these headers. Take some time and explore this stuff.
Whether you're debugging a binary, hunting for a flag, or analyzing malware, knowing these magic numbers can make all the difference.

Lift the curtain and have some fun!

PS: do tarballs work as expected? 

Saturday, November 16, 2024

Wherein We Crack Yet Another Program And Learn Something In the Process: part three (or something)

 



So, let's fast-forward through this first part. While it was revealing, it wasn’t all that great. Informative? Sure. Exciting? Nah.

So we can skip the fluff.


There I was, creating yet another C program to crack—asking an LLM (Large Language Model) to be rough with me. I told it to place whatever protections it found amusing, especially ones that might put a damper on my usual GDB shenanigans.

I whipped up a simple C program with some XOR gimmicks and handed it over to the LLM, telling it, “Go nuts. Protect this binary as if your life depends on it.”(I might be paraphrasing here).

The LLM's Attempt at a Challenge

Well, the LLM tried, but it failed pretty hard. Not because I’m some kind of binary-reversing wizard (I’m not), but because its defenses mostly relied on surface-level userspace tricks. These are the kinds of protections that look flashy but crumble under the weight of a determined debugger wielding carefully placed breakpoints.

Let’s cut to the chase: here’s a snippet of the original code it generated:


Breaking the "Protections"

Most of these defenses—fake functions, misleading execution flows, or basic obfuscation (not all seen here)—can be easily defeated with a debugger. When you examine the binary at runtime, these kinds of tricks are more like a speed bump than a roadblock.

GDB was enough by itself to detect the two main weaknesses—key+encrypted password:


And voilà, a quick peek into those memory locations reveals the key and the encrypted password. Nothing we haven’t seen before:


The logic here is straightforward. By reading the ASM, we can tell there’s a xor operation happening, and the key is being repeated (via a modulo 4 operation) to match the encrypted password’s length (10 characters).

Great! From here, undoing the operation is trivial. A simple Python script does the trick:


And that’s it. We have the password, the binary is cracked, and we move on.

Lessons Learned

What’s the moral of this part? Don’t store your bloody password and key inside your binary. Ever. Seriously, it’s like leaving your house key under the mat and hoping no one checks.


This reminds me of that guy who stored his password inside his binary while working on a GitHub project with full version control. He was surprised to find others knew the pass, regardless. 



What's Next?

I could create more complex C programs where the password lives elsewhere (maybe a server, maybe environment variables), but honestly, that defeats the purpose of this kind of exercise. Plus, it opens up a whole other can of worms I don’t feel like opening just yet.

Instead, we’ll dive into Binary Security: NX, ASLR, RELRO, Stack Canaries, and how these mitigations shape the reverse-engineering landscape.

It’ll be fun (or your money back—promise).












Thursday, November 7, 2024

Wherein We Were On A 24/7 Regimen: Vacation & CTFs

 



While enjoying some holiday time (because who doesn't love mixing relaxation with buffer overflows?), I finally decided to tackle PicoCTF's challenges. You know that feeling when you find a perfect excuse to dive deep into binary exploitation? Yeah, that's the one.

So, Pico CTFs or Capture the Flag challenges come in different flavors, each with its own special sauce:

  • Web Exploitation 
  • Cryptography 
  • Reverse Engineering 
  • Forensics 
  • General Skills 
  • Binary Exploitation




While on vacation, I completed just about all Easy PicoCTF (PicoGym) challenges in a few hours, which was a lot of fun, and then decided to tackle the Medium ones as soon as I arrived back home.

I then stumbled upon this particular challenge that had me grinning like a kid in a candy store: a Linux machine which allowed no alphabetic characters. None. Zero. Null. 

Just numbers and symbols.

Being somewhat versed in C (because real friends...), I thought I had it all figured out. Characters are just numbers in disguise, right? I'd just convert a bunch of numbers into characters and then feed that to stdin and then I could ls, cat, grep... Wrong! The challenge designers were way ahead of me. Any attempt to convert numbers to ASCII characters? Boom. Server says 'no way, Jose. Go do something else'.

But wait, there's more...

The real fun began when I shared this challenge with some friends. A couple of them didn't know CTFs were a thing, and they thought that this particular challenge sounded crazy fun. So, at around lunchtime, we went at it.

Remember that more modern security challenges often include protections like ASLR, DEP, and other acronyms that usually drive newcomers to fits of despair. But this challenge? This was different. It wasn't about bypassing protections - it was about thinking differently about how we interact with Linux systems. How on earth do you run commands when you can't write letters? What commands can you write?

While many classic approaches would be blocked by the no-alphabet rule, there are ways around limitations. That's the beauty of Linux - there's usually another way.

So, we put our heads together and had a great time, bouncing ideas off each other.

Remember: I don't do walkthroughs, so you won't find one here. Don't worry.
There are plenty out there, but if you simply give in and read one of those, you'll rob yourself of the delight of actually figuring out how to beat a challenge like this. Trust me: there's a lot to be said for failing, trying again, failing again, having a small breakthrough and going back to the drawing board before trying once more. Much learning can happen in those moments.

Every failed attempt, every error message, every "permission denied" - they're all teaching moments. They get under your skin, become part of your hacker DNA.

Still with me? Good, because here's the thing about CTFs that many miss: they're not about the flags. They're about the journey, the learning, the moments when you and your friends look at each other and go "ohhhhh, that's how it works!" or "how about this? Let's try it!"

Some say CTFs are unrealistic. Maybe. But you know what? So is practicing armbars on a compliant partner, yet BJJ works. And believe me, you'll learn when you win, but you'll do double plus better when you are faced with a new adversary or technique that rocks your world and turns it upside down. Losing is Fun.

Remember: every great hacker started somewhere. Probably failing at a challenge just like this one. The difference? They kept going.

So, yeah. We found the solution and had our minds blown at what we could do within such a restricted environment.

How a Spy Pixel Crashed Into My Friend's Vacation

              So it goes.   A friend of mine, a freelancer, recently went on a much-deserved vacation. Like most of us in today's always...