All the world's a stage, and all the men and women merely players
Today, we'll be talking about File Headers, also known as Magic Numbers.
These are specific sequences of bytes at the beginning of files that identify the type and format (e.g., PNG: 89 50 4E 47). They facilitate programming by allowing quick identification of the type of file being used, precluding the need to search within the file for specific functions or structures.
GZIP: 1F 88 08
As you can see here, the GZIP signature is right at the beginning of the file. Throughout this blog post, I'll be using tools like xxd (which we've seen before) to actually check these headers.
In Reverse Engineering, these file signatures allow for quick identification of files, detect tampering with said files and determine the appropriate tools or parsers to use.
Remember that file signatures can be modified to disguise file types or to bypass detection. Malware often uses such tactics, obfuscating payloads to evade analysis.
It's a good idea to practice in identifying these numbers. If, for some reason expected signatures aren't detected, it might be a good idea to whip out a hex editor like xxd or use tools like file or binwalk to analyze headers. These commands and tools rely on databases of known file signatures to identify file types and structures quickly.
- The file command, in particular relies on a magic database (commonly /usr/share/misc/magic), which contains predefined patterns for file headers.
- binwalk goes beyond headers to scan the entire binary for embedded file types or compressed data. It also uses signature databases but is more specialized for firmware analysis, detecting compressed archives, or images embedded in binaries.
JPEG: FF D8 FF E0
Speaking of JPEG files, I found an interesting challengee on a CTF: I was presented with a data file which was hard to interpret. This was its header:
If we know nothing about headers, then this is meaningless.
- 02 -> 64-bit (0x01 for 32-bit)
- 01 -> Little-endian (0x02 for big-endian)
- 01 -> Current version
PS: do tarballs work as expected?