Executable file loading and process
Introduce the loading process of ELF files under Linux, and explore the essence of executable file loading
- What is the virtual address space of a process
- Why does a process have its own independent virtual address space
- Several loading methods
- Distribution of process virtual address space
Process virtual address space
The 32-bit hardware platform determines the address of the virtual address space from 0 to 2^32 -1, that is 0x00000000 ~ 0xFFFFFFFF
, the virtual space size of 4GB; while the 63-bit hardware platform has 64-bit addressing capabilities, and its virtual address space has reached 2^64 bytes, that is 0x0000000000000000 ~ 0xFFFFFFFFFFFFFFFF
, total 17179869184GB
.
From a program point of view, the space occupied by pointers in C language can be used to calculate the size of the virtual address space. Generally, the number of bits in the size of C language pointers is the same as the number of bits in virtual space, such as pointers under 32-bit platforms. It is 32 bits and 4 bytes.
The following uses 32 as the address space as the main address space, and 64-bit as the extension.
By default, the Linux system allocates the virtual address space of the process as follows:
The space used by the operating system is not allowed to be accessed by the process, and the process cannot fully use the remaining 3GB of virtual space, part of which is reserved for other purposes.
PAE
Intel Pentium Pro CPU under Linux began to use 36-bit physical address in 1995, that is, it can access 64GB of physical memory. At this time, the operating system can only have 4GB of virtual address space and cannot read all of the 64GB of physical memory. PAE is designed to solve this problem.
PAE (Physical Address Extension) is an address extension method. After Inter modified the page mapping method, the new mapping method can access more physical memory. The operating system provides a window mapping method to map additional memory into the address space. The application chooses application and mapping as needed. For example, the 256MB virtual address space of 0x10000000~0x20000000 in the application is used as the window. The program applies for multiple physical spaces of 256MB from the physical space higher than 4GB, numbered A, B, C, and then the window is set as needed Map to different physical space blocks, map 0x10000000~0x20000000 to A when A is used, and map to the past when B and C are used, and so on. Under Windows, this memory operation method is AWE (Address Windows Extensions) . UNIX systems such as Linux use the mmp() system call to achieve this .
Loading method
Overwrite loading
Virtual storage was used widely before it was invented, and it has been almost eliminated.
The overwrite loading method gives the programmer the task of tapping the potential of the memory. When writing the program, the programmer must manually divide the program into several blocks, and then write small auxiliary codes to manage when these modules reside in memory and when they are replace. This auxiliary code is called the Overlay Manager , as shown in the figure below
There is no call dependency between modules A and B, so the two modules share the memory area, when
The memory is overwritten when A is used, the memory is overwritten when B is used, and the coverage manager is used as resident memory.
The multi-module is as follows, the programmer needs to manually organize the modules into a tree structure according to the call dependency between them .
Therefore, the coverage manager needs to ensure some highlights.
-
In the tree structure, the call path from any module to the root of the tree (main) is called the call path. When the module is called, the module on the call path must be in memory. For example, when the C module is executing, both B and main need to be in the memory to ensure that E can correctly return to the module B and main after the execution is completed.
-
Prohibit calling across trees
Any module is not allowed to call across the tree structure, for example, A cannot call B, E, F. But in many cases, two modules depend on the same module. For example, module E and module C need another module G. The most convenient way is to incorporate module G into the main module, so that G is called between E and C. On the path.
Page mapping
Page mapping is part of the virtual storage mechanism, which was born with the invention of virtual storage.
Page mapping divides the data and instructions in the memory and all disks into several pages according to the unit of **page (Page)**, and all subsequent loading and operation units are pages.
Assuming that all instructions and data of the program are 32KB in total, the program is divided into 8 pages and numbered P0~P7. But 16KB memory cannot load 32KB programs, and the loading process will be carried out according to the principle of dynamic loading at this time. If the program execution entry is at P0, the load manager finds that the P0 of the program is not in the memory, and allocates the memory F0 to P0, and purchases the memory of P0 into F0. Use P5 after running, then load P4 into F1, and so on, as shown below:
If the program continues to run and needs to access P4, the load manager must choose to abandon one of the 4 memory pages currently in use to load P4. There are many abandonment algorithms, such as:
- FIFO first in first out, then give up F0, P4 is loaded into F0
- LUR is the least used, then give up F2, P4 into F2
And other algorithms. The so-called load manager here is the modern operating system, and to be precise, the movie is the storage manager of the operating system.
From the operating system perspective, the executable file is loaded and executed in the process
Process establishment
The key feature of the process is that it has an independent virtual address space . When a program is executed, three things usually need to be done at the beginning:
-
Create an independent virtual address space
That is, to create the corresponding data structure required by the mapping function, and under the Linux of i386, creating a virtual address space is actually just assigning a page directory, and there is even no need to set the mapping relationship. That is, the mapping relationship between virtual space and physical memory is completed.
-
Read the executable file header and establish the mapping relationship between the virtual space and the executable file
To complete the mapping relationship between the virtual space and the executable file, this step is the most important step in the entire loading process, which is "loading" in the traditional sense.
As shown in the figure, consider the simplest example, the virtual address is shown in the figure, the file size is 0x000e1, and the alignment is 0x1000. Since the size of the .text segment is less than 0x1000, it needs to be aligned.
This mapping relationship is a data structure stored inside the operating system . Linux calls a segment in the process virtual space a virtual memory area (VMA.Virtual Memory Area) . Windows is called a virtual section (Virtual Section) . In the above example, a VMA of the .text segment will be set in the corresponding data structure of the process. Its address in the virtual space is
0x08048000~0x08049000
the .text offset of 0 in the corresponding ELF file, and the attribute is read-only. -
Set the instruction register of the CPU to the entry address of the executable file and start the operation
Page fault
After the above steps are executed, the mapping relationship between the executable file and the virtual memory of the process is only established through the header information of the executable file, and the instructions and data of the executable file are not loaded into the memory.
Assuming that in the above example, the program entry address is 0x08048000
, which happens to be the actual address of the .text section. When the CPU finds an empty page when it intends to execute, it will consider this to be a page fault (Page Fault) . The CPU transfers control to the operating system. The operating system will query the data structure mentioned above, find the VMA where the empty page is located, calculate the offset of the corresponding page in the executable file, and then allocate a physical page in the physical memory. Establish a mapping relationship between the virtual page in the process and the allocated physical page, and then return the control to the process. The process restarts from the position of the page error just now. As shown in the figure below, it is an executable file. The relationship between physical memory:
Process virtual memory space distribution
ELF file link view and execution view
In the actual scene, the number of ELF file segments is relatively large, but due to the need to perform page alignment and other operations, if the page is allocated in one segment, it will inevitably cause greater waste.
But from the perspective of the operating system to load executable files (the operating system does not need to know which section name is, what is the role, etc.), it is found that it does not care about the actual content of the executable file, but only cares about the problems related to loading , The most important issue is the authority of the segment.
The permission combination of the segment is basically the following three
- Readable and executable segment represented by code segment
- Readable and writable segment represented by data segment and BSS segment
- The permission represented by the read-only data segment is a read-only segment
Therefore, we find a scheme that divides the types of permissions and merges the segments with the same permissions for mapping . The merged data is called Segment . If the .text
segment and the .init
segment are combined as a Segment , then it can be loaded when loading. Think of them as a whole to load, so that you can significantly reduce the problem of internal fragmentation of the page, thereby saving space.
Compare with the following figure, the left side is loading by segment, and the side is loading by segment after merging.
Write an example program below:
SectionMapping.c
#include <stdlib.h>
int main(){
while(1){
sleep(1000);
}
return 0;
}
gcc -static SectionMapping.c -o SectionMapping.elf
Then usereadelf
readelf -S SectionMapping.elf
Get the following information
There are 33 section headers, starting at offset 0xc4d50:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .note.ABI-tag NOTE 0000000000400190 00000190
0000000000000020 0000000000000000 A 0 0 4
[ 2] .note.gnu.build-i NOTE 00000000004001b0 000001b0
0000000000000024 0000000000000000 A 0 0 4
readelf: Warning: [ 3]: Link field (0) should index a symtab section.
[ 3] .rela.plt RELA 00000000004001d8 000001d8
00000000000001f8 0000000000000018 AI 0 24 8
[ 4] .init PROGBITS 00000000004003d0 000003d0
0000000000000017 0000000000000000 AX 0 0 4
[ 5] .plt PROGBITS 00000000004003e8 000003e8
00000000000000a8 0000000000000000 AX 0 0 8
[ 6] .text PROGBITS 0000000000400490 00000490
00000000000874fc 0000000000000000 AX 0 0 16
[ 7] __libc_freeres_fn PROGBITS 0000000000487990 00087990
00000000000024e9 0000000000000000 AX 0 0 16
[ 8] __libc_thread_fre PROGBITS 0000000000489e80 00089e80
00000000000003d7 0000000000000000 AX 0 0 16
[ 9] .fini PROGBITS 000000000048a258 0008a258
0000000000000009 0000000000000000 AX 0 0 4
[10] .rodata PROGBITS 000000000048a280 0008a280
000000000001bd7c 0000000000000000 A 0 0 32
[11] __libc_subfreeres PROGBITS 00000000004a6000 000a6000
0000000000000048 0000000000000000 A 0 0 8
[12] __libc_IO_vtables PROGBITS 00000000004a6060 000a6060
00000000000006a8 0000000000000000 A 0 0 32
[13] __libc_atexit PROGBITS 00000000004a6708 000a6708
0000000000000008 0000000000000000 A 0 0 8
[14] .stapsdt.base PROGBITS 00000000004a6710 000a6710
0000000000000001 0000000000000000 A 0 0 1
[15] __libc_thread_sub PROGBITS 00000000004a6718 000a6718
0000000000000010 0000000000000000 A 0 0 8
[16] .eh_frame PROGBITS 00000000004a6728 000a6728
0000000000009c38 0000000000000000 A 0 0 8
[17] .gcc_except_table PROGBITS 00000000004b0360 000b0360
0000000000000085 0000000000000000 A 0 0 1
[18] .tdata PROGBITS 00000000006b0b40 000b0b40
0000000000000020 0000000000000000 WAT 0 0 8
[19] .tbss NOBITS 00000000006b0b60 000b0b60
0000000000000040 0000000000000000 WAT 0 0 8
[20] .init_array INIT_ARRAY 00000000006b0b60 000b0b60
0000000000000010 0000000000000008 WA 0 0 8
[21] .fini_array FINI_ARRAY 00000000006b0b70 000b0b70
0000000000000010 0000000000000008 WA 0 0 8
[22] .data.rel.ro PROGBITS 00000000006b0b80 000b0b80
0000000000000464 0000000000000000 WA 0 0 32
[23] .got PROGBITS 00000000006b0fe8 000b0fe8
0000000000000008 0000000000000008 WA 0 0 8
[24] .got.plt PROGBITS 00000000006b1000 000b1000
00000000000000c0 0000000000000008 WA 0 0 8
[25] .data PROGBITS 00000000006b10c0 000b10c0
0000000000001af0 0000000000000000 WA 0 0 32
[26] .bss NOBITS 00000000006b2bc0 000b2bb0
0000000000001718 0000000000000000 WA 0 0 32
[27] __libc_freeres_pt NOBITS 00000000006b42d8 000b2bb0
0000000000000028 0000000000000000 WA 0 0 8
[28] .comment PROGBITS 0000000000000000 000b2bb0
0000000000000025 0000000000000001 MS 0 0 1
[29] .note.stapsdt NOTE 0000000000000000 000b2bd8
0000000000001408 0000000000000000 0 0 4
[30] .symtab SYMTAB 0000000000000000 000b3fe0
000000000000a6c8 0000000000000018 31 693 8
[31] .strtab STRTAB 0000000000000000 000be6a8
0000000000006532 0000000000000000 0 0 1
[32] .shstrtab STRTAB 0000000000000000 000c4bda
0000000000000176 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
View the ELF Segment
information, just as the structure of the Section attribute is called the segment table, the structure of the segment is described as the program header (Program Header) , which describes how the ELF file should be mapped into the virtual space of the process by the operating system:
readelf -l SectionMapping.elf
The results are as follows:
Elf file type is EXEC (Executable file)
Entry point 0x400a00
There are 6 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000b03e5 0x00000000000b03e5 R E 0x200000
LOAD 0x00000000000b0b40 0x00000000006b0b40 0x00000000006b0b40
0x0000000000002070 0x00000000000037c0 RW 0x200000
NOTE 0x0000000000000190 0x0000000000400190 0x0000000000400190
0x0000000000000044 0x0000000000000044 R 0x4
TLS 0x00000000000b0b40 0x00000000006b0b40 0x00000000006b0b40
0x0000000000000020 0x0000000000000060 R 0x8
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
GNU_RELRO 0x00000000000b0b40 0x00000000006b0b40 0x00000000006b0b40
0x00000000000004c0 0x00000000000004c0 R 0x1
Section to Segment mapping:
Segment Sections...
00 .note.ABI-tag .note.gnu.build-id .rela.plt .init .plt .text __libc_freeres_fn __libc_thread_freeres_fn .fini .rodata __libc_subfreeres __libc_IO_vtables __libc_atexit .stapsdt.base __libc_thread_subfreeres .eh_frame .gcc_except_table
01 .tdata .init_array .fini_array .data.rel.ro .got .got.plt .data .bss __libc_freeres_ptrs
02 .note.ABI-tag .note.gnu.build-id
03 .tdata .tbss
04
05 .tdata .init_array .fini_array .data.rel.ro .got
From the perspective of loading, we only need to care about two "LOAD" type segments, and the others only play a supporting role in loading. As you can see here, the file is re-divided into three parts:
- Readable and executable LOAD segment
- Read and write LOAD segment
- Unmapped segment
Therefore, all Sections with the same attributes are classified into the same Segment and mapped to the same VMA. Therefore, Segment and Section divide the ELF file from different angles.
- Link view: Divide from the perspective of Section
- Execution view: Divided from the perspective of Segment
The following figure shows the mapping relationship between the executable file segment and the process virtual space:
ELF executable files and shared library files have a special data structure called Program Header Table (Program Header Table) to save segment information, because ELF files do not need to be loaded, so there is no program header table. The specific structure is as follows: And correspond to the readelf -l
read data one-to-one
typedef struct{
Elf32_Word p_type;
Elf32_Off p_offset;
Elf32_Addr p_vaddr;
Elf32_Addr p_paddr;
Elf32_Word p_filesz;
Elf32_Word p_memsz;
Elf32_Word p_flags;
Elf32_Word p_align;
}Elf32_Phdr;
The basic meaning is as follows
Among them, p_memsz >= p_filesz
if p_memsz <= p_filesz
it means that the size of the space allocated by the segment segment in the memory exceeds the actual size of the file, this part of the extra space is all filled with "0". In this way, when we construct the ELF executable file, we don't need to set up an additional BSS segment, and can expand the p_memsz of the data segment, and those extra parts are the BSS. The difference between the data segment and the BSS is that the data segment initializes the content from the file, while the BSS is all initialized to 0. Therefore, it can be seen that the previous BSS has actually been incorporated into the data type segment, but has not been displayed.
Heap and stack
View the virtual space distribution of the process as shown in the figure:
The meaning can be seen in my other article, "Analysis of the File Structure of/proc/{pid}/maps"
Among them, the major device number, the minor device number, and the file node number are all 0, indicating that it is not mapped to a file. This kind of VMA is called an anonymous virtual memory area (Anonymous Virtual Memory Area) . We are currently concerned that the two VMAs Heap and Stack exist in almost all processes. The malloc() function memory allocation is allocated from the heap. The heap is managed by the system library and vsyscall
is located in the kernel space. The specific role is to be studied in detail. Judging from the name, it is guessed that it should be VMA related to kernel communication.
A process can be divided into the following areas:
- Code VMA, permission is read-only, executable; there are image files
- Data VMA, permissions can be read and written; there are image files
- Heap VMA, readable, writable and non-executable; anonymous, scalable
- Stack VMA, readable, writable and non-executable; anonymous, expandable downward
The details are shown in the figure:
Linux kernel loading ELF process
-
The bash process calls the
fork()
system call to create a new process -
Call the
execve()
system call to execute the specified ELF file, the original bash process continues to return to wait for the end of the new process just started, and then wait for the user to enter a command.execve()
Is defined in theunsitd.h
file, the prototype is as followsint execve(const char * filename,char * const argv[ ],char * const envp[ ]);
The three parameters are the file name of the executed program, the execution parameters, and the environment variables
Ah~~~ The steps here are quite many and complicated, and I don t understand it without actual experiments, so let s just paste the image and watch it;
Then the final result is that the return address is changed to the entry address of the loaded ELF program:
- Static link: the address pointed to by e_entry in the header file of the ELF file
- Dynamic link: dynamic linker address
So far, the loading part of the executable file has been introduced