mirror of
https://github.com/upx/upx.git
synced 2025-02-24 01:52:41 +00:00
66 lines
3.5 KiB
Plaintext
66 lines
3.5 KiB
Plaintext
Decompressing ELF Directly to Memory on Linux/x86
|
|
Copyright (C) 2000-2006 John F. Reiser jreiser@BitWagon.com
|
|
|
|
References:
|
|
<elf.h> definitions for the ELF file format
|
|
/usr/src/linux/fs/binfmt_elf.c what Linux execve() does with ELF
|
|
objdump --private-headers a.elf dump the Elf32_Phdr
|
|
http://www.cygnus.com/pubs/gnupro/5_ut/b_Usingld/ldLinker_scripts.html
|
|
how to construct unusual ELF using /bin/ld
|
|
|
|
There is exactly one immovable object: In all of the Linux kernel,
|
|
only the execve() system call sets the initial value of "the brk(0)",
|
|
the value that is manipulated by system call 45 (__NR_brk in
|
|
/usr/include/asm/unistd.h). For "direct to memory" decompression,
|
|
there will be no execve() except for the execve() of the decompressor
|
|
program itself. So, the decompressor program (which contains the
|
|
compressed version of the original executable) must have the same
|
|
brk() as the original executable. So, the second PT_LOAD
|
|
ELF "segment" of the compressed program is used only to set the brk(0).
|
|
See src/p_lx_elf.cpp, function PackLinuxI386elf::patchLoader().
|
|
All of the decompressor's code, and all of the compressed image
|
|
of the original executable, reside in the first PT_LOAD of the
|
|
decompressor program.
|
|
|
|
The decompressor program stub is just under 2K bytes when linked.
|
|
After linking, the decompressor code is converted to an initialized
|
|
array, and #included into the compilation of the compressor;
|
|
see src/stub/l_le_n2b.h. To make self-contained compressed
|
|
executables even smaller, the compressor also compresses all but the
|
|
startup and decompression subroutine of the decompressor itself,
|
|
saving a few hundred bytes. The startup code first decompresses the
|
|
rest of the decompressor, then jumps to it. A nonstandard linker
|
|
script src/stub/l_lx_elf86.lds places both the .text and .data
|
|
of the decompressor into the same PT_LOAD at 0x00401000. The
|
|
compressor includes the compressed bytes of the original executable
|
|
at the end of this first PT_LOAD.
|
|
|
|
At runtime, the decompressed stub lives at 0x00400000. In order for the
|
|
decompressed stub to work properly at an address that is different
|
|
from its link-time address, the compiled code must contain no absolute
|
|
addresses. So, the data items in l_lx_elf.c must be only parameters
|
|
and automatic (on-stack) local variables; no global data, no static data,
|
|
and no string constants. Use "size l_le_n2b.o l_6e_n2b.o" to check
|
|
that both data and bss have length zero. Also, the '&' operator
|
|
may not be used to take the address of a function.
|
|
|
|
The address 0x00400000 was chosen to be out of the way of the usual
|
|
load address 0x08048000, and to minimize fragmentation in kernel
|
|
page tables; one page of page tables covers 4MB. The address
|
|
0x00401000 was chosen as 1 page up from a 64KB boundary, to
|
|
make the startup code and its constants smaller.
|
|
|
|
Decompression of the executable begins by decompressing the Elf32_Ehdr
|
|
and Elf32_Phdr, and then uses the Ehdr and Phdrs to control decompression
|
|
of the PT_LOAD segments.
|
|
Subroutine do_xmap() of src/stub/l_lx_elf.c performs the
|
|
"virtual execve()" using the compressed data as source, and stores
|
|
the decompressed bytes directly into the appropriate virtual addresses.
|
|
|
|
Before transfering control to the PT_INTERP "program interpreter",
|
|
minor tricks are required to setup the Elf32_auxv_t entries,
|
|
clear the free portion of the stack (to compensate for ld-linux.so.2
|
|
assuming that its automatic stack variables are initialized to zero),
|
|
and remove (all but 4 bytes of) the decompression program (and
|
|
compressed executable) from the address space.
|