Debugging PowerPC ELF Binaries
From Devpit
Authors
This page was originally created by Ryan S. Arnold aka RandomTask.
ELF Sections
The .text
The .text section on powerpc32
The .glink section on powerpc32
The .glink is an implementation detail of the secure-PLT (Procedure Linkage Table) ABI.
-
- The .glink section is the executable part of the PLT and must reside in the code segment (LOAD ... r-x).
- The .plt section is the companion non-executable part of the PLT and must reside in the data segment (LOAD ... rw-).
The .glink section has two purposes:
-
- It serves as an executable trampoline for branching to a symbol whose absolute address is dynamically resolved and stored into a non-executable .plt entry.
- It detects whether a dynamically resolved symbol has yet to be resolved and invokes the dl_runtime_resolver if necessary to do so.
Therefore the .glink holds two kinds of code:
-
- PLT call code stubs: dynamically resolved symbol requests for functions in shared objects will branch to these call stubs. These stubs are generated by the linker who knows the offset (from the .got) to the .plt entry which will eventually hold the absolute address of the dynamically resolved symbol. The call code stubs will branch to the address held in the .plt entry for the associated function.
- The PLT symbol resolver stub: .plt entries for functions which have not been dynamically resolved by the loader are set up by default to fall into this stub (via a series of nop instructions). This stub then calls the loader's dl_runtime_resolve function which will populated the unresolved .plt entry with the absolute address of the dynamically resolved symbol. Future calls to the PLT call code stubs will now branch to the resolved absolute symbol address held by the associated .plt entry for the function.
Description of process given an application that references a dynamically resolved external function, namely function2:
- Prior to executable generation via linking, function branches in a .S (GNU Assembler) file will look like the following:
-
-
bl function2@plt
-
- The linker knows when it generates the executable that function2 is undefined and that this should actually be a branch to the companion .glink stub for function2, not a direct branch to the .plt entry for that function.
- The linker generates the PLT call codes stub and places it on the tail end of the .text section. Since the .glink section is unmarked references to function2 will appear in objdump output as branches to a .glink stub labeled call___do_global_ctors_aux+offset. The linker has picked the closest previous symbol as a label.
-
-
10001960 <call___do_global_ctors_aux+0x20>
-
- The PLT call code stubs for function2 at address 0x1001960 looks like the following:
10001960: 3d 60 10 01 lis r11,4097 10001964: 81 6b 1b 78 lwz r11,7032(r11) 10001968: 7d 69 03 a6 mtctr r11 1000196c: 4e 80 04 20 bctr
- The associated .plt entry at 0x10011b78 has been initialized by the linker as the following:
10011b78 <function2@plt>: 10011b78: 10 00 19 84
The PLT call code stub for function2 loads the contents of memory pointed to by 0x10011b78 (the .plt entry for function2) into gpr11, effectively 0x10001984. The referenced .plt entry for function2 will ultimately hold the absolute address of function2 after the dl_runtime_resolver has loaded the necessary library.
By default, the linker sets the address in the .plt entry for function2 as the prologue address for the PLT runtime resolver. This address 0x10001984 is where the PLT call code stub for function2 branches to, e.g.
10001980: 60 00 00 00 nop 10001984: 60 00 00 00 nop 10001988: 60 00 00 00 nop 1000198c: 60 00 00 00 nop 10001990: 3d 80 10 01 lis r12,4097 10001994: 3d 6b f0 00 addis r11,r11,-4096 10001998: 80 0c 1b 6c lwz r0,7020(r12) 1000199c: 39 6b e6 80 addi r11,r11,-6528 100019a0: 7c 09 03 a6 mtctr r0 100019a4: 7c 0b 5a 14 add r0,r11,r11 100019a8: 81 8c 1b 70 lwz r12,7024(r12) 100019ac: 7d 60 5a 14 add r11,r0,r11 100019b0: 4e 80 04 20 bctr
Here's an example of the an entire .glink section made up of PLT call code stubs and the PLT symbol resolver.
- The blue text highlights where the .glink section starts.
- The teal text highlights PLT call code stubs.
- The orange text highlights the PLT symbol resolver stub, including the preceding nop fall-through code.
10001930 <call___do_global_ctors_aux>: 10001930: 94 21 ff f0 stwu r1,-16(r1) 10001934: 7c 08 02 a6 mflr r0 10001938: 90 01 00 14 stw r0,20(r1) 1000193c: 80 01 00 14 lwz r0,20(r1) 10001940: 38 21 00 10 addi r1,r1,16 10001944: 7c 08 03 a6 mtlr r0 10001948: 4e 80 00 20 blr 1000194c: 00 00 00 00 .long 0x0 10001950: 3d 60 10 01 lis r11,4097 10001954: 81 6b 1b 74 lwz r11,7028(r11) 10001958: 7d 69 03 a6 mtctr r11 1000195c: 4e 80 04 20 bctr 10001960: 3d 60 10 01 lis r11,4097 10001964: 81 6b 1b 78 lwz r11,7032(r11) 10001968: 7d 69 03 a6 mtctr r11 1000196c: 4e 80 04 20 bctr 10001970: 3d 60 10 01 lis r11,4097 10001974: 81 6b 1b 7c lwz r11,7036(r11) 10001978: 7d 69 03 a6 mtctr r11 1000197c: 4e 80 04 20 bctr 10001980: 60 00 00 00 nop 10001984: 60 00 00 00 nop 10001988: 60 00 00 00 nop 1000198c: 60 00 00 00 nop 10001990: 3d 80 10 01 lis r12,4097 10001994: 3d 6b f0 00 addis r11,r11,-4096 10001998: 80 0c 1b 6c lwz r0,7020(r12) 1000199c: 39 6b e6 80 addi r11,r11,-6528 100019a0: 7c 09 03 a6 mtctr r0 100019a4: 7c 0b 5a 14 add r0,r11,r11 100019a8: 81 8c 1b 70 lwz r12,7024(r12) 100019ac: 7d 60 5a 14 add r11,r0,r11 100019b0: 4e 80 04 20 bctr 100019b4: 60 00 00 00 nop 100019b8: 60 00 00 00 nop 100019bc: 60 00 00 00 nop 100019c0: 60 00 00 00 nop 100019c4: 60 00 00 00 nop 100019c8: 60 00 00 00 nop 100019cc: 60 00 00 00 nop
The .rodata
The .rodata section on powerpc32
.section .rodata.cst16,"aM",@progbits,32 .LC1: /* 9223372036854775808.0DL */ .long 0x2207c000 .long 0x00000003 .long 0xa4cfa07a .long 0x2c7f600a .LC2: /* 18446744073709551616.0DL */ .long 0x2207c000 .long 0x0000000c .long 0xa99e40ed .long 0xc5ba58e0
- The section flags and entsize ( "aM" and @progbits,32 respectively) identify this section as allocatable ( a), read-only ( no w), non-string ( no S), and mergeable ( M). The element size is 32 (because there are two 16 byte constants). The section name (.rodata. suffix) can be anything and it is used by the linker ( ld -r) to merge like named sections. Different flags/entsize sections should have different section names.
The .rodata section on powerpc64
- The .rodata section exists on powerpc64 but a performance boost can be attained by storing static constant data in the .toc section instead.
- Space in the .toc section is limited so be discriminatory about the data being placed there. Something like an array of constant data should be held in the .rodata section instead.
.section ".toc","aw" .LC1: /* 9223372036854775808.0DD */ .tc FT_2207c000_3_a4cfa07a_2c7f600a[TC],0x2207c00000000003,0xa4cfa07a2c7f600a .LC2: /* 18446744073709551616.0DD */ .tc FT_2207c000_c_a99e40ed_c5ba58e0[TC],0x2207c0000000000c,0xa99e40edc5ba58e0
The .got, the .toc, and the .plt
disclaimer: Reference to the .got is powerpc32 centric. On powerpc64 the symbol would be referenced directly from the .toc. Position Independent Code (PIC) and secure-plt usage are assumed.
The .got section on powerpc32
The powerpc32 .got is a Global Offset Table of absolute addresses to symbols. This offset table is required because the PIC (position independent code) standard says that code cannot contain absolute addresses. The .got is used for two things, it holds the absolute address to static or global variables and it holds the absolute address to functions accessed via function pointers as in the following example:
int (*ptrfunc)(int) = 0; ptrfunc = &function1; ret = ptrfunc(10);
This means that the absolute address of function1 is resolved by the dynamic linker-loader ld.so at program load-time. You pay the up-front cost of the dynamic symbol resolution at program load-time.
Getting the address of the .got on powerpc32
Let's examine the Gnu Assembler sequence that you'll see in every 32-bit PowerPC function that branches to a dynamically resolved symbol. A dynamically resolved symbol's address is accessed via an offset into the .plt from the .got address. The address of the .got needs to be computed at least once for each routine that needs to branch to a dynamically resolved symbol since (per the ABI) it is not stored in a gpr across function calls.
-
Note:
- The program code is the .text section which is in the read-execute code segment.
- The .got is in the read-write data segment.
- The read-write data segment is at a higher address than the read-execute code segment. Therefore offset computation would subtract the .text symbols addresses from the .got symbol addresses.
Example:
bcl 20,31,.LCF1
.LCF1:
mflr 30
addis 30,30,_GLOBAL_OFFSET_TABLE_-.LCF1@ha
addi 30,30,_GLOBAL_OFFSET_TABLE_-.LCF1@l
- At link time the linker will compute the fixed offset between the .LCF1 symbol and the _GLOBAL_OFFSET_TABLE symbol by subtracting the symbol address of .LCF1 from the _GLOBAL_OFFSET_TABLE_ address, i.e. _GLOBAL_OFFSET_TABLE_-.LCF1.
Here's the generated object-code after the linker has run:
1000159c: 42 9f 00 05 bcl- 20,4*cr7+so,100015a0 <main+0x1c> 100015a0: 7f c8 02 a6 mflr r30 100015a4: 3f de 00 01 addis r30,r30,1 100015a8: 3b de 05 c8 addi r30,r30,1480
-
- bcl is used to obtain the address of the next instruction. This is stored into the link register per the 'l' on the 'bc' mnemonic.
- mflr is used to load the value in the link register (The address 0x100015a0) into gpr30.
- The computed offset between .LCF1 and the _GLOBAL_OFFSET_TABLE_ was calculated by the linker to be 0x000105c8.
- The addis and addi combo will effectively add the computed offset 0x000105c8 to the .LCF1 symbol address 0x100015a0.
-
- addis is used to add the high half-word of the computed offset to the value in gpr30.
- addi is used to add the low half-word of the computed offset to the address in gpr30.
- The address of the .got in gpr30 is 0x10011B68.
The .toc section on powerpc64
- The .toc section only exists on powerpc64. It stands for Table of Contents. It holds both data and addresses. The powerpc64 ELF ABI defines general purpose register 2 to always hold a pointer to the .toc section.
The .plt section
- On both powepc32 and powerpc64 the .plt section of an executable is the Procedure Linkage Table. It is used to store the absolute address of late-bound functions invoked by symbol name. For instance the following invocation of function1() would require late binding and would be invoked through a symbol offset in the .plt section:
ret = function1(10);
Addressibility
Addressibility on PowerPC32
The PowerPC 32-bit architecture cannot load an entire 32 bit address in one instruction. As a result you'll generally see two methods for getting an address into a register, lis/lwz or addis/addi. Both of these methods will load the high 16 bits of the address first and then the low 16 bits.
The lis/lwz instruction pair loads the contents pointed to by the address 0x10011b44 into gpr11.
10001920: 3d 60 10 01 lis r11,4097 10001924: 81 6b 1b 44 lwz r11,6980(r11)
-
- The lis instruction stands for load immediate shifted and it stores 0x10010000 in gpr11, i.e. 0x1001 shifted to the high bits of grp11.
- The lwz instruction stands for load word and zero and it says take the contents at the address in gpr11 (0x10010000) and add to it the offset 0x1b44. Then load the contents at the resultant address (0x10011b44) into gpr11.
The addis/addi instruction pair adds 0x000105c8 to the address already in gpr30 using the following method:
100015a4: 3f de 00 01 addis r30,r30,1 100015a8: 3b de 05 c8 addi r30,r30,1480
-
- The addis instruction stands for add immediate shift and it adds 0x0001 to the high-order 16-bits of the address already held in gpr30.
- The addi instruction stands for add immediate and it adds 0x05c8 to the low-order 16-bits of the address already held in gpr30.
Branching
- On PowerPC, unconditional branches are done in one of the following four ways:
- A direct branch to an address, e.g. b 10011b44 <symbol> (used for gotos)
- A branch to an address, setting up the link register, e.g. bl 10011b44 <symbol>.
- A branch to the address in the link register, e.g. blr.
- A branch to an address held in the count register, e.g. bctr (used for indirection or loops)
- You will see the compiler make use of all of the enumerated unconditional branches for its own internal use. Additionally each of these branches can be generated as the result of a particular symbol invocation by a user level program:
- A direct branch results from a simple goto, e.g. goto mylabel;.
- A branch to an address is a result of invoking a statically or dynamically resolved function, e.g. function2();.
- A branch to the address in the link register is the result of a function return, e.g. return somevariable;.
- A branch to an address in the count register is generally the result of invoking a function via a function pointer. The loader resolves these symbols at load time so you pay the resolution price up-front (i.e. _dl_runtime_resolve() is not invoked).
Branching on PowerPC32
On PowerPC32 the effect of calling a function via a function pointer is that the symbol address is resolved and loaded into the .got at application load time. It does not have a .plt reference unless it is invoked directly e.g. function2() and dynamically resolved.
User Code Branching Example
Given the following example files we'll demonstrate three different function invocation methods and the resultant bindings:
- Dynamically resolved, load-time bound shared-object function pointer invocation.
- Dynamically resolved, late bound (_dl_runtime_resolve) shared-object function invocation.
- Statically resolved, link-time bound local function invocation.
- func.h:
extern int function1(int); extern int function2(int);
- func.c:
#include "func.h" int function1(int val) { return ++val; } int function2(int val) { return --val; }
- test.c
#include "func.h"
int function3(int val) { return ++val; } int main() { int (*ptrfunc)(int) = 0; int ptrret; int ret; ptrfunc = &function1; ptrret = ptrfunc(10); /* Function pointer invocation of function1(). */ ret = function2(ptrret); /* Dynamic symbol resolution. */ return function3(ret); /* Static symbol resolution local to test.o. */ }
Powerpc32 example of functions invoked through the .got and .plt sections
Create the .o file which holds function1().
/opt/biarch/20060123/bin/gcc -g -m32 -msecure-plt -fpic -c func.c -o func.o
Create the shared object and symlinks.
/opt/biarch/20060123/bin/gcc -shared -Wl,-export-dynamic,-soname,libfunc.so.1 -o libfunc.so.1.0.1 func.o ln -s libfunc.so.1.0.1 libfunc.so.1 ln -s libfunc.so.1.0.1 libfunc.so
Intermediary powerpc32 assembler code
We can ask GCC to create an intermediary assembler file for investigation which will reveal the pre-linkage assembler for our test application.
/opt/biarch/20060123/bin/gcc -g -m32 -msecure-plt -fpic -L. -lfunc test.c -S
Investigation of the '.S' file at this stage will reveal the three function call methods highlighted earlier as well as some auxiliary information:
-
- The address of the .got is loaded into gpr30 in the necessary round-about manner(note orange highlighted text).
- The blr at the end of the main routine returns the calling function to the address in the link-register (note blue highlighted text).
- The contents of the Global Offset Table entry for function1@got is the absolute address of the symbol function1. This absolute address was resolved by the dynamic link/loader (ld.so) at program load-time. This address is loaded into gpr 0 and eventually moved into register ctr. The function is finally branched to with the bctrl call (note green highlighted text). This is the method used to invoke a function via a function pointer.
- A bl to function2@plt is requested (note red highlighted text). Since the absolute address is determined at load time this ..S file simply uses a symbol reference to the .plt section to indicate the late binding. Later investigation of the disassembled executable will reveal that since function2() exists in a shared-object file the address of function1@plt is bound at runtime to an executable stub which loads the absolute function1 address from the .plt entry that was populated by the dynamic loader via the _dl_dynamic_resolve() function.
- The bl to function3@plt is requested (note purple highlighted text). Even though the .S file contains function3@plt, at link time the linker notices that function3 exists in the same C file as main and inserts the absolute address for function3 into the executable as a bl directly to the absolute function address.
main: .LFB3: .loc 1 8 0 stwu 1,-32(1) .LCFI3: mflr 0 .LCFI4: stw 30,24(1) .LCFI5: stw 31,28(1) .LCFI6: stw 0,36(1) .LCFI7: mr 31,1 .LCFI8: bcl 20,31,.LCF1 .LCF1: mflr 30 addis 30,30,_GLOBAL_OFFSET_TABLE_-.LCF1@ha addi 30,30,_GLOBAL_OFFSET_TABLE_-.LCF1@l .loc 1 9 0 li 0,0 stw 0,16(31) .loc 1 12 0 lwz 0,function1@got(30) stw 0,16(31) .loc 1 13 0 lwz 0,16(31) mtctr 0 li 3,10 bctrl mr 0,3 stw 0,12(31) .loc 1 14 0 lwz 3,12(31) bl function2@plt mr 0,3 stw 0,8(31) .loc 1 15 0 lwz 3,8(31) bl function3@plt mr 0,3 .loc 1 16 0 mr 3,0 lwz 11,0(1) lwz 0,4(11) mtlr 0 lwz 30,-8(11) lwz 31,-4(11) mr 1,11 blr
Build and link the executable
Build and link the executable to the shared object file.
/opt/biarch/20060123/bin/gcc -g -m32 -msecure-plt -fpic -L. -lfunc test.c -o test
Objdump the full ELF information into a disassembly file to examine the .plt, .got, and plt stubs.
/opt/biarch/20060123/bin/objdump -stDx test > test.dis
Specify the LD_LIBRARY_PATH environment variable so that the linker can find libfunc.so.1 when you execute the application:
export LD_LIBRARY_PATH=$PWD
Examine the dissasembled binary
We can determine what the linker has done during early binding by examining the disassembled binary.
If GCC can see the code for a function it will generally include it in the executable. If you were to directly #include "func.c" rather than "func.h" which contains the extern function prototype GCC would simply insert the function code into the executable. During the link stage the linker would determine that it could resolve this function1@plt reference directly to an absolute address, meaning it will not be loaded from a shared library and it would result in a bl directly to the function address.
10001580 <main>: 10001580: 94 21 ff e0 stwu r1,-32(r1) 10001584: 7c 08 02 a6 mflr r0 10001588: 93 c1 00 18 stw r30,24(r1) 1000158c: 93 e1 00 1c stw r31,28(r1) 10001590: 90 01 00 24 stw r0,36(r1) 10001594: 7c 3f 0b 78 mr r31,r1 10001598: 42 9f 00 05 bcl- 20,4*cr7+so,1000159c <main+0x1c> 1000159c: 7f c8 02 a6 mflr r30 100015a0: 3f de 00 01 addis r30,r30,1 100015a4: 3b de 05 cc addi r30,r30,1484 100015a8: 38 00 00 00 li r0,0 100015ac: 90 1f 00 10 stw r0,16(r31) 100015b0: 80 1e ff fc lwz r0,-4(r30) 100015b4: 90 1f 00 10 stw r0,16(r31) 100015b8: 80 1f 00 10 lwz r0,16(r31) 100015bc: 7c 09 03 a6 mtctr r0 100015c0: 38 60 00 0a li r3,10 100015c4: 4e 80 04 21 bctrl 100015c8: 7c 60 1b 78 mr r0,r3 100015cc: 90 1f 00 0c stw r0,12(r31) 100015d0: 80 7f 00 0c lwz r3,12(r31) 100015d4: 48 00 03 8d bl 10001960 <call___do_global_ctors_aux+0x30> 100015d8: 7c 60 1b 78 mr r0,r3 100015dc: 90 1f 00 08 stw r0,8(r31) 100015e0: 80 7f 00 08 lwz r3,8(r31) 100015e4: 4b ff ff 69 bl 1000154c <function3> 100015e8: 7c 60 1b 78 mr r0,r3 100015ec: 7c 03 03 78 mr r3,r0 100015f0: 81 61 00 00 lwz r11,0(r1) 100015f4: 80 0b 00 04 lwz r0,4(r11) 100015f8: 7c 08 03 a6 mtlr r0 100015fc: 83 cb ff f8 lwz r30,-8(r11) 10001600: 83 eb ff fc lwz r31,-4(r11) 10001604: 7d 61 5b 78 mr r1,r11 10001608: 4e 80 00 20 blr
- The orange highlighted text shows the post-linkage assembly code used to fetch the address of the .got into gpr30 (as discussed above).
- The green highlighted text shows the post-linkage assembly code that invokes function call 1. The linker simply computed the offset of function1 in the .got and loaded the result into gpr0. The .got entry for function1 was populated at load-time with the absolute address of function1.
- The red highlighted text shows how the linker used late binding to bl to a code segment representing function2@plt. The symbol label <call___do_global_ctors_aux+0x20> in the above asm is wrong and it is due to objdump assigning the nearest preceding symbol as a label for the address. This address is actually the .glink PLT call code stub for function2:
10001960: 3d 60 10 01 lis r11,4097 10001964: 81 6b 1b 78 lwz r11,7032(r11) 10001968: 7d 69 03 a6 mtctr r11 1000196c: 4e 80 04 20 bctr
This .glink PLT call code stub for function2 loads the contents of the function2 .plt entry (at address 0x10011b78) into the ctr and branches to it:
- This is the .plt section:
Contents of section .plt:
10011b74 10001980 10001984 10001988 ............
- The contents of the actual .plt entry for function2:
10011b78 <function2@plt>: 10011b78: 10 00 19 84 vslw v0,v0,v3
The .plt entry for function2 will eventually hold the absolute address of function2 after it is loaded by the dynamic linker.
After the linker has run the .plt entry for function2 holds 0x10001984 which is the address of the .glink PLT resolver:
10001980: 60 00 00 00 nop
10001984: 60 00 00 00 nop
10001988: 60 00 00 00 nop
1000198c: 60 00 00 00 nop
10001990: 3d 80 10 01 lis r12,4097
10001994: 3d 6b f0 00 addis r11,r11,-4096
10001998: 80 0c 1b 6c lwz r0,7020(r12)
1000199c: 39 6b e6 80 addi r11,r11,-6528
100019a0: 7c 09 03 a6 mtctr r0
100019a4: 7c 0b 5a 14 add r0,r11,r11
100019a8: 81 8c 1b 70 lwz r12,7024(r12)
100019ac: 7d 60 5a 14 add r11,r0,r11
100019b0: 4e 80 04 20 bctr
Before the dynamic link/loader has loaded the shared object which implements function2 the .plt entry for function2 actually contains a fall-through nop address for the PLT resolver which will invoke _dl_runtime_resolve().
After _dl_runtime_resolve() has run the .plt entry for function2 will hold the absolute address for function2. All future .glink PLT call code stub invocations for function2 will load the absolute address for function2 into the count register and the bctr will invoke function2 proper. Therefore the expense of the dynamic resolution is made the first time the function is used. All other invocations simply incur the cost of the .glink access of the .plt entry for function2.
- The purple highlighted text shows how the linker recognized that the function body of function3 exists in the executable and provided a direct bl to the function body rather than branching to a .glink stub and fetching the function address from the .plt. The asm for function3 is listed here for reference.
1000154c <function3>: 1000154c: 94 21 ff e0 stwu r1,-32(r1) 10001550: 93 e1 00 18 stw r31,24(r1) 10001554: 7c 3f 0b 78 mr r31,r1 10001558: 90 7f 00 08 stw r3,8(r31) 1000155c: 81 3f 00 08 lwz r9,8(r31) 10001560: 38 09 00 01 addi r0,r9,1 10001564: 90 1f 00 08 stw r0,8(r31) 10001568: 80 1f 00 08 lwz r0,8(r31) 1000156c: 7c 03 03 78 mr r3,r0 10001570: 81 61 00 00 lwz r11,0(r1) 10001574: 83 eb ff f8 lwz r31,-8(r11) 10001578: 7d 61 5b 78 mr r1,r11 1000157c: 4e 80 00 20 blr
- Note: Per the warning at the beginning of this section, if you directly include a .c file rather than a .h which contains a function prototype the GCC compiler can get to the function code and will insert it into the executable, negating the benefit of linking against a shared object.
- The blue highlighted text shows how the application returns control to the calling function by branching to the address held in the link-register.
How do I fix symbols showing up as check-localplt make check failures?
Reference this email exchange on the libc-alpha mailing list.
Say you're seeing the following:
--- /home/ryanarn/glibc/glibc-2.7/scripts/data/localplt-powerpc-linux-gnu.data +++ /home/ryanarn/glibc/build/glibc32/check-localplt.out.new @@ -4,4 +4,9 @@ libc.so: malloc libc.so: memalign libc.so: realloc +libm.so: cosl +libm.so: finitel +libm.so: logl libm.so: matherr +libm.so: sinl +libm.so: sqrtl
This means that some function is using the symbols cosl, finitel, logl, sinl, and sqrtl directly from within libm.so when they SHOULD be using the internal versions __cosl, __finitel, __logl, __sinl, and __sqrtl instead.
Using the external interface from within the library is bad for two reasons:
1. Someone could change the implementation out from underneath you and then libm may end up using an erroneous user created function where it is not appropriate. 2. The libm.so shared object ends up having to do plt branching when it could simply branch directly to the internal symbol.
To find out which function is using the symbols follow the procedure outlined below.
NOTE: The trick is that PLT slots and call-stubs are numbered and have to correspond.
Dump the relocations used by the library:
> readelf -r math/libm.so | grep R_PPC_JMP_SLOT 000b7000 00007815 R_PPC_JMP_SLOT 00000000 __assert_fail + 0 000b7004 00008f15 R_PPC_JMP_SLOT 00047ea0 cosl + 0 000b7008 00009315 R_PPC_JMP_SLOT 00000000 __errno_location + 0 000b700c 0000df15 R_PPC_JMP_SLOT 0004d570 sqrtl + 0 000b7010 0000f015 R_PPC_JMP_SLOT 00000000 fputs + 0 000b7014 0000f715 R_PPC_JMP_SLOT 00000000 strlen + 0 000b7018 00012415 R_PPC_JMP_SLOT 00000000 sprintf + 0 000b701c 00013115 R_PPC_JMP_SLOT 00010180 matherr + 0 000b7020 00015b15 R_PPC_JMP_SLOT 00000000 __cxa_finalize + 0 000b7024 00017e15 R_PPC_JMP_SLOT 00000000 strtold + 0 000b7028 00018515 R_PPC_JMP_SLOT 00000000 memset + 0 000b702c 00018615 R_PPC_JMP_SLOT 00000000 strtof + 0 000b7030 00018a15 R_PPC_JMP_SLOT 00000000 strtod + 0 000b7034 00018d15 R_PPC_JMP_SLOT 0004b0a0 sinl + 0 000b7038 00019815 R_PPC_JMP_SLOT 0004ccd0 logl + 0 000b703c 0001a615 R_PPC_JMP_SLOT 00000000 fwrite + 0 000b7040 0001ab15 R_PPC_JMP_SLOT 00054910 finitel + 0 000b7044 0001c915 R_PPC_JMP_SLOT 00000000 __gmon_start__ + 0
Next you need to find out where the call-stubs are at in the libm.so .text section. There is no symbol defined to tell you where this starts. Generally the linker seems to put the first call-stub at the end of of the <call___do_global_ctors_aux> text for 32-bit and before the <call_gmon_start> text for 64-bit:
0005eb00 <call___do_global_ctors_aux>: 5eb00: 94 21 ff f0 stwu r1,-16(r1) 5eb04: 7c 08 02 a6 mflr r0 5eb08: 90 01 00 14 stw r0,20(r1) 5eb0c: 80 01 00 14 lwz r0,20(r1) 5eb10: 38 21 00 10 addi r1,r1,16 5eb14: 7c 08 03 a6 mtlr r0 5eb18: 4e 80 00 20 blr 5eb1c: 00 00 00 00 .long 0x0 5eb20: 81 7e 00 0c lwz r11,12(r30) 5eb24: 7d 69 03 a6 mtctr r11 5eb28: 4e 80 04 20 bctr 5eb2c: 60 00 00 00 nop 5eb30: 81 7e 00 10 lwz r11,16(r30) 5eb34: 7d 69 03 a6 mtctr r11 5eb38: 4e 80 04 20 bctr 5eb3c: 60 00 00 00 nop 5eb40: 81 7e 00 14 lwz r11,20(r30) 5eb44: 7d 69 03 a6 mtctr r11 5eb48: 4e 80 04 20 bctr 5eb4c: 60 00 00 00 nop
The second call-stub in blue corresponds with the second relocation slot for cosl:
5eb30: 81 7e 00 10 lwz r11,16(r30)
5eb34: 7d 69 03 a6 mtctr r11
5eb38: 4e 80 04 20 bctr
5eb3c: 60 00 00 00 nop
So simply search the libm disassembly for the branch to the address of the call-stub which happens to be in the __ieee754_j0l function:
0003e370 <__ieee754_j0l>:
3e370: 94 21 ff 40 stwu r1,-192(r1)
3e374: 7c 08 02 a6 mflr r0
3e378: 42 9f 00 05 bcl- 20,4*cr7+so,3e37c <__ieee754_j0l+0xc>
...
3e8f8: 48 02 02 39 bl 5eb30 <call___do_global_ctors_aux+0x30>
This function is implemented in the following file:
glibc-2.7/sysdeps/ieee754/ldbl-128/e_j0l.c
In this file change all references to cosl to __cosl. Continue to search libm.so for all instance of 0x5eb30 and then move on to the next relocation that needs to be removed.
.so naming convention and the required symlinks
Code and Data segment layout in executable file and memory
Talk about brk() and sbrk() and how the loader lays out the segments.