有关C++中符号与链接的那些事
前言
C/C++中函数变量等对象在编译后被抽象为符号symbol可以被其他程序链接
概念比较抽象,实际应用只需牢记一点:
编译库允许声明symbol不定义,编译二进制则必须递归链接到所有symbol
库文件
静态库(static)会包含所有用到的符号,通常体积较大。
动态库(shared)则递归引用外部的符号,可以节省空间。
搜索顺序通常为:
LD_LIBRARY_PATHRPATHSYSTEM
使用ldconfig -p | grep xxx.so命令可以查找系统库的具体路径
二进制
二进制文件格式为ELF(Executable & Linkable Format)
包含可被unix系统识别的特殊header可以通过patchelf修改
查找
系统自带的文档位于man ld.so
可以通过nm -D命令展示库的符号
$ nm -D /lib/x86_64-linux-gnu/libe2p.so.2
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
00000000002086b0 B __bss_start
U __ctype_b_loc
w __cxa_finalize
U __errno_location
U __fprintf_chk
U __fxstat
w __gmon_start__
U __lxstat
U __printf_chk
U __sprintf_chk
U __stack_chk_fail
U __strcat_chk
00000000002086b0 D _edata
0000000000208928 B _end
释意
For each symbol, nm shows:
· The symbol value, in the radix selected by options (see below), or hexadecimal by default.
· The symbol type. At least the following types are used; others are, as well, depending on the object file format. If lowercase, the symbol is usually local; if uppercase, the
symbol is global (external). There are however a few lowercase symbols that are shown for special global symbols ("u", "v" and "w").
"A" The symbol's value is absolute, and will not be changed by further linking.
"B"
"b" The symbol is in the BSS data section. This section typically contains zero-initialized or uninitialized data, although the exact behavior is system dependent.
"C" The symbol is common. Common symbols are uninitialized data. When linking, multiple common symbols may appear with the same name. If the symbol is defined anywhere, the
common symbols are treated as undefined references.
"D"
"d" The symbol is in the initialized data section.
"G"
"g" The symbol is in an initialized data section for small objects. Some object file formats permit more efficient access to small data objects, such as a global int variable as
opposed to a large global array.
"i" For PE format files this indicates that the symbol is in a section specific to the implementation of DLLs. For ELF format files this indicates that the symbol is an indirect
function. This is a GNU extension to the standard set of ELF symbol types. It indicates a symbol which if referenced by a relocation does not evaluate to its address, but
instead must be invoked at runtime. The runtime execution will then return the value to be used in the relocation.
"I" The symbol is an indirect reference to another symbol.
"N" The symbol is a debugging symbol.
"p" The symbols is in a stack unwind section.
"R"
"r" The symbol is in a read only data section.
"S"
"s" The symbol is in an uninitialized or zero-initialized data section for small objects.
"T"
"t" The symbol is in the text (code) section.
"U" The symbol is undefined.
"u" The symbol is a unique global symbol. This is a GNU extension to the standard set of ELF symbol bindings. For such a symbol the dynamic linker will make sure that in the
entire process there is just one symbol with this name and type in use.
"V"
"v" The symbol is a weak object. When a weak defined symbol is linked with a normal defined symbol, the normal defined symbol is used with no error. When a weak undefined symbol
is linked and the symbol is not defined, the value of the weak symbol becomes zero with no error. On some systems, uppercase indicates that a default value has been specified.
"W"
"w" The symbol is a weak symbol that has not been specifically tagged as a weak object symbol. When a weak defined symbol is linked with a normal defined symbol, the normal
defined symbol is used with no error. When a weak undefined symbol is linked and the symbol is not defined, the value of the symbol is determined in a system-specific manner
without error. On some systems, uppercase indicates that a default value has been specified.
"-" The symbol is a stabs symbol in an a.out object file. In this case, the next values printed are the stabs other field, the stabs desc field, and the stab type. Stabs symbols
are used to hold debugging information.
"?" The symbol type is unknown, or object file format specific.
· The symbol name.
调试
对于找不到symbol或需要展示symbol具体链接过程,可以打开调试模式进行观察,例如:
$ export LD_DEBUG=libs
$ date
30606: find library=libc.so.6 [0]; searching
30606: search path=/opt/ros/lib:tls/haswell/x86_64:tls/haswell:tls/x86_64:tls:haswell/x86_64:haswell:x86_64: (LD_LIBRARY_PATH)
30606: trying file=tls/haswell/x86_64/libc.so.6
30606: trying file=tls/haswell/libc.so.6
30606: trying file=tls/x86_64/libc.so.6
30606: trying file=tls/libc.so.6
30606: trying file=haswell/x86_64/libc.so.6
30606: trying file=haswell/libc.so.6
30606: trying file=x86_64/libc.so.6
30606: trying file=libc.so.6
30606: search cache=/etc/ld.so.cache
30606: trying file=/lib/x86_64-linux-gnu/libc.so.6
30606:
30606:
30606: calling init: /lib/x86_64-linux-gnu/libc.so.6
30606:
30606:
30606: initialize program: date
30606:
30606:
30606: transferring control: date
30606:
Sat Aug 6 10:57:12 CST 2022
$ unset LD_DEBUG
解析
对于C++风格的库文件,symbol名称可能较为晦涩,原因是mangle机制,需要对应demangle解读
具体编码规则详见Itanium C++ ABI
例如_ZNK8KxVectorI16KxfArcFileRecordjEixEj
原为KxVector<KxfArcFileRecord, unsigned int>::operator[](unsigned int) const
实际操作中有两种解决办法:
- 对
nm传入-c参数直接展示可读名称 - 在线处理GCC and MSVC C++ Demangler
救火
举个例子:
/home/ubuntu/debug/target/lib/libxxx.so.0.1.0: undefined reference to `FooBar'
collect2: error: ld returned 1 exit status
make[2]: *** [tools/CMakeFiles/PublicDebugger.dir/build.make:200: tools/PublicDebugger] Error 1
make[1]: *** [CMakeFiles/Makefile2:201: tools/CMakeFiles/PublicDebugger.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[100%] Linking CXX executable UnitTest
对于引入新library的工程,常见.so能编过但app报错undefined reference高度怀疑bin文件link不完整,结合文首
编译库允许声明symbol不定义,编译二进制则必须递归链接到所有symbol
对应处理思路为:
- 对应cmake中修改target_link_libraries
- 直接修改
CXX_FLAGS对GCC传参-lxxx

浙公网安备 33010602011771号