查看程序的库依赖

由两个python脚本引起的学习兴趣 – 查看程序的库依赖

在网上看到宋宝华老师写的两个python脚本,是可以用来查看程序的库依赖的,github地址在下面:

https://github.com/21cnbao/libdep/

Linux程序对库的依赖

symbol-dep.py:

https://cloud.tencent.com/developer/article/1518254

原理

nm -D --undefined-only命令可以列出一个程序依赖的需要动态链接的库函数,例如:

➜  nm -D --undefined-only a.out
                 w __gmon_start__
                 U __libc_start_main
                 U puts

a.out是一个随意编写的C程序编译出来的ELF可执行文件。

helloc.c:

#include <stdio.h>

int main(int argc, char *argv[])
{
	printf("hello world!\n");

	return 0;
}
gcc hello.c

nm -D --defined-only命令可以列出一个动态链接库给别人提供的函数,例如:

➜  nm -D --defined-only /lib/x86_64-linux-gnu/libc-2.19.so | more 
0000000000046d30 T a64l
0000000000039f90 T abort
00000000003bfe00 B __abort_msg
000000000003c920 T abs
...

./symbol-dep.py -s a -d b.so,只要把a依赖的函数,与b.so供给的函数中,求一个交集,即可在完全没有源代码的情况下,知道a会调用到b.so的哪些函数。

代码实现

#!/usr/bin/python3

import sys, getopt, os

def main(argv):
	srcfile  = ''
	dstfile  = ''
	neededsymbols    = []
	exportedsymbols  = []

	try:
		opts, args = getopt.getopt(argv, "hs:d:", ["sfile=", "dfile="])
	except getopt.GetoptError:
		print ('symbol-dep.py -s <srcfile> -d <dstfile>')
		sys.exit(2)
		
	for opt, arg in opts:
		if opt == '-h':
			print ('symbol-dep.py -s <srcfile> -d <dstfile>')
			sys.exit()
		elif opt in ("-s", "--sfile"):
			srcfile = arg
		elif opt in ("-d", "--dfile"):
			dstfile = arg

	# get the symbols srcfile depends on
	src=os.popen("nm -D --undefined-only "+srcfile)
	srclist=src.read().splitlines()
	for sline in srclist:
		neededsymbols.append(sline.split()[-1])

	# get the symbols dstfile exports
	dst=os.popen("nm -D --defined-only "+dstfile)
	dstlist=dst.read().splitlines()
	for dline in dstlist:
		exportedsymbols.append(dline.split()[-1])
	
	# intersection of src and dest
	for symbol in neededsymbols:
		if symbol in exportedsymbols:
			print(symbol)

if __name__ == "__main__":
	main(sys.argv[1:])

在ubuntu虚拟机上使用:

./symbol-dep.py -s a.out -d /lib/x86_64-linux-gnu/libc-2.19.so

画出Linux程序/库依赖图

libdep-pic.py:

https://cloud.tencent.com/developer/article/1518085

原理

dot绘图工具

ubuntu安装dot绘图工具:

sudo apt-get install xdot

编写test.dot数据代码:

digraph graphname {
a -> b -> c;
b -> d;
}

运行命令生成图片test.png:

dot -Tpng -o test.png test.dot

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-qMQHUbL3-1602486864635)(./dot/test.png)]

ldd工具

ldd工具是一个普通的shell脚本,可以列出来elf文件所依赖的.so,以及.so依赖的.so。

➜  which ldd 
/usr/bin/ldd
➜  file /usr/bin/ldd
/usr/bin/ldd: Bourne-Again shell script, ASCII text executable

操作方法:

➜  ldd /usr/lib/firefox/firefox
	linux-vdso.so.1 =>  (0x00007fffef7e9000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f3b7f724000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3b7f520000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f3b7f21c000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3b7ef16000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3b7eb51000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f3b7fb5e000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f3b7e93b000)

linux-vdso.so.1vdso全称是Virtual Dynamic Shared Object,是一个虚拟的动态库,不存在/lib/usb/lib下。

firefox依赖于libm.so.6等,如果我们对libm.so.6继续ldd,就可以分析出更深层次的依赖。所以,整个依赖图依赖于递归。

➜  ldd /lib/x86_64-linux-gnu/libm.so.6  
	linux-vdso.so.1 =>  (0x00007ffe37f60000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdffe5c1000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fdffec8c000)

代码实现

#!/usr/bin/python2

import sys, os, re

analyzedlist = []

# get the libs prog depends on and write the results into opened file f
def dep(f, prog):
   # one lib may be used by several users
   if prog in analyzedlist:
       return
   else:
       analyzedlist.append(prog) 

   pname = prog.split('/')[-1]
   needed=os.popen("ldd "+prog)
   neededso=re.findall(r'[>](.*?)[(]', needed.read())

   for so in neededso:
       if(len(so.strip()) > 0):
           f.write('"' + pname + '" -> "' + so.split('/')[-1] + '";\n')
           dep(f, so)

def main(argv):
   f = open('/tmp/libdep.dot','w',encoding='utf-8')
   f.write('digraph graphname {\n')
   dep(f, argv)
   f.write('}\n')
   f.close()
   os.popen("dot -Tpng -o ./libdep.png /tmp/libdep.dot")

if __name__ == "__main__":
   if len(sys.argv) == 2:
       main(sys.argv[1])
   else:
       print ("usage: libdep-pic.py [program]")

在ubuntu虚拟机上使用:

./libdep-pic.py /usr/lib/firefox/firefox

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-EkBVUCX6-1602486864638)(./dot/libdep.png)]

ldd工具

介绍:
1)
  ldd不是一个可执行程序,而只是一个shell脚本 ldd能够显示可执行模块的dependency(所属),其原理是通过设置一系列的环境变量,如下: LD_TRACE_LOADED_OBJECTS、LD_WARN、LD_BIND_NOW、LD_LIBRARY_VERSION、LD_VERBOSE等。当LD_TRACE_LOADED_OBJECTS环境变量不为空时,任何可执行程序在运行时,它都会只显示模块的dependency(所属),而程序并不真正执行。要不你可以在shell终端测试一下,如下: export LD_TRACE_LOADED_OBJECTS=1 再执行任何的程序,如ls等,看看程序的运行结果。
2)
  ldd显示可执行模块的dependency(所属)的工作原理,其实质是通过ld-linux.so(elf动态库的装载器)来实现的。我们知道,ld-linux.so模块会先于executable模块程序工作,并获得控制权,因此当上述的那些环境变量被设置时,ld-linux.so选择了显示可执行模块的dependency(所属)。 实际上可以直接执行ld-linux.so模块,如: /lib/ld-linux.so.2 --list program(这相当于ldd program)

ldd脚本源码:

#! /bin/bash
# Copyright (C) 1996-2014 Free Software Foundation, Inc.
# This file is part of the GNU C Library.

# The GNU C Library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2.1 of the License, or (at your option) any later version.

# The GNU C Library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
# Lesser General Public License for more details.

# You should have received a copy of the GNU Lesser General Public
# License along with the GNU C Library; if not, see
# <http://www.gnu.org/licenses/>.


# This is the `ldd' command, which lists what shared libraries are
# used by given dynamically-linked executables.  It works by invoking the
# run-time dynamic linker as a command and setting the environment
# variable LD_TRACE_LOADED_OBJECTS to a non-empty value.

# We should be able to find the translation right at the beginning.
TEXTDOMAIN=libc
TEXTDOMAINDIR=/usr/share/locale

RTLDLIST="/lib/ld-linux.so.2 /lib64/ld-linux-x86-64.so.2 /libx32/ld-linux-x32.so.2"
warn=
bind_now=
verbose=

while test $# -gt 0; do
  case "$1" in
  --vers | --versi | --versio | --version)
    echo 'ldd (Ubuntu EGLIBC 2.19-0ubuntu6.6) 2.19'
    printf $"Copyright (C) %s Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
" "2014"
    printf $"Written by %s and %s.
" "Roland McGrath" "Ulrich Drepper"
    exit 0
    ;;
  --h | --he | --hel | --help)
    echo $"Usage: ldd [OPTION]... FILE...
      --help              print this help and exit
      --version           print version information and exit
  -d, --data-relocs       process data relocations
  -r, --function-relocs   process data and function relocations
  -u, --unused            print unused direct dependencies
  -v, --verbose           print all information
"
    printf $"For bug reporting instructions, please see:\\n%s.\\n" \
      "<https://bugs.launchpad.net/ubuntu/+source/eglibc/+bugs>"
    exit 0
    ;;
  -d | --d | --da | --dat | --data | --data- | --data-r | --data-re | \
  --data-rel | --data-relo | --data-reloc | --data-relocs)
    warn=yes
    shift
    ;;
  -r | --f | --fu | --fun | --func | --funct | --functi | --functio | \
  --function | --function- | --function-r | --function-re | --function-rel | \
  --function-relo | --function-reloc | --function-relocs)
    warn=yes
    bind_now=yes
    shift
    ;;
  -v | --verb | --verbo | --verbos | --verbose)
    verbose=yes
    shift
    ;;
  -u | --u | --un | --unu | --unus | --unuse | --unused)
    unused=yes
    shift
    ;;
  --v | --ve | --ver)
    echo >&2 $"ldd: option \`$1' is ambiguous"
    exit 1
    ;;
  --)		# Stop option processing.
    shift; break
    ;;
  -*)
    echo >&2 'ldd:' $"unrecognized option" "\`$1'"
    echo >&2 $"Try \`ldd --help' for more information."
    exit 1
    ;;
  *)
    break
    ;;
  esac
done

nonelf ()
{
  # Maybe extra code for non-ELF binaries.
  return 1;
}

add_env="LD_TRACE_LOADED_OBJECTS=1 LD_WARN=$warn LD_BIND_NOW=$bind_now"
add_env="$add_env LD_LIBRARY_VERSION=\$verify_out"
add_env="$add_env LD_VERBOSE=$verbose"
if test "$unused" = yes; then
  add_env="$add_env LD_DEBUG=\"$LD_DEBUG${LD_DEBUG:+,}unused\""
fi

# The following command substitution is needed to make ldd work in SELinux
# environments where the RTLD might not have permission to write to the
# terminal.  The extra "x" character prevents the shell from trimming trailing
# newlines from command substitution results.  This function is defined as a
# subshell compound list (using "(...)") to prevent parameter assignments from
# affecting the calling shell execution environment.
try_trace() (
  output=$(eval $add_env '"$@"' 2>&1; rc=$?; printf 'x'; exit $rc)
  rc=$?
  printf '%s' "${output%x}"
  return $rc
)

case $# in
0)
  echo >&2 'ldd:' $"missing file arguments"
  echo >&2 $"Try \`ldd --help' for more information."
  exit 1
  ;;
1)
  single_file=t
  ;;
*)
  single_file=f
  ;;
esac

result=0
for file do
  # We don't list the file name when there is only one.
  test $single_file = t || echo "${file}:"
  case $file in
  */*) :
       ;;
  *) file=./$file
     ;;
  esac
  if test ! -e "$file"; then
    echo "ldd: ${file}:" $"No such file or directory" >&2
    result=1
  elif test ! -f "$file"; then
    echo "ldd: ${file}:" $"not regular file" >&2
    result=1
  elif test -r "$file"; then
    RTLD=
    ret=1
    for rtld in ${RTLDLIST}; do
      if test -x $rtld; then
	dummy=`$rtld 2>&1` 
	if test $? = 127; then
	  verify_out=`${rtld} --verify "$file"`
	  ret=$?
	  case $ret in
	  [02]) RTLD=${rtld}; break;;
	  esac
	fi
      fi
    done
    case $ret in
    0|2)
      try_trace "$RTLD" "$file" || result=1
      ;;
    1)
      # This can be a non-ELF binary or no binary at all.
      nonelf "$file" || {
	echo $"	not a dynamic executable"
	result=1
      }
      ;;
    *)
      echo 'ldd:' ${RTLD} $"exited with unknown exit code" "($ret)" >&2
      exit 1
      ;;
    esac
  else
    echo 'ldd:' $"error: you do not have read permission for" "\`$file'" >&2
    result=1
  fi
done

exit $result
# Local Variables:
#  mode:ksh
# End:

上面的脚本关键应该是在try_trace()函数和add_env变量,首先设置环境变量LD_TRACE_LOADED_OBJECTS=1,其他的环境变量根据输入的选项进行设置,然后会在RTLDLIST中找到一个当前环境存在的ld-linux.so库,找到之后将ld和要查看的文件输入try_trace(),就可以有对应的输出了。

简化ldd执行的命令相当于下面的工作:

➜  export  LD_TRACE_LOADED_OBJECTS=1  
➜  /lib64/ld-linux-x86-64.so.2 ./a.out
	linux-vdso.so.1 =>  (0x00007ffe89dcb000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb168b4b000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fb168f10000)

当设置了之后,就连普通的ls命令也会打印出当前的依赖库:

➜  export  LD_TRACE_LOADED_OBJECTS=1  
➜  ls
	linux-vdso.so.1 =>  (0x00007ffc2b7bb000)
	libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f7675ee4000)
	libacl.so.1 => /lib/x86_64-linux-gnu/libacl.so.1 (0x00007f7675cdc000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7675917000)
	libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f76756d9000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f76754d5000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f7676107000)
	libattr.so.1 => /lib/x86_64-linux-gnu/libattr.so.1 (0x00007f76752d0000)
➜  unset  LD_TRACE_LOADED_OBJECTS

撤销LD_TRACE_LOADED_OBJECTS环境变量后恢复正常。

在某些情况下(例如,程序规范使用ELF解释器 ld-linux.so之外),ldd的某些版本可能会尝试通过直接执行程序来获取依赖项信息,这可能导致执行在程序的ELF解释器中定义的任何代码,或者执行程序本身。(例如,在2.27之前的glibc版本中,上游ldd实现做到了这一点,尽管大多数发行版提供了未修改的版本。

因此可以使用另外的代替:

➜  objdump -p ./a.out  | grep NEEDED
  NEEDED               libc.so.6

实际ldd脚本的关键是动态链接器ld

ld-linux.so.X

查看相关的man手册:

  1. man ldd(http://www.kernel.org/doc/man-pages/online/pages/man1/ldd.1.html)
  2. man ld.so(http://www.kernel.org/doc/man-pages/online/pages/man8/ld.so.8.html)
  3. man ldconfig(http://www.kernel.org/doc/man-pages/online/pages/man8/ldconfig.8.html)

这里生成了对应网页的pdf文件。

相关参考博客:

  1. Linux 动态库剖析(http://www.ibm.com/developerworks/cn/linux/l-dynamic-libraries/)
  2. 剖析共享程序库(http://www.ibm.com/developerworks/cn/linux/l-shlibs.html)

到这里就是ld工具的代码了,后续想要了解看来还是需要很多时间和精力去研究一下。

在网上看到相关资料,发现ld.soglibc的内容,这里下载glibc-2.30.tar.gz源码,发现完全看不懂,算了吧,当前能力还是不够阅读这些代码,哈哈哈。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值