tarfile模块
实现打包压缩、解压缩,同时支持gzip / bzip2 / xz格式
- tarfile模块允许创建、访问tar文件
- 同时支持gzip、bzip2格式
# 压缩
>>> import tarfile
>>> tar = tarfile.open('/tmp/mytest.tar.gz', 'w:gz') # 用gzip压缩
>>> tar.add('/etc/hosts') # 压缩单个文件
>>> tar.add('/etc/security') # 压缩目录
>>> tar.close() # 关闭
# 解压
>>> tar = tarfile.open('/tmp/mytest.tar.gz') # 自动识别压缩格式,以读方式打开
>>> tar.extractall(path='/var/tmp') # 指定解压目录,默认解到当前目录
>>> tar.close()
备份程序
- 需要支持完全和增量备份
- 周一执行完全备份
- 其他时间执行增量备份
- 备份文件需要打包为tar文件并使用gzip格式压缩
[root@room8pc16 day03]# mkdir /tmp/demo
[root@room8pc16 day03]# cp -r /etc/security/ /tmp/demo
[root@room8pc16 demo]# mkdir /tmp/demo/backup
- 将/tmp/demo/security备份到/tmp/demo/backup中
- 需要支持完全和增量备份
- 周一执行完全备份
- 其他时间执行增量备份
分析:
- 完全备份需要执行备份目录、计算每个文件的md5值
- 增量备份需要计算文件的md5值,把md5值与前一天的md5值比较,有变化的文件要备份;目录中新增的文件也要备份
- 备份的文件名,应该体现出:备份的是哪个目录,是增量还是完全,哪一天备份的
获取/tmp/demo/security目录下每个文件的绝对路径:
>>> import os
>>> os.walk('/tmp/demo/security')
<generator object walk at 0x7f6306b7edb0>
>>> files = list(os.walk('/tmp/demo/security'))
>>> import pprint
>>> pprint.pprint(files)
[('/tmp/demo/security',
['console.apps', 'console.perms.d', 'limits.d', 'namespace.d'],
['access.conf',
'chroot.conf',
'console.handlers',
'console.perms',
'group.conf',
'limits.conf',
'namespace.conf',
'namespace.init',
'opasswd',
'pam_env.conf',
'sepermit.conf',
'time.conf',
'pwquality.conf']),
('/tmp/demo/security/console.apps',
[],
['config-util', 'xserver', 'liveinst', 'setup']),
('/tmp/demo/security/console.perms.d', [], []),
('/tmp/demo/security/limits.d', [], ['20-nproc.conf']),
('/tmp/demo/security/namespace.d', [], [])]
>>> len(files)
5
>>> files[0]
('/tmp/demo/security', ['console.apps', 'console.perms.d', 'limits.d', 'namespace.d'], ['access.conf',
'chroot.conf', 'console.handlers', 'console.perms', 'group.conf', 'limits.conf', 'namespace.conf',
'namespace.init', 'opasswd', 'pam_env.conf', 'sepermit.conf', 'time.conf', 'pwquality.conf'])
>>> files[1]
('/tmp/demo/security/console.apps', [], ['config-util', 'xserver', 'liveinst', 'setup'])
>>> files[2]
('/tmp/demo/security/console.perms.d', [], [])
# 经过分析,发现ow.walk它的返回值由多个元组构成。元组有三项,第一项是路径字符串,第二项是该路径下的目录列表,第三项是该目录下的文件列表。
# 要获得每个文件的绝对路径,只要把路径字符串和文件拼接起来即可
>>> for path, folders, files in os.walk('/tmp/demo/security'):
... for file in files:
... os.path.join(path, file)
...
'/tmp/demo/security/access.conf'
'/tmp/demo/security/chroot.conf'
'/tmp/demo/security/console.handlers'
'/tmp/demo/security/console.perms'
'/tmp/demo/security/group.conf'
'/tmp/demo/security/limits.conf'
'/tmp/demo/security/namespace.conf'
'/tmp/demo/security/namespace.init'
'/tmp/demo/security/opasswd'
'/tmp/demo/security/pam_env.conf'
'/tmp/demo/security/sepermit.conf'
'/tmp/demo/security/time.conf'
'/tmp/demo/security/pwquality.conf'
'/tmp/demo/security/console.apps/config-util'
'/tmp/demo/security/console.apps/xserver'
'/tmp/demo/security/console.apps/liveinst'
'/tmp/demo/security/console.apps/setup'
'/tmp/demo/security/limits.d/20-nproc.conf'