fast copy

[edit] The normal way

To copy files recursively from src1/, src2/, to dest/ you do

cp -Rv src1/ src2/ dest/

[edit] The piped way

cp uses character by character copy. Using the kernel pipes support, we could copy files block by block. tar converts the directories recursively into a single stream.At one end we create a single stream out of source directories and at other end we extract this stream and put it in the destination directory. The transfer between the two ends is by means of pipes.

tar -Sc src1/ src2/ | tar -C dest/ -xv

This is not necessarily faster, but is much more flexible. Note that ext2, ext3 and most modern Linux file systems have a capability called 'sparse files'. This is used to store large files with lots of zeroed content in an efficient manner. Most of the time you should not care about this, but certain programs make extensive use of this feature (e.g. net-p2p/mldonkey). You should be careful when copying sparse files using this method, as your disk usage can explode if you forget to use the -S switch on the left side. Whereas the cp utility does handle sparse files automatically, tar (without -S) does not. Sparse files in the source directory will be stored in full representation on the destination directory.

[edit] Network with SCP

If you want to transfer a file, the following may be used

# Remote to local
scp -C remotebox:path/to/sourcefile .
# Local to remote
scp -C localfile remotebox:path/to/destination/

If multiple files or directories are to be transferred, the following may be used

# Remote to local
scp -rC remotebox:path/to/sourcedir .
# Local to remote
scp -Cr localdir remotebox:path/to/destination/

[edit] Network with SSH

A distinction must be whether from a local system to a remote or vice versa is to be copied:

# to the remote system
tar czv dir/files | ssh remote.box.com "tar xz -C /dir/"
# to the remote system with faster encryption 
tar czv ListOfFiles | ssh -c blowfish remote.box.com tar xz -C /home/user/PathToCopy
# from remote to local
ssh remote.box.com tar cz -C BeginDirCopyFiles |tar xz -C DirToCopy

Transfer of individual files can also be made with tar, but also suitable for this cat:

# local to remote
cat dir/file | ssh -C remote.box.com "cat > /dir/file"
# combined with gzip or bz2
cat dir/file | gzip | ssh -C remote.box.com "gunzip > /dir/file"
cat dir/file | bzip2 | ssh -C remote.box.com "bunzip2 > /dir/file"
# remote to local
ssh remote.box.com "cat /dir/file" > dir/file
# combined with gzip or bz2
ssh remote.box.com "cat /dir/file | gzip" | gunzip > dir/file
ssh remote.box.com "cat /dir/file | bzip2" | bunzip2 > dir/file

[edit] Network with netcat

Transfers over netcat can have miniscule CPU needs, unlike transfers over ssh. However the data is transmitted without encryption and authentication.

Destination box: nc -l -p 2342 | tar -C /target/dir -xz -
Source box: tar -cz /source/dir | nc Target_Box 2342

For further CPU use reduction, lzop can be used in place of the tar z option for much faster but less effective compression.

Destination box: nc -l -p 2342 | lzop -d | tar -C /target/dir -x -
Source box: tar -c /source/dir | lzop | nc Target_Box 2342

Network with socat Same idea as Netcat.

Destination box: socat -u - tcp4-listen:2342 | tar x -C /target/dir
Source box: tar c /source/dir | socat -u - tcp4:Target_Box:2342

We can use a variety of compression methods in a general way:

Destination box: socat -u - tcp4-listen:2342 | ${UNZIP} | tar x -C /target/dir
Source box: tar c /source/dir | ${ZIP} | socat -u - tcp4:Target_Box:2342

We then define ZIP as one of the following:

   *  cat
   *  lzop
   *  gzip
   *  bzip2

and UNZIP as:

   *  cat
   *  lzop -d
   *  gunzip
   *  bunzip2

What's the difference?


time -p tar /usr/src/linux-2.6.3 | ${ZIP} | cat > /dev/null
   *  cat: 182MB, 1.1sec
   *  lzop: 64MB, 4sec
   *  gzip --fast: 51MB, 9sec
   *  gzip: 41MB, 18sec
   *  gzip --best: 41MB, 49sec
   *  bzip2: 32MB, 134sec

(edit: umm... this is kind of irrelevant. It all depends on how fast your network connection is, and how fast each computer is. If your processors/disks are really slow and your network is really fast, strait cat would work best. if your processors/disks are really fast but your network is, say, a dialup, or just a slow wireless connection or something, you're way better off with the smallest transfer (like, bzip2 or gzip). Reality is, you're gonna be somewhere in the middle, but there's a reason that they put the kernel in .bz2 on kernel.org)

阅读更多
个人分类: linux-kernel
想对作者说点什么? 我来说一句

fast copy program

2008年09月12日 137KB 下载

没有更多推荐了,返回首页

加入CSDN,享受更精准的内容推荐,与500万程序员共同成长!
关闭
关闭