pdcp(1) - Linux man page



Name

pdcp  - copy files to groups of hosts in parallel
rpdcp - (reverse  pdcp ) copy files from a group of hosts in parallel

Synopsis

pdcp  [ options ]... src [src2...] dest
rpdcp  [ options ]... src [src2...] dir

Description

pdcp  is a variant of the  rcp(1)  command. Unlike  rcp(1) , which copies files to a single remote host,  pdcp  can copy files to multiple remote hosts in parallel. However,  pdcp does not recognize files in the format ''rname@rhost:path,'' therefore all source files must be on the local host machine. Destination nodes must be listed on the  pdcp command line using a suitable target nodelist option (See the  OPTIONS  section below). Each destination node listed must have  pdcp  installed for the copy to succeed.

When pdcp receives SIGINT (ctrl-C), it lists the status of current threads. A second SIGINT within one second terminates the program. Pending threads may be canceled by issuing ctrl-Z within one second of ctrl-C. Pending threads are those that have not yet been initiated, or are still in the process of connecting to the remote host.

Like pdsh(1), the functionality of pdcp may be supplemented by dynamically loadable modules. In pdcp, the modules may provide a new connect protocol (replacing the standard rsh(1) protocol), filtering options (e.g. excluding hosts that are down), and/or host selection options (e.g. -a selects all nodes from a local config file). By default,pdcp requires at least one "rcmd" module to be loaded (to provide the channel for remote copy).

Reverse Pdcp

rpdcp  performs a reverse parallel copy. Rather than copying files to remote hosts, files are retrieved from remote hosts and stored locally. All directories or files retrieved will be stored with their remote hostname appended to the filename. The destination file must be a directory when this option is used.

In other respects, rpdcp is exactly like pdcp, and further statements regarding pdcp in this manual also apply to rpdcp.

Rcmd Modules

The method by which  pdcp  connects to remote hosts may be selected at runtime using the  -R  option (See  OPTIONS  below). This functionality is ultimately implemented via dynamically loadable modules, and so the list of available options may be different from installation to installation. A list of currently available rcmd modules is printed when using any of the  -h -V , or  -L  options. The default rcmd module will also be displayed with the  -h  and  -V  options.

A list of rcmd modules currently distributed with pdcp follows.

rsh
Uses an internal, thread-safe implementation of BSD  rcmd(3) to run commands using the standard  rsh(1) protocol.
ssh
Uses a variant of  popen(3) to run multiple copies of the  ssh(1) command.
mrsh
This module uses the  mrsh(1) protocol to execute jobs on remote hosts. The mrsh protocol uses a credential based authentication, forgoing the need to allocate reserved ports. In other aspects, it acts just like rsh.
krb4
The krb4 module allows users to execute remote commands after authenticating with kerberos. Of course, the remote rshd daemons must be kerberized.
xcpu
The xcpu module uses the xcpu service to execute remote commands.

Options

The list of available  pdcp  options is determined at runtime by supplementing the list of standard  pdcp  options with any options provided by loaded  rcmd  and  misc  modules. In some cases, options provided by modules may conflict with each other. In these cases, the modules are incompatible and the first module loaded wins.

Standard target nodelist options

-w  TARGETS,...
Target and or filter the specified list of hosts. Do not use with any other node selection options (e.g.  -a-g, if they are available). No spaces are allowed in the comma-separated list. Arguments in the  TARGETS list may include normal host names, a range of hosts in hostlist format (See  HOSTLIST EXPRESSIONS), or a single '-' character to read the list of hosts on stdin.

If a host or hostlist is preceded by a '-' character, this causes those hosts to be explicitly excluded. If the argument is preceded by a single '^' character, it is taken to be the path to file containing a list of hosts, one per line. If the item begins with a '/' character, it is taken as a regular expression on which to filter the list of hosts (a regex argument may also be optionally trailed by another '/', e.g. /node.*/). A regex or file name argument may also be preceeded by a minus '-' to exclude instead of include thoses hosts.

A list of hosts may also be preceded by "user@" to specify a remote username other than the default, or "rcmd_type:" to specify an alternate rcmd connection type for these hosts. When used together, the rcmd type must be specified first, e.g. "ssh:user1@host0" would use ssh to connect to host0 as user "user1."

-x host,host,...
Exclude the specified hosts. May be specified in conjunction with other target node list options such as  -a and  -g (when available). Hostlists may also be specified to the  -x option (see the  HOSTLIST EXPRESSIONS section below). Arguments to  -x may also be preceeded by the filename ('^') and regex ('/') characters as described above, in which case the resulting hosts are excluded as if they had been given to  -w and preceeded with the minus '-' character.

Standard pdcp options

-h
Output usage menu and quit. A list of available rcmd modules will be printed at the end of the usage message.
-q
List option values and the target nodelist and exit without action.
-b
Disable ctrl-C status feature so that a single ctrl-C kills parallel copy. (Batch Mode)
-r
Copy directories recursively.
-p
Preserve modification time and modes.
-e PATH
Explicitly specify path to remote  pdcp binary instead of using the locally executed path. Can also be set via the environment variable PDSH_REMOTE_PDCP_PATH.
-l user
This option may be used to copy files as another user, subject to authorization. For BSD rcmd, this means the invoking user and system must be listed in the user's .rhosts file (even for root).
-t seconds
Set the connect timeout. Default is 10 seconds.
-f number
Set the maximum number of simultaneous remote copies to  number. The default is 32.
-R name
Set rcmd module to  name. This option may also be set via the PDSH_RCMD_TYPE environment variable. A list of available rcmd modules may be obtained via either the  -h or  -L options.
-M name,...
When multiple  misc modules provide the same options to  pdsh, the first module initialized "wins" and subsequent modules are not loaded. The  -M option allows a list of modules to be specified that will be force-initialized before all others, in-effect ensuring that they load without conflict (unless they conflict with eachother). This option may also be set via the PDSH_MISC_MODULES environment variable.
-L
List info on all loaded  pdcp modules and quit.
-d
Include more complete thread status when SIGINT is received, and display connect and command time statistics on stderr when done.
-V
Output  pdcp version information, along with list of currently loaded modules, and exit.

Hostlist Expressions

As noted in sections above,  pdcp  accepts ranges of hostnames in the general form: prefix[n-m,l-k,...], where n < m and l < k, etc., as an alternative to explicit lists of hosts. This form should not be confused with regular expression character classes (also denoted by ''[]''). For example, foo[19] does not represent foo1 or foo9, but rather represents a degenerate range: foo19.

This range syntax is meant only as a convenience on clusters with a prefixNN naming convention and specification of ranges should not be considered necessary -- the list foo1,foo9 could be specified as such, or by the range foo[1,9].

Some examples of range usage follow:

Copy /etc/hosts to foo01,foo02,...,foo05
    pdcp -w foo[01-05] /etc/hosts /etc
Copy /etc/hosts to foo7,foo9,foo10
    pdcp -w foo[7,9-10] /etc/hosts /etc
Copy /etc/hosts to foo0,foo4,foo5
    pdcp -w foo[0-5] -x foo[1-3] /etc/hosts /etc
As a reminder to the reader, some shells will interpret brackets ('[' and ']') for pattern matching. Depending on your shell, it may be necessary to enclose ranged lists within quotes. For example, in tcsh, the first example above should be executed as:

pdcp -w "foo[01-05]" /etc/hosts /etc

Origin

Pdsh/ pdcp  was originally a rewrite of IBM  dsh(1)  by Jim Garlick < garlick@llnl.gov > on LLNL's ASCI Blue-Pacific IBM SP system. It is now also used on Linux clusters at LLNL.

Limitations

When using  ssh  for remote execution, stderr of ssh to be folded in with that of the remote command. When invoked by  pdcp , it is not possible for ssh to prompt for confirmation if a host key changes, prompt for passwords if RSA keys are not configured properly, etc.. Finally, the connect timeout is only adjustable with ssh when the underlying ssh implementation supports it, and pdsh has been built to use the correct option.

See Also

pdsh(1)
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值