Guide to Linux Archive Utility Mastery

原创 2004年08月08日 15:08:00

It's really so happen to find this one. I'm just trying to find cpio info to install 10g on an HPUX box

Finally, I use following command to get the installation source

cpio -idvmc ship_rel10_hp64_db_Disk1.cpio

cpio -idvmc ship_rel10_hp64_db_Disk2.cpio


-------------------------------------for your information--------------

Guide to Linux Archive Utility Mastery
by Sheryl Calish

An introduction to the effective use of the tar, cpio, and rpm facilities for archiving and restoring files

Whether you are a seasoned application developer, a veteran system administrator, or a nascent Linux newbie, the Linux archive utilities have powerful features that provide advantageous information and functionality to a Linux user. Even in the conceivable but inadvisable case that you don't back up your files, you may still encounter one or more of these facilities. For example, if you download an application like Oracle 10g or OpenOffice, you can uncomprehendingly follow the accompanying instructions to uncompress and install a package. While many, including yours truly, have followed this practice without major catastrophe, if you're reading this you probably prefer to have a greater understanding of the commands you enter.

As of this writing, three archive facilities you are most likely to encounter on Linux are GNU tar, GNU cpio, and rpm (Red Hat Packet Manager). "Tar" is an abbreviation of 't'ape 'ar'chiver; it was originally written for backup onto magnetic tape. Cpio derives its name from "copy input and output" and is similar to tar. First developed by Red Hat and released to the open source community, rpm is a specialized archive facility designed for packaging application software.

By way of introduction to these facilities, this article will focus on archiving files on a single-user system. This pretty much implies backing up your /home directory and perhaps some configuration files in the /etc directory, the directories that change regularly and are most difficult to replace if you run into problems. Although it is possible to run a system or data file backup with a facility like tar, neither of these procedures will be covered here, except to mention that, if you are using Oracle Cluster File System (OCFS) for backup, you will need to go to and download the latest tool to enable use of tar for backup of your Oracle database files. If you use a third-party tool for database backup, you may still need to do this because some third-party database backup programs use tar.

Working with tar

Archive utilities, such as tar and cpio, are known for their ability to preserve associated file information: directory structure, file contents, ownership and mode (permission) settings. (See my previous article, "Guide to Linux File Command Mastery," for an explanation of file access permissions.) This lets you store and recreate a file system exactly as it was when you archived it.

For user-controlled backups or single-user systems, tar seems to be the backup facility of choice. Its basic command syntax is:

tar mandatory_operation [options] nameoftarfile.tar file(s)_to_archive
A mandatory_operation is one of eight "Functions Letters" listed on the tar manpage. Exactly one, and only one, of these "operations" must be specified first when you invoke tar. The most common of these operations are --create (-c), --list (-t), and --extract (-x).

Two commonly used "options" are --verbose (-v), which prints a list of files as tar processes them, and --file (-f), which specifies the name of the archive file. Although they are not mandatory, they are extremely important for eliminating confusion.

There are three acceptable formats for tar options and operations: the short, mnemonic, and old formats. The short format uses single letters, as follows:

$ tar -cvf Documents.tar Documents
The mnemonic format uses long names such as
$ tar --create --verbose --file Documents.tar Documents
to do the same thing. The old format is similar to the short one but does not use the preceding dash:
$ tar cvf Documents.tar Documents
Each of the above commands performs the same two tasks:
  • They create a tar file for the directory Documents, which contains two subdirectories
  • They print the name of each file as it is added to the archive, Documents.tar
One important and often confusing concept is that, unless specified with the -C --directory=DIR option, tar archives are created in the directory in which tar is invoked, the working directory, not the directory of the articles to be tarred.

Formats can be intermixed, as with

$ tar cv --file Documents.tar Documents
for example. The order of options in a tar command is important for the sake of clarity. The following command will produce an archive named "v."
$ tar -cfv archive.tar Documents
Using the mnemonic format can alleviate some of this confusion.

There are no requirements for naming archives, but by convention tar files are named with a *.tar extension. Gzipped archives, discussed later, are usually named *.tar.gz or *.tgz.

Peeking Inside Your Archive

So, here's where things get interesting. Once you have created a tar file, you can peek inside it using the -t,--list option:
$ tar -tf Documents.tar
which will output a listing similar to the one produced when you run tar -cv. It is also a good idea to get a listing of a tar file you have downloaded before you extract it, to make sure that the files do not begin with a "/", indicating absolute pathnames.

You can look for individual files with tar -t.

$ tar -tf Documents.tar Documents/samplesql/mk*.sql
This approach also works for directories.
$ tar -tf Documents.tar Documents/samplesql
Using the -v option with --list produces a long file listing of your tar components.
$tar -tvf Documents.tar Documents/samplesql
Listing tar contents is useful in finding the exact name of a single file you want to extract. You can also see that tar automatically retains the modification date and other file information.

To find the differences between an existing tar file and the file system, invoke tar with the -d, --diff option.

$ tar -dvf Documents.tar
tar: Documents/samplesql/oe8_views.sql: Warning: Cannot stat: No such file or directory
Documents/samplesql/hr8_analz.sql: Mod time differs
Documents/samplesql/hr8_analz.sql: Size differs
tar: Documents/samplesql/getdate.sql: Warning: Cannot stat: No such file or directory
In this case, tar is reporting that we have one file in our archive, hr8_analz.sql, that differs from the file system version, and two files, oe8_views.sql and getdate.sql, in our tar archive but not in our file system.

Tar will ignore files in the directory that have been created since the archive was last created. However, updating your archive is pretty straightforward, as you will see in the next section.

Adding Files to an Archive

You can add a new file to an existing archive with the -r, --append option.
$ tar -rvf Documents.tar Documents/samplesql
This causes the entire directory to be appended to the archive. A single file can be appended to an archive with:
$ tar -rvf Documents.tar Documents/samplesql/getdate.sql
Due to tar's origins as a tool for archiving to tape, updating your archive with --append or --update will simply add modified files to the existing archive without removing the old files. The -N, --newer option can be used to create incremental archives of new and modified files. Ultimately, recreating an archive on a regular basis may be the easiest way to maintain an up-to-date archive.

Backing Up Lots of Data

For a large amount of data, you can either send tar output to other media, such as tapes, CD-ROMs, or floppy disks, using the -M (--multivolume) option or you can use one of the compression facilities available with tar. However, you cannot use both together; doing so will render your archive useless.

Traditional UNIX versions of tar do not support zipping, although archived files could, and still can, be piped to a compression facility. With GNU tar, compression can be specified with one of several options when tar is invoked. Tar supports three compression alternatives:

  • compress/uncompress (-Z, --compress/--uncompress)
  • gzip/gunzip (-z, --gzip/--gunzip)
  • bzip2/bunzip2 (-j, --bzip2/--bunzip2).
According to the GNU Project, compress is an older, proprietary compression utility found in commercial UNIX distributions and is available on Linux for the sake of compatibility. Gzip has been available in the GNU version of tar since early 1997; it can unzip files treated with compress and is considered a better choice because of the following:
  • It combines UNIX tar and compress commands
  • Its use does not subject you to possible patent violations
  • It is considered more efficient than compress/uncompress.
Bzip2/bunzip2 is an alternative utility that provides even more efficient, albeit slower, compression than gzip.

Once an archive is created with a compression filter:

$ tar -czvf Documents.tar.gz
you must run it through the filter in all further references to that archive. To list the contents of the archive:
$ tar -tzvf Documents.tar.gz
to get a --diff on the archive,
$ tar -dzvf Documents.tar.gz
or, to --extract an archive, as we will discuss next.

Extracting a tar Archive

You can extract whole directories or individual files by running tar with the --extract (-x) operation.

$ tar -xvf Samplesql.tar getdate.sql
$ tar -xvf Documents.tar Documents/samplesql
These extractions create files in the working directory. If you are working with a gzipped archive, remember to specify z when you run your extraction.
$ tar -xzvf articles.tar.gz

$ tar -xzvf articles.tar.gz *.doc
One caveat: Verify which directory you are in when you execute an extract. You will need to change to the target directory (cd) or specify it with the -C option.

Working with Cpio

Cpio is tar's predecessor in the UNIX world. Like tar, it archives files to hard disk, floppy, CD-ROM, or tape. It is more versatile than tar in the types of files it handles. The GNU version of cpio copies files into or out of a cpio or tar archive. It recognizes and handles special formats, such as HPUX binary, old ACSII, new ASCII, and a few others. It is also useful for moving an entire directory tree. To be compatible with older cpio programs, cpio stores its archive files in binary format.

The general format of the command is:
cpio -mode[other_options] [redirection_symbol] filename
It takes an explicit list of files from standard input, so it is typically used at the end of pipe that begins with ls or find.

There are three basic modes in which you can use cpio:

  • The copy-out mode, used with the -o option, copies files to an archive
  • The copy-in mode, used with the -i option, extracts files from an archive
  • The copy-pass mode, used with the -p option, passes files from one directory tree to another.
Different cpio options are allowable depending on the mode you are running. A full listing of options allowed in each mode is found in the "Synopsis" section of the cpio manpage.

Copy-out Mode

Unlike tar, cpio needs explicit instructions: which files to archive with standard output, where to redirect the archive, whether the associated file information should be preserved, and so on. Copy-out mode can archive the contents of a directory with
$ ls | cpio -ov > samplesql.cpio
where ls produces standard output for cpio to copy-out the archive. The -o, --create option directs cpio to archive the output and the -v, --verbose option provides a listing similar to that of tar.

You can use the find command to send files to cpio as well.

$ find . -print -depth | cpio -ov > Documents.cpio
To minimize issues with permissions on directories, use the -depth option of find. This option processes the directory contents before the directory itself, allowing the contents of a directory without owner write privileges to be restored before the directory's permissions are restored.

The -t, --list facility is also available with cpio and can even be run on a tar archive.
$ cpio -tv < Samplesql.tar
Copy-in Mode

Use copy-in mode to extract the contents of an archive. In this mode, cpio automatically recognizes which kind of archive it is reading. This means it can read archives created on machines with a different byte order.

$ cpio -idv < ../samplesql.cpio
Furthermore, cpio differs from tar in that it will not restore a file's original modification time unless you specify the -m, --preserve-modification-time option:
$ cpio -idvm < samplesql.cpio
As already mentioned, cpio can extract from a tar archive as well.
$ cpio -idv ‹ Samplesql.tar
Although cpio doesn't work on zipped files, you can unzip your files before sending them to cpio via a pipe.

Copy-pass Mode

This mode is a combination of the copy-out and copy-in modes. The major difference is that it bypasses an archive. It can move entire directories from one place to another and is used as follows:

$ find . -depth -print0 | cpio --null -pvda testdir
Note the use of the -print0 option with find instead of -print, as we used in copy-out mode. The copy-pass mode of GNU cpio needs files terminated by a null, a service provided by the -print0 option. This approach allows cpio to handle filenames that contain a newline character and requires the --null option in the cpio command above. The -d, --make-directories option directs cpio to create the directory testdir.

Tar can accomplish the same thing. However, in this example the directory testdir must be created first.

$ tar -cvf - samplesql | (cd testdir; tar -xf -)
Tar can accomplish the same thing. However, in this example the directory testdir must be created first.
$ tar -cvf - samplesql | (cd testdir; tar -xf -)
Here, tar sends the archive to standard output, denoted by the "-" in the first tar command. The second tar command gets the archive from standard input, denoted by the "-" in the second tar command. When the files are passed they are copied to testdir, preserving user, permission, and date information.


Both Red Hat and SuSE Linux distributions use rpm to install your Linux operating system and application software. It can be used to:
  • Install all application files with a single command
  • Manage your installed packages
  • Update a package
  • Uninstall a package with a single command
  • Build a software package from source code into source and binary form.
We'll discuss all but the last function here.

Querying rpm

Use the -q, --query option to obtain information on rpm packages. Combine -q with the -a option to get a list of all rpm packages installed on your system.

$ rpm -qa | more
or, for a more ordered result,
$ rpm -qa | sort | grep k | more
If you installed a package with cpio, tar, Oracle Universal Installer, or any other installation program, it will not be in this list. You can find the version of any rpm installed package with:
$ rpm -q glibc
Rpm -qi with a package name provides more information on a specific package:
$ rpm -qi orarun
This command produces a detailed description of the package. To find a list of all files installed with a package, use:
$ rpm -ql orarun
Before you go overboard spring-cleaning your files, however, it might be a good idea to find out which package uses a particular file.
$ rpm -qf  /usr/sbin/rcoracle
Installing, Updating, and Uninstalling with rpm

The -i option directs rpm to install a package.

$ rpm -iVh orarun-1.3-0.rpm
Combined with the -h option, the -i option causes rpm to display "#" symbols as the package is being installed. This lets you know the installation is not hung.


Download Oracle Database 10g for Linux
Oracle Database 10g Release 1 ( is currently available on Linux x86 and Linux Itanium platforms; download it from OTN for free here.

Visit the Linux Technology Center
Bookmark this page for technical information about Linux sysadmin best practices generally and the Oracle-on-Linux stack specifically.

Related Articles

Archive of Linux-related Technical Articles

The -V, verify option checks for installation problems. If you try to install a newer version of a package that is already installed, you will get an error message. In this case, you should use the -U option to update a package.

Uninstall an rpm package with the -e option:
$ rpm -e orarun
Sometimes when you uninstall or install a package, you will receive an error message concerning one or more missing dependency packages. To view the list of dependencies for a particular package, use the command:
$ rpm -qR orarun
There are cases where the dependency is reciprocal, so that you appear to be caught in a Catch-22 situation. The solution lies with the use of the --nodeps option.
$ rpm -e --nodeps netscape
As you might guess, this command could cause problems if used indiscriminately. Sometimes, however, it is the only way out. The same holds true for the --force option, which will install a package, in spite of conflicts, by overwriting current files.

Obtaining Archival Quality

While there is some overlap in their functionality, the archive facilities discussed here do have special capabilities. They also have their champions and critics.

Both rpm and tar are commonly used for distributing software packages. In addition to its distribution functions, tar is frequently used as a backup utility. For many, however, cpio's simplicity makes it a favorite backup tool. There is also a facility called rpm2cpio, which converts an rpm package into a cpio package, enabling the extraction of one or two files from an rpm package. See "Maximum RPM: Taking the Red Hat Package Manager to the Limit" by Edward C. Bailey for more information.

It is possible to use a combination of tar, cpio, and rpm or other installer to install different software on your system. Most systems probably use an assortment of installation methods for various application programs. However, there is an increased danger of accidental overwriting with this approach. While cpio will complain about overwriting a file unless explicitly told to do so, tar will happily do so.

Which, then, is the best facility for archiving your files? As with most decisions in information technology, there is no simple answer. Rather, the best solution depends on your requirements, preferences, and policies. Sheryl Calish ( an Oracle developer specializing in Linux for Blue Heron Consulting. She is also Funding Chair for the Central Florida Oracle Users Group and Marketing Chair for the IOUG Linux SIG.

CSS Mastery 3rd 第3版 pdf 0分

  • 2016年08月03日 08:27
  • 21.19MB
  • 下载


HDFS并不擅长存储小文件,因为每个文件最少一个block,每个block的元数据都会在namenode节点占用内存,如果存在这样大量的小文件,它们会吃掉namenode节点的大量内存。Hadoop ...
  • Xw_Classmate
  • Xw_Classmate
  • 2016-01-30 20:46:11
  • 798

linux utility 介绍

**************************************** lsof 使用 -- list open file 查看一个进程打开了什么文件。 查看文件被那个进程打开。 查看哪个进...
  • hejinjing_tom_com
  • hejinjing_tom_com
  • 2013-12-06 16:10:48
  • 1754


S Assembler source code (Unix) S Source code file (Scheme) SAI Encrypted video file (Integrated Sens...
  • penginpha
  • penginpha
  • 2009-08-14 13:02:00
  • 5582

精通机器学习的5本免费电子书(5 free e-books for machine learning mastery)

原文:5 free e-books for machine learning mastery  作者:Serdar Yegulalp 翻译:赖信涛 责编:仲培艺 There a...
  • Real_Myth
  • Real_Myth
  • 2016-09-07 14:05:35
  • 1243

Jason Brownlee - Machine learning Mastery with Python 高清PDF+Code

  • 2017年11月09日 10:36
  • 1.77MB
  • 下载


  • 2016年05月10日 09:53
  • 1.02MB
  • 下载

精通CSS(css mastery)中文版 part1

  • 2007年08月18日 15:53
  • 9MB
  • 下载

《CSS Mastery》《 精通CSS》 中文版高清晰 +英文版 +源码下载===>之2/6

  • 2008年01月24日 19:01
  • 9.76MB
  • 下载
您举报文章:Guide to Linux Archive Utility Mastery