Guide to Linux Archive Utility Mastery
It's really so happen to find this one. I'm just trying to find cpio info to install 10g on an HPUX box
Finally, I use following command to get the installation source
cpio -idvmc ship_rel10_hp64_db_Disk1.cpio
cpio -idvmc ship_rel10_hp64_db_Disk2.cpio
-------------------------------------for your information--------------
Guide to Linux Archive Utility Mastery
by Sheryl Calish
An introduction to the effective use of the tar, cpio, and rpm facilities for archiving and restoring files
Whether you are a seasoned application developer, a veteran system administrator, or a nascent Linux newbie, the Linux archive utilities have powerful features that provide advantageous information and functionality to a Linux user. Even in the conceivable but inadvisable case that you don't back up your files, you may still encounter one or more of these facilities. For example, if you download an application like Oracle 10g or OpenOffice, you can uncomprehendingly follow the accompanying instructions to uncompress and install a package. While many, including yours truly, have followed this practice without major catastrophe, if you're reading this you probably prefer to have a greater understanding of the commands you enter.
As of this writing, three archive facilities you are most likely to encounter on Linux are GNU tar, GNU cpio, and rpm (Red Hat Packet Manager). "Tar" is an abbreviation of 't'ape 'ar'chiver; it was originally written for backup onto magnetic tape. Cpio derives its name from "copy input and output" and is similar to tar. First developed by Red Hat and released to the open source community, rpm is a specialized archive facility designed for packaging application software.
By way of introduction to these facilities, this article will focus on archiving files on a single-user system. This pretty much implies backing up your /home directory and perhaps some configuration files in the /etc directory, the directories that change regularly and are most difficult to replace if you run into problems. Although it is possible to run a system or data file backup with a facility like tar, neither of these procedures will be covered here, except to mention that, if you are using Oracle Cluster File System (OCFS) for backup, you will need to go to oss.oracle.com and download the latest tool to enable use of tar for backup of your Oracle database files. If you use a third-party tool for database backup, you may still need to do this because some third-party database backup programs use tar.
Working with tar
Archive utilities, such as tar and cpio, are known for their ability to preserve associated file information: directory structure, file contents, ownership and mode (permission) settings. (See my previous article, "Guide to Linux File Command Mastery," for an explanation of file access permissions.) This lets you store and recreate a file system exactly as it was when you archived it.
For user-controlled backups or single-user systems, tar seems to be the backup facility of choice. Its basic command syntax is:
tar mandatory_operation [options] nameoftarfile.tar file(s)_to_archiveA mandatory_operation is one of eight "Functions Letters" listed on the tar manpage. Exactly one, and only one, of these "operations" must be specified first when you invoke tar. The most common of these operations are --create (-c), --list (-t), and --extract (-x).
Two commonly used "options" are --verbose (-v), which prints a list of files as tar processes them, and --file (-f), which specifies the name of the archive file. Although they are not mandatory, they are extremely important for eliminating confusion.
There are three acceptable formats for tar options and operations: the short, mnemonic, and old formats. The short format uses single letters, as follows:
$ tar -cvf Documents.tar Documents Documents/ Documents/zz/ Documents/zz/new_file.out Documents/samplesql/ Documents/samplesql/mksample8.sql Documents/samplesql/oe8_cre.sql Documents/samplesql/oe8_drop.sql Documents/samplesql/hr8_cre.sqlThe mnemonic format uses long names such as
$ tar --create --verbose --file Documents.tar Documentsto do the same thing. The old format is similar to the short one but does not use the preceding dash:
$ tar cvf Documents.tar DocumentsEach of the above commands performs the same two tasks:
- They create a tar file for the directory Documents, which contains two subdirectories
- They print the name of each file as it is added to the archive, Documents.tar
Formats can be intermixed, as with
$ tar cv --file Documents.tar Documentsfor example. The order of options in a tar command is important for the sake of clarity. The following command will produce an archive named "v."
$ tar -cfv archive.tar DocumentsUsing the mnemonic format can alleviate some of this confusion.
There are no requirements for naming archives, but by convention tar files are named with a *.tar extension. Gzipped archives, discussed later, are usually named *.tar.gz or *.tgz.
Peeking Inside Your ArchiveSo, here's where things get interesting. Once you have created a tar file, you can peek inside it using the -t,--list option:
$ tar -tf Documents.tarwhich will output a listing similar to the one produced when you run tar -cv. It is also a good idea to get a listing of a tar file you have downloaded before you extract it, to make sure that the files do not begin with a "/", indicating absolute pathnames.
You can look for individual files with tar -t.
$ tar -tf Documents.tar Documents/samplesql/mk*.sql Documents/samplesql/mksample8.sqlThis approach also works for directories.
$ tar -tf Documents.tar Documents/samplesql Documents/samplesql/ Documents/samplesql/mksample8.sql Documents/samplesql/oe8_cre.sql Documents/samplesql/oe8_drop.sql ...Using the -v option with --list produces a long file listing of your tar components.
$tar -tvf Documents.tar Documents/samplesqlListing tar contents is useful in finding the exact name of a single file you want to extract. You can also see that tar automatically retains the modification date and other file information.
To find the differences between an existing tar file and the file system, invoke tar with the -d, --diff option.
$ tar -dvf Documents.tar Documents/samplesql/ Documents/samplesql/mksample8.sql tar: Documents/samplesql/oe8_views.sql: Warning: Cannot stat: No such file or directory Documents/samplesql/hr8_analz.sql: Mod time differs Documents/samplesql/hr8_analz.sql: Size differs Documents/samplesql/getdate.sql tar: Documents/samplesql/getdate.sql: Warning: Cannot stat: No such file or directory ...In this case, tar is reporting that we have one file in our archive, hr8_analz.sql, that differs from the file system version, and two files, oe8_views.sql and getdate.sql, in our tar archive but not in our file system.
Tar will ignore files in the directory that have been created since the archive was last created. However, updating your archive is pretty straightforward, as you will see in the next section.
Adding Files to an ArchiveYou can add a new file to an existing archive with the -r, --append option.
$ tar -rvf Documents.tar Documents/samplesqlThis causes the entire directory to be appended to the archive. A single file can be appended to an archive with:
$ tar -rvf Documents.tar Documents/samplesql/getdate.sql Documents/samplesql/getdate.sqlDue to tar's origins as a tool for archiving to tape, updating your archive with --append or --update will simply add modified files to the existing archive without removing the old files. The -N, --newer option can be used to create incremental archives of new and modified files. Ultimately, recreating an archive on a regular basis may be the easiest way to maintain an up-to-date archive.
Backing Up Lots of DataFor a large amount of data, you can either send tar output to other media, such as tapes, CD-ROMs, or floppy disks, using the -M (--multivolume) option or you can use one of the compression facilities available with tar. However, you cannot use both together; doing so will render your archive useless.
Traditional UNIX versions of tar do not support zipping, although archived files could, and still can, be piped to a compression facility. With GNU tar, compression can be specified with one of several options when tar is invoked. Tar supports three compression alternatives:
- compress/uncompress (-Z, --compress/--uncompress)
- gzip/gunzip (-z, --gzip/--gunzip)
- bzip2/bunzip2 (-j, --bzip2/--bunzip2).
- It combines UNIX tar and compress commands
- Its use does not subject you to possible patent violations
- It is considered more efficient than compress/uncompress.
Once an archive is created with a compression filter:
$ tar -czvf Documents.tar.gzyou must run it through the filter in all further references to that archive. To list the contents of the archive:
$ tar -tzvf Documents.tar.gzto get a --diff on the archive,
$ tar -dzvf Documents.tar.gzor, to --extract an archive, as we will discuss next.
Extracting a tar Archive
You can extract whole directories or individual files by running tar with the --extract (-x) operation.
$ tar -xvf Samplesql.tar getdate.sqlor
$ tar -xvf Documents.tar Documents/samplesqlThese extractions create files in the working directory. If you are working with a gzipped archive, remember to specify z when you run your extraction.
$ tar -xzvf articles.tar.gz $ tar -xzvf articles.tar.gz *.docOne caveat: Verify which directory you are in when you execute an extract. You will need to change to the target directory (cd) or specify it with the -C option.
Working with CpioCpio is tar's predecessor in the UNIX world. Like tar, it archives files to hard disk, floppy, CD-ROM, or tape. It is more versatile than tar in the types of files it handles. The GNU version of cpio copies files into or out of a cpio or tar archive. It recognizes and handles special formats, such as HPUX binary, old ACSII, new ASCII, and a few others. It is also useful for moving an entire directory tree. To be compatible with older cpio programs, cpio stores its archive files in binary format.
The general format of the command is:
cpio -mode[other_options] [redirection_symbol] filenameIt takes an explicit list of files from standard input, so it is typically used at the end of pipe that begins with ls or find.
There are three basic modes in which you can use cpio:
- The copy-out mode, used with the -o option, copies files to an archive
- The copy-in mode, used with the -i option, extracts files from an archive
- The copy-pass mode, used with the -p option, passes files from one directory tree to another.
Copy-out ModeUnlike tar, cpio needs explicit instructions: which files to archive with standard output, where to redirect the archive, whether the associated file information should be preserved, and so on. Copy-out mode can archive the contents of a directory with
$ ls | cpio -ov > samplesql.cpiowhere ls produces standard output for cpio to copy-out the archive. The -o, --create option directs cpio to archive the output and the -v, --verbose option provides a listing similar to that of tar.
You can use the find command to send files to cpio as well.
$ find . -print -depth | cpio -ov > Documents.cpioTo minimize issues with permissions on directories, use the -depth option of find. This option processes the directory contents before the directory itself, allowing the contents of a directory without owner write privileges to be restored before the directory's permissions are restored.
The -t, --list facility is also available with cpio and can even be run on a tar archive.
$ cpio -tv < Samplesql.tarCopy-in Mode
Use copy-in mode to extract the contents of an archive. In this mode, cpio automatically recognizes which kind of archive it is reading. This means it can read archives created on machines with a different byte order.
$ cpio -idv < ../samplesql.cpioFurthermore, cpio differs from tar in that it will not restore a file's original modification time unless you specify the -m, --preserve-modification-time option:
$ cpio -idvm < samplesql.cpioAs already mentioned, cpio can extract from a tar archive as well.
$ cpio -idv ‹ Samplesql.tarAlthough cpio doesn't work on zipped files, you can unzip your files before sending them to cpio via a pipe.
This mode is a combination of the copy-out and copy-in modes. The major difference is that it bypasses an archive. It can move entire directories from one place to another and is used as follows:
$ find . -depth -print0 | cpio --null -pvda testdirNote the use of the -print0 option with find instead of -print, as we used in copy-out mode. The copy-pass mode of GNU cpio needs files terminated by a null, a service provided by the -print0 option. This approach allows cpio to handle filenames that contain a newline character and requires the --null option in the cpio command above. The -d, --make-directories option directs cpio to create the directory testdir.
Tar can accomplish the same thing. However, in this example the directory testdir must be created first.
$ tar -cvf - samplesql | (cd testdir; tar -xf -)Tar can accomplish the same thing. However, in this example the directory testdir must be created first.
$ tar -cvf - samplesql | (cd testdir; tar -xf -)Here, tar sends the archive to standard output, denoted by the "-" in the first tar command. The second tar command gets the archive from standard input, denoted by the "-" in the second tar command. When the files are passed they are copied to testdir, preserving user, permission, and date information.
RpmBoth Red Hat and SuSE Linux distributions use rpm to install your Linux operating system and application software. It can be used to:
- Install all application files with a single command
- Manage your installed packages
- Update a package
- Uninstall a package with a single command
- Build a software package from source code into source and binary form.
Use the -q, --query option to obtain information on rpm packages. Combine -q with the -a option to get a list of all rpm packages installed on your system.
$ rpm -qa | moreor, for a more ordered result,
$ rpm -qa | sort | grep k | moreIf you installed a package with cpio, tar, Oracle Universal Installer, or any other installation program, it will not be in this list. You can find the version of any rpm installed package with:
$ rpm -q glibc glibc-2.2.5-177Rpm -qi with a package name provides more information on a specific package:
$ rpm -qi orarunThis command produces a detailed description of the package. To find a list of all files installed with a package, use:
$ rpm -ql orarun /etc/init.d/oracle /etc/profile.d/oracle.csh ...Before you go overboard spring-cleaning your files, however, it might be a good idea to find out which package uses a particular file.
$ rpm -qf /usr/sbin/rcoracle orarun-1.3-0Installing, Updating, and Uninstalling with rpm
The -i option directs rpm to install a package.
$ rpm -iVh orarun-1.3-0.rpmCombined with the -h option, the -i option causes rpm to display "#" symbols as the package is being installed. This lets you know the installation is not hung.
Visit the Linux Technology Center
Uninstall an rpm package with the -e option:
$ rpm -e orarunSometimes when you uninstall or install a package, you will receive an error message concerning one or more missing dependency packages. To view the list of dependencies for a particular package, use the command:
$ rpm -qR orarun /bin/shThere are cases where the dependency is reciprocal, so that you appear to be caught in a Catch-22 situation. The solution lies with the use of the --nodeps option.
$ rpm -e --nodeps netscapeAs you might guess, this command could cause problems if used indiscriminately. Sometimes, however, it is the only way out. The same holds true for the --force option, which will install a package, in spite of conflicts, by overwriting current files.
Obtaining Archival Quality While there is some overlap in their functionality, the archive facilities discussed here do have special capabilities. They also have their champions and critics. Both rpm and tar are commonly used for distributing software packages. In addition to its distribution functions, tar is frequently used as a backup utility. For many, however, cpio's simplicity makes it a favorite backup tool. There is also a facility called rpm2cpio, which converts an rpm package into a cpio package, enabling the extraction of one or two files from an rpm package. See "Maximum RPM: Taking the Red Hat Package Manager to the Limit" by Edward C. Bailey for more information.
It is possible to use a combination of tar, cpio, and rpm or other installer to install different software on your system. Most systems probably use an assortment of installation methods for various application programs. However, there is an increased danger of accidental overwriting with this approach. While cpio will complain about overwriting a file unless explicitly told to do so, tar will happily do so.
Which, then, is the best facility for archiving your files? As with most decisions in information technology, there is no simple answer. Rather, the best solution depends on your requirements, preferences, and policies. Sheryl Calish (firstname.lastname@example.org)is an Oracle developer specializing in Linux for Blue Heron Consulting. She is also Funding Chair for the Central Florida Oracle Users Group and Marketing Chair for the IOUG Linux SIG.