cpio How-To & Quick Start

转载自:http://cybertiggyr.com/gene/cpio-howto/

Gene Michael Stover's

cpio How-To & Quick Start

created Tuesday, 19 March 2002
updated Monday, 14 October 2002

Introduction

cpio is an archive program, sort of liketar. It is commonly availableon Unix & Unix-like systems,including Gnu/Linux.

This article is a quickintroduction for using cpio.

Extract Files from an Archive

To extract files from anarchive, use the -i(copy in) command line option forcpio. That will tellcpio to read an archivefrom stdin &to extract the files from it.

So, assuming the archive iscompressed, do this:

bzcat dir.cpio.bz2 |cpio -i

(If you're confused or concerned aboutmy use of bzip2, youmight want to read my short sectionaboutbzip2 or gzip?,then come back here & continuereading this article.)

Create an Archive

cpio creates archivesdifferently than tar. Wheretar automatically recursesinto subdirectories, cpioreads fromstdina list of files & directoriesto archive; it does not automaticallyrecurse into directories.

To create an archive,give cpio the -o (copy out) commandline option. cpio will read a listof files & directories from stdin,create the archive, & write the archive tostdout.

A good way to generate the list of filesis the find program.

To archive everything in a directory,compress it with bzip2, &write the results to a file, do this:

find dir -print |cpio -o |bzip2 >dir.cpio.bz2

That's the generic way to create anarchive. On a Gnu/Linux system, you mightget a lot of ugly warnings about i-node numbersbeing truncated. The archive will be fine,but it's never good to have unnecessary errorsin the output; the eye-sore might prevent you from seeingimportant error messages. To prevent all thosewarnings, type this:

find dir -print |cpio -o -Hnewc |bzip2 >dir.cpio.bz2

A potential problem is that "-Hnewc"is not portable to all implementationsof cpio. So either you mustknow when it's okay to use it or youmust avoid using it & suffer withthe gratuitous warning messages.

So far, we've created archives ofall files in a directory tree. Inother words, we've reproduced thefunctionality of tar butat the cost of more key strokes.Not very impressive. Since cpioreads a list of files from stdin,we can do a lot more.

If you want to create a distribution archiveof your source code, leaving out object files (*.o),backup files (*~), andCVS & RCS directories, just take advantageof the features of find that youalready know & love.

find dir \
-name "*.o" -o \
-name "*~" -o \
-name CVS -prune -o \
-name RCS -prune -o \
-print \
|cpio -o -Hnewc |bzip2 >dir.cpio.bz2

(I've broken the example into multiple lines forreadability. You'd either type the commandon a single command line, or you'd break itinto multiple lines, as I've done, by includingthe back-slashes (\) literally.)

Need to backup just the files thathave changed since your last backup yesterday? Trivial!

find dir -ctime -1 -print \
|cpio -o -Hnewc |bzip2 >dir.cpio.bz2

By using find to generate thelist of files, you can make cpioarchive any combination of files you want.It's easy to use find from yourown shell scripts, too, or you could evenuse your own programs to generate thelist of file names.cpio achieves greatflexibility by leaving the file-selectionresponsibilities to another program.

Advanced Features

Some (most? all?)cpio implementationsare able to access file systems &tapes through acpio server on anotherhost. A benefit there is that youcan use cpio to archive files from one host butwrite the archive file to, say, thetape drive on another host. I'vefound this useful in cases where Ineeded to backup large amounts of datato a tape drive, but the tape drive wason a server that didn't have enoughdisk space to hold a temporary copy ofthe entire archive, so I had to godirectly to tape.

To use this feature, use the-O (that's a capitalO) command line option inconjunction with theuser@host:pathnamemethod of specifying the destination file.See "man cpio"for details.

Similarly, you can use the-I command line optionto extract files from tape archivesmounted on servers.

As cool as it sounds, this featurehas some draw-backs. System-specificcommand line options & device-file namesare often necessary. For example,you might have to force special blocksizes with -Bor --block-size, or you mighthave to use system-specific device filenames, such as /dev/st/n0a1bf00aor something similarly incomprehensible.Also, systems sometimes behave as though thecommunication between the client (yourcpio process) & the serverare treated as text, so non-text characters& end-of-lines get mangled. In otherwords, it sometimes just doesn't work.

In those cases, I've often made itwork by using rsh and ddexplicitly. In other words:

find . -print |cpio -o -Hnewc \
|rsh server dd bs=32kb of=/dev/st0

(The values for block size (bs) &output file (of) aresystem-specific, of course, & mightdiffer for you.)

Comparison with tar

I don't mean this article to persuadepeople to use cpio instead oftar. tar is fine; Imean mostly to help people learn touse cpio if they are faced withsuch an archive (probably because that's whatI usually give to people unless theyinstruct me differently). Nevertheless,I can't help but do some comparisons.

The main advantage cpio hasover tar is that it's easierto archive only some of the files in adirectory. That benefit comes to usbecause cpio reads a list offiles to archive instead of assuming itshould recurse into directories &archive all files. Modern implementationsof tar have similar features,but they are not as flexible as thefile-selection features of findor of your own program. What's more,you have to learn the file-selectionlanguage of tar, whereas youalready know the file-selection languageof find, & that knowledgecan be applied to any file-selectiontask that's appropriate for find.In other words, you must know findanyway, so why not re-use that knowledgewith your archiver (cpio) insteadof learning a less capable, less generalarchiver-specific system?

cpio archives areusually noticeably smallerthan tar files.

bash-2.04$ for D in phil skeleton camano tigris; do
> (cd /space/gene-1/src; find $D -print |cpio -o -Hnewc |bzip2 -9) >$D.cpio.bz2
> (cd /space/gene-1/src; tar cf - $D |bzip2 -9) >$D.tar.bz2
> done
10903 blocks
1783 blocks
13996 blocks
504 blocks
bash-2.04$ ls -l
total 3332
-rw-rw---- 1 gene gene 861535 Mar 19 18:26 camano.cpio.bz2
-rw-rw---- 1 gene gene 866475 Mar 19 18:27 camano.tar.bz2
-rw-rw---- 1 gene gene 663206 Mar 19 18:26 phil.cpio.bz2
-rw-rw---- 1 gene gene 662529 Mar 19 18:26 phil.tar.bz2
-rw-rw---- 1 gene gene 110668 Mar 19 18:26 skeleton.cpio.bz2
-rw-rw---- 1 gene gene 111623 Mar 19 18:26 skeleton.tar.bz2
-rw-rw---- 1 gene gene 46374 Mar 19 18:27 tigris.cpio.bz2
-rw-rw---- 1 gene gene 46633 Mar 19 18:27 tigris.tar.bz2

You can see that the cpio archivesare smaller, but here's a table to show therelative sizes. The right-most columnshows the relative size of thecpioarchive in terms of the the tararchive. Smaller numbers indicate thatthe cpio archive was smaller.


size
base name cpio (bytes) tar (bytes) relative
camano 861535 866475 0.994
phil 663206 662529 1.001
skeleton 110668 111623 0.991
tigris 46374 46633 0.994

A disadvantage with cpio,compared to tar,is that you must type more characters to use it.Even in the simplest case, recursivelyarchiving a directory tree using thedefault archive format, requires moretyping. Observe the differences betweenthese two command line:

find . -print |cpio -o >../archive.cpio

 

tar cf ../archive.tar .

History, Portability, & Tips

"cpio" stands for"copy in, copy out". The copy part comes from "cp",which is the Unix copy program.

cpio comesto us from AT&T from theearly 1980s, if not earlier.It is not used often; tarhas that honor, but I foundcpio because I was forcedto exchange files between twosystems that had incompatibleversions of tar. Thesystems' administrator wasunwilling to update the tarimplementations, so I had tofind an alternative. cpioworked just fine, & since then,I have not found a Unix or Unix-likesystem that had a cpiothat could not work with some otherUnix's cpio. In other words,cpio archives appear to bevery portable.

To achieve that portability when youcreate archives, alwaysuse -Hnewc or the defaultarchive format (no -Hoption at all) unless specificexperience shows that another -H value is required.

Ignore the pass-through (-p)function of cpio. Usefind instead.

On MS-DOS (including Windows), wherepipes are treated as text, always use the-O or -Icommand line option to specify the outputarchive or the input archive.That's a real bummer, but life sucks.(More specifically, MS-DOS (which includesWindows) isnaïve.)

Many modern implementations ofcpio (and tar)are able to read or write allmanner of archive file formats.Don't use these; they arenot portable. Use cpiofor archives in the cpio format. Usetar for archives in thetar format.

Similarly, some modern implementationsallow you to instruct cpioto run your compression program on thearchive. Don't do this; it isnot portable. Instead, run the compressionprogram separately & explicitly.On Unix, use a pipe to connect cpio& the compressor.

bzip2 or gzip?

I prefer bzip2, so I'veused it in my examples. gzipwould work just as well. The two programseven share most of the important commandline options. So you could substitutegzip wherever you see bzip2,& you could substitute gzcator zcat wherever you seebzcat.

End.

Copyright © 2002 by Gene Michael Stover.Permission to copy, store, & view this document unmodified &in its entirety is granted. All other rights are reserved.

$Header: /home/gene/library/website/docsrc/cpio-howto/RCS/index.html,v 395.1 2008/04/20 17:25:55 gene Exp $

posted on 2010-12-26 00:42  lbsx  阅读(455)  评论(0编辑  收藏  举报