| ||||||||||||||
原文地址 http://cm.bell-labs.com/7thEdMan/vol2/sed | ||||||||||||||
发表于: 2006-06-27,修改于: 2006-07-04 20:57,已浏览8926次,有评论1条 推荐 投诉 | ||||||||||||||
|
the null character at the end of a line.
4) The characters `/n' match an imbedded newline character, but
not the newline at the end of the pattern space.
5) A period `.' matches any character except the terminal newline
of the pattern space.
6) A regular expression followed by an asterisk `*' matches any
number (including 0) of adjacent occurrences of the
regular expression it follows.
7) A string of characters in square brackets `[ ]' matches any
character in the string, and no others. If, however, the
first character of the string is circumflex `^', the regular
expression matches any character except the characters in
the string and the terminal newline of the pattern space.
8) A concatenation of regular expressions is a regular expression
which matches the concatenation of strings matched by
the components of the regular expression.
9) A regular expression between the sequences `/(' and `/)' is
identical in effect to the unadorned regular expression,
but has side-effects which are described under the s
command below and specification 10) immediately below.
10) The expression `/d' means the same string of characters
matched by an expression enclosed in `/(' and `/)' earlier
in the same pattern. Here d is a single digit; the string
specified is that beginning with the dth occurrence of `/('
counting from the left. For example, the expression
`^/(.*/)/1' matches a line beginning with two repeated
occurrences of the same string.
11) The null regular expression standing alone (e.g., `//') is
equivalent to the last regular expression compiled.
To use one of the special characters (^ ___FCKpd___46nbsp;. * [ ] / /) as a
literal (to match an occurrence of itself in the input),
precede the special character by a backslash `/'.
For a context address to `match' the input requires that the
whole pattern within the address match some portion of the
pattern space.
2.3. Number of Addresses
The commands in the next section can have 0, 1, or 2
addresses. Under each command the maximum number of allowed
addresses is given. For a command to have more addresses
than the maximum allowed is considered an error.
If a command has no addresses, it is applied to every line
in the input.
If a command has one address, it is applied to all lines
which match that address.
If a command has two addresses, it is applied to the first
line which matches the first address, and to all subsequent
lines until (and including) the first subsequent line which
matches the second address. Then an attempt is made on
subsequent lines to again match the first address, and the
process is repeated.
Two addresses are separated by a comma.
Examples:
/an/ matches lines 1, 3, 4 in our sample text
/an.*an/ matches line 1
/^an/ matches no lines
/./ matches all lines
//./ matches line 5
/r*an/ matches lines 1,3, 4 (number = zero!)
//(an/).*/1/ matches line 1
3. FUNCTIONS
All functions are named by a single character. In the
following summary, the maximum number of allowable addresses
is given enclosed in parentheses, then the single character
function name, possible arguments enclosed in angles (< >),
an expanded English translation of the single-character
name, and finally a description of what each function does.
The angles around the arguments are not part of the
argument, and should not be typed in actual editing
commands.
3.1. Whole-line Oriented Functions
(2)d -- delete lines The d function deletes from the
file (does not write to the output) all those
lines matched by its address(es). It also
has the side effect that no further commands
are attempted on the corpse of a deleted line;
as soon as the d function is executed, a new line
is read from the input, and the list of
editing commands is re-started from the
beginning on the new line.
(2)n -- next line The n function reads the next line
from the input, replacing the current line.
The current line is written to the output if it
should be. The list of editing commands is
continued following the n command.
(1)a/
<text> -- append lines
The a function causes the argument <text> to be
written to the output after the line matched by
its address. The a command is inherently
multi-line; a must appear at the end of a line,
and <text> may contain any number of lines. To
preserve the one-command-to-a-line fiction, the
interior newlines must be hidden by a backslash
character (`/') immediately preceding the newline.
The <text> argument is terminated by the first
unhidden newline (the first one not immediately
preceded by backslash). Once an a function is
successfully executed, <text> will be written to
the output regardless of what later commands do to
the line which triggered it. The triggering line
may be deleted entirely; <text> will still be
written to the output. The <text> is not scanned
for address matches, and no editing commands are
attempted on it. It does not cause any change in
the line-number counter.
(1)i/
<text> -- insert lines
The i function behaves identically to the a
function, except that <text> is written to the
output before the matched line. All other
comments about the a function apply to the i
function as well.
(2)c/
<text> -- change lines
The c function deletes the lines selected by its
address(es), and replaces them with the lines in
<text>. Like a and i, c must be followed by a
newline hidden by a backslash; and interior new
lines in <text> must be hidden by backslashes.
The c command may have two addresses, and
therefore select a range of lines. If it does,
all the lines in the range are deleted, but only
one copy of <text> is written to the output, not
one copy per line deleted. As with a and i,
<text> is not scanned for address matches, and no
editing commands are attempted on it. It does not
change the line-number counter. After a line has
been deleted by a c function, no further commands
are attempted on the corpse. If text is appended
after a line by a or r functions, and the line is
subsequently changed, the text inserted by the c
function will be placed before the text of the a
or r functions. (The r function is described in
Section 3.4.)
Note: Within the text put in the output by these functions,
leading blanks and tabs will disappear, as always in sed
commands. To get leading blanks and tabs into the output,
precede the first desired blank or tab by a backslash; the
backslash will not appear in the output.
Example:
The list of editing commands:
n
a/
XXXX
d
applied to our standard input, produces:
In Xanadu did Kubhla Khan
XXXX
Where Alph, the sacred river, ran
XXXX
Down to a sunless sea.
In this particular case, the same effect would be produced
by either of the two following command lists:
n n
i/ c/
XXXX XXXX
d
3.2. Substitute Function
One very important function changes parts of lines selected
by a context search within the line.
(2)s<pattern><replacement><flags> -- substitute The s
function replaces part of a line
(selected by <pattern>) with <replacement>. It
can best be read:
Substitute for <pattern>, <replacement>
The <pattern> argument contains a pattern, exactly
like the patterns in addresses (see 2.2 above).
The only difference between <pattern> and a
context address is that the context address must
be delimited by slash (`/') characters; <pattern>
may be delimited by any character other than space
or newline. By default, only the first string
matched by <pattern> is replaced, but see the g
flag below. The <replacement> argument begins
immediately after the second delimiting character
of <pattern>, and must be followed immediately by
another instance of the delimiting character.
(Thus there are exactly three instances of the
delimiting character.) The <replacement> is not a
pattern, and the characters which are special in
patterns do not have special meaning in
<replacement>. Instead, other characters are
special:
& is replaced by the string matched
by <pattern>
/d (where d is a single digit) is replaced by
the dth substring matched by
parts of <pattern> enclosed in `/('
and `/)'. If nested substrings occur
in <pattern>, the dth is determined by
counting opening delimiters (`/('). As
in patterns, special characters may be
made literal by preceding them with
backslash (`/').
The <flags> argument may contain the following
flags:
g -- substitute <replacement> for all
(non-overlapping) instances of
<pattern> in the line. After a
successful substitution, the scan for
the next instance of <pattern> begins
just after the end of the inserted
characters; characters put into the line
from <replacement> are not rescanned.
p -- print the line if a successful
replacement was done. The p flag
causes the line to be written to the
output if and only if a substitution was
actually made by the s function.
Notice that if several s
functions, each followed by a p
flag, successfully substitute in the
same input line, multiple copies of
the line will be written to the
output: one for each successful
substitution.
w <filename> -- write the line to a file if a
successful replacement was done. The w
flag causes lines which are actually
substituted by the s function to be
written to a file named by <filename>.
If <filename> exists before sed is run,
it is overwritten; if not, it is
created. A single space must separate
w and <filename>. The
possibilities of multiple, somewhat
different copies of one input line
being written are the same as for p. A
maximum of 10 different file names may
be mentioned after w flags and w
functions (see below), combined.
Examples:
The following command, applied to our standard input,
s/to/by/w changes
produces, on the standard output:
In Xanadu did Kubhla Khan
A stately pleasure dome decree:
Where Alph, the sacred river, ran
Through caverns measureless by man
Down by a sunless sea.
and, on the file `changes':
Through caverns measureless by man
Down by a sunless sea.
If the nocopy option is in effect, the command:
s/[.,;?:]/*P&*/gp
produces:
A stately pleasure dome decree*P:*
Where Alph*P,* the sacred river*P,* ran
Down to a sunless sea*P.*
Finally, to illustrate the effect of the g flag, the command:
/X/s/an/AN/p
produces (assuming nocopy mode):
In XANadu did Kubhla Khan
and the command:
/X/s/an/AN/gp
produces:
In XANadu did Kubhla KhAN
3.3. Input-output Functions
(2)p -- print The print function writes the addressed
lines to the standard output file. They are
written at the time the p function is
encountered, regardless of what succeeding editing
commands may do to the lines.
(2)w <filename> -- write on <filename> The write
function writes the addressed lines to the file
named by <filename>. If the file previously
existed, it is overwritten; if not, it is
created. The lines are written exactly as they
exist when the write function is encountered
for each line, regardless of what subsequent
editing commands may do to them. Exactly one
space must separate the w and <filename>. A
maximum of ten different files may be
mentioned in write functions and w flags
after s functions, combined.
(1)r <filename> -- read the contents of a file The read
function reads the contents of <filename>, and
appends them after the line matched by the
address. The file is read and appended
regardless of what subsequent editing commands
do to the line which matched its address. If
r and a functions are executed on the same line,
the text from the a functions and the r functions
is written to the output in the order that
the functions are executed. Exactly one
space must separate the r and <filename>. If a
file mentioned by a r function cannot be opened,
it is considered a null file, not an error, and no
diagnostic is given.
NOTE: Since there is a limit to the number of files that can
be opened simultaneously, care should be taken that no more
than ten files be mentioned in w functions or flags; that
number is reduced by one if any r functions are present.
(Only one read file is open at one time.)
Examples
Assume that the file `note1' has the following contents:
Note: Kubla Khan (more properly Kublai Khan;
1216-1294) was the grandson and most eminent
successor of Genghiz (Chingiz) Khan, and founder
of the Mongol dynasty in China.
Then the following command:
/Kubla/r note1
produces:
In Xanadu did Kubla Khan
Note: Kubla Khan (more properly Kublai Khan;
1216-1294) was the grandson and most eminent
successor of Genghiz (Chingiz) Khan, and founder
of the Mongol dynasty in China.
A stately pleasure dome decree:
Where Alph, the sacred river, ran
Through caverns measureless to man
Down to a sunless sea.
3.4.
Multiple Input-line Functions
Three functions, all spelled with capital letters, deal
specially with pattern spaces containing imbedded newlines;
they are intended principally to provide pattern matches
across lines in the input.
(2)N -- Next line The next input line is appended to
the current line in the pattern space; the two
input lines are separated by an imbedded
newline. Pattern matches may extend across the
imbedded newline(s).
(2)D -- Delete first part of the pattern space Delete
up to and including the first newline character
in the current pattern space. If the pattern
space becomes empty (the only newline was the
terminal newline), read another line from the
input. In any case, begin the list of editing
commands again from its beginning.
(2)P -- Print first part of the pattern space Print up
to and including the first newline in the
pattern space.
The P and D functions are equivalent to their lower-case
counterparts if there are no imbedded newlines in the pattern
space.
3.5. Hold and Get Functions
Four functions save and retrieve part of the input for
possible later use.
(2)h -- hold pattern space The h functions copies the
contents of the pattern space into a hold area
(destroying the previous contents of the hold area).
(2)H -- Hold pattern space The H function appends the
contents of the pattern space to the contents of the
hold area; the former and new contents are
separated by a newline.
(2)g -- get contents of hold area The g function copies the
contents of the hold area into the pattern space
(destroying the previous contents of the pattern
space).
(2)G -- Get contents of hold area The G function appends the
contents of the hold area to the contents of the
pattern space; the former and new contents are
separated by a newline.
(2)x -- exchange The exchange command interchanges the
contents of the pattern space and the hold area.
Example
The commands
1h
1s/ did.*//
1x
G
s//n/ :/
applied to our standard example, produce:
In Xanadu did Kubla Khan :In Xanadu
A stately pleasure dome decree: :In Xanadu
Where Alph, the sacred river, ran :In Xanadu
Through caverns measureless to man :In Xanadu
Down to a sunless sea. :In Xanadu
3.6. Flow-of-Control Functions
These functions do no editing on the input lines, but
control the application of functions to the lines selected
by the address part.
(2)! -- Don't The Don't command causes the next command
(written on the same line), to be applied to
all and only those input lines not selected by the
adress part.
(2){ -- Grouping The grouping command `{' causes the
next set of commands to be applied (or not
applied) as a block to the input lines selected by
the addresses of the grouping command. The first
of the commands under control of the grouping may
appear on the same line as the `{' or on the next
line.
The group of commands is terminated by a matching `}'
standing on a line by itself.
Groups can be nested.
(0):<label> -- place a label The label function marks a place in
the list of editing commands which may be referred to by b
and t functions. The <label> may be any sequence of
eight or fewer characters; if two different colon
functions have identical labels, a compile time
diagnostic will be generated, and no execution attempted.
(2)b<label> -- branch to label The branch function causes the
sequence of editing commands being applied to the current
input line to be restarted immediately after the place
where a colon function with the same <label> was
encountered. If no colon function with the same label
can be found after all the editing commands have been
compiled, a compile time diagnostic is produced, and
no execution is attempted. A b function with no <label>
is taken to be a branch to the end of the list of editing
commands; whatever should be done with the current input
line is done, and another input line is read; the list
of editing commands is restarted from the beginning on the
new line.
(2)t<label> -- test substitutions The t function tests whether
any successful substitutions have been made on the current
input line; if so, it branches to <label>; if not, it
does nothing. The flag which indicates that a successful
substitution has been executed is reset by:
1) reading a new input line, or
2) executing a t function.
3.7. Miscellaneous Functions
(1)= -- equals The = function writes to the standard
output the line number of the line matched by
its address.
(1)q -- quit The q function causes the current line to
be written to the output (if it should be),
any appended or read text to be written, and
execution to be terminated.
.SH
Reference
[1] Ken Thompson and Dennis M. Ritchie, The UNIX
Programmer's Manual. Bell Laboratories, 1978.