文本处理命令之 sed

测试:

    以 Redhat6.0 为测试环境
    事实上在solaris下的sed命令要比linux强,但因为没有测试
    环境,我这里只给在linux下经过测试的用法。

目录:

    ★ 命令行参数简介
    ★ 首先假设我们有这样一个文本文件 sedtest.txt
    ★ 输出指定范围的行 p
    ★ 在每一行前面增加一个制表符(^I)
    ★ 在每一行后面增加--end
    ★ 显示指定模式匹配行的行号 [/pattern/]=
    ★ 在匹配行后面增加文本 [/pattern/]a/ 或者 [address]a/
    ★ 删除匹配行 [/pattern/]d 或者 [address1][,address2]d
    ★ 替换匹配行 [/pattern/]c/ 或者 [address1][,address2]c/
    ★ 在匹配行前面插入文本 [/pattern/]i/ 或者 [address]i/
    ★ 替换匹配串(注意不再是匹配行) [addr1][,addr2]s/old/new/g
    ★ 限定范围后的模式匹配
    ★ 指定替换每一行中匹配的第几次出现
    ★ &代表最后匹配
    ★ 利用sed修改PATH环境变量
    ★ 测试并提高sed命令运行效率
    ★ 指定输出文件 [address1][,address2]w outputfile
    ★ 指定输入文件 [address]r inputfile
    ★ 替换相应字符 [address1][,address2]y/old/new/
    ★ !号的使用
    ★ /c正则表达式c 的使用
    ★ sed命令中正则表达式的复杂性
    ★ 转换man手册成普通文本格式(新)
    ★ sed的man手册(用的就是上面的方法)

★ 命令行参数简介

    sed
    -e script 指定sed编辑命令
    -f scriptfile 指定的文件中是sed编辑命令
    -n 寂静模式,抑制来自sed命令执行过程中的冗余输出信息,比如只
       显示那些被改变的行。

    不明白?不要紧,把这些肮脏丢到一边,跟我往下走,不过下面的介绍里
    不包括正则表达式的解释,如果你不明白,可能有点麻烦。

★ 首先假设我们有这样一个文本文件 sedtest.txt

cat > sedtest.txt
Sed is a stream editor
----------------------
A stream editor is used to perform basic text transformations on an input stream
--------------------------------------------------------------------------------
While in some ways similar to an editor which permits scripted edits (such as ed
)
,
--------------------------------------------------------------------------------
-
-
sed works by making only one pass over the input(s), and is consequently more
-----------------------------------------------------------------------------
efficient. But it is sed's ability to filter text in a pipeline which particular
l
y
--------------------------------------------------------------------------------
-

★ 输出指定范围的行 p other types of editors.

sed -e "1,4p" -n sedtest.txt
sed -e "/from/p" -n sedtest.txt
sed -e "1,/from/p" -n sedtest.txt

★ 在每一行前面增加一个制表符(^I)

sed "s/^/^I/g" sedtest.txt

注意^I的输入方法是ctrl-v ctrl-i

单个^表示行首

★ 在每一行后面增加--end

sed "s/$/--end/g" sedtest.txt

单个$表示行尾

★ 显示指定模式匹配行的行号 [/pattern/]=

sed -e '/is/=' sedtest.txt

1
Sed is a stream editor
----------------------
3
A stream editor is used to perform basic text transformations on an input stream
--------------------------------------------------------------------------------
While in some ways similar to an editor which permits scripted edits (such as ed
)
,
--------------------------------------------------------------------------------
-
-
7
sed works by making only one pass over the input(s), and is consequently more
-----------------------------------------------------------------------------
9
efficient. But it is sed's ability to filter text in a pipeline which particular
l
y
--------------------------------------------------------------------------------
-
-
意思是分析sedtest.txt,显示那些包含is串的匹配行的行号,注意11行中出现了is字符串
这个输出是面向stdout的,如果不做重定向处理,则不影响原来的sedtest.txt

★ 在匹配行后面增加文本 [/pattern/]a/ 或者 [address]a/
^D

sed -f sedadd.script sedtest.txt

Sed is a stream editor

A stream editor is used to perform basic text transformations on an input stream

While in some ways similar to an editor which permits scripted edits (such as ed
)
,
--------------------------------------------------------------------------------
-
-
sed works by making only one pass over the input(s), and is consequently more
-----------------------------------------------------------------------------
efficient. But it is sed's ability to filter text in a pipeline which particular
l
y
--------------------------------------------------------------------------------
-
-
[scz@ /home/scz/src]> sed -e "a//
+++++++++
---------------------------------------------

找到包含from字符串的行,在该行的下一行增加+++++++++。
这个输出是面向stdout的,如果不做重定向处理,则不影响原来的sedtest.txt

很多人想在命令行上直接完成这个操作而不是多一个sedadd.script,不幸的是,这需要用
 
续行符/,

[scz@ /home/scz/src]> sed -e "/from/a//
> +++++++++" sedtest.txt

[scz@ /home/scz/src]> sed -e "a//
> +++++++++" sedtest.txt

上面这条命令将在所有行后增加一个新行+++++++++

[scz@ /home/scz/src]> sed -e "1 a//
> +++++++++" sedtest.txt

把下面这两行copy/paste到一个shell命令行上,效果一样

+++++++++" sedtest.txt

[address]a/ 只接受一个地址指定

对于a命令,不支持单引号,只能用双引号,而对于d命令等其他命令,同时


★ 删除匹配行 [/pattern/]d 或者 [address1][,address2]d

sed -e '/---------------------------------------------/d' sedtest.txt

Sed is a stream editor

A stream editor is used to perform basic text transformations on an input stream
While in some ways similar to an editor which permits scripted edits (such as ed
)
,
sed works by making only one pass over the input(s), and is consequently more
efficient. But it is sed's ability to filter text in a pipeline which particular
l

y

sed -e '6,10d' sedtest.txt
删除6-10行的内容,包括6和10

sed -e "2d" sedtest.txt
删除第2行的内容

sed "1,/^$/d" sedtest.txt
删除从第一行到第一个空行之间的所有内容
注意这个命令很容易带来意外的结果,当sedtest.txt中从第一行开始并没有空行,则sed
 
 

sed "1,/from/d" sedtest.txt
删除从第一行到第一个包含from字符串的行之间的所有内容,包括第一个包含
from字符串的行。

★ 替换匹配行 [/pattern/]c/ 或者 [address1][,address2]c/

sed -e "/is/c//
**********" sedtest.txt

寻找所有包含is字符串的匹配行,替换成**********

**********
----------------------
**********
--------------------------------------------------------------------------------
While in some ways similar to an editor which permits scripted edits (such as ed
)
,
--------------------------------------------------------------------------------
-
-
**********
-----------------------------------------------------------------------------
**********
--------------------------------------------------------------------------------
-

sed -e "1,11c//
**********" sedtest.txt----------------------
在1-12行内搜索所有from字符串,分别替换成****字符串

★ 限定范围后的模式匹配

sed "/But/s/is/are/g" sedtest.txt
对那些包含But字符串的行,把is替换成are

sed "/is/s/t/T/" sedtest.txt
对那些包含is字符串的行,把每行第一个出现的t替换成T

sed "/While/,/from/p" sedtest.txt -n
输出在这两个模式匹配行之间的所有内容

★ 指定替换每一行中匹配的第几次出现

sed "s/is/are/5" sedtest.txt
把每行的is字符串的第5次出现替换成are

★ &代表最后匹配

sed "s/^$/(&)/" sedtest.txt
给所有空行增加一对()

sed "s/is/(&)/g" sedtest.txt
给所有is字符串外增加()

sed "s/.*/(&)/" sedtest.txt
给所有行增加一对()

sed "/is/s/.*/(&)/" sedtest.txt
给所有包含is字符串的行增加一对()

★ 利用sed修改PATH环境变量

先查看PATH环境变量
[scz@ /home/scz/src]> echo $PATH
/usr/bin:/usr/bin:/bin:/usr/local/bin:/sbin:/usr/sbin:/usr/X11R6/bin:.

去掉尾部的{ :/usr/X11R6/bin:. }
[scz@ /home/scz/src]> echo $PATH | sed "s/^/(.*/)://usr[/]X11R6//bin:[.]$//1/"
/usr/bin:/usr/bin:/bin:/usr/local/bin:/sbin:/usr/sbin

去掉中间的{ :/bin: }
[scz@ /home/scz/src]> echo $PATH | sed "s/^/(.*/)://bin:/(.*/)$//1/2/"
/usr/bin:/usr/bin/usr/local/bin:/sbin:/usr/sbin:/usr/X11R6/bin:.

[/]表示/失去特殊意义
//同样表示/失去意义
/1表示子匹配的第一次出现
/2表示子匹配的第二次出现
/(.*/)表示子匹配

去掉尾部的:,然后增加新的路径
PATH=`echo $PATH | sed 's//(.*/):$//1/'`:$HOME/src
注意反引号`和单引号'的区别。

★ 测试并提高sed命令运行效率

time sed -n "1,12p" webkeeper.db > /dev/null
time sed 12q webkeeper.db > /dev/null
可以看出后者比前者效率高。

[address]q 当碰上指定行时退出sed执行

★ 指定输出文件 [address1][,address2]w outputfile

sed "1,10w sed.out" sedtest.txt -n
将sedtest.txt中1-10行的内容写到sed.out文件中。

★ 指定输入文件 [address]r inputfile

sed "1r sedappend.txt" sedtest.txt
将sedappend.txt中的内容附加到sedtest.txt文件的第一行之后

★ 替换相应字符 [address1][,address2]y/old/new/

sed "y/abcdef/ABCDEF/" sedtest.txt
将sedtest.txt中所有的abcdef小写字母替换成ABCDEF大写字母。

★ !号的使用

sed -e '3,7!d' sedtest.txt
删除3-7行之外的所有行

sed -e '1,/from/!d' sedtest.txt
找到包含from字符串的行,删除其后的所有行

★ /c正则表达式c 的使用

sed -e "/:from:d" sedtest.txt
等价于 sed -e "/from/d" sedtest.txt

★ sed命令中正则表达式的复杂性

cat > sedtest.txt
^//[}]{.*}[/(]$/)
^D

如何才能把该行替换成
/(]$/)//[}]{.*}^[

★ 转换man手册成普通文本格式(新)

man sed | col -b > sed.txt
sed -e "s/^H//g" -e "/^$/d" -e "s/^^I/        /g" -e "s/^I/ /g" sed.txt > sedman
.
txt
删除所有退格键、空行,把行首的制表符替换成8个空格,其余制表符替换成一个空格。

★ sed的man手册(用的就是上面的方法)

NAME
       sed - a Stream EDitor
SYNOPSIS
       sed [-n] [-V] [--quiet] [--silent] [--version] [--help]
           [-e script] [--expression=script]
           [-f script-file] [--file=script-file]
           [script-if-no-other-script]
           [file...]
DESCRIPTION
       Sed  is a stream editor.  A stream editor is used to per-
       form basic text transformations on an input stream (a file
       or  input from a pipeline).  While in some ways similar to
       an editor which permits scripted edits (such as ed),  sed
       works  by  making  only one pass over the input(s), and is
       consequently more efficient.  But it is sed's  ability  to
       filter text in a pipeline which particularly distinguishes
       it from other types of editors.
OPTIONS
       Sed  may  be  invoked  with  the  following   command-line
       options:
       -V
       --version
              Print  out the version of sed that is being run and
              a copyright notice, then exit.
       -h
       --help Print a usage  message  briefly  summarizing  these
              command-line options and the bug-reporting address,
              then exit.
       -n
       --quiet
       --silent
              By default, sed will print out the pattern space at
              the  end of  each cycle through the script.  These
              options disable this automatic  printing,  and  sed
              will  only  produce  output when explicitly told to
              via the p command.
       -e script
       --expression=script
              Add the commands in script to the set  of  commands
              to be run while processing the input.
       -f script-file
       --file=script-file
              Add  the commands contained in the file script-file
              to the set of commands to be run while  processing
              the input.
       If  no  -e,-f,--expression, or --file options are given on
       the command-line, then the first  non-option  argument  on
       the command line is taken to be the script to be executed.
       If any command-line parameters remain after processing the
       above,  these  parameters  are interpreted as the names of
       input files to be processed.  A file name of -  refers  to
       the  standard  input stream.  The standard input will pro-
       cessed if no file names are specified.
Command Synopsis
       This is just a brief synopsis of sed commands to serve  as
       a reminder to those who already know sed; other documenta-
       tion (such as the texinfo document) must be consulted  for
       fuller descriptions.
   Zero-address ``commands''
       : label
              Label for b and t commands.
       #comment
              The  comment extends until the next newline (or the
              end of a -e script fragment).
       }      The closing bracket of a { } block.
   Zero- or One- address commands
       =      Print the current line number.
       a /
       text   Append text, which has each embedded  newline  pre-
              ceeded by a backslash.
       i /
       text   Insert  text,  which has each embedded newline pre-
              ceeded by a backslash.
       q      Immediately quit the sed script without  processing
              any  more  input,  except that if auto-print is not
              diabled the current pattern space will be  printed.
       r filename
              Append text read from filename.
   Commands which accept address ranges
       {      Begin a block of commands (end with a }).
       b label
              Branch to label; if label is omitted, branch to end
              of script.
       t label
              If a s/// has done a successful substitution  since
              the  last  input line was read and since the last t
              command, then branch to label; if label is omitted,
              branch to end of script.
       c /
       text   Replace  the  selected  lines  with text, which has
              each embedded newline preceeded by a backslash.
       d      Delete pattern space.  Start next cycle.
       D      Delete up to the first embedded newline in the pat-
              tern  space.   Start  next  cycle, but skip reading
              from the input if there is still data in the  pat-
              tern space.
       h H    Copy/append pattern space to hold space.
       g G    Copy/append hold space to pattern space.
       x      Exchange the  contents  of  the hold  and pattern
              spaces.
       l      List out the current line in a ``visually unambigu-
              ous'' form.
       n N    Read/append the next line of input into the pattern
              space.
       p      Print the current pattern space.
       P      Print up to the first embedded newline of the  cur-
              rent pattern space.
       s/regexp/replacement/
              Attempt  to match regexp against the pattern space.
              If successful, replace that  portion  matched  with
              replacement.   The replacement may contain the spe-
              cial character & to refer to that  portion  of  the
              pattern space  which  matched, and  the  special
              escapes /1 through /9 to refer to the corresponding
              matching sub-expressions in the regexp.
       w      filename Write  the current pattern space to file-
              name.
       y/source/dest/
              Transliterate the characters in the  pattern  space
              which appear in source to the corresponding charac-
              ter in dest.
Addresses
       Sed commands can be given with no addresses, in which case
       the command will be executed for all input lines; with one
       address, in which case the command will only  be  executed
       for  input  lines  which  match that address; or with two
       addresses, in which case the command will be executed  for
       all  input  lines which match the inclusive range of lines
       starting from the first address and continuing to the sec-
       ond  address.   Three things to note about address ranges:
       the syntax is addr1,addr2 (i.e., the addresses  are  sepa-
       rated  by  a  comma);  the  line  which addr1 matched will
       always be accepted, even if addr2 selects an earlier line;
       and  if addr2  is a regexp, it will not be tested against
       the line that addr1 matched.
       After the address (or address-range), and before the  com-
       mand,  a !  may be inserted, which specifies that the com-
       mand shall only be executed if the  address  (or  address-
       range) does not match.
       The following address types are supported:
       number Match only the specified line number.
       first~step
              Match  every step'th line starting with line first.
              For example, ``sed -n 1~2p''  will  print  all  the
              odd-numbered  lines  in  the  input stream, and the
              address 2~5 will match every fifth  line,  starting
              with the second. (This is a GNU extension.)
       $      Match the last line.
       /regexp/
              Match lines matching the regular expression regexp.
       /cregexpc
              Match lines matching the regular expression regexp.
              The c may be any character.
Regular expressions
       POSIX.2 BREs  should  be  supported, but they aren't com-
       pletely yet.  The /n  sequence  in  a  regular  expression
       matches the  newline  character.  There are also some GNU
       extensions.  [XXX FIXME: more needs to be  said.   At  the
       very   least,   a  reference  to  another  document  which
       describes what is supported should be given.]
Miscellaneous notes
       This version of sed supports a /<newline> sequence in  all
       regular expressions, the replacement part of a substitute
       (s) command, and  in  the  source  and  dest  parts  of a
       transliterate  (y)  command.   The  / is stripped, and the
       newline is kept.
SEE ALSO
       awk(1), ed(1), expr(1), emacs(1), perl(1),  tr(1),  vi(1),
       regex(5) [well, one ought to be written... XXX], sed.info,
       any of various books on sed, the sed FAQ
       (http://www.wollery.demon.co.uk/sedtut10.txt
       http://www.ptug.org/sed/sedfaq.htm.
BUGS
       E-mail bug reports to bug-gnu-utils@gnu.org.  Be sure to
       include the word ``sed'' somewhere in the ``Subject:''
       field.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值