前言
本文主要记录一些perl中,如何用正则表达式处理字符串的基本语法。
9 用正则表达式处理文本(1)
9.1 用s///进行替换
$_ = "He's out bowling with Barney tonight.";
#利用s操作对字符串进行修改
#格式:s/搜寻的字符串/要替换的字符串/
s/Barney/Fred/;
print "$_\n"; # He's out bowling with Fred tonight.
s/with (\w+)/against $1's team/;
print "$_\n"; # He's out bowling against Fred's team tonight.
$_ = "one two three";
#交换位置
s/(\w+) (\w+)/$2 $1/;
print "$_\n"; #two one three
#^匹配开头,在开头加入内容
s/^/huge,/;
print "$_\n"; #huge,two one three
#默认情况正则表达式只进行一次替换
$_ = "home, sweet home";
s/home/cave/;
print "$_\n"; #cave, sweet home
#全局替换加一个选项g
$_ = "home, sweet home";
s/home/cave/g;
print "$_\n"; #cave, sweet cave
#全局替换经常用来删除多余的空白
$_ = "Input data \t may have";
s/\s+/ /g;
print "$_\n";#Input data may have
#删除一个字符串开头和结尾的很多空白
$_ = " one two ";
s/^\s+//;
s/\s+$//;
# s/^\s+ | \s+$//g;
print "$_\n";#one two
#定界符用任何的字符都可以,但必须是成对的
$_ = "He's out bowling with Barney tonight.";
#格式:
#s#搜寻的字符串#要替换的字符串#
#s{搜寻的字符串}{要替换的字符串}
#s(搜寻的字符串)(要替换的字符串)
#s<搜寻的字符串><要替换的字符串>
s#^https://#https:#;
#替换是先去匹配,配上了就替换,没配上就不发生任何变化
$_ = "fred and Willin";
if(s/fred/Fred/){
print "Successfully replaced!\n"
}
print "$_\n";#Fred and Willin
#选项i不区分大小写
$_ = "freD and Willin";
s#fred#Fred#gi;
print "$_\n";#Fred and Willin
#普通变量的替换操作,要求使用绑定操作符
$name = "one another one";
$name =~ s/one/two/g;
print "$name\n";#two another two
#\U全部变大写
$_ = "I saw Bill with Frid";
s/(bill|frid)/\U$1/ig;
print "$_\n";#I saw BILL with FRID
#\L全部变小写
$_ = "I saw Bill with Frid";
s/(bill|frid)/\L$1/ig;
print "$_\n";#I saw bill with frid
#\E结束前面的大小写
$_ = "this is One and Two";
s/(\w+) and (\w+)/\L$2\E and \U$1\E/;
print "$_\n";#this is two and ONE
#\u表示只改变第一个字母大写
$_ = "this is one and two";
s/(one|two)/\u$1/gi;
print "$_\n";#this is One and Two
#\l表示只改变第一个字母小写
$_ = "this is ONE and TWO";
s/(one|two)/\l$1/gi;
print "$_\n";#this is oNE and tWO
#\u\L表示第一个字母大写,其他的字母小写
s/(one|two)/\u\L$1/gi;
print "$_\n";#this is One and Two
#上述的字符大小写转换,在字符串中同样试用
$name = "bill";
print "Hello, \L\u$name\E";$Hello, Bill
9.2 split操作
#split用于拆分字符串
@values = split(/:/, "abc:def:g:h");
foreach(@values){
print "$_\n";
}
#利用空白进行分割,默认分隔符是/\s+/,默认的拆分变量是$_
$input = "This is a \t test. \t";
@values = split(/\s+/, $input);
foreach(@values){
print "$_\n";
#This
#is
#a
#test.
#
}
9.3 join函数
#join函数用于合并,可以将数组中的字符合并成一个大的字符串
@values = (1,2,3,4,5,6,7,8);
$value = join(":", @values); #连接符:可以变为任意的字符
print "$value\n";#1:2:3:4:5:6:7:8
9.4 列表上下文中的m//
$_ = "Hello there, neighbor!";
my($v1, $v2, $v3) = /(\S+) (\S+), (\S+)/;
print "$v1, $v2, $v3\n";#Hello, there, neighbor!
my $text = "Fred dropped a 5 ton granite block on Mr.Slate";
my @words = ($text =~ /([a-z]+)/ig);
print "@words\n";#Fred dropped a ton granite block on Mr Slate
my $data = "aa aaa bb bbb cc ccc dd ddd ee eee";
my %d = ($data =~ /(\w+)\s+(\w+)/g);
while(($key, $value) = each %d){
print "$key => $value\n";
#bb => bbb
#aa => aaa
#dd => ddd
#ee => eee
#cc => ccc
}