Perl初级教程[第3天]

最新推荐文章于 2024-09-19 07:04:58 发布

Kyle-soft

最新推荐文章于 2024-09-19 07:04:58 发布

阅读量1.5k

点赞数

分类专栏：网络教程文章标签： perl character newline whitespace string search

网络教程专栏收录该内容

8 篇文章 0 订阅

订阅专栏

条件语句

Perl当然也支持if/then/else语句，下面是一个例子：

if ($a)
{
	print "The string is not empty/n";
}
else
{
	print "The string is empty/n";
}

记住，空字符串被认为是false，如果$a是字符串"0"，结果将是"empty"。

条件语句中也可以使用elsif：

if (!$a)			# The ! is the not operator
{
	print "The string is empty/n";
}
elsif (length($a) == 1)		# If above fails, try this
{
	print "The string has one character/n";
}
elsif (length($a) == 2)		# If that fails, try this
{
	print "The string has two characters/n";
}
else				# Now, everything has failed
{
	print "The string has lots of characters/n";
}

注意：elsif中确实缺一个"e"。

字符串匹配

Perl的最有用的特征之一是

它的强大的字符串处理能力。其中的核心是被很多其它UNIX工具使用的规则表达式（regular expression - RE）。

规则表达式

规则表达式包含在斜线内，匹配通过=~操作符进行。如果字符串the出现在变量$sentence中，则下面的表达式为真：

$sentence =~ /the/

RE是大小写敏感的，所以如果

$sentence = "The quick brown fox";

那么上面的匹配结果为false。操作符!~用在“非匹配”时，在上面的例子中

$sentence !~ /the/

是真，因为字符串the没有出现在$sentence中。

特殊变量$_

在条件语句

if ($sentence =~ /under/)
{
	print "We're talking about rugby/n";
}

中，如果我们有下面两个表达式中的一个：

$sentence = "Up and under";
$sentence = "Best winkles in Sunderland";

将打印出一条信息。

但是如果我们把这个句子赋值给特殊变量$_，用起来会更容易些。如果这样，我们可以避免使用匹配和非匹配操作符，上面的例子可以写成：

if (/under/)
{
	print "We're talking about rugby/n";
}

$_变量是很多Perl操作的缺省变量，经常被使用。

其它的RE

在RE中有大量的特殊字符，既使它们功能强大，又使它们看起来很复杂。最好在用RE时慢慢来，对它们的使用是一种艺术。

下面是一些特殊的RE字符和它们的意义：

.	# Any single character except a newline
^	# The beginning of the line or string
$	# The end of the line or string
*	# Zero or more of the last character
+	# One or more of the last character
?	# Zero or one of the last character

下面是一些匹配的例子，在使用时应加上/.../：

t.e	# t followed by anthing followed by e
	# This will match the
	#                 tre
	#                 tle
	# but not te
	#         tale
^f	# f at the beginning of a line
^ftp	# ftp at the beginning of a line
e$	# e at the end of a line
tle$	# tle at the end of a line
und*	# un followed by zero or more d characters
	# This will match un
	#                 und
	#                 undd
	#                 unddd (etc)
.*	# Any string without a newline. This is because
	# the . matches anything except a newline and
	# the * means zero or more of these.
^$	# A line with nothing in it.

还有更多的用法。方括号用来匹配其中的任何一个字符。在方括号中"-"表明"between"，"^"表示"not"：

[qjk]		# Either q or j or k
[^qjk]		# Neither q nor j nor k
[a-z]		# Anything from a to z inclusive
[^a-z]		# No lower case letters
[a-zA-Z]	# Any letter
[a-z]+		# Any non-zero sequence of lower case letters

上面提到的已经基本够用了，下面介绍的只做参考：

竖线"|"表示"or"，括号(...)可以进行集合：

jelly|cream	# Either jelly or cream
(eg|le)gs	# Either eggs or legs
(da)+		# Either da or dada or dadada or...

下面是一些其它的特殊字符：

/n		# A newline
/t		# A tab
/w		# Any alphanumeric (word) character.
		# The same as [a-zA-Z0-9_]
/W		# Any non-word character.
		# The same as [^a-zA-Z0-9_]
/d		# Any digit. The same as [0-9]
/D		# Any non-digit. The same as [^0-9]
/s		# Any whitespace character: space,
		# tab, newline, etc
/S		# Any non-whitespace character
/b		# A word boundary, outside [] only
/B		# No word boundary

象$, |, [, ), /, /这样的字符是很特殊的，如果要引用它们，必须在前面加一个反斜线：

/|		# Vertical bar
/[		# An open square bracket
/)		# A closing parenthesis
/*		# An asterisk
/^		# A carat symbol
//		# A slash
//		# A backslash

RE的例子
我们前面提到过，用RE最好慢慢来。下面是一些例子，当使用它们时应方在/.../中。

[01]		# Either "0" or "1"
//0		# A division by zero: "/0"
// 0		# A division by zero with a space: "/ 0"
///s0		# A division by zero with a whitespace:
		# "/ 0" where the space may be a tab etc.
// *0		# A division by zero with possibly some
		# spaces: "/0" or "/ 0" or "/  0" etc.
///s*0		# A division by zero with possibly some
		# whitespace.
///s*0/.0*	# As the previous one, but with decimal
		# point and maybe some 0s after it. Accepts
		# "/0." and "/0.0" and "/0.00" etc and
		# "/ 0." and "/  0.0" and "/   0.00" etc.

替换和翻译

Perl可以在匹配的基础上进行替换操作。可以用s函数实现这个功能。如果不使用匹配操作符，那么替换被认为对$_变量进行操作。

在字符串$sentence中用London替换london可以用下面的表达式：

$sentence =~ s/london/London/

用$_变量可以这样做：

s/london/London/

表达式的结果是替换发生的次数，所以或者是0或者是1。

选项

上面的例子只替代第一个匹配的字符串，用g参数可以进行全程替换：

s/london/London/g

返回的结果为0或被替换的次数。

如果我们想替换lOndon, lonDON, LoNDoN等，可以这样做：

s/[Ll][Oo][Nn][Dd][Oo][Nn]/London/g

但是可以有更简单的方式 - 使用i选项（忽略大小写）：

s/london/London/gi

记忆方式

如果记住匹配方式，以后用起来可以更方便。任何发生在括号内的匹配被记在变量$1,...,$9中。这些用在相同RE中的字符串可以用/1,...,/9表示：

$_ = "Lord Whopper of Fibbing";
s/([A-Z])/:/1:/g;
print "$_/n";

这段代码替换任何大写字母为被冒号包围的形式。结果是:L:ord :W:hopper of :F:ibbing。变量$1,...,$9是只读变量，不可以修改它们。

另一个例子，判断语句：

if (/(/b.+/b) /1/)
{
	print "Found $1 repeated/n";
}

将判断任何重复的单词。每个/b代表一个单词边界，.+与任何非空字符串相匹配，因此/b.+/b匹配任何两个单词边界中的内容。然后被括号记住，存储在/1中，$1被程序的其余部分使用。

下面的表达式交换$_变量的第一个和最后一个字符：

s/^(.)(.*)(.)$//3/2/1/

^和$匹配行的开始和结尾。/1存储第一个字符，/2存储除第一个和最后一个字符之外的部分，最后一个字符存储在/3中。然后/1和/3进行互换。

匹配之后，可以使用特殊的只读变量$~、$&和$'找到查询之前、之中和之后的内容。所以在

$_ = "Lord Whopper of Fibbing";
/pp/;

之后，下面的表达式都为真（eq表示字符串匹配判断）。

$` eq "Lord Wo";
$& eq "pp";
$' eq "er of Fibbing";

在替换表达式中可以使用变量，因此

$search = "the";
s/$search/xxx/g;

将把任何出现的the替换为xxx。如果想替换there，则不能使用s/$searchre/xxx/，因为程序会把它当作变量$searchre。可以用花括号实现there的替换：

$search = "the";
s/${search}re/xxx/;

翻译

tr函数实现字符对字符的翻译。下面的表达式替换变量$sentence中的每个a为e，b为d，c为f。表达式返回替换的次数。

$sentence =~ tr/abc/edf/

大多数特殊RE代码不能用在tr函数中。例如，下面的语句计算$sentence变量中的星号数，然后存储在变量$count中。

$count = ($sentence =~ tr/*/*/);

但是"-"仍然表示"between"。下面的语句把变量$_转换为大写形式：

tr/a-z/A-Z/;

Perl初级教程[第2天]<< >>Perl初级教程[第4天]

Kyle-soft

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录