Learning Perl: 8.8. Precedence

Previous Page
Next Page

 

8.8. Precedence

With all of these metacharacters in regular expressions, you may feel you can't keep track of the players without a scorecard. That's the precedence chart, which shows us which parts of the pattern stick together the most tightly. Unlike the precedence chart for operators, the regular expression precedence chart is simple, with only four levels. As a bonus, this section will review all of the metacharacters that Perl uses in patterns.

  1. At the top of the precedence chart are the parentheses, (( )), used for grouping and memory. Anything in parentheses will stick together more tightly than anything else.

  2. The second level is the quantifiers. These are the repeat operatorsstar (*), plus (+), and question mark (?)as well as the quantifiers made with curly braces, like {5,15}, {3,}, and {5}. These always stick to the item they're following.

  3. The third level of the precedence chart holds anchors and sequence. The anchors are the caret (^) start-of-string anchor, the dollar-sign ($) end-of-string anchor, the (/b) word-boundary anchor, and the (/B) nonword-boundary anchor. Sequence (putting one item after another) is actually an operator, even though it doesn't use a metacharacter. That means that letters in a word will stick together just as tightly as the anchors stick to the letters.

  4. The lowest level of precedence is the vertical bar (|) of alternation. Since this is at the bottom of the chart, it effectively cuts the pattern into pieces. It's at the bottom of the chart because we want the letters in the words in /fred|barney/ to stick together more tightly than the alternation. If alternation were higher priority than sequence, that pattern would mean to match fre, followed by a choice of d or b and by arney. So, alternation is at the bottom of the chart, and the letters within the names stick together.

Besides the precedence chart, there are the so-called atoms that make up the most basic pieces of the pattern. These are the individual characters, character classes, and backreferences.

8.8.1. Examples of Precedence

When you need to decipher a complex regular expression, you'll need to do as Perl does and use the precedence chart to see what's going on.

For example, /^fred|barney$/ is probably not what the programmer intended. That's because the vertical bar of alternation is low precedence; it cuts the pattern in two. That pattern matches either fred at the beginning of the string or barney at the end. It's much more likely that the programmer wanted /^(fred|barney)$/, which will match if the whole line has nothing but fred or nothing but barney.[*] And what will /(wilma|pebbles?)/ match? The question mark applies to the previous character,[] so that will match wilma, pebbles, or pebble, perhaps as part of a larger string (since there are no anchors).

[*] And, perhaps, a newline at the end of the string, as we mentioned earlier in connection with the $ anchor.

[] Because a quantifier sticks to the letter s more tightly than the s sticks to the other letters in pebbles.

The pattern /^(/w+)/s+(/w+)$/ matches lines that have a "word," some required whitespace, and another "word," with nothing else before or after. That might be used to match lines like fred flintstone, for example. The parentheses around the words aren't needed for grouping, so they may be intended to save those substrings into the regular expression memories.

When you're trying to understand a complex pattern, it may be helpful to add parentheses to clarify the precedence. That's okay, but remember that grouping parentheses are also automatically memory parentheses; you may need to change the numbering of other memories when you add the parentheses.[]

[] But look in the perlre manpage for information about nonmemory parentheses, which are used for grouping without memory.

8.8.2. And There's More

Though we've covered all of the regular expression features that most people are likely to need for everyday programming, there are more features. A few are covered in the Alpaca book. Also check the perlre, perlrequick, and perlretut manpages for more information about what patterns in Perl can do.[%sect;]

[%sect;] And check out YAPE::Regexp::Explain in CPAN as a regular-expression-to-English translator.

Previous Page
Next Page
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值