ruby中的正则表达式_Ruby中的正则表达式非常基础

ruby中的正则表达式

If you’re new to Ruby (or any programming language), you may have come across these strange bits of coding when you search stack overflow for answers to life’s greatest questions — like how to count the number of sentences in a string:

如果您是Ruby(或任何编程语言)的新手,当您在堆栈溢出中搜索生活中最重要的问题(例如如何计算字符串中的句子数)的答案时,可能会遇到这些奇怪的编码问题:

string.strip.split(/\w[?!.]/)

You may be wondering — what is this code within a code?

您可能想知道-代码中的代码是什么?

It’s called Regex — short for Regular Expressions! It’s not unique to Ruby (many languages have some form of regex), but we’ll focus on Ruby’s version here.

它称为Regex-正则表达式的缩写! 它不是Ruby独有的(许多语言都有某种形式的regex),但我们在这里将重点介绍Ruby的版本。

Regex code is used for specifying a certain search pattern of characters to be matched in a string (like finding words that end with -ing, or places in a string where a space follows a punctuation mark). Once we use this search pattern, we can pull out those matches or manipulate them in some way.

正则表达式代码用于指定字符串中要匹配的某些字符搜索模式(例如查找以-ing结尾的单词,或在字符串中标点符号后的空格中放置字符)。 一旦使用了这种搜索模式,我们就可以提取那些匹配项或以某种方式对其进行操作。

An example of a real-world use for regex would be validating an email address is entered correctly — it has a string of before an @ sign, and a string before a .com (or some variation) — but there are an infinite number of ways you may find it useful in your own code!

regex在现实世界中使用的一个示例是验证是否正确输入了电子邮件地址-它在@符号前有一个字符串,在.com(或某些变体)前有一个字符串-但是有无限多个您可能会发现在自己的代码中有用的方法!

正则表达式解码 (Regex Decoded)

First things first, regex code goes between two forward slashes to differentiate it from the rest of your code:

首先,正则表达式代码位于两个正斜杠之间,以区别于其余代码:

/ *fancy code stuff* /

Now, the interesting part — there are a few major types of regular expressions that can go inside these slashes:

现在,有趣的部分-这些斜杠中可以包含几种主要的正则表达式类型:

  1. Anchors

    锚点

^     Start of a line              $     End of a line\A    Start of a string            \z    End of a string$     End of a string, or line\b    Any word boundary            \<    Start of a word              \>    End of a word

Anchors tell the search where to start or stop. For example:

锚点告诉搜索从哪里开始或停止。 例如:

/\A *fancy code stuff* /

says “start searching for ___ (whatever you put next in the regex) at the start of every string.”

说:“在每个字符串的开头开始搜索___(无论您在正则表达式中输入的是什么)。”

*Notice the clever way \A (the first letter of the alphabet) denotes the start of a string, and \z (the last letter) denotes the end of a string*

*请注意,巧妙的方式\ A(字母的第一个字母)表示字符串的开头,\ z(最后一个字母)表示字符串的结尾*

2. Groups and ranges

2.组和范围

Groups and ranges can include numbers or letters, and tell the search what characters you’re looking to match:

组和范围可以包括数字或字母,并告诉搜索您要匹配的字符:

[abc]         Any single character (a, b or c)[^abc]        Excluding any single character (a, b , or c)[a-x]         Any lowercase character between a-x[A-T]         Any uppercase character between A-T[a-zA-Z]      Any character between a-z or A-Z[0-7]         Any number from 0 through 7(a|b)         A or b (but not both)

So if we had:

因此,如果我们有:

/ \<[a-m] /

…the search is saying: “go to the start of every word, and find those that are any letter between lowercase a and m”.

…搜索的意思是:“转到每个单词的开头,然后找到介于小写am之间的任何字母的单词”。

3. Character Classes

3.角色类

Character classes are almost the same as groups and ranges, but they denote an entire class:

字符类与组和范围几乎相同,但它们表示整个类:

.        Any single character\s       Any whitespace character\w       Any word character\W       Any non-word character\d       Any digit\D       Any non-digit

…so if you wanted to find any digit followed by a white space:

…因此,如果您想查找后跟空格的任何数字:

/ \d\s /

In addition to using character classes on groupings and ranges, they can also be applied to more specific strings — for instance, if you wanted to match and words that ended with “ring” — using /[ring]/ would find any instance of a word with those letters, in any order. Instead, we would put them directly inside the forward slashes, and use a plus sign to combine the expressions:

除了在分组和范围上使用字符类外,它们还可以应用于更特定的字符串(例如,如果要匹配并且以“ ring”结尾的单词),则使用/ [ring] /可以找到a的任何实例带有这些字母的单词,顺序不限。 相反,我们将它们直接放在正斜杠内,并使用加号来组合表达式:

/ \w+ring /

…this example would match any word character ending in ring (so bring and string would all match, but ring and ringing would not!).

…此示例将匹配以ring结尾的任何单词字符(因此,bring和string都将匹配,而ring和ringing将不匹配!)。

3. Quantifiers

3.量词

Quantifiers denote how many instances of something you want to match:

量词表示要匹配的事物的实例数:

a?          Zero or one of ad*          Zero or more of dm+          More than one of m!{3}        Exactly 3 of !5{3, }      3 or more of 5p{3, 6}     Between 3 and 6 of p

Quantifiers work for any characters, so if you saw this:

量词适用于任何字符,因此,如果您看到以下内容:

/[!.]{3}/

…it’s searching for any exclamation point or period, that’s repeated 3 times in a row — like !!!.

…它正在搜索任何感叹号或句点,它连续3次重复-像!!!。

4. Pattern modifiers

4.模式修改器

Pattern modifiers are unique, in that the go outside of the forward slashes, and apply to the entire expression:

模式修饰符是唯一的,因为修饰符在正斜杠之外,并且适用于整个表达式:

/i    case insensitive/x    Ignore white space

…so if we wanted to find any string that starts with the letters k-s, either lower- or uppercase:

…因此,如果我们要查找以字母ks开头的任何字符串,则可以是小写或大写:

/ \A^[k-s] /i

在代码中使用正则表达式 (Using Regex in code)

Now that you know how to interpret regex, you’re ready to actually use it in your code! These are a few basic methods that work well regex. To use them, place the regex inside parentheses after the method.

既然您知道如何解释正则表达式,就可以在代码中实际使用它了! 这些是可以正常使用正则表达式的一些基本方法。 要使用它们,请将正则表达式放在方法后面的括号内。

  1. .scan

    。扫描

.scan will return an array of all the items in your string that match the given regex. So:

.scan将返回字符串中与给定正则表达式匹配的所有项目的数组。 所以:

"Rain water washing down the drain".scan(/\w+ain/)

…would return the array

…将返回数组

 #=> [“Rain”, “drain”]

2. .match

2. .match

.match is nearly identical to .scan, but only returns the first instance that it matches:

.match与.scan几乎相同,但仅返回它匹配的第一个实例:

"Rain water washing down the drain".match(/\w+ain/)

…would return

…会回来的

<#Matchdata “Rain”> 

*Note that .match returns an object, not just the string — this method is often used to return a boolean, just to quickly test if a string does contain a specified pattern.

*请注意,.match返回一个对象,而不仅仅是字符串-该方法通常用于返回布尔值,只是为了快速测试字符串是否确实包含指定的模式。

3. .grep

3. .grep

.grep is similar to .scan, but works as an enumerator to iterate over arrays and hashes.

.grep与.scan相似,但是用作枚举器,用于遍历数组和哈希。

If you had an array of strings that listed pets and their species, and wanted to return those whose names were exactly 5 letters:

如果您有一个包含宠物及其种类的字符串数组,并且想要返回名称正好是5个字母的字符串:

pets = ["Barkley the dog", "Spot the dog", "Whiskers the cat", "Kiwi the bird"]pets.grep(/^\w{4}\s/)

…would look at: the start of each string (^) , followed by any 4 word characters (\w{4}), then a white space (\s). The returned array would be:

…将看:每个字符串的开头(^),后跟任意4个单词字符(\ w {4}),然后是空格(\ s)。 返回的数组将是:

#=> [“Spot the dog”, “Kiwi the bird”]

There are a lot of very complex-looking regex snippets you’ll come across when searching for ways to simplify your code, but knowing the basics will make this foreign-looking (ruby) language much more straight forward!

在寻找简化代码的方式时,会遇到很多看起来很复杂的正则表达式片段,但是了解基础知识将使这种外来的(Ruby)语言变得更简单!

A great place to test out your own regex is https://regex101.com/ . Once you’re feeling more adventurous, you can head over to https://alf.nu/RegexGolf and try to pass the challenges in as small an expression as possible.

https://regex101.com/是测试您自己的正则表达式的好地方。 一旦感到冒险,您可以前往https://alf.nu/RegexGolf并尝试以尽可能小的表达方式传递挑战。

Additional resources:

其他资源

https://catarak.github.io/blog/2014/10/13/ruby-regular-expressions/

https://catarak.github.io/blog/2014/10/13/ruby-regular-expressions/

https://ruby-doc.org/stdlib-2.6.1/libdoc/strscan/rdoc/StringScanner.html

https://ruby-doc.org/stdlib-2.6.1/libdoc/strscan/rdoc/StringScanner.html

翻译自: https://medium.com/@melindadiaz_75942/regex-in-ruby-the-very-basics-7343af05c60c

ruby中的正则表达式

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值