正则表达式的例题分析

本文通过分析一道关于Petya自创语言的题目,探讨如何利用正则表达式进行语法检查。题目描述了Petya语言的词法规则,并提供输入输出示例。作者鼓励读者思考正则表达式在该问题中的必要性及可能的陷阱,并提供了代码实现。
摘要由CSDN通过智能技术生成

读者你好,下面的例题是我在学习Python时遇到的一道我自认为还蛮有启发的一道题目。本题目不太需要算法(小白放轻松),只需要对正则表达式有一个大概的了解——希望你在阅读完我的文章后能够进一步提升对正则表达式的理解以及运用能力。
话不多说,直接上题目:

A grammer lesson

Petya got interested in grammar on his third year in school. He invented his own language called Petya’s. Petya wanted to create a maximally simple language that would be enough to chat with friends, that’s why all the language’s grammar can be described with the following set of rules:

There are three parts of speech: the adjective, the noun, the verb. Each word in his language is an adjective, noun or verb.

There are two genders: masculine and feminine. Each word in his language has gender either masculine or feminine.

*Masculine adjectives end with -lios, and feminine adjectives end with -liala.*

*Masculine nouns end with -etr, and feminime nouns end with -etra.*

*Masculine verbs end with -initis, and feminime verbs end with -inites.*

Thus, each word in the Petya’s language has one of the six endings, given above. There are no other endings in Petya’s language.

It is accepted that the whole word consists of an ending. That is, words “lios”, “liala”, “etr” and so on belong to the Petya’s language.

There aren’t any punctuation marks, grammatical tenses, singular/plural forms or other language complications.

A sentence is either exactly one valid language word or exactly one statement

Statement is any sequence of the Petya’s language, that satisfy both conditions:

Words in statement follow in the following order (from the left to the right): zero or more adjectives followed by exactly one noun followed by zero or more verbs.

All words in the statement should have the same gender.

After Petya’s friend Vasya wrote instant messenger (an instant messaging program) that supported the Petya’s language, Petya wanted to add spelling and grammar checking to the program. As Vasya was in the country and Petya didn’t feel like waiting, he asked you to help him with this problem. Your task is to define by a given sequence of words, whether it is true that the given text represents exactly one sentence in Petya’s language.

## Input

The first line contains one or more words consisting of lowercase Latin letters. The overall number of characters (including letters and spaces) does not exceed 105.

It is guaranteed that any two consecutive words are separated by exactly one space and the input data do not contain any other spaces. It is possible that given words do not belong to the Petya’s language.

## Output

If some word of the given text does not belong to the Petya’s language or if the text contains more that one sentence, print “NO” (without the quotes). Otherwise, print “YES” (without the quotes).

### input

petr

###output

YES

### input

etis atis animatis etis atis amatis

### output

NO

### input

nataliala kataliala vetra feinites

### output

YES
(需要12个具体test的同学可以私信我)
看完你有没有产生需要重学英语的想法^ v ^?(希望没有这种想法)

在看我给出的具体代码之前,我希望你可以思考以下几个问题:
1、本题是否真的需要正则表达式?如果可以不用,我可以怎么做?
2、本题使用正则表达式有什么优越之处?
3、如果让我来写,我会怎么使用正则表达式?
4、本题是否有陷阱?有哪些可能会出错的地方?
代码如下:

import re


def isPetyaLanguage():
    s = input().split()
    join_words = "".join(s)
    if s[0] == join_words and (re.match(".*lios", join_words) is not None or re.match(".*etr", join_words) is not None or re.match(".*initis", join_words) is not None or re.match(".*liala", join_words) is not None or re.match(".*etra", join_words) is not None or re.match(".*inites", join_words) is not None):
        print("YES")
        return
    join1 = re.search("(.*lios)*.*?etr(.*[^a]initis)*", join_words)
    join2 = re.search("(.*liala)*.*?etra(.*inites)*", join_words)
    if join1 is None and join2 is None:
        print("NO")
    else:
        if join1 is not None and join1.group() == join_words:
            print("YES")
        elif join2 is not None and join2.group() == join_words:
            print("YES")
        else:
            print("NO")


if __name__ == '__main__':
    isPetyaLanguage()

我不清楚你阅读完我写的源代码的感受,如果你是大佬,你一定会觉得还有改进的空间(欢迎在评论区讨论);如果你有正则表达式的基础,你也许会会心一笑。
那么,再来看看下面的注释版,看看你是否思考到了我在里面提到的一些问题。

import re


def isPetyaLanguage():
    s = input().split()
    join_words = "".join(s)
    # 判断单个单词是否符合,这里涉及一个陷阱:单个单词是不用分词性的。
    if s[0] == join_words and (re.match(".*lios", join_words) is not None or re.match(".*etr", join_words) is not None or re.match(".*initis", join_words) is not None or re.match(".*liala", join_words) is not None or re.match(".*etra", join_words) is not None or re.match(".*inites", join_words) is not None):
        print("YES")
        return
    join1 = re.search("(.*lios)*.*?etr(.*[^a]initis)*", join_words)	
    join2 = re.search("(.*liala)*.*?etra(.*inites)*", join_words)	
    # 中间用懒惰模式,思考:为什么贪婪模式不行?
    # 解答:lios和etr中间夹杂的非法单词会被认为是etr的一部分,从而导致误判。
    if join1 is None and join2 is None:
        print("NO")
    else:
        # 这里必须还要分别明确join1、join2不为None,因为if语句只能保证两者不是都为None。
        # 否则编译器会报错,认为‘Nonetype’没有group()
        if join1 is not None and join1.group() == join_words:	
            print("YES")
        elif join2 is not None and join2.group() == join_words:
            print("YES")
        else:
            print("NO")


if __name__ == '__main__':
    isPetyaLanguage()

本题的难处不在于思维的逻辑性,而在于思维的严谨性和对于英文题目的阅读理解。
希望你看完本篇文章能够有所收获(喜欢就一键三连加关注)在这里插入图片描述
peace

评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

司空昆颉

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值