如何建立长正则表达式

最新推荐文章于 2022-09-27 23:09:06 发布

cxygs5788

最新推荐文章于 2022-09-27 23:09:06 发布

阅读量194

点赞数

通常，当您使用正则表达式提取文本时，您将从简单表达式开始。当您了解目标文本时，就可以扩展表达式。随后很难准备好长的特殊符号集，并且无法改善这种表达方式。

我们必须创建“智能”正则表达式。我们不用编写一个行表达式，而是准备多行文本，然后从中生成长表达式。这是一个简单的例子。


space                    [\s/-]+
word                     \w+
words                    (?:{word}{space})*?{word}
birthday                 (?<birthday>\d+\.d+\.d+)
title                    {word}\.
name                     {words}
person                   {title}{space}{name}{space}{birthday}

此文本包含两列，各列之间用空格分隔。第一列是模式名称，第二列易于阅读正则表达式。模式“人”的结果正则表达式为：


\w+\.[\s/-]+(?:\w+[\s/-]+)*?\w+[\s/-]+(?<birthday>\d+.\d+.\d+)

您可以使用以下课程进行操作


public class Lexer
    {
        private NameValueCollection col;
        public Lexer()
        {
            col = new NameValueCollection();
        } 
        public static Lexer Create(string resource)
        {
            StringReader sr = new StringReader(resource);
            Lexer lex =new Lexer();
            while (sr.Peek()>=0)
            {
                string line = sr.ReadLine();
                Match m = Regex.Match(line,@"([\w_]+)\s+(.*)");
                if (m.Success) 
                {
                    lex.col.Add(m.Groups[1].Value.Trim(), m.Groups[2].Value.Trim());
                }
            }
            sr.Close(); 
            return lex;
        }  
        public string GetExpression(string name)
        {
            if (name == null || name.Length == 0) return string.Empty;
            string res = col[name];
            if (res == null) throw new ArgumentException("Template not found", name); 
            bool needGroup = res.IndexOf('|') > 0;
            Regex reg = new Regex(@"(?<!\\p){([a-zA-Z][\w_]+)}");
            Match m = reg.Match(res);
            while (m.Success)
            {
                string token = m.Groups[1].Value;
                string exp = GetExpression(token); 
                if (exp != null && exp.Length>0)
                    res = res.Replace(@"{"+token+"}",exp);
                m = m.NextMatch();
            }
            string result = res;
            if (needGroup)
            {
                result = "(?:" + res + ")";
            }
            result = "(?#" + name + ")" + result; 
            return result;
        } 
    }

然后我们可以创建类实例并获取正则表达式


Lexer lex = Lexer.Create(txtLexerText.Text);
string expr = lex.GetExpression("person");
Regex reg = new Regex(expr);

From: https://bytes.com/topic/net/insights/729580-how-build-long-regular-expression

cxygs5788

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫