Linux下regex.h知识点和使用样例

1 篇文章 0 订阅

查看:man regex.h

定位:find / -name regex.h 2>/dev/null


<regex.h>(P)               POSIX Programmer’s Manual              <regex.h>(P)



PROLOG
       This  manual  page is part of the POSIX Programmer’s Manual.  The Linux
       implementation of this interface may differ (consult the  corresponding
       Linux  manual page for details of Linux behavior), or the interface may
       not be implemented on Linux.

NAME
       regex.h - regular expression matching types

SYNOPSIS
       #include <regex.h>

DESCRIPTION
       The <regex.h> header shall define the structures and symbolic constants
       used  by the regcomp(), regexec(), regerror(), and regfree() functions.

       The structure type 【regex_t】 shall contain at least the following member:


              size_t    re_nsub    Number of parenthesized subexpressions.

       The type size_t shall be defined as described in <sys/types.h> .

       The  type  regoff_t  shall be defined as a signed integer type that can
       hold the largest value that can be stored in either  a  type  off_t  or
       type  ssize_t. The structure type regmatch_t shall contain at least the
       following members:


              regoff_t    rm_so    Byte offset from start of string
                                   to start of substring.
              regoff_t    rm_eo    Byte offset from start of string of the
                                   first character after the end of substring.

       Values for the 【cflags 】parameter to the regcomp() function are  as  fol-
       lows:

       REG_EXTENDED 设定使用扩展正则表达式
              Use Extended Regular Expressions.

       REG_ICASE  设定忽略大小写
              Ignore case in match.

       REG_NOSUB  设定不存储匹配后的结果
              Report only success or fail in regexec().

       REG_NEWLINE 设定识别换行,单行匹配。没有全文当一串匹配
              Change the handling of <newline>.


       Values  for  the 【eflags】 parameter to the regexec() function are as fol-
       lows:

       REG_NOTBOL 设定^作为指定的字符,不用于匹配字符串开头
              The circumflex character ( ’^’ ), when taken as a special  char-
              acter, does not match the beginning of string.

       REG_NOTEOL 设定$作为指定的字符,不用于匹配字符串尾部
              The dollar sign ( ’$’ ), when taken as a special character, does
              not match the end of string.


       The following constants shall be defined as 【error return values】:

       REG_NOMATCH 匹配不成功
              regexec() failed to match.

       REG_BADPAT  无效的正则表达式
              Invalid regular expression.

       REG_ECOLLATE 无效元素引用
              Invalid collating element referenced.

       REG_ECTYPE  无效字符串类型引用
              Invalid character class type referenced.

       REG_EESCAPE  
              Trailing ’\’ in pattern.

       REG_ESUBREG \数字 无效或出错
              Number in \digit invalid or in error.

       REG_EBRACK []不成对匹配
              "[]" imbalance.

       REG_EPAREN   "\(\)" or "()" 不成对匹配
              "\(\)" or "()" imbalance.

       REG_EBRACE   "\{\}" 不成对匹配
              "\{\}" imbalance.

       REG_BADBR  "\{\}"所填数据无效:不是数字,数字太大,数字多于两个,数字第一个大于第二个
              Content of "\{\}" invalid: not a number, number too large,  more
              than two numbers, first larger than second.

       REG_ERANGE  表达式范围内无效终结点
              Invalid endpoint in range expression.

       REG_ESPACE 内存超限
              Out of memory.

       REG_BADRPT 正则表达式’?’ , ’*’ , or ’+’使用错误,之前没有限定字符
              ’?’ , ’*’ , or ’+’ not preceded by valid regular expression.

       REG_ENOSYS  保留
              Reserved.


       The following shall be declared as functions and may also be defined as
       macros. Function prototypes shall be provided.


              int    regcomp(regex_t *restrict, const char *restrict, int);根据正则字符串 初始化成 程序规定格式的正则数据结构
                        (返回的数据结构,正则字符串,【cflags 】)
			  size_t regerror(int, const regex_t *restrict, char *restrict, size_t);错误获取
                        
			  int    regexec(const regex_t *restrict, const char *restrict, size_t,
                         regmatch_t[restrict], int);根据程序规定格式的正则数据结构 匹配 待匹配字符串
                     (正则数据结构,匹配字符串,存储匹配结果个数,存储匹配结果缓冲区数据结构,【eflags】)
			  void   regfree(regex_t *);//释放空间

       The implementation may define  additional  macros  or  constants  using
       names beginning with REG_.

       The following sections are informative.

APPLICATION USAGE
       None.

RATIONALE
       None.

FUTURE DIRECTIONS
       None.

SEE ALSO
       <sys/types.h>  ,  the System Interfaces volume of IEEE Std 1003.1-2001,
       regcomp(), the Shell and Utilities volume of IEEE Std 1003.1-2001

COPYRIGHT
       Portions of this text are reprinted and reproduced in  electronic  form
       from IEEE Std 1003.1, 2003 Edition, Standard for Information Technology
       -- Portable Operating System Interface (POSIX),  The  Open  Group  Base
       Specifications  Issue  6,  Copyright  (C) 2001-2003 by the Institute of
       Electrical and Electronics Engineers, Inc and The Open  Group.  In  the
       event of any discrepancy between this version and the original IEEE and
       The Open Group Standard, the original IEEE and The Open Group  Standard
       is  the  referee document. The original Standard can be obtained online
       at http://www.opengroup.org/unix/online.html .



IEEE/The Open Group                  2003                         <regex.h>(P)


原来代码是C++的链接 http://blog.chinaunix.net/uid-28323465-id-4083290.html

更改一小部分后成为C的。

可以把正则表达式用vi保存,然后用od工具查看  查看命令:od -tx1 -c  file.txt

//编译 gcc regex_xjy.c
//运行 ./a.out
#include<sys/types.h>
#include<regex.h>
#include<string.h>
#include<stdio.h>
int main()
{
      char *haa = "a very simple simple simple string";
         char *regex = "([a-z]+)[ \t]([a-z]+)";
    regex_t comment;
    size_t nmatch;
  int i;
int cnt;
    char str[256];
    regmatch_t regmatch[100];
    regcomp(&comment, regex, REG_EXTENDED|REG_NEWLINE);
    while(1)
    {
        int j = regexec(&comment,haa,sizeof(regmatch)/sizeof(regmatch_t),regmatch,0);
        if(j != 0)
            break;
        for( i = 0; i< 100 && regmatch[i].rm_so!=-1;i++)
        {
            memset(str,sizeof(str),0);
            cnt=regmatch[i].rm_eo-regmatch[i].rm_so;
            printf("cnt=%d \t",cnt);
            memcpy(str,&haa[regmatch[i].rm_so],cnt);
            str[cnt]='\0';
            printf("%s\n",str);
        }
        printf("cyc:**************%d \n",i);

        if(regmatch[0].rm_so != -1)
            haa+= regmatch[0].rm_eo;
    }
    regfree(&comment);
    return 0;
}


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值