c++使用libpcre

最新推荐文章于 2024-05-21 20:12:49 发布

逐梦如风

最新推荐文章于 2024-05-21 20:12:49 发布

阅读量2.5k

点赞数

分类专栏： c++/c++类库从零开始学习cpp 文章标签： c++ pcre regex 正则

本文链接：https://blog.csdn.net/cabing2005/article/details/52806262

版权

从零开始学习cpp 同时被 2 个专栏收录

45 篇文章 5 订阅

订阅专栏

c++/c++类库

24 篇文章 0 订阅

订阅专栏

我还是不太习惯使用regex，习惯使用pcre标准的正则库
先放上代码

#include "pcre.h"  
#include <stdio.h>  
#include <string.h>  

#define OVECCOUNT 256

int main(int argc, char ** argv)  
{  
    char pText[1024] = "<div class=\"main-wrap J_TRegion\" data-modules=\"main\" style=\"overflow:visible;\" data-width=\"b950\"><div class=\"J_TModule\" data-widgetid=\"14777518346\"  id=\"shop14777518346\"  data-componentid=\"4011\"  data-spm='110.0.4011-14777518346'  microscope-data='4011-14777518346' data-title=\"搜索列表\"  >                        <div class=\"skin-box tb-module tshop-pbsm tshop-pbsm-tmall-srch-list\" id=\"TmshopSrchNav\"><!-- /search.htm    --><input id=\"J_ShopAsynSearchURL\" type=\"hidden\" value=\"/i/asynSearch.htm?mid=w-14777518346-0&wid=14777518346&path=/search.htm&&amp;search=y&amp;orderType=newOn_desc&amp;tsearch=y&amp;pageNo=2\" />";  

    //const char * pPattern = "(\\d+)\\w+";
    const char *pPattern="<input id=\"J_ShopAsynSearchURL\" type=\"hidden\" value=\"([^\"]+)\"";
    const char * pErrMsg = NULL;  
    pcre * pPcre = NULL;  
    int nOffset = -1;   

    pPcre = pcre_compile(pPattern, 0, &pErrMsg, &nOffset, NULL);
    if(pPcre == NULL){
        printf("pcre match error\n");
        return 1;
    }
    int ovector[OVECCOUNT];
    int matchFlag;
    matchFlag = pcre_exec(pPcre, NULL, pText, strlen(pText),0,0, ovector, OVECCOUNT);
    if(matchFlag < 0){
        if(matchFlag == PCRE_ERROR_NOMATCH){
            printf("not match\n");
        }else {
            printf("match error\n");
        }
        free(pPcre);
        return 1;
    }
    printf(" match result :\n");
    if( matchFlag > 0){
        printf("Pattern_CM: \"%s\"\n", pPattern);
        printf("String : %s\n", pText);
        printf("matchFlag=%d\n",matchFlag);

    }

    int i;
    for(int i=0;i<matchFlag;i++){
        char *strStart = pText+ovector[2*i];
        int substrLen = ovector[2*i+1] - ovector[2*i];
        printf("$%2d: %.*s\n", i, substrLen, strStart);
    }
    return 0;
}

函数详细说明
1.pcre_compile
函数原型：
pcre *pcre_compile(const char *pattern, int options, const char **errptr, int *erroffset, const unsigned char *tableptr)
功能：将一个正则表达式编译成一个内部表示，在匹配多个字符串时，可以加速匹配。其同pcre_compile2功能一样只是缺少一个参数errorcodeptr。
参数说明：
pattern 正则表达式
options 为0，或者其他参数选项
errptr 出错消息
erroffset 出错位置
tableptr 指向一个字符数组的指针，可以设置为空NULL。
2. pcre_compile2
函数原型：
pcre *pcre_compile2(const char *pattern, int options, int *errorcodeptr, const char **errptr, int *erroffset, const unsigned char *tableptr)
功能：将一个正则表达式编译成一个内部表示，在匹配多个字符串时，可以加速匹配。其同pcre_compile功能一样只是多一个参数errorcodeptr。
参数：
pattern 正则表达式
options 为0，或者其他参数选项
errorcodeptr 存放出错码
errptr 出错消息
erroffset 出错位置
tableptr 指向一个字符数组的指针，可以设置为空NULL。
3. pcre_exec
函数原型：
int pcre_exec(const pcre *code, const pcre_extra *extra, const char *subject, int length, int startoffset, int options, int *ovector, int ovecsize)
功能：使用编译好的模式进行匹配，采用与Perl相似的算法，返回匹配串的偏移位置。
参数：
code 编译好的模式
extra 指向一个pcre_extra结构体，可以为NULL
subject 需要匹配的字符串
length 匹配的字符串长度（Byte）
startoffset 匹配的开始位置
options 选项位
ovector 指向一个结果的整型数组
ovecsize 数组大小。