为了方便在MASM中使用,我用正则表达式从C++头文件中提取constant、function,callback等等,
在抓取这两段代码
#define NdrUnMarshConfStringHdr(p, s, l) ((s=_midl_unma4(p,unsigned long),\
(_midl_addp(p,4)), \
(l=_midl_unma4(p,unsigned long))
#define NdrMarshSCtxtHdl(pc,p,rd) (NdrSContextMarshall((NDR_SCONTEXT)pc,p, (NDR_RUNDOWN)rd)
由于上两段代码括号并不平衡,所以在匹配时出现超时。
正则表达式由程序生成
(\#define\s+)
(?'Name'([a-zA-Z_]\w+))
(?'Params'(\([^\(\)\#]*\)))?
\s+
((0(x|X)[0-9a-fA-F]+)|
(\d+)|
([a-zA-Z_]\w+)|
("(?>[^"]+|"")*")|
(\(([^\(\)\#]+|
(?'Paren'\()|(?'Close-Paren'\)))*?(?(Paren)(?!))\))|
(\+|\-|\*|/\\)|
(([a-zA-Z_]\w+)(\(([^\(\)\#]+|(?'Paren'\()|(?'Close-Paren'\)))*?(?(Paren)(?!))\))))?
简化为这两种情况之后还是出现超时
(\#define\s+)
(?'Name'([a-zA-Z_]\w+))
(?'Params'(\([^\(\)\#]*\)))?
\s+
(\(([^\(\)\#]+|(?'Paren'\()|(?'Close-Paren'\)))*?(?(Paren)(?!))\))
(\#define\s+)
(?'Name'([a-zA-Z_]\w+))
(.+?)
(\(([^\(\)\#]+|(?'Paren'\()|(?'Close-Paren'\)))*?(?(Paren)(?!))\))
一些匹配和文本导致超时的情况
^[a-zA-Z0-9]+((.?|\-*)[a-zA-Z0-9]+)*$
asdf.host-name.asd-f.
^(([a-zA-Z\d!#$%&'*+-/=?^_`{|}~]+\x20*|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*"\x20*)*(?<angle><))?
((?!\.)(\.?[a-zA-Z\d!#$%&'*+-/=?^_`{|}~]+)+|"((?=[\x01-\x7f])[^"\\]|\\[\x01-\x7f])*")
@
(((?!-)[a-zA-Z\d\-]+(?<!-)\.)+[a-zA-Z]{2,}
|\[
(((?(?<!\[)\.)(25[0-5]|2[0-4]\d|[01]?\d?\d)){4}
|[a-zA-Z\d\-]*[a-zA-Z\d]:((?=[\x01-\x7f])[^\\\[\]]|\\[\x01-\x7f])+)
\])
(?(angle)>)$
www42af43ds.afsd.fds.ds
^([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@(([0-9a-zA-Z])+([-\w]*[0-9a-zA-Z])*\.)+[a-zA-Z]{2,9})$
hello23423423423424n@aol.c