我在C++中用libtld编写了这样的解析器。如果你真的想要完整的话,有lex和yacc(尽管我不使用这些工具)。我的C++ code可能会帮助您用python编写自己的版本。在(lex part)
[-A-Za-z0-9!#$%&'*+/=?^_`{|}~]+ atom_text_repeat (ALPHA+DIGIT+some other characters)
([\x09\x0A\x0D\x20-\x27\x2A-\x5B\x5D-\x7E]|\\[\x09\x20-\x7E])+ comment_text_repeat
([\x33-\x5A\x5E-\x7E])+ domain_text_repeat
([\x21\x23-\x5B\x5D-\x7E]|\\[\x09\x20-\x7E])+ quoted_text_repeat
\x22 DQUOTE
[\x20\x09]*\x0D\x0A[\x20\x09]+ FWS
. any other character
(lex definitions merged in more complex lex definitions)
[\x01-\x08\x0B\x0C\x0E-\x1F\x7F] NO_WS_CTL
[()<>[\]:;@\\,.] specials
[\x01-\x09\x0B\x0C\x0E-\x7F] text
\\[\x09\x20-\x7E] quoted_pair ('\\' text)
[A-Za-z] ALPHA
[0-9] DIGIT
[\x20\x09] WSP
\x20 SP
\x09 HTAB
\x0D\x0A CRLF
\x0D CR
\x0A LF
(yacc part)
address_list: address
| address ',' address_list
address: mailbox
| group
mailbox_list: mailbox
| mailbox ',' mailbox_list
mailbox: name_addr
| addr_spec
group: display_name ':' mailbox_list ';' CFWS
| display_name ':' CFWS ';' CFWS
name_addr: angle_addr
| display_name angle_addr
display_name: phrase
angle_addr: CFWS '' CFWS
addr_spec: local_part '@' domain
local_part: dot_atom
| quoted_string
domain: dot_atom
| domain_literal
domain_literal: CFWS '[' FWS domain_text_repeat FWS ']' CFWS
phrase: word
| word phrase
word: atom
| quoted_string
atom: CFWS atom_text_repeat CFWS
dot_atom: CFWS dot_atom_text CFWS
dot_atom_text: atom_text_repeat
| atom_text_repeat '.' dot_atom_text
quoted_string: CFWS DQUOTE quoted_text_repeat DQUOTE CFWS
CFWS:
| FWS comment
| CFWS comment FWS
comment: '(' comment_content ')'
comment_content: comment_text_repeat
| comment
| ccontent ccontent