Comparison of parser generators

 

Comparison of parser generators

From Wikipedia, the free encyclopedia

Jump to navigation

Jump to search

 

This is a list of notable lexer generators and parser generators for various language classes.

Contents [hide]

1 Regular languages

2 Deterministic context-free languages

3 Parsing expression grammars, deterministic boolean grammars

4 General context-free, conjunctive or boolean languages

5 Context-sensitive grammars

6 See also

7 References

8 Notes

9 External links

Regular languages[edit]

Regular languages are a category of languages (sometimes known as Chomsky Type 3) which can be matched by a state machine (more specifically, by a deterministic finite automaton or a nondeterministic finite automaton) constructed from a regular expression. In particular, a regular language can match constructs like "A follows B", "Either A or B", "A, followed by zero or more instances of B", but cannot match constructs which require consistency between non-adjacent elements, such as "some instances of A followed by the same number of instances of B", and also cannot express the concept of recursive "nesting" ("every A is eventually followed by a matching B"). A classic example of a problem which a regular grammar cannot handle is the question of whether a given string contains correctly-nested parentheses. (This is typically handled by a Chomsky Type 2 grammar, also known as a context-free grammar.)

See also: List of lexer generators

NameLexer algorithmOutput languagesGrammar, codeDevelopment platformLicense
AlexDFAHaskellmixedallBSD
AnnoFlexDFAJavamixedJava Virtual MachineBSD
AustenXDFAJavaseparateallBSD
Booze-toolsDFAstate machine is runtime-generated or saved as JSONmixedPythonPublic Domain
C# FlexDFAC#mixed.NET CLRGNU GPL
C# LexDFAC#mixed.NET CLR?
CookCCDFAJavamixedJava Virtual MachineApache License 2.0
DFAlexDFAno code generation requiredJavaJavaApache License 2.0
DolphinDFAC++separateallProprietary
flexDFA table drivenC, C++mixedallBSD
gelexDFAEiffelmixedEiffelMIT
golexDFAGomixedGoBSD-style
gplexDFAC#mixed.NET CLRBSD-like
JFlexDFAJavamixedJava Virtual MachineBSD
JLexDFAJavamixedJava Virtual MachineBSD-like
lexDFACmixedPOSIXProprietary, CDDL
lexertlDFAC++ allGNU LGPL
LRSTARDFAC++separateWindowsBSD
QuexDFA direct codeC, C++mixedallGNU LGPL
RagelDFAC, C++, Assembly, Objective C, D, Go, Ruby, Java, C#, OCaml, Crack, Rust, JuliamixedallGNU GPL, MIT[1]
RE/flexDFA direct code, DFA table driven, and NFA regex librariesC++mixedallBSD
re2cDFA direct codeCmixedallPublic domain

Deterministic context-free languages[edit]

Context-free languages are a category of languages (sometimes known as Chomsky Type 2) which can be matched by a sequence of replacement rules, each of which essentially maps each non-terminal element to a sequence of terminal elements and/or other nonterminal elements. Grammars of this type can match anything that can be matched by a regular grammar, and furthermore, can handle the concept of recursive "nesting" ("every A is eventually followed by a matching B"), such as the question of whether a given string contains correctly-nested parentheses. The rules of Context-free grammars are purely local, however, and therefore cannot handle questions that require non-local analysis such as "Does a declaration exist for every variable that is used in a function?". To do so technically would require a more sophisticated grammar, like a Chomsky Type 1 grammar, also known as a Context-sensitive grammar. However, parser generators for context-free grammars often support the ability for user-written code to introduce limited amounts of context-sensitivity. (For instance, upon encountering a variable declaration, user-written code could save the name and type of the variable into an external data structure, so that these could be checked against later variable references detected by the parser.)

The deterministic context-free languages are a proper subset of the Context-Free languages which can be efficiently parsed by Deterministic pushdown automata.

NameParsing algorithmInput grammar notationOutput languagesGrammar, codeLexerDevelopment platformIDELicense
ANTLR4ALL(*)[2]EBNFC#, Java, Python, JavaScript, C++, Swift, GomixedgeneratedJava Virtual MachineYesBSD
ANTLR3LL(*)EBNFActionScript, Ada95, C, C++, C#, Java, JavaScript, Objective-C, Perl, Python, RubymixedgeneratedJava Virtual MachineYesBSD
APGRecursive descent, BacktrackingABNFC, C++, JavaScript, JavaseparatenoneallNoGNU GPL
AXERecursive descentAXE/C++C++17, C++11mixednoneany platform with standard C++17/C++11 compilerNoBoost
BeaverLALR(1)EBNFJavamixedexternalJava Virtual MachineNoBSD
BisonLALR(1), LR(1), IELR(1), GLRYACCC, C++, JavamixedexternalallNoGNU GPL with Exception
Bison++[note 1]LALR(1)?C++mixedexternalPOSIXNoGNU GPL
Bisonc++LALR(1)?C++mixedexternalPOSIXNoGNU GPL
Booze-toolsLALR(1) or LR(1) (canonical or minimal)BNF with macros in place of EBNFstate machine can be runtime-generated or saved as JSONmixed, separableincludedPythonNoPublic domain
BtYaccBacktracking Bottom-up?C++mixedexternalallNoPublic domain
byaccLALR(1)YACCCmixedexternalallNoPublic domain
BYACC/JLALR(1)YACCC, JavamixedexternalallNoPublic domain
CL-YaccLALR(1)LispCommon LispmixedexternalallNoMIT
Coco/RLL(1)EBNFC, C++, C#, F#, Java, Ada, Object Pascal, Delphi, Modula-2, Oberon, Ruby, Swift, Unicon, Visual Basic .NETmixedgeneratedJava Virtual Machine, .NET Framework, Microsoft Windows, POSIX (depends on output language)NoGNU GPL
CookCCLALR(1)Java annotationsJavamixedgeneratedJava Virtual MachineNoApache License 2.0
CppCCLL(k)?C++mixedgeneratedPOSIXNoGNU GPL
CSPLR(1)?C++separategeneratedPOSIXNoApache License 2.0
CUPLALR(1)?JavamixedexternalJava Virtual MachineNoBSD-like
DragonLR(1), LALR(1)?C++, JavaseparategeneratedallNoGNU GPL
eliLALR(1)?CmixedgeneratedPOSIXNoGNU GPL, GNU LGPL
Epsilon Grammar StudioRecursive descent, BacktrackingABNFC++separategeneratedMicrosoft WindowsYesproprietary
EssenceLR(???)?Scheme 48mixedexternalallNoBSD
Eto.ParseLL(k)BNF, EBNF or C#N/A (state machine is runtime generated)separateinternal.NET FrameworkNoMIT
eyappLALR(1)?Perlmixedexternal or generatedallNoPerl
FrownLALR(k)?Haskell 98mixedexternalallNoGNU GPL
geyaccLALR(1)?EiffelmixedexternalallNoMIT
GOLDLALR(1)BNFx86 assembly language, ANSI C, C#, D, Java, Pascal, Object Pascal, Python, Visual Basic 6, Visual Basic .NET, Visual C++separategeneratedMicrosoft WindowsYesModified Zlib
GPPGLALR(1)YACCC#separateexternalMicrosoft WindowsYesBSD
GrammaticaLL(k)BNF dialectC#, JavaseparategeneratedJava Virtual MachineNoBSD
HiLexedLL(*)EBNF or JavaJavaseparateinternalJava Virtual MachineNoGNU LGPL
Hime Parser GeneratorLALR(1), GLRBNF dialectC#, Java, Rustseparategenerated.NET Framework, Java Virtual MachineNoGNU LGPL
HyaccLR(1), LALR(1), LR(0)YACCCmixedexternalallNoGNU GPL
IronyLALR(1)C#N/A (state machine is runtime generated)separateinternal.NET FrameworkYesMIT
iyaccLALR(1)YACCIconmixedexternalallNoGNU GPL
jaccLALR(1)?JavamixedexternalJava Virtual MachineNoBSD
JavaCCLL(k)EBNFJava, C++, JavaScript (via GWT compiler)[3]mixedgeneratedJava Virtual MachineYesBSD
jayLALR(1)YACCC#, JavamixednoneJava Virtual MachineNoBSD
JFLAPLL(1), LALR(1)?Java??Java Virtual MachineYes?
JetPAGLL(k)?C++mixedgeneratedallNoGNU GPL
JS/CCLALR(1)EBNFJavaScript, JScript, ECMAScriptmixedinternalallYesBSD
KDevelop-PG-QtLL(1), Backtracking, Shunting yard?C++mixedgenerated or externalall, KDENoGNU LGPL
KelbtBacktracking LALR(1)?C++mixedgeneratedPOSIXNoGNU GPL
kmyaccLALR(1)?C, Java, Perl, JavaScriptmixedexternalallNoGNU GPL
LapgLALR(1)?C, C++, C#, Java, JavaScriptmixedgeneratedJava Virtual MachineNoGNU GPL
LemonLALR(1)?CmixedexternalallNoPublic domain
LEPLRecursive descentPythonPython (no generation, library)separatenoneallNoMPL/GNU LGPL
LimeLALR(1)?PHPmixedexternalallNoGNU GPL
LISALR(?), LL(?), LALR(?), SLR(?)?JavamixedgeneratedJava Virtual MachineYesPublic domain
LLgenLL(1)?CmixedexternalPOSIXNoBSD
LLnextgenLL(1)?CmixedexternalallNoGNU GPL
LLLPGLL(k) + syntactic and semantic predicatesANTLR-likeC#mixedgenerated (?).NET Framework, MonoVisual StudioGNU LGPL
LPGBacktracking LALR(k)?JavamixedgeneratedJava Virtual MachineNoEPL
LRSTARLALR(1), LR(1), LR(*)EBNF, Yacc-likeC++separategeneratedWindowsVisual StudioBSD
MenhirLR(1)?OCamlmixedgeneratedallNoQPL
ML-YaccLALR(1)?MLmixedexternalallNo?
MonkeyLR(1)?JavaseparategeneratedJava Virtual MachineNoGNU GPL
MstaLALR(k), LR(k)YACC, EBNFC, C++mixedexternal or generatedPOSIX, CygwinNoGNU GPL
MTP (More Than Parsing)LL(1)?JavaseparategeneratedJava Virtual MachineNoGNU GPL
MyParserLL(*)MarkdownC++11separateinternalany platform with standard C++11 compilerNoMIT License
NLTGLRC#/BNF-likeC#mixedmixed.NET FrameworkNoMIT
ocamlyaccLALR(1)?OCamlmixedexternalallNoQPL
olexLL(1)?C++mixedgeneratedallNoGNU GPL
parglareScannerless LALR(1)/SLR(1)/GLRBNF-like, PythonN/A (state machine is runtime generated)mixednoneallNoMIT
ParsecLL, BacktrackingHaskellHaskellmixednoneallNoBSD
Parse::YappLALR(1)?PerlmixedexternalallNoGNU GPL
Parser ObjectsLL(k)?Javamixed?Java Virtual MachineNozlib
PCCTSLL?C, C++??allNo?
PLYLALR(1)BNFPythonmixedgeneratedallNoMIT License
PlyPlusLALR(1)EBNFPythonseparategeneratedallNoMIT License
PRECCLL(k)?CseparategeneratedDOS, POSIXNoGNU GPL
QLALRLALR(1)?C++mixedexternalallNoGNU GPL
RPATKRecursive descent, BacktrackingBNFC (no generation, library)separatenoneallNoGNU GPL
SableCCLALR(1)?C, C++, C#, Java, OCaml, PythonseparategeneratedallNoGNU LGPL
SLKLL(k) LR(k) LALR(k)EBNFC, C++, C#, Java, JavaScriptseparateexternalallNoSLK[4]
SP (Simple Parser)Recursive descentPythonPythonseparategeneratedallNoGNU LGPL
SpiritRecursive descent?C++mixedinternalallNoBoost
SpracheLL, BacktrackingC#interpretedmixedinternal.NET FrameworkNoMIT
StyxLALR(1)?C, C++separategeneratedallNoGNU LGPL
Sweet ParserLALR(1)?C++separategeneratedMicrosoft WindowsNozlib
TapLL(1)?C++mixedgeneratedallNoGNU GPL
TextTransformerLL(k)?C++mixedgeneratedMicrosoft WindowsYesProprietary
TinyPGLL(1)?C#, Visual Basic??Microsoft WindowsYesCPOL 1.0
Toy Parser GeneratorRecursive descent?PythonmixedgeneratedallNoGNU LGPL
TP YaccLALR(1)?Turbo PascalmixedexternalallYesGNU GPL
UltraGramLALR(1), LR(1), GLRBNFC++, Java, C#, Visual Basic .NETseparateexternalMicrosoft WindowsYesPublic domain
UniCCLALR(1)EBNFC, C++, Python, JavaScript, JSON, XMLmixedgeneratedPOSIXNoBSD
UrchinCCLL(1)?Java?generatedJava Virtual MachineNo?
WhaleLR(?), some conjunctive stuff, see Whale Calf?C++mixedexternalallNoProprietary
wisentLALR(1)?C++, JavamixedexternalallNoGNU GPL
Yacc AT&T/SunLALR(1)YACCCmixedexternalPOSIXNoCPL & CDDL
Yacc++LR(1), LALR(1)YACCC++, C#mixedgenerated or externalallNoProprietary
YappsLL(1)?PythonmixedgeneratedallNoMIT
yeccLALR(1)?ErlangseparategeneratedallNoErlang
Visual BNFLR(1), LALR(1)?C#separategenerated.NET FrameworkYesProprietary
YooParseLR(1), LALR(1)?C++mixedexternalallNoMIT
ParseLR(1)BNF in C++ types??noneC++11 compliant compilerNoMIT
GGLLLL(1)GraphJavamixedgeneratedWindowsYesMIT
ProductParsing algorithmInput grammar notationOutput languagesGrammar, codeLexerDevelopment platformIDELicense

Parsing expression grammars, deterministic boolean grammars[edit]

NameParsing algorithmOutput languagesGrammar, codeDevelopment platformLicense
ArpeggioPEG parser interpreter, PackratPython (no generation, interpreted)mixedallMIT
AustenXPackrat (modified)JavaseparateallBSD
AurochsPackratC, OCaml, JavamixedallGNU GPL
BNFliteRecursive descentC++mixedallMIT
CanopyPackratJava, JavaScript, Python, RubyseparateallGNU GPL
CL-pegPackratCommon LispmixedallMIT
Drat!PackratDmixedallGNU GPL
FrisbyPackratHaskellmixedallBSD
grammar::pegPackratTclmixedallBSD
GrakoPackrat + Cut + Left RecursionPython / C++ (beta)separateallBSD
IronMetaPackratC#mixedMicrosoft WindowsBSD
KatahdinPackrat (modified), mutating interpreterC#mixedallPublic domain
Laja2-phase scannerless top-down backtracking + runtime supportJavaseparateallGNU GPL
lars::ParserPackrat (supporting left-recursion and grammar ambiguity)C++identicalallBSD
LPegParsing MachineLuamixedallMIT
lugParsing MachineC++17mixedallMIT
MouseRecursive descentJavaseparateJava Virtual MachineApache License 2.0
NarwhalPackratCmixedPOSIX, Microsoft WindowsBSD
NearleyEarleyJavaScriptmixedallMIT
Nemerle.PegRecursive descent + PrattNemerleseparateallBSD
neotomaPackratErlangseparateallMIT
NPEGRecursive descentC#mixedallMIT
OMetaPackrat (modified, partial memoization)JavaScript, Squeak, PythonmixedallMIT
PackCCPackrat (modified)CmixedallMIT
PackratPackratSchememixedallMIT
PappyPackratHaskellmixedallBSD
parboiledRecursive descentJava, ScalamixedJava Virtual MachineApache License 2.0
Lambda PEGRecursive descentJavamixedJava Virtual MachineApache License 2.0
parseppRecursive descentC++mixedallPublic domain
ParsnipPackratC++mixedMicrosoft WindowsGNU GPL
pegRecursive descentCmixedallMIT
PEG.jsPackrat (partial memoization)JavaScriptmixedallMIT
peg-parserPEG parser interpreterDylanseparateall 
PegasusRecursive descent / Packrat (selectively)C#mixedMicrosoft WindowsMIT
pegcRecursive descentCmixedallPublic domain
pestRecursive descentRustseparateallMPL
PetitParserPackratSmalltalk, Java, DartmixedallMIT
PEGTLRecursive descentC++11mixedallMIT
PGEHybrid recursive descent / operator precedence[5]Parrot bytecodemixedParrot virtual machineArtistic 2.0
PyPy rlibPackratPythonmixedallMIT
pyPEGPEG parser interpreter, PackratPythonmixedallGNU GPL
Rats!PackratJavamixedJava Virtual MachineGNU LGPL
Spirit2Recursive descentC++mixedallBoost
textXPEG parser interpreter, PackratPython (no generation, interpreted)separateallMIT
TreetopRecursive descentRubymixedallMIT
YardRecursive descentC++mixedallMIT or Public domain
WaxeyeParsing MachineC, Java, JavaScript, Python, Racket, RubyseparateallMIT
PHP PEG? (PEG Parser?)PHPmixedallBSD

General context-free, conjunctive or boolean languages[edit]

NameParsing algorithmInput grammar notationOutput languagesGrammar, codeLexerDevelopment platformIDELicense
ACCENTEarleyYACC variantCmixedexternalallNoGNU GPL
APaGeDGLR, LALR(1), LL(k)?DmixedgeneratedallNoArtistic
BisonLALR(1), LR(1), IELR(1), GLRYACCC, C++, Java, XMLmixed (except XML)externalallNoGNU GPL
DMS Software Reengineering ToolkitGLR?ParlansemixedgeneratedMicrosoft WindowsNoProprietary
DParserScannerless GLR?CmixedscannerlessPOSIXNoBSD
Dypgenruntime-extensible GLR?OCamlmixedgeneratedallNoCeCILL-B
E3Earley?OCamlmixedexternal, or scannerlessallNo?
ElkhoundGLR?C++, OCamlmixedexternalallNoBSD
eu.h8me.ParsingGLR?N/A (state machine is runtime generated)separateexternal.NET FrameworkNoBSD
GDKLALR(1), GLR?C, Lex, Haskell, HTML, Java, Object Pascal, YaccmixedgeneratedPOSIXNoMIT
HappyLALR, GLR?HaskellmixedexternalallNoBSD
Hime Parser GeneratorGLR?C#, Java, Rustseparategenerated.NET Framework, Java Virtual MachineNoGNU LGPL
IronText LibraryLALR(1), GLRC#C#mixedgenerated or external.NET FrameworkNoApache License 2.0
JisonLALR(1), LR(0), SLR(1)YACCJavaScript, C#, PHPmixedgeneratedallNoMIT
SyntaxLALR(1), LR(0), SLR(1) CLR(1) LL(1)JSON/YACCJavaScript, Python, PHP, Ruby, C#, Rust, JavamixedgeneratedallNoMIT
LajaScannerless, two phaseLajaJavaseparatescannerlessallNoGNU GPL
ModelCCEarleyAnnotated class modelJavageneratedgeneratedallNoBSD
parglareScannerless LR/GLRBNF-likePython interpreted, automata run-time generatedmixedscannerlessallNoMIT
P1CombinatorsBNF-likeOCamlmixedexternal, or scannerlessallNo?
P3Earley/combinatorsBNF-likeOCamlmixedexternal, or scannerlessallNo?
P4Earley/combinators, infinitary CFGsBNF-likeOCamlmixedexternal, or scannerlessallNo?
Scannerless Boolean ParserScannerless GLR (Boolean grammars)?Haskell, JavaseparatescannerlessJava Virtual MachineNoBSD
SDF/SGLRScannerless GLRSDFC, JavaseparatescannerlessallYesBSD
SmaCCGLR(1), LALR(1), LR(1)?SmalltalkmixedinternalallYesMIT
SPARKEarley?PythonmixedexternalallNoMIT
TomGLR?CgeneratednoneallNo"No licensing or copyright restrictions"
UltraGramLALR, LR, GLR?C++, C#, Java, Visual Basic .NETseparategeneratedMicrosoft WindowsYesProprietary
WormholePruning, LR, GLR, Scannerless GLR?C, PythonmixedscannerlessMicrosoft WindowsNoMIT
Whale CalfGeneral tabular, SLL(k), Linear normal form (Conjunctive grammars), LR, Binary normal form (Boolean grammars)?C++separateexternalallNoProprietary
yaepEarleyyacc likeCmixedexternalallNoLGPL
ZeccRecursive Pattern MatchingZecc/ZaccLinkable LibrarymixedscannerlessmacOSYesProprietary

Context-sensitive grammars[edit]

NameParsing algorithmInput grammar notationBoolean grammar capabilitiesDevelopment platformLicense
LuZc[6][7]delta chainmodularConjunctive, not complimentaryPOSIXproprietary
bnf2xmlrecursive descent (is a text filter output is xml)simple bnf[clarification needed] grammar (input matching), output is xml?beta, and not a full-fledged EBNF parserGNU GPLv2

See also[edit]

References[edit]

  1. ^ http://www.colm.net/open-source/ragel/

  2. ^ "Adaptive LL(*) Parsing: The Power of Dynamic Analysis" (PDF). Terence Parr. Retrieved 2016-04-03.

  3. ^ "Building parsers for the web with JavaCC & GWT (Part one)". Chris Ainsley. Retrieved 2014-05-04.

  4. ^ http://www.slkpg2.com/license.txt

  5. ^ "Parrot: Grammar Engine". The Parrot Foundation. 2011. "PGE rules provide the full power of recursive descent parsing and operator precedence parsing."

  6. ^ "LuZ: A context sensitive parser". 2016-10-17. Archived from the original on 2016-10-17. Retrieved 2018-10-17.

  7. ^ "LuZc - A conjunctive context-sensitive parser". luzc.zohosites.com. Retrieved 2018-10-17.

Notes[edit]

  1. ^ Bison 1.19 fork

External links[edit]

 

[hide]

v · t · e

Parsing algorithms

Top-down

LL · Recursive descent (Tail recursive · Pratt parser)

Bottom-up

Precedence (Simple · Operator (Shunting-yard)) · Bounded-context · LR (Simple · Look-ahead · Canonical · Generalized) · CYK · Recursive ascent · Shift-reduce

Mixed/Other

Combinator · Chart · Earley

Related topics

PEG · Definite clause grammar · Dynamic programming · Memoization · Parser generator (LALR) · Metacompiler · Parse tree · AST · Scannerless parsing · History of compiler construction · Comparison of parser generators

 

Categories: Parser generatorsParsing algorithmsSoftware comparisons

  • 17
    点赞
  • 42
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值