Lexer and Parser Generators


  • Summary
    Published Papers
    Source Code
    Manuals
  • AFLEX & AYACC

    Summary Aflex and Ayacc are similar to the Unix tools Lex and Yacc, but they are written in Ada and generate Ada output.
    Published Papers Click here to find papers
    Source Code Click here to obtain source code
    Manuals Click here for documentation
  • ALE

    Summary The Attribute-Logic Engine integrates phrase structure parsing and constraint logic programming with typed feature structures as terms.
    Published Papers Click here to find papers
    Source Code Click here to obtain ALE
    Manuals Click here for manuals
  • ATS

    Summary Generates table-driven LL(1) parsers with full insert-only error recovery. It also handles full left-attribute semantic handling, which is a dream compared to using YACC's parser actions.
    Published Papers
    Source Code
    Manuals
  • BYACC

    Summary Berkeley Yacc is a public domain LALR(1) parser generator. It has been made as compatible as possible with AT&T Yacc.
    Published Papers Click here to find papers
    Source Code Click here to obtain BYACC. Here is perl-byacc: a byacc with a -P option that causes Perl code to be generated.
    Manuals Click here for manuals
  • BISON

    Summary This is another parser generator by Berkeley that is very similar to YACC.
    Published Papers
    Source Code Click here to obtain BISON
    Manuals
  • BISON++

    Summary C++-ized version of BISON. E-mail author Alain Coetmeur.
    Published Papers
    Source Code Found in this ftp directory.
    Manuals Download some documentation.
  • COCO

    Summary Coco/R reads an attributed EBNF grammar of a language and translates it into a recursive descent parser and a scanner for that language. Using Coco/R, compilers can be generated that are as efficient as hand-coded and carefully optimized production quality compilers. Almost as important as efficiency is the simplicity and adequacy of the system. Programmers are not willing to use a tool if it does not come in handy to their work, if it uses an arcane notation or a bulk of options and special cases. Coco/R puts simplicity and efficiency over power.
    Published Papers Click here to find papers
    Source Code Click here to obtain COCO
    Manuals Click here for manual
  • DEPOT4

    Summary A top-down parser generator that supports specifications in a style similar to syntax-directed translation schemes. The specification language is based on EBNF. Depot4 is intended for use by non-experts implementing domain-specific languages.
    Published Papers Scroll down this site to find papers
    Source Code DEPOT4/Oberon
    DEPOT4/Java
    Manuals Click here for documentation
  • EAG

    Summary The Extended Affix Grammar formalism, or shortly EAG, is a formalism for describing both the context free and the context sensitive syntax of languages. EAG is a member of the family of two-level grammars. They are very closely related to two-level van Wijngaarden grammars. The EAG compiler will generate either a recognizer or a transducer or a translator or a syntax directed editor for a language described in the EAG formalism.
    Published Papers E-mail the author to find out about published papers
    Source Code Click here to obtain EAG
    Manuals Click here for manuals
  • FLEX

    Summary FLEX is a tool for generating scanners.
    Published Papers
    Source Code Click here to obtain FLEX
    Manuals Click here to obtain manual
  • FLEX++

    Summary C++-ized version of FLEX. E-mail author Alain Coetmeur.
    Published Papers
    Source Code Found in this ftp directory.
    Manuals Download some documentation.
  • GRAY

    Summary Gray is a parser generator written in Forth. It takes grammars in an extended BNF and produces executable Forth code for recursive descent parsers.
    Published Papers Here's one, but it's in German.
    Source Code Download the source code for version 4.
    Manuals Some documentation with the source code.
  • Ibpag2

    Summary Ibpag2 is a parser generator for Icon. It does most of what you would expect. Latest version can handle both SLR(1) and even GLR (Tomita) grammars.
    Published Papers
    Source Code Ibpag2 is in comp.sources.misc. Here it is.
    Manuals I think documentation is included within the source code. Because comp.sources.misc is practically extinct, Ibpag2 is in an old archive, and not very user-friendly to access.
  • KwParsing

    Summary kwParser is a parser generator for Python. It transforms an abstract specification of a language grammar (for example the CORBA Interface Definition Language) together with "interpretation functions" that define the semantics of the language into a compiler or translator or interpreter. In the case of CORBA IDL a python program using kwParser could generate stubs and support code (in Python or some other language) to talk to a CORBA interface.
    Published Papers
    Source Code Source code links found here .
    Manuals Documentation included with source code.
  • Lalr

    Summary Lalr is a parser generator that generates very fast and powerful parsers. Lalr is a part of the COCKTAIL compiler construction kit.
    Published Papers
    Source Code
    Manuals
  • LEX

    Summary Lex generates programs to be used in simple lexical analysis of text. Lex is the classical scanner generator that comes with Unix
    Published Papers
    Source Code Available in all UNIX machines
    Manuals Click here for documentation
  • LLGEN

    Summary LLgen is a tool for generating an efficient recursive descent parser from an ELL(1) grammar. The grammar may be ambiguous or more general than ELL(1): there are both static and dynamic facilities to resolve the ambiguities.
    Published Papers Scroll down here to find paper link and reference
    Source Code Download source code
    Manuals Documentation (PostScript) is here
  • LISA

    Summary LISA generates table driven lexical analysers and LL(1) syntax analysers from regular expressions and BNF. LISA supports Kennedy-Warren and Katayama attribute evaluators.
    Published Papers Here is a paper
    Source Code
    Manuals
  • MKS Lex & Yacc

    Summary Lex compatible scanner generator and Yacc compatible parser generator for PC's
    Published Papers
    Source Code Click here for information about buying MKS Lex & Yacc
    Manuals
  • MUSKOX

    Summary MUSKOX partitions a software system into higher-level design declarations of the laws of state transitions, and a lower-level C++ code that takes care of the details. The MUSKOX compiler converts the declarations of the laws of state transitions into C++ programs that control the application at runtime, thereby making the design a driving part of the running application. MUSKOX is currently used as the core software component in the High Availability UNIX clusters where it provides availability and failover of hardware and software resources. MUSKOX is also a great choice to encode object behavior in Object-Oriented Frameworks and to build complex distributed software such as Transaction Processing systems, WEB servers, and Workflow engines. Last but not least, MUSKOX is an efficient C++ parser-generator tool.

    MUSKOX v4.0 annotates C++ & Java classes and interfaces with EBNF LR(K) grammars. It features grammar inheritance and redefinition of rules. It also supports multiple parsers, recording/replay of trace logs, and HTML pretty-printing. Default and user-defined lexer and error processing is available. In addition, the new version provides syntactic lookaheads, semantic predicates, and syntax trees.

    Published Papers
    Source Code Click here for details about obtaining the source code
    Manuals Click here for documentation links
  • NewYacc

    Summary NewYacc, an extension of the UNIX tool Yacc, is a tool that allows users to do source-to-source transformations on input languages. A specification for NewYacc contains all the elements of a Yacc specification with the addition of user-defined traversal specifications.
    Published Papers
    Source Code Here are some source code links.
    Manuals Spawn ps version of manual.
  • PRECCX

    Summary PRECC eXtended is an infinite-lookahead compiler compiler for context dependent grammars. Specification scripts are in very EBNF with inherited and synthetic attributes allowed.
    Published Papers A listing of papers
    Source Code Click here to obtain PRECCX
    Manuals Documentation is here
  • Ratatosk

    Summary Ratatosk started as an exercise to learn to use Gofer, but grew to be a usable system for generating parsers and scanners to use with Gofer programs. The parser generator generates SLR parsers from a context free syntax, and allows semantic "actions" to be used. The generated parsers will backtrack over shift/reduce and reduce/reduce conflicts, so non-SLR languages can be parsed, at the cost of some extra run-time. You will, however, get warnings about any conflicts found while generating a parser.
    Published Papers
    Source Code Download source code.
    Manuals Text manual included with source code.
  • Rex

    Summary Rex generates programs to be used in lexical analysis of text, for example then generation of scanners for compilers. Rex is a part of the COCKTAIL compiler construction kit.
    Published Papers
    Source Code
    Manuals
  • RDP

    Summary RDP compiles attributed LL(1) grammars decorated with C-language semantic actions into recursive descent compilers.
    Published Papers
    Source Code Here is the source code
    Manuals Download documentation
  • RE2C

    Summary re2c is a tool for generating C-based recognizers from regular expressions. The generated code is not tied to any particular input model.
    Published Papers Here is a paper (PostScript)
    Source Code Download source code
    Manual Quick reference manual is included with source code in the re2c.0.5/bootstrap directory
  • S/SL

    Summary S/SL is a programming language for constructing compilers. It incorporates sequences, repetitions and selections of actions (statements); input, matching and output of tokens; output of error signals; subprograms (called rules); and invocation of semantic operations.
    Published Papers R.C. Holt, J.R. Cordy, and D.B. Wortman,
    "An Introduction to S/SL: Syntax/Semantic Language",
    ACM Transactions on Programming Languages and Systems (TOPLAS),
    Vol 4, No. 2, April 1982, Pages 149-178.
    Source Code Click here to obtain S/SL
    Manuals
  • SCANGEN, LLGEN, LARLGEN

    Summary ScanGen, LLGen, LARLGen are scannner and LL(1) and LALR(1) parser generators introduced in ``Crafting a Compiler'' by Fischer & LeBlanc.
    Consult the READTHIS file to discover which files each system (SCANGEN, LLGEN, and LARLGEN) requires
    Published Papers
    Source Code Here is the directory containing the source code.
    Manuals Here is the ftp directory containing documentation
  • SML-Yacc

    Summary ML-Yacc is a parser generator for Standard ML modeled after the Yacc parser generator. It generates parsers for LALR languages, like Yacc, and has a similar syntax. The generated parsers use a different algorithm for recovering from syntax errors than parsers generated by Yacc.
    Published Papers
    Source Code Download the source code.
    Manuals Manual included with source code.
  • SML-Lex

    Summary ML-Lex is a variant of Lex for the ML programming language. ML-Lex has a syntax similar to Lex, and produces an ML program instead of a C program. ML-Lex produces a program that runs very efficiently. Typically the program will be as fast or even faster than a hand-coded lexer implemented in Standard ML.
    Published Papers
    Source Code Download sourcecode.
    Manuals Documentation is included with the source code.
  • tcllex

    Summary It's a scanner generator, producing an output file in tcl which can be used to scan lexically.
    Published Papers
    Source Code
    Manuals Very basic manual here
  • T-gen

    Summary T-gen is a general-purpose object-oriented tool for the automatic generation of string-to-object translators. It is written in Smalltalk and lives in the Smalltalk programming environment. T-gen supports the generation of both top-down (LL) and bottom-up (LR) parsers, which will automatically generate derivation trees, abstract syntax trees, or arbitrary Smalltalk objects. The simple specification syntax and graphical user interface are intended to enhance the learning, comprehension, and usefulness of T-gen.
    Published Papers
    Source Code Click here to find out how to get the source code.
    Manuals Spawn a PostScript manual.
  • TOM

    Summary An implementation of the Tomita parsing algorithm using LR(0) tables and dynamic programming. This is just a demo for the Tomita Parsing algorithm.
    Published Papers Kluwer '91, _Generalized LR Parsing_, Tomita ed., 0-7923-9201-9 "The Tomita Parsing Algorithm ...", comp.compilers May 20, 1994
    Source Code Download the source code.
    Manuals Some documentation included with source code.
  • TROOPER

    Summary TROOPER encapsulates the complexities of lexical analysis and parsing of Eiffel texts utilising an extended parse library which evolved from the ISE EiffelParse library. It provides a complete working parser written entirely in Eiffel. Semantic actions can be added to the parsing constructs to develop different tools such as pretty-printers, compilers, etc..
    Published Papers Avotins, J., Mingins, C., Schmidt, H.,
    "Yes! An Object-Oriented Compiler Compiler (YOOCC)",
    Technology of Object-Oriented Languages and Systems,
    (TOOLS) USA 1995.
    Avotins, J., Maughan, G., Mingins, C.,
    "Language Processor Construction: The Case for YOOCC and TROOPER",
    Technology of Object-Oriented Languages and Systems,
    (TOOLS) USA 1995.
    Avotins, J., Maughan, G., Mingins, C.,
    "YOOCC and TROOPER: Just What the ISE Parse Library Needed",
    The Newsletter for the International Eiffel Community,
    Volume 4, Number 4, Summer/Fall, 1995.
    Source Code Click here to obtain TROOPER
    Manuals Click here for how to use TROOPER
  • VISUALPARSE++

    Summary Visual Parse++ provides a visual interface which lets any programmer learn and utilize lexing and parsing technology interactively.
    Published Papers
    Source Code Here is some information about how you can download Visual Parse++ and get a temporary key so you can try it out
    Manuals Click on the Request Info link on this page to receive documentation
  • YACC

    Summary AT&T Yacc is the classical parser generator that comes with Unix
    Published Papers
    Source Code
    Manuals
  • YACC++ or this

    Summary Not just a set of C++ wrappers around lex and yacc output, Yacc++ and the Language Objects Library is an O-O rewrite of lex and yacc. Features include grammar classes with inheritance, regular expressions efficiently integrated into LR parsing, and solutions to include files, substring keywords, nested comments, and more.
    Published Papers
    Source Code Email Authors
    Manuals Email Authors
  • YOOCC

    Summary
    Published Papers Avotins, J., Mingins, C., Schmidt, H.,
    "Yes! An Object-Oriented Compiler Compiler (YOOCC)",
    Technology of Object-Oriented Languages and Systems,
    (TOOLS) USA 1995.
    Avotins, J., Maughan, G., Mingins, C.,
    "Language Processor Construction: The Case for YOOCC and TROOPER",
    Technology of Object-Oriented Languages and Systems,
    (TOOLS) USA 1995.
    Avotins, J., Maughan, G., Mingins, C.,
    "YOOCC and TROOPER: Just What the ISE Parse Library Needed",
    The Newsletter for the International Eiffel Community,
    Volume 4, Number 4, Summer/Fall, 1995.
    Source Code Click here to obtain YOOC
    Manuals Click here for how to use YOOC