Versions
Description
The aim of this project is to parse discontinuous constituents in natural language using Data-Oriented Parsing (DOP), with a focus on global world domination. The grammar is extracted from a treebank of sentences annotated with (discontinuous) phrase-structure trees. Concretely, this project provides a statistical constituency parser with support for discontinuous constituents and Data-Oriented Parsing. Discontinuous constituents are supported through the grammar formalism Linear Context-Free Rewriting System (LCFRS), which is a generalization of Probabilistic Context-Free Grammar (PCFG). Data-Oriented Parsing allows re-use of arbitrary-sized fragments from previously seen sentences using Tree-Substitution Grammar (TSG).
Repository
http://github.com/andreasvc/disco-dop.git
Project Slug
discodop
Last Built
4 months, 2 weeks ago passed
Maintainers
Home Page
http://github.com/andreasvc/disco-dop
Badge
Tags
Short URLs
discodop.readthedocs.io
discodop.rtfd.io
Default Version
latest
'latest' Version
master