Building corpora for the development of a dependency parser for Spanish using Maltparser
- Herrera, Jesús
- Gervás Gómez-Navarro, Pablo
- Moriano, Pedro J.
- Muñoz, Alfonso
- Romero, Luis
ISSN: 1135-5948
Year of publication: 2007
Issue: 39
Pages: 181-186
Type: Article
More publications in: Procesamiento del lenguaje natural
Abstract
The present paper details the process followed for creating training and test corpora for a dependency parser generator (Maltparser). The starting point is the Cast3LB corpus, which contains constituency analyses of Spanish texts. These constituency analyses are automatically transformed into dependency analyses. In addition, the empirically and semiautomatically obtention of a set of syntactic function labels for the training corpus is described. As a result of the process followed, it has been obtained a dependency parser for Spanish showing a 91% precision when determining dependencies.