Combinatoria léxica, polisemia y polisemia regular en una base de conocimiento léxico conceptualel caso de "Redes diccionario combinatorio del español contemporáneo" y functional grammar knowledge base

  1. Sherwood Droz, Maia
Dirigida por:
  1. Ricardo Mairal Usón Director/a

Universidad de defensa: Universidad Complutense de Madrid

Fecha de defensa: 03 de febrero de 2014

Tribunal:
  1. Joaquín Garrido Medina Presidente
  2. Isabel Negro Alousque Secretaria
  3. Olga Fernández Soriano Vocal
  4. Guadalupe Aguado de Cea Vocal
  5. Jose Carlos Periñán Pascual Vocal

Tipo: Tesis

Resumen

This dissertation attempts to link the linguistic information contained in REDES Diccionario combinatorio del español contemporáneo (Bosque, 2004), or REDES, to the ontological framework of Functional Grammar Knowledge Base, or FunGramKB. REDES is a dictionary that gathers systematic restrictions imposed by some 4,000 Spanish predicates to their selection of lexical arguments, while FunGramKB is a multilingual and multipurpose lexical conceptual knowledge base (KB) designed for Natural Language Processing (NLP). This work is thus part of the field known as electronic lexicography for the XXIst century or the third millennium (Fuertes and Tarp, 2011).This new lexicography comprises electronic lexicographical resources which are much more complex than the commonly used digitalized versions of traditional dictionaries. It is composed of lexical databases or lexical knowledge bases, of variable complexity and depth, built on electronic platforms. Even though electronic lexicography can also serve the typical dictionary queries posed by humans, the best-known NLP applications include machine translation (MT), question and answer systems, information extraction, and voice recognition programs. However, the main problem facing NLP continues to be Word Sense Disambiguation (WSD). Most words in a language are polysemic and a successful NLP application will depend on its ability to assign the correct sense to each word in a given context. A great deal of the work that is taking place in NLP is therefore focused on finding strategies to effectively disambiguate words in their context.In working with REDES and FunGramKB, it could seem at first glance that we are dealing with two distant fields –print lexicography and knowledge engineering, respectively–, but this thesis assumes that linking these resources would yield significant benefits for both. Furthermore, it would combine two sources of valuable linguistic and theoretical data: on the one hand, REDES contributes thousands of patterns of systematic predicate-argument word combinations, taken from real use in Spanish-language corpora. These combinations are not presented as “collocations”, that is, combinations between a single predicate and a single argument, or vice versa, but rather, between predicates and “lexical classes”, groups of arguments that share a common semantic basis. On the other hand, FunGramKB contributes a multilevel electronic platform designed for NLP, which includes a conceptual, a lexical and a grammatical model, and has been built around a hierarchical and taxonomical ontology of universal cognitive concepts.