Sistemas de recuperación de información adaptados al dominio biomédico

  1. Marrero, Mónica
  2. Sánchez Cuadrado, Sonia
  3. Urbano, Julián
  4. Morato Lara, Jorge
  5. Moreiro González, José Antonio
Revista:
El profesional de la información

ISSN: 1386-6710 1699-2407

Año de publicación: 2010

Título del ejemplar: Información biomédica

Volumen: 19

Número: 3

Páginas: 246-254

Tipo: Artículo

DOI: 10.3145/EPI.2010.MAY.04 DIALNET GOOGLE SCHOLAR lock_openAcceso abierto editor

Otras publicaciones en: El profesional de la información

Resumen

La terminología usada en biomedicina tiene rasgos léxicos que han requerido la elaboración de recursos terminológicos y sistemas de recuperación de información con funciones específicas. Las principales características son las elevadas tasas de sinonimia y homonimia, debidas a fenómenos como la proliferación de siglas polisémicas y su interacción con el lenguaje común. Los sistemas de recuperación de información en el dominio biomédico utilizan técnicas orientadas al tratamiento de estas peculiaridades léxicas. Se revisan algunas de estas técnicas, como la aplicación de Procesamiento del Lenguaje Natural (BioNLP), la incorporación de recursos léxico-semánticos, y la aplicación de Reconocimiento de Entidades (BioNER). Se presentan los métodos de evaluación adoptados para comprobar la adecuación de estas técnicas en la recuperación de recursos biomédicos.

Referencias bibliográficas

  • Ananiadou, Sophia (ed.); McNaught, John (ed.). Text mining for biology and biomedicine. Artech House, 2006, ISBN 978-1-58053-984-5.
  • Baeza-Yates, Ricardo. "Tendencias en minería de datos de la Web". El profesional de la información, 2009, v. 18, n. 1, pp. 5-10.
  • Bodenreider, Olivier. "Lexical, terminological and ontological resources for biological text mining". En: Ananiadou, Sophia (ed.); McNaught, John (ed.). Text mining for biology and biomedicine. Artech House, 2006, pp. 43-66, ISBN 978-1-58053-984-5. http://www.lhncbc.nlm.nih.gov/lhc/docs/ published/2006/pub2006007.pdf
  • Clegg, Andrew B.; Shepherd, Adrian J. "Evaluating and integrating treebank parsers on a biomedical corpus". En: Workshop on software (43rd Annual meeting of the ACL), 2005. http://citeseerx.ist.psu.edu/viewdoc/download? doi=10.1.1.136.8412&rep= rep1&type=pdf
  • Cohen Aaron; Hersch, William. "A survey of current work in biomedical text mining". Briefings in bioinformatics, 2005, v. 6, n. 1, pp. 57-71. http://bib.oxfordjournals.org/cgi/content/short/6/1/57
  • Collier, Nigel; Kawazoe, Ai; Jin, Lihua; Shigematsu, Mika; Dien, Dinh; Barrero, Roberto A.; Takeuchi, Koichi; Kawtrakul, Asanee. "A multilingual ontology for infectious disease surveillance: rationale, design and challenges". Language resources and evaluation, 2007, v. 40 n. 3-4, pp. 405-413. http://naist.cpe.ku.ac.th/downloads/publications/2007-n/Journal- Lecture- Notes/Multi-Onot-Disease.pdf
  • Cunningham, Hamish. "Information extraction, automatic". En: Brown, Keith (ed.). Encyclopedia of language and linguistics, v. 1-14, 2nd Edition, Elsevier Science Publishers, 2005, pp. 665-677. ISBN 0-08-044299-4. http://gate.ac.uk/sale/ell2/ie/main.pdf
  • Dingare, Shipra; Finkel, Jenny; Nissim, Malvina; Manning, Christopher; Grover, Claire. "A system for identifying named entities in biomedical text: how results from two evaluations reflect on both the system and the evaluations". En: BioLink meeting at ISMB, 2004.
  • Gaizauskas, Robert; Demetriou, George; Artymiuk, Pete J.; Willett, Peter. "Protein structures and information extraction from biological texts: the Pasta system". Bioinformatics, 2003, v. 19, n. 1, pp. 135-143. http://bioinformatics.oxfordjournals.org/ cgi/content/abstract/19/1/135
  • Hersh, William. TREC genomics track protocol. Oregon Health & Science University, 2004. http://ir.ohsu.edu/genomics/2004protocol. html
  • Jacquemin, Christian. Spotting and discovering terms through natural language processing. Cambridge, MA: MIT Press, 2001, ISBN 0-262-10085-1.
  • Kawazoe, Ai; Jin, Lihua; Shigematsu, Mika; Bekki, Daisuke; Barrero, Roberto; Taniguchi, Kiyosu; Collier, Nigel. "The development of a schema for semantic annotation: gain brought by a formal ontological method". Applied ontology, 2009, v. 4, n. 1, pp. 5-20.
  • Leser, Ulf; Hakenberg Jörg. "What makes a gene name? Named entity recognition in the biomedical literature". Briefings in bioinformatics, 2005, v. 6, n. 4, pp. 357-369.
  • Liu, Hongfang; Johnson, Stephen; Friedman, Carol. "Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS". Journal of the American Medical Informatics Association, 2002, v. 9, n. 6, pp. 621-636.
  • McCray, Alexa T.; Browne, Allen C.; Bodenreider, Olivier. "The lexical properties of the Gene ontology (GO)". En: Proceedings of the AMIA symposium, 2002, pp. 504-508. http://www.lhncbc.nlm.nih.gov/lhc/docs/published/ 2002/pub2002030.pdf
  • Morgan, Alexander; Hirschman, Lynette; Colosimo, Marc; Yeh, Alexander; Colombe, Jeff. "Gene name identification and normalization using a model organism database". Journal of biomedical informatics, 2004, v. 37, n. 6, pp. 396-410.
  • Poibeau Thierry; Kosseim, Leila. "Proper name extraction from non-journalistic texts". Language and computers, 2001, v. 37, pp. 144-157.
  • Rector, Alan; Stevens, Robert; Rogers, Jeremy. Simple bio upper ontolgy, 2006. http://www.cs.man.ac.uk/~rector/ontologies/simple-top-bio/
  • Rong, Xu; Morgan, Alex; Das, Amar K.; Garber, Alan. "Investigation of unsupervised pattern learning techniques for bootstrap construction of a medical treatment lexicon". En: BioNLP workshop, 2009, pp. 63-70. http://aclweb.org/anthology/W/W09/W09-1308.pdf
  • Rosse, Cornelius; Kumar, Anand; Mejino Jose L. V.; Cook, Daniel L.; Detwiler, Landon T.; Smith, Barry. "A strategy for improving and integrating biomedical ontologies". En: Annual symposium of the AMIA, 2005, pp. 639-643. http://ontology.buffalo.edu/bio/OBR.pdf
  • Samwald, Matthias; Adlassnig, Klaus-Peter. "The bio-zen plus ontology". Applied ontology, 2008, v. 3, n. 4, pp. 213-217.
  • Schulze-Kremer, Steffen. "Adding semantics to genome databases: towards an ontology for molecular biology". En: 5th Int. conf. on intelligent systems for molecular biology, 1997, pp. 272-275.
  • Soldatova, Larisa N.; King, Ross D. "Are the current ontologies in biology good ontologies?". Nature biotechnology, 2005, v. 23, n. 9, pp. 1095-1098.
  • Spasic, Irena; Ananiadou, Sophia. "A flexible measure of contextual similarity for biomedical terms". En: Pacific symposium on biocomputing, 2005, pp. 197-208. http://helix-web.stanford.edu/psb05/spasic.pdf
  • Spasic, Irena; Ananiadou, Sophia; McNaught, John; Kumar, Anand. "Text mining and ontologies in biomedicine: making sense of raw text". Briefings in bioinformatics, 2005, v. 6, n. 3, pp. 239-251. http://bib.oxfordjournals.org/cgi/content/short/6/3/239
  • Stenzhorn, Holger; Schulz, Stefan; Beißwanger, Elena; Hahn, Udo; Van Den Hoek, László; Van Mulligen, Erik. "BioTop and ChemTop - Top-Domain ontologies for biology and chemistry". En: International Semantic Web Conference (Posters & Demos), 2008, pp. 1-2. http://www.imbi.uni-freiburg.de/ontology/biotop/publications/iswc08.pdf
  • Tsuruoka, Yoshimasa; Tsujii, Jun'ichi. "Improving the performance of dictionary-based approaches in protein name recognition". Journal of biomedical informatics, 2004, v. 37, n. 6, pp. 461-470.
  • Weeber, Marc; Klein, Henny; Aronson, Alan R.; Mork, James G.; De Jong-Van den Berg, Lolkje; Vos, Rein. "Text-based discovery in biomedicine: the architecture of the DAD-system". En: AMIA symposium, 2000, pp. 903-907. http://www.lhncbc.nlm.nih.gov/lhc/docs/published/2000/pub2000061.pdf
  • Zhou, GuoDong; Zhang, Jie; Su, Jian; Shen, Dan; Tan, ChewLim. "Recognizing names in biomedical texts: a machine learning approach". Bioinformatics, 2004, v. 20, n. 7, pp. 1178-1190. http://bioinformatics. oxfordjournals.org/cgi/content/short/20/7/1178