Sistemas de recuperación de información adaptados al dominio biomédico

  1. Marrero, Mónica
  2. Sánchez Cuadrado, Sonia
  3. Urbano, Julián
  4. Morato Lara, Jorge
  5. Moreiro González, José Antonio
Journal:
El profesional de la información

ISSN: 1386-6710 1699-2407

Year of publication: 2010

Issue Title: Información biomédica

Volume: 19

Issue: 3

Pages: 246-254

Type: Article

DOI: 10.3145/EPI.2010.MAY.04 DIALNET GOOGLE SCHOLAR lock_openOpen access editor

More publications in: El profesional de la información

Sustainable development goals

Abstract

The terminology used in biomedicine has lexical characteristics that have required the elaboration of terminological resources and information retrieval systems with specific functionalities. The main characteristics are the high rates of synonymy and homonymy, due to phenomena such as the proliferation of polysemic acronyms and their interaction with common language. Information retrieval systems in the biomedical domain use techniques oriented to the treatment of these lexical peculiarities. In this paper we review some of these techniques, such as the application of Natural Language Processing (BioNLP), the incorporation of lexical-semantic resources, and the application of Named Entity Recognition (BioNER). Finally, we present the evaluation methods adopted to assess the suitability of these techniques for retrieving biomedical resources.

Bibliographic References

  • Ananiadou, Sophia (ed.); McNaught, John (ed.). Text mining for biology and biomedicine. Artech House, 2006, ISBN 978-1-58053-984-5.
  • Baeza-Yates, Ricardo. "Tendencias en minería de datos de la Web". El profesional de la información, 2009, v. 18, n. 1, pp. 5-10.
  • Bodenreider, Olivier. "Lexical, terminological and ontological resources for biological text mining". En: Ananiadou, Sophia (ed.); McNaught, John (ed.). Text mining for biology and biomedicine. Artech House, 2006, pp. 43-66, ISBN 978-1-58053-984-5. http://www.lhncbc.nlm.nih.gov/lhc/docs/ published/2006/pub2006007.pdf
  • Clegg, Andrew B.; Shepherd, Adrian J. "Evaluating and integrating treebank parsers on a biomedical corpus". En: Workshop on software (43rd Annual meeting of the ACL), 2005. http://citeseerx.ist.psu.edu/viewdoc/download? doi=10.1.1.136.8412&rep= rep1&type=pdf
  • Cohen Aaron; Hersch, William. "A survey of current work in biomedical text mining". Briefings in bioinformatics, 2005, v. 6, n. 1, pp. 57-71. http://bib.oxfordjournals.org/cgi/content/short/6/1/57
  • Collier, Nigel; Kawazoe, Ai; Jin, Lihua; Shigematsu, Mika; Dien, Dinh; Barrero, Roberto A.; Takeuchi, Koichi; Kawtrakul, Asanee. "A multilingual ontology for infectious disease surveillance: rationale, design and challenges". Language resources and evaluation, 2007, v. 40 n. 3-4, pp. 405-413. http://naist.cpe.ku.ac.th/downloads/publications/2007-n/Journal- Lecture- Notes/Multi-Onot-Disease.pdf
  • Cunningham, Hamish. "Information extraction, automatic". En: Brown, Keith (ed.). Encyclopedia of language and linguistics, v. 1-14, 2nd Edition, Elsevier Science Publishers, 2005, pp. 665-677. ISBN 0-08-044299-4. http://gate.ac.uk/sale/ell2/ie/main.pdf
  • Dingare, Shipra; Finkel, Jenny; Nissim, Malvina; Manning, Christopher; Grover, Claire. "A system for identifying named entities in biomedical text: how results from two evaluations reflect on both the system and the evaluations". En: BioLink meeting at ISMB, 2004.
  • Gaizauskas, Robert; Demetriou, George; Artymiuk, Pete J.; Willett, Peter. "Protein structures and information extraction from biological texts: the Pasta system". Bioinformatics, 2003, v. 19, n. 1, pp. 135-143. http://bioinformatics.oxfordjournals.org/ cgi/content/abstract/19/1/135
  • Hersh, William. TREC genomics track protocol. Oregon Health & Science University, 2004. http://ir.ohsu.edu/genomics/2004protocol. html
  • Jacquemin, Christian. Spotting and discovering terms through natural language processing. Cambridge, MA: MIT Press, 2001, ISBN 0-262-10085-1.
  • Kawazoe, Ai; Jin, Lihua; Shigematsu, Mika; Bekki, Daisuke; Barrero, Roberto; Taniguchi, Kiyosu; Collier, Nigel. "The development of a schema for semantic annotation: gain brought by a formal ontological method". Applied ontology, 2009, v. 4, n. 1, pp. 5-20.
  • Leser, Ulf; Hakenberg Jörg. "What makes a gene name? Named entity recognition in the biomedical literature". Briefings in bioinformatics, 2005, v. 6, n. 4, pp. 357-369.
  • Liu, Hongfang; Johnson, Stephen; Friedman, Carol. "Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS". Journal of the American Medical Informatics Association, 2002, v. 9, n. 6, pp. 621-636.
  • McCray, Alexa T.; Browne, Allen C.; Bodenreider, Olivier. "The lexical properties of the Gene ontology (GO)". En: Proceedings of the AMIA symposium, 2002, pp. 504-508. http://www.lhncbc.nlm.nih.gov/lhc/docs/published/ 2002/pub2002030.pdf
  • Morgan, Alexander; Hirschman, Lynette; Colosimo, Marc; Yeh, Alexander; Colombe, Jeff. "Gene name identification and normalization using a model organism database". Journal of biomedical informatics, 2004, v. 37, n. 6, pp. 396-410.
  • Poibeau Thierry; Kosseim, Leila. "Proper name extraction from non-journalistic texts". Language and computers, 2001, v. 37, pp. 144-157.
  • Rector, Alan; Stevens, Robert; Rogers, Jeremy. Simple bio upper ontolgy, 2006. http://www.cs.man.ac.uk/~rector/ontologies/simple-top-bio/
  • Rong, Xu; Morgan, Alex; Das, Amar K.; Garber, Alan. "Investigation of unsupervised pattern learning techniques for bootstrap construction of a medical treatment lexicon". En: BioNLP workshop, 2009, pp. 63-70. http://aclweb.org/anthology/W/W09/W09-1308.pdf
  • Rosse, Cornelius; Kumar, Anand; Mejino Jose L. V.; Cook, Daniel L.; Detwiler, Landon T.; Smith, Barry. "A strategy for improving and integrating biomedical ontologies". En: Annual symposium of the AMIA, 2005, pp. 639-643. http://ontology.buffalo.edu/bio/OBR.pdf
  • Samwald, Matthias; Adlassnig, Klaus-Peter. "The bio-zen plus ontology". Applied ontology, 2008, v. 3, n. 4, pp. 213-217.
  • Schulze-Kremer, Steffen. "Adding semantics to genome databases: towards an ontology for molecular biology". En: 5th Int. conf. on intelligent systems for molecular biology, 1997, pp. 272-275.
  • Soldatova, Larisa N.; King, Ross D. "Are the current ontologies in biology good ontologies?". Nature biotechnology, 2005, v. 23, n. 9, pp. 1095-1098.
  • Spasic, Irena; Ananiadou, Sophia. "A flexible measure of contextual similarity for biomedical terms". En: Pacific symposium on biocomputing, 2005, pp. 197-208. http://helix-web.stanford.edu/psb05/spasic.pdf
  • Spasic, Irena; Ananiadou, Sophia; McNaught, John; Kumar, Anand. "Text mining and ontologies in biomedicine: making sense of raw text". Briefings in bioinformatics, 2005, v. 6, n. 3, pp. 239-251. http://bib.oxfordjournals.org/cgi/content/short/6/3/239
  • Stenzhorn, Holger; Schulz, Stefan; Beißwanger, Elena; Hahn, Udo; Van Den Hoek, László; Van Mulligen, Erik. "BioTop and ChemTop - Top-Domain ontologies for biology and chemistry". En: International Semantic Web Conference (Posters & Demos), 2008, pp. 1-2. http://www.imbi.uni-freiburg.de/ontology/biotop/publications/iswc08.pdf
  • Tsuruoka, Yoshimasa; Tsujii, Jun'ichi. "Improving the performance of dictionary-based approaches in protein name recognition". Journal of biomedical informatics, 2004, v. 37, n. 6, pp. 461-470.
  • Weeber, Marc; Klein, Henny; Aronson, Alan R.; Mork, James G.; De Jong-Van den Berg, Lolkje; Vos, Rein. "Text-based discovery in biomedicine: the architecture of the DAD-system". En: AMIA symposium, 2000, pp. 903-907. http://www.lhncbc.nlm.nih.gov/lhc/docs/published/2000/pub2000061.pdf
  • Zhou, GuoDong; Zhang, Jie; Su, Jian; Shen, Dan; Tan, ChewLim. "Recognizing names in biomedical texts: a machine learning approach". Bioinformatics, 2004, v. 20, n. 7, pp. 1178-1190. http://bioinformatics. oxfordjournals.org/cgi/content/short/20/7/1178