Diagnóstico y pronóstico en bases de datos clínicas con tecnicas no supervisadas. Diagnosis ano prognosis in clinical databases through unsupervised statistical techniques

  1. SÁNCHEZ RICO, MARINA LUCÍA
Dirixida por:
  1. Nicolas Hoertel Director
  2. Jesús María Alvarado Izquierdo Director

Universidade de defensa: Universidad Complutense de Madrid

Fecha de defensa: 30 de marzo de 2022

Tribunal:
  1. Marta Evelia Aparicio García Presidenta
  2. Miguel Ángel Castellanos López Secretario
  3. Hugo Peyre Vogal
  4. José Manuel Reales Avilés Vogal
  5. Francisco José Abad García Vogal

Tipo: Tese

Resumo

When working in clinical settings, epidemiological research can, and frequently has, a direct impact on patients. Observational studies based on hospital data can be extremely valuable tools, especially in situations in which time is a key element. They have the ability tostudy a broad range of patients, and test very complex associations, both regarding the search and study of pathologies, prevalence, characteristics, associated risk factors or conditions, or associations between treatments or interventions and clinical outcomes. In recent years there has been a substantial growth in high quality observational studies in epidemiology, which is hypothesised to be due to two main factors. First, a proper, strong design that accounts for several potential error sources that account for the lack of randomization of observational studies. Second, because the proliferation and improvement of electronic health records (EHRs), researchers have been able to use techniques from other fields of study for epidemiological settings. In this thesis we aimed to contribute to the study and implementationof machine learning techniques that allow to take advantage of EHRs and clinical databases in observational epidemiological studies. To that aim, we incorporated unsupervised machine learning techniques for pattern identification studies to explore comorbidity patterns in hospitalized patients. In study 1, we compared the performance of three dimensionality reduction techniques, (i.e., Principal Component Analysis (PCA), t-Stochastic NeighborEmbedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP)) when applied in combination with cluster analysis to find hidden diagnostic patterns, finding a superior performance of UMAP...