Diagnóstico y pronóstico en bases de datos clínicas con tecnicas no supervisadas. Diagnosis ano prognosis in clinical databases through unsupervised statistical techniques

  1. SÁNCHEZ RICO, MARINA LUCÍA
unter der Leitung von:
  1. Nicolas Hoertel Doktorvater/Doktormutter
  2. Jesús María Alvarado Izquierdo Doktorvater

Universität der Verteidigung: Universidad Complutense de Madrid

Fecha de defensa: 30 von März von 2022

Gericht:
  1. Marta Evelia Aparicio García Präsidentin
  2. Miguel Ángel Castellanos López Sekretär
  3. Hugo Peyre Vocal
  4. José Manuel Reales Avilés Vocal
  5. Francisco José Abad García Vocal

Art: Dissertation

Zusammenfassung

When working in clinical settings, epidemiological research can, and frequently has, a direct impact on patients. Observational studies based on hospital data can be extremely valuable tools, especially in situations in which time is a key element. They have the ability tostudy a broad range of patients, and test very complex associations, both regarding the search and study of pathologies, prevalence, characteristics, associated risk factors or conditions, or associations between treatments or interventions and clinical outcomes. In recent years there has been a substantial growth in high quality observational studies in epidemiology, which is hypothesised to be due to two main factors. First, a proper, strong design that accounts for several potential error sources that account for the lack of randomization of observational studies. Second, because the proliferation and improvement of electronic health records (EHRs), researchers have been able to use techniques from other fields of study for epidemiological settings. In this thesis we aimed to contribute to the study and implementationof machine learning techniques that allow to take advantage of EHRs and clinical databases in observational epidemiological studies. To that aim, we incorporated unsupervised machine learning techniques for pattern identification studies to explore comorbidity patterns in hospitalized patients. In study 1, we compared the performance of three dimensionality reduction techniques, (i.e., Principal Component Analysis (PCA), t-Stochastic NeighborEmbedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP)) when applied in combination with cluster analysis to find hidden diagnostic patterns, finding a superior performance of UMAP...