Non-invasive diagnosis of human diseases by combining breath analysis and neural network modeling

  1. CANCILLA BUENACHE, JUAN CARLOS
Supervised by:
  1. José Santiago Torrecilla Velasco Director

Defence university: Universidad Complutense de Madrid

Fecha de defensa: 09 September 2016

Committee:
  1. Mª Luz Mena Fernández Chair
  2. Oscar Palomares Gracia Secretary
  3. Gema Moreno Bueno Committee member
  4. Hans H. Riese Jordá Committee member
  5. Joaquín Arribas López Committee member
Department:
  1. Ingeniería Química y de Materiales

Type: Thesis

Abstract

It is currently known that there is a direct relation between the moment a disease is detected or diagnosed and the consequences it will have on the patient, as an early detection is generally linked to a more favorable outcome. This concept is the basis of the present research, due to the fact that its main goal is the development of mathematical tools based on computational artificial intelligence to safely and non-invasively attain the detection of multiple diseases. To reach these devices, this research has focused on the breath analysis of patients with diverse diseases, using several analytical methodologies to extract the information contained in these samples, and multiple feature selection algorithms and neural networks for data analysis. In the past, it has been shown that there is a correlation between the molecular composition of breath and the clinical status of a human being, proving the existence of volatile biomarkers that can aid in disease detection depending on their presence or amount. During this research, two main types of analytical approaches have been employed to study the gaseous samples, and these were cross-reactive sensor arrays (based on organically functionalized silicon nanowire field-effect transistors (SiNW FETs) or gold nanoparticles (GNPs)) and proton transfer reaction-mass spectrometry (PTR-MS). The cross-reactive sensors analyze the bulk of the breath samples, offering global, fingerprint-like information, whereas PTR-MS quantifies the volatile molecules present in the samples. All of the analytical equipment employed leads to the generation of large amounts of data per sample, forcing the need of a meticulous mathematical analysis to adequately interpret the results. In this work, two fundamental types of mathematical tools were utilized. In first place, a set of five filter-based feature selection algorithms (?2 (chi2) score, Fisher’s discriminant ratio, Kruskal-Wallis test, Relief-F algorithm, and information gain test) were employed to reduce the amount of independent in the large databases to the ones which contain the greatest discriminative power for a further modeling task. On the other hand, and in relation to mathematical modeling, artificial neural networks (ANNs), algorithms that are categorized as computational artificial intelligence, have been employed. These non-linear tools have been used to locate the relations between the independent variables of a system and the dependent ones to fulfill estimations or classifications. The type of ANN that has been used in this thesis coincides with the one that is more commonly employed in research, which is the supervised multilayer perceptron (MLP), due to its proven ability to create reliable models for many different applications...