Non-invasive diagnosis of human diseases by combining breath analysis and neural network modeling
- CANCILLA BUENACHE, JUAN CARLOS
- José Santiago Torrecilla Velasco Director
Universidade de defensa: Universidad Complutense de Madrid
Fecha de defensa: 09 de setembro de 2016
- Mª Luz Mena Fernández Presidenta
- Oscar Palomares Gracia Secretario
- Gema Moreno Bueno Vogal
- Hans H. Riese Jordá Vogal
- Joaquín Arribas López Vogal
Tipo: Tese
Resumo
It is currently known that there is a direct relation between the moment a disease is detected or diagnosed and the consequences it will have on the patient, as an early detection is generally linked to a more favorable outcome. This concept is the basis of the present research, due to the fact that its main goal is the development of mathematical tools based on computational artificial intelligence to safely and non-invasively attain the detection of multiple diseases. To reach these devices, this research has focused on the breath analysis of patients with diverse diseases, using several analytical methodologies to extract the information contained in these samples, and multiple feature selection algorithms and neural networks for data analysis. In the past, it has been shown that there is a correlation between the molecular composition of breath and the clinical status of a human being, proving the existence of volatile biomarkers that can aid in disease detection depending on their presence or amount. During this research, two main types of analytical approaches have been employed to study the gaseous samples, and these were cross-reactive sensor arrays (based on organically functionalized silicon nanowire field-effect transistors (SiNW FETs) or gold nanoparticles (GNPs)) and proton transfer reaction-mass spectrometry (PTR-MS). The cross-reactive sensors analyze the bulk of the breath samples, offering global, fingerprint-like information, whereas PTR-MS quantifies the volatile molecules present in the samples. All of the analytical equipment employed leads to the generation of large amounts of data per sample, forcing the need of a meticulous mathematical analysis to adequately interpret the results. In this work, two fundamental types of mathematical tools were utilized. In first place, a set of five filter-based feature selection algorithms (?2 (chi2) score, Fisher’s discriminant ratio, Kruskal-Wallis test, Relief-F algorithm, and information gain test) were employed to reduce the amount of independent in the large databases to the ones which contain the greatest discriminative power for a further modeling task. On the other hand, and in relation to mathematical modeling, artificial neural networks (ANNs), algorithms that are categorized as computational artificial intelligence, have been employed. These non-linear tools have been used to locate the relations between the independent variables of a system and the dependent ones to fulfill estimations or classifications. The type of ANN that has been used in this thesis coincides with the one that is more commonly employed in research, which is the supervised multilayer perceptron (MLP), due to its proven ability to create reliable models for many different applications...