Classification techniques for time series and functional data
- Andrés M. Alonso Fernández Director/a
- Juan José Romo Urroz Director
Universidad de defensa: Universidad Carlos III de Madrid
Fecha de defensa: 12 de julio de 2010
- Francisco Javier Prieto Fernández Presidente/a
- Pedro Galeano Secretario/a
- Julio Rodríguez Puerta Vocal
- Nuno Crato Vocal
- José Ramón Berrendero Díaz Vocal
Tipo: Tesis
Resumen
The main subject of this doctoral thesis is to develop classification techniques for dependent and functional data. Methods for classifying time series and functional data are proposed. Although this work involves several type of data, the functional data play a central role. An important point of both classification methodologies is that the original problems are not directly dealt with: the time series problem is rewritten as a functional data problem while the functional data problem is solved using a multivariate technique. It is worthwhile noticing, however, the different role of the functional data in the two forthcoming proposals: in the time series problem functional estimators are constructed, while in the functional data problem curves are the primary data. For the classification of time series, their integrated periodograms are considered instead. After this, a new element is assigned to the group minimizing the distance from its integrated periodogram to the group mean of integrated periodograms. Although the periodogram is defined only for stationary time series, the application of the methodology to nonstationary series is still possible by computing these periodograms locally. Finally, functional data depth is applied to make the classification robust. On the one hand, the classification of functional data arises naturally in the previous framework. On the other hand, the problem of selecting the more appropriate form to express the data is suggested: crude functions, their integrals or their derivatives. Without loss of generality, this second problem is equivalently formulated in terms of functions and their derivatives of different order, without integrals. In this thesis, a single methodology is proposed to cope with these two tasks at the same time. Following the same criterion of classifying a curve by using the distances from the function or its derivatives to group representative (usually the mean) functions or their derivatives, the combination of those distances is proposed in our method. The proposal works with a multivariate variable defined in terms of the distances. Besides, an automatic form of ranking the original functions and their derivatives by discriminat power is obtained.