Galaxy clustering: a point process

HURTADO GIL, LLUIS

Galaxy clusteringa point process

HURTADO GIL, LLUIS

Dirigée par:

Vicent J. Martínez Directeur/trice
Pablo Arnalte Mur Co-directeur/trice

Université de défendre: Universitat de València

Fecha de defensa: 23 juin 2016

Jury:

Diego Sáez Milán President
María Jesús Pons Borderia Secrétaire
Elmo Tempel Rapporteur

Type: Thèses

Teseo: 422413 DIALNET RODERIC editor

Résumé

El 'clustering' de galàxies és l'agregació de galàxies en l'universe produida per la força de la gravetat. Les galàxies tendeixen a formar estructures de major tamany tal com 'clusters' o filaments que formen la xarxa còsmica ('Cosmic Web'). Aquesta Estructura a Gran Escala de l'Univers es pot entendre com el resultat de la distribució de galàxies, un procés en el qual totes les galàxies estan subjectes a forces comuns i comparteixen propietats universals. L'anàlisis d'aquesta distribució es pot realitzar amb técniques de processos puntuals, l'estudi de configuracions de punts sobre un marc. En aquesta tesi fem servir aquesta branca de la estadística en tres approximacions diferents: els estadístics de resum, la mineria de dades i el modelatge. Els resultats mostren que els Processos Puntuals són una ferramenta excel·lent per a descobrir les propietats de la distribució de galàxies així com per a modelar els seus patrons. Diverses fonts de dades s'han utilitzat com exemples de Processos Puntuals. Aquestos incluen cartografiats de galàxies moderns tals com l''Sloan Diginal Sky Survey' (SDSS) i el cartografiat ALHAMBRA, que ens permet estudiar i descobrir noves propietats de la distribució de galàxies i les seues conseqüències en el comportament de les galàxies. La distribució de recomptes de cel·les d'un Procés Puntual és una técnica senzilla però poderosa per a descriure una distribució. Per a aquest estadístic fem servir dades del SDSS. Ajustem la distribució observacional obtinguda amb quatre funcions de densitat de probabilitat diferents i comparem la bondat dels seus ajustos. Un altre exemple d'estadístics de resum és la funció de correlació, la qual utilitzem per a descriure el comportament de l'agregació de galàxies del cartografiat ALHAMBRA, cobrint un ampla regió de 'redshift'. Amb aquest estadístic som capaços de calcular l'agregació de galàxies segregated espectralment a escales petites per primer cop (< 0.2 Mpc/h). Els algoritmes de mineria de dades i modelatge utilitzats en aquesta tesi han sigut testats en simulacions de galàxies i matèria fosca, com per exemple la simulació LasDamas o la simulació 'MultiDak'. El nostre primer model per a la distribució de galàxies és el model d'interacció de punts de Gibbs, un model probabilistic que descriu la distribució de galàxies depenent de les distàncies de parelles. Per a parelles properes el model incrementa la intesitat del procés, creant un patró agregat. Tres models diferents s'han utilitzat, depenent del perfil del 'cluster'. El model de Geyer defineix un perfil pla on les galàxies estan agregades a intensitats superiors a les de la distirbució homogenea de Poisson trobada a escales majors. El model de Fiksel es un perfil continu amb una pendent exponencial, definint unes amplituds d'agrupament majors a escales curtes. Finalment, el model de Llei de Potències defineix també un perfil continu amb un pol a distància zero. Els Models de Mescla són una poderosa ferramenta tant per a modelar com per a fer mineria de dades de la distribució de galàxies. Donat una procés amb una estructura ben definida, tal i com és la distribució de galàxies, amb 'clusters', filaments i altres tipus d'agregacions de galàxies, el Models de Mescla poden descriure el seu contingut correctament. Necessitem definir el nombre d'estructures i la seua morfologia, i utilitzar aquesta informació per a construir un model que localutza i ajusta cada estructura. El model resultant és una funció de densitat de probabilitat que descriu fidelment el contingut del Procés Puntual i descriu separadament cada estructura present, permetent una eficient mineria de dades. Galaxy clustering is the aggregation of galaxies in the universe driven by the force of gravity. Galaxies tend to form bigger structures like clusters or filaments that weave the Cosmic Web. This Large Scale Structure of the Universe can be understood as the resulting distribution of galaxies, a process in which all galaxies are subjected to common forces and share universal properties. The analysis of this distribution can be dealt with Point Processes techniques, the study of point configurations in a framework. In this thesis work we use this brach of statistics in three different approaches: summary statistics, data mining and modeling. Results show that Point Process are an excellent tool to unveil the properties of the galaxy distribution as well as to model their patterns. Different data sources have been used as examples of galaxy Point Process. These include modern galaxy surveys like the Sloan Digital Sky Survey (SDSS) and the ALHAMBRA survey, which allow us to study and discover new properties of the galaxy distribution and its consequences in the galaxy behavior. The Counts-in-Cells distribution of a Point Process is a simple yet powerful technique to describe a distribution. For this statistic we use data from the SDSS. We fit the obtained observational distribution with four different probability density functions and compare their goodness of fit. Another example of summary statistics technique is the correlation function, which we use to describe the clustering behavior of galaxies of the ALHAMBRA survey, covering wide redshift values. With this statistic we are able to calculate the galaxy clustering of spectral segregated galaxies at small scales (< 0.2 Mpc/h) for the first time. The data mining and modeling algorithms used in this thesis are tested on galaxy and dark matter simulations, such as LasDamas simulation and the MultiDark simulation. Our first model for the galaxy distribution is the point interaction Gibbs model, a probabilistic model that describes the distribution of galaxies depending on their pairwise distances. For close galaxy pairs the model increases the intensity of the process, creating aggregated patterns. Three different models have been used, depending on the cluster profile. The Geyer model defines a top hat profile where galaxy are aggregated at higher intensities than that of the homogeneous Poisson distribution found at larger scales. The Fiksel model is a continuous profile with an exponential slope, defining higher clustering amplitudes at small scales. Finally, the Power Law model defines also a continuous profile with a pole at distance 0. The Mixture Models are a powerful tool both for modeling and mining the galaxy distribution. Given a process with a well defined structure, such as the galaxy distribution, with clusters, filaments and other kinds of galaxy aggregations, the Mixture Models can properly describe its content. We need to define the number of structures and its morphology, and use this information to build a model which localize and fit each structure. The resulting model is a probability density function which reliably describes the point process content and separately describe each present structure, allowing an efficient data mining.