A First Look at Creating Mock Catalogs with Machine Learning Techniques

Xu, Xiaoying; Ho, Shirley; Trac, Hy; Schneider, Jeff; Poczos, Barnabas; Ntampaka, Michelle

doi:10.1184/R1/6475409.v1

file.pdf (3.86 MB)

A First Look at Creating Mock Catalogs with Machine Learning Techniques

journal contribution

posted on 2013-03-01, 00:00 authored by Xiaoying Xu, Shirley Ho, Hy TracHy Trac, Jeff Schneider, Barnabas Poczos, Michelle Ntampaka

We investigate machine learning (ML) techniques for predicting the number of galaxies (N_gal) that occupy a halo, given the halo's properties. These types of mappings are crucial for constructing the mock galaxy catalogs necessary for analyses of large-scale structure. The ML techniques proposed here distinguish themselves from traditional halo occupation distribution (HOD) modeling as they do not assume a prescribed relationship between halo properties and N _gal. In addition, our ML approaches are only dependent on parent halo properties (like HOD methods), which are advantageous over subhalo-based approaches as identifying subhalos correctly is difficult. We test two algorithms: support vector machines (SVM) and k-nearest-neighbor (kNN) regression. We take galaxies and halos from the Millennium simulation and predict N gal by training our algorithms on the following six halo properties: number of particles, M ₂₀₀, σ _v, v _max, half-mass radius, and spin. For Millennium, our predicted N gal values have a mean-squared error (MSE) of ~0.16 for both SVM and kNN. Our predictions match the overall distribution of halos reasonably well and the galaxy correlation function at large scales to ~5%-10%. In addition, we demonstrate a feature selection algorithm to isolate the halo parameters that are most predictive, a useful technique for understanding the mapping between halo properties and N _gal. Lastly, we investigate these ML-based approaches in making mock catalogs for different galaxy subpopulations (e.g., blue, red, high M _star, low M _star). Given its non-parametric nature as well as its powerful predictive and feature selection capabilities, ML offers an interesting alternative for creating mock catalogs.

History

Publisher Statement

C 2013. The American Astronomical Society.

Date

2013-03-01

Usage metrics

Keywords

Machine Learning

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

A First Look at Creating Mock Catalogs with Machine Learning Techniques

History

Publisher Statement

Date

Usage metrics

Categories

Keywords

Licence

Exports