Synergies between low- and intermediate-redshift galaxy populations revealed with unsupervised machine learning
Authors:
Sebastian Turner,
Małgorzata Siudek,
Samir Salim,
Ivan K. Baldry,
Agnieszka Pollo,
Steven N. Longmore,
Katarzyna Małek,
Chris A. Collins,
Paulo J. Lisboa,
Janusz Krywult,
Thibaud Moutard,
Daniela Vergani,
Alexander Fritz
Abstract:
The colour bimodality of galaxies provides an empirical basis for theories of galaxy evolution. However, the balance of processes that begets this bimodality has not yet been constrained. A more detailed view of the galaxy population is needed, which we achieve in this paper by using unsupervised machine learning to combine multi-dimensional data at two different epochs. We aim to understand the c…
▽ More
The colour bimodality of galaxies provides an empirical basis for theories of galaxy evolution. However, the balance of processes that begets this bimodality has not yet been constrained. A more detailed view of the galaxy population is needed, which we achieve in this paper by using unsupervised machine learning to combine multi-dimensional data at two different epochs. We aim to understand the cosmic evolution of galaxy subpopulations by uncovering substructures within the colour bimodality. We choose a clustering algorithm that models clusters using only the most discriminative data available, and apply it to two galaxy samples: one from the second edition of the GALEX-SDSS-WISE Legacy Catalogue (GSWLC-2; $z \sim 0.06$), and the other from the VIMOS Public Extragalactic Redshift Survey (VIPERS; $z \sim 0.65$). We cluster within a nine-dimensional feature space defined purely by rest-frame ultraviolet-through-near-infrared colours. Both samples are similarly partitioned into seven clusters, breaking down into four of mostly star-forming galaxies (including the vast majority of green valley galaxies) and three of mostly passive galaxies. The separation between these two families of clusters suggests differences in the evolution of their galaxies, and that these differences are strongly expressed in their colours alone. The samples are closely related, with star-forming/green-valley clusters at both epochs forming morphological sequences, capturing the gradual internally-driven growth of galaxy bulges. At high stellar masses, this growth is linked with quenching. However, it is only in our low-redshift sample that additional, environmental processes appear to be involved in the evolution of low-mass passive galaxies.
△ Less
Submitted 6 June, 2021; v1 submitted 9 February, 2021;
originally announced February 2021.
Reproducible $k$-means clustering in galaxy feature data from the GAMA survey
Authors:
Sebastian Turner,
Lee S. Kelvin,
Ivan K. Baldry,
Paulo J. Lisboa,
Steven N. Longmore,
Chris A. Collins,
Benne W. Holwerda,
Andrew M. Hopkins,
Jochen Liske
Abstract:
A fundamental bimodality of galaxies in the local Universe is apparent in many of the features used to describe them. Multiple sub-populations exist within this framework, each representing galaxies following distinct evolutionary pathways. Accurately identifying and characterising these sub-populations requires that a large number of galaxy features be analysed simultaneously. Future galaxy surve…
▽ More
A fundamental bimodality of galaxies in the local Universe is apparent in many of the features used to describe them. Multiple sub-populations exist within this framework, each representing galaxies following distinct evolutionary pathways. Accurately identifying and characterising these sub-populations requires that a large number of galaxy features be analysed simultaneously. Future galaxy surveys such as LSST and Euclid will yield data volumes for which traditional approaches to galaxy classification will become unfeasible. To address this, we apply a robust $k$-means unsupervised clustering method to feature data derived from a sample of 7338 local-Universe galaxies selected from the Galaxy And Mass Assembly (GAMA) survey. This allows us to partition our sample into $k$ clusters without the need for training on pre-labelled data, facilitating a full census of our high dimensionality feature space and guarding against stochastic effects. We find that the local galaxy population natively splits into $2$, $3$, $5$ and a maximum of $6$ sub-populations, with each corresponding to a distinct ongoing evolutionary mechanism. Notably, the impact of the local environment appears strongly linked with the evolution of low-mass ($M_{*} < 10^{10}$ M$_{\odot}$) galaxies, with more massive systems appearing to evolve more passively from the blue cloud onto the red sequence. With a typical run time of $\sim3$ minutes per value of $k$ for our galaxy sample, we show how $k$-means unsupervised clustering is an ideal tool for future analysis of large extragalactic datasets, being scalable, adaptable, and providing crucial insight into the fundamental properties of the local galaxy population.
△ Less
Submitted 1 October, 2018;
originally announced October 2018.