Abstract

With the advent of wide-field cosmological surveys, we are approaching samples of hundreds of thousands of galaxy clusters. While such large numbers will help reduce statistical uncertainties, the control of systematics in cluster masses is crucial. Here we examine the effects of an important source of systematic uncertainty in galaxy-based cluster mass estimation techniques: the presence of significant dynamical substructure. Dynamical substructure manifests as dynamically distinct subgroups in phase-space, indicating an ‘unrelaxed’ state. This issue affects around a quarter of clusters in a generally selected sample. We employ a set of mock clusters whose masses have been measured homogeneously with commonly used galaxy-based mass estimation techniques (kinematic, richness, caustic, radial methods). We use these to study how the relation between observationally estimated and true cluster mass depends on the presence of substructure, as identified by various popular diagnostics. We find that the scatter for an ensemble of clusters does not increase dramatically for clusters with dynamical substructure. However, we find a systematic bias for all methods, such that clusters with significant substructure have higher measured masses than their relaxed counterparts. This bias depends on cluster mass: the most massive clusters are largely unaffected by the presence of significant substructure, but masses are significantly overestimated for lower mass clusters, by ∼ 10 per cent at 1014 and ≳ 20 per cent for ≲ 1013.5. The use of cluster samples with different levels of substructure can therefore bias certain cosmological parameters up to a level comparable to the typical uncertainties in current cosmological studies.

1 INTRODUCTION

Galaxy clusters are massive, rare objects which form from high peaks in the underlying density field and whose population characteristics are sensitive to the expansion history of the Universe and the growth rate of structure. Statistical studies of the galaxy cluster population are therefore powerful tools across various fields including cosmology (see Voit 2005; Allen, Evrard & Mantz 2011 for a review, Tinker et al. 2012), galaxy evolution (e.g. Dressler 1980; Balogh et al. 1999; Goto et al. 2003; Postman et al. 2005; Peng et al. 2012), and large-scale structure (e.g. Bahcall 1988; Einasto et al. 2001).

We are entering an exciting time for cluster cosmology with ongoing surveys such as The Dark Energy Survey (The Dark Energy Survey Collaboration 2005), the Kilo-Degree Survey (de Jong et al. 2015), WFIRST (Spergel et al. 2015), the South Pole Telescope Sunyaev Zel'dovich survey (de Haan et al. 2016), the Atacama Cosmology Telescope (Sehgal et al. 2011), the Hyper Suprime Cam survey (Aihara et al. 2017), and upcoming surveys such as Euclid (Amendola et al. 2013), eROSITA (Pillepich, Porciani & Reiprich 2012), and LSST (LSST Science Collaboration 2009).

With the production of these wide-field surveys across a variety of wavelengths, we are moving into an era where samples of 106 galaxy clusters will be available. These large samples enable the reduction of statistical uncertainties, however, it is clear that systematic uncertainties often dominate the statistical uncertainties in cluster mass estimation (as highlighted in Benson et al. 2013; Hasselfield et al. 2013; Planck Collaboration XXIV 2016b), and the need to control for these systematic uncertainties is even more crucial for cluster cosmology studies.

One such source of systematic uncertainty in cluster mass estimation techniques in particular is the presence of dynamically young clusters with significant dynamical substructure. Cluster dynamical substructure is characterized as the presence of dynamically distinct subgroups within galaxy clusters. In the cluster galaxy distribution substructure typically manifests itself in the form of asymmetrical velocity distributions and distinct subgroups in phase-space of clusters. The presence of significant substructure is an indication that a cluster is not in virial equilibrium or in a ‘relaxed’ state, either because of a recent cluster–cluster merger, or significant growth of the cluster via infalling groups.

There have been numerous studies since the 1980s probing the frequency of dynamical substructure in cluster samples (e.g. Geller & Beers 1982; Dressler & Shectman 1988; Rhee, van Haarlem & Katgert 1991; Bird 1994; Escalera et al. 1994; West, Jones & Forman 1995; Solanes, Salvador-Solé & González-Casado 1999; Burgett et al. 2004; Owers, Couch & Nulsen 2009; Aguerri & Sánchez-Janssen 2010; Cohn 2012; Einasto et al. 2012; Hou et al. 2012; Ziparo et al. 2012; Owers et al. 2017). Many of these works also explored whether measured global properties of clusters differ for clusters in their samples with significant substructure compared to more relaxed clusters. While some works have found that the measured global properties of clusters do differ in samples of clusters that have significant dynamical substructure (e.g. Geller & Beers 1982; Escalera et al. 1994; West et al. 1995; Girardi et al. 1997; Biviano et al. 2006; Lopes et al. 2006; Hou et al. 2012), other works do not find any obvious difference in cluster measures for complex clusters (e.g. Biviano et al. 1993; Fadda et al. 1997; Aguerri & Sánchez-Janssen 2010). The discordance in the conclusions are likely due to small galaxy cluster samples and the method employed to characterize dynamical substructure.

While these works focus on comparing measured global cluster properties for highly substructured and non-substructured clusters, in this study, we focus on deducing whether cluster mass estimation techniques themselves are affected by the presence of significant dynamical substructure, as opposed to differences in global parameters of these two cluster populations.

One approach to examine whether cluster mass estimation techniques themselves are affected by the presence of significant dynamical substructure is to compare galaxy-based reconstructed mass estimates with reconstructed mass estimates computed using other mass proxies, e.g. X-ray, lensing, SZ-based mass estimates. An example of this multiwavelength comparison is in Lopes et al. (2006), where optical richness and X-ray luminosity relations for a sample of several hundred clusters are examined. The authors find that the exclusion of clusters with substructure does not improve the correlation between X-ray luminosity and richness, but does improve the relation between X-ray temperature and optical parameters. More recently, Sifón et al. (2013) hints that disturbed systems may bias the relation between SZE-velocity dispersion cluster mass, however, they state the need for larger samples of clusters to confirm this.

The second approach to deduce whether cluster mass estimation techniques themselves are affected by the presence of significant dynamical substructure is to use mock data where the underlying halo mass is known, and global cluster properties including mass and relaxation state are measured in an observational manner. For example, Pinkney et al. (1996), use N-body simulations of galaxy cluster mergers and find that virial masses are overestimated by up to a factor of 2 for clusters undergoing mergers, a conclusion similar to that of Perea, del Olmo & Moles (1990).

The main assumption required in this approach is that the simulated galaxy clusters deemed highly substructured by an observational substructure tests are indeed similar to clusters in the real Universe that would be deemed highly substructured by dynamical substructure tests. This assumption is reasonable in the case where the properties of galaxies in the simulated clusters used are taken directly from the underlying N-body dark matter simulation, where phase-space properties have primarily evolved over time due to the influence of gravity. To first order, these simulated phase-space properties are indeed comparable to galaxy phase-space properties in the Universe.

To understand the consequence of including dynamically disturbed galaxy clusters in cluster cosmology samples, we look to examine the following questions: does the presence of significant dynamical substructure impact commonly used galaxy-based mass estimation techniques? Would scaling relations between multiwavelength mass estimation techniques differ for highly substructured and non-substructured clusters? And finally, should dynamically young clusters be excluded from future cluster cosmology samples?

In this work, we explore these critical questions, presenting the first extensive, homogenous study of the impact of dynamical substructure on galaxy-based cluster mass estimation techniques. We utilize part of the Galaxy Cluster Mass Reconstruction Project (GCMRP) data set, where 25 different galaxy-based mass estimation techniques were tested using two mock galaxy catalogues to deduce how well these methods characterized global cluster properties such as mass (Old et al. 2014, 2015), and how this mass depends on the accuracy of the selected members (Wojtak et al. in preparation).

The article is organized as follows: we describe the mock galaxy catalogue in Section 2, and the mass reconstruction methods applied to this catalogue in Section 3. In Section 4, we provide details of our analysis before presenting the results on the effects of significant dynamical substructure on cluster mass estimation in Section 5. We end with a discussion of our results and conclusions in Section 6. Throughout the article we adopt a Lambda Cold Dark Matter (ΛCDM) cosmology with Ω0 = 0.27, ΩΛ = 0.73, σ8 = 0.82 and a Hubble constant of |$H_{\rm 0} = 100\,h\,\rm {km\,s^{-1}}\,\rm {Mpc^{-1}}$| where h = 0.7, although none of the conclusions depend strongly on these parameters.

2 DATA

For this study, we only use data from the GCMRP where the dynamical properties of the galaxies are taken directly from the underlying N-body dark matter subhaloes themselves, where the galaxies have retained the ‘dynamical memory’ of the merging history of the clusters. This strategy ensures a more direct comparison with that of the real Universe, where we assume the phase-space properties of galaxies have primarily evolved over time due to the influence of gravity. We take an observational approach in this study, measuring the dynamical state of our mock clusters using observational dynamical substructure tests. We describe the underlying dark matter simulation, light cone generation and model used to populate the dark matter simulation outputs with galaxies in the following subsections.

2.1 Dark matter simulation

The underlying dark matter simulation we use is the Bolshoi dissipationless cosmological simulation which follows the evolution of 20483 dark matter particles of mass |$1.35 \times 10^{8}\,h^{-1} \,{\rm M_{\rm {\odot }}}$| from z = 80 to z = 0 within a box of side length 250 h−1 Mpc with a force resolution of the 1 h−1 kpc (Klypin, Trujillo-Gomez & Primack 2011). The simulation was run with the art adaptive mesh refinement code following a flat ΛCDM cosmology with the following parameters: Ω0 = 0.27, ΩΛ = 0.73, σ8 = 0.82, n = 0.95, and h = 0.70. The halo catalogues are complete for haloes with circular velocity Vcirc > 50 km s− 1 (corresponding to |$M_{\rm 360\rho } \approx 1.5 \times 10^{10}\,h^{-1} \,{\rm M_{\rm {\odot }}}$|⁠, ∼110 particles).

rockstar, a 6D FOF group-finder based on adaptive hierarchical refinement, is used to identify dark matter haloes, substructure, and tidal features (Behroozi, Wechsler & Wu 2013). rockstar identifies haloes and subhaloes using 6D (3D in spatial and 3D in velocity) information which are joined into hierarchical merging trees that describe in detail how structures grow as the universe evolves. As rockstar uses spatial and velocity information to identify dark matter structures, it does not suffer from (3D) projection effects that would potentially bias this study in incidences where two group centres were spatially aligned in the same snapshot. rockstar calculates the underlying halo masses by calculating the spherical overdensities according to a density threshold 200 times that of the critical density. We highlight that these overdensities are calculated using all the particles for all the substructure contained in a halo. This halo finder has been shown to recover halo properties with high accuracy (for example, with errors in mass of ΔM200c/M200c ≪ 0.1) and produces results consistent with those of other halo finders (Knebe et al. 2011).

2.2 Light cone construction

For this study, we use light cones produced by the Theoretical Astrophysical Observatory1 (Bernyk et al. 2016), an online eResearch tool that provides access to semi-analytic galaxy formation models and N-body simulations. The light cone tool remaps the spatial and temporal positions of each galaxy in the simulation box on to a cone which subtends 60° by 60° on the sky, covering a redshift range of 0 < z < 0.15. We specify a minimum r-band luminosity for the galaxies of Mr = −19 + 5 log h for the catalogue.

2.3 Semi-analytic model

The model we use to form galaxies on the underlying dark matter data is the Semi-Analytic Galaxy Evolution (SAGE) galaxy formation model (Croton et al. 2016). As described in more detail in Old et al. (2015), this galaxy formation model is applied to the merger trees described in Section 2.1. In each tree and at each redshift, virialized dark matter haloes are assumed to attract pristine gas from the surrounding environment, from which galaxies form and evolve. The SAGE model is calibrated using various observations at z = 0, namely the stellar mass function and SDSS-band luminosity functions, baryonic Tully–Fisher relation, metallicity–stellar mass relation, and the black hole–bulge relation.

The model includes various galaxy formation physics from reionization of the inter-galactic medium at early times, the infall of this gas into haloes, radiative cooling of hot gas and the formation of cooling flows, star formation in the cold disc of galaxies and the resulting supernova feedback, black hole growth, and active galactic nuclei (AGN) feedback through the ‘quasar’ and ‘radio’ epochs of AGN evolution, metal enrichment of the inter-galactic and intra-cluster medium from star formation, and galaxy morphology shaped through secular processes, mergers, and merger-induced starbursts. Detailed comparisons of the model to observations at higher redshift can be found in Lu et al. (2014) and Croton et al. (2016), though we note that our light cone spans only lower redshifts, as described in Section 2.2.

Importantly, each group identified by the halo finder rockstar has a ‘central’ galaxy whose central position and velocity is determined by averaging the positions and velocities of the subset of halo particles. Each group also has a number of ‘satellite’ galaxies (cluster members) that maintain the positions and velocities of the subhaloes that merged with the parent halo.

3 MASS RECONSTRUCTION METHODS

To determine the consequence of including dynamically disturbed galaxy clusters in cluster cosmology samples, we use a subset of the GCMRP data set, where 23 commonly-used galaxy-based mass estimation techniques (kinematic, richness, caustic, radial methods), were tested in a blind manner on clusters from two mock galaxy catalogues. For this study, we use only results of galaxy-based techniques which were tested on mock clusters from the semi-analytic model (SAM)-based data set described in Section 2.3, where the dynamical properties of the galaxies are taken directly from the underlying N-body dark matter subhaloes themselves (unlike the HOD2 model used in Old et al. 2015).

The three general steps that galaxy-based techniques typically follow are first to locate the cluster overdensity, choose which galaxies are members of the cluster, and finally use the properties of this membership to reconstruct cluster mass. In this study, we focus on the second and third steps: deducing membership and mass, as opposed to cluster finding. We therefore provide the mass reconstruction methods with the galaxy cluster positions as input. We summarize the type of data the methods require as input in Table 1 and the basic properties of all methods in Tables A1 and A2, however, we refer the reader to studies Old et al. (2014, 2015) for more detail of the procedure of each cluster mass reconstruction technique. We note that the colour associated with each method in the figures and tables corresponds to the main galaxy population property used to perform mass estimation richness (magenta), projected phase-space (black), radii (blue), velocity dispersion (red), or abundance matching (green).

Table 1.

Summary of the 23 cluster mass estimation methods. Listed is an acronym identifying the method, an indication of the main property used to undertake member galaxy selection, and an indication of the method used to convert this membership list to a mass estimate. The type of observational data required as input for each method is listed in the fourth column. Note that acronyms denoted with an asterisk indicate that the method did not use our initial object target list but rather matched these locations at the end of their analysis. Please see Tables A1 and A2 in the appendix for more details on each method.

MethodInitial galaxy selectionMass estimationType of data requiredReference
PCNPhase-spaceRichnessSpectroscopyPearson et al. (2015)
PFN*FOFRichnessSpectroscopyPearson et al. (2015)
NUMPhase-spaceRichnessSpectroscopyMamon et al. (in preparation)
ESCPhase-spacePhase-spaceSpectroscopyGifford & Miller (2013)
MPOPhase-spacePhase-spaceMultiband photometry, spectroscopyMamon, Biviano & Boué (2013)
MP1Phase-spacePhase-spaceSpectroscopyMamon et al. (2013)
RWPhase-spacePhase-spaceSpectroscopyWojtak et al. (2009)
TAR*FOFPhase-spaceSpectroscopyTempel et al. (2014)
PCOPhase-spaceRadiusSpectroscopyPearson et al. (2015)
PFO*FOFRadiusSpectroscopyPearson et al. (2015)
PCRPhase-spaceRadiusSpectroscopyPearson et al. (2015)
PFR*FOFRadiusSpectroscopyPearson et al. (2015)
MVM*FOFAbundance matchingSpectroscopyMuñoz-Cuartas & Müller (2012)
AS1Red sequenceVelocity dispersionSpectroscopySaro et al. (2013)
AS2Red sequenceVelocity dispersionSpectroscopySaro et al. (2013)
AvLPhase-spaceVelocity dispersionSpectroscopyvon der Linden et al. (2007)
CLEPhase-spaceVelocity dispersionSpectroscopyMamon et al. (2013)
CLNPhase-spaceVelocity dispersionSpectroscopyMamon et al. (2013)
SG1Phase-spaceVelocity dispersionSpectroscopySifón et al. (2013)
SG2Phase-spaceVelocity dispersionSpectroscopySifón et al. (2013)
SG3Phase-spaceVelocity dispersionSpectroscopyLopes et al. (2009)
PCSPhase-spaceVelocity dispersionSpectroscopyPearson et al. (2015)
PFS*FOFVelocity dispersionSpectroscopyPearson et al. (2015)
MethodInitial galaxy selectionMass estimationType of data requiredReference
PCNPhase-spaceRichnessSpectroscopyPearson et al. (2015)
PFN*FOFRichnessSpectroscopyPearson et al. (2015)
NUMPhase-spaceRichnessSpectroscopyMamon et al. (in preparation)
ESCPhase-spacePhase-spaceSpectroscopyGifford & Miller (2013)
MPOPhase-spacePhase-spaceMultiband photometry, spectroscopyMamon, Biviano & Boué (2013)
MP1Phase-spacePhase-spaceSpectroscopyMamon et al. (2013)
RWPhase-spacePhase-spaceSpectroscopyWojtak et al. (2009)
TAR*FOFPhase-spaceSpectroscopyTempel et al. (2014)
PCOPhase-spaceRadiusSpectroscopyPearson et al. (2015)
PFO*FOFRadiusSpectroscopyPearson et al. (2015)
PCRPhase-spaceRadiusSpectroscopyPearson et al. (2015)
PFR*FOFRadiusSpectroscopyPearson et al. (2015)
MVM*FOFAbundance matchingSpectroscopyMuñoz-Cuartas & Müller (2012)
AS1Red sequenceVelocity dispersionSpectroscopySaro et al. (2013)
AS2Red sequenceVelocity dispersionSpectroscopySaro et al. (2013)
AvLPhase-spaceVelocity dispersionSpectroscopyvon der Linden et al. (2007)
CLEPhase-spaceVelocity dispersionSpectroscopyMamon et al. (2013)
CLNPhase-spaceVelocity dispersionSpectroscopyMamon et al. (2013)
SG1Phase-spaceVelocity dispersionSpectroscopySifón et al. (2013)
SG2Phase-spaceVelocity dispersionSpectroscopySifón et al. (2013)
SG3Phase-spaceVelocity dispersionSpectroscopyLopes et al. (2009)
PCSPhase-spaceVelocity dispersionSpectroscopyPearson et al. (2015)
PFS*FOFVelocity dispersionSpectroscopyPearson et al. (2015)
Table 1.

Summary of the 23 cluster mass estimation methods. Listed is an acronym identifying the method, an indication of the main property used to undertake member galaxy selection, and an indication of the method used to convert this membership list to a mass estimate. The type of observational data required as input for each method is listed in the fourth column. Note that acronyms denoted with an asterisk indicate that the method did not use our initial object target list but rather matched these locations at the end of their analysis. Please see Tables A1 and A2 in the appendix for more details on each method.

MethodInitial galaxy selectionMass estimationType of data requiredReference
PCNPhase-spaceRichnessSpectroscopyPearson et al. (2015)
PFN*FOFRichnessSpectroscopyPearson et al. (2015)
NUMPhase-spaceRichnessSpectroscopyMamon et al. (in preparation)
ESCPhase-spacePhase-spaceSpectroscopyGifford & Miller (2013)
MPOPhase-spacePhase-spaceMultiband photometry, spectroscopyMamon, Biviano & Boué (2013)
MP1Phase-spacePhase-spaceSpectroscopyMamon et al. (2013)
RWPhase-spacePhase-spaceSpectroscopyWojtak et al. (2009)
TAR*FOFPhase-spaceSpectroscopyTempel et al. (2014)
PCOPhase-spaceRadiusSpectroscopyPearson et al. (2015)
PFO*FOFRadiusSpectroscopyPearson et al. (2015)
PCRPhase-spaceRadiusSpectroscopyPearson et al. (2015)
PFR*FOFRadiusSpectroscopyPearson et al. (2015)
MVM*FOFAbundance matchingSpectroscopyMuñoz-Cuartas & Müller (2012)
AS1Red sequenceVelocity dispersionSpectroscopySaro et al. (2013)
AS2Red sequenceVelocity dispersionSpectroscopySaro et al. (2013)
AvLPhase-spaceVelocity dispersionSpectroscopyvon der Linden et al. (2007)
CLEPhase-spaceVelocity dispersionSpectroscopyMamon et al. (2013)
CLNPhase-spaceVelocity dispersionSpectroscopyMamon et al. (2013)
SG1Phase-spaceVelocity dispersionSpectroscopySifón et al. (2013)
SG2Phase-spaceVelocity dispersionSpectroscopySifón et al. (2013)
SG3Phase-spaceVelocity dispersionSpectroscopyLopes et al. (2009)
PCSPhase-spaceVelocity dispersionSpectroscopyPearson et al. (2015)
PFS*FOFVelocity dispersionSpectroscopyPearson et al. (2015)
MethodInitial galaxy selectionMass estimationType of data requiredReference
PCNPhase-spaceRichnessSpectroscopyPearson et al. (2015)
PFN*FOFRichnessSpectroscopyPearson et al. (2015)
NUMPhase-spaceRichnessSpectroscopyMamon et al. (in preparation)
ESCPhase-spacePhase-spaceSpectroscopyGifford & Miller (2013)
MPOPhase-spacePhase-spaceMultiband photometry, spectroscopyMamon, Biviano & Boué (2013)
MP1Phase-spacePhase-spaceSpectroscopyMamon et al. (2013)
RWPhase-spacePhase-spaceSpectroscopyWojtak et al. (2009)
TAR*FOFPhase-spaceSpectroscopyTempel et al. (2014)
PCOPhase-spaceRadiusSpectroscopyPearson et al. (2015)
PFO*FOFRadiusSpectroscopyPearson et al. (2015)
PCRPhase-spaceRadiusSpectroscopyPearson et al. (2015)
PFR*FOFRadiusSpectroscopyPearson et al. (2015)
MVM*FOFAbundance matchingSpectroscopyMuñoz-Cuartas & Müller (2012)
AS1Red sequenceVelocity dispersionSpectroscopySaro et al. (2013)
AS2Red sequenceVelocity dispersionSpectroscopySaro et al. (2013)
AvLPhase-spaceVelocity dispersionSpectroscopyvon der Linden et al. (2007)
CLEPhase-spaceVelocity dispersionSpectroscopyMamon et al. (2013)
CLNPhase-spaceVelocity dispersionSpectroscopyMamon et al. (2013)
SG1Phase-spaceVelocity dispersionSpectroscopySifón et al. (2013)
SG2Phase-spaceVelocity dispersionSpectroscopySifón et al. (2013)
SG3Phase-spaceVelocity dispersionSpectroscopyLopes et al. (2009)
PCSPhase-spaceVelocity dispersionSpectroscopyPearson et al. (2015)
PFS*FOFVelocity dispersionSpectroscopyPearson et al. (2015)

4 DYNAMICAL SUBSTRUCTURE ANALYSIS

The tools for detecting dynamical substructure, either solely using the cluster member velocity distribution (1D), the member positions (2D) or combining the velocity and positional information of the cluster (3D), have been extensively assessed for their robustness and reliability for both group-sized systems and cluster-sized systems (Pinkney et al. 1996; Hou et al. 2009). These comprehensive works indicate that while applying a variety of 1D, 2D and 3D dynamical substructure tests is useful, the more reliable substructure tests are 3D tests which quantify the difference between local subgroups of galaxies within clusters to the global cluster properties such as the Dressler–Shectman (DS, 1988) test and the Kappa test (Colless & Dunn 1996). In this study, we apply these tests to our semi-analytic mock simulation data (where we again note that the mock galaxy properties are taken from the underlying N-body simulation dark matter subhaloes). A cluster is deemed as significantly dynamically substructured if either the DS test or the Kappa test detected substructure. We outline the procedure of these tests below.

While these tests are found to be the more reliable techniques applied in the literature (see extensive evaluations in Pinkney et al. 1996; Hou et al. 2009), there can be cases where clusters do indeed contain significant substructure undetected by these tests. For example, White, Cohn & Smit (2010) use N-body simulations to test the correlation between a given dynamical substructure detection technique and time since last major merger of a cluster. They find that this correlation is dependent on viewing angle, especially in cases where the substructure is not well separated along the line of sight. Furthermore, Hou et al. (2012) find that the DS test in particular can be reliably applied to groups only with Ngal > 20 and where a high confidence level of 95 per cent or higher is used. Indeed, Hou et al. (2012) deduced that for groups with 10 ≤ Ngal < 20, the DS test does not necessarily detect all substructures within a system, but the test can be used to determine a reliable lower limit on the amount of substructure.

4.1 The Dressler–Shectman test

The DS test aims to quantify the difference between local kinematics and global kinematics by selecting subgroups of cluster members and calculating the local velocity dispersion σlocal and velocity mean |$\overline{\nu }_{\rm local}$|⁠. These local properties are compared with the global cluster velocity dispersion σglobal and cluster velocity mean |$\overline{\nu }_{\rm global}$| by computing an i-th deviation δi for the i-th cluster member:
\begin{equation} \delta _{\rm i}^{2}=\left(\frac{N_{\rm nn}+1}{\sigma _{\rm global}} \right)\, \left[(\overline{\nu }_{\rm local}-\overline{\nu }_{\rm global})^{2}+(\sigma _{\rm local}-\overline{\sigma }_{\rm global})^{2}\right]\!. \end{equation}
(1)
We adopt a correction to the original DS test by replacing Nnn = 11 with |$N_{\rm nn} = \sqrt{ N_{\rm members}}$| as suggested for applying to groups and clusters with fewer members to enhance the sensitivity of the test to small-scale structures (Silverman 1986; Zabludoff & Mulchaey 1998). The deviations are then summed to give Δ, the DS statistic
\begin{equation} \Delta = \displaystyle \sum \limits _{i} \delta _{i}. \end{equation}
(2)

Often referred to as the critical value for the cluster, the Δ-statistic is used to compute a PTE for the presence of substructure by computing 10 000 Monte Carlo realizations, shuffling the member velocities amongst the positions. The PTE is used to test the null hypothesis that the cluster has no substructure, hence a small PTE ≤ 0.05 indicates that the cluster has significant substructure.

4.2 The kappa test

In addition to the DS test, we employ another 3D dynamical substructure test, the κ-test (Colless & Dunn 1996), which quantifies the difference between local substructures and global cluster phase-space properties using the Kolmogorov–Smirnov (KS) test. Similar to the DS-test, for each galaxy within the cluster, |$N_{\rm nn} = \sqrt{ N_{\rm members}}$| nearest galaxies are selected and the velocity distribution of that local subgroup is compared to the parent distribution by measuring the maximum separation of the cumulative distribution functions DObs. The negative log likelihood of producing a D-statistic greater than DObs is computed and summed for all N galaxies in the cluster:
\begin{equation} \kappa _{n} = \sum ^{n}_{i=1}-[\rm {log}({\rm P}_{KS}({\rm D}_{\rm sim}>{\rm D}_{Obs})]. \end{equation}
(3)
As for the DS test, the significance of the κn statistic is computed by performing 10 000 Monte Carlo realizations, shuffling the member velocities amongst the positions to produce a Probability to Exceed (PTE). The PTE, 0 ≤ PTE ≤ 1, is used to test the null hypothesis that the cluster has no substructure, hence a small PTE ≤ 0.05 indicates that the cluster has significant substructure. For clusters with Ngal ≥ 30, it is noted that the DS test is one of the most sensitive test for substructure detection (Pinkney et al. 1996) and is reliable for clusters with Ngal ≥ 20, provided that the PTE is 0.05 or 0.01 (Knebe & Müller 2000; Hou et al. 2012). The test is also reliable to use as a lower limit for group-sized systems with Ngal ≥ 10.

4.3 Mock cluster sample and analysis

In this study, we apply the commonly used dynamical substructure DS and Kappa tests as described in the above sections on semi-analytic clusters whose galaxy properties are taken from the underlying N-body simulation dark matter subhaloes. A cluster is deemed as highly dynamically substructured if either of these tests detect substructure. As mentioned in Section 4, the dynamical substructure tests may not detect significant substructure in certain cases. This means that our sample of clusters that are deemed to be non-substructured may have some level of contamination from substructured clusters. We first select all clusters with Ngal ≥ 20 from the GCMRP cluster sample, leaving us with 943 clusters between 13.50 ≤ log (M200c, true/M) ≤ 15.14 and with a median mass of log (M200c, true/M) = 14.05.

The 943 clusters are separated into two samples according to whether either the DS test or Kappa test detected substructure or not. Of the 943 clusters, dynamical substructure was detected in 255 clusters. PTE values of both the Kappa and DS test for individual clusters can be found in Fig. B1 and the mass–richness relation of the substructured and non-substructured sample is shown as a red and black solid line respectively in Fig. D1 in the appendix.

The frequency of significant dynamical substructure in our cluster sample is ∼ 27 per cent. We note that the frequency of significant dynamical substructure varies significantly for observational cluster samples in the literature, with fractions of substructure detected in samples being as low as ∼ 15 per cent (e.g. Girardi et al. 1996), and as high as ∼ 80 per cent (e.g. Wing & Blanton 2013). This variation in the fraction of highly substructured clusters is attributed to factors such as differences in the algorithms used to detect substructure and the characteristics of the cluster samples themselves (for example, survey depth, number of galaxies for which there are spectroscopic redshifts available; Kolokotronis et al. 2001; Burgett et al. 2004; Ramella et al. 2007). In Fig. C1 in the appendix, we show the prevalence of highly substructured clusters as a function of log true mass, which we find increases for higher mass clusters. This trend is also identified in several observational studies which employ different dynamical substructure tests (e.g. de Carvalho et al. 2017; Roberts & Parker 2017).

When assessing differences in cluster mass reconstruction of two samples, it is important to control by cluster mass, especially as cluster mass estimation technique performance is often mass dependent. We ensure that the median mass of the two samples are similar by binning the clusters in each sample into seven linearly spaced log true mass bins. We then randomly select the minimum number of distinct clusters in a given mass bin of the two samples. We do this iteratively (N = 200 iterations), resulting in N subsamples of substructured clusters and N subsamples of non-substructured clusters. These subsamples are controlled to have median mass values close to the median mass of the substructured cluster sample [log(M200c, true/M) = 14.13]. As the sample of highly substructured clusters is smaller, each N subsamples of substructured clusters typically consists of the same clusters, whereas each N subsamples of the non-substructured clusters often consists of different clusters within each mass bin.

For each set of N subsamples of dynamically substructured and non-substructured clusters, we quantify differences between the two samples in terms of scaling relations between the true and recovered cluster masses. The first statistic we assess is the scatter in the recovered mass, |$\sigma _{M_{\rm Rec}}$|⁠, which delivers a measure of the scatter about the fit between true and recovered mass. The second parameter is the slope in the relation between recovered and true underlying mass, s, and the third parameter is the amplitude of the fit at the pivot mass, a. These statistics are computed by performing a likelihood-fitting analysis on these 400 subsamples, assuming a model where there is a linear relationship between the recovered and true log mass and residual offsets in the recovered mass are drawn from a normal distribution: log MRec = (a + log MPivot) + s(log MTrue − log MPivot) + e, where a, s, and e are the amplitude (or normalization), slope, and scatter, which includes measurement and model errors in addition to intrinsic scatter (induced by the different physical conditions of each cluster).

This analysis is similar to that in Old et al. (2015) and we refer the reader there for more detail. To summarize this approach, we compute a likelihood that is a sum of the probability of obtaining the data point assuming it is drawn from a ‘good’ distribution and the probability of obtaining the data point assuming it is drawn from a ‘bad’ outlier distribution, to try to ensure that the scatter value is not affected by a small number of extreme outliers (see Hogg, Bovy & Lang 2010, for more details). The components of this likelihood are weighted by the probability that any given point belongs to either of these distributions:
\begin{eqnarray} \mathcal {L} &\!=\!\!& \prod _{i=1,N} p_i \nonumber \\ p_i &\!\!=\!\!& \left[(1-P_{\rm b}) P(\log M_{{\rm Rec},i} | \log M_{{\rm True},i} ,\sigma _{\log M_{{\rm Rec}, i}},s,a) \right. \nonumber \\ && \qquad + \left. P_{\rm b} P(\log M_{{\rm Rec}, i}|\log M_{{\rm True},i},\sigma _{\rm outlier},s,a) \right]\! . \end{eqnarray}
(4)
Pb represents the posterior fraction of objects belonging to the ‘bad’ outlier distribution, |$\sigma _{M_{\rm Rec, i}}$| is the variance of the ‘good’ distribution and s and a are the slope and amplitude of the fit, respectively. We fix the variance of the ‘bad’ outlier distribution to a very large number with a prior that the variance of the ‘good’ distribution must always be smaller than the variance of the ‘bad’ distribution. We adopt flat priors for the variance of the ‘good’ distribution, the slope, and the amplitude. The probability that N data points belong to a ‘bad’ outlier distribution must be between zero and one. We note that we have performed the analysis with alternative priors (Jeffreys priors), and our results do not change significantly. We utilize Markov Chain Monte Carlo (MCMC) techniques to efficiently sample our parameter space and produce posterior probability distributions for the parameters described above. We use the parallel-tempered MCMC sampler emcee which employs several ensembles of walkers at different temperatures to explore our parameter space (Foreman-Mackey et al. 2013).

Employing walkers at different ‘temperatures’ where the likelihood is modified, enables walkers to easily explore different local maxima, preventing walkers becoming stuck at regions of local instead of global maxima in the case of a multimodal likelihood. In this analysis, we employ 50 walkers at 5 temperatures and perform 2200 iterations, including a ‘burn-in’ of 1000 iterations that are discarded. In total, 50 × 5 × 2200 = 5500 000 points in parameter space are sampled for each method and input catalogue. Figures of the marginalized probability distributions of parameters for all methods are available upon request.

We perform the analysis described below for each N subsample of highly substructured and N subsample non-substructured clusters and then compute the median of these output parameters of all subsamples.

5 RESULTS

The goals of this study are assess the extent to which galaxy-based cluster mass estimation techniques are sensitive to the presence of significant dynamical substructure, and ultimately, whether cluster cosmology studies utilizing galaxy-based mass estimation should look to exclude dynamically substructured clusters from their samples. We apply observational dynamical substructure tests to our sample of 943 mock clusters to separate our sample into highly substructured and non-substructured clusters. We then assess whether commonly used galaxy-based cluster mass estimation techniques perform differently on these two samples. In the following subsections, we discuss the impact of significant dynamical substructure on cluster mass estimation using three key statistics with which we assess how the cluster mass estimation techniques perform. These statistics are the scatter in the relation between recovered and true mass, the amplitude in the relation between recovered and true mass and finally, the mass-dependence, i.e. slope in the relation between recovered and true mass.

5.1 Impact of dynamical substructure on scatter

Fig. 1 depicts the median scatter in recovered mass produced by each cluster mass estimation technique for the highly substructured cluster sample versus the median scatter in recovered mass produced by each cluster mass estimation technique for the non-substructured cluster sample. The solid black line represents a 1:1 relation between these two parameters. The colour scheme reflects the approach implemented by each method to deliver a cluster mass from a chosen galaxy membership: magenta (richness), black (phase-space), blue (radial), green (abundance-matching), and red (velocity dispersion). We find methods that produce lower scatter in recovered mass (situated in the left-hand corner of Fig. 1), show little difference in scatter for both highly substructured and non-substructured cluster samples. The x-axis error bars show the uncertainty in the scatter parameter for non-substructured clusters, which is calculated by taking the standard deviation of the median scatter parameter values from the set of 200 non-substructured cluster samples. The y-axis error bars show the uncertainty in the scatter parameter for substructured clusters. This uncertainty is calculated by adding in quadrature the uncertainty from the standard deviation of the median scatter parameter values from the set of 200 substructured cluster samples to the uncertainty of the MCMC sampling of the scatter parameter (this former uncertainty is very small as the subsamples typically include the same clusters).

The median scatter in recovered mass produced by each cluster mass estimation technique for the sample of clusters with significant dynamical substructure versus the median scatter in recovered mass for the sample of clusters without significant dynamical substructure. The solid black line represents a 1:1 relation.
Figure 1.

The median scatter in recovered mass produced by each cluster mass estimation technique for the sample of clusters with significant dynamical substructure versus the median scatter in recovered mass for the sample of clusters without significant dynamical substructure. The solid black line represents a 1:1 relation.

While certain methods producing higher scatter in recovered mass may produce higher scatter for highly substructured clusters (of the order of ∼ 15 per cent), for example, SG1, PFS, we also see that other methods that utilize similar galaxy-based properties, may produce lower scatter for highly substructured clusters (of the order of up to ∼ 10 per cent) for example, AS1, AS2, and PCR. We do not see any consistent behaviour in terms of an increase or decrease in scatter for substructured clusters with mass estimation technique type (i.e. richness, phase-space, radial, abundance matching, velocity dispersion).

5.2 Impact of dynamical substructure on the amplitude

In addition to scatter, it is important to examine how the presence of significant dynamical substructure affects the amplitude in the relation between recovered and true underlying cluster mass. In this study, we measure the amplitude at the pivot mass which reflects the normalization of the relation between recovered and true log mass produced by each cluster mass estimation technique. Fig. 2 shows the median amplitude at the pivot mass of log M200c,true = 14.13 for the highly substructured cluster sample versus the median amplitude at the pivot mass produced by each cluster mass estimation technique for the non-substructured cluster sample. The x-axis error bars show the uncertainty in the amplitude parameter for non-substructured clusters, which is calculated by taking the standard deviation of the median amplitude parameter values from the set of 200 non-substructured cluster samples. The y-axis error bars show the uncertainty in the amplitude parameter for substructured clusters. This uncertainty is calculated by adding in quadrature the uncertainty from the standard deviation of the median amplitude parameter values from the set of 200 substructured cluster samples to the uncertainty of the MCMC sampling of the amplitude parameter (this former uncertainty is very small as the subsamples typically include the same clusters).

The median amplitude at the pivot mass for the sample of clusters with significant dynamical substructure versus the median amplitude at the pivot mass for the sample of clusters without significant dynamical substructure for each cluster mass estimation technique. The solid black line represents a 1:1 relation. If there were no difference in the amplitudes produced by each method at the pivot mass for the highly substructured and non-substructured samples, the methods’ median amplitude markers would lie on the 1:1 relation.
Figure 2.

The median amplitude at the pivot mass for the sample of clusters with significant dynamical substructure versus the median amplitude at the pivot mass for the sample of clusters without significant dynamical substructure for each cluster mass estimation technique. The solid black line represents a 1:1 relation. If there were no difference in the amplitudes produced by each method at the pivot mass for the highly substructured and non-substructured samples, the methods’ median amplitude markers would lie on the 1:1 relation.

If there were no difference in the biases produced by each method at the pivot mass for the highly substructured and non-substructured samples, the methods’ median amplitude markers would lie on the 1:1 relation. Instead, we see a systematic increase in the amplitude for all techniques for the highly substructured sample compared to the non-substructured sample.

For some methods that underestimate cluster mass in general, for example, velocity dispersion methods PFS, CLN, PCS, CLE, and phase-space method MP1, this systematic shift brings the amplitude value slightly closer to zero, and more comparable to the true underlying cluster mass.

For the methods that significantly overestimate cluster mass at the pivot mass, for example, radial based methods PCR, PFR, PFO, and richness methods PCN and PFN, the amplitude values increase and are brought further away from the average true underlying cluster mass. The median difference for all methods in the amplitude at the pivot mass for the highly substructured cluster sample versus non-substructured cluster samples, Δa = aSubs. − aNo subs., is Δa = 0.040 dex ( ∼ 9.7 per cent). We note that this value reflects the average difference in amplitude for all techniques for samples that comprise of only highly substructured clusters versus only non-substructured clusters.

In the likely case that ‘relaxed’, non-substructured clusters are used to calibrate scaling relations with mass, and these scaling relations are then applied to a larger sample of clusters that include both substructured and non-substructured clusters, this bias will likely be smaller. We repeat the MCMC likelihood analysis to compare the amplitude for non-substructured clusters compared to all 943 clusters (substructured and non-substructured clusters) and find a median difference in amplitude of Δa = 0.029 dex ( ∼ 6.9 per cent) at the pivot mass of log M200c, true = 14.13. Note that the median mass of these two samples is kept within ∼0.009 dex of each other by subsampling as for the analysis described in Section 4.3.

We note that the difference in amplitude increases to Δa = 0.067 dex ( ∼ 16.8 per cent), when we re-run the analysis with a more conservative DS and Kappa test PTE threshold to PTE ≤ 0.01. This increase in bias likely arises from the increased ‘purity’ in the substructured sample, due to the more pronounced substructure. In addition, we also find that the magnitude of the measured bias increases to 0.06 dex ( ∼ 14.6 per cent) when we re-run the analysis for the case the mock cluster sample is split into substructured and non-substructured clusters if only |$\it both$| the DS and Kappa test classify the cluster as highly substructured (with PTE ≤ 0.05), as opposed to if |$\it either$| the DS test or Kappa test classify the cluster as highly substructured.

5.3 Impact of dynamical substructure on slope

We now examine the mass dependence in cluster mass reconstruction, to deduce whether methods under- or overestimate cluster mass differently for lower and higher mass clusters if they have significant dynamical substructure. Fig. 3 shows the difference in the slope of the relation between recovered and true log mass produced by each cluster mass estimation technique for the sample of non-substructured clusters to the sample of highly substructured clusters versus the slope for the non-substructured clusters. The solid black line represents no difference in slope produced by these methods for these two different samples. The dotted purple line represents the median difference in the slopes for the two samples for all methods (0.054 dex, ∼ 13 per cent). The x-axis error bars show the uncertainty in the slope parameter for non-substructured clusters, which is calculated by taking the standard deviation of the median slope parameter values from the set of 200 non-substructured cluster samples. The y-axis error bars show the uncertainty in the difference in slopes, which is calculated by adding in quadrature the uncertainty in the slope for non-substructured clusters and the uncertainty in the slope for substructured clusters.

The difference in the slope of the relation between recovered and true log mass produced by each cluster mass estimation technique for the sample of non-substructured clusters to the sample of highly substructured clusters versus the slope for the sample of non-substructured clusters. The solid black line represents no difference in slope produced by these methods for these two different samples. The dotted red line represents the median difference in the slopes for the two samples for all methods (0.054 dex, ∼ 13 per cent).
Figure 3.

The difference in the slope of the relation between recovered and true log mass produced by each cluster mass estimation technique for the sample of non-substructured clusters to the sample of highly substructured clusters versus the slope for the sample of non-substructured clusters. The solid black line represents no difference in slope produced by these methods for these two different samples. The dotted red line represents the median difference in the slopes for the two samples for all methods (0.054 dex, ∼ 13 per cent).

The uncertainty in the slope parameter for substructured clusters is calculated by adding in quadrature the uncertainty from the standard deviation of the median slope parameter values from the set of 200 substructured cluster samples to the uncertainty of the MCMC sampling of the slope parameter (this former uncertainty is very small as the subsamples typically include the same clusters).

We see that the majority of methods produce a slightly flatter slope of the relation between recovered and true log mass for highly substructured clusters. This behaviour indicates that the masses of higher mass clusters are underestimated and the masses of lower mass clusters are overestimated compared to that of clusters around the pivot mass. Since we also find that cluster masses are systematically biased high at the pivot mass (Section 5.2), these two effects are likely to result in high mass clusters having relatively unbiased masses, while the masses of low mass clusters will likely be biased very high. This is indicated by the linear fit to the substructured clusters in Fig. E1 in the appendix, which shows the median difference in recovered and true cluster mass for all 23 mass estimation techniques. This flattening of the slope also demonstrates that magnitude of the bias in recovered mass ( ∼ 10 per cent at the pivot mass) does depend on the underlying cluster mass. For example, if a method systematically overestimated cluster mass by ∼ 10 per cent for clusters with a true mass of ∼log M200c, true = 14.13, that method would likely overestimate the masses of clusters log M200c, true < 14.13 to a greater extent.

Whilst we see a general trend to flatter slopes between the recovered and true cluster mass, methods that utilize the same galaxy population property to reconstruction, for example the velocity dispersion (red markers), are not all affected in the same manner. This further highlights the diversity in performance of methods which use the same galaxy property as a mass proxy.

6 DISCUSSION

The main objectives of this study are to deduce whether the inclusion of clusters with significant dynamical substructure will produce biases in cluster mass estimation and explore how these biases will impact both galaxy-based cluster cosmology studies and galaxy evolution studies that characterize galaxy environment by cluster mass. Reassuringly, for the majority of galaxy-based techniques with lower intrinsic scatter, we see little difference in the scatter in the recovered versus underlying mass for non-substructured and substructured clusters. However, as shown in Figs 2 and 3, the presence of significant dynamical substructure does indeed bias the amplitude and the slope in the relation between true underlying mass and estimated mass for all 23 cluster mass estimation techniques in this study.

The direction of this bias, i.e. the increase in estimated cluster mass compared to the true underlying mass for highly dynamically substructured clusters, is qualitatively in agreement with Perea et al. (1990), Pinkney et al. (1996), and Biviano et al. (2006), who find that in the case of virial-based cluster mass specifically, masses are overestimated for N-body simulations of merging clusters. For a more direct comparison, we apply our analysis to the simulated data set of 62 cluster-sized haloes in three projections from Biviano et al. (2006). For clusters that are highly substructured in projected phase-space compared to unsubstructured, we measure a bias between the recovered virial-based mass to true mass of (0.12 dex, ∼ 32 per cent) at a pivot mass of log M200c, true = 14.13, which is consistent with the bias we see for several methods. In addition, we perform a both a two-sample KS test and a two-sample Anderson–Darling test on this data set which rejects the null hypothesis that the recovered virial-based masses of substructured and non-substructured clusters are drawn from the same underlying continuous distribution (with PTEs of 0.0029 and 0.0038, respectively). The overestimation of virial-based masses for substructured clusters is also indicated in Foëx, Böhringer & Chon (2017), who find that the ratio of hydrostatic mass to virial-based mass is correlated for substructured clusters in their sample of 10 X-ray luminous clusters. Interestingly, the authors find that excising galaxies which are part of substructures reduces the overestimation in mass.

The analyses described above indicate a bias in virial-based cluster mass estimation. We highlight that the bias we find is prevalent in all 23 galaxy-based techniques which encompass richness, projected phase-space, radial, and abundance matching-based techniques. For richness-based techniques, this bias could be partially explained by differences in the stacked mass–richness relation for the substructured and non-substructured samples. A linear fit to the stacked samples, for example, delivers an increase in log mass of 0.07 dex at fixed Ngal of 40. However, we see substructures causing a consistent bias across all galaxy-based techniques that do not reconstruct mass from galaxy number counts.

The exact impact of this substructure-induced mass bias will be highly dependent on the underlying properties of individual cluster samples; however, we wish to qualitatively deduce the relevance of this bias. The most direct channel of propagating the bias into the estimates of cosmological parameters occurs when a cluster sample used for calibrating a mass scaling relation includes galaxy clusters with a different degree of substructure than the entire cluster sample used for cosmological inference. Considering the most extreme case, the calibration sample may consist of fully relaxed, non-substructured clusters. The primary effect of this observational strategy would be a shift of the observed mass function along the mass axis which in turn would cause a biased measurement of Ωm and σ8. A simple way to estimate the potential relative bias in the two cosmological parameters is to determine the two cosmological parameters for which the corresponding mass function matches the mass function computed for a fixed, fiducial cosmology, but shifted along the mass axis by a range of mass biases. In our calculation we adopt a Planck cosmology (Planck Collaboration XIII 2016a) with Ωm = 0.31 and σ8 = 0.83 as a reference model and a universal fitting formula for the mass function from Tinker et al. (2008). Fig. 4 shows the results for a range of mass biases. Interestingly, the error on Ωm and σ8 is on the same order as the error on the current leading constraints from CMB-based cosmology studies such as Planck Collaboration XIII (2016a) and is slightly lower than the error produced by weak lensing cluster cosmology studies such as Mantz et al. (2015) and SZ-based cluster cosmology studies (de Haan et al. 2016). We note that this is for the extreme case that the calibration sample is non-substructured, and the majority of the full sample of clusters are highly substructured. In the more realistic case that the contamination of highly substructured clusters in a given survey is typical to the fraction we observe in our simulated sample, ∼ 27 per cent, the systematic error on is likely on Ωm and σ8 is of the order of ∼ 1 per cent.

The percentage difference in Ωm and σ8 found when fitting a ΛCDM mass function with Planck parameters when shifting the mass function in log M200c by a range of values between −0.1 and 0.1 dex.
Figure 4.

The percentage difference in Ωm and σ8 found when fitting a ΛCDM mass function with Planck parameters when shifting the mass function in log M200c by a range of values between −0.1 and 0.1 dex.

We highlight that this study is restricted to galaxy clusters at relatively low redshifts (z < 0.15). Current surveys such as DES, ACT, SPT, and future surveys such as Euclid, LSST are probing much higher redshifts (up to z ∼ 2). We expect that the presence of significant substructure will be greater / more common at higher redshift. If this is indeed the case, cluster mass samples will be more positively biased (or overestimated) at higher redshift than at lower redshifts.

7 CONCLUSIONS

In this paper, we examine whether the masses of dynamically disturbed clusters can be measured to the same accuracy and precision as dynamically relaxed clusters with a variety of commonly used galaxy-based cluster mass estimation techniques. We aim to understand whether scaling relations between multiwavelength mass estimation techniques would differ for highly substructured and non-substructured clusters, and to that end, whether dynamically young clusters should be excluded from future galaxy-based cluster cosmology samples. The main results are as follows:

  • For the majority of galaxy-based techniques with lower intrinsic scatter, we see little difference in the scatter in the recovered versus underlying mass for non-substructured and substructured clusters.

  • We see a systematic increase in the measured amplitude at the median mass of the sample for all techniques for the highly substructured sample compared to the non-substructured sample. This means that for the same given underlying true cluster mass, all cluster mass measurement techniques will, on average, overestimate the mass of a cluster if it has significant dynamical substructure compared to a dynamically relaxed cluster. This systematic bias for all cluster mass estimation techniques is, on average, ∼ 10 per cent for clusters around log M200c = 14.13. It should be noted that for some methods which underestimate cluster mass in general, this systematic increase in amplitude brings measured cluster masses closer to the true underlying cluster mass, and vice versa.

  • We find that the bias in cluster mass for dynamically disturbed clusters is indeed mass dependent. Typically, the slope of the relation between recovered and true cluster mass is flatter for the sample of highly substructured clusters. A flatter slope indicates that the masses of higher mass clusters are underestimated and the masses of lower mass clusters are overestimated in comparison to the reconstructed masses of clusters at the median mass of the sample (∼log M200c = 14.13). The combination of a flatter slope and a positive bias in amplitude at the pivot mass indicate that the reconstructed masses of clusters at the high mass end are likely to be only minimally biased, whereas the reconstructed masses of clusters at the low mass end are biased even higher (for group-sized systems, this bias is ≳ 20 per cent for ≲ 1013.5).

  • For the purpose of improving accurate deductions of cosmological parameters from future galaxy-based cluster cosmology samples, or accurate characterization of environment for galaxy evolution studies, we recommend the dynamical state of a cluster sample is classified to identify whether masses of the dynamically substructured clusters will be systematically overestimated. In the case of using cluster mass scaling relations to estimate masses of another cluster sample, we advise that the underlying dynamical characteristics of the cluster sample used to calibrate the scaling relation is similar to that of the cluster sample the scaling relation is applied to.

Acknowledgements

The authors would like to thank numerous people for useful discussions, including Matt Owers, Reneé Hložek, Irene Pintos-Castro, and Joanne Cohn. We would like to acknowledge funding from the Science and Technology Facilities Council (STFC). DC would like to thank the Australian Research Council for receipt of a QEII Research Fellowship. The authors would like to express special thanks to the Instituto de Fisica Teorica (IFT-UAM/CSIC in Madrid) for its hospitality and support, via the Centro de Excelencia Severo Ochoa Program under Grant No. SEV-2012-0249, during the three week workshop ‘nIFTy Cosmology’ where this work developed. We further acknowledge the financial support of the University of Western 2014 Australia Research Collaboration Award for ‘Fast Approximate Synthetic Universes for the SKA’, the ARC Centre of Excellence for All Sky Astrophysics (CAASTRO) grant number CE110001020, and the two ARC Discovery Projects DP130100117 and DP140100198. We also recognize support from the Universidad Autonoma de Madrid (UAM) for the workshop infrastructure. RAS acknowledges support from the NSF grant AST-1055081. CS acknowledges support from the European Research Council under FP7 grant number 279396. AS is supported by the ERC-StG ‘ClustersXCosmo’, grant agreement 71676. ET is supported by the ETAg grant IUT40-2 and by the European Regional Development Fund (TK133).

Footnotes

REFERENCES

Aguerri
J. A. L.
,
Sánchez-Janssen
R.
,
2010
,
A&A
,
521
,
A28

Aihara
H.
et al. ,
2017
,
preprint (arXiv:1704.05858)

Allen
S. W.
,
Evrard
A. E.
,
Mantz
A. B.
,
2011
,
ARA&A
,
49
,
409

Amendola
L.
et al. ,
2013
,
Liv. Rev. Relativ.
,
16
,
6

Bahcall
N. A.
,
1988
,
ARA&A
,
26
,
631

Balogh
M. L.
,
Morris
S. L.
,
Yee
H. K. C.
,
Carlberg
R. G.
,
Ellingson
E.
,
1999
,
ApJ
,
527
,
54

Behroozi
P. S.
,
Wechsler
R. H.
,
Wu
H.-Y.
,
2013
,
ApJ
,
762
,
109

Benson
B. A.
et al. ,
2013
,
ApJ
,
763
,
147

Bernyk
M.
et al. ,
2016
,
ApJS
,
223
,
9

Bird
C. M.
,
1994
,
AJ
,
107
,
1637

Biviano
A.
,
Girardi
M.
,
Giuricin
G.
,
Mardirossian
F.
,
Mezzetti
M.
,
1993
,
ApJ
,
411
,
L13

Biviano
A.
,
Murante
G.
,
Borgani
S.
,
Diaferio
A.
,
Dolag
K.
,
Girardi
M.
,
2006
,
A&A
,
456
,
23

Burgett
W. S.
et al. ,
2004
,
MNRAS
,
352
,
605

Cohn
J. D.
,
2012
,
MNRAS
,
419
,
1017

Colless
M.
,
Dunn
A. M.
,
1996
,
ApJ
,
458
,
435

Croton
D. J.
et al. ,
2016
,
ApJS
,
222
,
22

de Carvalho
R. R.
,
Ribeiro
A. L. B.
,
Stalder
D. H.
,
Rosa
R. R.
,
Costa
A. P.
,
Moura
T. C.
,
2017
,
AJ
,
154
,
96

de Haan
T.
et al. ,
2016
,
ApJ
,
832
,
95

de Jong
J. T. A.
et al. ,
2015
,
A&A
,
582
,
A62

Dressler
A.
,
1980
,
ApJ
,
236
,
351

Dressler
A.
,
Shectman
S. A.
,
1988
,
AJ
,
95
,
985

Einasto
M.
,
Einasto
J.
,
Tago
E.
,
Müller
V.
,
Andernach
H.
,
2001
,
AJ
,
122
,
2222

Einasto
M.
et al. ,
2012
,
A&A
,
540
,
A123

Escalera
E.
,
Biviano
A.
,
Girardi
M.
,
Giuricin
G.
,
Mardirossian
F.
,
Mazure
A.
,
Mezzetti
M.
,
1994
,
ApJ
,
423
,
539

Fadda
D.
,
Girardi
M.
,
Giuricin
G.
,
Mardirossian
F.
,
Mezzetti
M.
,
Escalera
E.
,
1997
, in
Persic
M.
,
Salucci
P.
, eds,
ASP Conf. Ser. Vol. 117, Dark and Visible Matter in Galaxies and Cosmological Implications
.
Astron. Soc. Pac.
,
San Francisco
, p.
505

Foëx
G.
,
Böhringer
H.
,
Chon
G.
,
2017
,
A&A
,
606
,
A122

Foreman-Mackey
D.
,
Hogg
D. W.
,
Lang
D.
,
Goodman
J.
,
2013
,
PASP
,
125
,
306

Geller
M. J.
,
Beers
T. C.
,
1982
,
PASP
,
94
,
421

Gifford
D.
,
Miller
C. J.
,
2013
,
ApJ
,
768
,
L32

Girardi
M.
,
Escalera
E.
,
Fadda
D.
,
Giuricin
G.
,
Mardirossian
F.
,
Mezzetti
M.
,
1996
,

Girardi
M.
,
Escalera
E.
,
Fadda
D.
,
Giuricin
G.
,
Mardirossian
F.
,
Mezzetti
M.
,
1997
,
ApJ
,
482
,
41

Goto
T.
,
Yamauchi
C.
,
Fujita
Y.
,
Okamura
S.
,
Sekiguchi
M.
,
Smail
I.
,
Bernardi
M.
,
Gomez
P. L.
,
2003
,
MNRAS
,
346
,
601

Hasselfield
M.
et al. ,
2013
,
JCAP
,
7
,
008

Hogg
D. W.
,
Bovy
J.
,
Lang
D.
,
2010
,
preprint (arXiv:1008.4686)

Hou
A.
,
Parker
L. C.
,
Harris
W. E.
,
Wilman
D. J.
,
2009
,
ApJ
,
702
,
1199

Hou
A.
et al. ,
2012
,
MNRAS
,
421
,
3594

Klypin
A. A.
,
Trujillo-Gomez
S.
,
Primack
J.
,
2011
,
ApJ
,
740
,
102

Knebe
A.
,
Müller
V.
,
2000
,
A&A
,
354
,
761

Knebe
A.
et al. ,
2011
,
MNRAS
,
415
,
2293

Kolokotronis
V.
,
Basilakos
S.
,
Plionis
M.
,
Georgantopoulos
I.
,
2001
,
MNRAS
,
320
,
49

Lopes
P. A. A.
,
de Carvalho
R. R.
,
Capelato
H. V.
,
Gal
R. R.
,
Djorgovski
S. G.
,
Brunner
R. J.
,
Odewahn
S. C.
,
Mahabal
A. A.
,
2006
,
ApJ
,
648
,
209

Lopes
P. A. A.
,
de Carvalho
R. R.
,
Kohl-Moreira
J. L.
,
Jones
C.
,
2009
,
MNRAS
,
392
,
135

LSST Science Collaboration
,
2009
,
preprint (arXiv:0912.0201)

Lu
Y.
et al. ,
2014
,
ApJ
,
795
,
123

Mamon
G. A.
,
Biviano
A.
,
Boué
G.
,
2013
,
MNRAS
,
429
,
3079

Mantz
A. B.
et al. ,
2015
,
MNRAS
,
446
,
2205

Muñoz-Cuartas
J. C.
,
Müller
V.
,
2012
,
MNRAS
,
423
,
1583

Old
L.
et al. ,
2014
,
MNRAS
,
441
,
1513

Old
L.
et al. ,
2015
,
MNRAS
,
449
,
1897

Owers
M. S.
,
Couch
W. J.
,
Nulsen
P. E. J.
,
2009
,
ApJ
,
693
,
901

Owers
M. S.
et al. ,
2017
,
MNRAS
,
468
,
1824

Pearson
R. J.
,
Ponman
T. J.
,
Norberg
P.
,
Robotham
A. S. G.
,
Farr
W. M.
,
2015
,
MNRAS
,
449
,
3082

Peng
Y.-j.
,
Lilly
S. J.
,
Renzini
A.
,
Carollo
M.
,
2012
,
ApJ
,
757
,
4

Perea
J.
,
del Olmo
A.
,
Moles
M.
,
1990
,
A&A
,
237
,
319

Pillepich
A.
,
Porciani
C.
,
Reiprich
T. H.
,
2012
,
MNRAS
,
422
,
44

Pinkney
J.
,
Roettiger
K.
,
Burns
J. O.
,
Bird
C. M.
,
1996
,
ApJS
,
104
,
1

Planck Collaboration XIII
,
2016a
,
A&A
,
594
,
A13

Planck Collaboration XXIV
,
2016b
,
A&A
,
594
,
A24

Postman
M.
et al. ,
2005
,
ApJ
,
623
,
721

Ramella
M.
et al. ,
2007
,
A&A
,
470
,
39

Rhee
G. F. R. N.
,
van Haarlem
M. P.
,
Katgert
P.
,
1991
,
A&A
,
246
,
301

Roberts
I. D.
,
Parker
L. C.
,
2017
,
MNRAS
,
467
,
3268

Saro
A.
,
Mohr
J. J.
,
Bazin
G.
,
Dolag
K.
,
2013
,
ApJ
,
772
,
47

Sehgal
N.
et al. ,
2011
,
ApJ
,
732
,
44

Sifón
C.
et al. ,
2013
,
ApJ
,
772
,
25

Silverman
B. W.
,
1986
,
Density Estimation for Statistics and Data Analysis
.
Chapman & Hall
,
London/New York

Solanes
J. M.
,
Salvador-Solé
E.
,
González-Casado
G.
,
1999
,
A&A
,
343
,
733

Spergel
D.
et al. ,
2015
,
preprint (arXiv:1503.03757)

Tempel
E.
et al. ,
2014
,
A&A
,
566
,
A1

The Dark Energy Survey Collaboration
,
2005
, )

Tinker
J.
,
Kravtsov
A. V.
,
Klypin
A.
,
Abazajian
K.
,
Warren
M.
,
Yepes
G.
,
Gottlöber
S.
,
Holz
D. E.
,
2008
,
ApJ
,
688
,
709

Tinker
J. L.
et al. ,
2012
,
ApJ
,
745
,
16

Voit
G. M.
,
2005
,
Rev. Mod. Phys.
,
77
,
207

von der Linden
A.
,
Best
P. N.
,
Kauffmann
G.
,
White
S. D. M.
,
2007
,
MNRAS
,
379
,
867

West
M. J.
,
Jones
C.
,
Forman
W.
,
1995
,
ApJ
,
451
,
L5

White
M.
,
Cohn
J. D.
,
Smit
R.
,
2010
,
MNRAS
,
408
,
1818

Wing
J. D.
,
Blanton
E. L.
,
2013
,
ApJ
,
767
,
102

Wojtak
R.
,
Łokas
E. L.
,
Mamon
G. A.
,
Gottlöber
S.
,
2009
,
MNRAS
,
399
,
812

Zabludoff
A. I.
,
Mulchaey
J. S.
,
1998
,
ApJ
,
498
,
L5

Ziparo
F.
,
Braglia
F. G.
,
Pierini
D.
,
Finoguenov
A.
,
Böhringer
H.
,
Bongiorno
A.
,
2012
,
MNRAS
,
420
,
2480

APPENDIX A: PROPERTIES OF THE MASS RECONSTRUCTION METHODS

Table A1.

Illustration of the member galaxy selection process for all methods. The colour of the acronym for each method corresponds to the main galaxy population property used to perform mass estimation richness (magenta), projected phase-space (black), radii (blue), velocity dispersion (red), or abundance matching (green). The second column details how each method selects an initial member galaxy sample, while the third column outlines the member galaxy sample refining process. Finally, the fourth column describes how methods treat interloping galaxies that are not associated with the clusters.

Member galaxy selection methodology
MethodsInitial galaxy selectionRefine membershipTreatment of interlopers
PCNWithin 5 Mpc, 1000 km s− 1Clipping of ±3σ, using galaxies within 1 MpcUse galaxies at 3-5 Mpc to find interloper population to remove
PFNFOFNoNo
NUMWithin 3 Mpc, 4000 km s− 1(1) Estimate R200c from the relationship between R200c and richness deduced from CLE; (2) select galaxies within R200c and with |$|v|<2.7\,\sigma _{\rm los}^{\rm NFW}(R)$|Same as CLE
ESCWithin preliminary R200c estimate and ± 3500 km s− 1Gapper techniqueRemoved by Gapper technique
MPOInput from CLN(1) Calculate R200c, Rρ, Rred, Rblue by MAMPOSSt method; (2) select members within radius according to colourNo
MP1Input from CLNSame as MPO except colour blindNo
RWWithin 3 Mpc, 4000 km s− 1Within R200c, |2Φ(R)|1/2, where R200c obtained iterativelyNo
TARFOFNoNo
PCOInput from PCNInput from PCNInclude interloper contamination in density fitting
PFOInput from PFNInput from PFNNo
PCRInput from PCNInput from PCNSame as PCN
PFRInput from PFNInput from PFNNo
MVMFOF (ellipsoidal search range, centre of most luminous galaxy)Increasing mass limits, then FOF, loops until closure conditionNo
AS1Within 1 Mpc, 4000 km s− 1, constrained by colour–magnitude relationClipping of ±3 σRemoved by clipping of ±3 σ
AS2Within 1 Mpc, 4000 km s− 1, constrained by colour–magnitude relationClipping of ±3 σRemoved by clipping of ±3 σ
AvLWithin 2.5 σv and 0.8 R200Obtain R200c and  σv by  σ-clippingImplicit with σ-clipping
CLEWithin 3 Mpc, 4000 km s− 1(1) Estimate R200c from the aperture velocity dispersion; (2) select galaxies within R200c and with |$|v|<2.7\,\sigma _{\rm los}^{\rm NFW}(R)$|⁠; 3) Iterate steps 1 and 2 until convergenceObvious interlopers are removed by velocity gap technique, then further treated in iteration by σ clipping
CLNInput from NUMSame as CLESame as CLE
SG1Within 4000 km s− 1(1) Measure  σgal, estimate M200c and R200c; (2) select galaxies within R200c; (3) Iterate steps 1 and 2 until convergenceShifting gapper with minimum bin size of 250 kpc and 15 galaxies; velocity limit 1000 km s− 1 from main body
SG2Within 4000 km s− 1(1) Measure σgal, estimate M200c and R200c; (2) select galaxies within R200c; (3) iterate steps 1 and 2 until convergenceShifting gapper with minimum bin size of 150 kpc and 10 galaxies; velocity limit 500 km s− 1 from main body
SG3Within 2.5 h− 1 Mpc and 4000 km s− 1. Velocity distribution symmeterizedMeasure  σgal, correct for velocity errors, then estimate M200c and R200c and apply the surface pressure term correctionShifting gapper with minimum bin size of 420 h− 1 kpc and 15 galaxies
PCSInput from PCNInput from PCNSame as PCN
PFSInput from PFNInput from PFNNo
Member galaxy selection methodology
MethodsInitial galaxy selectionRefine membershipTreatment of interlopers
PCNWithin 5 Mpc, 1000 km s− 1Clipping of ±3σ, using galaxies within 1 MpcUse galaxies at 3-5 Mpc to find interloper population to remove
PFNFOFNoNo
NUMWithin 3 Mpc, 4000 km s− 1(1) Estimate R200c from the relationship between R200c and richness deduced from CLE; (2) select galaxies within R200c and with |$|v|<2.7\,\sigma _{\rm los}^{\rm NFW}(R)$|Same as CLE
ESCWithin preliminary R200c estimate and ± 3500 km s− 1Gapper techniqueRemoved by Gapper technique
MPOInput from CLN(1) Calculate R200c, Rρ, Rred, Rblue by MAMPOSSt method; (2) select members within radius according to colourNo
MP1Input from CLNSame as MPO except colour blindNo
RWWithin 3 Mpc, 4000 km s− 1Within R200c, |2Φ(R)|1/2, where R200c obtained iterativelyNo
TARFOFNoNo
PCOInput from PCNInput from PCNInclude interloper contamination in density fitting
PFOInput from PFNInput from PFNNo
PCRInput from PCNInput from PCNSame as PCN
PFRInput from PFNInput from PFNNo
MVMFOF (ellipsoidal search range, centre of most luminous galaxy)Increasing mass limits, then FOF, loops until closure conditionNo
AS1Within 1 Mpc, 4000 km s− 1, constrained by colour–magnitude relationClipping of ±3 σRemoved by clipping of ±3 σ
AS2Within 1 Mpc, 4000 km s− 1, constrained by colour–magnitude relationClipping of ±3 σRemoved by clipping of ±3 σ
AvLWithin 2.5 σv and 0.8 R200Obtain R200c and  σv by  σ-clippingImplicit with σ-clipping
CLEWithin 3 Mpc, 4000 km s− 1(1) Estimate R200c from the aperture velocity dispersion; (2) select galaxies within R200c and with |$|v|<2.7\,\sigma _{\rm los}^{\rm NFW}(R)$|⁠; 3) Iterate steps 1 and 2 until convergenceObvious interlopers are removed by velocity gap technique, then further treated in iteration by σ clipping
CLNInput from NUMSame as CLESame as CLE
SG1Within 4000 km s− 1(1) Measure  σgal, estimate M200c and R200c; (2) select galaxies within R200c; (3) Iterate steps 1 and 2 until convergenceShifting gapper with minimum bin size of 250 kpc and 15 galaxies; velocity limit 1000 km s− 1 from main body
SG2Within 4000 km s− 1(1) Measure σgal, estimate M200c and R200c; (2) select galaxies within R200c; (3) iterate steps 1 and 2 until convergenceShifting gapper with minimum bin size of 150 kpc and 10 galaxies; velocity limit 500 km s− 1 from main body
SG3Within 2.5 h− 1 Mpc and 4000 km s− 1. Velocity distribution symmeterizedMeasure  σgal, correct for velocity errors, then estimate M200c and R200c and apply the surface pressure term correctionShifting gapper with minimum bin size of 420 h− 1 kpc and 15 galaxies
PCSInput from PCNInput from PCNSame as PCN
PFSInput from PFNInput from PFNNo
Table A1.

Illustration of the member galaxy selection process for all methods. The colour of the acronym for each method corresponds to the main galaxy population property used to perform mass estimation richness (magenta), projected phase-space (black), radii (blue), velocity dispersion (red), or abundance matching (green). The second column details how each method selects an initial member galaxy sample, while the third column outlines the member galaxy sample refining process. Finally, the fourth column describes how methods treat interloping galaxies that are not associated with the clusters.

Member galaxy selection methodology
MethodsInitial galaxy selectionRefine membershipTreatment of interlopers
PCNWithin 5 Mpc, 1000 km s− 1Clipping of ±3σ, using galaxies within 1 MpcUse galaxies at 3-5 Mpc to find interloper population to remove
PFNFOFNoNo
NUMWithin 3 Mpc, 4000 km s− 1(1) Estimate R200c from the relationship between R200c and richness deduced from CLE; (2) select galaxies within R200c and with |$|v|<2.7\,\sigma _{\rm los}^{\rm NFW}(R)$|Same as CLE
ESCWithin preliminary R200c estimate and ± 3500 km s− 1Gapper techniqueRemoved by Gapper technique
MPOInput from CLN(1) Calculate R200c, Rρ, Rred, Rblue by MAMPOSSt method; (2) select members within radius according to colourNo
MP1Input from CLNSame as MPO except colour blindNo
RWWithin 3 Mpc, 4000 km s− 1Within R200c, |2Φ(R)|1/2, where R200c obtained iterativelyNo
TARFOFNoNo
PCOInput from PCNInput from PCNInclude interloper contamination in density fitting
PFOInput from PFNInput from PFNNo
PCRInput from PCNInput from PCNSame as PCN
PFRInput from PFNInput from PFNNo
MVMFOF (ellipsoidal search range, centre of most luminous galaxy)Increasing mass limits, then FOF, loops until closure conditionNo
AS1Within 1 Mpc, 4000 km s− 1, constrained by colour–magnitude relationClipping of ±3 σRemoved by clipping of ±3 σ
AS2Within 1 Mpc, 4000 km s− 1, constrained by colour–magnitude relationClipping of ±3 σRemoved by clipping of ±3 σ
AvLWithin 2.5 σv and 0.8 R200Obtain R200c and  σv by  σ-clippingImplicit with σ-clipping
CLEWithin 3 Mpc, 4000 km s− 1(1) Estimate R200c from the aperture velocity dispersion; (2) select galaxies within R200c and with |$|v|<2.7\,\sigma _{\rm los}^{\rm NFW}(R)$|⁠; 3) Iterate steps 1 and 2 until convergenceObvious interlopers are removed by velocity gap technique, then further treated in iteration by σ clipping
CLNInput from NUMSame as CLESame as CLE
SG1Within 4000 km s− 1(1) Measure  σgal, estimate M200c and R200c; (2) select galaxies within R200c; (3) Iterate steps 1 and 2 until convergenceShifting gapper with minimum bin size of 250 kpc and 15 galaxies; velocity limit 1000 km s− 1 from main body
SG2Within 4000 km s− 1(1) Measure σgal, estimate M200c and R200c; (2) select galaxies within R200c; (3) iterate steps 1 and 2 until convergenceShifting gapper with minimum bin size of 150 kpc and 10 galaxies; velocity limit 500 km s− 1 from main body
SG3Within 2.5 h− 1 Mpc and 4000 km s− 1. Velocity distribution symmeterizedMeasure  σgal, correct for velocity errors, then estimate M200c and R200c and apply the surface pressure term correctionShifting gapper with minimum bin size of 420 h− 1 kpc and 15 galaxies
PCSInput from PCNInput from PCNSame as PCN
PFSInput from PFNInput from PFNNo
Member galaxy selection methodology
MethodsInitial galaxy selectionRefine membershipTreatment of interlopers
PCNWithin 5 Mpc, 1000 km s− 1Clipping of ±3σ, using galaxies within 1 MpcUse galaxies at 3-5 Mpc to find interloper population to remove
PFNFOFNoNo
NUMWithin 3 Mpc, 4000 km s− 1(1) Estimate R200c from the relationship between R200c and richness deduced from CLE; (2) select galaxies within R200c and with |$|v|<2.7\,\sigma _{\rm los}^{\rm NFW}(R)$|Same as CLE
ESCWithin preliminary R200c estimate and ± 3500 km s− 1Gapper techniqueRemoved by Gapper technique
MPOInput from CLN(1) Calculate R200c, Rρ, Rred, Rblue by MAMPOSSt method; (2) select members within radius according to colourNo
MP1Input from CLNSame as MPO except colour blindNo
RWWithin 3 Mpc, 4000 km s− 1Within R200c, |2Φ(R)|1/2, where R200c obtained iterativelyNo
TARFOFNoNo
PCOInput from PCNInput from PCNInclude interloper contamination in density fitting
PFOInput from PFNInput from PFNNo
PCRInput from PCNInput from PCNSame as PCN
PFRInput from PFNInput from PFNNo
MVMFOF (ellipsoidal search range, centre of most luminous galaxy)Increasing mass limits, then FOF, loops until closure conditionNo
AS1Within 1 Mpc, 4000 km s− 1, constrained by colour–magnitude relationClipping of ±3 σRemoved by clipping of ±3 σ
AS2Within 1 Mpc, 4000 km s− 1, constrained by colour–magnitude relationClipping of ±3 σRemoved by clipping of ±3 σ
AvLWithin 2.5 σv and 0.8 R200Obtain R200c and  σv by  σ-clippingImplicit with σ-clipping
CLEWithin 3 Mpc, 4000 km s− 1(1) Estimate R200c from the aperture velocity dispersion; (2) select galaxies within R200c and with |$|v|<2.7\,\sigma _{\rm los}^{\rm NFW}(R)$|⁠; 3) Iterate steps 1 and 2 until convergenceObvious interlopers are removed by velocity gap technique, then further treated in iteration by σ clipping
CLNInput from NUMSame as CLESame as CLE
SG1Within 4000 km s− 1(1) Measure  σgal, estimate M200c and R200c; (2) select galaxies within R200c; (3) Iterate steps 1 and 2 until convergenceShifting gapper with minimum bin size of 250 kpc and 15 galaxies; velocity limit 1000 km s− 1 from main body
SG2Within 4000 km s− 1(1) Measure σgal, estimate M200c and R200c; (2) select galaxies within R200c; (3) iterate steps 1 and 2 until convergenceShifting gapper with minimum bin size of 150 kpc and 10 galaxies; velocity limit 500 km s− 1 from main body
SG3Within 2.5 h− 1 Mpc and 4000 km s− 1. Velocity distribution symmeterizedMeasure  σgal, correct for velocity errors, then estimate M200c and R200c and apply the surface pressure term correctionShifting gapper with minimum bin size of 420 h− 1 kpc and 15 galaxies
PCSInput from PCNInput from PCNSame as PCN
PFSInput from PFNInput from PFNNo
Table A2.

Characteristics of the mass reconstruction process for the methods used in this comparison. The second to sixth columns illustrate whether a method calculates/utilizes the velocities, velocity dispersion, radial distance of galaxies from cluster centre, the richness, and the projected phase-space information of galaxies, respectively. If a method assumed a mass or number density profile it is indicated in columns seven and eight.

Galaxy properties used to obtain group/cluster membership and estimate mass
MethodsVelocitiesVelocity dispersionRadial distanceRichnessProjected phase-spaceMass density profileNumber density profile
PCNYesNoNoYesNoNoNo
PFNYesNoNoYesNoNoNo
NUMNoNoNoYesYesNoNo
ESCYesYesYesNoNoCausticsNo
MPOYesNoYesNoYesNFWNFW
MP1YesNoYesNoYesNFWNFW
RWYesNoYesNoYesNFWNFW
TARYesYesYesNoNoNFWNo
PCOYesNoNoNoNoNFWNFW
PFOYesNoNoNoNoNFWNFW
PCRYesNoYesNoNoNoNo
PFRYesNoYesNoNoNoNo
MVMYesYesYesNoNoNFWNo
AS1YesYesNoNoNoNoNo
AS2YesNoYesNoYesNoNo
AvLYesYesYesNoNoNoNo
CLEYesYesNoNoNoNFWNFW
CLNYesYesNoNoNoNFWNFW
SG1YesYesYesNoNoNoNo
SG2YesYesYesNoNoNoNo
SG3YesYesYesNoNoNoNo
PCSYesYesNoNoNoNoNo
PFSYesYesNoNoNoNoNo
Galaxy properties used to obtain group/cluster membership and estimate mass
MethodsVelocitiesVelocity dispersionRadial distanceRichnessProjected phase-spaceMass density profileNumber density profile
PCNYesNoNoYesNoNoNo
PFNYesNoNoYesNoNoNo
NUMNoNoNoYesYesNoNo
ESCYesYesYesNoNoCausticsNo
MPOYesNoYesNoYesNFWNFW
MP1YesNoYesNoYesNFWNFW
RWYesNoYesNoYesNFWNFW
TARYesYesYesNoNoNFWNo
PCOYesNoNoNoNoNFWNFW
PFOYesNoNoNoNoNFWNFW
PCRYesNoYesNoNoNoNo
PFRYesNoYesNoNoNoNo
MVMYesYesYesNoNoNFWNo
AS1YesYesNoNoNoNoNo
AS2YesNoYesNoYesNoNo
AvLYesYesYesNoNoNoNo
CLEYesYesNoNoNoNFWNFW
CLNYesYesNoNoNoNFWNFW
SG1YesYesYesNoNoNoNo
SG2YesYesYesNoNoNoNo
SG3YesYesYesNoNoNoNo
PCSYesYesNoNoNoNoNo
PFSYesYesNoNoNoNoNo
Table A2.

Characteristics of the mass reconstruction process for the methods used in this comparison. The second to sixth columns illustrate whether a method calculates/utilizes the velocities, velocity dispersion, radial distance of galaxies from cluster centre, the richness, and the projected phase-space information of galaxies, respectively. If a method assumed a mass or number density profile it is indicated in columns seven and eight.

Galaxy properties used to obtain group/cluster membership and estimate mass
MethodsVelocitiesVelocity dispersionRadial distanceRichnessProjected phase-spaceMass density profileNumber density profile
PCNYesNoNoYesNoNoNo
PFNYesNoNoYesNoNoNo
NUMNoNoNoYesYesNoNo
ESCYesYesYesNoNoCausticsNo
MPOYesNoYesNoYesNFWNFW
MP1YesNoYesNoYesNFWNFW
RWYesNoYesNoYesNFWNFW
TARYesYesYesNoNoNFWNo
PCOYesNoNoNoNoNFWNFW
PFOYesNoNoNoNoNFWNFW
PCRYesNoYesNoNoNoNo
PFRYesNoYesNoNoNoNo
MVMYesYesYesNoNoNFWNo
AS1YesYesNoNoNoNoNo
AS2YesNoYesNoYesNoNo
AvLYesYesYesNoNoNoNo
CLEYesYesNoNoNoNFWNFW
CLNYesYesNoNoNoNFWNFW
SG1YesYesYesNoNoNoNo
SG2YesYesYesNoNoNoNo
SG3YesYesYesNoNoNoNo
PCSYesYesNoNoNoNoNo
PFSYesYesNoNoNoNoNo
Galaxy properties used to obtain group/cluster membership and estimate mass
MethodsVelocitiesVelocity dispersionRadial distanceRichnessProjected phase-spaceMass density profileNumber density profile
PCNYesNoNoYesNoNoNo
PFNYesNoNoYesNoNoNo
NUMNoNoNoYesYesNoNo
ESCYesYesYesNoNoCausticsNo
MPOYesNoYesNoYesNFWNFW
MP1YesNoYesNoYesNFWNFW
RWYesNoYesNoYesNFWNFW
TARYesYesYesNoNoNFWNo
PCOYesNoNoNoNoNFWNFW
PFOYesNoNoNoNoNFWNFW
PCRYesNoYesNoNoNoNo
PFRYesNoYesNoNoNoNo
MVMYesYesYesNoNoNFWNo
AS1YesYesNoNoNoNoNo
AS2YesNoYesNoYesNoNo
AvLYesYesYesNoNoNoNo
CLEYesYesNoNoNoNFWNFW
CLNYesYesNoNoNoNFWNFW
SG1YesYesYesNoNoNoNo
SG2YesYesYesNoNoNoNo
SG3YesYesYesNoNoNoNo
PCSYesYesNoNoNoNoNo
PFSYesYesNoNoNoNoNo

APPENDIX B: DS AND KAPPA TEST PTE VALUES FOR ALL CLUSTERS

The DS and Kappa test PTE values for the cluster sample. Black symbols indicate clusters that are not defined as highly substructured by either the DS test or Kappa test (688 clusters, 73 per cent of the sample). Blue symbols indicate clusters where either the DS test or Kappa test have defined as highly substructured (255 clusters, 27 per cent). The red symbols indicate clusters that have been defined as highly dynamically substructured by both the DS and Kappa test (147 clusters, 15.5 per cent). We note that the DS test detects significant dynamical substructure in 215 clusters, 23 per cent of the sample. This is a high detection rate than the Kappa test, which finds 187, 20 per cent of the sample, to be dynamically substructured.
Figure B1.

The DS and Kappa test PTE values for the cluster sample. Black symbols indicate clusters that are not defined as highly substructured by either the DS test or Kappa test (688 clusters, 73 per cent of the sample). Blue symbols indicate clusters where either the DS test or Kappa test have defined as highly substructured (255 clusters, 27 per cent). The red symbols indicate clusters that have been defined as highly dynamically substructured by both the DS and Kappa test (147 clusters, 15.5 per cent). We note that the DS test detects significant dynamical substructure in 215 clusters, 23 per cent of the sample. This is a high detection rate than the Kappa test, which finds 187, 20 per cent of the sample, to be dynamically substructured.

APPENDIX C: Dynamical substructure test detection and cluster mass

The fraction of highly substructured clusters as a function of log true mass, where clusters are deemed substructured if either the DS test or Kappa test detects significant dynamical substructure. The clusters are binned into seven linearly spaced log true mass bins. The error bars represent the standard deviation of a set of fractions calculated by randomly sampling the data with replacement (n = 500 iterations). The DS and Kappa test detects higher fractions of clusters with substructure as a function of cluster mass (and hence richness). This trend of dynamically disturbed clusters having higher masses is also identified in several observational studies which use different dynamical substructure tests (e.g. de Carvalho et al. 2017; Roberts & Parker 2017).
Figure C1.

The fraction of highly substructured clusters as a function of log true mass, where clusters are deemed substructured if either the DS test or Kappa test detects significant dynamical substructure. The clusters are binned into seven linearly spaced log true mass bins. The error bars represent the standard deviation of a set of fractions calculated by randomly sampling the data with replacement (n = 500 iterations). The DS and Kappa test detects higher fractions of clusters with substructure as a function of cluster mass (and hence richness). This trend of dynamically disturbed clusters having higher masses is also identified in several observational studies which use different dynamical substructure tests (e.g. de Carvalho et al. 2017; Roberts & Parker 2017).

APPENDIX D: THE RICHNESS–MASS RELATION OF THE SAM2 MOCK CLUSTER CATALOGUE

The richness versus mass of the 943 groups/clusters of the input SAM2 catalogues. Clusters that deemed as highly dynamically substructured by the DS or Kappa test are denoted as red circles, and the non-substructured clusters denoted are by black circles. The red line reflects a linear fit (described in Section 4.3) to the richness–mass relation for the substructured clusters of $\rm {log}({N}_{\rm gal}) = 0.75\,(\rm {log}({M}_{200c})-14.126)+1.67$. The black line reflects a linear fit to the richness–mass relation for the non-substructured clusters of $\rm {log}({N}_{\rm gal}) = 0.70\,(\rm {log}({M}_{\rm 200c})-14.126)+1.61$. The intrinsic scatter of the richness versus mass relation of all 943 SAM2 clusters is 0.12 dex. We note that the linear fit parameters are also very similar to those deduced by performing simple linear fit.
Figure D1.

The richness versus mass of the 943 groups/clusters of the input SAM2 catalogues. Clusters that deemed as highly dynamically substructured by the DS or Kappa test are denoted as red circles, and the non-substructured clusters denoted are by black circles. The red line reflects a linear fit (described in Section 4.3) to the richness–mass relation for the substructured clusters of |$\rm {log}({N}_{\rm gal}) = 0.75\,(\rm {log}({M}_{200c})-14.126)+1.67$|⁠. The black line reflects a linear fit to the richness–mass relation for the non-substructured clusters of |$\rm {log}({N}_{\rm gal}) = 0.70\,(\rm {log}({M}_{\rm 200c})-14.126)+1.61$|⁠. The intrinsic scatter of the richness versus mass relation of all 943 SAM2 clusters is 0.12 dex. We note that the linear fit parameters are also very similar to those deduced by performing simple linear fit.

APPENDIX E: THE MEDIAN DIFFERENCE IN RECOVERED MASS FOR ALL 23 TECHNIQUES

The median difference between recovered and true log mass versus true log mass, $\delta _{\rm M200c} = \rm {log}({M}_{\rm 200c,rec})-\rm {log}({M}_{\rm 200c,true})$ for all 23 methods. Clusters that deemed as highly dynamically substructured by the DS or Kappa test are denoted as red circles, and the non-substructured clusters denoted are by black circles. The red line reflects a linear fit for the substructured clusters of δM200c = −0.06 (log(M200c, rec) − 14.126) + 0.019. The black line reflects a linear fit for the non-substructured clusters of δM200c = −0.09 (log(M200c, rec) − 14.126) + 0.07.
Figure E1.

The median difference between recovered and true log mass versus true log mass, |$\delta _{\rm M200c} = \rm {log}({M}_{\rm 200c,rec})-\rm {log}({M}_{\rm 200c,true})$| for all 23 methods. Clusters that deemed as highly dynamically substructured by the DS or Kappa test are denoted as red circles, and the non-substructured clusters denoted are by black circles. The red line reflects a linear fit for the substructured clusters of δM200c = −0.06 (log(M200c, rec) − 14.126) + 0.019. The black line reflects a linear fit for the non-substructured clusters of δM200c = −0.09 (log(M200c, rec) − 14.126) + 0.07.