-
GipsyX/RTGx, A New Tool Set for Space Geodetic Operations and Research
Authors:
Willy Bertiger,
Yoaz Bar-Sever,
Angie Dorsey,
Bruce Haines,
Nate Harvey,
Dan Hemberger,
Michael Heflin,
Wenwen Lu,
Mark Miller,
Angelyn W. Moore,
Dave Murphy,
Paul Ries,
Larry Romans,
Aurore Sibois,
Ant Sibthorpe,
Bela Szilagyi,
Michele Vallisneri,
Pascal Willis
Abstract:
GipsyX/RTGx is the Jet Propulsion Laboratory's (JPL) next generation software package for positioning, navigation, timing, and Earth science using measurements from three geodetic techniques: Global Navigation Satellite Systems (GNSS), Satellite Laser Ranging (SLR), and Doppler Orbitography and Radiopositioning Integrated by Satellite (DORIS); with Very Long Baseline Interferometry (VLBI) under de…
▽ More
GipsyX/RTGx is the Jet Propulsion Laboratory's (JPL) next generation software package for positioning, navigation, timing, and Earth science using measurements from three geodetic techniques: Global Navigation Satellite Systems (GNSS), Satellite Laser Ranging (SLR), and Doppler Orbitography and Radiopositioning Integrated by Satellite (DORIS); with Very Long Baseline Interferometry (VLBI) under development. The software facilitates combined estimation of geodetic and geophysical parameters using a Kalman filter approach on real or simulated data in both post-processing and in real-time. The estimated parameters include station coordinates and velocities, satellite orbits and clocks, Earth orientation, ionospheric and tropospheric delays. The software is also capable of full realization of a dynamic terrestrial reference through analysis and combination of time series of ground station coordinates.
We present some key aspects of its new architecture, and describe some of its major applications, including Real-time orbit determination and ephemeris predictions in the U.S. Air Force Next Generation GPS Operational Control Segment (OCX), as well as in JPL's Global Differential GPS (GDGPS) System, supporting User Range Error (URE) of $<$ 5 cm RMS; precision post-processing GNSS orbit determination, including JPL's contributions to the International GNSS Service (IGS) with URE in the 2 cm RMS range; Precise point positioning (PPP) with ambiguity resolution, both statically and kinematically, for geodetic applications with 2 mm horizontal, and 6.5 mm vertical repeatability for static positioning; Operational orbit and clock determination for Low Earth Orbiting (LEO) satellites, such as NASA's Gravity Recovery and Climate Experiment (GRACE) mission with GRACE relative clock alignment at the 20 ps level.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
No Delay: Latency-Driven, Application Performance-Aware, Cluster Scheduling
Authors:
Diana Andreea Popescu,
Andrew W. Moore
Abstract:
Given the network latency variability observed in data centers, applications' performance is also determined by their placement within the data centre. We present NoMora, a cluster scheduling architecture whose core is represented by a latency-driven, application performance-aware, cluster scheduling policy. The policy places the tasks of an application taking into account the expected performance…
▽ More
Given the network latency variability observed in data centers, applications' performance is also determined by their placement within the data centre. We present NoMora, a cluster scheduling architecture whose core is represented by a latency-driven, application performance-aware, cluster scheduling policy. The policy places the tasks of an application taking into account the expected performance based on the measured network latency between pairs of hosts in the data center. Furthermore, if a tenant's application experiences increased network latency, and thus lower application performance, their application may be migrated to a better placement. Preliminary results show that our policy improves the overall average application performance by up to 13.4% and by up to 42% if preemption is enabled, and improves the task placement latency by a factor of 1.79x and the median algorithm runtime by 1.16x compared to a random policy on the Google cluster workload. This demonstrates that application performance can be improved by exploiting the relationship between network latency and application performance, and the current network conditions in a data center, while preserving the demands of low-latency cluster scheduling.
△ Less
Submitted 18 August, 2019; v1 submitted 17 March, 2019;
originally announced March 2019.
-
Seek and Push: Detecting Large Traffic Aggregates in the Dataplane
Authors:
Jan Kučera,
Diana Andreea Popescu,
Gianni Antichi,
Jan Kořenek,
Andrew W. Moore
Abstract:
High level goals such as bandwidth provisioning, accounting and network anomaly detection can be easily met if high-volume traffic clusters are detected in real time. This paper presents Elastic Trie, an alternative to approaches leveraging controller-dataplane architectures.
Our solution is a novel push-based network monitoring approach that allows detection, within the dataplane, of high-volum…
▽ More
High level goals such as bandwidth provisioning, accounting and network anomaly detection can be easily met if high-volume traffic clusters are detected in real time. This paper presents Elastic Trie, an alternative to approaches leveraging controller-dataplane architectures.
Our solution is a novel push-based network monitoring approach that allows detection, within the dataplane, of high-volume traffic clusters. Notifications from the switch to the controller can be sent only as required, avoiding the transmission or processing of unnecessary data. Furthermore, the dataplane can iteratively refine the responsible IP prefixes allowing a controller to receive a flexible granularity information. We report and discuss an evaluation of our P4-based prototype, showing our solution to be able to detect (with 95% of precision), hierarchical heavy hitters and superspreaders using less than 8KB or 80KB of active memory respectively. Finally, Elastic Trie can identify changes in the network traffic patterns, symptomatic of Denial-of-Service attack events.
△ Less
Submitted 15 May, 2018;
originally announced May 2018.
-
Extending programs with debug-related features, with application to hardware development
Authors:
Nik Sultana,
Salvator Galea,
David Greaves,
Marcin Wojcik,
Noa Zilberman,
Richard Clegg,
Luo Mai,
Richard Mortier,
Peter Pietzuch,
Jon Crowcroft,
Andrew W Moore
Abstract:
The capacity and programmability of reconfigurable hardware such as FPGAs has improved steadily over the years, but they do not readily provide any mechanisms for monitoring or debugging running programs. Such mechanisms need to be written into the program itself. This is done using ad hoc methods and primitive tools when compared to CPU programming. This complicates the programming and debugging…
▽ More
The capacity and programmability of reconfigurable hardware such as FPGAs has improved steadily over the years, but they do not readily provide any mechanisms for monitoring or debugging running programs. Such mechanisms need to be written into the program itself. This is done using ad hoc methods and primitive tools when compared to CPU programming. This complicates the programming and debugging of reconfigurable hardware. We introduce Program-hosted Directability (PhD), the extension of programs to interpret direction commands at runtime to enable debugging, monitoring and profiling. Normally in hardware development such features are fixed at compile time. We present a language of directing commands, specify its semantics in terms of a simple controller that is embedded with programs, and implement a prototype for directing network programs running in hardware. We show that this approach affords significant flexibility with low impact on hardware utilisation and performance.
△ Less
Submitted 28 May, 2017;
originally announced May 2017.
-
Prototyping RISC Based, Reconfigurable Networking Applications in Open Source
Authors:
Jong Hun Han,
Noa Zilberman,
Bjoern A. Zeeb,
Andreas Fiessler,
Andrew W. Moore
Abstract:
In the last decade we have witnessed a rapid growth in data center systems, requiring new and highly complex networking devices. The need to refresh networking infrastructure whenever new protocols or functions are introduced, and the increasing costs that this entails, are of a concern to all data center providers. New generations of Systems on Chip (SoC), integrating microprocessors and higher b…
▽ More
In the last decade we have witnessed a rapid growth in data center systems, requiring new and highly complex networking devices. The need to refresh networking infrastructure whenever new protocols or functions are introduced, and the increasing costs that this entails, are of a concern to all data center providers. New generations of Systems on Chip (SoC), integrating microprocessors and higher bandwidth interfaces, are an emerging solution to this problem. These devices permit entirely new systems and architectures that can obviate the replacement of existing networking devices while enabling seamless functionality change. In this work, we explore open source, RISC based, SoC architectures with high performance networking capabilities. The prototype architectures are implemented on the NetFPGA-SUME platform. Beyond details of the architecture, we also describe the hardware implementation and the porting of operating systems to the platform. The platform can be exploited for the development of practical networking appliances, and we provide use case examples.
△ Less
Submitted 16 December, 2016;
originally announced December 2016.
-
The Norma cluster (ACO3627) -- III. The Distance and Peculiar Velocity via the Near-Infrared Ks-band Fundamental Plane
Authors:
T. Mutabazi,
S. L. Blyth,
P. A. Woudt,
J. R. Lucey,
T. H. Jarrett,
M. Bilicki,
A. C. Schroder,
S. A. W. Moore
Abstract:
While Norma (ACO3627) is the richest cluster in the Great Attractor (GA) region, its role in the local dynamics is poorly understood. The Norma cluster has a mean redshift (z_CMB) of 0.0165 and has been proposed as the "core" of the GA. We have used the Ks-band Fundamental Plane (FP) to measure Norma cluster's distance with respect to the Coma cluster. We report FP photometry parameters (effective…
▽ More
While Norma (ACO3627) is the richest cluster in the Great Attractor (GA) region, its role in the local dynamics is poorly understood. The Norma cluster has a mean redshift (z_CMB) of 0.0165 and has been proposed as the "core" of the GA. We have used the Ks-band Fundamental Plane (FP) to measure Norma cluster's distance with respect to the Coma cluster. We report FP photometry parameters (effective radii and surface brightnesses), derived from ESO NTT SOFI images, and velocity dispersions, from AAT 2dF spectroscopy, for 31 early-type galaxies in the cluster. For the Coma cluster we use 2MASS images and SDSS velocity dispersion measurements for 121 early-type galaxies to generate the calibrating FP dataset. For the combined Norma-Coma sample we measure FP coefficients of a=1.465+/-0.059 and b=0.326+/-0.020. We find an rms scatter, in log(central velocity dispersion) of 0.08 dex which corresponds to a distance uncertainty of 28% per galaxy. The zero point offset between Norma's and Coma's FPs is 0.154+/-0.014 dex. Assuming that the Coma cluster is at rest with respect to the cosmic microwave background frame and z_CMB(Coma)=0.0240, we derive a distance to the Norma cluster of 5026+/-160 km/s, and the derived peculiar velocity is -72+/-170 km/s, i.e., consistent with zero. This is lower than previously reported positive peculiar velocities for clusters/groups/galaxies in the GA region and hence the Norma cluster may indeed represent the GA's "core".
△ Less
Submitted 30 January, 2014; v1 submitted 29 January, 2014;
originally announced January 2014.
-
Challenges in the capture and dissemination of measurements from high-speed networks
Authors:
R. G. Clegg,
M. S. Withall,
A. W. Moore,
I. W. Phillips,
D. J. Parish,
M. Rio,
R. Landa,
H. Haddadi,
K. Kyriakopoulos,
J. Auge,
R. Clayton,
D. Salmon
Abstract:
The production of a large-scale monitoring system for a high-speed network leads to a number of challenges. These challenges are not purely techinical but also socio-political and legal. The number of stakeholders in a such a monitoring activity is large including the network operators, the users, the equipment manufacturers and of course the monitoring researchers. The MASTS project (Measurement…
▽ More
The production of a large-scale monitoring system for a high-speed network leads to a number of challenges. These challenges are not purely techinical but also socio-political and legal. The number of stakeholders in a such a monitoring activity is large including the network operators, the users, the equipment manufacturers and of course the monitoring researchers. The MASTS project (Measurement at All Scales in Time and Space) was created to instrument the high-speed JANET Lightpath network, and has been extended to incorporate other paths supported by JANET(UK).
Challenges the project has faced have included: simple access to the network; legal issues involved in the storage and dissemination of the captured information, which may be personal; the volume of data captured and the rate at which this data appears at store. To this end the MASTS system will have established four monitoring points each capturing packets on a high speed link. Traffic header data will be continuously collected, anonymised, indexed, stored and made available to the research community. A legal framework for the capture and storage of network measurement data has been developed which allows the anonymised IP traces to be used for research pur poses.
△ Less
Submitted 27 March, 2013;
originally announced March 2013.
-
Dual-Tree Fast Gauss Transforms
Authors:
Dongryeol Lee,
Alexander G. Gray,
Andrew W. Moore
Abstract:
Kernel density estimation (KDE) is a popular statistical technique for estimating the underlying density distribution with minimal assumptions. Although they can be shown to achieve asymptotic estimation optimality for any input distribution, cross-validating for an optimal parameter requires significant computation dominated by kernel summations. In this paper we present an improvement to the dua…
▽ More
Kernel density estimation (KDE) is a popular statistical technique for estimating the underlying density distribution with minimal assumptions. Although they can be shown to achieve asymptotic estimation optimality for any input distribution, cross-validating for an optimal parameter requires significant computation dominated by kernel summations. In this paper we present an improvement to the dual-tree algorithm, the first practical kernel summation algorithm for general dimension. Our extension is based on the series-expansion for the Gaussian kernel used by fast Gauss transform. First, we derive two additional analytical machinery for extending the original algorithm to utilize a hierarchical data structure, demonstrating the first truly hierarchical fast Gauss transform. Second, we show how to integrate the series-expansion approximation within the dual-tree approach to compute kernel summations with a user-controllable relative error bound. We evaluate our algorithm on real-world datasets in the context of optimal bandwidth selection in kernel density estimation. Our results demonstrate that our new algorithm is the only one that guarantees a hard relative error bound and offers fast performance across a wide range of bandwidths evaluated in cross validation procedures.
△ Less
Submitted 14 February, 2011;
originally announced February 2011.
-
Beyond Node Degree: Evaluating AS Topology Models
Authors:
Hamed Haddadi,
Damien Fay,
Almerima Jamakovic,
Olaf Maennel,
Andrew W. Moore,
Richard Mortier,
Miguel Rio,
Steve Uhlig
Abstract:
Many models have been proposed to generate Internet Autonomous System (AS) topologies, most of which make structural assumptions about the AS graph. In this paper we compare AS topology generation models with several observed AS topologies. In contrast to most previous works, we avoid making assumptions about which topological properties are important to characterize the AS topology. Our analysi…
▽ More
Many models have been proposed to generate Internet Autonomous System (AS) topologies, most of which make structural assumptions about the AS graph. In this paper we compare AS topology generation models with several observed AS topologies. In contrast to most previous works, we avoid making assumptions about which topological properties are important to characterize the AS topology. Our analysis shows that, although matching degree-based properties, the existing AS topology generation models fail to capture the complexity of the local interconnection structure between ASs. Furthermore, we use BGP data from multiple vantage points to show that additional measurement locations significantly affect local structure properties, such as clustering and node centrality. Degree-based properties, however, are not notably affected by additional measurements locations. These observations are particularly valid in the core. The shortcomings of AS topology generation models stems from an underestimation of the complexity of the connectivity in the core caused by inappropriate use of BGP data.
△ Less
Submitted 13 July, 2008;
originally announced July 2008.
-
The Norma Cluster (ACO 3627): I. A Dynamical Analysis of the Most Massive Cluster in the Great Attractor
Authors:
P. A. Woudt,
R. C. Kraan-Korteweg,
J. Lucey,
A. P. Fairall,
S. A. W. Moore
Abstract:
A detailed dynamical analysis of the nearby rich Norma cluster (ACO 3627) is presented. From radial velocities of 296 cluster members, we find a mean velocity of 4871 +/- 54 km/s and a velocity dispersion of 925 km/s. The mean velocity of the E/S0 population (4979 +/- 85 km/s) is offset with respect to that of the S/Irr population (4812 +/- 70 km/s) by `Delta' v = 164 km/s in the cluster rest fr…
▽ More
A detailed dynamical analysis of the nearby rich Norma cluster (ACO 3627) is presented. From radial velocities of 296 cluster members, we find a mean velocity of 4871 +/- 54 km/s and a velocity dispersion of 925 km/s. The mean velocity of the E/S0 population (4979 +/- 85 km/s) is offset with respect to that of the S/Irr population (4812 +/- 70 km/s) by `Delta' v = 164 km/s in the cluster rest frame. This offset increases towards the core of the cluster. The E/S0 population is free of any detectable substructure and appears relaxed. Its shape is clearly elongated with a position angle that is aligned along the dominant large-scale structures in this region, the so-called Norma wall. The central cD galaxy has a very large peculiar velocity of 561 km/s which is most probably related to an ongoing merger at the core of the cluster. The spiral/irregular galaxies reveal a large amount of substructure; two dynamically distinct subgroups within the overall spiral-population have been identified, located along the Norma wall elongation. The dynamical mass of the Norma cluster within its Abell radius is 1 - 1.1 x 10^15 h^-1_73 M_Sun. One of the cluster members, the spiral galaxy WKK 6176 which recently was observed to have a 70 kpc X-ray tail, reveals numerous striking low-brightness filaments pointing away from the cluster centre suggesting strong interaction with the intracluster medium.
△ Less
Submitted 15 June, 2007;
originally announced June 2007.
-
The Effect of Large-Scale Structure on the SDSS Galaxy Three-Point Correlation Function
Authors:
R. C. Nichol,
R. K. Sheth,
Y. Suto,
A. J. Gray,
I. Kayo,
R. H. Wechsler,
F. Marin,
G. Kulkarni,
M. Blanton,
A. J. Connolly,
J. P. Gardner,
B. Jain,
C. J. Miller,
A. W. Moore,
A. Pope,
J. Pun,
D. Schneider,
J. Schneider,
A. Szalay,
I. Szapudi,
I. Zehavi,
N. A. Bahcall,
I. Csabai,
J. Brinkmann
Abstract:
We present measurements of the normalised redshift-space three-point correlation function (Q_z) of galaxies from the Sloan Digital Sky Survey (SDSS) main galaxy sample. We have applied our "npt" algorithm to both a volume-limited (36738 galaxies) and magnitude-limited sample (134741 galaxies) of SDSS galaxies, and find consistent results between the two samples, thus confirming the weak luminosi…
▽ More
We present measurements of the normalised redshift-space three-point correlation function (Q_z) of galaxies from the Sloan Digital Sky Survey (SDSS) main galaxy sample. We have applied our "npt" algorithm to both a volume-limited (36738 galaxies) and magnitude-limited sample (134741 galaxies) of SDSS galaxies, and find consistent results between the two samples, thus confirming the weak luminosity dependence of Q_z recently seen by other authors. We compare our results to other Q_z measurements in the literature and find it to be consistent within the full jack-knife error estimates. However, we find these errors are significantly increased by the presence of the ``Sloan Great Wall'' (at z ~ 0.08) within these two SDSS datasets, which changes the 3-point correlation function (3PCF) by 70% on large scales (s>=10h^-1 Mpc). If we exclude this supercluster, our observed Q_z is in better agreement with that obtained from the 2dFGRS by other authors, thus demonstrating the sensitivity of these higher-order correlation functions to large-scale structures in the Universe. This analysis highlights that the SDSS datasets used here are not ``fair samples'' of the Universe for the estimation of higher-order clustering statistics and larger volumes are required. We study the shape-dependence of Q_z(s,q,theta) as one expects this measurement to depend on scale if the large scale structure in the Universe has grown via gravitational instability from Gaussian initial conditions. On small scales (s <= 6h^-1 Mpc), we see some evidence for shape-dependence in Q_z, but at present our measurements are consistent with a constant within the errors (Q_z ~ 0.75 +/- 0.05). On scales >10h^-1 Mpc, we see considerable shape-dependence in Q_z.
△ Less
Submitted 24 February, 2006;
originally announced February 2006.
-
Statistical Computations with AstroGrid and the Grid
Authors:
Robert C Nichol,
Garry Smith,
Christopher J Miller,
Chris Genovese,
Larry Wasserman,
Brent Bryan,
Alexander Gray,
Jeff Schneider,
Andrew W Moore
Abstract:
We outline our first steps towards marrying two new and emerging technologies; the Virtual Observatory (e.g, AstroGrid) and the computational grid. We discuss the construction of VOTechBroker, which is a modular software tool designed to abstract the tasks of submission and management of a large number of computational jobs to a distributed computer system. The broker will also interact with the…
▽ More
We outline our first steps towards marrying two new and emerging technologies; the Virtual Observatory (e.g, AstroGrid) and the computational grid. We discuss the construction of VOTechBroker, which is a modular software tool designed to abstract the tasks of submission and management of a large number of computational jobs to a distributed computer system. The broker will also interact with the AstroGrid workflow and MySpace environments. We present our planned usage of the VOTechBroker in computing a huge number of n-point correlation functions from the SDSS, as well as fitting over a million CMBfast models to the WMAP data.
△ Less
Submitted 15 November, 2005;
originally announced November 2005.
-
NOAO Fundamental Plane Survey -- II. Age and Metallicity along the Red Sequence
Authors:
Jenica E. Nelan,
Russell J. Smith,
Michael J. Hudson,
Gary A. Wegner,
John R. Lucey,
Stephen A. W. Moore,
Stephen J. Quinney,
Nicholas B. Suntzeff
Abstract:
We present spectroscopic linestrength data for 4097 red-sequence galaxies in 93 low-redshift galaxy clusters, and use these to investigate variations in average stellar populations as a function of galaxy mass. Our analysis includes an improved treatment of nebular emission contamination, which affects ~10% of the sample galaxies. Using the stellar population models of D. Thomas and collaborator…
▽ More
We present spectroscopic linestrength data for 4097 red-sequence galaxies in 93 low-redshift galaxy clusters, and use these to investigate variations in average stellar populations as a function of galaxy mass. Our analysis includes an improved treatment of nebular emission contamination, which affects ~10% of the sample galaxies. Using the stellar population models of D. Thomas and collaborators, we simultaneously fit twelve observed linestrength-sigma relations in terms of common underlying trends of age, [Z/H] (total metallicity) and a/Fe (alpha-element enhancement). We find that the observed linestrength-sigma relations can be explained only if higher-mass red-sequence galaxies are, on average, older, more metal rich, and more alpha-enhanced than lower-mass galaxies. Quantitatively, the scaling relations are age=sigma^(0.59+/-0.13), Z/H=sigma^(0.53+/-0.08) and a/Fe=sigma^(0.31+/-0.06), where the errors reflect the range obtained using different subsets of indices. We conclude that although the stars in giant red galaxies in clusters formed early, most of the galaxies at the faint end joined the red sequence only at recent epochs. This "down-sizing" trend is in good qualitative agreement with observations of the red sequence at higher redshifts, but is not predicted by semi-analytic models of galaxy formation.
△ Less
Submitted 13 May, 2005;
originally announced May 2005.
-
Multi-Tree Methods for Statistics on Very Large Datasets in Astronomy
Authors:
Alexander G. Gray,
Andrew W. Moore,
Robert C. Nichol,
Andrew J. Connolly,
Christopher Genovese,
Larry Wasserman
Abstract:
Many fundamental statistical methods have become critical tools for scientific data analysis yet do not scale tractably to modern large datasets. This paper will describe very recent algorithms based on computational geometry which have dramatically reduced the computational complexity of 1) kernel density estimation (which also extends to nonparametric regression, classification, and clustering…
▽ More
Many fundamental statistical methods have become critical tools for scientific data analysis yet do not scale tractably to modern large datasets. This paper will describe very recent algorithms based on computational geometry which have dramatically reduced the computational complexity of 1) kernel density estimation (which also extends to nonparametric regression, classification, and clustering), and 2) the n-point correlation function for arbitrary n. These new multi-tree methods typically yield orders of magnitude in speedup over the previous state of the art for similar accuracy, making millions of data points tractable on desktop workstations for the first time.
△ Less
Submitted 8 January, 2004;
originally announced January 2004.
-
Stellar populations in early-type Coma cluster galaxies - I. The data
Authors:
Stephen A. W. Moore,
John R. Lucey,
Harald Kuntschner,
Matthew Colless
Abstract:
We present a homogeneous and high signal-to-noise data set (mean S/N of ~60 per Å) of Lick/IDS stellar population line indices and central velocity dispersions for a sample of 132 bright (b_j < 18.0) galaxies within the central 1 deg (= 1.26 h^-1 Mpc) of the nearby rich Coma cluster (A1656). Our observations include 73 per cent (100 out of 137) of the total early-type galaxy population (b_j < 18…
▽ More
We present a homogeneous and high signal-to-noise data set (mean S/N of ~60 per Å) of Lick/IDS stellar population line indices and central velocity dispersions for a sample of 132 bright (b_j < 18.0) galaxies within the central 1 deg (= 1.26 h^-1 Mpc) of the nearby rich Coma cluster (A1656). Our observations include 73 per cent (100 out of 137) of the total early-type galaxy population (b_j < 18.0). Observations were made with the WHT 4.2 metre and the AUTOFIB2/WYFFOS multi-object spectroscopy instrument (resolution of 2.2 ÅFWHM) using 2.7'' diameter fibres (= 0.94 h^-1 kpc). The data in this paper have well characterised errors, calculated in a rigorous and statistical way. Data are compared to previous studies and are demonstrated to be of high quality and well calibrated on to the Lick/IDS system. Our data have median errors of ~0.1 Åfor atomic line indices, ~0.008 mag for molecular line indices, and 0.015 dex for velocity dispersions. This work provides a well-defined, high-quality baseline at z~0 for studies of medium to high redshift clusters. Subsequent papers will use this data set to probe the stellar populations (which act as fossil records of galaxy formation and evolution) and the spectro-photometric relations of the bright early-type galaxies within the core of the Coma cluster.
△ Less
Submitted 18 March, 2002;
originally announced March 2002.
-
Non-Parametric Inference in Astrophysics
Authors:
Larry Wasserman,
Christopher J. Miller,
Robert C. Nichol,
Chris Genovese,
Woncheol Jang,
Andrew J. Connolly,
Andrew W. Moore,
Jeff Schneider,
the PICA group
Abstract:
We discuss non-parametric density estimation and regression for astrophysics problems. In particular, we show how to compute non-parametric confidence intervals for the location and size of peaks of a function. We illustrate these ideas with recent data on the Cosmic Microwave Background. We also briefly discuss non-parametric Bayesian inference.
We discuss non-parametric density estimation and regression for astrophysics problems. In particular, we show how to compute non-parametric confidence intervals for the location and size of peaks of a function. We illustrate these ideas with recent data on the Cosmic Microwave Background. We also briefly discuss non-parametric Bayesian inference.
△ Less
Submitted 3 December, 2001;
originally announced December 2001.
-
The Fundamental Properties of Early-type Galaxies in the Coma Cluster
Authors:
Stephen A. W. Moore,
J. R. Lucey,
H. Kuntschner,
R. L. Davies,
M. Colless
Abstract:
We report the results of a high quality spectral study of early-type galaxies within the Coma Cluster core. Stellar population analysis using Lick/IDS indices to break the age/metallicity degeneracy are presented, probing their formation history and properties. A clear metallicity trend and a dominant single age population are found.
We report the results of a high quality spectral study of early-type galaxies within the Coma Cluster core. Stellar population analysis using Lick/IDS indices to break the age/metallicity degeneracy are presented, probing their formation history and properties. A clear metallicity trend and a dominant single age population are found.
△ Less
Submitted 2 November, 2001;
originally announced November 2001.
-
Computational AstroStatistics: Fast and Efficient Tools for Analysing Huge Astronomical Data Sources
Authors:
R. C. Nichol,
S. Chong,
A. J. Connolly,
S. Davies,
C. Genovese,
A. M. Hopkins,
C. J. Miller,
A. W. Moore,
D. Pelleg,
G. T. Richards,
J. Schneider,
I. Szapudi,
L. Wasserman
Abstract:
I present here a review of past and present multi-disciplinary research of the Pittsburgh Computational AstroStatistics (PiCA) group. This group is dedicated to developing fast and efficient statistical algorithms for analysing huge astronomical data sources. I begin with a short review of multi-resolutional kd-trees which are the building blocks for many of our algorithms. For example, quick ra…
▽ More
I present here a review of past and present multi-disciplinary research of the Pittsburgh Computational AstroStatistics (PiCA) group. This group is dedicated to developing fast and efficient statistical algorithms for analysing huge astronomical data sources. I begin with a short review of multi-resolutional kd-trees which are the building blocks for many of our algorithms. For example, quick range queries and fast n-point correlation functions. I will present new results from the use of Mixture Models (Connolly et al. 2000) in density estimation of multi-color data from the Sloan Digital Sky Survey (SDSS). Specifically, the selection of quasars and the automated identification of X-ray sources. I will also present a brief overview of the False Discovery Rate (FDR) procedure (Miller et al. 2001a) and show how it has been used in the detection of ``Baryon Wiggles'' in the local galaxy power spectrum and source identification in radio data. Finally, I will look forward to new research on an automated Bayes Network anomaly detector and the possible use of the Locally Linear Embedding algorithm (LLE; Roweis & Saul 2000) for spectral classification of SDSS spectra.
△ Less
Submitted 9 October, 2001;
originally announced October 2001.
-
Fast Algorithms and Efficient Statistics: Density Estimation in Large Astronomical Datasets
Authors:
A. J. Connolly,
C. Genovese,
A. W. Moore,
R. C. Nichol,
J. Schneider,
L. Wasserman
Abstract:
In this paper, we outline the use of Mixture Models in density estimation of large astronomical databases. This method of density estimation has been known in Statistics for some time but has not been implemented because of the large computational cost. Herein, we detail an implementation of the Mixture Model density estimation based on multi-resolutional KD-trees which makes this statistical te…
▽ More
In this paper, we outline the use of Mixture Models in density estimation of large astronomical databases. This method of density estimation has been known in Statistics for some time but has not been implemented because of the large computational cost. Herein, we detail an implementation of the Mixture Model density estimation based on multi-resolutional KD-trees which makes this statistical technique into a computationally tractable problem. We provide the theoretical and experimental background for using a mixture model of Gaussians based on the Expectation Maximization (EM) Algorithm. Applying these analyses to simulated data sets we show that the EM algorithm - using the AIC penalized likelihood to score the fit - out-performs the best kernel density estimate of the distribution while requiring no ``fine--tuning'' of the input algorithm parameters. We find that EM can accurately recover the underlying density distribution from point processes thus providing an efficient adaptive smoothing method for astronomical source catalogs. To demonstrate the general application of this statistic to astrophysical problems we consider two cases of density estimation: the clustering of galaxies in redshift space and the clustering of stars in color space. From these data we show that EM provides an adaptive smoothing of the distribution of galaxies in redshift space (describing accurately both the small and large-scale features within the data) and a means of identifying outliers in multi-dimensional color-color space (e.g. for the identification of high redshift QSOs). Automated tools such as those based on the EM algorithm will be needed in the analysis of the next generation of astronomical catalogs (2MASS, FIRST, PLANCK, SDSS) and ultimately in in the development of the National Virtual Observatory.
△ Less
Submitted 11 August, 2000;
originally announced August 2000.
-
Computational AstroStatistics: Fast Algorithms and Efficient Statistics for Density Estimation in Large Astronomical Datasets
Authors:
R. C. Nichol,
A. J. Connolly,
A. W. Moore,
J. Schneider,
C. Genovese,
L. Wasserman
Abstract:
We present initial results on the use of Mixture Models for density estimation in large astronomical databases. We provide herein both the theoretical and experimental background for using a mixture model of Gaussians based on the Expectation Maximization (EM) Algorithm. Applying these analyses to simulated data sets we show that the EM algorithm - using the both the AIC & BIC penalized likeliho…
▽ More
We present initial results on the use of Mixture Models for density estimation in large astronomical databases. We provide herein both the theoretical and experimental background for using a mixture model of Gaussians based on the Expectation Maximization (EM) Algorithm. Applying these analyses to simulated data sets we show that the EM algorithm - using the both the AIC & BIC penalized likelihood to score the fit - can out-perform the best kernel density estimate of the distribution while requiring no ``fine-tuning'' of the input algorithm parameters. We find that EM can accurately recover the underlying density distribution from point processes thus providing an efficient adaptive smoothing method for astronomical source catalogs. To demonstrate the general application of this statistic to astrophysical problems we consider two cases of density estimation; the clustering of galaxies in redshift space and the clustering of stars in color space. From these data we show that EM provides an adaptive smoothing of the distribution of galaxies in redshift space (describing accurately both the small and large-scale features within the data) and a means of identifying outliers in multi-dimensional color-color space (e.g. for the identification of high redshift QSOs). Automated tools such as those based on the EM algorithm will be needed in the analysis of the next generation of astronomical catalogs (2MASS, FIRST, PLANCK, SDSS) and ultimately in the development of the National Virtual Observatory.
△ Less
Submitted 26 July, 2000;
originally announced July 2000.
-
Reinforcement Learning: A Survey
Authors:
L. P. Kaelbling,
M. L. Littman,
A. W. Moore
Abstract:
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic enviro…
▽ More
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.
△ Less
Submitted 30 April, 1996;
originally announced May 1996.