Matching Catalogues by Probabilistic Pattern Classification
Authors:
D. J. Rohde,
M. R. Gallagher,
M. J. Drinkwater,
K. A. Pimbblet
Abstract:
We consider the statistical problem of catalogue matching from a machine learning perspective with the goal of producing probabilistic outputs, and using all available information. A framework is provided that unifies two existing approaches to producing probabilistic outputs in the literature, one based on combining distribution estimates and the other based on combining probabilistic classifie…
▽ More
We consider the statistical problem of catalogue matching from a machine learning perspective with the goal of producing probabilistic outputs, and using all available information. A framework is provided that unifies two existing approaches to producing probabilistic outputs in the literature, one based on combining distribution estimates and the other based on combining probabilistic classifiers. We apply both of these to the problem of matching the HIPASS radio catalogue with large positional uncertainties to the much denser SuperCOSMOS catalogue with much smaller positional uncertainties. We demonstrate the utility of probabilistic outputs by a controllable completeness and efficiency trade-off and by identifying objects that have high probability of being rare. Finally, possible biasing effects in the output of these classifiers are also highlighted and discussed.
△ Less
Submitted 9 May, 2006;
originally announced May 2006.
The HIPASS Catalogue: III - Optical Counterparts & Isolated Dark Galaxies
Authors:
Marianne T. Doyle,
M. J. Drinkwater,
D. J. Rohde,
K. A. Pimbblet,
M. Read,
M. J. Meyer,
M. A. Zwaan,
E. Ryan-Weber,
J. Stevens,
B. S. Koribalski,
R. L. Webster,
L. Staveley-Smith,
D. G. Barnes,
M. Howlett,
V. A. Kilborn,
M. Waugh,
M. J. Pierce,
R. Bhathal,
W. J. G. de Blok,
M. J. Disney,
R. D. Ekers,
K. C. Freeman,
D. A. Garcia,
B. K. Gibson,
J. Harnett
, et al. (16 additional authors not shown)
Abstract:
We present the largest catalogue to date of optical counterparts for HI radio-selected galaxies, Hopcat. Of the 4315 HI radio-detected sources from the HI Parkes All Sky Survey (Hipass) catalogue, we find optical counterparts for 3618 (84%) galaxies. Of these, 1798 (42%) have confirmed optical velocities and 848 (20%) are single matches without confirmed velocities. Some galaxy matches are membe…
▽ More
We present the largest catalogue to date of optical counterparts for HI radio-selected galaxies, Hopcat. Of the 4315 HI radio-detected sources from the HI Parkes All Sky Survey (Hipass) catalogue, we find optical counterparts for 3618 (84%) galaxies. Of these, 1798 (42%) have confirmed optical velocities and 848 (20%) are single matches without confirmed velocities. Some galaxy matches are members of galaxy groups. From these multiple galaxy matches, 714 (16%) have confirmed optical velocities and a further 258 (6%) galaxies are without confirmed velocities. For 481 (11%), multiple galaxies are present but no single optical counterpart can be chosen and 216 (5%) have no obvious optical galaxy present. Most of these 'blank fields' are in crowded fields along the Galactic plane or have high extinctions.
Isolated 'Dark galaxy' candidates are investigated using an extinction cut of ABj < 1 mag and the blank fields category. Of the 3692 galaxies with an ABj extinction < 1 mag, only 13 are also blank fields. Of these, 12 are eliminated either with follow-up Parkes observations or are in crowded fields. The remaining one has a low surface brightness optical counterpart. Hence, no isolated optically dark galaxies have been found within the limits of the Hipass survey.
△ Less
Submitted 30 May, 2005;
originally announced May 2005.
Applying Machine Learning to Catalogue Matching in Astrophysics
Authors:
D J Rohde,
M J Drinkwater,
M R Gallagher,
T Downs,
M T Doyle
Abstract:
We present the results of applying automated machine learning techniques to the problem of matching different object catalogues in astrophysics. In this study we take two partially matched catalogues where one of the two catalogues has a large positional uncertainty. The two catalogues we used here were taken from the HI Parkes All Sky Survey (HIPASS), and SuperCOSMOS optical survey. Previous wo…
▽ More
We present the results of applying automated machine learning techniques to the problem of matching different object catalogues in astrophysics. In this study we take two partially matched catalogues where one of the two catalogues has a large positional uncertainty. The two catalogues we used here were taken from the HI Parkes All Sky Survey (HIPASS), and SuperCOSMOS optical survey. Previous work had matched 44% (1887 objects) of HIPASS to the SuperCOSMOS catalogue.
A supervised learning algorithm was then applied to construct a model of the matched portion of our catalogue. Validation of the model shows that we achieved a good classification performance (99.12% correct).
Applying this model, to the unmatched portion of the catalogue found 1209 new matches. This increases the catalogue size from 1887 matched objects to 3096. The combination of these procedures yields a catalogue that is 72% matched.
△ Less
Submitted 1 April, 2005;
originally announced April 2005.