-
Panopticon: a telescope for our times
Authors:
Will Saunders,
Timothy Chin,
Michael Goodwin
Abstract:
We present a design for a wide-field spectroscopic telescope. The only large powered mirror is spherical, the resulting spherical aberration is corrected for each target separately, giving exceptional image quality. The telescope is a transit design, but still allows all-sky coverage. Three simultaneous modes are proposed: (a) natural seeing multi-object spectroscopy with 12m aperture over 3dg FoV…
▽ More
We present a design for a wide-field spectroscopic telescope. The only large powered mirror is spherical, the resulting spherical aberration is corrected for each target separately, giving exceptional image quality. The telescope is a transit design, but still allows all-sky coverage. Three simultaneous modes are proposed: (a) natural seeing multi-object spectroscopy with 12m aperture over 3dg FoV with ~25,000 targets; (b) multi-object AO with 12m aperture over 3dg FoV with ~100 AO-corrected Integral Field Units each with 4 arcsec FoV; (c) ground layer AO-corrected integral field spectroscopy with 15m aperture and 13 arcmin FoV. Such a telescope would be uniquely powerful for large-area follow-up of imaging surveys; in each mode, the AOmega and survey speed exceed all existing facilities combined. The expected cost of this design is relatively modest, much closer to $500M than $1000M.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Monogamous subvarieties of the nilpotent cone
Authors:
Simon M. Goodwin,
Rachel Pengelly,
David I. Stewart,
Adam R. Thomas
Abstract:
Let $G$ be a reductive algebraic group over an algebraically closed field $k$ of prime characteristic not $2$, whose Lie algebra is denoted $\mathfrak{g}$. We call a subvariety $\mathfrak{X}$ of the nilpotent cone $N \subset \mathfrak{g}$ monogamous if for every $e\in \mathfrak{X}$, the $\mathfrak{sl}_2$-triples $(e,h,f)$ with $f\in \mathfrak{X}$ are conjugate under the centraliser $C_G(e)$. Build…
▽ More
Let $G$ be a reductive algebraic group over an algebraically closed field $k$ of prime characteristic not $2$, whose Lie algebra is denoted $\mathfrak{g}$. We call a subvariety $\mathfrak{X}$ of the nilpotent cone $N \subset \mathfrak{g}$ monogamous if for every $e\in \mathfrak{X}$, the $\mathfrak{sl}_2$-triples $(e,h,f)$ with $f\in \mathfrak{X}$ are conjugate under the centraliser $C_G(e)$. Building on work by the first two authors, we show there is a unique maximal closed $G$-stable monogamous subvariety $V \subset N$ and that it is an orbit closure, hence irreducible. We show that $V$ can also be characterised in terms of Serre's $G$-complete reducibility.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
The Blue Multi Unit Spectroscopic Explorer (BlueMUSE) on the VLT: science drivers and overview of instrument design
Authors:
Johan Richard,
Rémi Giroud,
Florence Laurent,
Davor Krajnović,
Alexandre Jeanneau,
Roland Bacon,
Manuel Abreu,
Angela Adamo,
Ricardo Araujo,
Nicolas Bouché,
Jarle Brinchmann,
Zhemin Cai,
Norberto Castro,
Ariadna Calcines,
Diane Chapuis,
Adélaïde Claeyssens,
Luca Cortese,
Emanuele Daddi,
Christopher Davison,
Michael Goodwin,
Robert Harris,
Matthew Hayes,
Mathilde Jauzac,
Andreas Kelz,
Jean-Paul Kneib
, et al. (24 additional authors not shown)
Abstract:
BlueMUSE is a blue-optimised, medium spectral resolution, panoramic integral field spectrograph under development for the Very Large Telescope (VLT). With an optimised transmission down to 350 nm, spectral resolution of R$\sim$3500 on average across the wavelength range, and a large FoV (1 arcmin$^2$), BlueMUSE will open up a new range of galactic and extragalactic science cases facilitated by its…
▽ More
BlueMUSE is a blue-optimised, medium spectral resolution, panoramic integral field spectrograph under development for the Very Large Telescope (VLT). With an optimised transmission down to 350 nm, spectral resolution of R$\sim$3500 on average across the wavelength range, and a large FoV (1 arcmin$^2$), BlueMUSE will open up a new range of galactic and extragalactic science cases facilitated by its specific capabilities. The BlueMUSE consortium includes 9 institutes located in 7 countries and is led by the Centre de Recherche Astrophysique de Lyon (CRAL). The BlueMUSE project development is currently in Phase A, with an expected first light at the VLT in 2031. We introduce here the Top Level Requirements (TLRs) derived from the main science cases, and then present an overview of the BlueMUSE system and its subsystems fulfilling these TLRs. We specifically emphasize the tradeoffs that are made and the key distinctions compared to the MUSE instrument, upon which the system architecture is built.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Maximum Manifold Capacity Representations in State Representation Learning
Authors:
Li Meng,
Morten Goodwin,
Anis Yazidi,
Paal Engelstad
Abstract:
The expanding research on manifold-based self-supervised learning (SSL) builds on the manifold hypothesis, which suggests that the inherent complexity of high-dimensional data can be unraveled through lower-dimensional manifold embeddings. Capitalizing on this, DeepInfomax with an unbalanced atlas (DIM-UA) has emerged as a powerful tool and yielded impressive results for state representations in r…
▽ More
The expanding research on manifold-based self-supervised learning (SSL) builds on the manifold hypothesis, which suggests that the inherent complexity of high-dimensional data can be unraveled through lower-dimensional manifold embeddings. Capitalizing on this, DeepInfomax with an unbalanced atlas (DIM-UA) has emerged as a powerful tool and yielded impressive results for state representations in reinforcement learning. Meanwhile, Maximum Manifold Capacity Representation (MMCR) presents a new frontier for SSL by optimizing class separability via manifold compression. However, MMCR demands extensive input views, resulting in significant computational costs and protracted pre-training durations. Bridging this gap, we present an innovative integration of MMCR into existing SSL methods, incorporating a discerning regularization strategy that enhances the lower bound of mutual information. We also propose a novel state representation learning method extending DIM-UA, embedding a nuclear norm loss to enforce manifold consistency robustly. On experimentation with the Atari Annotated RAM Interface, our method improves DIM-UA significantly with the same number of target encoding dimensions. The mean F1 score averaged over categories is 78% compared to 75% of DIM-UA. There are also compelling gains when implementing SimCLR and Barlow Twins. This supports our SSL innovation as a paradigm shift, enabling more nuanced high-dimensional data representations.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Building an Open-Source Community to Enhance Autonomic Nervous System Signal Analysis: DBDP-Autonomic
Authors:
Jessilyn Dunn,
Varun Mishra,
Md Mobashir Hasan Shandhi,
Hayoung Jeong,
Natasha Yamane,
Yuna Watanabe,
Bill Chen,
Matthew S. Goodwin
Abstract:
Smartphones and wearable sensors offer an unprecedented ability to collect peripheral psychophysiological signals across diverse timescales, settings, populations, and modalities. However, open-source software development has yet to keep pace with rapid advancements in hardware technology and availability, creating an analytical barrier that limits the scientific usefulness of acquired data. We pr…
▽ More
Smartphones and wearable sensors offer an unprecedented ability to collect peripheral psychophysiological signals across diverse timescales, settings, populations, and modalities. However, open-source software development has yet to keep pace with rapid advancements in hardware technology and availability, creating an analytical barrier that limits the scientific usefulness of acquired data. We propose a community-driven, open-source peripheral psychophysiological signal pre-processing and analysis software framework that could advance biobehavioral health by enabling more robust, transparent, and reproducible inferences involving autonomic nervous system data.
△ Less
Submitted 29 March, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
The SAMI Galaxy Survey: galaxy spin is more strongly correlated with stellar population age than mass or environment
Authors:
S. M. Croom,
J. van de Sande,
S. P. Vaughan,
T. H. Rutherford,
C. P. Lagos,
S. Barsanti,
J. Bland-Hawthorn,
S. Brough,
J. J. Bryant,
M. Colless,
L. Cortese,
F. D'Eugenio,
A. Fraser-McKelvie,
M. Goodwin,
N. P. F. Lorente,
S. N. Richards,
A. Ristea,
S. M. Sweet,
S. K. Yi,
T. Zafar
Abstract:
We use the SAMI Galaxy Survey to examine the drivers of galaxy spin, $λ_{R_e}$, in a multi-dimensional parameter space including stellar mass, stellar population age (or specific star formation rate) and various environmental metrics (local density, halo mass, satellite vs. central). Using a partial correlation analysis we consistently find that age or specific star formation rate is the primary p…
▽ More
We use the SAMI Galaxy Survey to examine the drivers of galaxy spin, $λ_{R_e}$, in a multi-dimensional parameter space including stellar mass, stellar population age (or specific star formation rate) and various environmental metrics (local density, halo mass, satellite vs. central). Using a partial correlation analysis we consistently find that age or specific star formation rate is the primary parameter correlating with spin. Light-weighted age and specific star formation rate are more strongly correlated with spin than mass-weighted age. In fact, across our sample, once the relation between light-weighted age and spin is accounted for, there is no significant residual correlation between spin and mass, or spin and environment. This result is strongly suggestive that present-day environment only indirectly influences spin, via the removal of gas and star formation quenching. That is, environment affects age, then age affects spin. Older galaxies then have lower spin, either due to stars being born dynamically hotter at high redshift, or due to secular heating. Our results appear to rule out environmentally dependent dynamical heating (e.g. galaxy-galaxy interactions) being important, at least within $1R_e$ where our kinematic measurements are made. The picture is more complex when we only consider high-mass galaxies ($M_*\gtrsim 10^{11}$M$_{\odot}$). While the age-spin relation is still strong for these high-mass galaxies, there is a residual environmental trend with central galaxies preferentially having lower spin, compared to satellites of the same age and mass. We argue that this trend is likely due to central galaxies being a preferred location for mergers.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Sound Source Separation Using Latent Variational Block-Wise Disentanglement
Authors:
Karim Helwani,
Masahito Togami,
Paris Smaragdis,
Michael M. Goodwin
Abstract:
While neural network approaches have made significant strides in resolving classical signal processing problems, it is often the case that hybrid approaches that draw insight from both signal processing and neural networks produce more complete solutions. In this paper, we present a hybrid classical digital signal processing/deep neural network (DSP/DNN) approach to source separation (SS) highligh…
▽ More
While neural network approaches have made significant strides in resolving classical signal processing problems, it is often the case that hybrid approaches that draw insight from both signal processing and neural networks produce more complete solutions. In this paper, we present a hybrid classical digital signal processing/deep neural network (DSP/DNN) approach to source separation (SS) highlighting the theoretical link between variational autoencoder and classical approaches to SS. We propose a system that transforms the single channel under-determined SS task to an equivalent multichannel over-determined SS problem in a properly designed latent space. The separation task in the latent space is treated as finding a variational block-wise disentangled representation of the mixture. We show empirically, that the design choices and the variational formulation of the task at hand motivated by the classical signal processing theoretical results lead to robustness to unseen out-of-distribution data and reduction of the overfitting risk. To address the resulting permutation issue we explicitly incorporate a novel differentiable permutation loss function and augment the model with a memory mechanism to keep track of the statistics of the individual sources.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
The SAMI galaxy survey: predicting kinematic morphology with logistic regression
Authors:
Sam P. Vaughan,
Jesse van de Sande,
A. Fraser-McKelvie,
Scott Croom,
Richard McDermid,
Benoit Liquet-Weiland,
Stefania Barsanti,
Luca Cortese,
Sarah Brough,
Sarah Sweet,
Julia J. Bryant,
Michael Goodwin,
Jon Lawrence
Abstract:
We use the SAMI galaxy survey to study the the kinematic morphology-density relation: the observation that the fraction of slow rotator galaxies increases towards dense environments. We build a logistic regression model to quantitatively study the dependence of kinematic morphology (whether a galaxy is a fast rotator or slow rotator) on a wide range of parameters, without resorting to binning the…
▽ More
We use the SAMI galaxy survey to study the the kinematic morphology-density relation: the observation that the fraction of slow rotator galaxies increases towards dense environments. We build a logistic regression model to quantitatively study the dependence of kinematic morphology (whether a galaxy is a fast rotator or slow rotator) on a wide range of parameters, without resorting to binning the data. Our model uses a combination of stellar mass, star-formation rate (SFR), $r$-band half-light radius and a binary variable based on whether the galaxy's observed ellipticity ($ε$) is less than 0.4. We show that, at fixed mass, size, SFR and $ε$, a galaxy's local environmental surface density ($\log_{10}(Σ_5/\mathrm{Mpc}^{-2})$) gives no further information about whether a galaxy is a slow rotator, i.e. the observed kinematic-morphology density relation can be entirely explained by the well-known correlations between environment and other quantities. We show how our model can be applied to different galaxy surveys to predict the fraction of slow rotators which would be observed and discuss its implications for the formation pathways of slow rotators.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
The SAMI Galaxy Survey: Using Tidal Streams and Shells to Trace the Dynamical Evolution of Massive Galaxies
Authors:
Tomas H. Rutherford,
Jesse van de Sande,
Scott M. Croom,
Lucas M. Valenzuela,
Rhea-Silvia Remus,
Francesco D'Eugenio,
Sam P. Vaughan,
Henry R. M. Zovaro,
Sarah Casura,
Stefania Barsanti,
Joss Bland-Hawthorn,
Sarah Brough,
Julia J. Bryant,
Michael Goodwin,
Nuria Lorente,
Sree Oh,
Andrei Ristea
Abstract:
Slow rotator galaxies are distinct amongst galaxy populations, with simulations suggesting that a mix of minor and major mergers are responsible for their formation. A promising path to resolve outstanding questions on the type of merger responsible, is by investigating deep imaging of massive galaxies for signs of potential merger remnants. We utilise deep imaging from the Subaru-Hyper Suprime Ca…
▽ More
Slow rotator galaxies are distinct amongst galaxy populations, with simulations suggesting that a mix of minor and major mergers are responsible for their formation. A promising path to resolve outstanding questions on the type of merger responsible, is by investigating deep imaging of massive galaxies for signs of potential merger remnants. We utilise deep imaging from the Subaru-Hyper Suprime Cam Wide data to search for tidal features in massive ($\log_{10}(M_*/M_{\odot}) > 10$) early-type galaxies (ETGs) in the SAMI Galaxy Survey. We perform a visual check for tidal features on images where the galaxy has been subtracted using a Multi-Gauss Expansion (MGE) model. We find that $31\pm 2$ percent of our sample show tidal features. When comparing galaxies with and without features, we find that the distributions in stellar mass, light-weighted mean stellar population age and H$α$ equivalent width are significantly different, whereas spin ($λ_{R_e}$), ellipticity and bulge to total ratio have similar distributions. When splitting our sample in age, we find that galaxies below the median age (10.8 Gyr) show a correlation between the presence of shells and lower $λ_{R_e}$, as expected from simulations. We also find these younger galaxies which are classified as having "strong" shells have lower $λ_{R_e}$. However, simulations suggest that merger features become undetectable within $\sim 2-4$ Gyr post-merger. This implies that the relationship between tidal features and merger history disappears for galaxies with older stellar ages, i.e. those that are more likely to have merged long ago.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
A Manifold Representation of the Key in Vision Transformers
Authors:
Li Meng,
Morten Goodwin,
Anis Yazidi,
Paal Engelstad
Abstract:
Vision Transformers implement multi-head self-attention via stacking multiple attention blocks. The query, key, and value are often intertwined and generated within those blocks via a single, shared linear transformation. This paper explores the concept of disentangling the key from the query and value, and adopting a manifold representation for the key. Our experiments reveal that decoupling and…
▽ More
Vision Transformers implement multi-head self-attention via stacking multiple attention blocks. The query, key, and value are often intertwined and generated within those blocks via a single, shared linear transformation. This paper explores the concept of disentangling the key from the query and value, and adopting a manifold representation for the key. Our experiments reveal that decoupling and endowing the key with a manifold structure can enhance the model's performance. Specifically, ViT-B exhibits a 0.87% increase in top-1 accuracy, while Swin-T sees a boost of 0.52% in top-1 accuracy on the ImageNet-1K dataset, with eight charts in the manifold key. Our approach also yields positive results in object detection and instance segmentation tasks on the COCO dataset. We establish that these performance gains are not merely due to the simplicity of adding more parameters and computations. Future research may investigate strategies for cutting the budget of such representations and aim for further performance improvements based on our findings.
△ Less
Submitted 7 June, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Real-time Stereo Speech Enhancement with Spatial-Cue Preservation based on Dual-Path Structure
Authors:
Masahito Togami,
Jean-Marc Valin,
Karim Helwani,
Ritwik Giri,
Umut Isik,
Michael M. Goodwin
Abstract:
We introduce a real-time, multichannel speech enhancement algorithm which maintains the spatial cues of stereo recordings including two speech sources. Recognizing that each source has unique spatial information, our method utilizes a dual-path structure, ensuring the spatial cues remain unaffected during enhancement by applying source-specific common-band gain. This method also seamlessly integra…
▽ More
We introduce a real-time, multichannel speech enhancement algorithm which maintains the spatial cues of stereo recordings including two speech sources. Recognizing that each source has unique spatial information, our method utilizes a dual-path structure, ensuring the spatial cues remain unaffected during enhancement by applying source-specific common-band gain. This method also seamlessly integrates pretrained monaural speech enhancement, eliminating the need for retraining on stereo inputs. Source separation from stereo mixtures is achieved via spatial beamforming, with the steering vector for each source being adaptively updated using post-enhancement output signal. This ensures accurate tracking of the spatial information. The final stereo output is derived by merging the spatial images of the enhanced sources, with its efficacy not heavily reliant on the separation performance of the beamforming. The algorithm runs in real-time on 10-ms frames with a 40 ms of look-ahead. Evaluations reveal its effectiveness in enhancing speech and preserving spatial cues in both fully and sparsely overlapped mixtures.
△ Less
Submitted 31 January, 2024;
originally announced February 2024.
-
MapAI: Precision in Building Segmentation
Authors:
Sander Riisøen Jyhne,
Morten Goodwin,
Per Arne Andersen,
Ivar Oveland,
Alexander Salveson Nossum,
Karianne Ormseth,
Mathilde Ørstavik,
Andrew C. Flatman
Abstract:
MapAI: Precision in Building Segmentation is a competition arranged with the Norwegian Artificial Intelligence Research Consortium (NORA) in collaboration with Centre for Artificial Intelligence Research at the University of Agder (CAIR), the Norwegian Mapping Authority, AI:Hub, Norkart, and the Danish Agency for Data Supply and Infrastructure. The competition will be held in the fall of 2022. It…
▽ More
MapAI: Precision in Building Segmentation is a competition arranged with the Norwegian Artificial Intelligence Research Consortium (NORA) in collaboration with Centre for Artificial Intelligence Research at the University of Agder (CAIR), the Norwegian Mapping Authority, AI:Hub, Norkart, and the Danish Agency for Data Supply and Infrastructure. The competition will be held in the fall of 2022. It will be concluded at the Northern Lights Deep Learning conference focusing on the segmentation of buildings using aerial images and laser data. We propose two different tasks to segment buildings, where the first task can only utilize aerial images, while the second must use laser data (LiDAR) with or without aerial images. Furthermore, we use IoU and Boundary IoU to properly evaluate the precision of the models, with the latter being an IoU measure that evaluates the results' boundaries. We provide the participants with a training dataset and keep a test dataset for evaluation.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Multiple Toddler Tracking in Indoor Videos
Authors:
Somaieh Amraee,
Bishoy Galoaa,
Matthew Goodwin,
Elaheh Hatamimajoumerd,
Sarah Ostadabbas
Abstract:
Multiple toddler tracking (MTT) involves identifying and differentiating toddlers in video footage. While conventional multi-object tracking (MOT) algorithms are adept at tracking diverse objects, toddlers pose unique challenges due to their unpredictable movements, various poses, and similar appearance. Tracking toddlers in indoor environments introduces additional complexities such as occlusions…
▽ More
Multiple toddler tracking (MTT) involves identifying and differentiating toddlers in video footage. While conventional multi-object tracking (MOT) algorithms are adept at tracking diverse objects, toddlers pose unique challenges due to their unpredictable movements, various poses, and similar appearance. Tracking toddlers in indoor environments introduces additional complexities such as occlusions and limited fields of view. In this paper, we address the challenges of MTT and propose MTTSort, a customized method built upon the DeepSort algorithm. MTTSort is designed to track multiple toddlers in indoor videos accurately. Our contributions include discussing the primary challenges in MTT, introducing a genetic algorithm to optimize hyperparameters, proposing an accurate tracking algorithm, and curating the MTTrack dataset using unbiased AI co-labeling techniques. We quantitatively compare MTTSort to state-of-the-art MOT methods on MTTrack, DanceTrack, and MOT15 datasets. In our evaluation, the proposed method outperformed other MOT methods, achieving 0.98, 0.68, and 0.98 in multiple object tracking accuracy (MOTA), higher order tracking accuracy (HOTA), and iterative and discriminative framework 1 (IDF1) metrics, respectively.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
On induced completely prime primitive ideals in enveloping algebras of classical Lie algebras
Authors:
Simon M. Goodwin,
Lewis Topley,
Matthew Westaway
Abstract:
A distinguished family of completely prime primitive ideals in the universal enveloping algebra of a reductive Lie algebra ${\mathfrak g}$ over ${\mathbb C}$ are those ideals constructed from one-dimensional representations of finite $W$-algebras. We refer to these ideals as Losev--Premet ideals. For ${\mathfrak g}$ simple of classical type, we prove that for a Losev-Premet ideal $I$ in…
▽ More
A distinguished family of completely prime primitive ideals in the universal enveloping algebra of a reductive Lie algebra ${\mathfrak g}$ over ${\mathbb C}$ are those ideals constructed from one-dimensional representations of finite $W$-algebras. We refer to these ideals as Losev--Premet ideals. For ${\mathfrak g}$ simple of classical type, we prove that for a Losev-Premet ideal $I$ in $U({\mathfrak g})$, there exists a Losev-Premet ideal $I_0$ for a certain Levi subalgebra ${\mathfrak g}_0$ of ${\mathfrak g}$ such that associated variety of $I_0$ is the closure of a rigid nilpotent orbit in ${\mathfrak g}_0$ and $I$ is obtained from $I_0$ by parabolic induction. This is deduced from the corresponding statement about one-dimensional representations of finite $W$-algebras.
△ Less
Submitted 4 December, 2023; v1 submitted 17 November, 2023;
originally announced November 2023.
-
Harnessing Attention Mechanisms: Efficient Sequence Reduction using Attention-based Autoencoders
Authors:
Daniel Biermann,
Fabrizio Palumbo,
Morten Goodwin,
Ole-Christoffer Granmo
Abstract:
Many machine learning models use the manipulation of dimensions as a driving force to enable models to identify and learn important features in data. In the case of sequential data this manipulation usually happens on the token dimension level. Despite the fact that many tasks require a change in sequence length itself, the step of sequence length reduction usually happens out of necessity and in…
▽ More
Many machine learning models use the manipulation of dimensions as a driving force to enable models to identify and learn important features in data. In the case of sequential data this manipulation usually happens on the token dimension level. Despite the fact that many tasks require a change in sequence length itself, the step of sequence length reduction usually happens out of necessity and in a single step. As far as we are aware, no model uses the sequence length reduction step as an additional opportunity to tune the models performance. In fact, sequence length manipulation as a whole seems to be an overlooked direction. In this study we introduce a novel attention-based method that allows for the direct manipulation of sequence lengths. To explore the method's capabilities, we employ it in an autoencoder model. The autoencoder reduces the input sequence to a smaller sequence in latent space. It then aims to reproduce the original sequence from this reduced form. In this setting, we explore the methods reduction performance for different input and latent sequence lengths. We are able to show that the autoencoder retains all the significant information when reducing the original sequence to half its original size. When reducing down to as low as a quarter of its original size, the autoencoder is still able to reproduce the original sequence with an accuracy of around 90%.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Neural Harmonium: An Interpretable Deep Structure for Nonlinear Dynamic System Identification with Application to Audio Processing
Authors:
Karim Helwani,
Erfan Soltanmohammadi,
Michael M. Goodwin
Abstract:
Improving the interpretability of deep neural networks has recently gained increased attention, especially when the power of deep learning is leveraged to solve problems in physics. Interpretability helps us understand a model's ability to generalize and reveal its limitations. In this paper, we introduce a causal interpretable deep structure for modeling dynamic systems. Our proposed model makes…
▽ More
Improving the interpretability of deep neural networks has recently gained increased attention, especially when the power of deep learning is leveraged to solve problems in physics. Interpretability helps us understand a model's ability to generalize and reveal its limitations. In this paper, we introduce a causal interpretable deep structure for modeling dynamic systems. Our proposed model makes use of the harmonic analysis by modeling the system in a time-frequency domain while maintaining high temporal and spectral resolution. Moreover, the model is built in an order recursive manner which allows for fast, robust, and exact second order optimization without the need for an explicit Hessian calculation. To circumvent the resulting high dimensionality of the building blocks of our system, a neural network is designed to identify the frequency interdependencies. The proposed model is illustrated and validated on nonlinear system identification problems as required for audio signal processing tasks. Crowd-sourced experimentation contrasting the performance of the proposed approach to other state-of-the-art solutions on an acoustic echo cancellation scenario confirms the effectiveness of our method for real-life applications.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
NoLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Shaping
Authors:
Jan Büthe,
Ahmed Mustafa,
Jean-Marc Valin,
Karim Helwani,
Michael M. Goodwin
Abstract:
Speech codec enhancement methods are designed to remove distortions added by speech codecs. While classical methods are very low in complexity and add zero delay, their effectiveness is rather limited. Compared to that, DNN-based methods deliver higher quality but they are typically high in complexity and/or require delay. The recently proposed Linear Adaptive Coding Enhancer (LACE) addresses this…
▽ More
Speech codec enhancement methods are designed to remove distortions added by speech codecs. While classical methods are very low in complexity and add zero delay, their effectiveness is rather limited. Compared to that, DNN-based methods deliver higher quality but they are typically high in complexity and/or require delay. The recently proposed Linear Adaptive Coding Enhancer (LACE) addresses this problem by combining DNNs with classical long-term/short-term postfiltering resulting in a causal low-complexity model. A short-coming of the LACE model is, however, that quality quickly saturates when the model size is scaled up. To mitigate this problem, we propose a novel adatpive temporal shaping module that adds high temporal resolution to the LACE model resulting in the Non-Linear Adaptive Coding Enhancer (NoLACE). We adapt NoLACE to enhance the Opus codec and show that NoLACE significantly outperforms both the Opus baseline and an enlarged LACE model at 6, 9 and 12 kb/s. We also show that LACE and NoLACE are well-behaved when used with an ASR system.
△ Less
Submitted 12 January, 2024; v1 submitted 25 September, 2023;
originally announced September 2023.
-
Noise-Robust DSP-Assisted Neural Pitch Estimation with Very Low Complexity
Authors:
Krishna Subramani,
Jean-Marc Valin,
Jan Buethe,
Paris Smaragdis,
Mike Goodwin
Abstract:
Pitch estimation is an essential step of many speech processing algorithms, including speech coding, synthesis, and enhancement. Recently, pitch estimators based on deep neural networks (DNNs) have have been outperforming well-established DSP-based techniques. Unfortunately, these new estimators can be impractical to deploy in real-time systems, both because of their relatively high complexity, an…
▽ More
Pitch estimation is an essential step of many speech processing algorithms, including speech coding, synthesis, and enhancement. Recently, pitch estimators based on deep neural networks (DNNs) have have been outperforming well-established DSP-based techniques. Unfortunately, these new estimators can be impractical to deploy in real-time systems, both because of their relatively high complexity, and the fact that some require significant lookahead. We show that a hybrid estimator using a small deep neural network (DNN) with traditional DSP-based features can match or exceed the performance of pure DNN-based models, with a complexity and algorithmic delay comparable to traditional DSP-based algorithms. We further demonstrate that this hybrid approach can provide benefits for a neural vocoding task.
△ Less
Submitted 16 January, 2024; v1 submitted 25 September, 2023;
originally announced September 2023.
-
DeNISE: Deep Networks for Improved Segmentation Edges
Authors:
Sander Riisøen Jyhne,
Per-Arne Andersen,
Morten Goodwin
Abstract:
This paper presents Deep Networks for Improved Segmentation Edges (DeNISE), a novel data enhancement technique using edge detection and segmentation models to improve the boundary quality of segmentation masks. DeNISE utilizes the inherent differences in two sequential deep neural architectures to improve the accuracy of the predicted segmentation edge. DeNISE applies to all types of neural networ…
▽ More
This paper presents Deep Networks for Improved Segmentation Edges (DeNISE), a novel data enhancement technique using edge detection and segmentation models to improve the boundary quality of segmentation masks. DeNISE utilizes the inherent differences in two sequential deep neural architectures to improve the accuracy of the predicted segmentation edge. DeNISE applies to all types of neural networks and is not trained end-to-end, allowing rapid experiments to discover which models complement each other. We test and apply DeNISE for building segmentation in aerial images. Aerial images are known for difficult conditions as they have a low resolution with optical noise, such as reflections, shadows, and visual obstructions. Overall the paper demonstrates the potential for DeNISE. Using the technique, we improve the baseline results with a building IoU of 78.9%.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
CorrEmbed: Evaluating Pre-trained Model Image Similarity Efficacy with a Novel Metric
Authors:
Karl Audun Kagnes Borgersen,
Morten Goodwin,
Jivitesh Sharma,
Tobias Aasmoe,
Mari Leonhardsen,
Gro Herredsvela Rørvik
Abstract:
Detecting visually similar images is a particularly useful attribute to look to when calculating product recommendations. Embedding similarity, which utilizes pre-trained computer vision models to extract high-level image features, has demonstrated remarkable efficacy in identifying images with similar compositions. However, there is a lack of methods for evaluating the embeddings generated by the…
▽ More
Detecting visually similar images is a particularly useful attribute to look to when calculating product recommendations. Embedding similarity, which utilizes pre-trained computer vision models to extract high-level image features, has demonstrated remarkable efficacy in identifying images with similar compositions. However, there is a lack of methods for evaluating the embeddings generated by these models, as conventional loss and performance metrics do not adequately capture their performance in image similarity search tasks.
In this paper, we evaluate the viability of the image embeddings from numerous pre-trained computer vision models using a novel approach named CorrEmbed. Our approach computes the correlation between distances in image embeddings and distances in human-generated tag vectors. We extensively evaluate numerous pre-trained Torchvision models using this metric, revealing an intuitive relationship of linear scaling between ImageNet1k accuracy scores and tag-correlation scores. Importantly, our method also identifies deviations from this pattern, providing insights into how different models capture high-level image features.
By offering a robust performance evaluation of these pre-trained models, CorrEmbed serves as a valuable tool for researchers and practitioners seeking to develop effective, data-driven approaches to similar item recommendations in fashion retail.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
State Representation Learning Using an Unbalanced Atlas
Authors:
Li Meng,
Morten Goodwin,
Anis Yazidi,
Paal Engelstad
Abstract:
The manifold hypothesis posits that high-dimensional data often lies on a lower-dimensional manifold and that utilizing this manifold as the target space yields more efficient representations. While numerous traditional manifold-based techniques exist for dimensionality reduction, their application in self-supervised learning has witnessed slow progress. The recent MSimCLR method combines manifold…
▽ More
The manifold hypothesis posits that high-dimensional data often lies on a lower-dimensional manifold and that utilizing this manifold as the target space yields more efficient representations. While numerous traditional manifold-based techniques exist for dimensionality reduction, their application in self-supervised learning has witnessed slow progress. The recent MSimCLR method combines manifold encoding with SimCLR but requires extremely low target encoding dimensions to outperform SimCLR, limiting its applicability. This paper introduces a novel learning paradigm using an unbalanced atlas (UA), capable of surpassing state-of-the-art self-supervised learning approaches. We investigated and engineered the DeepInfomax with an unbalanced atlas (DIM-UA) method by adapting the Spatiotemporal DeepInfomax (ST-DIM) framework to align with our proposed UA paradigm. The efficacy of DIM-UA is demonstrated through training and evaluation on the Atari Annotated RAM Interface (AtariARI) benchmark, a modified version of the Atari 2600 framework that produces annotated image samples for representation learning. The UA paradigm improves existing algorithms significantly as the number of target encoding dimensions grows. For instance, the mean F1 score averaged over categories of DIM-UA is ~75% compared to ~70% of ST-DIM when using 16384 hidden units.
△ Less
Submitted 24 June, 2024; v1 submitted 17 May, 2023;
originally announced May 2023.
-
Loss and Reward Weighing for increased learning in Distributed Reinforcement Learning
Authors:
Martin Holen,
Per-Arne Andersen,
Kristian Muri Knausgård,
Morten Goodwin
Abstract:
This paper introduces two learning schemes for distributed agents in Reinforcement Learning (RL) environments, namely Reward-Weighted (R-Weighted) and Loss-Weighted (L-Weighted) gradient merger. The R/L weighted methods replace standard practices for training multiple agents, such as summing or averaging the gradients. The core of our methods is to scale the gradient of each actor based on how hig…
▽ More
This paper introduces two learning schemes for distributed agents in Reinforcement Learning (RL) environments, namely Reward-Weighted (R-Weighted) and Loss-Weighted (L-Weighted) gradient merger. The R/L weighted methods replace standard practices for training multiple agents, such as summing or averaging the gradients. The core of our methods is to scale the gradient of each actor based on how high the reward (for R-Weighted) or the loss (for L-Weighted) is compared to the other actors. During training, each agent operates in differently initialized versions of the same environment, which gives different gradients from different actors. In essence, the R-Weights and L-Weights of each agent inform the other agents of its potential, which again reports which environment should be prioritized for learning. This approach of distributed learning is possible because environments that yield higher rewards, or low losses, have more critical information than environments that yield lower rewards or higher losses. We empirically demonstrate that the R-Weighted methods work superior to the state-of-the-art in multiple RL environments.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
A Video-based End-to-end Pipeline for Non-nutritive Sucking Action Recognition and Segmentation in Young Infants
Authors:
Shaotong Zhu,
Michael Wan,
Elaheh Hatamimajoumerd,
Kashish Jain,
Samuel Zlota,
Cholpady Vikram Kamath,
Cassandra B. Rowan,
Emma C. Grace,
Matthew S. Goodwin,
Marie J. Hayes,
Rebecca A. Schwartz-Mette,
Emily Zimmerman,
Sarah Ostadabbas
Abstract:
We present an end-to-end computer vision pipeline to detect non-nutritive sucking (NNS) -- an infant sucking pattern with no nutrition delivered -- as a potential biomarker for developmental delays, using off-the-shelf baby monitor video footage. One barrier to clinical (or algorithmic) assessment of NNS stems from its sparsity, requiring experts to wade through hours of footage to find minutes of…
▽ More
We present an end-to-end computer vision pipeline to detect non-nutritive sucking (NNS) -- an infant sucking pattern with no nutrition delivered -- as a potential biomarker for developmental delays, using off-the-shelf baby monitor video footage. One barrier to clinical (or algorithmic) assessment of NNS stems from its sparsity, requiring experts to wade through hours of footage to find minutes of relevant activity. Our NNS activity segmentation algorithm solves this problem by identifying periods of NNS with high certainty -- up to 94.0\% average precision and 84.9\% average recall across 30 heterogeneous 60 s clips, drawn from our manually annotated NNS clinical in-crib dataset of 183 hours of overnight baby monitor footage from 19 infants. Our method is based on an underlying NNS action recognition algorithm, which uses spatiotemporal deep learning networks and infant-specific pose estimation, achieving 94.9\% accuracy in binary classification of 960 2.5 s balanced NNS vs. non-NNS clips. Tested on our second, independent, and public NNS in-the-wild dataset, NNS recognition classification reaches 92.3\% accuracy, and NNS segmentation achieves 90.8\% precision and 84.2\% recall.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
A Contrastive Learning Scheme with Transformer Innate Patches
Authors:
Sander Riisøen Jyhne,
Per-Arne Andersen,
Morten Goodwin
Abstract:
This paper presents Contrastive Transformer, a contrastive learning scheme using the Transformer innate patches. Contrastive Transformer enables existing contrastive learning techniques, often used for image classification, to benefit dense downstream prediction tasks such as semantic segmentation. The scheme performs supervised patch-level contrastive learning, selecting the patches based on the…
▽ More
This paper presents Contrastive Transformer, a contrastive learning scheme using the Transformer innate patches. Contrastive Transformer enables existing contrastive learning techniques, often used for image classification, to benefit dense downstream prediction tasks such as semantic segmentation. The scheme performs supervised patch-level contrastive learning, selecting the patches based on the ground truth mask, subsequently used for hard-negative and hard-positive sampling. The scheme applies to all vision-transformer architectures, is easy to implement, and introduces minimal additional memory footprint. Additionally, the scheme removes the need for huge batch sizes, as each patch is treated as an image.
We apply and test Contrastive Transformer for the case of aerial image segmentation, known for low-resolution data, large class imbalance, and similar semantic classes. We perform extensive experiments to show the efficacy of the Contrastive Transformer scheme on the ISPRS Potsdam aerial image segmentation dataset. Additionally, we show the generalizability of our scheme by applying it to multiple inherently different Transformer architectures. Ultimately, the results show a consistent increase in mean IoU across all classes.
△ Less
Submitted 8 January, 2024; v1 submitted 26 March, 2023;
originally announced March 2023.
-
Unsupervised Representation Learning in Partially Observable Atari Games
Authors:
Li Meng,
Morten Goodwin,
Anis Yazidi,
Paal Engelstad
Abstract:
State representation learning aims to capture latent factors of an environment. Contrastive methods have performed better than generative models in previous state representation learning research. Although some researchers realize the connections between masked image modeling and contrastive representation learning, the effort is focused on using masks as an augmentation technique to represent the…
▽ More
State representation learning aims to capture latent factors of an environment. Contrastive methods have performed better than generative models in previous state representation learning research. Although some researchers realize the connections between masked image modeling and contrastive representation learning, the effort is focused on using masks as an augmentation technique to represent the latent generative factors better. Partially observable environments in reinforcement learning have not yet been carefully studied using unsupervised state representation learning methods.
In this article, we create an unsupervised state representation learning scheme for partially observable states. We conducted our experiment on a previous Atari 2600 framework designed to evaluate representation learning models. A contrastive method called Spatiotemporal DeepInfomax (ST-DIM) has shown state-of-the-art performance on this benchmark but remains inferior to its supervised counterpart. Our approach improves ST-DIM when the environment is not fully observable and achieves higher F1 scores and accuracy scores than the supervised learning counterpart. The mean accuracy score averaged over categories of our approach is ~66%, compared to ~38% of supervised learning. The mean F1 score is ~64% to ~33%.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement
Authors:
Zhepei Wang,
Ritwik Giri,
Devansh Shah,
Jean-Marc Valin,
Michael M. Goodwin,
Paris Smaragdis
Abstract:
In this study, we present an approach to train a single speech enhancement network that can perform both personalized and non-personalized speech enhancement. This is achieved by incorporating a frame-wise conditioning input that specifies the type of enhancement output. To improve the quality of the enhanced output and mitigate oversuppression, we experiment with re-weighting frames by the presen…
▽ More
In this study, we present an approach to train a single speech enhancement network that can perform both personalized and non-personalized speech enhancement. This is achieved by incorporating a frame-wise conditioning input that specifies the type of enhancement output. To improve the quality of the enhanced output and mitigate oversuppression, we experiment with re-weighting frames by the presence or absence of speech activity and applying augmentations to speaker embeddings. By training under a multi-task learning setting, we empirically show that the proposed unified model obtains promising results on both personalized and non-personalized speech enhancement benchmarks and reaches similar performance to models that are trained specialized for either task. The strong performance of the proposed method demonstrates that the unified model is a more economical alternative compared to keeping separate task-specific models during inference.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
A contrastive learning approach for individual re-identification in a wild fish population
Authors:
Ørjan Langøy Olsen,
Tonje Knutsen Sørdalen,
Morten Goodwin,
Ketil Malde,
Kristian Muri Knausgård,
Kim Tallaksen Halvorsen
Abstract:
In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis.
This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, suppo…
▽ More
In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis.
This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, supported by a projection head, and we use contrastive learning to find similar or dissimilar image pairs from a collection of uniform photographs. We apply this technique for corkwing wrasse, Symphodus melops, an ecologically and commercially important fish species. Photos are taken during repeated catches of the same individuals from a wild population, where the intervals between individual sightings might range from a few days to several years.
Our model achieves a one-shot accuracy of 0.35, a 5-shot accuracy of 0.56, and a 100-shot accuracy of 0.88, on our dataset.
△ Less
Submitted 2 January, 2023;
originally announced January 2023.
-
A Comparison Between Tsetlin Machines and Deep Neural Networks in the Context of Recommendation Systems
Authors:
Karl Audun Borgersen,
Morten Goodwin,
Jivitesh Sharma
Abstract:
Recommendation Systems (RSs) are ubiquitous in modern society and are one of the largest points of interaction between humans and AI. Modern RSs are often implemented using deep learning models, which are infamously difficult to interpret. This problem is particularly exasperated in the context of recommendation scenarios, as it erodes the user's trust in the RS. In contrast, the newly introduced…
▽ More
Recommendation Systems (RSs) are ubiquitous in modern society and are one of the largest points of interaction between humans and AI. Modern RSs are often implemented using deep learning models, which are infamously difficult to interpret. This problem is particularly exasperated in the context of recommendation scenarios, as it erodes the user's trust in the RS. In contrast, the newly introduced Tsetlin Machines (TM) possess some valuable properties due to their inherent interpretability. TMs are still fairly young as a technology. As no RS has been developed for TMs before, it has become necessary to perform some preliminary research regarding the practicality of such a system. In this paper, we develop the first RS based on TMs to evaluate its practicality in this application domain. This paper compares the viability of TMs with other machine learning models prevalent in the field of RS. We train and investigate the performance of the TM compared with a vanilla feed-forward deep learning model. These comparisons are based on model performance, interpretability/explainability, and scalability. Further, we provide some benchmark performance comparisons to similar machine learning solutions relevant to RSs.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
-
Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with Very Low Computational Complexity
Authors:
Ahmed Mustafa,
Jean-Marc Valin,
Jan Büthe,
Paris Smaragdis,
Mike Goodwin
Abstract:
GAN vocoders are currently one of the state-of-the-art methods for building high-quality neural waveform generative models. However, most of their architectures require dozens of billion floating-point operations per second (GFLOPS) to generate speech waveforms in samplewise manner. This makes GAN vocoders still challenging to run on normal CPUs without accelerators or parallel computers. In this…
▽ More
GAN vocoders are currently one of the state-of-the-art methods for building high-quality neural waveform generative models. However, most of their architectures require dozens of billion floating-point operations per second (GFLOPS) to generate speech waveforms in samplewise manner. This makes GAN vocoders still challenging to run on normal CPUs without accelerators or parallel computers. In this work, we propose a new architecture for GAN vocoders that mainly depends on recurrent and fully-connected networks to directly generate the time domain signal in framewise manner. This results in considerable reduction of the computational cost and enables very fast generation on both GPUs and low-complexity CPUs. Experimental results show that our Framewise WaveGAN vocoder achieves significantly higher quality than auto-regressive maximum-likelihood vocoders such as LPCNet at a very low complexity of 1.2 GFLOPS. This makes GAN vocoders more practical on edge and low-power devices.
△ Less
Submitted 1 March, 2023; v1 submitted 8 December, 2022;
originally announced December 2022.
-
CostNet: An End-to-End Framework for Goal-Directed Reinforcement Learning
Authors:
Per-Arne Andersen,
Morten Goodwin,
Ole-Christoffer Granmo
Abstract:
Reinforcement Learning (RL) is a general framework concerned with an agent that seeks to maximize rewards in an environment. The learning typically happens through trial and error using explorative methods, such as epsilon-greedy. There are two approaches, model-based and model-free reinforcement learning, that show concrete results in several disciplines. Model-based RL learns a model of the envi…
▽ More
Reinforcement Learning (RL) is a general framework concerned with an agent that seeks to maximize rewards in an environment. The learning typically happens through trial and error using explorative methods, such as epsilon-greedy. There are two approaches, model-based and model-free reinforcement learning, that show concrete results in several disciplines. Model-based RL learns a model of the environment for learning the policy while model-free approaches are fully explorative and exploitative without considering the underlying environment dynamics. Model-free RL works conceptually well in simulated environments, and empirical evidence suggests that trial and error lead to a near-optimal behavior with enough training. On the other hand, model-based RL aims to be sample efficient, and studies show that it requires far less training in the real environment for learning a good policy.
A significant challenge with RL is that it relies on a well-defined reward function to work well for complex environments and such a reward function is challenging to define. Goal-Directed RL is an alternative method that learns an intrinsic reward function with emphasis on a few explored trajectories that reveals the path to the goal state.
This paper introduces a novel reinforcement learning algorithm for predicting the distance between two states in a Markov Decision Process. The learned distance function works as an intrinsic reward that fuels the agent's learning. Using the distance-metric as a reward, we show that the algorithm performs comparably to model-free RL while having significantly better sample-efficiently in several test environments.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
CaiRL: A High-Performance Reinforcement Learning Environment Toolkit
Authors:
Per-Arne Andersen,
Morten Goodwin,
Ole-Christoffer Granmo
Abstract:
This paper addresses the dire need for a platform that efficiently provides a framework for running reinforcement learning (RL) experiments. We propose the CaiRL Environment Toolkit as an efficient, compatible, and more sustainable alternative for training learning agents and propose methods to develop more efficient environment simulations.
There is an increasing focus on developing sustainable…
▽ More
This paper addresses the dire need for a platform that efficiently provides a framework for running reinforcement learning (RL) experiments. We propose the CaiRL Environment Toolkit as an efficient, compatible, and more sustainable alternative for training learning agents and propose methods to develop more efficient environment simulations.
There is an increasing focus on developing sustainable artificial intelligence. However, little effort has been made to improve the efficiency of running environment simulations. The most popular development toolkit for reinforcement learning, OpenAI Gym, is built using Python, a powerful but slow programming language. We propose a toolkit written in C++ with the same flexibility level but works orders of magnitude faster to make up for Python's inefficiency. This would drastically cut climate emissions.
CaiRL also presents the first reinforcement learning toolkit with a built-in JVM and Flash support for running legacy flash games for reinforcement learning research. We demonstrate the effectiveness of CaiRL in the classic control benchmark, comparing the execution speed to OpenAI Gym. Furthermore, we illustrate that CaiRL can act as a drop-in replacement for OpenAI Gym to leverage significantly faster training speeds because of the reduced environment computation time.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
Interpretable Option Discovery using Deep Q-Learning and Variational Autoencoders
Authors:
Per-Arne Andersen,
Ole-Christoffer Granmo,
Morten Goodwin
Abstract:
Deep Reinforcement Learning (RL) is unquestionably a robust framework to train autonomous agents in a wide variety of disciplines. However, traditional deep and shallow model-free RL algorithms suffer from low sample efficiency and inadequate generalization for sparse state spaces. The options framework with temporal abstractions is perhaps the most promising method to solve these problems, but it…
▽ More
Deep Reinforcement Learning (RL) is unquestionably a robust framework to train autonomous agents in a wide variety of disciplines. However, traditional deep and shallow model-free RL algorithms suffer from low sample efficiency and inadequate generalization for sparse state spaces. The options framework with temporal abstractions is perhaps the most promising method to solve these problems, but it still has noticeable shortcomings. It only guarantees local convergence, and it is challenging to automate initiation and termination conditions, which in practice are commonly hand-crafted.
Our proposal, the Deep Variational Q-Network (DVQN), combines deep generative- and reinforcement learning. The algorithm finds good policies from a Gaussian distributed latent-space, which is especially useful for defining options. The DVQN algorithm uses MSE with KL-divergence as regularization, combined with traditional Q-Learning updates. The algorithm learns a latent-space that represents good policies with state clusters for options. We show that the DVQN algorithm is a promising approach for identifying initiation and termination conditions for option-based reinforcement learning. Experiments show that the DVQN algorithm, with automatic initiation and termination, has comparable performance to Rainbow and can maintain stability when trained for extended periods after convergence.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
The SAMI Galaxy Survey: Using concentrated star-formation and stellar population ages to understand environmental quenching
Authors:
Di Wang,
Scott M. Croom,
Julia J. Bryant,
Sam P. Vaughan,
Adam L. Schaefer,
Francesco D'Eugenio,
Stefania Barsanti,
Sarah Brough,
Claudia del P. Lagos,
Anne M. Medling,
Sree Oh,
Jesse van de Sande,
Giulia Santucci,
Joss Bland-Hawthorn,
Michael Goodwin,
Brent Groves,
Jon Lawrence,
Matt S. Owers,
Samuel Richards
Abstract:
We study environmental quenching using the spatial distribution of current star-formation and stellar population ages with the full SAMI Galaxy Survey. By using a star-formation concentration index [C-index, defined as log10(r_{50,Halpha}/r_{50,cont})], we separate our sample into regular galaxies (C-index>-0.2) and galaxies with centrally concentrated star-formation (SF-concentrated; C-index<-0.2…
▽ More
We study environmental quenching using the spatial distribution of current star-formation and stellar population ages with the full SAMI Galaxy Survey. By using a star-formation concentration index [C-index, defined as log10(r_{50,Halpha}/r_{50,cont})], we separate our sample into regular galaxies (C-index>-0.2) and galaxies with centrally concentrated star-formation (SF-concentrated; C-index<-0.2). Concentrated star-formation is a potential indicator of galaxies currently undergoing `outside-in' quenching. Our environments cover ungrouped galaxies, low-mass groups (M_200<10^12.5 M_sun), high-mass groups (M_200 in the range 10^{12.5-14} M_sun) and clusters (M_200>10^14 M_sun). We find the fraction of SF-concentrated galaxies increases as halo mass increases with 9\pm2 per cent, 8\pm3 per cent, 19\pm4 per cent and 29\pm4 per cent for ungrouped galaxies, low-mass groups, high-mass groups and clusters, respectively. We interpret these results as evidence for `outside-in' quenching in groups and clusters. To investigate the quenching time-scale in SF-concentrated galaxies, we calculate light-weighted age (Age_L) and mass-weighted age (Age_M) using full spectral fitting, as well as the Dn4000 and Hdelta_A indices. We assume that the average galaxy age radial profile before entering a group or cluster is similar to ungrouped regular galaxies. At large radius (1-2 R_e), SF-concentrated galaxies in high-mass groups have older ages than ungrouped regular galaxies with an age difference of 1.83\pm0.38 Gyr for Age_L and 1.34\pm0.56 Gyr for Age_M. This suggests that while `outside-in' quenching can be effective in groups, the process will not quickly quench the entire galaxy. In contrast, the ages at 1-2 R_e of cluster SF-concentrated galaxies and ungrouped regular galaxies are consistent (0.19\pm0.21 Gyr for Age_L, 0.40\pm0.61 Gyr for Age_M), suggesting the quenching process must be rapid.
△ Less
Submitted 1 September, 2022;
originally announced September 2022.
-
The SAMI Galaxy Survey: The relationship between galaxy rotation and the motion of neighbours
Authors:
Yifan Mai,
Sam P. Vaughan,
Scott M. Croom,
Jesse van de Sande,
Stefania Barsanti,
Joss Bland-Hawthorn,
Sarah Brough,
Julia J. Bryant,
Matthew Colless,
Michael Goodwin,
Brent Groves,
Iraklis S. Konstantopoulos,
Jon S. Lawrence,
Nuria P. F. Lorente,
Samuel N. Richards
Abstract:
Using data from the SAMI Galaxy Survey, we investigate the correlation between the projected stellar kinematic spin vector of 1397 SAMI galaxies and the line-of-sight motion of their neighbouring galaxies. We calculate the luminosity-weighted mean velocity difference between SAMI galaxies and their neighbours in the direction perpendicular to the SAMI galaxies angular momentum axes. The luminosity…
▽ More
Using data from the SAMI Galaxy Survey, we investigate the correlation between the projected stellar kinematic spin vector of 1397 SAMI galaxies and the line-of-sight motion of their neighbouring galaxies. We calculate the luminosity-weighted mean velocity difference between SAMI galaxies and their neighbours in the direction perpendicular to the SAMI galaxies angular momentum axes. The luminosity-weighted mean velocity offsets between SAMI and neighbours, which indicates the signal of coherence between the rotation of the SAMI galaxies and the motion of neighbours, is 9.0 $\pm$ 5.4 km s$^{-1}$ (1.7 $σ$) for neighbours within 1 Mpc. In a large-scale analysis, we find that the average velocity offsets increase for neighbours out to 2 Mpc. However, the velocities are consistent with zero or negative for neighbours outside 3 Mpc. The negative signals for neighbours at distance around 10 Mpc are also significant at $\sim 2$ $σ$ level, which indicate that the positive signals within 2 Mpc might come from the variance of large-scale structure. We also calculate average velocities of different subsamples, including galaxies in different regions of the sky, galaxies with different stellar masses, galaxy type, $λ_{Re}$ and inclination. Although low-mass, high-mass, early-type and low-spin galaxies subsamples show 2 - 3 $σ$ signal of coherence for the neighbours within 2 Mpc, the results for different inclination subsamples and large-scale results suggest that the $\sim 2 σ$ signals might result from coincidental scatter or variance of large-scale structure. Overall, the modest evidence of coherence signals for neighbouring galaxies within 2 Mpc needs to be confirmed by larger samples of observations and simulation studies.
△ Less
Submitted 8 July, 2022;
originally announced July 2022.
-
Deep Reinforcement Learning with Swin Transformers
Authors:
Li Meng,
Morten Goodwin,
Anis Yazidi,
Paal Engelstad
Abstract:
Transformers are neural network models that utilize multiple layers of self-attention heads and have exhibited enormous potential in natural language processing tasks. Meanwhile, there have been efforts to adapt transformers to visual tasks of machine learning, including Vision Transformers and Swin Transformers. Although some researchers use Vision Transformers for reinforcement learning tasks, t…
▽ More
Transformers are neural network models that utilize multiple layers of self-attention heads and have exhibited enormous potential in natural language processing tasks. Meanwhile, there have been efforts to adapt transformers to visual tasks of machine learning, including Vision Transformers and Swin Transformers. Although some researchers use Vision Transformers for reinforcement learning tasks, their experiments remain at a small scale due to the high computational cost. This article presents the first online reinforcement learning scheme that is based on Swin Transformers: Swin DQN. In contrast to existing research, our novel approach demonstrate the superior performance with experiments on 49 games in the Arcade Learning Environment. The results show that our approach achieves significantly higher maximal evaluation scores than the baseline method in 45 of all the 49 games (92%), and higher mean evaluation scores than the baseline method in 40 of all the 49 games (82%).
△ Less
Submitted 24 June, 2024; v1 submitted 30 June, 2022;
originally announced June 2022.
-
Semi-supervised Time Domain Target Speaker Extraction with Attention
Authors:
Zhepei Wang,
Ritwik Giri,
Shrikant Venkataramani,
Umut Isik,
Jean-Marc Valin,
Paris Smaragdis,
Mike Goodwin,
Arvindh Krishnaswamy
Abstract:
In this work, we propose Exformer, a time-domain architecture for target speaker extraction. It consists of a pre-trained speaker embedder network and a separator network based on transformer encoder blocks. We study multiple methods to combine speaker information with the input mixture, and the resulting Exformer architecture obtains superior extraction performance compared to prior time-domain n…
▽ More
In this work, we propose Exformer, a time-domain architecture for target speaker extraction. It consists of a pre-trained speaker embedder network and a separator network based on transformer encoder blocks. We study multiple methods to combine speaker information with the input mixture, and the resulting Exformer architecture obtains superior extraction performance compared to prior time-domain networks. Furthermore, we investigate a two-stage procedure to train the model using mixtures without reference signals upon a pre-trained supervised model. Experimental results show that the proposed semi-supervised learning procedure improves the performance of the supervised baselines.
△ Less
Submitted 17 June, 2022;
originally announced June 2022.
-
The SAMI Galaxy Survey: The Link Between [$α$/Fe] and Kinematic Morphology
Authors:
Peter J. Watson,
Roger L. Davies,
Jesse van de Sande,
Sarah Brough,
Scott M. Croom,
Francesco D'Eugenio,
Karl Glazebrook,
Brent Groves,
Ángel R. López-Sánchez,
Nicholas Scott,
Sam P. Vaughan,
C. Jakob Walcher,
Joss Bland-Hawthorn,
Julia J. Bryant,
Michael Goodwin,
Jon S. Lawrence,
Nuria P. F. Lorente,
Matt S. Owers,
Samuel Richards
Abstract:
We explore a sample of 1492 galaxies with measurements of the mean stellar population properties and the spin parameter proxy, $λ_{R_{\rm{e}}}$, drawn from the SAMI Galaxy Survey. We fit a global $\left[α/\rm{Fe}\right]$-$σ$ relation, finding that $\left[α/\rm{Fe}\right]=(0.395\pm0.010)\rm{log}_{10}\left(σ\right)-(0.627\pm0.002)$. We observe an anti-correlation between the residuals…
▽ More
We explore a sample of 1492 galaxies with measurements of the mean stellar population properties and the spin parameter proxy, $λ_{R_{\rm{e}}}$, drawn from the SAMI Galaxy Survey. We fit a global $\left[α/\rm{Fe}\right]$-$σ$ relation, finding that $\left[α/\rm{Fe}\right]=(0.395\pm0.010)\rm{log}_{10}\left(σ\right)-(0.627\pm0.002)$. We observe an anti-correlation between the residuals $Δ\left[α/\rm{Fe}\right]$ and the inclination-corrected $λ_{\,R_{\rm{e}}}^{\rm{\,eo}}$, which can be expressed as $Δ\left[α/\rm{Fe}\right]=(-0.057\pm0.008)λ_{\,R_{\rm{e}}}^{\rm{\,eo}}+(0.020\pm0.003)$. The anti-correlation appears to be driven by star-forming galaxies, with a gradient of $Δ\left[α/\rm{Fe}\right]\sim(-0.121\pm0.015)λ_{\,R_{\rm{e}}}^{\rm{\,eo}}$, although a weak relationship persists for the subsample of galaxies for which star formation has been quenched. We take this to be confirmation that disk-dominated galaxies have an extended duration of star formation. At a reference velocity dispersion of 200 km s$^{-1}$, we estimate an increase in half-mass formation time from $\sim$0.5 Gyr to $\sim$1.2 Gyr from low- to high-$λ_{\,R_{\rm{e}}}^{\rm{\,eo}}$ galaxies. Slow rotators do not appear to fit these trends. Their residual $α$-enhancement is indistinguishable from other galaxies with $λ_{\,R_{\rm{e}}}^{\rm{\,eo}}\lessapprox0.4$, despite being both larger and more massive. This result shows that galaxies with $λ_{\,R_{\rm{e}}}^{\rm{\,eo}}\lessapprox0.4$ experience a similar range of star formation histories, despite their different physical structure and angular momentum.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Improved singing voice separation with chromagram-based pitch-aware remixing
Authors:
Siyuan Yuan,
Zhepei Wang,
Umut Isik,
Ritwik Giri,
Jean-Marc Valin,
Michael M. Goodwin,
Arvindh Krishnaswamy
Abstract:
Singing voice separation aims to separate music into vocals and accompaniment components. One of the major constraints for the task is the limited amount of training data with separated vocals. Data augmentation techniques such as random source mixing have been shown to make better use of existing data and mildly improve model performance. We propose a novel data augmentation technique, chromagram…
▽ More
Singing voice separation aims to separate music into vocals and accompaniment components. One of the major constraints for the task is the limited amount of training data with separated vocals. Data augmentation techniques such as random source mixing have been shown to make better use of existing data and mildly improve model performance. We propose a novel data augmentation technique, chromagram-based pitch-aware remixing, where music segments with high pitch alignment are mixed. By performing controlled experiments in both supervised and semi-supervised settings, we demonstrate that training models with pitch-aware remixing significantly improves the test signal-to-distortion ratio (SDR)
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
Socially Fair Mitigation of Misinformation on Social Networks via Constraint Stochastic Optimization
Authors:
Ahmed Abouzeid,
Ole-Christoffer Granmo,
Christian Webersik,
Morten Goodwin
Abstract:
Recent social networks' misinformation mitigation approaches tend to investigate how to reduce misinformation by considering a whole-network statistical scale. However, unbalanced misinformation exposures among individuals urge to study fair allocation of mitigation resources. Moreover, the network has random dynamics which change over time. Therefore, we introduce a stochastic and non-stationary…
▽ More
Recent social networks' misinformation mitigation approaches tend to investigate how to reduce misinformation by considering a whole-network statistical scale. However, unbalanced misinformation exposures among individuals urge to study fair allocation of mitigation resources. Moreover, the network has random dynamics which change over time. Therefore, we introduce a stochastic and non-stationary knapsack problem, and we apply its resolution to mitigate misinformation in social network campaigns. We further propose a generic misinformation mitigation algorithm that is robust to different social networks' misinformation statistics, allowing a promising impact in real-world scenarios. A novel loss function ensures fair mitigation among users. We achieve fairness by intelligently allocating a mitigation incentivization budget to the knapsack, and optimizing the loss function. To this end, a team of Learning Automata (LA) drives the budget allocation. Each LA is associated with a user and learns to minimize its exposure to misinformation by performing a non-stationary and stochastic walk over its state space. Our results show how our LA-based method is robust and outperforms similar misinformation mitigation methods in how the mitigation is fairly influencing the network users.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise
Authors:
Li Meng,
Morten Goodwin,
Anis Yazidi,
Paal Engelstad
Abstract:
Q-learning is one of the most well-known Reinforcement Learning algorithms. There have been tremendous efforts to develop this algorithm using neural networks. Bootstrapped Deep Q-Learning Network is amongst them. It utilizes multiple neural network heads to introduce diversity into Q-learning. Diversity can sometimes be viewed as the amount of reasonable moves an agent can take at a given state,…
▽ More
Q-learning is one of the most well-known Reinforcement Learning algorithms. There have been tremendous efforts to develop this algorithm using neural networks. Bootstrapped Deep Q-Learning Network is amongst them. It utilizes multiple neural network heads to introduce diversity into Q-learning. Diversity can sometimes be viewed as the amount of reasonable moves an agent can take at a given state, analogous to the definition of the exploration ratio in RL. Thus, the performance of Bootstrapped Deep Q-Learning Network is deeply connected with the level of diversity within the algorithm. In the original research, it was pointed out that a random prior could improve the performance of the model. In this article, we further explore the possibility of replacing priors with noise and sample the noise from a Gaussian distribution to introduce more diversity into this algorithm. We conduct our experiment on the Atari benchmark and compare our algorithm to both the original and other related algorithms. The results show that our modification of the Bootstrapped Deep Q-Learning algorithm achieves significantly higher evaluation scores across different types of Atari games. Thus, we conclude that replacing priors with noise can improve Bootstrapped Deep Q-Learning's performance by ensuring the integrity of diversities.
△ Less
Submitted 24 June, 2024; v1 submitted 2 March, 2022;
originally announced March 2022.
-
Unlocking the potential of deep learning for marine ecology: overview, applications, and outlook
Authors:
Morten Goodwin,
Kim Tallaksen Halvorsen,
Lei Jiao,
Kristian Muri Knausgård,
Angela Helen Martin,
Marta Moyano,
Rebekah A. Oomen,
Jeppe Have Rasmussen,
Tonje Knutsen Sørdalen,
Susanna Huneide Thorbjørnsen
Abstract:
The deep learning revolution is touching all scientific disciplines and corners of our lives as a means of harnessing the power of big data. Marine ecology is no exception. These new methods provide analysis of data from sensors, cameras, and acoustic recorders, even in real time, in ways that are reproducible and rapid. Off-the-shelf algorithms can find, count, and classify species from digital i…
▽ More
The deep learning revolution is touching all scientific disciplines and corners of our lives as a means of harnessing the power of big data. Marine ecology is no exception. These new methods provide analysis of data from sensors, cameras, and acoustic recorders, even in real time, in ways that are reproducible and rapid. Off-the-shelf algorithms can find, count, and classify species from digital images or video and detect cryptic patterns in noisy data. Using these opportunities requires collaboration across ecological and data science disciplines, which can be challenging to initiate. To facilitate these collaborations and promote the use of deep learning towards ecosystem-based management of the sea, this paper aims to bridge the gap between marine ecologists and computer scientists. We provide insight into popular deep learning approaches for ecological data analysis in plain language, focusing on the techniques of supervised learning with deep neural networks, and illustrate challenges and opportunities through established and emerging applications of deep learning to marine ecology. We use established and future-looking case studies on plankton, fishes, marine mammals, pollution, and nutrient cycling that involve object detection, classification, tracking, and segmentation of visualized data. We conclude with a broad outlook of the field's opportunities and challenges, including potential technological advances and issues with managing complex data sets.
△ Less
Submitted 29 September, 2021;
originally announced September 2021.
-
On $\mathfrak{sl}_2$-triples for classical algebraic groups in positive characteristic
Authors:
Simon M. Goodwin,
Rachel Pengelly
Abstract:
Let $k$ be an algebraically closed field of characteristic $p > 2$, let $n \in \mathbb Z_{>0}$, and take $G$ to be one of the classical algebraic groups $\mathrm{GL}_n(k)$, $\mathrm{SL}_n(k)$, $\mathrm{Sp}_n(k)$, $\mathrm O_n(k)$ or $\mathrm{SO}_n(k)$, with $\mathfrak g = \operatorname{Lie} G$. We determine the maximal $G$-stable closed subvariety $\mathcal V$ of the nilpotent cone $\mathcal N$ of…
▽ More
Let $k$ be an algebraically closed field of characteristic $p > 2$, let $n \in \mathbb Z_{>0}$, and take $G$ to be one of the classical algebraic groups $\mathrm{GL}_n(k)$, $\mathrm{SL}_n(k)$, $\mathrm{Sp}_n(k)$, $\mathrm O_n(k)$ or $\mathrm{SO}_n(k)$, with $\mathfrak g = \operatorname{Lie} G$. We determine the maximal $G$-stable closed subvariety $\mathcal V$ of the nilpotent cone $\mathcal N$ of $\mathfrak g$ such that the $G$-orbits in $\mathcal V$ are in bijection with the $G$-orbits of $\mathfrak{sl}_2$-triples $(e,h,f)$ with $e,f \in \mathcal V$. This result determines to what extent the theorems of Jacobson--Morozov and Kostant on $\mathfrak{sl}_2$-triples hold for classical algebraic groups over an algebraically closed field of "small" odd characteristic.
△ Less
Submitted 16 March, 2022; v1 submitted 9 August, 2021;
originally announced August 2021.
-
Detection of subtle cartilage and bone tissue degeneration in the equine joint using polarisation-sensitive optical coherence tomography
Authors:
Matthew Goodwin,
Marie Klufts,
Joshua Workman,
Ashvin Thambyah,
Frédérique Vanholsbeeck
Abstract:
Objective: To explore the ability of polarisation-sensitive optical coherence tomography (PS-OCT) to rapidly identify subtle signs of tissue degeneration in the equine joint.
Design: Polarisation-sensitive optical coherence tomography (PS-OCT) images were systematically acquired in four locations along the medial and lateral condyles of the third metacarpal bone in 5 equine specimens. Intensity…
▽ More
Objective: To explore the ability of polarisation-sensitive optical coherence tomography (PS-OCT) to rapidly identify subtle signs of tissue degeneration in the equine joint.
Design: Polarisation-sensitive optical coherence tomography (PS-OCT) images were systematically acquired in four locations along the medial and lateral condyles of the third metacarpal bone in 5 equine specimens. Intensity and retardation PS-OCT images, and anomalies observed therein, were then compared and validated with high resolution images of the tissue sections obtained using Differential Interference contrast (DIC) optical light microscopy.
Results: The PS-OCT system was capable of imaging the entire equine osteochondral unit, and allowed delineation of the three structurally differentiated zones of the joint, that is, the articular cartilage matrix, zone of calcified cartilage and underlying subchondral bone. Importantly, PS-OCT imaging was able to detect underlying matrix and bone changes not visible without dissection and/or microscopy.
Conclusion: PS-OCT has substantial potential to detect, non-invasively, sub-surface microstructural changes that are known to be associated with the early stages of joint tissue degeneration.
△ Less
Submitted 1 August, 2021;
originally announced August 2021.
-
Self-transfer learning via patches: A prostate cancer triage approach based on bi-parametric MRI
Authors:
Alvaro Fernandez-Quilez,
Trygve Eftestøl,
Morten Goodwin,
Svein Reidar Kjosavik,
Ketil Oppedal
Abstract:
Prostate cancer (PCa) is the second most common cancer diagnosed among men worldwide. The current PCa diagnostic pathway comes at the cost of substantial overdiagnosis, leading to unnecessary treatment and further testing. Bi-parametric magnetic resonance imaging (bp-MRI) based on apparent diffusion coefficient maps (ADC) and T2-weighted (T2w) sequences has been proposed as a triage test to differ…
▽ More
Prostate cancer (PCa) is the second most common cancer diagnosed among men worldwide. The current PCa diagnostic pathway comes at the cost of substantial overdiagnosis, leading to unnecessary treatment and further testing. Bi-parametric magnetic resonance imaging (bp-MRI) based on apparent diffusion coefficient maps (ADC) and T2-weighted (T2w) sequences has been proposed as a triage test to differentiate between clinically significant (cS) and non-clinically significant (ncS) prostate lesions. However, analysis of the sequences relies on expertise, requires specialized training, and suffers from inter-observer variability. Deep learning (DL) techniques hold promise in tasks such as classification and detection. Nevertheless, they rely on large amounts of annotated data which is not common in the medical field. In order to palliate such issues, existing works rely on transfer learning (TL) and ImageNet pre-training, which has been proven to be sub-optimal for the medical imaging domain. In this paper, we present a patch-based pre-training strategy to distinguish between cS and ncS lesions which exploit the region of interest (ROI) of the patched source domain to efficiently train a classifier in the full-slice target domain which does not require annotations by making use of transfer learning (TL). We provide a comprehensive comparison between several CNNs architectures and different settings which are presented as a baseline. Moreover, we explore cross-domain TL which exploits both MRI modalities and improves single modality results. Finally, we show how our approaches outperform the standard approaches by a considerable margin
△ Less
Submitted 22 July, 2021;
originally announced July 2021.
-
The LEGA-C and SAMI Galaxy Surveys: Quiescent Stellar Populations and the Mass-Size Plane across 6 Gyr
Authors:
Tania M. Barone,
Francesco D'Eugenio,
Nicholas Scott,
Matthew Colless,
Sam P. Vaughan,
Arjen van der Wel,
Amelia Fraser-McKelvie,
Anna de Graaff,
Jesse van de Sande,
Po-Feng Wu,
Rachel Bezanson,
Sarah Brough,
Eric Bell,
Scott M. Croom,
Luca Cortese,
Simon Driver,
Anna R. Gallazzi,
Adam Muzzin,
David Sobral,
Joss Bland-Hawthorn,
Julia J. Bryant,
Michael Goodwin,
Jon S. Lawrence,
Nuria P. F. Lorente,
Matt S. Owers
Abstract:
We investigate the change in mean stellar population age and metallicity ([Z/H]) scaling relations for quiescent galaxies from intermediate redshift ($0.60\leq z\leq0.76$) using the LEGA-C Survey, to low redshift ($0.014\leq z\leq0.10$) using the SAMI Galaxy Survey. We find that, similarly to their low-redshift counterparts, the stellar metallicity of quiescent galaxies at $0.60\leq z\leq 0.76$ cl…
▽ More
We investigate the change in mean stellar population age and metallicity ([Z/H]) scaling relations for quiescent galaxies from intermediate redshift ($0.60\leq z\leq0.76$) using the LEGA-C Survey, to low redshift ($0.014\leq z\leq0.10$) using the SAMI Galaxy Survey. We find that, similarly to their low-redshift counterparts, the stellar metallicity of quiescent galaxies at $0.60\leq z\leq 0.76$ closely correlates with $M_*/R_\mathrm{e}$ (a proxy for the gravitational potential or escape velocity), in that galaxies with deeper potential wells are more metal-rich. This supports the hypothesis that the relation arises due to the gravitational potential regulating the retention of metals, by determining the escape velocity required by metal-rich stellar and supernova ejecta to escape the system and avoid being recycled into later stellar generations. On the other hand, we find no correlation between stellar age and $M_*/R_\mathrm{e}^2$ (stellar mass surface density $Σ$) in the LEGA-C sample, despite this being a strong relation at low redshift. We consider this change in the age--$Σ$ relation in the context of the redshift evolution of the star-forming and quiescent populations in the mass--size plane, and find our results can be explained as a consequence of galaxies forming more compactly at higher redshifts, and remaining compact throughout their evolution. Furthermore, galaxies appear to quench at a characteristic surface density that decreases with decreasing redshift. The $z\sim 0$ age--$Σ$ relation is therefore a result of building up the quiescent and star-forming populations with galaxies that formed at a range of redshifts and so a range of surface densities.
△ Less
Submitted 11 March, 2022; v1 submitted 2 July, 2021;
originally announced July 2021.
-
Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples
Authors:
Li Meng,
Anis Yazidi,
Morten Goodwin,
Paal Engelstad
Abstract:
In this article, we propose a novel algorithm for deep reinforcement learning named Expert Q-learning. Expert Q-learning is inspired by Dueling Q-learning and aims at incorporating semi-supervised learning into reinforcement learning through splitting Q-values into state values and action advantages. We require that an offline expert assesses the value of a state in a coarse manner using three dis…
▽ More
In this article, we propose a novel algorithm for deep reinforcement learning named Expert Q-learning. Expert Q-learning is inspired by Dueling Q-learning and aims at incorporating semi-supervised learning into reinforcement learning through splitting Q-values into state values and action advantages. We require that an offline expert assesses the value of a state in a coarse manner using three discrete values. An expert network is designed in addition to the Q-network, which updates each time following the regular offline minibatch update whenever the expert example buffer is not empty. Using the board game Othello, we compare our algorithm with the baseline Q-learning algorithm, which is a combination of Double Q-learning and Dueling Q-learning. Our results show that Expert Q-learning is indeed useful and more resistant to the overestimation bias. The baseline Q-learning algorithm exhibits unstable and suboptimal behavior in non-deterministic settings, whereas Expert Q-learning demonstrates more robust performance with higher scores, illustrating that our algorithm is indeed suitable to integrate state values from expert examples into Q-learning.
△ Less
Submitted 25 June, 2024; v1 submitted 28 June, 2021;
originally announced June 2021.
-
The SAMI Galaxy Survey: Trends in [α/Fe] as a Function of Morphology and Environment
Authors:
Peter J. Watson,
Roger L. Davies,
Sarah Brough,
Scott M. Croom,
Francesco D'Eugenio,
Karl Glazebrook,
Brent Groves,
Ángel R. López-Sánchez,
Jesse van de Sande,
Nicholas Scott,
Sam P. Vaughan,
Jakob Walcher,
Joss Bland-Hawthorn,
Julia J. Bryant,
Michael Goodwin,
Jon S. Lawrence,
Nuria P. F. Lorente,
Matt S. Owers,
Samuel Richards
Abstract:
We present a new set of index-based measurements of [$α$/Fe] for a sample of 2093 galaxies in the SAMI Galaxy Survey. Following earlier work, we fit a global relation between [$α$/Fe] and the galaxy velocity dispersion $σ$ for red sequence galaxies, [$α$/Fe]=(0.378$\pm$0.009)log($σ$/100)+(0.155$\pm$0.003). We observe a correlation between the residuals and the local environmental surface density,…
▽ More
We present a new set of index-based measurements of [$α$/Fe] for a sample of 2093 galaxies in the SAMI Galaxy Survey. Following earlier work, we fit a global relation between [$α$/Fe] and the galaxy velocity dispersion $σ$ for red sequence galaxies, [$α$/Fe]=(0.378$\pm$0.009)log($σ$/100)+(0.155$\pm$0.003). We observe a correlation between the residuals and the local environmental surface density, whereas no such relation exists for blue cloud galaxies. In the full sample, we find that elliptical galaxies in high-density environments are $α$-enhanced by up to 0.057$\pm$0.014 dex at velocity dispersions $σ$<100 km/s, compared with those in low-density environments. This $α$-enhancement is morphology-dependent, with the offset decreasing along the Hubble sequence towards spirals, which have an offset of 0.019$\pm$0.014 dex. At low velocity dispersion and controlling for morphology, we estimate that star formation in high-density environments is truncated $\sim1$ Gyr earlier than in low-density environments. For elliptical galaxies only, we find support for a parabolic relationship between [$α$/Fe] and $σ$, with an environmental $α$-enhancement of at least 0.03 dex. This suggests strong contributions from both environment and mass-based quenching mechanisms. However, there is no evidence for this behaviour in later morphological types.
△ Less
Submitted 24 August, 2021; v1 submitted 3 June, 2021;
originally announced June 2021.
-
The SAMI Galaxy Survey: The role of disc fading and progenitor bias in kinematic transitions
Authors:
S. M. Croom,
D. S. Taranu,
J. van de Sande,
C. D. P. Lagos,
K. E. Harborne,
J. Bland-Hawthorn,
S. Brough,
J. J. Bryant,
L. Cortese,
C. Foster,
M. Goodwin,
B. Groves,
A. Khalid,
J. Lawrence,
A. M. Medling,
S. N. Richards,
M. S. Owers,
N. Scott,
S. P. Vaughan
Abstract:
We use comparisons between the SAMI Galaxy Survey and equilibrium galaxy models to infer the importance of disc fading in the transition of spirals into lenticular (S0) galaxies. The local S0 population has both higher photometric concentration and lower stellar spin than spiral galaxies of comparable mass and we test whether this separation can be accounted for by passive aging alone. We construc…
▽ More
We use comparisons between the SAMI Galaxy Survey and equilibrium galaxy models to infer the importance of disc fading in the transition of spirals into lenticular (S0) galaxies. The local S0 population has both higher photometric concentration and lower stellar spin than spiral galaxies of comparable mass and we test whether this separation can be accounted for by passive aging alone. We construct a suite of dynamically self--consistent galaxy models, with a bulge, disc and halo using the GalactICS code. The dispersion-dominated bulge is given a uniformly old stellar population, while the disc is given a current star formation rate putting it on the main sequence, followed by sudden instantaneous quenching. We then generate mock observables (r-band images, stellar velocity and dispersion maps) as a function of time since quenching for a range of bulge/total (B/T) mass ratios. The disc fading leads to a decline in measured spin as the bulge contribution becomes more dominant, and also leads to increased concentration. However, the quantitative changes observed after 5 Gyr of disc fading cannot account for all of the observed difference. We see similar results if we instead subdivide our SAMI Galaxy Survey sample by star formation (relative to the main sequence). We use EAGLE simulations to also take into account progenitor bias, using size evolution to infer quenching time. The EAGLE simulations suggest that the progenitors of current passive galaxies typically have slightly higher spin than present day star-forming disc galaxies of the same mass. As a result, progenitor bias moves the data further from the disc fading model scenario, implying that intrinsic dynamical evolution must be important in the transition from star-forming discs to passive discs.
△ Less
Submitted 21 May, 2021;
originally announced May 2021.
-
The SAMI Galaxy Survey: stellar population and structural trends across the Fundamental Plane
Authors:
Francesco D'Eugenio,
Matthew Colless,
Nicholas Scott,
Arjen van der Wel,
Roger L. Davies,
Jesse van de Sande,
Sarah M. Sweet,
Sree Oh,
Brent Groves,
Rob Sharp,
Matt S. Owers,
Joss Bland-Hawthorn,
Scott M. Croom,
Sarah Brough,
Julia J. Bryant,
Michael Goodwin,
Jon S. Lawrence,
Nuria P. F. Lorente,
Samuel N. Richards
Abstract:
We study the Fundamental Plane (FP) for a volume- and luminosity-limited sample of 560 early-type galaxies from the SAMI survey. Using r-band sizes and luminosities from new Multi-Gaussian Expansion (MGE) photometric measurements, and treating luminosity as the dependent variable, the FP has coefficients a=1.294$\pm$0.039, b= 0.912$\pm$0.025, and zero-point c= 7.067$\pm$0.078. We leverage the high…
▽ More
We study the Fundamental Plane (FP) for a volume- and luminosity-limited sample of 560 early-type galaxies from the SAMI survey. Using r-band sizes and luminosities from new Multi-Gaussian Expansion (MGE) photometric measurements, and treating luminosity as the dependent variable, the FP has coefficients a=1.294$\pm$0.039, b= 0.912$\pm$0.025, and zero-point c= 7.067$\pm$0.078. We leverage the high signal-to-noise of SAMI integral field spectroscopy, to determine how structural and stellar-population observables affect the scatter about the FP. The FP residuals correlate most strongly (8$σ$ significance) with luminosity-weighted simple-stellar-population (SSP) age. In contrast, the structural observables surface mass density, rotation-to-dispersion ratio, Sérsic index and projected shape all show little or no significant correlation. We connect the FP residuals to the empirical relation between age (or stellar mass-to-light ratio $Υ_\star$) and surface mass density, the best predictor of SSP age amongst parameters based on FP observables. We show that the FP residuals (anti-)correlate with the residuals of the relation between surface density and $Υ_\star$. This correlation implies that part of the FP scatter is due to the broad age and $Υ_\star$ distribution at any given surface mass density. Using virial mass and $Υ_\star$ we construct a simulated FP and compare it to the observed FP. We find that, while the empirical relations between observed stellar population relations and FP observables are responsible for most (75%) of the FP scatter, on their own they do not explain the observed tilt of the FP away from the virial plane.
△ Less
Submitted 20 April, 2021;
originally announced April 2021.
-
Enhancing Interpretable Clauses Semantically using Pretrained Word Representation
Authors:
Rohan Kumar Yadav,
Lei Jiao,
Ole-Christoffer Granmo,
Morten Goodwin
Abstract:
Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, text classification, and Word Sense Disambiguation. To obtain human-level interpretability, legacy TM employs Boolean input features such as bag-of-words (BOW). However, the…
▽ More
Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, text classification, and Word Sense Disambiguation. To obtain human-level interpretability, legacy TM employs Boolean input features such as bag-of-words (BOW). However, the BOW representation makes it difficult to use any pre-trained information, for instance, word2vec and GloVe word representations. This restriction has constrained the performance of TM compared to deep neural networks (DNNs) in NLP. To reduce the performance gap, in this paper, we propose a novel way of using pre-trained word representations for TM. The approach significantly enhances the performance and interpretability of TM. We achieve this by extracting semantically related words from pre-trained word representations as input features to the TM. Our experiments show that the accuracy of the proposed approach is significantly higher than the previous BOW-based TM, reaching the level of DNN-based models.
△ Less
Submitted 10 September, 2021; v1 submitted 14 April, 2021;
originally announced April 2021.