-
Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization
Authors:
Ming-Yang Ho,
Che-Ming Wu,
Min-Sheng Wu,
Yufeng Jane Tseng
Abstract:
Recent advancements in ultra-high-resolution unpaired image-to-image translation have aimed to mitigate the constraints imposed by limited GPU memory through patch-wise inference. Nonetheless, existing methods often compromise between the reduction of noticeable tiling artifacts and the preservation of color and hue contrast, attributed to the reliance on global image- or patch-level statistics in…
▽ More
Recent advancements in ultra-high-resolution unpaired image-to-image translation have aimed to mitigate the constraints imposed by limited GPU memory through patch-wise inference. Nonetheless, existing methods often compromise between the reduction of noticeable tiling artifacts and the preservation of color and hue contrast, attributed to the reliance on global image- or patch-level statistics in the instance normalization layers. In this study, we introduce a Dense Normalization (DN) layer designed to estimate pixel-level statistical moments. This approach effectively diminishes tiling artifacts while concurrently preserving local color and hue contrasts. To address the computational demands of pixel-level estimation, we further propose an efficient interpolation algorithm. Moreover, we invent a parallelism strategy that enables the DN layer to operate in a single pass. Through extensive experiments, we demonstrate that our method surpasses all existing approaches in performance. Notably, our DN layer is hyperparameter-free and can be seamlessly integrated into most unpaired image-to-image translation frameworks without necessitating retraining. Overall, our work paves the way for future exploration in handling images of arbitrary resolutions within the realm of unpaired image-to-image translation. Code is available at: https://github.com/Kaminyou/Dense-Normalization.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Representational Alignment Supports Effective Machine Teaching
Authors:
Ilia Sucholutsky,
Katherine M. Collins,
Maya Malaviya,
Nori Jacoby,
Weiyang Liu,
Theodore R. Sumers,
Michalis Korakakis,
Umang Bhatt,
Mark Ho,
Joshua B. Tenenbaum,
Brad Love,
Zachary A. Pardos,
Adrian Weller,
Thomas L. Griffiths
Abstract:
A good teacher should not only be knowledgeable; but should be able to communicate in a way that the student understands -- to share the student's representation of the world. In this work, we integrate insights from machine teaching and pragmatic communication with the burgeoning literature on representational alignment to characterize a utility curve defining a relationship between representatio…
▽ More
A good teacher should not only be knowledgeable; but should be able to communicate in a way that the student understands -- to share the student's representation of the world. In this work, we integrate insights from machine teaching and pragmatic communication with the burgeoning literature on representational alignment to characterize a utility curve defining a relationship between representational alignment and teacher capability for promoting student learning. To explore the characteristics of this utility curve, we design a supervised learning environment that disentangles representational alignment from teacher accuracy. We conduct extensive computational experiments with machines teaching machines, complemented by a series of experiments in which machines teach humans. Drawing on our findings that improved representational alignment with a student improves student learning outcomes (i.e., task accuracy), we design a classroom matching procedure that assigns students to teachers based on the utility curve. If we are to design effective machine teachers, it is not enough to build teachers that are accurate -- we want teachers that can align, representationally, to their students too.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Scott analysis, linear orders and almost periodic functions
Authors:
David Gonzalez,
Matthew Harrison-Trainor,
Meng-Che "Turbo" Ho
Abstract:
For any limit ordinal $λ$, we construct a linear order $L_λ$ whose Scott complexity is $Σ_{λ+1}$. This completes the classification of the possible Scott sentence complexities of linear orderings. Previously, there was only one known construction of any structure (of any signature) with Scott complexity $Σ_{λ+1}$, and our construction gives new examples, e.g., rigid structures, of this complexity.…
▽ More
For any limit ordinal $λ$, we construct a linear order $L_λ$ whose Scott complexity is $Σ_{λ+1}$. This completes the classification of the possible Scott sentence complexities of linear orderings. Previously, there was only one known construction of any structure (of any signature) with Scott complexity $Σ_{λ+1}$, and our construction gives new examples, e.g., rigid structures, of this complexity.
Moreover, we can construct the linear orders $L_λ$ so that not only does $L_λ$ have Scott complexity $Σ_{λ+1}$, but there are continuum-many structures $M \equiv_λL_λ$ and all such structures also have Scott complexity $Σ_{λ+1}$. In contrast, we demonstrate that there is no structure (of any signature) with Scott complexity $Π_{λ+1}$ that is only $λ$-equivalent to structures with Scott complexity $Π_{λ+1}$.
Our construction is based on functions $f \colon \mathbb{Z}\to \mathbb{N}\cup \{\infty\}$ which are almost periodic but not periodic, such as those arising from shifts of the $p$-adic valuations.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Metallicity Dependence of Pressure-Regulated Feedback-Modulated Star Formation in the TIGRESS-NCR Simulation Suite
Authors:
Chang-Goo Kim,
Eve C. Ostriker,
Jeong-Gyu Kim,
Munan Gong,
Greg L. Bryan,
Drummond B. Fielding,
Sultan Hassan,
Matthew Ho,
Sarah M. R. Jeffreson,
Rachel S. Somerville,
Ulrich P. Steinwandel
Abstract:
We present a new simulation suite for the star-forming interstellar medium (ISM) in galactic disks using the TIGRESS-NCR framework. Distinctive aspects of our simulation suite are: (1) sophisticated and comprehensive numerical treatments of essential physical processes including magnetohydrodynamics, self-gravity, and galactic differential rotation, as well as photochemistry, cooling, and heating…
▽ More
We present a new simulation suite for the star-forming interstellar medium (ISM) in galactic disks using the TIGRESS-NCR framework. Distinctive aspects of our simulation suite are: (1) sophisticated and comprehensive numerical treatments of essential physical processes including magnetohydrodynamics, self-gravity, and galactic differential rotation, as well as photochemistry, cooling, and heating coupled with ray-tracing UV radiation transfer and resolved supernova feedback and (2) wide parameter coverage including metallicity over $Z'\equiv Z/Z_\odot\sim0.1-3$, gas surface density $Σ_{\rm gas}\sim5-150 M_{\odot}{\rm pc^{-2}}$, and stellar surface density $Σ_{\rm star}\sim 1-50 M_{\odot}{\rm pc^{-2}}$. The range of emergent star formation rate surface density is $Σ_{\rm SFR}\sim 10^{-4}-0.5 M_{\odot}{\rm kpc^{-2}yr^{-1}}$ and ISM total midplane pressure is $P_{\rm tot}/k_B=10^3-10^6{\rm cm^{-3}K}$, with $P_{\rm tot}$ equal to the ISM weight $W$. For given $Σ_{\rm gas}$ and $Σ_{\rm star}$, we find $Σ_{\rm SFR} \propto Z'^{0.3}$. We provide an interpretation based on the pressure-regulated feedback-modulated (PRFM) star formation theory. We characterize feedback modulation in terms of the yield $Υ$, defined as the ratio of each stress to $Σ_{\rm SFR}$. The thermal feedback yield varies sensitively with both weight and metallicity as $Υ_{\rm th}\propto W^{-0.46}Z'^{-0.53}$, while the combined turbulent and magnetic feedback yield shows weaker dependence $Υ_{\rm turb+mag}\propto W^{-0.22}Z'^{-0.18}$. The reduction in $Σ_{\rm SFR}$ at low metallicity is due mainly to enhanced thermal feedback yield, resulting from reduced attenuation of UV radiation. With the metallicity-dependent calibrations we provide, PRFM theory can be used for a new subgrid star formation prescription in cosmological simulations where the ISM is unresolved.
△ Less
Submitted 6 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
Algorithmic aspects of left-orderings of solvable Baumslag--Solitar groups via its dynamical realization
Authors:
Meng-Che "Turbo" Ho,
Khanh Le,
Dino Rossegger
Abstract:
We answer a question of Calderoni and Clay by showing that the conjugation equivalence relation of left orderings of the Baumslag-Solitar groups $\mathrm{BS}(1,n)$ is hyperfinite for any $n$. Our proof relies on a classification of $\mathrm{BS}(1,n)$'s left-orderings via its one-dimensional dynamical realizations. We furthermore use the effectiveness of the dynamical realizations of…
▽ More
We answer a question of Calderoni and Clay by showing that the conjugation equivalence relation of left orderings of the Baumslag-Solitar groups $\mathrm{BS}(1,n)$ is hyperfinite for any $n$. Our proof relies on a classification of $\mathrm{BS}(1,n)$'s left-orderings via its one-dimensional dynamical realizations. We furthermore use the effectiveness of the dynamical realizations of $\mathrm{BS}(1,n)$ to study algorithmic properties of the left-orderings on $\mathrm{BS}(1,n)$.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Bye bye, local bias: the statistics of the halo field are poorly determined by the local mass density
Authors:
Deaglan J. Bartlett,
Matthew Ho,
Benjamin D. Wandelt
Abstract:
Bias models relating the dark matter field to the spatial distribution of halos are widely used in current cosmological analyses. Many models predict halos purely from the local Eulerian matter density, yet bias models in perturbation theory require the inclusion of other local properties. We assess the validity of assuming that only the local dark matter density can be used to predict the number…
▽ More
Bias models relating the dark matter field to the spatial distribution of halos are widely used in current cosmological analyses. Many models predict halos purely from the local Eulerian matter density, yet bias models in perturbation theory require the inclusion of other local properties. We assess the validity of assuming that only the local dark matter density can be used to predict the number density of halos in a model-independent way and in the non-perturbative regime. Utilising $N$-body simulations, we study the properties of the halo counts field after spatial voxels with near-equal dark matter density have been permuted. If local-in-matter-density biasing were valid, the statistical properties of the permuted and un-permuted fields would be indistinguishable since both represent equally fair draws of the stochastic biasing model. For voxels of side length $\sim4-30\,h^{-1}{\rm\,Mpc}$ and for halos less massive than $\sim10^{15}\,h^{-1}{\rm\,M_\odot}$, we find that the permuted halo field has a scale-dependent bias with greater than 25% more power on scales relevant for current surveys. These bias models remove small-scale power by not modelling correlations between neighbouring voxels, which substantially boosts large-scale power to conserve the field's total variance. This conclusion is robust to the choice of initial conditions and cosmology. Assuming local-in-matter-density halo biasing cannot, therefore, reproduce the distribution of halos across a large range of scales and halo masses, no matter how complex the model. One must either allow the biasing to be a function of other quantities and/or remove the assumption that neighbouring voxels are statistically independent.
△ Less
Submitted 21 June, 2024; v1 submitted 1 May, 2024;
originally announced May 2024.
-
DISC: Latent Diffusion Models with Self-Distillation from Separated Conditions for Prostate Cancer Grading
Authors:
Man M. Ho,
Elham Ghelichkhan,
Yosep Chong,
Yufei Zhou,
Beatrice Knudsen,
Tolga Tasdizen
Abstract:
Latent Diffusion Models (LDMs) can generate high-fidelity images from noise, offering a promising approach for augmenting histopathology images for training cancer grading models. While previous works successfully generated high-fidelity histopathology images using LDMs, the generation of image tiles to improve prostate cancer grading has not yet been explored. Additionally, LDMs face challenges i…
▽ More
Latent Diffusion Models (LDMs) can generate high-fidelity images from noise, offering a promising approach for augmenting histopathology images for training cancer grading models. While previous works successfully generated high-fidelity histopathology images using LDMs, the generation of image tiles to improve prostate cancer grading has not yet been explored. Additionally, LDMs face challenges in accurately generating admixtures of multiple cancer grades in a tile when conditioned by a tile mask. In this study, we train specific LDMs to generate synthetic tiles that contain multiple Gleason Grades (GGs) by leveraging pixel-wise annotations in input tiles. We introduce a novel framework named Self-Distillation from Separated Conditions (DISC) that generates GG patterns guided by GG masks. Finally, we deploy a training framework for pixel-level and slide-level prostate cancer grading, where synthetic tiles are effectively utilized to improve the cancer grading performance of existing models. As a result, this work surpasses previous works in two domains: 1) our LDMs enhanced with DISC produce more accurate tiles in terms of GG patterns, and 2) our training scheme, incorporating synthetic data, significantly improves the generalization of the baseline model for prostate cancer grading, particularly in challenging cases of rare GG5, demonstrating the potential of generative models to enhance cancer grading when data is limited.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation
Authors:
Man M. Ho,
Shikha Dubey,
Yosep Chong,
Beatrice Knudsen,
Tolga Tasdizen
Abstract:
The Frozen Section (FS) technique is a rapid and efficient method, taking only 15-30 minutes to prepare slides for pathologists' evaluation during surgery, enabling immediate decisions on further surgical interventions. However, FS process often introduces artifacts and distortions like folds and ice-crystal effects. In contrast, these artifacts and distortions are absent in the higher-quality for…
▽ More
The Frozen Section (FS) technique is a rapid and efficient method, taking only 15-30 minutes to prepare slides for pathologists' evaluation during surgery, enabling immediate decisions on further surgical interventions. However, FS process often introduces artifacts and distortions like folds and ice-crystal effects. In contrast, these artifacts and distortions are absent in the higher-quality formalin-fixed paraffin-embedded (FFPE) slides, which require 2-3 days to prepare. While Generative Adversarial Network (GAN)-based methods have been used to translate FS to FFPE images (F2F), they may leave morphological inaccuracies with remaining FS artifacts or introduce new artifacts, reducing the quality of these translations for clinical assessments. In this study, we benchmark recent generative models, focusing on GANs and Latent Diffusion Models (LDMs), to overcome these limitations. We introduce a novel approach that combines LDMs with Histopathology Pre-Trained Embeddings to enhance restoration of FS images. Our framework leverages LDMs conditioned by both text and pre-trained embeddings to learn meaningful features of FS and FFPE histopathology images. Through diffusion and denoising techniques, our approach not only preserves essential diagnostic attributes like color staining and tissue morphology but also proposes an embedding translation mechanism to better predict the targeted FFPE representation of input FS images. As a result, this work achieves a significant improvement in classification performance, with the Area Under the Curve rising from 81.99% to 94.64%, accompanied by an advantageous CaseFD. This work establishes a new benchmark for FS to FFPE image translation quality, promising enhanced reliability and accuracy in histopathology FS image analysis. Our work is available at https://minhmanho.github.io/f2f_ldm/.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning
Authors:
Quang Minh Dinh,
Minh Khoi Ho,
Anh Quan Dang,
Hung Phong Tran
Abstract:
Traffic video description and analysis have received much attention recently due to the growing demand for efficient and reliable urban surveillance systems. Most existing methods only focus on locating traffic event segments, which severely lack descriptive details related to the behaviour and context of all the subjects of interest in the events. In this paper, we present TrafficVLM, a novel mul…
▽ More
Traffic video description and analysis have received much attention recently due to the growing demand for efficient and reliable urban surveillance systems. Most existing methods only focus on locating traffic event segments, which severely lack descriptive details related to the behaviour and context of all the subjects of interest in the events. In this paper, we present TrafficVLM, a novel multi-modal dense video captioning model for vehicle ego camera view. TrafficVLM models traffic video events at different levels of analysis, both spatially and temporally, and generates long fine-grained descriptions for the vehicle and pedestrian at different phases of the event. We also propose a conditional component for TrafficVLM to control the generation outputs and a multi-task fine-tuning paradigm to enhance TrafficVLM's learning capability. Experiments show that TrafficVLM performs well on both vehicle and overhead camera views. Our solution achieved outstanding results in Track 2 of the AI City Challenge 2024, ranking us third in the challenge standings. Our code is publicly available at https://github.com/quangminhdinh/TrafficVLM.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Broad Instantaneous Bandwidth Microwave Spectrum Analyzer with a Microfabricated Atomic Vapor Cell
Authors:
Yongqi Shi,
Thomas Ruster,
Melvyn Ho,
Sylvain Karlen,
Jacques Haesler,
Philipp Treutlein
Abstract:
We report on broad instantaneous bandwidth microwave spectrum analysis with hot $^{87}\mathrm{Rb}$ atoms in a microfabricated vapor cell in a large magnetic field gradient. The sensor is a MEMS atomic vapor cell filled with isotopically pure $^{87}\mathrm{Rb}$ and $\mathrm{N}_2$ buffer gas to localize the motion of the atoms. The microwave signals of interest are coupled through a coplanar wavegui…
▽ More
We report on broad instantaneous bandwidth microwave spectrum analysis with hot $^{87}\mathrm{Rb}$ atoms in a microfabricated vapor cell in a large magnetic field gradient. The sensor is a MEMS atomic vapor cell filled with isotopically pure $^{87}\mathrm{Rb}$ and $\mathrm{N}_2$ buffer gas to localize the motion of the atoms. The microwave signals of interest are coupled through a coplanar waveguide to the cell, inducing spin flip transitions between optically pumped ground states of the atoms. A static magnetic field with large gradient maps the $\textit{frequency spectrum}$ of the input microwave signals to a position-dependent $\textit{spin-flip pattern}$ on absorption images of the cell recorded with a laser beam onto a camera. In our proof-of-principle experiment, we demonstrate a microwave spectrum analyzer that has $\approx$ 1 GHz instantaneous bandwidth centered at 13.165 GHz, 3 MHz frequency resolution, 2 kHz refresh rate, and a -23 dBm single-tone microwave power detection limit in 1 s measurement time. A theoretical model is constructed to simulate the image signals by considering the processes of optical pumping, microwave interaction, diffusion of $^{87}\mathrm{Rb}$ atoms, and laser absorption. We expect to reach more than 25 GHz instantaneous bandwidth in an optimized setup, limited by the applied magnetic field gradient. Our demonstration offers a practical alternative to conventional microwave spectrum analyzers based on electronic heterodyne detection.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Two results on complexities of decision problems of groups
Authors:
Uri Andrews,
Matthew Harrison-Trainor,
Meng-Che "Turbo" Ho
Abstract:
We answer two questions on the complexities of decision problems of groups, each related to a classical result. First, C. Miller characterized the complexity of the isomorphism problem for finitely presented groups in 1971. We do the same for the isomorphism problem for recursively presented groups. Second, the fact that every Turing degree appears as the degree of the word problem of a finitely p…
▽ More
We answer two questions on the complexities of decision problems of groups, each related to a classical result. First, C. Miller characterized the complexity of the isomorphism problem for finitely presented groups in 1971. We do the same for the isomorphism problem for recursively presented groups. Second, the fact that every Turing degree appears as the degree of the word problem of a finitely presented group is shown independently by multiple people in the 1960s. We answer the analogous question for degrees of ceers instead of Turing degrees. We show that the set of ceers which are computably equivalent to a finitely presented group is $Σ^0_3$-complete, which is the maximal possible complexity.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Torsion-free abelian groups of finite rank and fields of finite transcendence degree
Authors:
Meng-Che "Turbo" Ho,
Julia Knight,
Russell Miller
Abstract:
Let $\operatorname{TFAb}_r$ be the class of torsion-free abelian groups of rank $r$, and let $\operatorname{FD}_r$ be the class of fields of characteristic $0$ and transcendence degree~$r$. We compare these classes using various notions. Considering Scott complexity of the structures in the classes and the complexity of the isomorphism relations on the classes, the classes seem very similar. Hjort…
▽ More
Let $\operatorname{TFAb}_r$ be the class of torsion-free abelian groups of rank $r$, and let $\operatorname{FD}_r$ be the class of fields of characteristic $0$ and transcendence degree~$r$. We compare these classes using various notions. Considering Scott complexity of the structures in the classes and the complexity of the isomorphism relations on the classes, the classes seem very similar. Hjorth and Thomas showed that the $\operatorname{TFAb}_r$ are strictly increasing under Borel reducibility. This is not so for the classes $\operatorname{FD}_r$. Thomas and Velickovic showed that for sufficiently large $r$, the classes $\operatorname{FD}_r$ are equivalent under Borel reducibility. We try to compare the groups with the fields, using Borel reducibility, and also using some effective variants. We give functorial Turing computable embeddings of $\operatorname{TFAb}_r$ in $\operatorname{FD}_r$, and of $\operatorname{FD}_r$ in $\operatorname{FD}_{r+1}$. We show that under computable countable reducibility, $\operatorname{TFAb}_1$ lies on top among the classes we are considering. In fact, under computable countable reducibility, isomorphism on $\operatorname{TFAb}_1$ lies on top among equivalence relations that are effective $Σ_3$, along with the Vitali equivalence relation on $2^ω$.
△ Less
Submitted 18 March, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology
Authors:
Matthew Ho,
Deaglan J. Bartlett,
Nicolas Chartier,
Carolina Cuesta-Lazaro,
Simon Ding,
Axel Lapel,
Pablo Lemos,
Christopher C. Lovell,
T. Lucas Makinen,
Chirag Modi,
Viraj Pandya,
Shivam Pandey,
Lucia A. Perez,
Benjamin Wandelt,
Greg L. Bryan
Abstract:
This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It i…
▽ More
This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It includes comprehensive validation metrics to assess posterior estimate coverage, enhancing the reliability of inferred results. Additionally, the pipeline is easily parallelizable and is designed for efficient exploration of modeling hyperparameters. To demonstrate its capabilities, we present real applications across a range of astrophysics and cosmology problems, such as: estimating galaxy cluster masses from X-ray photometry; inferring cosmology from matter power spectra and halo point clouds; characterizing progenitors in gravitational wave signals; capturing physical dust parameters from galaxy colors and luminosities; and establishing properties of semi-analytic models of galaxy formation. We also include exhaustive benchmarking and comparisons of all implemented methods as well as discussions about the challenges and pitfalls of ML inference in astronomical sciences. All code and examples are made publicly available at https://github.com/maho3/ltu-ili.
△ Less
Submitted 2 July, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Algorithmically finite, universal, and $*$-universal groups
Authors:
Uri Andrews,
Meng-Che "Turbo" Ho
Abstract:
The study of the word problems of groups dates back to Dehn in 1911, and has been a central topic of study in both group theory and computability theory. As most naturally occurring presentations of groups are recursive, their word problems can be thought of as a computably enumerable equivalence relation (ceer). In this paper, we study the word problem of groups in the framework of ceer degrees,…
▽ More
The study of the word problems of groups dates back to Dehn in 1911, and has been a central topic of study in both group theory and computability theory. As most naturally occurring presentations of groups are recursive, their word problems can be thought of as a computably enumerable equivalence relation (ceer). In this paper, we study the word problem of groups in the framework of ceer degrees, introducing a new metric with which to study word problems. This metric is more refined than the classical context of Turing degrees.
Classically, every Turing degree is realized as the word problem of some c.e. group, but this is not true for ceer degrees. This motivates us to look at the classical constructions and show that there is a group whose word problem is not universal, but becomes universal after taking any nontrivial free product, which we call $*$-universal. This shows that existing constructions of the Higman embedding theorem do not preserve ceer degrees. We also study the index set of various classes of groups defined by their properties as a ceer: groups whose word problems are dark (equivalently, algorithmically finite as defined by Miasnikov and Osin), universal, and $*$-universal groups.
△ Less
Submitted 2 February, 2024;
originally announced February 2024.
-
Self-supervised Complex Network for Machine Sound Anomaly Detection
Authors:
Miseul Kim,
Minh Tri Ho,
Hong-Goo Kang
Abstract:
In this paper, we propose an anomaly detection algorithm for machine sounds with a deep complex network trained by self-supervision. Using the fact that phase continuity information is crucial for detecting abnormalities in time-series signals, our proposed algorithm utilizes the complex spectrum as an input and performs complex number arithmetic throughout the entire process. Since the usefulness…
▽ More
In this paper, we propose an anomaly detection algorithm for machine sounds with a deep complex network trained by self-supervision. Using the fact that phase continuity information is crucial for detecting abnormalities in time-series signals, our proposed algorithm utilizes the complex spectrum as an input and performs complex number arithmetic throughout the entire process. Since the usefulness of phase information can vary depending on the type of machine sound, we also apply an attention mechanism to control the weights of the complex and magnitude spectrum bottleneck features depending on the machine type. We train our network to perform a self-supervised task that classifies the machine identifier (id) of normal input sounds among multiple classes. At test time, an input signal is detected as anomalous if the trained model is unable to correctly classify the id. In other words, we determine the presence of an anomality when the output cross-entropy score of the multiclass identification task is lower than a pre-defined threshold. Experiments with the MIMII dataset show that the proposed algorithm has a much higher area under the curve (AUC) score than conventional magnitude spectrum-based algorithms.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Astrometric Microlensing by Primordial Black Holes with The Roman Space Telescope
Authors:
James Fardeen,
Peter McGill,
Scott E. Perkins,
William A. Dawson,
Natasha S. Abrams,
Jessica R. Lu,
Ming-Feng Ho,
Simeon Bird
Abstract:
Primordial Black Holes (PBHs) could explain some fraction of dark matter and shed light on many areas of early-universe physics. Despite over half a century of research interest, a PBH population has so far eluded detection. The most competitive constraints on the fraction of dark matter comprised of PBHs ($f_{\rm DM}$) in the $(10^{-9}-10)M_{\odot}$ mass-ranges come from photometric microlensing…
▽ More
Primordial Black Holes (PBHs) could explain some fraction of dark matter and shed light on many areas of early-universe physics. Despite over half a century of research interest, a PBH population has so far eluded detection. The most competitive constraints on the fraction of dark matter comprised of PBHs ($f_{\rm DM}$) in the $(10^{-9}-10)M_{\odot}$ mass-ranges come from photometric microlensing and bound $f_{\rm DM}\lesssim10^{-2}-10^{-1}$. With the advent of the Roman Space Telescope with its sub-milliarcsecond (mas) astrometric capabilities and its planned Galactic Bulge Time Domain Survey (GBTDS), detecting astrometric microlensing signatures will become routine. Compared with photometric microlensing, astrometric microlensing signals are sensitive to different lens masses-distance configurations and contains different information, making it a complimentary lensing probe. At sub-mas astrometric precision, astrometric microlensing signals are typically detectable at larger lens-source separations than photometric signals, suggesting a microlensing detection channel of pure astrometric events. We use a Galactic simulation to predict the number of detectable microlensing events during the GBTDS via this pure astrometric microlensing channel. Assuming an absolute astrometric precision floor for bright stars of 0.1 mas for the GBTDS, we find that the number of detectable events peaks at $\approx 10^{3} f_{\rm DM}$ for a population of $ 1 M_{\odot}$ PBHs and tapers to $\approx 10f_{\rm DM}$ and $\approx 100f_{\rm DM}$ at $10^{-4}M_{\odot}$ and $10^{3}M_{\odot}$, respectively. Accounting for the distinguishability of PBHs from Stellar lenses, we conclude the GBTDS will be sensitive to a PBH population at $f_{\rm DM}$ down to $\approx10^{-1}-10^{-3}$ for $(10^{-1}-10^{2})M_{\odot}$ likely yielding novel PBH constraints.
△ Less
Submitted 8 March, 2024; v1 submitted 20 December, 2023;
originally announced December 2023.
-
Intra-Family Transformation of The Bi-Te Family via in-situ Chemical Interactions
Authors:
Zhihao He,
Tin Seng Manfred Ho,
Rolf Lortz,
Iam Keong Sou
Abstract:
The Bi-Te binary system, characterized by the homologous series of the (Bi2)m(Bi2Te3)n, has always attracted research interest for its layered structures and potential in advanced materials applications. Despite Bi2Te3 has been extensively studied, exploration of other compounds has been constrained by synthesis challenges. This study reports the molecular beam epitaxy (MBE) growth of FeTe on Bi2T…
▽ More
The Bi-Te binary system, characterized by the homologous series of the (Bi2)m(Bi2Te3)n, has always attracted research interest for its layered structures and potential in advanced materials applications. Despite Bi2Te3 has been extensively studied, exploration of other compounds has been constrained by synthesis challenges. This study reports the molecular beam epitaxy (MBE) growth of FeTe on Bi2Te3, demonstrating that varying growth conditions can turn the Bi2Te3 layer into different Bi-Te phases and form corresponding FeTe/Bi-Te heterostructures. Our combined analysis using reflection high-energy electron diffraction (RHEED), high-resolution X-ray diffraction (HRXRD), and high-resolution scanning transmission electron microscopy (HR-STEM), indicates that specific growth conditions used for the growth of the FeTe layer can facilitate the extraction of Te from Bi2Te3, leading to the formation of Bi4Te3 and Bi6Te3. Additionally, by lowering the FeTe growth temperature to 230 oC, Te extraction from the Bi2Te3 layer could be avoided, preserving the Bi2Te3 structure. Notably, all the three FeTe/Bi-Te structures exhibit superconductivity with the FeTe/Bi2Te3 heterostructure enjoying the highest superconductivity quality. These findings introduce a novel method for realizing Bi4Te3 and Bi6Te3 through Te extraction by growing FeTe on Bi2Te3, driven by the high reactivity between Fe and Te. This approach holds promise for synthesizing other members of the Bi-Te series, expanding the functional potential of these materials.
△ Less
Submitted 7 June, 2024; v1 submitted 16 December, 2023;
originally announced December 2023.
-
CLASS-M: Adaptive stain separation-based contrastive learning with pseudo-labeling for histopathological image classification
Authors:
Bodong Zhang,
Hamid Manoochehri,
Man Minh Ho,
Fahimeh Fooladgar,
Yosep Chong,
Beatrice S. Knudsen,
Deepika Sirohi,
Tolga Tasdizen
Abstract:
Histopathological image classification is an important task in medical image analysis. Recent approaches generally rely on weakly supervised learning due to the ease of acquiring case-level labels from pathology reports. However, patch-level classification is preferable in applications where only a limited number of cases are available or when local prediction accuracy is critical. On the other ha…
▽ More
Histopathological image classification is an important task in medical image analysis. Recent approaches generally rely on weakly supervised learning due to the ease of acquiring case-level labels from pathology reports. However, patch-level classification is preferable in applications where only a limited number of cases are available or when local prediction accuracy is critical. On the other hand, acquiring extensive datasets with localized labels for training is not feasible. In this paper, we propose a semi-supervised patch-level histopathological image classification model, named CLASS-M, that does not require extensively labeled datasets. CLASS-M is formed by two main parts: a contrastive learning module that uses separated Hematoxylin and Eosin images generated through an adaptive stain separation process, and a module with pseudo-labels using MixUp. We compare our model with other state-of-the-art models on two clear cell renal cell carcinoma datasets. We demonstrate that our CLASS-M model has the best performance on both datasets. Our code is available at github.com/BzhangURU/Paper_CLASS-M/tree/main
△ Less
Submitted 4 January, 2024; v1 submitted 11 December, 2023;
originally announced December 2023.
-
Exploring the hierarchical structure of human plans via program generation
Authors:
Carlos G. Correa,
Sophia Sanborn,
Mark K. Ho,
Frederick Callaway,
Nathaniel D. Daw,
Thomas L. Griffiths
Abstract:
Human behavior is inherently hierarchical, resulting from the decomposition of a task into subtasks or an abstract action into concrete actions. However, behavior is typically measured as a sequence of actions, which makes it difficult to infer its hierarchical structure. In this paper, we explore how people form hierarchically-structured plans, using an experimental paradigm that makes hierarchic…
▽ More
Human behavior is inherently hierarchical, resulting from the decomposition of a task into subtasks or an abstract action into concrete actions. However, behavior is typically measured as a sequence of actions, which makes it difficult to infer its hierarchical structure. In this paper, we explore how people form hierarchically-structured plans, using an experimental paradigm that makes hierarchical representations observable: participants create programs that produce sequences of actions in a language with explicit hierarchical structure. This task lets us test two well-established principles of human behavior: utility maximization (i.e. using fewer actions) and minimum description length (MDL; i.e. having a shorter program). We find that humans are sensitive to both metrics, but that both accounts fail to predict a qualitative feature of human-created programs, namely that people prefer programs with reuse over and above the predictions of MDL. We formalize this preference for reuse by extending the MDL account into a generative model over programs, modeling hierarchy choice as the induction of a grammar over actions. Our account can explain the preference for reuse and provides the best prediction of human behavior, going beyond simple accounts of compressibility to highlight a principle that guides hierarchical planning.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Energy and Time Complexity for Sorting Algorithms in Java
Authors:
Kristina Carter,
Su Mei Gwen Ho,
Mathias Marquar Arhipenko Larsen,
Martin Sundman,
Maja H. Kirkeby
Abstract:
The article investigates the relationship between time complexity and energy consumption in sorting algorithms, focusing on commonly-used algorithms implemented in Java: Bubble Sort, Counting Sort, Merge Sort, and Quick Sort. The significance of understanding this relationship is driven by the increasing energy demands of Information and Communication Technology systems and the potential for softw…
▽ More
The article investigates the relationship between time complexity and energy consumption in sorting algorithms, focusing on commonly-used algorithms implemented in Java: Bubble Sort, Counting Sort, Merge Sort, and Quick Sort. The significance of understanding this relationship is driven by the increasing energy demands of Information and Communication Technology systems and the potential for software optimization to contribute to energy efficiency. If we find a strong correlation between time complexity and energy usage, it would enhance the ability of software developers to create energy-efficient applications.
This quantitative study researches the execution of four selected sorting algorithms with input varying over input sizes (25000 to 1 million) and input order types (best, worst, and random cases) on a single kernel in a Java-enabled system. The input size is adjusted according to the type's maximum execution time, resulting in 136 combinations, totalling 12960 measurements. Wall time and the CPU energy consumption is measured using Intel's RAPL. Statistical analysis are used to examine the correlations between time complexity, wall time, and energy consumption.
The study finds a strong correlation between time complexity and energy consumption for the sorting algorithms tested. More than 99% of the variance in energy consumption for Counting Sort, Merge Sort, and Quick Sort depend on their time complexities. More than 94% of the variance in energy consumption for Bubble Sort depends on its time complexity. The results affirm that time complexity can serve as a reliable predictor of energy consumption in sequential sorting algorithms. This discovery could guide software developers in choosing energy-efficient algorithms by considering time complexities.
△ Less
Submitted 8 May, 2024; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Concept Alignment as a Prerequisite for Value Alignment
Authors:
Sunayana Rane,
Mark Ho,
Ilia Sucholutsky,
Thomas L. Griffiths
Abstract:
Value alignment is essential for building AI systems that can safely and reliably interact with people. However, what a person values -- and is even capable of valuing -- depends on the concepts that they are currently using to understand and evaluate what happens in the world. The dependence of values on concepts means that concept alignment is a prerequisite for value alignment -- agents need to…
▽ More
Value alignment is essential for building AI systems that can safely and reliably interact with people. However, what a person values -- and is even capable of valuing -- depends on the concepts that they are currently using to understand and evaluate what happens in the world. The dependence of values on concepts means that concept alignment is a prerequisite for value alignment -- agents need to align their representation of a situation with that of humans in order to successfully align their values. Here, we formally analyze the concept alignment problem in the inverse reinforcement learning setting, show how neglecting concept alignment can lead to systematic value mis-alignment, and describe an approach that helps minimize such failure modes by jointly reasoning about a person's concepts and values. Additionally, we report experimental results with human participants showing that humans reason about the concepts used by an agent when acting intentionally, in line with our joint reasoning model.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Morphology of Vaccine RD&D translation
Authors:
Martin Ho,
Henry CW Price,
Tim S Evans,
Eoin O'Sullivan
Abstract:
Translation as a concept coordinates participation in innovation but remains a qualitative construct. We provide multivariate accounting of linkages between market entries of vaccines, clinical trials, patents, publications, funders, and grants to quantify biomedical translation. We found that the most prevalent types of biomedical translation are those between basic and applied research (52 perce…
▽ More
Translation as a concept coordinates participation in innovation but remains a qualitative construct. We provide multivariate accounting of linkages between market entries of vaccines, clinical trials, patents, publications, funders, and grants to quantify biomedical translation. We found that the most prevalent types of biomedical translation are those between basic and applied research (52 percent) followed by those between research and product development (36 percent). Although many biomedical stakeholders assume knowledge flows one way from upstream research to downstream application, knowledge feedbacks that mediate translation are prevalent. We also cluster biomedical funders based on the types of translations they fund. Large-scale funding agencies such as NIH are similarly involved in early-stage translation, whereas pharmaceuticals and mission-oriented agencies such as DARPA involve diverse translation types, and each leaves different translation footprints.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
Constructing Impactful Machine Learning Research for Astronomy: Best Practices for Researchers and Reviewers
Authors:
D. Huppenkothen,
M. Ntampaka,
M. Ho,
M. Fouesneau,
B. Nord,
J. E. G. Peek,
M. Walmsley,
J. F. Wu,
C. Avestruz,
T. Buck,
M. Brescia,
D. P. Finkbeiner,
A. D. Goulding,
T. Kacprzak,
P. Melchior,
M. Pasquato,
N. Ramachandra,
Y. -S. Ting,
G. van de Ven,
S. Villar,
V. A. Villar,
E. Zinger
Abstract:
Machine learning has rapidly become a tool of choice for the astronomical community. It is being applied across a wide range of wavelengths and problems, from the classification of transients to neural network emulators of cosmological simulations, and is shifting paradigms about how we generate and report scientific results. At the same time, this class of method comes with its own set of best pr…
▽ More
Machine learning has rapidly become a tool of choice for the astronomical community. It is being applied across a wide range of wavelengths and problems, from the classification of transients to neural network emulators of cosmological simulations, and is shifting paradigms about how we generate and report scientific results. At the same time, this class of method comes with its own set of best practices, challenges, and drawbacks, which, at present, are often reported on incompletely in the astrophysical literature. With this paper, we aim to provide a primer to the astronomical community, including authors, reviewers, and editors, on how to implement machine learning models and report their results in a way that ensures the accuracy of the results, reproducibility of the findings, and usefulness of the method.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Local index theory and the Riemann-Roch-Grothendieck theorem for complex flat vector bundles II
Authors:
Man-Ho Ho
Abstract:
In this paper, we prove the real part of the Riemann-Roch-Grothendieck theorem for complex flat vector bundles at the differential form level.
In this paper, we prove the real part of the Riemann-Roch-Grothendieck theorem for complex flat vector bundles at the differential form level.
△ Less
Submitted 21 May, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Forecast Cosmological Constraints with the 1D Wavelet Scattering Transform and the Lyman-$α$ forest
Authors:
Hurum Tohfa,
Simeon Bird,
Ming-Feng Ho,
Mahdi Qezlou,
Martin Fernandez
Abstract:
We make forecasts for the constraining power of the 1D Wavelet Scattering Transform (WST) when used with a Lyman-$α$ forest cosmology survey. Using mock simulations and a Fisher matrix, we show that there is considerable cosmological information in the scattering transform coefficients not captured by the flux power spectrum. We estimate mock covariance matrices assuming uncorrelated Gaussian pixe…
▽ More
We make forecasts for the constraining power of the 1D Wavelet Scattering Transform (WST) when used with a Lyman-$α$ forest cosmology survey. Using mock simulations and a Fisher matrix, we show that there is considerable cosmological information in the scattering transform coefficients not captured by the flux power spectrum. We estimate mock covariance matrices assuming uncorrelated Gaussian pixel noise for each quasar, at a level drawn from a simple lognormal model. The extra information comes from a smaller estimated covariance in the first-order wavelet power, and from second-order wavelet coefficients which probe non-Gaussian information in the forest. Forecast constraints on cosmological parameters from the WST are more than an order of magnitude tighter than for the power spectrum, shrinking a $4D$ parameter space by a factor of $10^6$. Should these improvements be realised with DESI, inflationary running would be constrained to test common inflationary models predicting $α_s = - 6\times 10^{-4}$ and neutrino mass constraints would be improved enough for a $5-σ$ detection of the minimal neutrino mass.
△ Less
Submitted 3 June, 2024; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Disentangling the Black Hole Mass Spectrum with Photometric Microlensing Surveys
Authors:
Scott Ellis Perkins,
Peter McGill,
William Dawson,
Natasha S. Abrams,
Casey Y. Lam,
Ming-Feng Ho,
Jessica R. Lu,
Simeon Bird,
Kerianne Pruett,
Nathan Golovich,
George Chapline
Abstract:
From the formation mechanisms of stars and compact objects to nuclear physics, modern astronomy frequently leverages surveys to understand populations of objects to answer fundamental questions. The population of dark and isolated compact objects in the Galaxy contains critical information related to many of these topics, but is only practically accessible via gravitational microlensing. However,…
▽ More
From the formation mechanisms of stars and compact objects to nuclear physics, modern astronomy frequently leverages surveys to understand populations of objects to answer fundamental questions. The population of dark and isolated compact objects in the Galaxy contains critical information related to many of these topics, but is only practically accessible via gravitational microlensing. However, photometric microlensing observables are degenerate for different types of lenses, and one can seldom classify an event as involving either a compact object or stellar lens on its own. To address this difficulty, we apply a Bayesian framework that treats lens type probabilistically and jointly with a lens population model. This method allows lens population characteristics to be inferred despite intrinsic uncertainty in the lens-class of any single event. We investigate this method's effectiveness on a simulated ground-based photometric survey in the context of characterizing a hypothetical population of primordial black holes (PBHs) with an average mass of $30 M_{\odot}$. On simulated data, our method outperforms current black hole (BH) lens identification pipelines and characterizes different subpopulations of lenses while jointly constraining the PBH contribution to dark matter to ${\approx}25$\%. Key to robust inference, our method can marginalize over population model uncertainty. We find the lower mass cutoff for stellar origin BHs, a key observable in understanding the BH mass gap, particularly difficult to infer in our simulations. This work lays the foundation for cutting-edge PBH abundance constraints to be extracted from current photometric microlensing surveys.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Structurally guided task decomposition in spatial navigation tasks
Authors:
Ruiqi He,
Carlos G. Correa,
Thomas L. Griffiths,
Mark K. Ho
Abstract:
How are people able to plan so efficiently despite limited cognitive resources? We aimed to answer this question by extending an existing model of human task decomposition that can explain a wide range of simple planning problems by adding structure information to the task to facilitate planning in more complex tasks. The extended model was then applied to a more complex planning domain of spatial…
▽ More
How are people able to plan so efficiently despite limited cognitive resources? We aimed to answer this question by extending an existing model of human task decomposition that can explain a wide range of simple planning problems by adding structure information to the task to facilitate planning in more complex tasks. The extended model was then applied to a more complex planning domain of spatial navigation. Our results suggest that our framework can correctly predict the navigation strategies of the majority of the participants in an online experiment.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Sensitivity Analysis of Simulation-Based Inference for Galaxy Clustering
Authors:
Chirag Modi,
Shivam Pandey,
Matthew Ho,
ChangHoon Hahn,
Bruno R'egaldo-Saint Blancard,
Benjamin Wandelt
Abstract:
Simulation-based inference (SBI) is a promising approach to leverage high fidelity cosmological simulations and extract information from the non-Gaussian, non-linear scales that cannot be modeled analytically. However, scaling SBI to the next generation of cosmological surveys faces the computational challenge of requiring a large number of accurate simulations over a wide range of cosmologies, wh…
▽ More
Simulation-based inference (SBI) is a promising approach to leverage high fidelity cosmological simulations and extract information from the non-Gaussian, non-linear scales that cannot be modeled analytically. However, scaling SBI to the next generation of cosmological surveys faces the computational challenge of requiring a large number of accurate simulations over a wide range of cosmologies, while simultaneously encompassing large cosmological volumes at high resolution. This challenge can potentially be mitigated by balancing the accuracy and computational cost for different components of the the forward model while ensuring robust inference. To guide our steps in this, we perform a sensitivity analysis of SBI for galaxy clustering on various components of the cosmological simulations: gravity model, halo-finder and the galaxy-halo distribution models (halo-occupation distribution, HOD). We infer the $σ_8$ and $Ω_m$ using galaxy power spectrum multipoles and the bispectrum monopole assuming a galaxy number density expected from the luminous red galaxies observed using the Dark Energy Spectroscopy Instrument (DESI). We find that SBI is insensitive to changing gravity model between $N$-body simulations and particle mesh (PM) simulations. However, changing the halo-finder from friends-of-friends (FoF) to Rockstar can lead to biased estimate of $σ_8$ based on the bispectrum. For galaxy models, training SBI on more complex HOD leads to consistent inference for less complex HOD models, but SBI trained on simpler HOD models fails when applied to analyze data from a more complex HOD model. Based on our results, we discuss the outlook on cosmological simulations with a focus on applying SBI approaches to future galaxy surveys.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Towards A Unified Utilitarian Ethics Framework for Healthcare Artificial Intelligence
Authors:
Forhan Bin Emdad,
Shuyuan Mary Ho,
Benhur Ravuri,
Shezin Hussain
Abstract:
Artificial Intelligence (AI) aims to elevate healthcare to a pinnacle by aiding clinical decision support. Overcoming the challenges related to the design of ethical AI will enable clinicians, physicians, healthcare professionals, and other stakeholders to use and trust AI in healthcare settings. This study attempts to identify the major ethical principles influencing the utility performance of AI…
▽ More
Artificial Intelligence (AI) aims to elevate healthcare to a pinnacle by aiding clinical decision support. Overcoming the challenges related to the design of ethical AI will enable clinicians, physicians, healthcare professionals, and other stakeholders to use and trust AI in healthcare settings. This study attempts to identify the major ethical principles influencing the utility performance of AI at different technological levels such as data access, algorithms, and systems through a thematic analysis. We observed that justice, privacy, bias, lack of regulations, risks, and interpretability are the most important principles to consider for ethical AI. This data-driven study has analyzed secondary survey data from the Pew Research Center (2020) of 36 AI experts to categorize the top ethical principles of AI design. To resolve the ethical issues identified by the meta-analysis and domain experts, we propose a new utilitarian ethics-based theoretical framework for designing ethical AI for the healthcare domain.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Cosmological Constraints from the eBOSS Lyman-$α$ Forest using the PRIYA Simulations
Authors:
M. A. Fernandez,
Simeon Bird,
Ming-Feng Ho
Abstract:
We present new cosmological parameter constraints from the eBOSS Lyman-$α$ forest survey. We use a new theoretical model and likelihood based on the PRIYA simulation suite. PRIYA is the first suite to resolve the Lyman-$α$ forest in a ($120$ Mpc/h)$^3$ volume, using a multi-fidelity emulation technique. We use PRIYA to predict Lyman-$α$ forest observables with $\lesssim 1\%$ interpolation error ov…
▽ More
We present new cosmological parameter constraints from the eBOSS Lyman-$α$ forest survey. We use a new theoretical model and likelihood based on the PRIYA simulation suite. PRIYA is the first suite to resolve the Lyman-$α$ forest in a ($120$ Mpc/h)$^3$ volume, using a multi-fidelity emulation technique. We use PRIYA to predict Lyman-$α$ forest observables with $\lesssim 1\%$ interpolation error over an $11$ dimensional ($9$ simulated, $2$ in post-processing) parameter space. We identify an internal tension within the flux power spectrum data. Once the discrepant data is removed, we find the scalar spectral index at $k = 0.78$ h/Mpc to be $n_P = 0.97 - 0.995$ at $68\%$ confidence from the Lyman-$α$ forest flux power spectrum alone, in good agreement with Planck. The amplitude of matter fluctuations is $σ_8 = 0.733 \pm 0.026$ at $68\%$ confidence, in agreement with Dark Energy Survey weak lensing measurements and other small-scale structure probes and in tension with CMB measurements from Planck and ACT. The effective optical depth to Lyman-$α$ photons from our pipeline is in good agreement with earlier measurements. We add measurements of the mean temperature of the intergalactic gas from $z=3.8 - 2.2$ and use them to constrain the duration and heating rate of helium reionization, finding a preference for an early, hot, helium reionization event, as suggested by measurements from the helium Lyman-$α$ forest. Adding the mean IGM temperature data also increases the significance of the $σ_8$ tension. In the near future we will use our pipeline to infer cosmological parameters from the DESI Lyman-$α$ data.
△ Less
Submitted 7 September, 2023;
originally announced September 2023.
-
Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation
Authors:
Andi Peng,
Aviv Netanyahu,
Mark Ho,
Tianmin Shu,
Andreea Bobu,
Julie Shah,
Pulkit Agrawal
Abstract:
Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments. Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation. However, designers don't know which concepts are irrelevant a priori, especially when different end users have different preferences a…
▽ More
Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments. Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation. However, designers don't know which concepts are irrelevant a priori, especially when different end users have different preferences about how the task is performed. We propose an interactive framework to leverage feedback directly from the user to identify personalized task-irrelevant concepts. Our key idea is to generate counterfactual demonstrations that allow users to quickly identify possible task-relevant and irrelevant concepts. The knowledge of task-irrelevant concepts is then used to perform data augmentation and thus obtain a policy adapted to personalized user objectives. We present experiments validating our framework on discrete and continuous control tasks with real human users. Our method (1) enables users to better understand agent failure, (2) reduces the number of demonstrations required for fine-tuning, and (3) aligns the agent to individual user task preferences.
△ Less
Submitted 13 July, 2023; v1 submitted 12 July, 2023;
originally announced July 2023.
-
Reliable computation by large-alphabet formulas in the presence of noise
Authors:
Andrew K. Tan,
Matthew Ho,
Isaac L. Chuang
Abstract:
We present two new positive results for reliable computation using formulas over physical alphabets of size $q > 2$. First, we show that for logical alphabets of size $\ell = q$ the threshold for denoising using gates subject to $q$-ary symmetric noise with error probability $\varepsilon$ is strictly larger than that for Boolean computation, and is possible as long as signals remain distinguishabl…
▽ More
We present two new positive results for reliable computation using formulas over physical alphabets of size $q > 2$. First, we show that for logical alphabets of size $\ell = q$ the threshold for denoising using gates subject to $q$-ary symmetric noise with error probability $\varepsilon$ is strictly larger than that for Boolean computation, and is possible as long as signals remain distinguishable, i.e. $ε< (q - 1) / q$, in the limit of large fan-in $k \rightarrow \infty$. We also determine the point at which generalized majority gates with bounded fan-in fail, and show in particular that reliable computation is possible for $ε< (q - 1) / (q (q + 1))$ in the case of $q$ prime and fan-in $k = 3$. Secondly, we provide an example where $\ell < q$, showing that reliable Boolean computation can be performed using $2$-input ternary logic gates subject to symmetric ternary noise of strength $\varepsilon < 1/6$ by using the additional alphabet element for error signaling.
△ Less
Submitted 25 June, 2024; v1 submitted 22 June, 2023;
originally announced June 2023.
-
PRIYA: A New Suite of Lyman-alpha Forest Simulations for Cosmology
Authors:
Simeon Bird,
Martin Fernandez,
Ming-Feng Ho,
Mahdi Qezlou,
Reza Monadi,
Yueying Ni,
Nianyi Chen,
Rupert Croft,
Tiziana Di Matteo
Abstract:
We present the PRIYA suite of cosmological simulations, based on the code and hydrodynamic model of the ASTRID simulation, and designed for cosmological analyses of the Lyman-$α$ forest. Our simulation suite spans a $9$-dimensional parameter space, including $4$ cosmological parameters and $5$ astrophysical/thermal parameters. We have run $48$ low fidelity simulations with $1536^3$ particles in a…
▽ More
We present the PRIYA suite of cosmological simulations, based on the code and hydrodynamic model of the ASTRID simulation, and designed for cosmological analyses of the Lyman-$α$ forest. Our simulation suite spans a $9$-dimensional parameter space, including $4$ cosmological parameters and $5$ astrophysical/thermal parameters. We have run $48$ low fidelity simulations with $1536^3$ particles in a $120$ Mpc/h box and $3$ high fidelity simulations with $3072^3$ particles in a $120$ Mpc/h box. All our simulations include a full physics model for galaxy formation, including supernova and AGN feedback, and thus also contain a realistic population of DLAs. We advance on earlier simulations suites by larger particle loads, by incorporating new physical models for patchy hydrogen and helium reionization, and by self-consistently incorporating a model for AGN feedback. We show that patchy helium reionization imprints an excess in the 1D flux power spectrum on large scales, which may allow future measurements of helium reionization bubble sizes. Simulation parameters are chosen based on a Latin hypercube design and a Gaussian process is used to interpolate to arbitrary parameter combinations. We build a multi-fidelity emulator for the 1D flux power spectrum and the mean IGM temperature. We show that our final interpolation error is $< 1\%$ and that our simulations produce a flux power spectrum converged at the percent level for $z=5.4$ - $2.2$. Our simulation suite will be used to interpret Lyman-$α$ forest 1D flux power spectra from SDSS and future DESI data releases.
△ Less
Submitted 2 January, 2024; v1 submitted 8 June, 2023;
originally announced June 2023.
-
MF-Box: Multi-fidelity and multi-scale emulation for the matter power spectrum
Authors:
Ming-Feng Ho,
Simeon Bird,
Martin A. Fernandez,
Christian R. Shelton
Abstract:
We introduce MF-Box, an extended version of MFEmulator, designed as a fast surrogate for power spectra, trained using N-body simulation suites from various box sizes and particle loads. To demonstrate MF-Box's effectiveness, we design simulation suites that include low-fidelity suites (L1 and L2) at $256 \,\mathrm{Mpc}/h$ and $100 \,\mathrm{Mpc}/h$, each with $128^3$ particles, and a high-fidelity…
▽ More
We introduce MF-Box, an extended version of MFEmulator, designed as a fast surrogate for power spectra, trained using N-body simulation suites from various box sizes and particle loads. To demonstrate MF-Box's effectiveness, we design simulation suites that include low-fidelity suites (L1 and L2) at $256 \,\mathrm{Mpc}/h$ and $100 \,\mathrm{Mpc}/h$, each with $128^3$ particles, and a high-fidelity suite (HF) with $512^3$ particles at $256 \,\mathrm{Mpc}/h$, representing a higher particle load compared to the low-fidelity suites. MF-Box acts as a probabilistic resolution correction function, learning most of the cosmological dependencies from L1 and L2 simulations and rectifying resolution differences with just 3 HF simulations using a Gaussian process. MF-Box successfully emulates power spectra from our HF testing set with a relative error of $< 3\%$ up to $k \simeq 7 \,h/\mathrm{Mpc}$ at $z \in [0, 3]$, while maintaining a cost similar to our previous multi-fidelity approach, which was accurate only up to $z = 1$. The addition of an extra low-fidelity node in a smaller box significantly improves emulation accuracy for MF-Box at $k > 2 \,h/\mathrm{Mpc}$, increasing it by a factor of $10$. We conduct an error analysis of MF-Box based on computational budget, providing guidance for optimizing budget allocation per fidelity node. Our proposed MF-Box enables future surveys to efficiently combine simulation suites of varying quality, effectively expanding the range of emulation capabilities while ensuring cost efficiency.
△ Less
Submitted 4 October, 2023; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Information-Ordered Bottlenecks for Adaptive Semantic Compression
Authors:
Matthew Ho,
Xiaosheng Zhao,
Benjamin Wandelt
Abstract:
We present the information-ordered bottleneck (IOB), a neural layer designed to adaptively compress data into latent variables ordered by likelihood maximization. Without retraining, IOB nodes can be truncated at any bottleneck width, capturing the most crucial information in the first latent variables. Unifying several previous approaches, we show that IOBs achieve near-optimal compression for a…
▽ More
We present the information-ordered bottleneck (IOB), a neural layer designed to adaptively compress data into latent variables ordered by likelihood maximization. Without retraining, IOB nodes can be truncated at any bottleneck width, capturing the most crucial information in the first latent variables. Unifying several previous approaches, we show that IOBs achieve near-optimal compression for a given encoding architecture and can assign ordering to latent signals in a manner that is semantically meaningful. IOBs demonstrate a remarkable ability to compress embeddings of image and text data, leveraging the performance of SOTA architectures such as CNNs, transformers, and diffusion models. Moreover, we introduce a novel theory for estimating global intrinsic dimensionality with IOBs and show that they recover SOTA dimensionality estimates for complex synthetic data. Furthermore, we showcase the utility of these models for exploratory analysis through applications on heterogeneous datasets, enabling computer-aided discovery of dataset complexity.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Bayesian Reinforcement Learning with Limited Cognitive Load
Authors:
Dilip Arumugam,
Mark K. Ho,
Noah D. Goodman,
Benjamin Van Roy
Abstract:
All biological and artificial agents must learn and make decisions given limits on their ability to process information. As such, a general theory of adaptive behavior should be able to account for the complex interactions between an agent's learning history, decisions, and capacity constraints. Recent work in computer science has begun to clarify the principles that shape these dynamics by bridgi…
▽ More
All biological and artificial agents must learn and make decisions given limits on their ability to process information. As such, a general theory of adaptive behavior should be able to account for the complex interactions between an agent's learning history, decisions, and capacity constraints. Recent work in computer science has begun to clarify the principles that shape these dynamics by bridging ideas from reinforcement learning, Bayesian decision-making, and rate-distortion theory. This body of work provides an account of capacity-limited Bayesian reinforcement learning, a unifying normative framework for modeling the effect of processing constraints on learning and action selection. Here, we provide an accessible review of recent algorithms and theoretical results in this setting, paying special attention to how these ideas can be applied to studying questions in the cognitive and behavioral sciences.
△ Less
Submitted 4 May, 2023;
originally announced May 2023.
-
Machine Learning Uncovers the Universe's Hidden Gems: A Comprehensive Catalogue of CIV Absorption Lines in SDSS DR12
Authors:
Reza Monadi,
Ming-Feng Ho,
Kathy L. Cooksey,
Simeon Bird
Abstract:
We assemble the largest CIV absorption line catalogue to date, leveraging machine learning, specifically Gaussian processes, to remove the need for visual inspection for detecting CIV absorbers. The catalogue contains probabilities classifying the reliability of the absorption system within a quasar spectrum. Our training set was a sub-sample of DR7 spectra that had no detectable CIV absorption in…
▽ More
We assemble the largest CIV absorption line catalogue to date, leveraging machine learning, specifically Gaussian processes, to remove the need for visual inspection for detecting CIV absorbers. The catalogue contains probabilities classifying the reliability of the absorption system within a quasar spectrum. Our training set was a sub-sample of DR7 spectra that had no detectable CIV absorption in a large visually inspected catalogue. We used Bayesian model selection to decide between our continuum model and our absorption-line models. Using a random hold-out sample of 1301 spectra from all of the 26,030 investigated spectra in DR7 CIV catalogue, we validated our pipeline and obtained an 87% classification performance score. We found good purity and completeness values, both ~80%, when a probability of ~95% is used as the threshold. Our pipeline obtained similar CIV redshifts and rest equivalent widths to our training set. Applying our algorithm to 185,425 selected quasar spectra from SDSS DR12, we produce a catalogue of 113,775 CIV doublets with at least 95% confidence. Our catalogue provides maximum a posteriori values and credible intervals for CIV redshift, column density, and Doppler velocity dispersion. We detect CIV absorption systems with a redshift range of 1.37 $\!-\!$ 5.1, including 33 systems with a redshift larger than 5 and 549 absorbers systems with a rest equivalent width greater than 2 A at more than 95% confidence. Our catalogue can be used to investigate the physical properties of the circumgalactic and intergalactic media.
△ Less
Submitted 23 September, 2023; v1 submitted 28 April, 2023;
originally announced May 2023.
-
Posterior Sampling of the Initial Conditions of the Universe from Non-linear Large Scale Structures using Score-Based Generative Models
Authors:
Ronan Legin,
Matthew Ho,
Pablo Lemos,
Laurence Perreault-Levasseur,
Shirley Ho,
Yashar Hezaveh,
Benjamin Wandelt
Abstract:
Reconstructing the initial conditions of the universe is a key problem in cosmology. Methods based on simulating the forward evolution of the universe have provided a way to infer initial conditions consistent with present-day observations. However, due to the high complexity of the inference problem, these methods either fail to sample a distribution of possible initial density fields or require…
▽ More
Reconstructing the initial conditions of the universe is a key problem in cosmology. Methods based on simulating the forward evolution of the universe have provided a way to infer initial conditions consistent with present-day observations. However, due to the high complexity of the inference problem, these methods either fail to sample a distribution of possible initial density fields or require significant approximations in the simulation model to be tractable, potentially leading to biased results. In this work, we propose the use of score-based generative models to sample realizations of the early universe given present-day observations. We infer the initial density field of full high-resolution dark matter N-body simulations from the present-day density field and verify the quality of produced samples compared to the ground truth based on summary statistics. The proposed method is capable of providing plausible realizations of the early universe density field from the initial conditions posterior distribution marginalized over cosmological parameters and can sample orders of magnitude faster than current state-of-the-art methods.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
Benchmarks and Explanations for Deep Learning Estimates of X-ray Galaxy Cluster Masses
Authors:
Matthew Ho,
John Soltis,
Arya Farahi,
Daisuke Nagai,
August Evrard,
Michelle Ntampaka
Abstract:
We evaluate the effectiveness of deep learning (DL) models for reconstructing the masses of galaxy clusters using X-ray photometry data from next-generation surveys. We establish these constraints using a catalogue of realistic mock eROSITA X-ray observations which use hydrodynamical simulations to model realistic cluster morphology, background emission, telescope response, and AGN sources. Using…
▽ More
We evaluate the effectiveness of deep learning (DL) models for reconstructing the masses of galaxy clusters using X-ray photometry data from next-generation surveys. We establish these constraints using a catalogue of realistic mock eROSITA X-ray observations which use hydrodynamical simulations to model realistic cluster morphology, background emission, telescope response, and AGN sources. Using bolometric X-ray photon maps as input, DL models achieve a predictive mass scatter of $σ_{\ln M_\mathrm{500c}} = 17.8\%$, a factor of two improvements on scalar observables such as richness $N_\mathrm{gal}$, 1D velocity dispersion $σ_\mathrm{v,1D}$, and photon count $N_\mathrm{phot}$ as well as a $32\%$ improvement upon idealised, volume-integrated measurements of the bolometric X-ray luminosity $L_X$. We then show that extending this model to handle multichannel X-ray photon maps, separated in low, medium, and high energy bands, further reduces the mass scatter to $16.2\%$. We also tested a multimodal DL model incorporating both dynamical and X-ray cluster probes and achieved marginal gains at a mass scatter of $15.9\%$. Finally, we conduct a quantitative interpretability study of our DL models and find that they greatly down-weight the importance of pixels in the centres of clusters and at the location of AGN sources, validating previous claims of DL modelling improvements and suggesting practical and theoretical benefits for using DL in X-ray mass inference.
△ Less
Submitted 25 July, 2023; v1 submitted 28 February, 2023;
originally announced March 2023.
-
Order in Innovation
Authors:
Martin Ho,
Henry CW Price,
Tim S Evans,
Eoin O'Sullivan
Abstract:
Is calendar time the true clock of innovation? By combining complexity science with innovation economics and using vaccine datasets containing over three million citations and eight regulatory authorisations, we discover that calendar time and network order describe innovation progress at varying accuracy. First, we present a method to establish a mathematical link between technological evolution…
▽ More
Is calendar time the true clock of innovation? By combining complexity science with innovation economics and using vaccine datasets containing over three million citations and eight regulatory authorisations, we discover that calendar time and network order describe innovation progress at varying accuracy. First, we present a method to establish a mathematical link between technological evolution and complex networks. The result is a path of events that narrates innovation bottlenecks. Next, we quantify the position and proximity of documents to these innovation paths and find that research, by and large, proceed from basic research, applied research, development, to commercialisation. By extension, we are able to causally quantify the participation of innovation funders. When it comes to vaccine innovation, diffusion-oriented entities are preoccupied with basic, later-stage research; biopharmaceuticals tend to participate in applied development activities and clinical trials at the later-stage; while mission-oriented entities tend to initiate early-stage research. Future innovation programs and funding allocations would benefit from better understanding innovation orders.
△ Less
Submitted 25 February, 2023;
originally announced February 2023.
-
Proximity-induced quasi-one-dimensional superconducting quantum anomalous Hall state: a promising scalable top-down approach towards localized Majorana modes
Authors:
Omargeldi Atanov,
Wai Ting Tai,
Ying-Ming Xie,
Yat Hei Ng,
Molly A. Hammond,
Tin Seng Manfred Ho,
Tsin Hei Koo,
Hui Li,
Sui Lun Ho,
Jian Lyu,
Sukong Chong,
Peng Zhang,
Lixuan Tai,
Jiannong Wang,
Kam Tuen Law,
Kang L. Wang,
Rolf Lortz
Abstract:
In this work, ~100 nm wide quantum anomalous Hall insulator (QAHI) nanoribbons are etched from a two-dimensional QAHI film. One part of the nanoribbon is covered with superconducting Nb, while the other part is connected to an Au lead via two-dimensional QAHI regions. Andreev reflection spectroscopy measurements were performed, and multiple in-gap conductance peaks were observed in three different…
▽ More
In this work, ~100 nm wide quantum anomalous Hall insulator (QAHI) nanoribbons are etched from a two-dimensional QAHI film. One part of the nanoribbon is covered with superconducting Nb, while the other part is connected to an Au lead via two-dimensional QAHI regions. Andreev reflection spectroscopy measurements were performed, and multiple in-gap conductance peaks were observed in three different devices. In the presence of an increasing magnetic field perpendicular to the QAHI film, the multiple in-gap peak structure evolves into a single zero-bias conductance peak (ZBCP). Theoretical simulations suggest that the measurements are consistent with the scenario that the increasing magnetic field drives the nanoribbons from a multi-channel occupied regime to a single channel occupied regime, and that the ZBCP may be induced by zero energy Majorana modes as previously predicted [24]. Although further experiments are needed to clarify the nature of the ZBCP, we provide initial evidence that quasi-1D QAHI nanoribbon/superconductor heterostructures are new and promising platforms for realizing zero-energy Majorana modes.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
A novel test of gravity via black hole eikonal correspondence
Authors:
Che-Yu Chen,
Yu-Jui Chen,
Meng-Yuan Ho,
Yung-Hsuan Tseng
Abstract:
When adopted in black hole spacetimes, geometric-optics approximations imply a mapping between the quasinormal mode (QNM) spectrum of black holes in the eikonal limit and black hole images. In particular, the real part and the imaginary part of eikonal QNM frequencies are associated with the apparent size and the detailed structure of the ring images, respectively. This correspondence could be vio…
▽ More
When adopted in black hole spacetimes, geometric-optics approximations imply a mapping between the quasinormal mode (QNM) spectrum of black holes in the eikonal limit and black hole images. In particular, the real part and the imaginary part of eikonal QNM frequencies are associated with the apparent size and the detailed structure of the ring images, respectively. This correspondence could be violated when going beyond general relativity. We propose a novel method to test the eikonal correspondence via the comparison of two sets of observables from a nonrotating black hole, one extracted from QNM spectra and the other from the lensed photon rings on the image plane. Specifically, the photon ring observables robustly capture the information of the black hole spacetime itself regardless of the surrounding emission models. Therefore, the proposed test of eikonal correspondence can be validated in quite broad scenarios.
△ Less
Submitted 5 September, 2023; v1 submitted 20 December, 2022;
originally announced December 2022.
-
Testing Lepton Flavor Universality at Future $Z$ Factories
Authors:
Tin Seng Manfred Ho,
Xu-Hui Jiang,
Tsz Hong Kwok,
Lingfeng Li,
Tao Liu
Abstract:
As one of the hypothetical principles in the Standard Model (SM), lepton flavor universality (LFU) should be tested with a precision as high as possible such that the physics violating this principle can be fully examined. The run of $Z$ factory at a future $e^+e^-$ collider such as CEPC or FCC-$ee$ provides a great opportunity to perform this task because of the large statistics and high reconstr…
▽ More
As one of the hypothetical principles in the Standard Model (SM), lepton flavor universality (LFU) should be tested with a precision as high as possible such that the physics violating this principle can be fully examined. The run of $Z$ factory at a future $e^+e^-$ collider such as CEPC or FCC-$ee$ provides a great opportunity to perform this task because of the large statistics and high reconstruction efficiencies for $b$-hadrons at $Z$ pole. In this paper, we present a systematic study on the LFU test in the future $Z$ factories. The goal is three-fold. Firstly, we study the sensitivities of measuring the LFU-violating observables of $b\to c τν$, $i.e.$, $R_{J/ψ}$, $R_{D_s}$, $R_{D_s^\ast}$ and $R_{Λ_c}$, where $τ$ decays muonically. For this purpose, we develop the strategies for event reconstruction, based on the track information significantly. Secondly, we explore the sensitivity robustness against detector performance and its potential improvement with the message of event shape or beyond the $b$-hadron decays. A picture is drawn on the variation of analysis sensitivities with the detector tracking resolution and soft photon detectability, and the impact of Fox-Wolfram moments is studied on the measurement of relevant flavor events. Finally, we interpret the projected sensitivities in the SM effective field theory, by combining the LFU tests of $b\to c τν$ and the measurements of $b\to s τ^+τ^-$ and $b\to s \barν ν$. We show that the limits on the LFU-violating energy scale can be pushed up to $\sim \mathcal{O} (10)$~TeV for $\lesssim \mathcal O(1)$ Wilson coefficients at Tera-$Z$.
△ Less
Submitted 5 December, 2022;
originally announced December 2022.
-
Interactive Image Manipulation with Complex Text Instructions
Authors:
Ryugo Morita,
Zhiqiang Zhang,
Man M. Ho,
Jinjia Zhou
Abstract:
Recently, text-guided image manipulation has received increasing attention in the research field of multimedia processing and computer vision due to its high flexibility and controllability. Its goal is to semantically manipulate parts of an input reference image according to the text descriptions. However, most of the existing works have the following problems: (1) text-irrelevant content cannot…
▽ More
Recently, text-guided image manipulation has received increasing attention in the research field of multimedia processing and computer vision due to its high flexibility and controllability. Its goal is to semantically manipulate parts of an input reference image according to the text descriptions. However, most of the existing works have the following problems: (1) text-irrelevant content cannot always be maintained but randomly changed, (2) the performance of image manipulation still needs to be further improved, (3) only can manipulate descriptive attributes. To solve these problems, we propose a novel image manipulation method that interactively edits an image using complex text instructions. It allows users to not only improve the accuracy of image manipulation but also achieve complex tasks such as enlarging, dwindling, or removing objects and replacing the background with the input image. To make these tasks possible, we apply three strategies. First, the given image is divided into text-relevant content and text-irrelevant content. Only the text-relevant content is manipulated and the text-irrelevant content can be maintained. Second, a super-resolution method is used to enlarge the manipulation region to further improve the operability and to help manipulate the object itself. Third, a user interface is introduced for editing the segmentation map interactively to re-modify the generated image according to the user's desires. Extensive experiments on the Caltech-UCSD Birds-200-2011 (CUB) dataset and Microsoft Common Objects in Context (MS COCO) datasets demonstrate our proposed method can enable interactive, flexible, and accurate image manipulation in real-time. Through qualitative and quantitative evaluations, we show that the proposed model outperforms other state-of-the-art methods.
△ Less
Submitted 25 November, 2022;
originally announced November 2022.
-
Inverse clustering of Gibbs Partitions via independent fragmentation and dual dependent coagulation operators
Authors:
Man Wai Ho,
Lancelot F. James,
John W. Lau
Abstract:
Gibbs partitions of the integers generated by stable subordinators of index $α\in(0,1)$ form remarkable classes of random partitions where in principle much is known about their properties, including practically effortless obtainment of otherwise complex asymptotic results potentially relevant to applications in general combinatorial stochastic processes, random tree/graph growth models and Bayesi…
▽ More
Gibbs partitions of the integers generated by stable subordinators of index $α\in(0,1)$ form remarkable classes of random partitions where in principle much is known about their properties, including practically effortless obtainment of otherwise complex asymptotic results potentially relevant to applications in general combinatorial stochastic processes, random tree/graph growth models and Bayesian statistics. This class includes the well-known models based on the two-parameter Poisson-Dirichlet distribution which forms the bulk of explicit applications. This work continues efforts to provide interpretations for a larger classes of Gibbs partitions by embedding important operations within this framework. Here we address the formidable problem of extending the dual, infinite-block, coagulation/fragmentation results of Jim Pitman (1999, Annals of Probability), where in terms of coagulation they are based on independent two-parameter Poisson-Dirichlet distributions, to all such Gibbs (stable Poisson-Kingman) models. Our results create nested families of Gibbs partitions, and corresponding mass partitions, over any $0<β<α<1.$ We primarily focus on the fragmentation operations, which remain independent in this setting, and corresponding remarkable calculations for Gibbs partitions derived from that operation. We also present definitive results for the dual coagulation operations, now based on our construction of dependent processes, and demonstrate its relatively simple application in terms of Mittag-Leffler and generalized gamma models. The latter demonstrates another approach to recover the duality results in Pitman (1999).
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
How Population III Supernovae Determined the Properties of the First Galaxies
Authors:
Ke-Jung Chen,
Ching-Yao Tang,
Daniel J. Whalen,
Meng-Yuan Ho,
Sung-Han Tsai,
Po-Sheng Ou,
Masaomi Ono
Abstract:
Massive Pop III stars can die as energetic supernovae that enrich the early universe with metals and determine the properties of the first galaxies. With masses of about $10^9$ Msun at $z \gtrsim 10$, these galaxies are believed to be the ancestors of the Milky Way. This paper investigates the impact of Pop III supernova remnants (SNRs) from both Salpeter-like and top-heavy initial mass functions…
▽ More
Massive Pop III stars can die as energetic supernovae that enrich the early universe with metals and determine the properties of the first galaxies. With masses of about $10^9$ Msun at $z \gtrsim 10$, these galaxies are believed to be the ancestors of the Milky Way. This paper investigates the impact of Pop III supernova remnants (SNRs) from both Salpeter-like and top-heavy initial mass functions (IMFs) on the formation of first galaxies with high-resolution radiation-hydrodynamical simulations with the ENZO code. Our findings indicate that SNRs from a top-heavy Pop III IMF produce more metals, leading to more efficient gas cooling and earlier Pop II star formation in the first galaxies. From a few hundred to a few thousand Pop II stars can form in the central regions of these galaxies. These stars have metallicities of $10^{-3}$ to $10^{-2}$, Zsun, greater than those of extremely metal-poor (EMP) stars. Their mass function follows a power-law distribution with $dN(M_*)/dM_* \propto M_*^α$, where $M_*$ is stellar mass and $α= 2.66 - 5.83$ and is steeper for a top-heavy IMF. We thus find that EMP stars were not typical of most primitive galaxies.
△ Less
Submitted 12 February, 2024; v1 submitted 11 November, 2022;
originally announced November 2022.
-
Humans decompose tasks by trading off utility and computational cost
Authors:
Carlos G. Correa,
Mark K. Ho,
Frederick Callaway,
Nathaniel D. Daw,
Thomas L. Griffiths
Abstract:
Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions. How are these decompositions created and used? Here, we propose and evaluate a normative framework for task decomposition based on the simple idea that people decompose tasks to reduce the overall cost of planning while maintaining task performance. Analyzing 11,117 distinct gra…
▽ More
Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions. How are these decompositions created and used? Here, we propose and evaluate a normative framework for task decomposition based on the simple idea that people decompose tasks to reduce the overall cost of planning while maintaining task performance. Analyzing 11,117 distinct graph-structured planning tasks, we find that our framework justifies several existing heuristics for task decomposition and makes predictions that can be distinguished from two alternative normative accounts. We report a behavioral study of task decomposition ($N=806$) that uses 30 randomly sampled graphs, a larger and more diverse set than that of any previous behavioral study on this topic. We find that human responses are more consistent with our framework for task decomposition than alternative normative accounts and are most consistent with a heuristic -- betweenness centrality -- that is justified by our approach. Taken together, our results provide new theoretical insight into the computational principles underlying the intelligent structuring of goal-directed behavior.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning
Authors:
Dilip Arumugam,
Mark K. Ho,
Noah D. Goodman,
Benjamin Van Roy
Abstract:
Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources. Prior work has drawn inspiration from this fact and leveraged an information-theoretic model of such behaviors or policies as communication cha…
▽ More
Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources. Prior work has drawn inspiration from this fact and leveraged an information-theoretic model of such behaviors or policies as communication channels operating under a bounded rate constraint. Meanwhile, a parallel line of work also capitalizes on the same principles from rate-distortion theory to formalize capacity-limited decision making through the notion of a learning target, which facilitates Bayesian regret bounds for provably-efficient learning algorithms. In this paper, we aim to elucidate this latter perspective by presenting a brief survey of these information-theoretic models of capacity-limited decision making in biological and artificial agents.
△ Less
Submitted 30 October, 2022;
originally announced October 2022.
-
WikiWhy: Answering and Explaining Cause-and-Effect Questions
Authors:
Matthew Ho,
Aditya Sharma,
Justin Chang,
Michael Saxon,
Sharon Levy,
Yujie Lu,
William Yang Wang
Abstract:
As large language models (LLMs) grow larger and more sophisticated, assessing their "reasoning" capabilities in natural language grows more challenging. Recent question answering (QA) benchmarks that attempt to assess reasoning are often limited by a narrow scope of covered situations and subject matters. We introduce WikiWhy, a QA dataset built around a novel auxiliary task: explaining why an ans…
▽ More
As large language models (LLMs) grow larger and more sophisticated, assessing their "reasoning" capabilities in natural language grows more challenging. Recent question answering (QA) benchmarks that attempt to assess reasoning are often limited by a narrow scope of covered situations and subject matters. We introduce WikiWhy, a QA dataset built around a novel auxiliary task: explaining why an answer is true in natural language. WikiWhy contains over 9,000 "why" question-answer-rationale triples, grounded on Wikipedia facts across a diverse set of topics. Each rationale is a set of supporting statements connecting the question to the answer. WikiWhy serves as a benchmark for the reasoning capabilities of LLMs because it demands rigorous explicit rationales for each answer to demonstrate the acquisition of implicit commonsense knowledge, which is unlikely to be easily memorized. GPT-3 baselines achieve only 38.7% human-evaluated correctness in the end-to-end answer & explain condition, leaving significant room for future improvements.
△ Less
Submitted 30 November, 2022; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Free structures and limiting density
Authors:
Johanna N. Y. Franklin,
Meng-Che "Turbo" Ho,
Julia Knight
Abstract:
Gromov asked what a typical (finitely presented) group looks like, and he suggested a way to make the question precise in terms of limiting density. The typical finitely generated group is known to share some important properties with the non-abelian free groups. We ask Gromov's question more generally, for structures in an arbitrary algebraic variety (in the sense of universal algebra), with pres…
▽ More
Gromov asked what a typical (finitely presented) group looks like, and he suggested a way to make the question precise in terms of limiting density. The typical finitely generated group is known to share some important properties with the non-abelian free groups. We ask Gromov's question more generally, for structures in an arbitrary algebraic variety (in the sense of universal algebra), with presentations of a specific form. We focus on elementary properties. We give examples illustrating different behaviors of the limiting density. Based on the examples, we identify sufficient conditions for the elementary first-order theory of the free structure to match that of the typical structure; i.e., a sentence is true in the free structure iff it has limiting density 1.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.