subscribe to arXiv mailings

Conditional Forecasts in Large Bayesian VARs with Multiple Equality and Inequality Constraints

Authors: Joshua C. C. Chan, Davide Pettenuzzo, Aubrey Poon, Dan Zhu

Abstract: Conditional forecasts, i.e. projections of a set of variables of interest on the future paths of some other variables, are used routinely by empirical macroeconomists in a number of applied settings. In spite of this, the existing algorithms used to generate conditional forecasts tend to be very computationally intensive, especially when working with large Vector Autoregressions or when multiple l… ▽ More Conditional forecasts, i.e. projections of a set of variables of interest on the future paths of some other variables, are used routinely by empirical macroeconomists in a number of applied settings. In spite of this, the existing algorithms used to generate conditional forecasts tend to be very computationally intensive, especially when working with large Vector Autoregressions or when multiple linear equality and inequality constraints are imposed at once. We introduce a novel precision-based sampler that is fast, scales well, and yields conditional forecasts from linear equality and inequality constraints. We show in a simulation study that the proposed method produces forecasts that are identical to those from the existing algorithms but in a fraction of the time. We then illustrate the performance of our method in a large Bayesian Vector Autoregression where we simultaneously impose a mix of linear equality and inequality constraints on the future trajectories of key US macroeconomic indicators over the 2020--2022 period. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.08414 [pdf, other]

Discovering Preference Optimization Algorithms with and for Large Language Models

Authors: Chris Lu, Samuel Holt, Claudio Fanconi, Alex J. Chan, Jakob Foerster, Mihaela van der Schaar, Robert Tjarko Lange

Abstract: Offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. Typically, preference optimization is approached as an offline supervised learning task using manually-crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of… ▽ More Offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. Typically, preference optimization is approached as an offline supervised learning task using manually-crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of possible loss functions remains under explored. We address this by performing LLM-driven objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention. Specifically, we iteratively prompt an LLM to propose and implement new preference optimization loss functions based on previously-evaluated performance metrics. This process leads to the discovery of previously-unknown and performant preference optimization algorithms. The best performing of these we call Discovered Preference Optimization (DiscoPOP), a novel algorithm that adaptively blends logistic and exponential losses. Experiments demonstrate the state-of-the-art performance of DiscoPOP and its successful transfer to held-out tasks. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.06523 [pdf, other]

NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing

Authors: Ting-Hsuan Chen, Jiewen Chan, Hau-Shiang Shiu, Shih-Han Yen, Chang-Han Yeh, Yu-Lun Liu

Abstract: We propose a video editing framework, NaRCan, which integrates a hybrid deformation field and diffusion prior to generate high-quality natural canonical images to represent the input video. Our approach utilizes homography to model global motion and employs multi-layer perceptrons (MLPs) to capture local residual deformations, enhancing the model's ability to handle complex video dynamics. By intr… ▽ More We propose a video editing framework, NaRCan, which integrates a hybrid deformation field and diffusion prior to generate high-quality natural canonical images to represent the input video. Our approach utilizes homography to model global motion and employs multi-layer perceptrons (MLPs) to capture local residual deformations, enhancing the model's ability to handle complex video dynamics. By introducing a diffusion prior from the early stages of training, our model ensures that the generated images retain a high-quality natural appearance, making the produced canonical images suitable for various downstream tasks in video editing, a capability not achieved by current canonical-based methods. Furthermore, we incorporate low-rank adaptation (LoRA) fine-tuning and introduce a noise and diffusion prior update scheduling technique that accelerates the training process by 14 times. Extensive experimental results show that our method outperforms existing approaches in various video editing tasks and produces coherent and high-quality edited video sequences. See our project page for video results at https://koi953215.github.io/NaRCan_page/. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Project page: https://koi953215.github.io/NaRCan_page/

arXiv:2406.04398 [pdf, other]

lenscat: a Public and Community-Contributed Catalog of Known Strong Gravitational Lenses

Authors: L. Vujeva, R. K. L. Lo, J. M. Ezquiaga, J. C. L. Chan

Abstract: We present lenscat, a public and community-contributed catalog of strong gravitational lenses found by electromagnetic surveys. The main objective of lenscat is to compile a simple, easy-to-access catalog that can be used in a variety of lensing studies, such as facilitating the search for the host galaxy of a candidate strongly lensed transient event. We also provide a python package to interact… ▽ More We present lenscat, a public and community-contributed catalog of strong gravitational lenses found by electromagnetic surveys. The main objective of lenscat is to compile a simple, easy-to-access catalog that can be used in a variety of lensing studies, such as facilitating the search for the host galaxy of a candidate strongly lensed transient event. We also provide a python package to interact with tools commonly used by the community. This allows end users both with and without lensing expertise to obtain a list of known strong lenses within a given search area, and to also rank them by their respective searched probabilities. Here, we exemplify this by crossmatching the gravitational wave joint sky localization region of an interesting pair of events GW170104-GW170814. Other examples with short gamma-ray bursts are given. Thanks to the open and simple infrastructure of lenscat, members of the lensing community can directly add newly found lenses from their own studies to help create a long-lasting catalog that is as exhaustive and accessible as possible. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 7 pages, 2 figures

arXiv:2406.03109 [pdf, other]

CAPRI-FAIR: Integration of Multi-sided Fairness in Contextual POI Recommendation Framework

Authors: Francis Zac dela Cruz, Flora D. Salim, Yonchanok Khaokaew, Jeffrey Chan

Abstract: Point-of-interest (POI) recommendation, a form of context-aware recommendation, takes into account spatio-temporal constraints and contexts like distance, peak business hours, and previous user check-ins. Given the ability of these kinds of systems to influence not just the consumer's travel experience, but also the POI's business, it is important to consider fairness from multiple perspectives. U… ▽ More Point-of-interest (POI) recommendation, a form of context-aware recommendation, takes into account spatio-temporal constraints and contexts like distance, peak business hours, and previous user check-ins. Given the ability of these kinds of systems to influence not just the consumer's travel experience, but also the POI's business, it is important to consider fairness from multiple perspectives. Unfortunately, these systems tend to provide less accurate recommendations to inactive users, and less exposure to unpopular POIs. The goal of this paper is to develop a post-filter methodology that incorporates provider and consumer fairness factors into pre-existing recommendation models, to satisfy fairness metrics like item exposure, and performance metrics like precision and distance, making the system more sustainable to both consumers and providers. Experiments have shown that using a linear scoring model for provider fairness in re-scoring recommended items yields the best tradeoff between performance and long-tail exposure, in some cases without a significant decrease in precision. When attempting to address consumer fairness by recommending more popular POIs to inactive users, the result was an increase in precision for only some recommendation models and datasets. Finally, when considering the tradeoff between both parameters, the combinations that reached the Pareto front of consumer and provider fairness, unfortunately, achieved the lowest precision values. We find that the nature of this tradeoff depends heavily on the model and the dataset. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.00031 [pdf, other]

AMGPT: a Large Language Model for Contextual Querying in Additive Manufacturing

Authors: Achuth Chandrasekhar, Jonathan Chan, Francis Ogoke, Olabode Ajenifujah, Amir Barati Farimani

Abstract: Generalized large language models (LLMs) such as GPT-4 may not provide specific answers to queries formulated by materials science researchers. These models may produce a high-level outline but lack the capacity to return detailed instructions on manufacturing and material properties of novel alloys. Enhancing a smaller model with specialized domain knowledge may provide an advantage over large la… ▽ More Generalized large language models (LLMs) such as GPT-4 may not provide specific answers to queries formulated by materials science researchers. These models may produce a high-level outline but lack the capacity to return detailed instructions on manufacturing and material properties of novel alloys. Enhancing a smaller model with specialized domain knowledge may provide an advantage over large language models which cannot be retrained quickly enough to keep up with the rapid pace of research in metal additive manufacturing (AM). We introduce "AMGPT," a specialized LLM text generator designed for metal AM queries. The goal of AMGPT is to assist researchers and users in navigating the extensive corpus of literature in AM. Instead of training from scratch, we employ a pre-trained Llama2-7B model from Hugging Face in a Retrieval-Augmented Generation (RAG) setup, utilizing it to dynamically incorporate information from $\sim$50 AM papers and textbooks in PDF format. Mathpix is used to convert these PDF documents into TeX format, facilitating their integration into the RAG pipeline managed by LlamaIndex. Expert evaluations of this project highlight that specific embeddings from the RAG setup accelerate response times and maintain coherence in the generated text. △ Less

Submitted 24 May, 2024; originally announced June 2024.

Comments: 54 pages, 4 figures

arXiv:2405.19184 [pdf, other]

doi 10.1145/3638529.3654207

Promoting Two-sided Fairness in Dynamic Vehicle Routing Problem

Authors: Yufan Kang, Rongsheng Zhang, Wei Shao, Flora D. Salim, Jeffrey Chan

Abstract: Dynamic Vehicle Routing Problem (DVRP), is an extension of the classic Vehicle Routing Problem (VRP), which is a fundamental problem in logistics and transportation. Typically, DVRPs involve two stakeholders: service providers that deliver services to customers and customers who raise requests from different locations. Many real-world applications can be formulated as DVRP such as ridesharing and… ▽ More Dynamic Vehicle Routing Problem (DVRP), is an extension of the classic Vehicle Routing Problem (VRP), which is a fundamental problem in logistics and transportation. Typically, DVRPs involve two stakeholders: service providers that deliver services to customers and customers who raise requests from different locations. Many real-world applications can be formulated as DVRP such as ridesharing and non-compliance capture. Apart from original objectives like optimising total utility or efficiency, DVRP should also consider fairness for all parties. Unfairness can induce service providers and customers to give up on the systems, leading to negative financial and social impacts. However, most existing DVRP-related applications focus on improving fairness from a single side, and there have been few works considering two-sided fairness and utility optimisation concurrently. To this end, we propose a novel framework, a Two-sided Fairness-aware Genetic Algorithm (named 2FairGA), which expands the genetic algorithm from the original objective solely focusing on utility to multi-objectives that incorporate two-sided fairness. Subsequently, the impact of injecting two fairness definitions into the utility-focused model and the correlation between any pair of the three objectives are explored. Extensive experiments demonstrate the superiority of our proposed framework compared to the state-of-the-art. △ Less

Submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.12216 [pdf, other]

doi 10.1093/mnras/stae1311

Radiative transfer of 21-cm line through ionised cavities in an expanding universe

Authors: Kinwah Wu, Qin Han, Jennifer Y. H. Chan

Abstract: The optical depth parameterisation is typically used to study the 21-cm signals associated with the properties of the neutral hydrogen (HI) gas and the ionisation morphology during the Epoch of Reionisation (EoR), without solving the radiative transfer equation. To assess the uncertainties resulting from this simplification, we conduct explicit radiative transfer calculations using the cosmologica… ▽ More The optical depth parameterisation is typically used to study the 21-cm signals associated with the properties of the neutral hydrogen (HI) gas and the ionisation morphology during the Epoch of Reionisation (EoR), without solving the radiative transfer equation. To assess the uncertainties resulting from this simplification, we conduct explicit radiative transfer calculations using the cosmological 21-cm radiative transfer (C21LRT) code and examine the imprints of ionisation structures on the 21-cm spectrum. We consider a globally averaged reionisation history and implement fully ionised cavities (HII bubbles) of diameters $d$ ranging from 0.01 Mpc to 10 Mpc at epochs within the emission and the absorption regimes of the 21-cm global signal. The single-ray C21LRT calculations show that the shape of the imprinted spectral features are primarily determined by $d$ and the 21-cm line profile, which is parametrised by the turbulent velocity of the HI gas. It reveals the spectral features tied to the transition from ionised to neutral regions that calculations based on the optical depth parametrisation were unable to capture. We also present analytical approximations of the calculated spectral features of the HII bubbles. The multiple-ray calculations show that the apparent shape of a HII bubble (of $d=5$ Mpc at $z=8$), because of the finite speed of light, differs depending on whether the bubble's ionisation front is stationary or expanding. Our study shows the necessity of properly accounting for the effects of line-continuum interaction, line broadening and cosmological expansion to correctly predict the EoR 21-cm signals. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 11 figures, 1 table

arXiv:2405.06786 [pdf, other]

SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model

Authors: Trevor J. Chan, Aarush Sahni, Jie Li, Alisha Luthra, Amy Fang, Alison Pouch, Chamith S. Rajapakse

Abstract: We introduce SAM3D, a new approach to semi-automatic zero-shot segmentation of 3D images building on the existing Segment Anything Model. We achieve fast and accurate segmentations in 3D images with a four-step strategy comprising: volume slicing along non-orthogonal axes, efficient prompting in 3D, slice-wise inference using the pretrained SAM, and recoposition and refinement in 3D. We evaluated… ▽ More We introduce SAM3D, a new approach to semi-automatic zero-shot segmentation of 3D images building on the existing Segment Anything Model. We achieve fast and accurate segmentations in 3D images with a four-step strategy comprising: volume slicing along non-orthogonal axes, efficient prompting in 3D, slice-wise inference using the pretrained SAM, and recoposition and refinement in 3D. We evaluated SAM3D performance qualitatively on an array of imaging modalities and anatomical structures and quantify performance for specific organs in body CT and tumors in brain MRI. By enabling users to create 3D segmentations of unseen data quickly and with dramatically reduced manual input, these methods have the potential to aid surgical planning and education, diagnostic imaging, and scientific research. △ Less

Submitted 10 May, 2024; originally announced May 2024.

arXiv:2405.04863 [pdf, other]

Three-dimensional higher-order saddle points induced flat bands in Co-based kagome metals

Authors: Hengxin Tan, Yiyang Jiang, Gregory T. McCandless, Julia Y. Chan, Binghai Yan

Abstract: The saddle point (van Hove singularity) exhibits a divergent density of states in 2D systems, leading to fascinating phenomena like strong correlations and unconventional superconductivity, yet it is seldom observed in 3D systems. In this work, we have found two types of 3D higher-order saddle points (HOSPs) in emerging 3D kagome metals, YbCo$_6$Ge$_6$ and MgCo$_6$Ge$_6$. Both HOSPs exhibit a sing… ▽ More The saddle point (van Hove singularity) exhibits a divergent density of states in 2D systems, leading to fascinating phenomena like strong correlations and unconventional superconductivity, yet it is seldom observed in 3D systems. In this work, we have found two types of 3D higher-order saddle points (HOSPs) in emerging 3D kagome metals, YbCo$_6$Ge$_6$ and MgCo$_6$Ge$_6$. Both HOSPs exhibit a singularity in their density of states, which is significantly enhanced compared to the ordinary saddle point. The HOSP near the Fermi energy generates a flat band extending a large area in the Brillouin zone, potentially amplifying the correlation effect and fostering electronic instabilities. Two types of HOSPs exhibit distinct robustness upon element substitution and lattice distortions in these kagome compounds. Our work paves the way for engineering exotic band structures, such as saddle points and flat bands, and exploring interesting phenomena in Co-based kagome materials. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 5 main pages + 17 Supplementary pages

arXiv:2404.14407 [pdf, other]

A covariant formulation for cosmological radiative transfer of the 21-cm line

Authors: Jennifer Y. H. Chan, Qin Han, Kinwah Wu, Jason D. McEwen

Abstract: The 21-cm hyperfine line of neutral hydrogen is a useful tool to probe the conditions of the Universe during the Dark Ages, Cosmic Dawn, and the Epoch of Reionisation. In most of the current calculations, the 21-cm line signals at given frequencies are computed, using an integrated line-of-sight line opacity, with the correction for cosmological expansion. These calculations have not fully capture… ▽ More The 21-cm hyperfine line of neutral hydrogen is a useful tool to probe the conditions of the Universe during the Dark Ages, Cosmic Dawn, and the Epoch of Reionisation. In most of the current calculations, the 21-cm line signals at given frequencies are computed, using an integrated line-of-sight line opacity, with the correction for cosmological expansion. These calculations have not fully captured the line and continuum interactions in the radiative transfer, in response to evolution of the radiation field and the variations of thermal and dynamic properties of the line-of-sight medium. We construct a covariant formulation for the radiative transfer of the 21-cm line and derive the cosmological 21-cm line radiative transfer (C21LRT) equation. The formulation properly accounts for local emission and absorption processes and the interaction between the line and continuum when the radiation propagates across the expanding Universe to the present observer. Our C21LRT calculations show that methods simply summing the line optical depth could lead to error of $5\%$ in the 21-cm signals for redshift $z \sim 12-35$ and of $>10\%$ for redshift $z \lesssim 8$. Proper covariant radiative transfer is therefore necessary for producing correct theoretical templates for extracting information of the structural evolution of the Universe through the Epoch of Reionisation from the 21-cm tomographic data. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 16 pages, 11 figures, 3 tables

arXiv:2404.12361 [pdf, other]

Learning the Domain Specific Inverse NUFFT for Accelerated Spiral MRI using Diffusion Models

Authors: Trevor J. Chan, Chamith S. Rajapakse

Abstract: Deep learning methods for accelerated MRI achieve state-of-the-art results but largely ignore additional speedups possible with noncartesian sampling trajectories. To address this gap, we created a generative diffusion model-based reconstruction algorithm for multi-coil highly undersampled spiral MRI. This model uses conditioning during training as well as frequency-based guidance to ensure consis… ▽ More Deep learning methods for accelerated MRI achieve state-of-the-art results but largely ignore additional speedups possible with noncartesian sampling trajectories. To address this gap, we created a generative diffusion model-based reconstruction algorithm for multi-coil highly undersampled spiral MRI. This model uses conditioning during training as well as frequency-based guidance to ensure consistency between images and measurements. Evaluated on retrospective data, we show high quality (structural similarity > 0.87) in reconstructed images with ultrafast scan times (0.02 seconds for a 2D image). We use this algorithm to identify a set of optimal variable-density spiral trajectories and show large improvements in image quality compared to conventional reconstruction using the non-uniform fast Fourier transform. By combining efficient spiral sampling trajectories, multicoil imaging, and deep learning reconstruction, these methods could enable the extremely high acceleration factors needed for real-time 3D imaging. △ Less

Submitted 10 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.06630 [pdf, other]

An Energy Stable High-Order Cut Cell Discontinuous Galerkin Method with State Redistribution for Wave Propagation

Authors: Christina G. Taylor, Lucas C. Wilcox, Jesse Chan

Abstract: Cut meshes are a type of mesh that is formed by allowing embedded boundaries to "cut" a simple underlying mesh resulting in a hybrid mesh of cut and standard elements. While cut meshes can allow complex boundaries to be represented well regardless of the mesh resolution, their arbitrarily shaped and sized cut elements can present issues such as the small cell problem, where small cut elements can… ▽ More Cut meshes are a type of mesh that is formed by allowing embedded boundaries to "cut" a simple underlying mesh resulting in a hybrid mesh of cut and standard elements. While cut meshes can allow complex boundaries to be represented well regardless of the mesh resolution, their arbitrarily shaped and sized cut elements can present issues such as the small cell problem, where small cut elements can result in a severely restricted CFL condition. State redistribution, a technique developed by Berger and Giuliani [1], can be used to address the small cell problem. In this work, we pair state redistribution with a high-order discontinuous Galerkin scheme that is $L_2$ energy stable for arbitrary quadrature. We prove that state redistribution can be added to a provably $L_2$ energy stable discontinuous Galerkin method on a cut mesh without damaging the scheme's $L_2$ stability. We numerically verify the high order accuracy and stability of our scheme on two-dimensional wave propagation problems. △ Less

Submitted 23 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

arXiv:2403.06219 [pdf, other]

Affine Semigroup Algebras And Their Fibered Sums

Authors: C-Y. Jean Chan, I-Chiau Huang, Jung-Chen Liu

Abstract: We study affine semigroup rings as algebras over subsemigroup rings. From this relative viewpoint with respect to a given subsemigroup ring, the fibered sum of two affine semigroup algebras is constructed. Such a construction is compared to the tensor product and to the classical gluings of affine semigroup rings as defined in Rosales (1997). While fibered sum can always be achieved, gluings of… ▽ More We study affine semigroup rings as algebras over subsemigroup rings. From this relative viewpoint with respect to a given subsemigroup ring, the fibered sum of two affine semigroup algebras is constructed. Such a construction is compared to the tensor product and to the classical gluings of affine semigroup rings as defined in Rosales (1997). While fibered sum can always be achieved, gluings of affine semigroup rings do not always exist. Therefore, we further investigate when the fibered sum of affine semigroup algebras gives rise to a gluing. A criterion is recovered in terms of the defining semigroups under which the gluing may take place. △ Less

Submitted 10 March, 2024; originally announced March 2024.

MSC Class: 13F65; 13B10; 20M25; 20M50

arXiv:2403.03004 [pdf, other]

Ultralight vector dark matter search using data from the KAGRA O3GK run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we present the result of a search for $U(1)_{B-L}$ gauge boson DM using the KAGRA data from auxiliary length channels during the first joint observation run together with GEO600. By applying our search pipeline, which takes into account the stochastic nature of ultralight DM, upper bounds on the coupling strength between the $U(1)_{B-L}$ gauge boson and ordinary matter are obtained for a range of DM masses. While our constraints are less stringent than those derived from previous experiments, this study demonstrates the applicability of our method to the lower-mass vector DM search, which is made difficult in this measurement by the short observation time compared to the auto-correlation time scale of DM. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 20 pages, 5 figures

Report number: LIGO-P2300250

arXiv:2402.14952 [pdf]

Implementations of Cooperative Games Under Non-Cooperative Solution Concepts

Authors: Justin Chan

Abstract: Cooperative games can be distinguished as non-cooperative games in which players can freely sign binding agreements to form coalitions. These coalitions inherit a joint strategy set and seek to maximize collective payoffs. When the payoffs to each coalition under some non-cooperative solution concept coincide with their value in the cooperative game, the cooperative game is said to be implementabl… ▽ More Cooperative games can be distinguished as non-cooperative games in which players can freely sign binding agreements to form coalitions. These coalitions inherit a joint strategy set and seek to maximize collective payoffs. When the payoffs to each coalition under some non-cooperative solution concept coincide with their value in the cooperative game, the cooperative game is said to be implementable and the non-cooperative game its implementation. This paper proves that all strictly superadditive partition function form games are implementable under Nash equilibrium and rationalizability; that all weakly superadditive characteristic function form games are implementable under Nash equilibrium; and that all weakly superadditive partition function form games are implementable under trembling hand perfect equilibrium. Discussion then proceeds on the appropriate choice of non-cooperative solution concept for the implementation. △ Less

Submitted 12 April, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.14476 [pdf, ps, other]

Quantifying neural network uncertainty under volatility clustering

Authors: Steven Y. K. Wong, Jennifer S. K. Chan, Lamiae Azizi

Abstract: Time-series with time-varying variance pose a unique challenge to uncertainty quantification (UQ) methods. Time-varying variance, such as volatility clustering as seen in financial time-series, can lead to large mismatch between predicted uncertainty and forecast error. Building on recent advances in neural network UQ literature, we extend and simplify Deep Evidential Regression and Deep Ensembles… ▽ More Time-series with time-varying variance pose a unique challenge to uncertainty quantification (UQ) methods. Time-varying variance, such as volatility clustering as seen in financial time-series, can lead to large mismatch between predicted uncertainty and forecast error. Building on recent advances in neural network UQ literature, we extend and simplify Deep Evidential Regression and Deep Ensembles into a unified framework to deal with UQ under the presence of volatility clustering. We show that a Scale Mixture Distribution is a simpler alternative to the Normal-Inverse-Gamma prior that provides favorable complexity-accuracy trade-off. To illustrate the performance of our proposed approach, we apply it to two sets of financial time-series exhibiting volatility clustering: cryptocurrencies and U.S. equities. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 38 pages

arXiv:2402.08812 [pdf, other]

Intelligent Canvas: Enabling Design-Like Exploratory Visual Data Analysis with Generative AI through Rapid Prototyping, Iteration and Curation

Authors: Zijian Ding, Joel Chan

Abstract: Complex data analysis inherently seeks unexpected insights through exploratory visual analysis methods, transcending logical, step-by-step processing. However, existing interfaces such as notebooks and dashboards have limitations in exploration and comparison for visual data analysis. Addressing these limitations, we introduce a "design-like" intelligent canvas environment integrating generative A… ▽ More Complex data analysis inherently seeks unexpected insights through exploratory visual analysis methods, transcending logical, step-by-step processing. However, existing interfaces such as notebooks and dashboards have limitations in exploration and comparison for visual data analysis. Addressing these limitations, we introduce a "design-like" intelligent canvas environment integrating generative AI into data analysis, offering rapid prototyping, iteration, and comparative visualization management. Our dual contributions include the integration of generative AI components into a canvas interface, and empirical findings from a user study (N=10) evaluating the effectiveness of the canvas interface. △ Less

Submitted 21 March, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.08351 [pdf, other]

Wireless Channel Prediction via Gaussian Mixture Models

Authors: Nurettin Turan, Benedikt Böck, Kai Jie Chan, Benedikt Fesl, Friedrich Burmeister, Michael Joham, Gerhard Fettweis, Wolfgang Utschick

Abstract: In this work, we utilize a Gaussian mixture model (GMM) to capture the underlying probability density function (PDF) of the channel trajectories of moving mobile terminals (MTs) within the coverage area of a base station (BS) in an offline phase. We propose to leverage the same GMM for channel prediction in the online phase. Our proposed approach does not require signal-to-noise ratio (SNR)-specif… ▽ More In this work, we utilize a Gaussian mixture model (GMM) to capture the underlying probability density function (PDF) of the channel trajectories of moving mobile terminals (MTs) within the coverage area of a base station (BS) in an offline phase. We propose to leverage the same GMM for channel prediction in the online phase. Our proposed approach does not require signal-to-noise ratio (SNR)-specific training and allows for parallelization. Numerical simulations for both synthetic and measured channel data demonstrate the effectiveness of our proposed GMM-based channel predictor compared to state-ofthe-art channel prediction methods. △ Less

Submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.03104 [pdf, other]

High-dimensional Bayesian Optimization via Covariance Matrix Adaptation Strategy

Authors: Lam Ngo, Huong Ha, Jeffrey Chan, Vu Nguyen, Hongyu Zhang

Abstract: Bayesian Optimization (BO) is an effective method for finding the global optimum of expensive black-box functions. However, it is well known that applying BO to high-dimensional optimization problems is challenging. To address this issue, a promising solution is to use a local search strategy that partitions the search domain into local regions with high likelihood of containing the global optimum… ▽ More Bayesian Optimization (BO) is an effective method for finding the global optimum of expensive black-box functions. However, it is well known that applying BO to high-dimensional optimization problems is challenging. To address this issue, a promising solution is to use a local search strategy that partitions the search domain into local regions with high likelihood of containing the global optimum, and then use BO to optimize the objective function within these regions. In this paper, we propose a novel technique for defining the local regions using the Covariance Matrix Adaptation (CMA) strategy. Specifically, we use CMA to learn a search distribution that can estimate the probabilities of data points being the global optimum of the objective function. Based on this search distribution, we then define the local regions consisting of data points with high probabilities of being the global optimum. Our approach serves as a meta-algorithm as it can incorporate existing black-box BO optimizers, such as BO, TuRBO, and BAxUS, to find the global optimum of the objective function within our derived local regions. We evaluate our proposed method on various benchmark synthetic and real-world problems. The results demonstrate that our method outperforms existing state-of-the-art techniques. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 31 pages, 17 figures

Journal ref: Transactions on Machine Learning Research 2024

arXiv:2402.00782 [pdf, other]

Dense Reward for Free in Reinforcement Learning from Human Feedback

Authors: Alex J. Chan, Hao Sun, Samuel Holt, Mihaela van der Schaar

Abstract: Reinforcement Learning from Human Feedback (RLHF) has been credited as the key advance that has allowed Large Language Models (LLMs) to effectively follow instructions and produce useful assistance. Classically, this involves generating completions from the LLM in response to a query before using a separate reward model to assign a score to the full completion. As an auto-regressive process, the L… ▽ More Reinforcement Learning from Human Feedback (RLHF) has been credited as the key advance that has allowed Large Language Models (LLMs) to effectively follow instructions and produce useful assistance. Classically, this involves generating completions from the LLM in response to a query before using a separate reward model to assign a score to the full completion. As an auto-regressive process, the LLM has to take many "actions" (selecting individual tokens) and only receives a single, sparse reward at the end of an episode, a setup that is known to be difficult to optimise in traditional reinforcement learning. In this work we leverage the fact that the reward model contains more information than just its scalar output, in particular, it calculates an attention map over tokens as part of the transformer architecture. We use these attention weights to redistribute the reward along the whole completion, effectively densifying the signal and highlighting the most important tokens, all without incurring extra computational cost or requiring any additional modelling. We demonstrate that, theoretically, this approach is equivalent to potential-based reward shaping, ensuring that the optimal policy remains unchanged. Empirically, we show that it stabilises training, accelerates the rate of learning, and, in practical cases, may lead to better local optima. △ Less

Submitted 1 February, 2024; originally announced February 2024.

arXiv:2401.11022 [pdf, other]

Formulating or Fixating: Effects of Examples on Problem Solving Vary as a Function of Example Presentation Interface Design

Authors: Joel Chan, Zijian Ding, Eesh Kamrah, Mark Fuge

Abstract: Interactive systems that facilitate exposure to examples can augment problem solving performance. However designers of such systems are often faced with many practical design decisions about how users will interact with examples, with little clear theoretical guidance. To understand how example interaction design choices affect whether/how people benefit from examples, we conducted an experiment w… ▽ More Interactive systems that facilitate exposure to examples can augment problem solving performance. However designers of such systems are often faced with many practical design decisions about how users will interact with examples, with little clear theoretical guidance. To understand how example interaction design choices affect whether/how people benefit from examples, we conducted an experiment where 182 participants worked on a controlled analog to an exploratory creativity task, with access to examples of varying diversity and presentation interfaces. Task performance was worse when examples were presented in a list, compared to contextualized in the exploration space or shown in a dropdown list. Example lists were associated with more fixation, whereas contextualized examples were associated with using examples to formulate a model of the problem space to guide exploration. We discuss implications of these results for a theoretical framework that maps design choices to fundamental psychological mechanisms of creative inspiration from examples. △ Less

Submitted 23 January, 2024; v1 submitted 19 January, 2024; originally announced January 2024.

arXiv:2401.07156 [pdf]

doi 10.1021/acsnano.3c09234

Writing and detecting topological charges in exfoliated Fe$_{5-x}$GeTe$_2$

Authors: Alex Moon, Yue Li, Conor McKeever, Brian W. Casas, Moises Bravo, Wenkai Zheng, Juan Macy, Amanda K. Petford-Long, Gregory T. McCandless, Julia Y. Chan, Charudatta Phatak, Elton J. G. Santos, Luis Balicas

Abstract: Fe$_{5-x}$GeTe$_2$ is a promising two-dimensional (2D) van der Waals (vdW) magnet for practical applications, given its magnetic properties. These include Curie temperatures above room temperature, and topological spin textures (TST or both merons and skyrmions), responsible for a pronounced anomalous Hall effect (AHE) and its topological counterpart (THE), which can be harvested for spintronics.… ▽ More Fe$_{5-x}$GeTe$_2$ is a promising two-dimensional (2D) van der Waals (vdW) magnet for practical applications, given its magnetic properties. These include Curie temperatures above room temperature, and topological spin textures (TST or both merons and skyrmions), responsible for a pronounced anomalous Hall effect (AHE) and its topological counterpart (THE), which can be harvested for spintronics. Here, we show that both the AHE and THE can be amplified considerably by just adjusting the thickness of exfoliated Fe$_{5-x}$GeTe$_2$, with THE becoming observable even in zero magnetic field due to a field-induced unbalance in topological charges. Using a complementary suite of techniques, including electronic transport, Lorentz transmission electron microscopy, and micromagnetic simulations, we reveal the emergence of substantial coercive fields upon exfoliation, which are absent in the bulk, implying thickness-dependent magnetic interactions that affect the TST. We detected a ``magic" thickness $t \sim $30 nm where the formation of TST is maximized, inducing large magnitudes for the topological charge density ($6.45 \times 10^{20}$ cm$^{-2}$), and the concomitant anomalous ($ρ_{xy}^{\text{A,max}} \simeq 22.6$ $μΩ$cm) and topological ($ρ_{xy}^{\text{u,T}} \simeq 15$ $μΩ$ cm) Hall resistivities at $T$ ~ 120 K. These values for $ρ_{xy}^{\text{A,max}}$ and $ρ_{xy}^{\text{u,T}}$ are higher than those found in magnetic topological insulators and, so far, the largest reported for 2D magnets. The hitherto unobserved THE under zero magnetic field could provide a platform for the writing and electrical detection of TST aiming at energy-efficient devices based on vdW ferromagnets. △ Less

Submitted 13 January, 2024; originally announced January 2024.

Comments: 45 pages with supporting information file, 5 main figures. Accepted ACS Nano 2024

arXiv:2401.01753

A Generative AI Assistant to Accelerate Cloud Migration

Authors: Amal Vaidya, Mohan Krishna Vankayalapati, Jacky Chan, Senad Ibraimoski, Sean Moran

Abstract: We present a tool that leverages generative AI to accelerate the migration of on-premises applications to the cloud. The Cloud Migration LLM accepts input from the user specifying the parameters of their migration, and outputs a migration strategy with an architecture diagram. A user study suggests that the migration LLM can assist inexperienced users in finding the right cloud migration profile,… ▽ More We present a tool that leverages generative AI to accelerate the migration of on-premises applications to the cloud. The Cloud Migration LLM accepts input from the user specifying the parameters of their migration, and outputs a migration strategy with an architecture diagram. A user study suggests that the migration LLM can assist inexperienced users in finding the right cloud migration profile, while avoiding complexities of a manual approach. △ Less

Submitted 3 January, 2024; originally announced January 2024.

Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

arXiv:2312.12681 [pdf, other]

Imitation of Life: A Search Engine for Biologically Inspired Design

Authors: Hen Emuna, Nadav Borenstein, Xin Qian, Hyeonsu Kang, Joel Chan, Aniket Kittur, Dafna Shahaf

Abstract: Biologically Inspired Design (BID), or Biomimicry, is a problem-solving methodology that applies analogies from nature to solve engineering challenges. For example, Speedo engineers designed swimsuits based on shark skin. Finding relevant biological solutions for real-world problems poses significant challenges, both due to the limited biological knowledge engineers and designers typically possess… ▽ More Biologically Inspired Design (BID), or Biomimicry, is a problem-solving methodology that applies analogies from nature to solve engineering challenges. For example, Speedo engineers designed swimsuits based on shark skin. Finding relevant biological solutions for real-world problems poses significant challenges, both due to the limited biological knowledge engineers and designers typically possess and to the limited BID resources. Existing BID datasets are hand-curated and small, and scaling them up requires costly human annotations. In this paper, we introduce BARcode (Biological Analogy Retriever), a search engine for automatically mining bio-inspirations from the web at scale. Using advances in natural language understanding and data programming, BARcode identifies potential inspirations for engineering challenges. Our experiments demonstrate that BARcode can retrieve inspirations that are valuable to engineers and designers tackling real-world problems, as well as recover famous historical BID examples. We release data and code; we view BARcode as a step towards addressing the challenges that have historically hindered the practical application of BID to engineering innovation. △ Less

Submitted 19 December, 2023; originally announced December 2023.

Comments: To be published in the AAAI 2024 Proceedings Main Track

arXiv:2312.09080 [pdf, other]

Pseudodifferential Models for Ultrasound Waves with Fractional Attenuation

Authors: Sebastian Acosta, Jesse Chan, Raven Johnson, Benjamin Palacios

Abstract: To strike a balance between modeling accuracy and computational efficiency for simulations of ultrasound waves in soft tissues, we derive a pseudodifferential factorization of the wave operator with fractional attenuation. This factorization allows us to approximately solve the Helmholtz equation via one-way (transmission) or two-way (transmission and reflection) sweeping schemes tailored to high-… ▽ More To strike a balance between modeling accuracy and computational efficiency for simulations of ultrasound waves in soft tissues, we derive a pseudodifferential factorization of the wave operator with fractional attenuation. This factorization allows us to approximately solve the Helmholtz equation via one-way (transmission) or two-way (transmission and reflection) sweeping schemes tailored to high-frequency wave fields. We provide explicitly the three highest order terms of the pseudodifferential expansion to incorporate the well-known square-root first order symbol for wave propagation, the zeroth order symbol for amplitude modulation due to changes in wave speed and damping, and the next symbol to model fractional attenuation. We also propose wide-angle Pade approximations for the pseudodifferential operators corresponding to these three highest order symbols. Our analysis provides insights regarding the role played by the frequency and the Pade approximations in the estimation of error bounds. We also provide a proof-of-concept numerical implementation of the proposed method and test the error estimates numerically. △ Less

Submitted 8 April, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

MSC Class: 35S05; 35S15; 35L05; 41A21; 41A28

arXiv:2312.08453 [pdf, other]

Integrating Particle Flavor into Deep Learning Models for Hadronization

Authors: Jay Chan, Xiangyang Ju, Adam Kania, Benjamin Nachman, Vishnu Sangli, Andrzej Siodmok

Abstract: Hadronization models used in event generators are physics-inspired functions with many tunable parameters. Since we do not understand hadronization from first principles, there have been multiple proposals to improve the accuracy of hadronization models by utilizing more flexible parameterizations based on neural networks. These recent proposals have focused on the kinematic properties of hadrons,… ▽ More Hadronization models used in event generators are physics-inspired functions with many tunable parameters. Since we do not understand hadronization from first principles, there have been multiple proposals to improve the accuracy of hadronization models by utilizing more flexible parameterizations based on neural networks. These recent proposals have focused on the kinematic properties of hadrons, but a full model must also include particle flavor. In this paper, we show how to build a deep learning-based hadronization model that includes both kinematic (continuous) and flavor (discrete) degrees of freedom. Our approach is based on Generative Adversarial Networks and we show the performance within the context of the cluster hadronization model within the Herwig event generator. △ Less

Submitted 13 December, 2023; originally announced December 2023.

Comments: 9 pages, 4 figures

arXiv:2312.02401 [pdf, other]

Harmonizing Global Voices: Culturally-Aware Models for Enhanced Content Moderation

Authors: Alex J. Chan, José Luis Redondo García, Fabrizio Silvestri, Colm O'Donnel, Konstantina Palla

Abstract: Content moderation at scale faces the challenge of considering local cultural distinctions when assessing content. While global policies aim to maintain decision-making consistency and prevent arbitrary rule enforcement, they often overlook regional variations in interpreting natural language as expressed in content. In this study, we are looking into how moderation systems can tackle this issue b… ▽ More Content moderation at scale faces the challenge of considering local cultural distinctions when assessing content. While global policies aim to maintain decision-making consistency and prevent arbitrary rule enforcement, they often overlook regional variations in interpreting natural language as expressed in content. In this study, we are looking into how moderation systems can tackle this issue by adapting to local comprehension nuances. We train large language models on extensive datasets of media news and articles to create culturally attuned models. The latter aim to capture the nuances of communication across geographies with the goal of recognizing cultural and societal variations in what is considered offensive content. We further explore the capability of these models to generate explanations for instances of content violation, aiming to shed light on how policy guidelines are perceived when cultural and societal contexts change. We find that training on extensive media datasets successfully induced cultural awareness and resulted in improvements in handling content violations on a regional basis. Additionally, these advancements include the ability to provide explanations that align with the specific local norms and nuances as evidenced by the annotators' preference in our conducted study. This multifaceted success reinforces the critical role of an adaptable content moderation approach in keeping pace with the ever-evolving nature of the content it oversees. △ Less

Submitted 4 December, 2023; originally announced December 2023.

Comments: 12 pages, 8 Figures. Supplementary material

arXiv:2311.15542 [pdf]

Arbitrary Engineering of Spatial Caustics with 3D-printed Metasurfaces

Authors: Xiaoyan Zhou, Hongtao Wang, Shuxi Liu, Hao Wang, John You En Chan, Cheng-Feng Pan, Daomu Zhao, Joel K. W. Yang, Cheng-Wei Qiu

Abstract: Caustics occur in diverse physical systems, spanning the nano-scale in electron microscopy to astronomical-scale in gravitational lensing. As envelopes of rays, optical caustics result in sharp edges or extended networks. Caustics in structured light, characterized by complex-amplitude distributions, have innovated numerous applications including particle manipulation, high-resolution imaging tech… ▽ More Caustics occur in diverse physical systems, spanning the nano-scale in electron microscopy to astronomical-scale in gravitational lensing. As envelopes of rays, optical caustics result in sharp edges or extended networks. Caustics in structured light, characterized by complex-amplitude distributions, have innovated numerous applications including particle manipulation, high-resolution imaging techniques, and optical communication. However, these applications have encountered limitations due to a major challenge in engineering caustic fields with customizable propagation trajectories and in-plane intensity profiles. Here, we introduce the compensation phase via 3D-printed metasurfaces to shape caustic fields with curved trajectories in free space. The in-plane caustic patterns can be preserved or morphed from one structure to another during propagation. Large-scale fabrication of these metasurfaces is enabled by the fast-prototyping and cost-effective two-photon polymerization lithography. Our optical elements with the ultra-thin profile and sub-millimeter extension offer a compact solution to generating caustic structured light for beam shaping, high-resolution microscopy, and light-matter-interaction studies. △ Less

Submitted 27 November, 2023; originally announced November 2023.

arXiv:2311.15385 [pdf, other]

Engineering Anomalously Large Electron Transport in Topological Semimetals

Authors: Vincent M. Plisson, Xiaohan Yao, Yaxian Wang, George Varnavides, Alexey Suslov, David Graf, Eun Sang Choi, Hung-Yu Yang, Yiping Wang, Marisa Romanelli, Grant McNamara, Birender Singh, Gregory T. McCandless, Julia Y. Chan, Prineha Narang, Fazel Tafti, Kenneth S. Burch

Abstract: Anomalous transport of topological semimetals has generated significant interest for applications in optoelectronics, nanoscale devices, and interconnects. Understanding the origin of novel transport is crucial to engineering the desired material properties, yet their orders of magnitude higher transport than single-particle mobilities remain unexplained. This work demonstrates the dramatic mobili… ▽ More Anomalous transport of topological semimetals has generated significant interest for applications in optoelectronics, nanoscale devices, and interconnects. Understanding the origin of novel transport is crucial to engineering the desired material properties, yet their orders of magnitude higher transport than single-particle mobilities remain unexplained. This work demonstrates the dramatic mobility enhancements result from phonons primarily returning momentum to electrons due to phonon-electron dominating over phonon-phonon scattering. Proving this idea, proposed by Peierls in 1932, requires tuning electron and phonon dispersions without changing symmetry, topology, or disorder. This is achieved by combining de Haas - van Alphen (dHvA), electron transport, Raman scattering, and first-principles calculations in the topological semimetals MX$_2$ (M=Nb, Ta and X=Ge, Si). Replacing Ge with Si brings the transport mobilities from an order magnitude larger than single particle ones to nearly balanced. This occurs without changing the crystal structure or topology and with small differences in disorder or Fermi surface. Simultaneously, Raman scattering and first-principles calculations establish phonon-electron dominated scattering only in the MGe$_2$ compounds. Thus, this study proves that phonon-drag is crucial to the transport properties of topological semimetals and provides insight to further engineer these materials. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: 12 pages, 5 figures

arXiv:2311.14110 [pdf, other]

When is Off-Policy Evaluation Useful? A Data-Centric Perspective

Authors: Hao Sun, Alex J. Chan, Nabeel Seedat, Alihan Hüyük, Mihaela van der Schaar

Abstract: Evaluating the value of a hypothetical target policy with only a logged dataset is important but challenging. On the one hand, it brings opportunities for safe policy improvement under high-stakes scenarios like clinical guidelines. On the other hand, such opportunities raise a need for precise off-policy evaluation (OPE). While previous work on OPE focused on improving the algorithm in value esti… ▽ More Evaluating the value of a hypothetical target policy with only a logged dataset is important but challenging. On the one hand, it brings opportunities for safe policy improvement under high-stakes scenarios like clinical guidelines. On the other hand, such opportunities raise a need for precise off-policy evaluation (OPE). While previous work on OPE focused on improving the algorithm in value estimation, in this work, we emphasize the importance of the offline dataset, hence putting forward a data-centric framework for evaluating OPE problems. We propose DataCOPE, a data-centric framework for evaluating OPE, that answers the questions of whether and to what extent we can evaluate a target policy given a dataset. DataCOPE (1) forecasts the overall performance of OPE algorithms without access to the environment, which is especially useful before real-world deployment where evaluating OPE is impossible; (2) identifies the sub-group in the dataset where OPE can be inaccurate; (3) permits evaluations of datasets or data-collection strategies for OPE problems. Our empirical analysis of DataCOPE in the logged contextual bandit settings using healthcare datasets confirms its ability to evaluate both machine-learning and human expert policies like clinical guidelines. △ Less

Submitted 23 November, 2023; originally announced November 2023.

Comments: Off-Policy Evaluation, Data-Centric AI, Data-Centric Reinforcement Learning, Reinforcement Learning

arXiv:2311.10100 [pdf, other]

LenSiam: Self-Supervised Learning on Strong Gravitational Lens Images

Authors: Po-Wen Chang, Kuan-Wei Huang, Joshua Fagin, James Hung-Hsu Chan, Joshua Yao-Yu Lin

Abstract: Self-supervised learning has been known for learning good representations from data without the need for annotated labels. We explore the simple siamese (SimSiam) architecture for representation learning on strong gravitational lens images. Commonly used image augmentations tend to change lens properties; for example, zoom-in would affect the Einstein radius. To create image pairs representing the… ▽ More Self-supervised learning has been known for learning good representations from data without the need for annotated labels. We explore the simple siamese (SimSiam) architecture for representation learning on strong gravitational lens images. Commonly used image augmentations tend to change lens properties; for example, zoom-in would affect the Einstein radius. To create image pairs representing the same underlying lens model, we introduce a lens augmentation method to preserve lens properties by fixing the lens model while varying the source galaxies. Our research demonstrates this lens augmentation works well with SimSiam for learning the lens image representation without labels, so we name it LenSiam. We also show that a pre-trained LenSiam model can benefit downstream tasks. We open-source our code and datasets at https://github.com/kuanweih/LenSiam . △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: 5 pages, 2 figures. Accepted by NeurIPS 2023 AI for Science Workshop

arXiv:2311.07426 [pdf, other]

Optimising Human-AI Collaboration by Learning Convincing Explanations

Authors: Alex J. Chan, Alihan Huyuk, Mihaela van der Schaar

Abstract: Machine learning models are being increasingly deployed to take, or assist in taking, complicated and high-impact decisions, from quasi-autonomous vehicles to clinical decision support systems. This poses challenges, particularly when models have hard-to-detect failure modes and are able to take actions without oversight. In order to handle this challenge, we propose a method for a collaborative s… ▽ More Machine learning models are being increasingly deployed to take, or assist in taking, complicated and high-impact decisions, from quasi-autonomous vehicles to clinical decision support systems. This poses challenges, particularly when models have hard-to-detect failure modes and are able to take actions without oversight. In order to handle this challenge, we propose a method for a collaborative system that remains safe by having a human ultimately making decisions, while giving the model the best opportunity to convince and debate them with interpretable explanations. However, the most helpful explanation varies among individuals and may be inconsistent across stated preferences. To this end we develop an algorithm, Ardent, to efficiently learn a ranking through interaction and best assist humans complete a task. By utilising a collaborative approach, we can ensure safety and improve performance while addressing transparency and accountability concerns. Ardent enables efficient and effective decision-making by adapting to individual preferences for explanations, which we validate through extensive simulations alongside a user study involving a challenging image classification task, demonstrating consistent improvement over competing systems. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2311.06416 [pdf, other]

TESLA-X: An effective method to search for sub-threshold lensed gravitational waves with a targeted population model

Authors: Alvin K. Y. Li, Juno C. L. Chan, Heather Fong, Aidan H. Y. Chong, Alan J. Weinstein, Jose M. Ezquiaga

Abstract: Strong gravitational lensing can produce copies of gravitational-wave signals from the same source with the same waveform morphologies but different amplitudes and arrival times. Some of these strongly-lensed gravitational-wave signals can be demagnified and become sub-threshold. We present TESLA-X, an enhanced approach to the original GstLAL-based TargetEd Subthreshold Lensing seArch (TESLA) meth… ▽ More Strong gravitational lensing can produce copies of gravitational-wave signals from the same source with the same waveform morphologies but different amplitudes and arrival times. Some of these strongly-lensed gravitational-wave signals can be demagnified and become sub-threshold. We present TESLA-X, an enhanced approach to the original GstLAL-based TargetEd Subthreshold Lensing seArch (TESLA) method, for improving the detection efficiency of these potential sub-threshold lensed signals. TESLA-X utilizes lensed injections to generate a targeted population model and a targeted template bank. We compare the performance of a full template bank search, TESLA, and TESLA-X methods via a simulation campaign, and demonstrate the performance of TESLA-X in recovering lensed injections, particularly targeting a mock event. Our results show that the TESLA-X method achieves a maximum of $\sim 10\%$ higher search sensitivity compared to the TESLA method within the sub-threshold regime, presenting a step towards detecting the first lensed gravitational wave. TESLA-X will be employed for the LIGO-Virgo-KAGRA's collaboration-wide analysis to search for lensing signatures in the fourth observing run. △ Less

Submitted 4 June, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

arXiv:2311.03866 [pdf, other]

SCONE-GAN: Semantic Contrastive learning-based Generative Adversarial Network for an end-to-end image translation

Authors: Iman Abbasnejad, Fabio Zambetta, Flora Salim, Timothy Wiley, Jeffrey Chan, Russell Gallagher, Ehsan Abbasnejad

Abstract: SCONE-GAN presents an end-to-end image translation, which is shown to be effective for learning to generate realistic and diverse scenery images. Most current image-to-image translation approaches are devised as two mappings: a translation from the source to target domain and another to represent its inverse. While successful in many applications, these approaches may suffer from generating trivia… ▽ More SCONE-GAN presents an end-to-end image translation, which is shown to be effective for learning to generate realistic and diverse scenery images. Most current image-to-image translation approaches are devised as two mappings: a translation from the source to target domain and another to represent its inverse. While successful in many applications, these approaches may suffer from generating trivial solutions with limited diversity. That is because these methods learn more frequent associations rather than the scene structures. To mitigate the problem, we propose SCONE-GAN that utilises graph convolutional networks to learn the objects dependencies, maintain the image structure and preserve its semantics while transferring images into the target domain. For more realistic and diverse image generation we introduce style reference image. We enforce the model to maximize the mutual information between the style image and output. The proposed method explicitly maximizes the mutual information between the related patches, thus encouraging the generator to produce more diverse images. We validate the proposed algorithm for image-to-image translation and stylizing outdoor images. Both qualitative and quantitative results demonstrate the effectiveness of our approach on four dataset. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: 9 pages, 5 figures

arXiv:2311.00320 [pdf, other]

doi 10.1145/3586183.3606779

Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables

Authors: Bandhav Veluri, Malek Itani, Justin Chan, Takuya Yoshioka, Shyamnath Gollakota

Abstract: Imagine being able to listen to the birds chirping in a park without hearing the chatter from other hikers, or being able to block out traffic noise on a busy street while still being able to hear emergency sirens and car honks. We introduce semantic hearing, a novel capability for hearable devices that enables them to, in real-time, focus on, or ignore, specific sounds from real-world environment… ▽ More Imagine being able to listen to the birds chirping in a park without hearing the chatter from other hikers, or being able to block out traffic noise on a busy street while still being able to hear emergency sirens and car honks. We introduce semantic hearing, a novel capability for hearable devices that enables them to, in real-time, focus on, or ignore, specific sounds from real-world environments, while also preserving the spatial cues. To achieve this, we make two technical contributions: 1) we present the first neural network that can achieve binaural target sound extraction in the presence of interfering sounds and background noise, and 2) we design a training methodology that allows our system to generalize to real-world use. Results show that our system can operate with 20 sound classes and that our transformer-based network has a runtime of 6.56 ms on a connected smartphone. In-the-wild evaluation with participants in previously unseen indoor and outdoor scenarios shows that our proof-of-concept system can extract the target sounds and generalize to preserve the spatial cues in its binaural output. Project page with code: https://semantichearing.cs.washington.edu △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.17894 [pdf, other]

Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey

Authors: Weixu Zhang, Yifei Wang, Yuanfeng Song, Victor Junqiu Wei, Yuxing Tian, Yiyan Qi, Jonathan H. Chan, Raymond Chi-Wing Wong, Haiqin Yang

Abstract: The emergence of natural language processing has revolutionized the way users interact with tabular data, enabling a shift from traditional query languages and manual plotting to more intuitive, language-based interfaces. The rise of large language models (LLMs) such as ChatGPT and its successors has further advanced this field, opening new avenues for natural language processing techniques. This… ▽ More The emergence of natural language processing has revolutionized the way users interact with tabular data, enabling a shift from traditional query languages and manual plotting to more intuitive, language-based interfaces. The rise of large language models (LLMs) such as ChatGPT and its successors has further advanced this field, opening new avenues for natural language processing techniques. This survey presents a comprehensive overview of natural language interfaces for tabular data querying and visualization, which allow users to interact with data using natural language queries. We introduce the fundamental concepts and techniques underlying these interfaces with a particular emphasis on semantic parsing, the key technology facilitating the translation from natural language to SQL queries or data visualization commands. We then delve into the recent advancements in Text-to-SQL and Text-to-Vis problems from the perspectives of datasets, methodologies, metrics, and system designs. This includes a deep dive into the influence of LLMs, highlighting their strengths, limitations, and potential for future improvements. Through this survey, we aim to provide a roadmap for researchers and practitioners interested in developing and applying natural language interfaces for data interaction in the era of large language models. △ Less

Submitted 19 May, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

Comments: 20 pages, 4 figures, 5 tables. Accepted by IEEE TKDE

arXiv:2310.14438 [pdf, ps, other]

BVARs and Stochastic Volatility

Authors: Joshua Chan

Abstract: Bayesian vector autoregressions (BVARs) are the workhorse in macroeconomic forecasting. Research in the last decade has established the importance of allowing time-varying volatility to capture both secular and cyclical variations in macroeconomic uncertainty. This recognition, together with the growing availability of large datasets, has propelled a surge in recent research in building stochastic… ▽ More Bayesian vector autoregressions (BVARs) are the workhorse in macroeconomic forecasting. Research in the last decade has established the importance of allowing time-varying volatility to capture both secular and cyclical variations in macroeconomic uncertainty. This recognition, together with the growing availability of large datasets, has propelled a surge in recent research in building stochastic volatility models suitable for large BVARs. Some of these new models are also equipped with additional features that are especially desirable for large systems, such as order invariance -- i.e., estimates are not dependent on how the variables are ordered in the BVAR -- and robustness against COVID-19 outliers. Estimation of these large, flexible models is made possible by the recently developed equation-by-equation approach that drastically reduces the computational cost of estimating large systems. Despite these recent advances, there remains much ongoing work, such as the development of parsimonious approaches for time-varying coefficients and other types of nonlinearities in large BVARs. △ Less

Submitted 22 October, 2023; originally announced October 2023.

arXiv:2310.06808 [pdf, ps, other]

doi 10.1002/bimj.201700199

Odds are the sign is right

Authors: Brian Knaeble, Julian Chan

Abstract: This article introduces a new condition based on odds ratios for sensitivity analysis. The analysis involves the average effect of a treatment or exposure on a response or outcome with estimates adjusted for and conditional on a single, unmeasured, dichotomous covariate. Results of statistical simulations are displayed to show that the odds ratio condition is as reliable as other commonly used con… ▽ More This article introduces a new condition based on odds ratios for sensitivity analysis. The analysis involves the average effect of a treatment or exposure on a response or outcome with estimates adjusted for and conditional on a single, unmeasured, dichotomous covariate. Results of statistical simulations are displayed to show that the odds ratio condition is as reliable as other commonly used conditions for sensitivity analysis. Other conditions utilize quantities reflective of a mediating covariate. The odds ratio condition can be applied when the covariate is a confounding variable. As an example application we use the odds ratio condition to analyze and interpret a positive association observed between Zika virus infection and birth defects. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Journal ref: Biometrical Journal, 60(6) 1164-1171 (2018)

arXiv:2310.06322 [pdf, other]

Predicting Three Types of Freezing of Gait Events Using Deep Learning Models

Authors: Wen Tao Mo, Jonathan H. Chan

Abstract: Freezing of gait is a Parkinson's Disease symptom that episodically inflicts a patient with the inability to step or turn while walking. While medical experts have discovered various triggers and alleviating actions for freezing of gait, the underlying causes and prediction models are still being explored today. Current freezing of gait prediction models that utilize machine learning achieve high… ▽ More Freezing of gait is a Parkinson's Disease symptom that episodically inflicts a patient with the inability to step or turn while walking. While medical experts have discovered various triggers and alleviating actions for freezing of gait, the underlying causes and prediction models are still being explored today. Current freezing of gait prediction models that utilize machine learning achieve high sensitivity and specificity in freezing of gait predictions based on time-series data; however, these models lack specifications on the type of freezing of gait events. We develop various deep learning models using the transformer encoder architecture plus Bidirectional LSTM layers and different feature sets to predict the three different types of freezing of gait events. The best performing model achieves a score of 0.427 on testing data, which would rank top 5 in Kaggle's Freezing of Gait prediction competition, hosted by THE MICHAEL J. FOX FOUNDATION. However, we also recognize overfitting in training data that could be potentially improved through pseudo labelling on additional data and model architecture simplification. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: 5 pages

arXiv:2309.15840 [pdf, other]

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

Authors: Lorenzo Pacchiardi, Alex J. Chan, Sören Mindermann, Ilan Moscovitz, Alexa Y. Pan, Yarin Gal, Owain Evans, Jan Brauner

Abstract: Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM's activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by asking a… ▽ More Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM's activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by asking a predefined set of unrelated follow-up questions after a suspected lie, and feeding the LLM's yes/no answers into a logistic regression classifier. Despite its simplicity, this lie detector is highly accurate and surprisingly general. When trained on examples from a single setting -- prompting GPT-3.5 to lie about factual questions -- the detector generalises out-of-distribution to (1) other LLM architectures, (2) LLMs fine-tuned to lie, (3) sycophantic lies, and (4) lies emerging in real-life scenarios such as sales. These results indicate that LLMs have distinctive lie-related behavioural patterns, consistent across architectures and contexts, which could enable general-purpose lie detection. △ Less

Submitted 26 September, 2023; originally announced September 2023.

arXiv:2309.14474 [pdf]

Gastro-Intestinal Tract Segmentation Using an Explainable 3D Unet

Authors: Kai Li, Jonathan Chan

Abstract: In treating gastrointestinal cancer using radiotherapy, the role of the radiation oncologist is to administer high doses of radiation, through x-ray beams, toward the tumor while avoiding the stomach and intestines. With the advent of precise radiation treatment technology such as the MR-Linac, oncologists can visualize the daily positions of the tumors and intestines, which may vary day to day. B… ▽ More In treating gastrointestinal cancer using radiotherapy, the role of the radiation oncologist is to administer high doses of radiation, through x-ray beams, toward the tumor while avoiding the stomach and intestines. With the advent of precise radiation treatment technology such as the MR-Linac, oncologists can visualize the daily positions of the tumors and intestines, which may vary day to day. Before delivering radiation, radio oncologists must manually outline the position of the gastrointestinal organs in order to determine position and direction of the x-ray beam. This is a time consuming and labor intensive process that may substantially prolong a patient's treatment. A deep learning (DL) method can automate and expedite the process. However, many deep neural networks approaches currently in use are black-boxes which lack interpretability which render them untrustworthy and impractical in a healthcare setting. To address this, an emergent field of AI known as Explainable AI (XAI) may be incorporated to improve the transparency and viability of a model. This paper proposes a deep learning pipeline that incorporates XAI to address the challenges of organ segmentation. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: 5 pages, 8 figures, 13th Joint Symposium on Computational Intelligence (JSCI13)

arXiv:2309.12490 [pdf, other]

Bayesian improved cross entropy method with categorical mixture models

Authors: Jianpeng Chan, Iason Papaioannou, Daniel Straub

Abstract: We employ the Bayesian improved cross entropy (BiCE) method for rare event estimation in static networks and choose the categorical mixture as the parametric family to capture the dependence among network components. At each iteration of the BiCE method, the mixture parameters are updated through the weighted maximum a posteriori (MAP) estimate, which mitigates the overfitting issue of the standar… ▽ More We employ the Bayesian improved cross entropy (BiCE) method for rare event estimation in static networks and choose the categorical mixture as the parametric family to capture the dependence among network components. At each iteration of the BiCE method, the mixture parameters are updated through the weighted maximum a posteriori (MAP) estimate, which mitigates the overfitting issue of the standard improved cross entropy (iCE) method through a novel balanced prior, and we propose a generalized version of the expectation-maximization (EM) algorithm to approximate this weighted MAP estimate. The resulting importance sampling distribution is proved to be unbiased. For choosing a proper number of components $K$ in the mixture, we compute the Bayesian information criterion (BIC) of each candidate $K$ as a by-product of the generalized EM algorithm. The performance of the proposed method is investigated through a simple illustration, a benchmark study, and a practical application. In all these numerical examples, the BiCE method results in an efficient and accurate estimator that significantly outperforms the standard iCE method and the BiCE method with the independent categorical distribution. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2309.12164 [pdf, other]

Stratified Type Theory

Authors: Jonathan Chan, Stephanie Weirich

Abstract: A hierarchy of type universes is a rudimentary ingredient in the type theories of many proof assistants to prevent the logical inconsistency resulting from combining dependent functions and the type-in-type rule. In this work, we argue that a universe hierarchy is not the only option for a type theory with a type universe. Taking inspiration from Leivant's Stratified System F, we introduce Stratif… ▽ More A hierarchy of type universes is a rudimentary ingredient in the type theories of many proof assistants to prevent the logical inconsistency resulting from combining dependent functions and the type-in-type rule. In this work, we argue that a universe hierarchy is not the only option for a type theory with a type universe. Taking inspiration from Leivant's Stratified System F, we introduce Stratified Type Theory (StraTT), where rather than stratifying universes by levels, we stratify typing judgements and restrict the domain of dependent functions to strictly lower levels. Even with type-in-type, this restriction suffices to enforce consistency. In StraTT, we consider a number of extensions beyond just stratified dependent functions. First, the subsystem subStraTT employs McBride's crude-but-effective stratification (also known as displacement) as a simple form of level polymorphism where global definitions with concrete levels can be displaced uniformly to any higher level. Second, to recover some expressivity lost due to the restriction on dependent function domains, the full StraTT includes a separate nondependent function type with a "floating" domain whose level matches that of the overall function type. Finally, we have implemented a prototype type checker for StraTT extended with datatypes and inference for level and displacement annotations, along with a small core library. We have proven subStraTT to be consistent and StraTT to be type safe, but consistency of the full StraTT remains an open problem, largely due to the interaction between floating functions and cumulativity of judgements. Nevertheless, we believe StraTT to be consistent, and as evidence have verified the failure of some well-known type-theoretic paradoxes using our implementation. △ Less

Submitted 7 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

Comments: 26 pages, 4 figures

ACM Class: D.3.1; F.4.1

arXiv:2309.04211 [pdf, other]

Counterfactual Explanations via Locally-guided Sequential Algorithmic Recourse

Authors: Edward A. Small, Jeffrey N. Clark, Christopher J. McWilliams, Kacper Sokol, Jeffrey Chan, Flora D. Salim, Raul Santos-Rodriguez

Abstract: Counterfactuals operationalised through algorithmic recourse have become a powerful tool to make artificial intelligence systems explainable. Conceptually, given an individual classified as y -- the factual -- we seek actions such that their prediction becomes the desired class y' -- the counterfactual. This process offers algorithmic recourse that is (1) easy to customise and interpret, and (2) d… ▽ More Counterfactuals operationalised through algorithmic recourse have become a powerful tool to make artificial intelligence systems explainable. Conceptually, given an individual classified as y -- the factual -- we seek actions such that their prediction becomes the desired class y' -- the counterfactual. This process offers algorithmic recourse that is (1) easy to customise and interpret, and (2) directly aligned with the goals of each individual. However, the properties of a "good" counterfactual are still largely debated; it remains an open challenge to effectively locate a counterfactual along with its corresponding recourse. Some strategies use gradient-driven methods, but these offer no guarantees on the feasibility of the recourse and are open to adversarial attacks on carefully created manifolds. This can lead to unfairness and lack of robustness. Other methods are data-driven, which mostly addresses the feasibility problem at the expense of privacy, security and secrecy as they require access to the entire training data set. Here, we introduce LocalFACE, a model-agnostic technique that composes feasible and actionable counterfactual explanations using locally-acquired information at each step of the algorithmic recourse. Our explainer preserves the privacy of users by only leveraging data that it specifically requires to construct actionable algorithmic recourse, and protects the model by offering transparency solely in the regions deemed necessary for the intervention. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: 7 pages, 5 figures, 3 appendix pages

arXiv:2309.01214 [pdf]

Immersive Technologies in Virtual Companions: A Systematic Literature Review

Authors: Ziaullah Momand, Jonathan H. Chan, Pornchai Mongkolnam

Abstract: The emergence of virtual companions is transforming the evolution of intelligent systems that effortlessly cater to the unique requirements of users. These advanced systems not only take into account the user present capabilities, preferences, and needs but also possess the capability to adapt dynamically to changes in the environment, as well as fluctuations in the users emotional state or behavi… ▽ More The emergence of virtual companions is transforming the evolution of intelligent systems that effortlessly cater to the unique requirements of users. These advanced systems not only take into account the user present capabilities, preferences, and needs but also possess the capability to adapt dynamically to changes in the environment, as well as fluctuations in the users emotional state or behavior. A virtual companion is an intelligent software or application that offers support, assistance, and companionship across various aspects of users lives. Various enabling technologies are involved in building virtual companion, among these, Augmented Reality (AR), and Virtual Reality (VR) are emerging as transformative tools. While their potential for use in virtual companions or digital assistants is promising, their applications in these domains remain relatively unexplored. To address this gap, a systematic review was conducted to investigate the applications of VR, AR, and MR immersive technologies in the development of virtual companions. A comprehensive search across PubMed, Scopus, and Google Scholar yielded 28 relevant articles out of a pool of 644. The review revealed that immersive technologies, particularly VR and AR, play a significant role in creating digital assistants, offering a wide range of applications that brings various facilities in the individuals life in areas such as addressing social isolation, enhancing cognitive abilities and dementia care, facilitating education, and more. Additionally, AR and MR hold potential for enhancing Quality of life (QoL) within the context of virtual companion technology. The findings of this review provide a valuable foundation for further research in this evolving field. △ Less

Submitted 3 September, 2023; originally announced September 2023.

arXiv:2308.14051 [pdf]

3D Printed Multilayer Structures for High Numerical Aperture Achromatic Metalenses

Authors: Cheng-Feng Pan, Hao Wang, Hongtao Wang, Parvathi Nair S, Qifeng Ruan, Simon Wredh, Yujie Ke, John You En Chan, Wang Zhang, Cheng-Wei Qiu, Joel K. W. Yang

Abstract: Flat optics consisting of nanostructures of high-refractive-index materials produce lenses with thin form factors that tend to operate only at specific wavelengths. Recent attempts to achieve achromatic lenses uncover a trade-off between the numerical aperture (NA) and bandwidth, which limits performance. Here we propose a new approach to design high NA, broadband and polarization-insensitive mult… ▽ More Flat optics consisting of nanostructures of high-refractive-index materials produce lenses with thin form factors that tend to operate only at specific wavelengths. Recent attempts to achieve achromatic lenses uncover a trade-off between the numerical aperture (NA) and bandwidth, which limits performance. Here we propose a new approach to design high NA, broadband and polarization-insensitive multilayer achromatic metalenses (MAM). We combine topology optimization and full wave simulations to inversely design MAMs and fabricate the structures in low-refractive-index materials by two-photon polymerization lithography. MAMs measuring 20 micrometer in diameter operating in the visible range of 400-800 nm with 0.5 NA and 0.7 NA were achieved with efficiencies of up to 42%. We demonstrate broadband imaging performance of the fabricated MAM under white light, and RGB narrowband illuminations. These results highlight the potential of the 3D printed multilayer structures for realizing broadband and multi-functional meta-devices with inverse design. △ Less

Submitted 27 August, 2023; originally announced August 2023.

Comments: 37 pages, 15 figures

arXiv:2308.13755 [pdf, other]

doi 10.1007/s10618-023-00963-3

i-Align: an interpretable knowledge graph alignment model

Authors: Bayu Distiawan Trisedya, Flora D Salim, Jeffrey Chan, Damiano Spina, Falk Scholer, Mark Sanderson

Abstract: Knowledge graphs (KGs) are becoming essential resources for many downstream applications. However, their incompleteness may limit their potential. Thus, continuous curation is needed to mitigate this problem. One of the strategies to address this problem is KG alignment, i.e., forming a more complete KG by merging two or more KGs. This paper proposes i-Align, an interpretable KG alignment model. U… ▽ More Knowledge graphs (KGs) are becoming essential resources for many downstream applications. However, their incompleteness may limit their potential. Thus, continuous curation is needed to mitigate this problem. One of the strategies to address this problem is KG alignment, i.e., forming a more complete KG by merging two or more KGs. This paper proposes i-Align, an interpretable KG alignment model. Unlike the existing KG alignment models, i-Align provides an explanation for each alignment prediction while maintaining high alignment performance. Experts can use the explanation to check the correctness of the alignment prediction. Thus, the high quality of a KG can be maintained during the curation process (e.g., the merging process of two KGs). To this end, a novel Transformer-based Graph Encoder (Trans-GE) is proposed as a key component of i-Align for aggregating information from entities' neighbors (structures). Trans-GE uses Edge-gated Attention that combines the adjacency matrix and the self-attention matrix to learn a gating mechanism to control the information aggregation from the neighboring entities. It also uses historical embeddings, allowing Trans-GE to be trained over mini-batches, or smaller sub-graphs, to address the scalability issue when encoding a large KG. Another component of i-Align is a Transformer encoder for aggregating entities' attributes. This way, i-Align can generate explanations in the form of a set of the most influential attributes/neighbors based on attention weights. Extensive experiments are conducted to show the power of i-Align. The experiments include several aspects, such as the model's effectiveness for aligning KGs, the quality of the generated explanations, and its practicality for aligning large KGs. The results show the effectiveness of i-Align in these aspects. △ Less

Submitted 25 August, 2023; originally announced August 2023.

Comments: Data Min Knowl Disc (2023)

arXiv:2308.03822 [pdf, other]

Search for Eccentric Black Hole Coalescences during the Third Observing Run of LIGO and Virgo

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi , et al. (1750 additional authors not shown)

Abstract: Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effect… ▽ More Despite the growing number of confident binary black hole coalescences observed through gravitational waves so far, the astrophysical origin of these binaries remains uncertain. Orbital eccentricity is one of the clearest tracers of binary formation channels. Identifying binary eccentricity, however, remains challenging due to the limited availability of gravitational waveforms that include effects of eccentricity. Here, we present observational results for a waveform-independent search sensitive to eccentric black hole coalescences, covering the third observing run (O3) of the LIGO and Virgo detectors. We identified no new high-significance candidates beyond those that were already identified with searches focusing on quasi-circular binaries. We determine the sensitivity of our search to high-mass (total mass $M>70$ $M_\odot$) binaries covering eccentricities up to 0.3 at 15 Hz orbital frequency, and use this to compare model predictions to search results. Assuming all detections are indeed quasi-circular, for our fiducial population model, we place an upper limit for the merger rate density of high-mass binaries with eccentricities $0 < e \leq 0.3$ at $0.33$ Gpc$^{-3}$ yr$^{-1}$ at 90\% confidence level. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: 24 pages, 5 figures

Report number: LIGO-P2300080

arXiv:2307.12089 [pdf, other]

High order entropy stable schemes for the quasi-one-dimensional shallow water and compressible Euler equations

Authors: Jesse Chan, Khemraj Shukla, Xinhui Wu, Ruofeng Liu, Prani Nalluri

Abstract: High order schemes are known to be unstable in the presence of shock discontinuities or under-resolved solution features for nonlinear conservation laws. Entropy stable schemes address this instability by ensuring that physically relevant solutions satisfy a semi-discrete entropy inequality independently of discretization parameters. This work extends high order entropy stable schemes to the quasi… ▽ More High order schemes are known to be unstable in the presence of shock discontinuities or under-resolved solution features for nonlinear conservation laws. Entropy stable schemes address this instability by ensuring that physically relevant solutions satisfy a semi-discrete entropy inequality independently of discretization parameters. This work extends high order entropy stable schemes to the quasi-1D shallow water equations and the quasi-1D compressible Euler equations, which model one-dimensional flows through channels or nozzles with varying width. We introduce new non-symmetric entropy conservative finite volume fluxes for both sets of quasi-1D equations, as well as a generalization of the entropy conservation condition to non-symmetric fluxes. When combined with an entropy stable interface flux, the resulting schemes are high order accurate, conservative, and semi-discretely entropy stable. For the quasi-1D shallow water equations, the resulting schemes are also well-balanced. △ Less

Submitted 10 January, 2024; v1 submitted 22 July, 2023; originally announced July 2023.

Showing 1–50 of 443 results for author: Chan, J