-
ProDepth: Boosting Self-Supervised Multi-Frame Monocular Depth with Probabilistic Fusion
Authors:
Sungmin Woo,
Wonjoon Lee,
Woo Jin Kim,
Dogyoon Lee,
Sangyoun Lee
Abstract:
Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene. However, the presence of moving objects in dynamic scenes introduces inevitable inconsistencies, causing misaligned multi-frame feature matching and misleading self-supervision during training. In this paper, we propose a novel framework calle…
▽ More
Self-supervised multi-frame monocular depth estimation relies on the geometric consistency between successive frames under the assumption of a static scene. However, the presence of moving objects in dynamic scenes introduces inevitable inconsistencies, causing misaligned multi-frame feature matching and misleading self-supervision during training. In this paper, we propose a novel framework called ProDepth, which effectively addresses the mismatch problem caused by dynamic objects using a probabilistic approach. We initially deduce the uncertainty associated with static scene assumption by adopting an auxiliary decoder. This decoder analyzes inconsistencies embedded in the cost volume, inferring the probability of areas being dynamic. We then directly rectify the erroneous cost volume for dynamic areas through a Probabilistic Cost Volume Modulation (PCVM) module. Specifically, we derive probability distributions of depth candidates from both single-frame and multi-frame cues, modulating the cost volume by adaptively fusing those distributions based on the inferred uncertainty. Additionally, we present a self-supervision loss reweighting strategy that not only masks out incorrect supervision with high uncertainty but also mitigates the risks in remaining possible dynamic areas in accordance with the probability. Our proposed method excels over state-of-the-art approaches in all metrics on both Cityscapes and KITTI datasets, and demonstrates superior generalization ability on the Waymo Open dataset.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Enhancing Training Efficiency Using Packing with Flash Attention
Authors:
Achintya Kundu,
Rhui Dih Lee,
Laura Wynter,
Raghu Kiran Ganti
Abstract:
Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. On the other hand, the Hugging Face SFT trainer offers the option to use packin…
▽ More
Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. On the other hand, the Hugging Face SFT trainer offers the option to use packing to combine multiple training examples up to the maximum sequence length. This allows for maximal utilization of GPU resources. However, without proper masking of each packed training example, attention will not be computed correctly when using SFT trainer. We enable and then analyse packing and Flash Attention with proper attention masking of each example and show the benefits of this training paradigm.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Accelerating Eigenvalue Computation for Nuclear Structure Calculations via Perturbative Corrections
Authors:
Dong Min Roh,
Esmond Ng,
Chao Yang,
Dean Lee,
Pieter Maris,
James P. Vary
Abstract:
We present a new method for computing the lowest few eigenvalues and the corresponding eigenvectors of a nuclear many-body Hamiltonian represented in a truncated configuration interaction subspace, i.e., the no-core shell model (NCSM). The method uses the hierarchical structure of the NCSM Hamiltonian to partition the Hamiltonian as the sum of two matrices. The first matrix corresponds to the Hami…
▽ More
We present a new method for computing the lowest few eigenvalues and the corresponding eigenvectors of a nuclear many-body Hamiltonian represented in a truncated configuration interaction subspace, i.e., the no-core shell model (NCSM). The method uses the hierarchical structure of the NCSM Hamiltonian to partition the Hamiltonian as the sum of two matrices. The first matrix corresponds to the Hamiltonian represented in a small configuration space, whereas the second is viewed as the perturbation to the first matrix. Eigenvalues and eigenvectors of the first matrix can be computed efficiently. Perturbative corrections to the eigenvectors of the first matrix can be obtained from the solutions of a sequence of linear systems of equations defined in the small configuration space. These correction vectors can be combined with the approximate eigenvectors of the first matrix to construct a subspace from which more accurate approximations of the desired eigenpairs can be obtained. We call this method a Subspace Projection with Perturbative Corrections (SPPC) method. We show by numerical examples that the SPPC method can be more efficient than conventional iterative methods for solving large-scale eigenvalue problems such as the Lanczos, block Lanczos and the locally optimal block preconditioned conjugate gradient (LOBPCG) method. The method can also be combined with other methods to avoid convergence stagnation.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Centrality dependence of Lévy-stable two-pion Bose-Einstein correlations in $\sqrt{s_{_{NN}}}=200$ GeV Au$+$Au collisions
Authors:
PHENIX Collaboration,
N. J. Abdulameer,
U. Acharya,
A. Adare,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
R. Akimoto,
H. Al-Ta'ani,
J. Alexander,
A. Angerami,
K. Aoki,
N. Apadula,
Y. Aramaki,
H. Asano,
E. C. Aschenauer,
E. T. Atomssa,
T. C. Awes,
B. Azmoun,
V. Babintsev,
M. Bai,
B. Bannier,
K. N. Barish,
B. Bassalleck,
S. Bathe
, et al. (377 additional authors not shown)
Abstract:
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability…
▽ More
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability $α$, and the Lévy-scale parameter $R$ as a function of transverse mass $m_T$ and centrality. The $λ(m_T)$ parameter is constant at larger values of $m_T$, but decreases as $m_T$ decreases. The Lévy scale parameter $R(m_T)$ decreases with $m_T$ and exhibits proportionality to the length scale of the nuclear overlap region. The Lévy exponent $α(m_T)$ is independent of $m_T$ within uncertainties in each investigated centrality bin, but shows a clear centrality dependence. At all centralities, the Lévy exponent $α$ is significantly different from that of Gaussian ($α=2$) or Cauchy ($α=1$) source distributions. Comparisons to the predictions of Monte-Carlo simulations of resonance-decay chains show that in all but the most peripheral centrality class (50%-60%), the obtained results are inconsistent with the measurements, unless a significant reduction of the in-medium mass of the $η'$ meson is included. In each centrality class, the best value of the in-medium $η'$ mass is compared to the mass of the $η$ meson, as well as to several theoretical predictions that consider restoration of $U_A(1)$ symmetry in hot hadronic matter.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Sparse-DeRF: Deblurred Neural Radiance Fields from Sparse View
Authors:
Dogyoon Lee,
Donghyeong Kim,
Jungho Lee,
Minhyeok Lee,
Seunghoon Lee,
Sangyoun Lee
Abstract:
Recent studies construct deblurred neural radiance fields (DeRF) using dozens of blurry images, which are not practical scenarios if only a limited number of blurry images are available. This paper focuses on constructing DeRF from sparse-view for more pragmatic real-world scenarios. As observed in our experiments, establishing DeRF from sparse views proves to be a more challenging problem due to…
▽ More
Recent studies construct deblurred neural radiance fields (DeRF) using dozens of blurry images, which are not practical scenarios if only a limited number of blurry images are available. This paper focuses on constructing DeRF from sparse-view for more pragmatic real-world scenarios. As observed in our experiments, establishing DeRF from sparse views proves to be a more challenging problem due to the inherent complexity arising from the simultaneous optimization of blur kernels and NeRF from sparse view. Sparse-DeRF successfully regularizes the complicated joint optimization, presenting alleviated overfitting artifacts and enhanced quality on radiance fields. The regularization consists of three key components: Surface smoothness, helps the model accurately predict the scene structure utilizing unseen and additional hidden rays derived from the blur kernel based on statistical tendencies of real-world; Modulated gradient scaling, helps the model adjust the amount of the backpropagated gradient according to the arrangements of scene objects; Perceptual distillation improves the perceptual quality by overcoming the ill-posed multi-view inconsistency of image deblurring and distilling the pre-filtered information, compensating for the lack of clean information in blurry images. We demonstrate the effectiveness of the Sparse-DeRF with extensive quantitative and qualitative experimental results by training DeRF from 2-view, 4-view, and 6-view blurry images.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
A third-order finite difference weighted essentially non-oscillatory scheme with shallow neural network
Authors:
Kwanghyuk Park,
Xinjuan Chen,
Dongjin Lee,
Jiaxi Gu,
Jae-Hun Jung
Abstract:
In this paper, we introduce the finite difference weighted essentially non-oscillatory (WENO) scheme based on the neural network for hyperbolic conservation laws. We employ the supervised learning and design two loss functions, one with the mean squared error and the other with the mean squared logarithmic error, where the WENO3-JS weights are computed as the labels. Each loss function consists of…
▽ More
In this paper, we introduce the finite difference weighted essentially non-oscillatory (WENO) scheme based on the neural network for hyperbolic conservation laws. We employ the supervised learning and design two loss functions, one with the mean squared error and the other with the mean squared logarithmic error, where the WENO3-JS weights are computed as the labels. Each loss function consists of two components where the first component compares the difference between the weights from the neural network and WENO3-JS weights, while the second component matches the output weights of the neural network and the linear weights. The former of the loss function enforces the neural network to follow the WENO properties, implying that there is no need for the post-processing layer. Additionally the latter leads to better performance around discontinuities. As a neural network structure, we choose the shallow neural network (SNN) for computational efficiency with the Delta layer consisting of the normalized undivided differences. These constructed WENO3-SNN schemes show the outperformed results in one-dimensional examples and improved behavior in two-dimensional examples, compared with the simulations from WENO3-JS and WENO3-Z.
△ Less
Submitted 10 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
Regret Analysis of Multi-task Representation Learning for Linear-Quadratic Adaptive Control
Authors:
Bruce D. Lee,
Leonardo F. Toso,
Thomas T. Zhang,
James Anderson,
Nikolai Matni
Abstract:
Representation learning is a powerful tool that enables learning over large multitudes of agents or domains by enforcing that all agents operate on a shared set of learned features. However, many robotics or controls applications that would benefit from collaboration operate in settings with changing environments and goals, whereas most guarantees for representation learning are stated for static…
▽ More
Representation learning is a powerful tool that enables learning over large multitudes of agents or domains by enforcing that all agents operate on a shared set of learned features. However, many robotics or controls applications that would benefit from collaboration operate in settings with changing environments and goals, whereas most guarantees for representation learning are stated for static settings. Toward rigorously establishing the benefit of representation learning in dynamic settings, we analyze the regret of multi-task representation learning for linear-quadratic control. This setting introduces unique challenges. Firstly, we must account for and balance the $\textit{misspecification}$ introduced by an approximate representation. Secondly, we cannot rely on the parameter update schemes of single-task online LQR, for which least-squares often suffices, and must devise a novel scheme to ensure sufficient improvement. We demonstrate that for settings where exploration is "benign", the regret of any agent after $T$ timesteps scales as $\tilde O(\sqrt{T/H})$, where $H$ is the number of agents. In settings with "difficult" exploration, the regret scales as $\tilde{\mathcal O}(\sqrt{d_u d_θ} \sqrt{T} + T^{3/4}/H^{1/5})$, where $d_x$ is the state-space dimension, $d_u$ is the input dimension, and $d_θ$ is the task-specific parameter count. In both cases, by comparing to the minimax single-task regret $\tilde{\mathcal O}(\sqrt{d_x d_u^2}\sqrt{T})$, we see a benefit of a large number of agents. Notably, in the difficult exploration case, by sharing a representation across tasks, the effective task-specific parameter count can often be small $d_θ< d_x d_u$. Lastly, we provide numerical validation of the trends we predict.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Improved limit on neutrinoless double beta decay of \mohundred~from AMoRE-I
Authors:
A. Agrawal,
V. V. Alenkov,
P. Aryal,
J. Beyer,
B. Bhandari,
R. S. Boiko,
K. Boonin,
O. Buzanov,
C. R. Byeon,
N. Chanthima,
M. K. Cheoun,
J. S. Choe,
Seonho Choi,
S. Choudhury,
J. S. Chung,
F. A. Danevich,
M. Djamal,
D. Drung,
C. Enss,
A. Fleischmann,
A. M. Gangapshev,
L. Gastaldo,
Y. M. Gavrilyuk,
A. M. Gezhaev,
O. Gileva
, et al. (83 additional authors not shown)
Abstract:
AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate c…
▽ More
AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate crystals, at the Yangyang Underground Laboratory for over two years. The exposure was 8.02 kg$\cdot$year (or 3.89 kg$_{\mathrm{^{100}Mo}}\cdot$year) and the total background rate near the Q-value was 0.025 $\pm$ 0.002 counts/keV/kg/year. We observed no indication of $0νββ$ decay and report a new lower limit of the half-life of $^{100}$Mo $0νββ$ decay as $ T^{0ν}_{1/2}>3.0\times10^{24}~\mathrm{years}$ at 90\% confidence level. The effective Majorana mass limit range is $m_{ββ}<$(210--610) meV using nuclear matrix elements estimated in the framework of different models, including the recent shell model calculations.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Primordial perturbations in Type III hilltop inflation models
Authors:
Chia-Min Lin,
Harish Dhananjay Nalla,
Chen-Pin Yeh,
Da-Shin Lee
Abstract:
We analytically compute the power spectrum of primordial curvature perturbations in Type III hilltop inflation models under the slow-roll approximation. The model parameters are constrained using current Cosmic Microwave Background (CMB) data. The curvature perturbations that exit the horizon at small scales show sufficiently large amplitudes to produce primordial black holes (PBHs). We then consi…
▽ More
We analytically compute the power spectrum of primordial curvature perturbations in Type III hilltop inflation models under the slow-roll approximation. The model parameters are constrained using current Cosmic Microwave Background (CMB) data. The curvature perturbations that exit the horizon at small scales show sufficiently large amplitudes to produce primordial black holes (PBHs). We then consider the quantum one-loop corrections in these models from both the self-interaction of the inflaton and its interaction with the waterfall field. We show the loop corrections in both cases for 60 e-folds of inflation are negligible, ensuring the tree-level results are reliable within the chosen parameter regime.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Knowledge
Authors:
Young-Jun Lee,
Dokyong Lee,
Junyoung Youn,
Kyeongjin Oh,
Byungsoo Ko,
Jonghwan Hyeon,
Ho-Jin Choi
Abstract:
Humans share a wide variety of images related to their personal experiences within conversations via instant messaging tools. However, existing works focus on (1) image-sharing behavior in singular sessions, leading to limited long-term social interaction, and (2) a lack of personalized image-sharing behavior. In this work, we introduce Stark, a large-scale long-term multi-modal conversation datas…
▽ More
Humans share a wide variety of images related to their personal experiences within conversations via instant messaging tools. However, existing works focus on (1) image-sharing behavior in singular sessions, leading to limited long-term social interaction, and (2) a lack of personalized image-sharing behavior. In this work, we introduce Stark, a large-scale long-term multi-modal conversation dataset that covers a wide range of social personas in a multi-modality format, time intervals, and images. To construct Stark automatically, we propose a novel multi-modal contextualization framework, Mcu, that generates long-term multi-modal dialogue distilled from ChatGPT and our proposed Plan-and-Execute image aligner. Using our Stark, we train a multi-modal conversation model, Ultron 7B, which demonstrates impressive visual imagination ability. Furthermore, we demonstrate the effectiveness of our dataset in human evaluation. We make our source code and dataset publicly available.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion Blur Images
Authors:
Junghe Lee,
Donghyeong Kim,
Dogyoon Lee,
Suhwan Cho,
Sangyoun Lee
Abstract:
Neural radiance fields (NeRFs) have received significant attention due to their high-quality novel view rendering ability, prompting research to address various real-world cases. One critical challenge is the camera motion blur caused by camera movement during exposure time, which prevents accurate 3D scene reconstruction. In this study, we propose continuous rigid motion-aware gaussian splatting…
▽ More
Neural radiance fields (NeRFs) have received significant attention due to their high-quality novel view rendering ability, prompting research to address various real-world cases. One critical challenge is the camera motion blur caused by camera movement during exposure time, which prevents accurate 3D scene reconstruction. In this study, we propose continuous rigid motion-aware gaussian splatting (CRiM-GS) to reconstruct accurate 3D scene from blurry images with real-time rendering speed. Considering the actual camera motion blurring process, which consists of complex motion patterns, we predict the continuous movement of the camera based on neural ordinary differential equations (ODEs). Specifically, we leverage rigid body transformations to model the camera motion with proper regularization, preserving the shape and size of the object. Furthermore, we introduce a continuous deformable 3D transformation in the \textit{SE(3)} field to adapt the rigid body transformation to real-world problems by ensuring a higher degree of freedom. By revisiting fundamental camera theory and employing advanced neural network training techniques, we achieve accurate modeling of continuous camera trajectories. We conduct extensive experiments, demonstrating state-of-the-art performance both quantitatively and qualitatively on benchmark datasets.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
ConPR: Ongoing Construction Site Dataset for Place Recognition
Authors:
Dongjae Lee,
Minwoo Jung,
Ayoung Kim
Abstract:
Place recognition, an essential challenge in computer vision and robotics, involves identifying previously visited locations. Despite algorithmic progress, challenges related to appearance change persist, with existing datasets often focusing on seasonal and weather variations but overlooking terrain changes. Understanding terrain alterations becomes critical for effective place recognition, given…
▽ More
Place recognition, an essential challenge in computer vision and robotics, involves identifying previously visited locations. Despite algorithmic progress, challenges related to appearance change persist, with existing datasets often focusing on seasonal and weather variations but overlooking terrain changes. Understanding terrain alterations becomes critical for effective place recognition, given the aging infrastructure and ongoing city repairs. For real-world applicability, the comprehensive evaluation of algorithms must consider spatial dynamics. To address existing limitations, we present a novel multi-session place recognition dataset acquired from an active construction site. Our dataset captures ongoing construction progress through multiple data collections, facilitating evaluation in dynamic environments. It includes camera images, LiDAR point cloud data, and IMU data, enabling visual and LiDAR-based place recognition techniques, and supporting sensor fusion. Additionally, we provide ground truth information for range-based place recognition evaluation. Our dataset aims to advance place recognition algorithms in challenging and dynamic settings. Our dataset is available at https://github.com/dongjae0107/ConPR.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Cactus: Towards Psychological Counseling Conversations using Cognitive Behavioral Theory
Authors:
Suyeon Lee,
Sunghwan Kim,
Minju Kim,
Dongjin Kang,
Dongil Yang,
Harim Kim,
Minseok Kang,
Dayi Jung,
Min Hee Kim,
Seungbeen Lee,
Kyoung-Mee Chung,
Youngjae Yu,
Dongha Lee,
Jinyoung Yeo
Abstract:
Recently, the demand for psychological counseling has significantly increased as more individuals express concerns about their mental health. This surge has accelerated efforts to improve the accessibility of counseling by using large language models (LLMs) as counselors. To ensure client privacy, training open-source LLMs faces a key challenge: the absence of realistic counseling datasets. To add…
▽ More
Recently, the demand for psychological counseling has significantly increased as more individuals express concerns about their mental health. This surge has accelerated efforts to improve the accessibility of counseling by using large language models (LLMs) as counselors. To ensure client privacy, training open-source LLMs faces a key challenge: the absence of realistic counseling datasets. To address this, we introduce Cactus, a multi-turn dialogue dataset that emulates real-life interactions using the goal-oriented and structured approach of Cognitive Behavioral Therapy (CBT). We create a diverse and realistic dataset by designing clients with varied, specific personas, and having counselors systematically apply CBT techniques in their interactions. To assess the quality of our data, we benchmark against established psychological criteria used to evaluate real counseling sessions, ensuring alignment with expert evaluations. Experimental results demonstrate that Camel, a model trained with Cactus, outperforms other models in counseling skills, highlighting its effectiveness and potential as a counseling agent. We make our data, model, and code publicly available.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity
Authors:
Qian Yu,
Yining Wang,
Baihe Huang,
Qi Lei,
Jason D. Lee
Abstract:
Optimization of convex functions under stochastic zeroth-order feedback has been a major and challenging question in online learning. In this work, we consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the mi…
▽ More
Optimization of convex functions under stochastic zeroth-order feedback has been a major and challenging question in online learning. In this work, we consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds. We propose an algorithm that features a combination of a bootstrapping stage and a mirror-descent stage. Our main technical innovation consists of a sharp characterization for the spherical-sampling gradient estimator under higher-order smoothness conditions, which allows the algorithm to optimally balance the bias-variance tradeoff, and a new iterative method for the bootstrapping stage, which maintains the performance for unbounded Hessian.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
CmWave and Sub-THz: Key Radio Enablers and Complementary Spectrum for 6G
Authors:
Mayur V. Katwe,
Aryan Kaushik,
Keshav Singh,
Marco Di Renzo,
Shu Sun,
Doohwan Lee,
Ana G. Armada,
Yonina C. Eldar,
Octavia A. Dobre,
Theodore S. Rappaport
Abstract:
Sixth-generation (6G) networks are poised to revolutionize communication by exploring alternative spectrum options, aiming to capitalize on strengths while mitigating limitations in current fifth-generation (5G) spectrum. This paper explores the potential opportunities and emerging trends for cmWave and sub-THz spectra as key radio enablers. This paper poses and answers three key questions regardi…
▽ More
Sixth-generation (6G) networks are poised to revolutionize communication by exploring alternative spectrum options, aiming to capitalize on strengths while mitigating limitations in current fifth-generation (5G) spectrum. This paper explores the potential opportunities and emerging trends for cmWave and sub-THz spectra as key radio enablers. This paper poses and answers three key questions regarding motivation of additional spectrum to explore the strategic implementation and benefits of cmWave and sub-THz spectra. Also, we show using case studies how these complementary spectrum bands will enable new applications in 6G, such as integrated sensing and communication (ISAC), re-configurable intelligent surfaces (RIS) and non-terrestrial networks (NTN). Numerical simulations reveal that the ISAC performance of cmWave and sub-THz spectra outperforms that of existing 5G spectrum, including sub-6 GHz and mmWave. Additionally, we illustrate the effective interplay between RIS and NTN to counteract the effects of high attenuation at sub-THz frequencies. Finally, ongoing standardization endeavors, challenges and promising directions are elucidated for these complementary spectrum bands.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
B-TMS: Bayesian Traversable Terrain Modeling and Segmentation Across 3D LiDAR Scans and Maps for Enhanced Off-Road Navigation
Authors:
Minho Oh,
Gunhee Shin,
Seoyeon Jang,
Seungjae Lee,
Dongkyu Lee,
Wonho Song,
Byeongho Yu,
Hyungtae Lim,
Jaeyoung Lee,
Hyun Myung
Abstract:
Recognizing traversable terrain from 3D point cloud data is critical, as it directly impacts the performance of autonomous navigation in off-road environments. However, existing segmentation algorithms often struggle with challenges related to changes in data distribution, environmental specificity, and sensor variations. Moreover, when encountering sunken areas, their performance is frequently co…
▽ More
Recognizing traversable terrain from 3D point cloud data is critical, as it directly impacts the performance of autonomous navigation in off-road environments. However, existing segmentation algorithms often struggle with challenges related to changes in data distribution, environmental specificity, and sensor variations. Moreover, when encountering sunken areas, their performance is frequently compromised, and they may even fail to recognize them. To address these challenges, we introduce B-TMS, a novel approach that performs map-wise terrain modeling and segmentation by utilizing Bayesian generalized kernel (BGK) within the graph structure known as the tri-grid field (TGF). Our experiments encompass various data distributions, ranging from single scans to partial maps, utilizing both public datasets representing urban scenes and off-road environments, and our own dataset acquired from extremely bumpy terrains. Our results demonstrate notable contributions, particularly in terms of robustness to data distribution variations, adaptability to diverse environmental conditions, and resilience against the challenges associated with parameter changes.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
f-GAN: A frequency-domain-constrained generative adversarial network for PPG to ECG synthesis
Authors:
Nathan C. L. Kong,
Dae Lee,
Huyen Do,
Dae Hoon Park,
Cong Xu,
Hongda Mao,
Jonathan Chung
Abstract:
Electrocardiograms (ECGs) and photoplethysmograms (PPGs) are generally used to monitor an individual's cardiovascular health. In clinical settings, ECGs and fingertip PPGs are the main signals used for assessing cardiovascular health, but the equipment necessary for their collection precludes their use in daily monitoring. Although PPGs obtained from wrist-worn devices are susceptible to noise due…
▽ More
Electrocardiograms (ECGs) and photoplethysmograms (PPGs) are generally used to monitor an individual's cardiovascular health. In clinical settings, ECGs and fingertip PPGs are the main signals used for assessing cardiovascular health, but the equipment necessary for their collection precludes their use in daily monitoring. Although PPGs obtained from wrist-worn devices are susceptible to noise due to motion, they have been widely used to continuously monitor cardiovascular health because of their convenience. Therefore, we would like to combine the ease with which PPGs can be collected with the information that ECGs provide about cardiovascular health by developing models to synthesize ECG signals from paired PPG signals. We tackled this problem using generative adversarial networks (GANs) and found that models trained using the original GAN formulations can be successfully used to synthesize ECG signals from which heart rate can be extracted using standard signal processing pipelines. Incorporating a frequency-domain constraint to model training improved the stability of model performance and also the performance on heart rate estimation.
△ Less
Submitted 15 May, 2024;
originally announced June 2024.
-
The Surface Signature and Rough Surfaces
Authors:
Darrick Lee
Abstract:
Parallel transport, or path development, provides a rich characterization of paths which preserves the underlying algebraic structure of concatenation. The path signature is universal among such maps: any (translation-invariant) parallel transport factors uniquely through the path signature. Furthermore, the path signature is a central object in the theory of rough paths, which provides an integra…
▽ More
Parallel transport, or path development, provides a rich characterization of paths which preserves the underlying algebraic structure of concatenation. The path signature is universal among such maps: any (translation-invariant) parallel transport factors uniquely through the path signature. Furthermore, the path signature is a central object in the theory of rough paths, which provides an integration theory for highly irregular paths. A fundamental result is Lyons' extension theorem, which allows us to compute the signature of rough paths, and in turn provides a way to compute parallel transport of arbitrarily irregular paths. In this article, we consider the notion of surface holonomy, a generalization of parallel transport to the higher dimensional setting of surfaces parametrized by rectangular domains, which preserves the higher algebraic structures of horizontal and vertical concatenation. Building on work of Kapranov, we introduce the surface signature, which is universal among surface holonomy maps with respect to continuous 2-connections. Furthermore, we introduce the notion of a rough surface and prove a surface extension theorem, which allows us to compute the signature of rough surfaces. By exploiting the universal property of the surface signature, this provides a method to compute the surface holonomy of arbitrarily irregular surfaces.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
PlagBench: Exploring the Duality of Large Language Models in Plagiarism Generation and Detection
Authors:
Jooyoung Lee,
Toshini Agrawal,
Adaku Uchendu,
Thai Le,
Jinghui Chen,
Dongwon Lee
Abstract:
Recent literature has highlighted potential risks to academic integrity associated with large language models (LLMs), as they can memorize parts of training instances and reproduce them in the generated texts without proper attribution. In addition, given their capabilities in generating high-quality texts, plagiarists can exploit LLMs to generate realistic paraphrases or summaries indistinguishab…
▽ More
Recent literature has highlighted potential risks to academic integrity associated with large language models (LLMs), as they can memorize parts of training instances and reproduce them in the generated texts without proper attribution. In addition, given their capabilities in generating high-quality texts, plagiarists can exploit LLMs to generate realistic paraphrases or summaries indistinguishable from original work. In response to possible malicious use of LLMs in plagiarism, we introduce PlagBench, a comprehensive dataset consisting of 46.5K synthetic plagiarism cases generated using three instruction-tuned LLMs across three writing domains. The quality of PlagBench is ensured through fine-grained automatic evaluation for each type of plagiarism, complemented by human annotation. We then leverage our proposed dataset to evaluate the plagiarism detection performance of five modern LLMs and three specialized plagiarism checkers. Our findings reveal that GPT-3.5 tends to generates paraphrases and summaries of higher quality compared to Llama2 and GPT-4. Despite LLMs' weak performance in summary plagiarism identification, they can surpass current commercial plagiarism detectors. Overall, our results highlight the potential of LLMs to serve as robust plagiarism detection tools.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Quantitative pointwise estimates of the cooling process for inelastic Boltzmann equation
Authors:
Gayoung An,
Jin Woo Jang,
Donghyun Lee
Abstract:
In this paper, we study the homogeneous inelastic Boltzmann equation for hard spheres. We first prove that the solution $f(t,v)$ is pointwisely bounded from above by $C_{f_0}\langle t \rangle^3$ and establish that the cooling time is infinite $T_c = +\infty$ under the condition $f_0 \in L^1_2 \cap L^{\infty}_{s} $ for $s > 2 $. Away from the zero velocity, we further prove that…
▽ More
In this paper, we study the homogeneous inelastic Boltzmann equation for hard spheres. We first prove that the solution $f(t,v)$ is pointwisely bounded from above by $C_{f_0}\langle t \rangle^3$ and establish that the cooling time is infinite $T_c = +\infty$ under the condition $f_0 \in L^1_2 \cap L^{\infty}_{s} $ for $s > 2 $. Away from the zero velocity, we further prove that $f(t,v)\leq C_{f_0, ε} \langle t \rangle $ for $|v| \geq ε$ at any time $t > 0 $ and $ε>0$. This time-growing pointwise upper-bound is natural in the cooling process, as we expect the density near $v = 0 $ to grow rapidly. As a consequence, via these results, we obtain Maxwellian upper-bounds of solutions for each time. Our upper-bounds hold for any constant normal restitution $ 0 < α\leq 1 $ and are uniform in $ α$.
△ Less
Submitted 23 June, 2024; v1 submitted 21 June, 2024;
originally announced June 2024.
-
Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization
Authors:
Deokjae Lee,
Hyun Oh Song,
Kyunghyun Cho
Abstract:
Active learning is increasingly adopted for expensive multi-objective combinatorial optimization problems, but it involves a challenging subset selection problem, optimizing the batch acquisition score that quantifies the goodness of a batch for evaluation. Due to the excessively large search space of the subset selection problem, prior methods optimize the batch acquisition on the latent space, w…
▽ More
Active learning is increasingly adopted for expensive multi-objective combinatorial optimization problems, but it involves a challenging subset selection problem, optimizing the batch acquisition score that quantifies the goodness of a batch for evaluation. Due to the excessively large search space of the subset selection problem, prior methods optimize the batch acquisition on the latent space, which has discrepancies with the actual space, or optimize individual acquisition scores without considering the dependencies among candidates in a batch instead of directly optimizing the batch acquisition. To manage the vast search space, a simple and effective approach is the greedy method, which decomposes the problem into smaller subproblems, yet it has difficulty in parallelization since each subproblem depends on the outcome from the previous ones. To this end, we introduce a novel greedy-style subset selection algorithm that optimizes batch acquisition directly on the combinatorial space by sequential greedy sampling from the greedy policy, specifically trained to address all greedy subproblems concurrently. Notably, our experiments on the red fluorescent proteins design task show that our proposed method achieves the baseline performance in 1.69x fewer queries, demonstrating its efficiency.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics
Authors:
Seungbeen Lee,
Seungwon Lim,
Seungju Han,
Giyeong Oh,
Hyungjoo Chae,
Jiwan Chung,
Minju Kim,
Beong-woo Kwak,
Yeonsoo Lee,
Dongha Lee,
Jinyoung Yeo,
Youngjae Yu
Abstract:
The idea of personality in descriptive psychology, traditionally defined through observable behavior, has now been extended to Large Language Models (LLMs) to better understand their behavior. This raises a question: do LLMs exhibit distinct and consistent personality traits, similar to humans? Existing self-assessment personality tests, while applicable, lack the necessary validity and reliabilit…
▽ More
The idea of personality in descriptive psychology, traditionally defined through observable behavior, has now been extended to Large Language Models (LLMs) to better understand their behavior. This raises a question: do LLMs exhibit distinct and consistent personality traits, similar to humans? Existing self-assessment personality tests, while applicable, lack the necessary validity and reliability for precise personality measurements. To address this, we introduce TRAIT, a new tool consisting of 8K multi-choice questions designed to assess the personality of LLMs with validity and reliability. TRAIT is built on the psychometrically validated human questionnaire, Big Five Inventory (BFI) and Short Dark Triad (SD-3), enhanced with the ATOMIC10X knowledge graph for testing personality in a variety of real scenarios. TRAIT overcomes the reliability and validity issues when measuring personality of LLM with self-assessment, showing the highest scores across three metrics: refusal rate, prompt sensitivity, and option order sensitivity. It reveals notable insights into personality of LLM: 1) LLMs exhibit distinct and consistent personality, which is highly influenced by their training data (i.e., data used for alignment tuning), and 2) current prompting techniques have limited effectiveness in eliciting certain traits, such as high psychopathy or low conscientiousness, suggesting the need for further research in this direction.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
On the Burau representation of $B_3$ modulo $p$
Authors:
Donsung Lee
Abstract:
We present an algorithm that, given a prime $p$ as input, determines whether or not the Burau representation of the 3-strand braid group modulo $p$ is faithful. We also prove that the representation is indeed faithful when $p\le 13$. Additionally, we re-pose Salter's question on the Burau representation of $B_3$ over finite fields $\mathbb{F}_p$, and solve it for every $p$.
We present an algorithm that, given a prime $p$ as input, determines whether or not the Burau representation of the 3-strand braid group modulo $p$ is faithful. We also prove that the representation is indeed faithful when $p\le 13$. Additionally, we re-pose Salter's question on the Burau representation of $B_3$ over finite fields $\mathbb{F}_p$, and solve it for every $p$.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Protecting Privacy Through Approximating Optimal Parameters for Sequence Unlearning in Language Models
Authors:
Dohyun Lee,
Daniel Rim,
Minseok Choi,
Jaegul Choo
Abstract:
Although language models (LMs) demonstrate exceptional capabilities on various tasks, they are potentially vulnerable to extraction attacks, which represent a significant privacy risk. To mitigate the privacy concerns of LMs, machine unlearning has emerged as an important research area, which is utilized to induce the LM to selectively forget about some of its training data. While completely retra…
▽ More
Although language models (LMs) demonstrate exceptional capabilities on various tasks, they are potentially vulnerable to extraction attacks, which represent a significant privacy risk. To mitigate the privacy concerns of LMs, machine unlearning has emerged as an important research area, which is utilized to induce the LM to selectively forget about some of its training data. While completely retraining the model will guarantee successful unlearning and privacy assurance, it is impractical for LMs, as it would be time-consuming and resource-intensive. Prior works efficiently unlearn the target token sequences, but upon subsequent iterations, the LM displays significant degradation in performance. In this work, we propose Privacy Protection via Optimal Parameters (POP), a novel unlearning method that effectively forgets the target token sequences from the pretrained LM by applying optimal gradient updates to the parameters. Inspired by the gradient derivation of complete retraining, we approximate the optimal training objective that successfully unlearns the target sequence while retaining the knowledge from the rest of the training data. Experimental results demonstrate that POP exhibits remarkable retention performance post-unlearning across 9 classification and 4 dialogue benchmarks, outperforming the state-of-the-art by a large margin. Furthermore, we introduce Remnant Memorization Accuracy that quantifies privacy risks based on token likelihood and validate its effectiveness through both qualitative and quantitative analyses.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Reinforcement Learning for Infinite-Horizon Average-Reward MDPs with Multinomial Logistic Function Approximation
Authors:
Jaehyun Park,
Dabeen Lee
Abstract:
We study model-based reinforcement learning with non-linear function approximation where the transition function of the underlying Markov decision process (MDP) is given by a multinomial logistic (MNL) model. In this paper, we develop two algorithms for the infinite-horizon average reward setting. Our first algorithm \texttt{UCRL2-MNL} applies to the class of communicating MDPs and achieves an…
▽ More
We study model-based reinforcement learning with non-linear function approximation where the transition function of the underlying Markov decision process (MDP) is given by a multinomial logistic (MNL) model. In this paper, we develop two algorithms for the infinite-horizon average reward setting. Our first algorithm \texttt{UCRL2-MNL} applies to the class of communicating MDPs and achieves an $\tilde{\mathcal{O}}(dD\sqrt{T})$ regret, where $d$ is the dimension of feature mapping, $D$ is the diameter of the underlying MDP, and $T$ is the horizon. The second algorithm \texttt{OVIFH-MNL} is computationally more efficient and applies to the more general class of weakly communicating MDPs, for which we show a regret guarantee of $\tilde{\mathcal{O}}(d^{2/5} \mathrm{sp}(v^*)T^{4/5})$ where $\mathrm{sp}(v^*)$ is the span of the associated optimal bias function.
We also prove a lower bound of $Ω(d\sqrt{DT})$ for learning communicating MDPs with MNL transitions of diameter at most $D$. Furthermore, we show a regret lower bound of $Ω(dH^{3/2}\sqrt{K})$ for learning $H$-horizon episodic MDPs with MNL function approximation where $K$ is the number of episodes, which improves upon the best-known lower bound for the finite-horizon setting.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Ultrastable vacuum-gap Fabry-Pérot cavities operated in air
Authors:
Yifan Liu,
Naijun Jin,
Dahyeon Lee,
Charles McLemore,
Takuma Nakamura,
Megan Kelleher,
Haotian Cheng,
Susan Schima,
Nazanin Hoghooghi,
Scott Diddams,
Peter Rakich,
Franklyn Quinlan
Abstract:
We demonstrate a vacuum-gap ultrastable optical reference cavity that does not require a vacuum enclosure. Our simple method of optical contact bonding in a vacuum environment allows for cavity operation in air while maintaining vacuum between the cavity mirrors. Vacuum is maintained long term, with no observed degradation in cavity stability for over 1 year after bonding. For a 1550 nm laser stab…
▽ More
We demonstrate a vacuum-gap ultrastable optical reference cavity that does not require a vacuum enclosure. Our simple method of optical contact bonding in a vacuum environment allows for cavity operation in air while maintaining vacuum between the cavity mirrors. Vacuum is maintained long term, with no observed degradation in cavity stability for over 1 year after bonding. For a 1550 nm laser stabilized to a 9.7 mL in-vacuum bonded cavity, the measured Allan deviation is $2.4\times 10^{-14}$ at 1 s and its phase noise is thermal-noise-limited from 0.1 Hz to 10 kHz, reaching about -105 dBc/Hz at 10 kHz offset frequency. This represents the highest stability of any oscillator operated without a vacuum enclosure. Furthermore, we demonstrate a 0.5 mL in-vacuum bonded cavity created using microfabricated mirrors and cavity dicing, with phase noise reaching -95 dBc/Hz at 10 kHz offset frequency. By relieving the need for high-vacuum enclosures, we greatly enhance the portability and utility of low noise, compact cavity-stabilized lasers, with applications ranging from environmental sensing to mobile optical clocks to ultralow noise microwave generation.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis
Authors:
Saranya Venkatraman,
Nafis Irtiza Tripto,
Dongwon Lee
Abstract:
The rise of unifying frameworks that enable seamless interoperability of Large Language Models (LLMs) has made LLM-LLM collaboration for open-ended tasks a possibility. Despite this, there have not been efforts to explore such collaborative writing. We take the next step beyond human-LLM collaboration to explore this multi-LLM scenario by generating the first exclusively LLM-generated collaborativ…
▽ More
The rise of unifying frameworks that enable seamless interoperability of Large Language Models (LLMs) has made LLM-LLM collaboration for open-ended tasks a possibility. Despite this, there have not been efforts to explore such collaborative writing. We take the next step beyond human-LLM collaboration to explore this multi-LLM scenario by generating the first exclusively LLM-generated collaborative stories dataset called CollabStory. We focus on single-author ($N=1$) to multi-author (up to $N=5$) scenarios, where multiple LLMs co-author stories. We generate over 32k stories using open-source instruction-tuned LLMs. Further, we take inspiration from the PAN tasks that have set the standard for human-human multi-author writing tasks and analysis. We extend their authorship-related tasks for multi-LLM settings and present baselines for LLM-LLM collaboration. We find that current baselines are not able to handle this emerging scenario. Thus, CollabStory is a resource that could help propel an understanding as well as the development of techniques to discern the use of multiple LLMs. This is crucial to study in the context of writing tasks since LLM-LLM collaboration could potentially overwhelm ongoing challenges related to plagiarism detection, credit assignment, maintaining academic integrity in educational settings, and addressing copyright infringement concerns. We make our dataset and code available at \texttt{\url{https://github.com/saranya-venkatraman/multi_llm_story_writing}}.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
SNAP: Unlearning Selective Knowledge in Large Language Models with Negative Instructions
Authors:
Minseok Choi,
Daniel Rim,
Dohyun Lee,
Jaegul Choo
Abstract:
Instruction-following large language models (LLMs), such as ChatGPT, have become increasingly popular with the general audience, many of whom are incorporating them into their daily routines. However, these LLMs inadvertently disclose personal or copyrighted information, which calls for a machine unlearning method to remove selective knowledge. Previous attempts sought to forget the link between t…
▽ More
Instruction-following large language models (LLMs), such as ChatGPT, have become increasingly popular with the general audience, many of whom are incorporating them into their daily routines. However, these LLMs inadvertently disclose personal or copyrighted information, which calls for a machine unlearning method to remove selective knowledge. Previous attempts sought to forget the link between the target information and its associated entities, but it rather led to generating undesirable responses about the target, compromising the end-user experience. In this work, we propose SNAP, an innovative framework designed to selectively unlearn information by 1) training an LLM with negative instructions to generate obliterated responses, 2) augmenting hard positives to retain the original LLM performance, and 3) applying the novel Wasserstein regularization to ensure adequate deviation from the initial weights of the LLM. We evaluate our framework on various NLP benchmarks and demonstrate that our approach retains the original LLM capabilities, while successfully unlearning the specified information.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Unveiling Implicit Table Knowledge with Question-Then-Pinpoint Reasoner for Insightful Table Summarization
Authors:
Kwangwook Seo,
Jinyoung Yeo,
Dongha Lee
Abstract:
Implicit knowledge hidden within the explicit table cells, such as data insights, is the key to generating a high-quality table summary. However, unveiling such implicit knowledge is a non-trivial task. Due to the complex nature of structured tables, it is challenging even for large language models (LLMs) to mine the implicit knowledge in an insightful and faithful manner. To address this challeng…
▽ More
Implicit knowledge hidden within the explicit table cells, such as data insights, is the key to generating a high-quality table summary. However, unveiling such implicit knowledge is a non-trivial task. Due to the complex nature of structured tables, it is challenging even for large language models (LLMs) to mine the implicit knowledge in an insightful and faithful manner. To address this challenge, we propose a novel table reasoning framework Question-then-Pinpoint. Our work focuses on building a plug-and-play table reasoner that can self-question the insightful knowledge and answer it by faithfully pinpointing evidence on the table to provide explainable guidance for the summarizer. To train a reliable reasoner, we collect table knowledge by guiding a teacher LLM to follow the coarse-to-fine reasoning paths and refine it through two quality enhancement strategies to selectively distill the high-quality knowledge to the reasoner. Extensive experiments on two table summarization datasets, including our newly proposed InsTaSumm, validate the general effectiveness of our framework.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Moiré flat bands and antiferroelectric domains in lattice relaxed twisted bilayer hexagonal boron nitride under perpendicular electric fields
Authors:
Fengping Li,
Dongkyu Lee,
Nicolas Leconte,
Srivani Javvaji,
Jeil Jung
Abstract:
Local interlayer charge polarization of twisted bilayer hexagonal boron nitride (t2BN) is calculated and parametrized as a function of twist angle and perpendicular electric fields through tight-binding calculations on lattice relaxed geometries Lattice relaxations tend to increase the bandwidth of the nearly flat bands, where widths smaller than 1 meV are expected for angle less than 1.08 degree…
▽ More
Local interlayer charge polarization of twisted bilayer hexagonal boron nitride (t2BN) is calculated and parametrized as a function of twist angle and perpendicular electric fields through tight-binding calculations on lattice relaxed geometries Lattice relaxations tend to increase the bandwidth of the nearly flat bands, where widths smaller than 1 meV are expected for angle less than 1.08 degree for parallel BN/BN alignment, and for angle less than 1.5 degree for the antiparallel BN/NB alignment. Local interlayer charge polarization maxima of 2.6 pC/m corresponding are expected at the AB and BA stacking sites of BN/BN aligned t2BN in the long moire period limit for angle less than 1 degree, and evolves non-monotonically with a maximum of 3.5 pC/m at angle equal to 1.6 degree before reaching 2 pC/m for angle equal to 6 degree. The electrostatic potential maxima due to the t2BN are overall enhanced by 20 percentage with respect to the rigid system assuming potential modulation depths of up to 300 mV near its surface. In BN/BN aligned bilayers the relative areas of the AB or BA local stacking regions can be expanded or reduced through a vertical electric field depending on its sign.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Financial Assets Dependency Prediction Utilizing Spatiotemporal Patterns
Authors:
Haoren Zhu,
Pengfei Zhao,
Wilfred Siu Hung NG,
Dik Lun Lee
Abstract:
Financial assets exhibit complex dependency structures, which are crucial for investors to create diversified portfolios to mitigate risk in volatile financial markets. To explore the financial asset dependencies dynamics, we propose a novel approach that models the dependencies of assets as an Asset Dependency Matrix (ADM) and treats the ADM sequences as image sequences. This allows us to leverag…
▽ More
Financial assets exhibit complex dependency structures, which are crucial for investors to create diversified portfolios to mitigate risk in volatile financial markets. To explore the financial asset dependencies dynamics, we propose a novel approach that models the dependencies of assets as an Asset Dependency Matrix (ADM) and treats the ADM sequences as image sequences. This allows us to leverage deep learning-based video prediction methods to capture the spatiotemporal dependencies among assets. However, unlike images where neighboring pixels exhibit explicit spatiotemporal dependencies due to the natural continuity of object movements, assets in ADM do not have a natural order. This poses challenges to organizing the relational assets to reveal better the spatiotemporal dependencies among neighboring assets for ADM forecasting. To tackle the challenges, we propose the Asset Dependency Neural Network (ADNN), which employs the Convolutional Long Short-Term Memory (ConvLSTM) network, a highly successful method for video prediction. ADNN can employ static and dynamic transformation functions to optimize the representations of the ADM. Through extensive experiments, we demonstrate that our proposed framework consistently outperforms the baselines in the ADM prediction and downstream application tasks. This research contributes to understanding and predicting asset dependencies, offering valuable insights for financial market participants.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Stein Variational Ergodic Search
Authors:
Darrick Lee,
Cameron Lerch,
Fabio Ramos,
Ian Abraham
Abstract:
Exploration requires that robots reason about numerous ways to cover a space in response to dynamically changing conditions. However, in continuous domains there are potentially infinitely many options for robots to explore which can prove computationally challenging. How then should a robot efficiently optimize and choose exploration strategies to adopt? In this work, we explore this question thr…
▽ More
Exploration requires that robots reason about numerous ways to cover a space in response to dynamically changing conditions. However, in continuous domains there are potentially infinitely many options for robots to explore which can prove computationally challenging. How then should a robot efficiently optimize and choose exploration strategies to adopt? In this work, we explore this question through the use of variational inference to efficiently solve for distributions of coverage trajectories. Our approach leverages ergodic search methods to optimize coverage trajectories in continuous time and space. In order to reason about distributions of trajectories, we formulate ergodic search as a probabilistic inference problem. We propose to leverage Stein variational methods to approximate a posterior distribution over ergodic trajectories through parallel computation. As a result, it becomes possible to efficiently optimize distributions of feasible coverage trajectories for which robots can adapt exploration. We demonstrate that the proposed Stein variational ergodic search approach facilitates efficient identification of multiple coverage strategies and show online adaptation in a model-predictive control formulation. Simulated and physical experiments demonstrate adaptability and diversity in exploration strategies online.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Performance Improvement of Language-Queried Audio Source Separation Based on Caption Augmentation From Large Language Models for DCASE Challenge 2024 Task 9
Authors:
Do Hyun Lee,
Yoonah Song,
Hong Kook Kim
Abstract:
We present a prompt-engineering-based text-augmentation approach applied to a language-queried audio source separation (LASS) task. To enhance the performance of LASS, the proposed approach utilizes large language models (LLMs) to generate multiple captions corresponding to each sentence of the training dataset. To this end, we first perform experiments to identify the most effective prompts for c…
▽ More
We present a prompt-engineering-based text-augmentation approach applied to a language-queried audio source separation (LASS) task. To enhance the performance of LASS, the proposed approach utilizes large language models (LLMs) to generate multiple captions corresponding to each sentence of the training dataset. To this end, we first perform experiments to identify the most effective prompts for caption augmentation with a smaller number of captions. A LASS model trained with these augmented captions demonstrates improved performance on the DCASE 2024 Task 9 validation set compared to that trained without augmentation. This study highlights the effectiveness of LLM-based caption augmentation in advancing language-queried audio source separation.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
From Intentions to Techniques: A Comprehensive Taxonomy and Challenges in Text Watermarking for Large Language Models
Authors:
Harsh Nishant Lalai,
Aashish Anantha Ramakrishnan,
Raj Sanjay Shah,
Dongwon Lee
Abstract:
With the rapid growth of Large Language Models (LLMs), safeguarding textual content against unauthorized use is crucial. Text watermarking offers a vital solution, protecting both - LLM-generated and plain text sources. This paper presents a unified overview of different perspectives behind designing watermarking techniques, through a comprehensive survey of the research literature. Our work has t…
▽ More
With the rapid growth of Large Language Models (LLMs), safeguarding textual content against unauthorized use is crucial. Text watermarking offers a vital solution, protecting both - LLM-generated and plain text sources. This paper presents a unified overview of different perspectives behind designing watermarking techniques, through a comprehensive survey of the research literature. Our work has two key advantages, (1) we analyze research based on the specific intentions behind different watermarking techniques, evaluation datasets used, watermarking addition, and removal methods to construct a cohesive taxonomy. (2) We highlight the gaps and open challenges in text watermarking to promote research in protecting text authorship. This extensive coverage and detailed analysis sets our work apart, offering valuable insights into the evolving landscape of text watermarking in language models.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation
Authors:
Seo Hyun Kim,
Kai Tzu-iunn Ong,
Taeyoon Kwon,
Namyoung Kim,
Keummin Ka,
SeongHyeon Bae,
Yohan Jo,
Seung-won Hwang,
Dongha Lee,
Jinyoung Yeo
Abstract:
Large language models (LLMs) are capable of processing lengthy dialogue histories during prolonged interaction with users without additional memory modules; however, their responses tend to overlook or incorrectly recall information from the past. In this paper, we revisit memory-augmented response generation in the era of LLMs. While prior work focuses on getting rid of outdated memories, we argu…
▽ More
Large language models (LLMs) are capable of processing lengthy dialogue histories during prolonged interaction with users without additional memory modules; however, their responses tend to overlook or incorrectly recall information from the past. In this paper, we revisit memory-augmented response generation in the era of LLMs. While prior work focuses on getting rid of outdated memories, we argue that such memories can provide contextual cues that help dialogue systems understand the development of past events and, therefore, benefit response generation. We present Theanine, a framework that augments LLMs' response generation with memory timelines -- series of memories that demonstrate the development and causality of relevant past events. Along with Theanine, we introduce TeaFarm, a counterfactual-driven question-answering pipeline addressing the limitation of G-Eval in long-term conversations. Supplementary videos of our methods and the TeaBag dataset for TeaFarm evaluation are in https://theanine-693b0.web.app/.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
Four microlensing giant planets detected through signals produced by minor-image perturbations
Authors:
Cheongho Han,
Ian A. Bond,
Chung-Uk Lee,
Andrew Gould,
Michael D. Albrow,
Sun-Ju Chung,
Kyu-Ha Hwang,
Youn Kil Jung,
Yoon-Hyun Ryu,
Yossi Shvartzvald,
In-Gu Shin,
Jennifer C. Yee,
Hongjing Yang,
Weicheng Zang,
Sang-Mok Cha,
Doeon Kim,
Dong-Jin Kim,
Seung-Lee Kim,
Dong-Joo Lee,
Yongseok Lee,
Byeong-Gon Park,
Richard W. Pogge,
Fumio Abe,
Ken Bando,
Richard Barry
, et al. (41 additional authors not shown)
Abstract:
We investigated the nature of the anomalies appearing in four microlensing events KMT-2020-BLG-0757, KMT-2022-BLG-0732, KMT-2022-BLG-1787, and KMT-2022-BLG-1852. The light curves of these events commonly exhibit initial bumps followed by subsequent troughs that extend across a substantial portion of the light curves. We performed thorough modeling of the anomalies to elucidate their characteristic…
▽ More
We investigated the nature of the anomalies appearing in four microlensing events KMT-2020-BLG-0757, KMT-2022-BLG-0732, KMT-2022-BLG-1787, and KMT-2022-BLG-1852. The light curves of these events commonly exhibit initial bumps followed by subsequent troughs that extend across a substantial portion of the light curves. We performed thorough modeling of the anomalies to elucidate their characteristics. Despite their prolonged durations, which differ from the usual brief anomalies observed in typical planetary events, our analysis revealed that each anomaly in these events originated from a planetary companion located within the Einstein ring of the primary star. It was found that the initial bump arouse when the source star crossed one of the planetary caustics, while the subsequent trough feature occurred as the source traversed the region of minor image perturbations lying between the pair of planetary caustics. The estimated masses of the host and planet, their mass ratios, and the distance to the discovered planetary systems are $(M_{\rm host}/M_\odot, M_{\rm planet}/M_{\rm J}, q/10^{-3}, \dl/{\rm kpc}) = (0.58^{+0.33}_{-0.30}, 10.71^{+6.17}_{-5.61}, 17.61\pm 2.25,6.67^{+0.93}_{-1.30})$ for KMT-2020-BLG-0757, $(0.53^{+0.31}_{-0.31}, 1.12^{+0.65}_{-0.65}, 2.01 \pm 0.07, 6.66^{+1.19}_{-1.84})$ for KMT-2022-BLG-0732, $(0.42^{+0.32}_{-0.23}, 6.64^{+4.98}_{-3.64}, 15.07\pm 0.86, 7.55^{+0.89}_{-1.30})$ for KMT-2022-BLG-1787, and $(0.32^{+0.34}_{-0.19}, 4.98^{+5.42}_{-2.94}, 8.74\pm 0.49, 6.27^{+0.90}_{-1.15})$ for KMT-2022-BLG-1852. These parameters indicate that all the planets are giants with masses exceeding the mass of Jupiter in our solar system and the hosts are low-mass stars with masses substantially less massive than the Sun.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Finite-Time Analysis of Simultaneous Double Q-learning
Authors:
Hyunjun Na,
Donghwan Lee
Abstract:
$Q$-learning is one of the most fundamental reinforcement learning (RL) algorithms. Despite its widespread success in various applications, it is prone to overestimation bias in the $Q$-learning update. To address this issue, double $Q$-learning employs two independent $Q$-estimators which are randomly selected and updated during the learning process. This paper proposes a modified double $Q…
▽ More
$Q$-learning is one of the most fundamental reinforcement learning (RL) algorithms. Despite its widespread success in various applications, it is prone to overestimation bias in the $Q$-learning update. To address this issue, double $Q$-learning employs two independent $Q$-estimators which are randomly selected and updated during the learning process. This paper proposes a modified double $Q$-learning, called simultaneous double $Q$-learning (SDQ), with its finite-time analysis. SDQ eliminates the need for random selection between the two $Q$-estimators, and this modification allows us to analyze double $Q$-learning through the lens of a novel switching system framework facilitating efficient finite-time analysis. Empirical studies demonstrate that SDQ converges faster than double $Q$-learning while retaining the ability to mitigate the maximization bias. Finally, we derive a finite-time expected error bound for SDQ.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Projected background and sensitivity of AMoRE-II
Authors:
A. Agrawal,
V. V. Alenkov,
P. Aryal,
J. Beyer,
B. Bhandari,
R. S. Boiko,
K. Boonin,
O. Buzanov,
C. R. Byeon,
N. Chanthima,
M. K. Cheoun,
J. S. Choe,
Seonho Choi,
S. Choudhury,
J. S. Chung,
F. A. Danevich,
M. Djamal,
D. Drung,
C. Enss,
A. Fleischmann,
A. M. Gangapshev,
L. Gastaldo,
Y. M. Gavrilyuk,
A. M. Gezhaev,
O. Gileva
, et al. (81 additional authors not shown)
Abstract:
AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located ap…
▽ More
AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located approximately 1000 meters deep in Jeongseon, Korea. The goal of AMoRE-II is to reach up to $T^{0νββ}_{1/2}$ $\sim$ 6 $\times$ 10$^{26}$ years, corresponding to an effective Majorana mass of 15 - 29 meV, covering all the inverted mass hierarchy regions. To achieve this, the background level of the experimental configurations and possible background sources of gamma and beta events should be well understood. We have intensively performed Monte Carlo simulations using the GEANT4 toolkit in all the experimental configurations with potential sources. We report the estimated background level that meets the 10$^{-4}$counts/(keV$\cdot$kg$\cdot$yr) requirement for AMoRE-II in the region of interest (ROI) and show the projected half-life sensitivity based on the simulation study.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
VLind-Bench: Measuring Language Priors in Large Vision-Language Models
Authors:
Kang-il Lee,
Minbeom Kim,
Seunghyun Yoon,
Minsung Kim,
Dongryeol Lee,
Hyukhun Koh,
Kyomin Jung
Abstract:
Large Vision-Language Models (LVLMs) have demonstrated outstanding performance across various multimodal tasks. However, they suffer from a problem known as language prior, where responses are generated based solely on textual patterns while disregarding image information. Addressing the issue of language prior is crucial, as it can lead to undesirable biases or hallucinations when dealing with im…
▽ More
Large Vision-Language Models (LVLMs) have demonstrated outstanding performance across various multimodal tasks. However, they suffer from a problem known as language prior, where responses are generated based solely on textual patterns while disregarding image information. Addressing the issue of language prior is crucial, as it can lead to undesirable biases or hallucinations when dealing with images that are out of training distribution. Despite its importance, current methods for accurately measuring language priors in LVLMs are poorly studied. Although existing benchmarks based on counterfactual or out-of-distribution images can partially be used to measure language priors, they fail to disentangle language priors from other confounding factors. To this end, we propose a new benchmark called VLind-Bench, which is the first benchmark specifically designed to measure the language priors, or blindness, of LVLMs. It not only includes tests on counterfactual images to assess language priors but also involves a series of tests to evaluate more basic capabilities such as commonsense knowledge, visual perception, and commonsense biases. For each instance in our benchmark, we ensure that all these basic tests are passed before evaluating the language priors, thereby minimizing the influence of other factors on the assessment. The evaluation and analysis of recent LVLMs in our benchmark reveal that almost all models exhibit a significant reliance on language priors, presenting a strong challenge in the field.
△ Less
Submitted 10 July, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Authors:
Licong Lin,
Jingfeng Wu,
Sham M. Kakade,
Peter L. Bartlett,
Jason D. Lee
Abstract:
Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approximation, bias, and variance errors, where the variance error increases with model size. This disagrees with the general form of neural scaling laws, wh…
▽ More
Empirically, large-scale deep learning models often satisfy a neural scaling law: the test error of the trained model improves polynomially as the model size and data size grow. However, conventional wisdom suggests the test error consists of approximation, bias, and variance errors, where the variance error increases with model size. This disagrees with the general form of neural scaling laws, which predict that increasing model size monotonically improves performance.
We study the theory of scaling laws in an infinite dimensional linear regression setup. Specifically, we consider a model with $M$ parameters as a linear function of sketched covariates. The model is trained by one-pass stochastic gradient descent (SGD) using $N$ data. Assuming the optimal parameter satisfies a Gaussian prior and the data covariance matrix has a power-law spectrum of degree $a>1$, we show that the reducible part of the test error is $Θ(M^{-(a-1)} + N^{-(a-1)/a})$. The variance error, which increases with $M$, is dominated by the other errors due to the implicit regularization of SGD, thus disappearing from the bound. Our theory is consistent with the empirical neural scaling laws and verified by numerical simulation.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV
Authors:
PHENIX Collaboration,
N. J. Abdulameer,
U. Acharya,
A. Adare,
S. Afanasiev,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
H. Al-Bataineh,
J. Alexander,
M. Alfred,
K. Aoki,
N. Apadula,
L. Aphecetche,
J. Asai,
H. Asano,
E. T. Atomssa,
R. Averbeck,
T. C. Awes,
B. Azmoun,
V. Babintsev,
M. Bai,
G. Baksay,
L. Baksay,
A. Baldisseri
, et al. (510 additional authors not shown)
Abstract:
High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs…
▽ More
High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is observed in the yield of high-momentum jet fragments opposite the trigger particle, which indicates jet suppression stemming from in-medium partonic energy loss, while enhancement is observed for low-momentum particles. The ratio and differences between the yield in Au$+$Au collisions and $p$$+$$p$ collisions, $I_{AA}$ and $Δ_{AA}$, as a function of the trigger-hadron azimuthal separation, $Δφ$, are measured for the first time at the Relativistic Heavy Ion Collider. These results better quantify how the yield of low-$p_T$ associated hadrons is enhanced at wide angle, which is crucial for studying energy loss as well as medium-response effects.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Functional voxel hierarchy and afferent capacity revealed mental state transition on dynamic correlation resting-state fMRI
Authors:
Dong Soo Lee,
Hyun Joo Kim,
Youngmin Huh,
Yeon Koo Kang,
Wonseok Whi,
Hyekyoung Lee,
Hyejin Kang
Abstract:
Voxel hierarchy on dynamic brain graphs is produced by k core percolation on functional dynamic amplitude correlation of resting-state fMRI. Directed graphs and their afferent/efferent capacities are produced by Markov modeling of the universal cover of undirected graphs simultaneously with the calculation of volume entropy. Positive and unsigned negative brain graphs were analyzed separately on s…
▽ More
Voxel hierarchy on dynamic brain graphs is produced by k core percolation on functional dynamic amplitude correlation of resting-state fMRI. Directed graphs and their afferent/efferent capacities are produced by Markov modeling of the universal cover of undirected graphs simultaneously with the calculation of volume entropy. Positive and unsigned negative brain graphs were analyzed separately on sliding-window representation to underpin the visualization and quantitation of mental dynamic states with their transitions. Voxel hierarchy animation maps of positive graphs revealed abrupt changes in coreness k and kmaxcore, which we called mental state transitions. Afferent voxel capacities of the positive graphs also revealed transient modules composed of dominating voxels/independent components and their exchanges representing mental state transitions. Animation and quantification plots of voxel hierarchy and afferent capacity corroborated each other in underpinning mental state transitions and afferent module exchange on the positive directed functional connectivity graphs. We propose the use of spatiotemporal trajectories of voxels on positive dynamic graphs to construct hierarchical structures by k core percolation and quantified in- and out-flows of information of voxels by volume entropy/directed graphs to subserve diverse resting mental state transitions on resting-state fMRI graphs in normal human individuals.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Optimal Qubit Mapping Search for Encoding Classical Data into Matrix Product State Representation with Minimal Loss
Authors:
Hyeongjun Jeon,
Kyungmin Lee,
Dongkyu Lee,
Bongsang Kim,
Taehyun Kim
Abstract:
Matrix product state (MPS) offers a framework for encoding classical data into quantum states, enabling the efficient utilization of quantum resources for data representation and processing. This research paper investigates techniques to enhance the efficiency and accuracy of MPS representations specifically designed for encoding classical data. Based on the observations that MPS truncation error…
▽ More
Matrix product state (MPS) offers a framework for encoding classical data into quantum states, enabling the efficient utilization of quantum resources for data representation and processing. This research paper investigates techniques to enhance the efficiency and accuracy of MPS representations specifically designed for encoding classical data. Based on the observations that MPS truncation error depends on the pattern of the classical data, we devised an algorithm that finds optimal qubit mapping for given classical data, thereby improving the efficiency and fidelity of the MPS representation. Furthermore, we evaluate the impact of the optimized MPS in the context of quantum classifiers, demonstrating their enhanced performance compared to the conventional mapping. This improvement confirms the efficacy of the proposed techniques for encoding classical data into quantum states. MPS representation combined with optimal qubit mapping can pave a new way for more efficient and accurate quantum data representation and processing.
△ Less
Submitted 12 June, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Bifurcations and multistability in empirical mutualistic networks
Authors:
Andrus Giraldo,
Deok-Sun Lee
Abstract:
Individual species may experience diverse outcomes, from prosperity to extinction, in an ecological community subject to external and internal variations. Despite the wealth of theoretical results derived from random matrix ensembles, a theoretical framework still remains to be developed to understand species-level dynamical heterogeneity within a given community, hampering real-world ecosystems'…
▽ More
Individual species may experience diverse outcomes, from prosperity to extinction, in an ecological community subject to external and internal variations. Despite the wealth of theoretical results derived from random matrix ensembles, a theoretical framework still remains to be developed to understand species-level dynamical heterogeneity within a given community, hampering real-world ecosystems' theoretical assessment and management. Here, we consider empirical plant-pollinator mutualistic networks, additionally including all-to-all intragroup competition, where species abundance evolves under a Lotka-Volterra-type equation. Setting the strengths of competition and mutualism to be uniform, we investigate how individual species persist or go extinct under varying the interaction strengths. By employing bifurcation theory in tandem with numerical continuation, we elucidate transcritical bifurcations underlying species extinction and demonstrate that the Hopf bifurcation of unfeasible equilibria and degenerate transcritical bifurcations give rise to multistability, i.e., the coexistence of multiple attracting feasible equilibria. These bifurcations allow us to partition the parameter space into different regimes, each with distinct sets of extinct species, offering insights into how interspecific interactions generate one or multiple extinction scenarios within an ecological network.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Transformers Provably Learn Sparse Token Selection While Fully-Connected Nets Cannot
Authors:
Zixuan Wang,
Stanley Wei,
Daniel Hsu,
Jason D. Lee
Abstract:
The transformer architecture has prevailed in various deep learning settings due to its exceptional capabilities to select and compose structural information. Motivated by these capabilities, Sanford et al. proposed the sparse token selection task, in which transformers excel while fully-connected networks (FCNs) fail in the worst case. Building upon that, we strengthen the FCN lower bound to an a…
▽ More
The transformer architecture has prevailed in various deep learning settings due to its exceptional capabilities to select and compose structural information. Motivated by these capabilities, Sanford et al. proposed the sparse token selection task, in which transformers excel while fully-connected networks (FCNs) fail in the worst case. Building upon that, we strengthen the FCN lower bound to an average-case setting and establish an algorithmic separation of transformers over FCNs. Specifically, a one-layer transformer trained with gradient descent provably learns the sparse token selection task and, surprisingly, exhibits strong out-of-distribution length generalization. We provide empirical simulations to justify our theoretical findings.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
SignBLEU: Automatic Evaluation of Multi-channel Sign Language Translation
Authors:
Jung-Ho Kim,
Mathew Huerta-Enochian,
Changyong Ko,
Du Hui Lee
Abstract:
Sign languages are multi-channel languages that communicate information through not just the hands (manual signals) but also facial expressions and upper body movements (non-manual signals). However, since automatic sign language translation is usually performed by generating a single sequence of glosses, researchers eschew non-manual and co-occurring manual signals in favor of a simplified list o…
▽ More
Sign languages are multi-channel languages that communicate information through not just the hands (manual signals) but also facial expressions and upper body movements (non-manual signals). However, since automatic sign language translation is usually performed by generating a single sequence of glosses, researchers eschew non-manual and co-occurring manual signals in favor of a simplified list of manual glosses. This can lead to significant information loss and ambiguity. In this paper, we introduce a new task named multi-channel sign language translation (MCSLT) and present a novel metric, SignBLEU, designed to capture multiple signal channels. We validated SignBLEU on a system-level task using three sign language corpora with varied linguistic structures and transcription methodologies and examined its correlation with human judgment through two segment-level tasks. We found that SignBLEU consistently correlates better with human judgment than competing metrics. To facilitate further MCSLT research, we report benchmark scores for the three sign language corpora and release the source code for SignBLEU at https://github.com/eq4all-projects/SignBLEU.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Decoupled Marked Temporal Point Process using Neural Ordinary Differential Equations
Authors:
Yujee Song,
Donghyun Lee,
Rui Meng,
Won Hwa Kim
Abstract:
A Marked Temporal Point Process (MTPP) is a stochastic process whose realization is a set of event-time data. MTPP is often used to understand complex dynamics of asynchronous temporal events such as money transaction, social media, healthcare, etc. Recent studies have utilized deep neural networks to capture complex temporal dependencies of events and generate embedding that aptly represent the o…
▽ More
A Marked Temporal Point Process (MTPP) is a stochastic process whose realization is a set of event-time data. MTPP is often used to understand complex dynamics of asynchronous temporal events such as money transaction, social media, healthcare, etc. Recent studies have utilized deep neural networks to capture complex temporal dependencies of events and generate embedding that aptly represent the observed events. While most previous studies focus on the inter-event dependencies and their representations, how individual events influence the overall dynamics over time has been under-explored. In this regime, we propose a Decoupled MTPP framework that disentangles characterization of a stochastic process into a set of evolving influences from different events. Our approach employs Neural Ordinary Differential Equations (Neural ODEs) to learn flexible continuous dynamics of these influences while simultaneously addressing multiple inference problems, such as density estimation and survival rate computation. We emphasize the significance of disentangling the influences by comparing our framework with state-of-the-art methods on real-life datasets, and provide analysis on the model behavior for potential applications.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
DiffInject: Revisiting Debias via Synthetic Data Generation using Diffusion-based Style Injection
Authors:
Donggeun Ko,
Sangwoo Jo,
Dongjun Lee,
Namjun Park,
Jaekwang Kim
Abstract:
Dataset bias is a significant challenge in machine learning, where specific attributes, such as texture or color of the images are unintentionally learned resulting in detrimental performance. To address this, previous efforts have focused on debiasing models either by developing novel debiasing algorithms or by generating synthetic data to mitigate the prevalent dataset biases. However, generativ…
▽ More
Dataset bias is a significant challenge in machine learning, where specific attributes, such as texture or color of the images are unintentionally learned resulting in detrimental performance. To address this, previous efforts have focused on debiasing models either by developing novel debiasing algorithms or by generating synthetic data to mitigate the prevalent dataset biases. However, generative approaches to date have largely relied on using bias-specific samples from the dataset, which are typically too scarce. In this work, we propose, DiffInject, a straightforward yet powerful method to augment synthetic bias-conflict samples using a pretrained diffusion model. This approach significantly advances the use of diffusion models for debiasing purposes by manipulating the latent space. Our framework does not require any explicit knowledge of the bias types or labelling, making it a fully unsupervised setting for debiasing. Our methodology demonstrates substantial result in effectively reducing dataset bias.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
A Deep Learning-Augmented Stand-off Radar Scheme for Rapidly Detecting Tree Defects
Authors:
Jiwei Qian,
Yee Hui Lee,
Kaixuan Cheng,
Qiqi Dai,
Mohamed Lokman Mohd Yusof,
Daryl Lee,
Abdulkadir C. Yucel
Abstract:
Tree defect detection is crucial for the structural health screening of trees. Existing nondestructive testing (NDT) techniques for tree defect detection require time-consuming and labor-intensive measurement campaigns. This discourages their application for the routine structural health screening of whole populations of managed urban trees. To address this issue, this study proposes a deep-learni…
▽ More
Tree defect detection is crucial for the structural health screening of trees. Existing nondestructive testing (NDT) techniques for tree defect detection require time-consuming and labor-intensive measurement campaigns. This discourages their application for the routine structural health screening of whole populations of managed urban trees. To address this issue, this study proposes a deep-learning augmented stand-off radar scheme for contactless scanning of tree trunks and rapid detection of tree defects. In this scheme, the antenna is moved along a straight trajectory at a distance from the tree trunk to obtain the trunk's B-scan. The obtained raw B-scan is then processed by a signal-processing framework specifically developed for revealing the scattering signatures of defects in B-scan, which achieves a 30 dB and 22 dB increase in the signal-to-clutter and noise ratio of the measurement data of tree trunk samples and living trees, respectively. Finally, the processed B-scan is input into a multilevel feature fusion neural network particularly designed for extracting the signature of the defect in the processed B-scan in real time. The developed scheme's applications to the detection of defects in real fresh-cut tree trunks show that the stand-off radar scheme can detect tree defects with 96% accuracy. This stand-off radar scheme is the first contactless NDT technique for tree defect detection while operated on a straight trajectory and potentially can be integrated into the routine tree inspection workflow which is part of urban tree management.
△ Less
Submitted 8 June, 2024;
originally announced June 2024.
-
Decay Energy Spectrometry for Improved Nuclear Material Analysis at the IAEA NML
Authors:
G. B. Kim,
A. R. L. Kavner,
T. Parsons-Davis,
S. Friedrich,
O. B. Drury,
D. Lee,
X. Zhang,
N. Hines,
S. T. P. Boyd,
S. Weidenbenner,
K. Schreiber,
S. Martinson,
C. Smith,
D. McNeel,
S. Salazar,
K. Koehler,
M. Carpenter,
M. Croce,
D. Schmidt,
J. Ullom
Abstract:
Decay energy spectrometry (DES) is a novel radiometric technique for high-precision analysis of nuclear materials. DES employs the unique thermal detection physics of cryogenic microcalorimeters with ultra-high energy resolution and 100$\%$ detection efficiency to accomplish high precision decay energy measurements. Low-activity nuclear samples of 1 Bq or less, and without chemical separation, are…
▽ More
Decay energy spectrometry (DES) is a novel radiometric technique for high-precision analysis of nuclear materials. DES employs the unique thermal detection physics of cryogenic microcalorimeters with ultra-high energy resolution and 100$\%$ detection efficiency to accomplish high precision decay energy measurements. Low-activity nuclear samples of 1 Bq or less, and without chemical separation, are used to provide elemental and isotopic compositions in a single measurement. Isotopic ratio precisions of 1 ppm - 1,000 ppm (isotope dependent), which is close to that of the mass spectrometry, have been demonstrated in 12-hour DES measurements of ~5 Bq samples of certified reference materials of uranium (U) and plutonium (Pu). DES has very different systematic biases and uncertainties, as well as different sensitivities to nuclides, compared to mass-spectrometry techniques. Therefore, the accuracy and confidence of nuclear material assays can be improved by combining this new technique with existing mass-spectrometry techniques. Commercial-level DES techniques and equipment are being developed for the implementation of DES at the Nuclear Material Laboratory (NML) of International Atomic Energy Agency (IAEA) to provide complementary measurements to the existing technologies. The paper describes details of DES measurement methods, as well as DES precision and accuracy to U and Pu standard sources to discuss its capability in analysis of nuclear safeguards samples.
△ Less
Submitted 11 July, 2024; v1 submitted 7 June, 2024;
originally announced June 2024.