-
A Perspective on Foundation Models for the Electric Power Grid
Authors:
Hendrik F. Hamann,
Thomas Brunschwiler,
Blazhe Gjorgiev,
Leonardo S. A. Martins,
Alban Puech,
Anna Varbella,
Jonas Weiss,
Juan Bernabe-Moreno,
Alexandre Blondin Massé,
Seong Choi,
Ian Foster,
Bri-Mathias Hodge,
Rishabh Jain,
Kibaek Kim,
Vincent Mai,
François Mirallès,
Martin De Montigny,
Octavio Ramos-Leaños,
Hussein Suprême,
Le Xie,
El-Nasser S. Youssef,
Arnaud Zinflou,
Alexander J. Belvi,
Ricardo J. Bessa,
Bishnu Prasad Bhattari
, et al. (2 additional authors not shown)
Abstract:
Foundation models (FMs) currently dominate news headlines. They employ advanced deep learning architectures to extract structural information autonomously from vast datasets through self-supervision. The resulting rich representations of complex systems and dynamics can be applied to many downstream applications. Therefore, FMs can find uses in electric power grids, challenged by the energy transi…
▽ More
Foundation models (FMs) currently dominate news headlines. They employ advanced deep learning architectures to extract structural information autonomously from vast datasets through self-supervision. The resulting rich representations of complex systems and dynamics can be applied to many downstream applications. Therefore, FMs can find uses in electric power grids, challenged by the energy transition and climate change. In this paper, we call for the development of, and state why we believe in, the potential of FMs for electric grids. We highlight their strengths and weaknesses amidst the challenges of a changing grid. We argue that an FM learning from diverse grid data and topologies could unlock transformative capabilities, pioneering a new approach in leveraging AI to redefine how we manage complexity and uncertainty in the electric grid. Finally, we discuss a power grid FM concept, namely GridFM, based on graph neural networks and show how different downstream tasks benefit.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (414 additional authors not shown)
Abstract:
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det…
▽ More
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
CAMP: Continuous and Adaptive Learning Model in Pathology
Authors:
Anh Tien Nguyen,
Keunho Byeon,
Kyungeun Kim,
Boram Song,
Seoung Wan Chae,
Jin Tae Kwak
Abstract:
There exist numerous diagnostic tasks in pathology. Conventional computational pathology formulates and tackles them as independent and individual image classification problems, thereby resulting in computational inefficiency and high costs. To address the challenges, we propose a generic, unified, and universal framework, called a continuous and adaptive learning model in pathology (CAMP), for pa…
▽ More
There exist numerous diagnostic tasks in pathology. Conventional computational pathology formulates and tackles them as independent and individual image classification problems, thereby resulting in computational inefficiency and high costs. To address the challenges, we propose a generic, unified, and universal framework, called a continuous and adaptive learning model in pathology (CAMP), for pathology image classification. CAMP is a generative, efficient, and adaptive classification model that can continuously adapt to any classification task by leveraging pathology-specific prior knowledge and learning taskspecific knowledge with minimal computational cost and without forgetting the knowledge from the existing tasks. We evaluated CAMP on 22 datasets, including 1,171,526 patches and 11,811 pathology slides, across 17 classification tasks. CAMP achieves state-of-theart classification performance on a wide range of datasets and tasks at both patch- and slide-levels and reduces up to 94% of computation time and 85% of storage memory in comparison to the conventional classification models. Our results demonstrate that CAMP can offer a fundamental transformation in pathology image classification, paving the way for the fully digitized and computerized pathology practice.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (385 additional authors not shown)
Abstract:
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I…
▽ More
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Centrality dependence of Lévy-stable two-pion Bose-Einstein correlations in $\sqrt{s_{_{NN}}}=200$ GeV Au$+$Au collisions
Authors:
PHENIX Collaboration,
N. J. Abdulameer,
U. Acharya,
A. Adare,
C. Aidala,
N. N. Ajitanand,
Y. Akiba,
R. Akimoto,
H. Al-Ta'ani,
J. Alexander,
A. Angerami,
K. Aoki,
N. Apadula,
Y. Aramaki,
H. Asano,
E. C. Aschenauer,
E. T. Atomssa,
T. C. Awes,
B. Azmoun,
V. Babintsev,
M. Bai,
B. Bannier,
K. N. Barish,
B. Bassalleck,
S. Bathe
, et al. (377 additional authors not shown)
Abstract:
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability…
▽ More
The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability $α$, and the Lévy-scale parameter $R$ as a function of transverse mass $m_T$ and centrality. The $λ(m_T)$ parameter is constant at larger values of $m_T$, but decreases as $m_T$ decreases. The Lévy scale parameter $R(m_T)$ decreases with $m_T$ and exhibits proportionality to the length scale of the nuclear overlap region. The Lévy exponent $α(m_T)$ is independent of $m_T$ within uncertainties in each investigated centrality bin, but shows a clear centrality dependence. At all centralities, the Lévy exponent $α$ is significantly different from that of Gaussian ($α=2$) or Cauchy ($α=1$) source distributions. Comparisons to the predictions of Monte-Carlo simulations of resonance-decay chains show that in all but the most peripheral centrality class (50%-60%), the obtained results are inconsistent with the measurements, unless a significant reduction of the in-medium mass of the $η'$ meson is included. In each centrality class, the best value of the in-medium $η'$ mass is compared to the mass of the $η$ meson, as well as to several theoretical predictions that consider restoration of $U_A(1)$ symmetry in hot hadronic matter.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
DIOR-ViT: Differential Ordinal Learning Vision Transformer for Cancer Classification in Pathology Images
Authors:
Ju Cheon Lee,
Keunho Byeon,
Boram Song,
Kyungeun Kim,
Jin Tae Kwak
Abstract:
In computational pathology, cancer grading has been mainly studied as a categorical classification problem, which does not utilize the ordering nature of cancer grades such as the higher the grade is, the worse the cancer is. To incorporate the ordering relationship among cancer grades, we introduce a differential ordinal learning problem in which we define and learn the degree of difference in th…
▽ More
In computational pathology, cancer grading has been mainly studied as a categorical classification problem, which does not utilize the ordering nature of cancer grades such as the higher the grade is, the worse the cancer is. To incorporate the ordering relationship among cancer grades, we introduce a differential ordinal learning problem in which we define and learn the degree of difference in the categorical class labels between pairs of samples by using their differences in the feature space. To this end, we propose a transformer-based neural network that simultaneously conducts both categorical classification and differential ordinal classification for cancer grading. We also propose a tailored loss function for differential ordinal learning. Evaluating the proposed method on three different types of cancer datasets, we demonstrate that the adoption of differential ordinal learning can improve the accuracy and reliability of cancer grading, outperforming conventional cancer grading approaches. The proposed approach should be applicable to other diseases and problems as they involve ordinal relationship among class labels.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
On the support of solutions to nonlinear stochastic heat equations
Authors:
Beom-Seok Han,
Kunwoo Kim,
Jaeyun Yi
Abstract:
We investigate the strict positivity and the compact support property of solutions to the one-dimensional nonlinear stochastic heat equation: $$\partial_t u(t,x) = \frac{1}{2}\partial^2_x u(t,x) + σ(u(t,x))\dot{W}(t,x), \quad (t,x)\in \mathbf{R}_+\times\mathbf{R},$$ with nonnegative and compactly supported initial data $u_0$, where $\dot{W}$ is the space-time white noise and…
▽ More
We investigate the strict positivity and the compact support property of solutions to the one-dimensional nonlinear stochastic heat equation: $$\partial_t u(t,x) = \frac{1}{2}\partial^2_x u(t,x) + σ(u(t,x))\dot{W}(t,x), \quad (t,x)\in \mathbf{R}_+\times\mathbf{R},$$ with nonnegative and compactly supported initial data $u_0$, where $\dot{W}$ is the space-time white noise and $σ:\mathbf{R} \to \mathbf{R} $ is a continuous function with $σ(0)=0$. We prove that (i) if $v/ σ(v)$ is sufficiently large near $v=0$, then the solution $u(t,\cdot)$ is strictly positive for all $t>0$, and (ii) if $v/σ(v)$ is sufficiently small near $v= 0$, then the solution $u(t,\cdot)$ has compact support for all $t>0$. These findings extend previous results concerning the strict positivity and the compact support property, which were analyzed only for the case $σ(u)\approx u^γ$ for $γ>0$. Additionally, we establish the uniqueness of a solution and the weak comparison principle in case (i).
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Improved limit on neutrinoless double beta decay of \mohundred~from AMoRE-I
Authors:
A. Agrawal,
V. V. Alenkov,
P. Aryal,
J. Beyer,
B. Bhandari,
R. S. Boiko,
K. Boonin,
O. Buzanov,
C. R. Byeon,
N. Chanthima,
M. K. Cheoun,
J. S. Choe,
Seonho Choi,
S. Choudhury,
J. S. Chung,
F. A. Danevich,
M. Djamal,
D. Drung,
C. Enss,
A. Fleischmann,
A. M. Gangapshev,
L. Gastaldo,
Y. M. Gavrilyuk,
A. M. Gezhaev,
O. Gileva
, et al. (83 additional authors not shown)
Abstract:
AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate c…
▽ More
AMoRE searches for the signature of neutrinoless double beta decay of $^{100}$Mo with a 100 kg sample of enriched $^{100}$Mo. Scintillating molybdate crystals coupled with a metallic magnetic calorimeter operate at milli-Kelvin temperatures to measure the energy of electrons emitted in the decay. As a demonstration of the full-scale AMoRE, we conducted AMoRE-I, a pre-experiment with 18 molybdate crystals, at the Yangyang Underground Laboratory for over two years. The exposure was 8.02 kg$\cdot$year (or 3.89 kg$_{\mathrm{^{100}Mo}}\cdot$year) and the total background rate near the Q-value was 0.025 $\pm$ 0.002 counts/keV/kg/year. We observed no indication of $0νββ$ decay and report a new lower limit of the half-life of $^{100}$Mo $0νββ$ decay as $ T^{0ν}_{1/2}>3.0\times10^{24}~\mathrm{years}$ at 90\% confidence level. The effective Majorana mass limit range is $m_{ββ}<$(210--610) meV using nuclear matrix elements estimated in the framework of different models, including the recent shell model calculations.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (349 additional authors not shown)
Abstract:
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper…
▽ More
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper limits at 90\% credibility level on the branching fractions of $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛπ^-$ are determined to be $4.7 \times 10^{-8}$ and $4.3 \times 10^{-8}$, respectively.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Domain Wall Networks as Skyrmion Crystals in Chiral Magnets
Authors:
Seungho Lee,
Toshiaki Fujimori,
Muneto Nitta,
Se Kwon Kim
Abstract:
We theoretically investigate the ground states of a chiral magnet with a square anisotropy and show that it supports domain wall networks as stable ground states. A domain wall junction in the domain wall network turns out to be a skyrmion with half topological charge and, therefore, the found domain wall network has a second topological nature, a skyrmion crystal. More specifically, we present a…
▽ More
We theoretically investigate the ground states of a chiral magnet with a square anisotropy and show that it supports domain wall networks as stable ground states. A domain wall junction in the domain wall network turns out to be a skyrmion with half topological charge and, therefore, the found domain wall network has a second topological nature, a skyrmion crystal. More specifically, we present a ground-state phase diagram of the chiral magnet with varying anisotropy parameters consisting of skyrmion lattices, chiral soliton lattices, and ferromagnetic states. In the presence of the square anisotropy, the skyrmion crystal forms a domain wall network. The size of domains in the domain wall network is shown to be tunable by an external magnetic field, offering a way to realize experimentally detectable domain wall networks.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Evidence of $h_{b}(\text{2P}) \to Υ(\text{1S})η$ decay and search for $h_{b}(\text{1P,2P}) \to Υ(\text{1S})π^0$ with the Belle detector
Authors:
Belle Collaboration,
E. Kovalenko,
I. Adachi,
H. Aihara,
D. M. Asner,
T. Aushev,
R. Ayad,
V. Babu,
Sw. Banerjee,
K. Belous,
J. Bennett,
M. Bessner,
T. Bilka,
D. Biswas,
A. Bobrov,
D. Bodrov,
A. Bondar,
A. Bozek,
M. Bračko,
P. Branchini,
T. E. Browder,
A. Budano,
M. Campajola,
M. -C. Chang,
B. G. Cheon
, et al. (142 additional authors not shown)
Abstract:
We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of…
▽ More
We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, and $\mathcal{B}[h_{b}(\text{1P})\to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, at the $90\%$ confidence level. These results are obtained with a $131.4$~fb$^{-1}$ data sample collected near the $Υ(\text{5S})$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+e^-$ collider.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Absolute average and median treatment effects as causal estimands on metric spaces
Authors:
Ha-Young Shin,
Kyusoon Kim,
Kwonsang Lee,
Hee-Seok Oh
Abstract:
We define the notions of absolute average and median treatment effects as causal estimands on general metric spaces such as Riemannian manifolds, propose estimators using stratification, and prove several properties, including strong consistency. In the process, we also demonstrate the strong consistency of the weighted sample Fréchet means and geometric medians. Stratification allows these estima…
▽ More
We define the notions of absolute average and median treatment effects as causal estimands on general metric spaces such as Riemannian manifolds, propose estimators using stratification, and prove several properties, including strong consistency. In the process, we also demonstrate the strong consistency of the weighted sample Fréchet means and geometric medians. Stratification allows these estimators to be utilized beyond the narrow constraints of a completely randomized experiment. After constructing confidence intervals using bootstrapping, we outline how to use the proposed estimates to test Fisher's sharp null hypothesis that the absolute average or median treatment effect is zero. Empirical evidence for the strong consistency of the estimators and the reasonable asymptotic coverage of the confidence intervals is provided through simulations in both randomized experiments and observational study settings. We also apply our methods to real data from an observational study to investigate the causal relationship between Alzheimer's disease and the shape of the corpus callosum, rejecting the aforementioned null hypotheses in cases where conventional Euclidean methods fail to do so. Our proposed methods are more generally applicable than past studies in dealing with general metric spaces.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Submillimeter and Mid-Infrared Variability of Young Stellar Objects in the M17SWex Intermediate-Mass Star-Forming Region
Authors:
Geumsook Park,
Doug Johnstone,
Carlos Contreras Pena,
Jeong-Eun Lee,
Sheng-Yuan Liu,
Gregory Herczeg,
Steve Mairs,
Zhiwei Chen,
Jennifer Hatchell,
Kee-Tae Kim,
Mi-Ryang Kim,
Keping Qiu,
Yao-Te Wang,
Xu Zhang,
The JCMT Transient Team
Abstract:
We present a comprehensive analysis of young stellar object (YSO) variability within the M17 Southwest Extension (M17 SWex), using 3.5 years of monitoring data from the JCMT Transient Survey at sub-millimeter (sub-mm) and 9 years from the NEOWISE mission at mid-infrared (mid-IR). Our study encompasses observations of 147 bright sub-mm peaks identified within our deep JCMT co-added map as well as 1…
▽ More
We present a comprehensive analysis of young stellar object (YSO) variability within the M17 Southwest Extension (M17 SWex), using 3.5 years of monitoring data from the JCMT Transient Survey at sub-millimeter (sub-mm) and 9 years from the NEOWISE mission at mid-infrared (mid-IR). Our study encompasses observations of 147 bright sub-mm peaks identified within our deep JCMT co-added map as well as 156 YSOs in NEOWISE W1 and 179 in W2 that were previously identified in Spitzer surveys. We find three robust sub-mm variables: two are candidate YSOs and one is a likely extragalactic source. At mid-IR wavelengths, our analysis reveals secular and stochastic variability in 47 YSOs, with the highest fraction of secular variability occurring at the earliest evolutionary stage. This is similar to what has previously been observed for low-mass YSO variability within the Gould Belt. However, we observe less overall variability in M17SWex at both the sub-mm and mid-IR. We suspect that this lower fraction is due to the greater distance to M17 SWex. Our findings showcase the utility of multi-wavelength observations to better capture the complex variability phenomena inherent to star formation processes and demonstrate the importance of years-long monitoring of a diverse selection of star-forming environments.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Fermi Surface Nesting Driving the RKKY Interaction in the Centrosymmetric Skyrmion Magnet Gd2PdSi3
Authors:
Yuyang Dong,
Yosuke Arai,
Kenta Kuroda,
Masayuki Ochi,
Natsumi Tanaka,
Yuxuan Wan,
Matthew D. Watson,
Timur K. Kim,
Cephise Cacho,
Makoto Hashimoto,
Donghui Lu,
Yuji Aoki,
Tatsuma D. Matsuda,
Takeshi Kondo
Abstract:
The magnetic skyrmions generated in a centrosymmetric crystal were recently first discovered in Gd2PdSi3. In light of this, we observe the electronic structure by angle-resolved photoemission spectroscopy (ARPES) and unveil its direct relationship with the magnetism in this compound. The Fermi surface and band dispersions are demonstrated to have a good agreement with the density functional theory…
▽ More
The magnetic skyrmions generated in a centrosymmetric crystal were recently first discovered in Gd2PdSi3. In light of this, we observe the electronic structure by angle-resolved photoemission spectroscopy (ARPES) and unveil its direct relationship with the magnetism in this compound. The Fermi surface and band dispersions are demonstrated to have a good agreement with the density functional theory (DFT) calculations carried out with careful consideration of the crystal superstructure. Most importantly, we find that the three-dimensional Fermi surface has extended nesting which matches well the q-vector of the magnetic order detected by recent scattering measurements. The consistency we find among ARPES, DFT, and the scattering measurements suggests the Ruderman-Kittel-Kasuya-Yosida (RKKY) interaction involving itinerant electrons to be the formation mechanism of skyrmions in Gd2PdSi3.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Rethinking Data Augmentation for Robust LiDAR Semantic Segmentation in Adverse Weather
Authors:
Junsung Park,
Kyungmin Kim,
Hyunjung Shim
Abstract:
Existing LiDAR semantic segmentation methods often struggle with performance declines in adverse weather conditions. Previous research has addressed this issue by simulating adverse weather or employing universal data augmentation during training. However, these methods lack a detailed analysis and understanding of how adverse weather negatively affects LiDAR semantic segmentation performance. Mot…
▽ More
Existing LiDAR semantic segmentation methods often struggle with performance declines in adverse weather conditions. Previous research has addressed this issue by simulating adverse weather or employing universal data augmentation during training. However, these methods lack a detailed analysis and understanding of how adverse weather negatively affects LiDAR semantic segmentation performance. Motivated by this issue, we identified key factors of adverse weather and conducted a toy experiment to pinpoint the main causes of performance degradation: (1) Geometric perturbation due to refraction caused by fog or droplets in the air and (2) Point drop due to energy absorption and occlusions. Based on these findings, we propose new strategic data augmentation techniques. First, we introduced a Selective Jittering (SJ) that jitters points in the random range of depth (or angle) to mimic geometric perturbation. Additionally, we developed a Learnable Point Drop (LPD) to learn vulnerable erase patterns with Deep Q-Learning Network to approximate the point drop phenomenon from adverse weather conditions. Without precise weather simulation, these techniques strengthen the LiDAR semantic segmentation model by exposing it to vulnerable conditions identified by our data-centric analysis. Experimental results confirmed the suitability of the proposed data augmentation methods for enhancing robustness against adverse weather conditions. Our method attains a remarkable 39.5 mIoU on the SemanticKITTI-to-SemanticSTF benchmark, surpassing the previous state-of-the-art by over 5.4%p, tripling the improvement over the baseline compared to previous methods achieved.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Unconventional p-wave and finite-momentum superconductivity induced by altermagnetism through the formation of Bogoliubov Fermi surface
Authors:
SeungBeom Hong,
Moon Jip Park,
Kyoung-Min Kim
Abstract:
Altermagnet is an exotic class of magnetic materials wherein the Fermi surface exhibits a momentum-dependent spin-splitting while maintaining a net zero magnetization. Previous studies have shown that this distinctive spin-splitting can induce chiral p-wave superconductors or Fulde-Ferrell superconducting states carrying finite momentum. However, the underlying mechanisms of such unconventional su…
▽ More
Altermagnet is an exotic class of magnetic materials wherein the Fermi surface exhibits a momentum-dependent spin-splitting while maintaining a net zero magnetization. Previous studies have shown that this distinctive spin-splitting can induce chiral p-wave superconductors or Fulde-Ferrell superconducting states carrying finite momentum. However, the underlying mechanisms of such unconventional superconductivities remain incompletely understood. Here, we propose that the formation of the Bogoliubov Fermi surface through the exchange field can play a significant role in such phenomena. Through a systematic self-consistent mean-field analysis on the extended attractive Hubbard model combined with the d-wave spin-splitting induced by the exchange field, as observed in RuO2, we demonstrate that the formation of the Bogoliubov Fermi surface suppresses conventional spin-singlet superconducting states with s-wave characteristics. In contrast, the chiral p-wave state maintains a fully gapped spectrum without the Fermi surface, thereby becoming the ground state in the strong field regime. In the intermediate regime, we find that the Fulde-Ferrell state becomes the predominant state through the optimization of available channels for Cooper pairing. Moreover, we illustrate how the prevalence of the chiral p-wave and Fulde-Ferrell states over the s-wave state changes under the variation of the field strength or chemical potential. Our findings provide valuable insights into potential pathways for realizing sought-after topological p-wave superconductivity and finite momentum pairing facilitated by altermagnetism.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Vortex confinement through an unquantized magnetic flux
Authors:
Geunyong Kim,
Jinyoung Yun,
Jinho Yang,
Ilkyu Yang,
Dirk Wulferding,
Roman Movshovich,
Gil Young Cho,
Ki-Seok Kim,
Garam Hahn,
Jeehoon Kim
Abstract:
Geometrically confined superconductors often experience a breakdown in the quantization of magnetic flux owing to the incomplete screening of the supercurrent against the field penetration. In this study, we report that the confinement of a magnetic field occurs regardless of the dimensionality of the system, extending even to 1D linear potential systems. By utilizing a vector-field magnetic force…
▽ More
Geometrically confined superconductors often experience a breakdown in the quantization of magnetic flux owing to the incomplete screening of the supercurrent against the field penetration. In this study, we report that the confinement of a magnetic field occurs regardless of the dimensionality of the system, extending even to 1D linear potential systems. By utilizing a vector-field magnetic force microscope, we successfully create a vortex-antivortex pair connected by a 1D unquantized magnetic flux in ultra-thin superconducting films. Through an investigation of the manipulation and thermal behavior of the vortex pair, we uncover a long-range interaction mediated by the unquantized magnetic flux. These findings suggest a universal phenomenon of unquantized magnetic flux formation, independent of the geometry of the system. Our results present an experimental route for probing the impact of confinement on superconducting properties and order parameters in unconventional superconductors characterized by extremely low dimensionality.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment
Authors:
The Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
J. K. Ahn,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (382 additional authors not shown)
Abstract:
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga…
▽ More
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Study of $χ_{bJ}(2P)\toωΥ(1S)$ at Belle
Authors:
Belle Collaboration,
Z. S. Stottler,
T. K. Pedlar,
B. G. Fulsom,
I. Adachi,
K. Adamczyk,
H. Aihara,
S. Al Said,
D. M. Asner,
H. Atmacan,
T. Aushev,
R. Ayad,
V. Babu,
Sw. Banerjee,
M. Bauer,
P. Behera,
K. Belous,
J. Bennett,
F. Bernlochner,
M. Bessner,
T. Bilka,
D. Biswas,
A. Bobrov,
D. Bodrov,
G. Bonvicini
, et al. (157 additional authors not shown)
Abstract:
We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of…
▽ More
We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of $B\big(χ_{b0}(2P)\toωΥ(1S)\big) = \big(0.55\pm0.19\pm0.07\big)\%$. We also obtain branching fractions of $B\big(χ_{b1}(2P)\toωΥ(1S)\big) = \big(2.39{}^{+0.20}_{-0.19}\pm0.24\big)\%$ and $B\big(χ_{b2}(2P)\toωΥ(1S)\big) = \big(0.47{}^{+0.13}_{-0.12}\pm0.06\big)\%$, confirming the measurement of the $ω$ transitions of the $J=1,2~P$--wave states. The ratio for the $J=2$ to $J=1$ transitions is also measured and found to differ by 3.3 standard deviations from the expected value in the QCD multipole expansion.
△ Less
Submitted 8 July, 2024; v1 submitted 30 June, 2024;
originally announced July 2024.
-
Filtration learning in exact multi-parameter persistent homology and classification of time-series data
Authors:
Keunsu Kim,
Jae-Hun Jung
Abstract:
To analyze the topological properties of the given discrete data, one needs to consider a continuous transform called filtration. Persistent homology serves as a tool to track changes of homology in the filtration. The outcome of the topological analysis of data varies depending on the choice of filtration, making the selection of filtration crucial. Filtration learning is an attempt to find an op…
▽ More
To analyze the topological properties of the given discrete data, one needs to consider a continuous transform called filtration. Persistent homology serves as a tool to track changes of homology in the filtration. The outcome of the topological analysis of data varies depending on the choice of filtration, making the selection of filtration crucial. Filtration learning is an attempt to find an optimal filtration that minimizes the loss function. Exact Multi-parameter Persistent Homology (EMPH) has been recently proposed, particularly for topological time-series analysis, that utilizes the exact formula of rank invariant instead of calculating it. In this paper, we propose a framework for filtration learning of EMPH. We formulate an optimization problem and propose an algorithm for solving the problem. We then apply the proposed algorithm to several classification problems. Particularly, we derive the exact formula of the gradient of the loss function with respect to the filtration parameter, which makes it possible to directly update the filtration without using automatic differentiation, significantly enhancing the learning process.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Emergence of metachronal waves in a chain of symmetrically beating filaments
Authors:
Narina Jung,
Won Kyu Kim,
Changbong Hyeon
Abstract:
Recent experiments have shown that metachronal waves (MCWs) can emerge from a chain of symmetrically beating nematodes aligned at the edge of sessile droplets. Our study, employing a coupled elastohydrodynamic model of active filaments, elucidates that a misalignment caused by a tilt against the bounding wall disrupts the synchronization and generates a constant time lag between adjacent filaments…
▽ More
Recent experiments have shown that metachronal waves (MCWs) can emerge from a chain of symmetrically beating nematodes aligned at the edge of sessile droplets. Our study, employing a coupled elastohydrodynamic model of active filaments, elucidates that a misalignment caused by a tilt against the bounding wall disrupts the synchronization and generates a constant time lag between adjacent filaments, leading to MCWs. The MCWs, enhancing the fluid circulation, achieve their maximum thermodynamic efficiency over the same range of tilt angles observed in the nematode experiments.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Chandra detects low-luminosity AGN with $M_\mathrm{BH}=10^{4}-10^{6}~M_\mathrm{\odot}$ in nearby ($z<0.5$), dwarf and star-forming galaxies
Authors:
Mainak Singha,
Julissa Sarmiento,
Sangeeta Malhotra,
James E. Rhoads,
L. Y. Aaron Yung,
Junxian Wang,
Zhen-Ya Zheng,
Ruqiu Lin,
Keunho Kim,
Jialai Kang,
Santosh Harish
Abstract:
We searched the Chandra and XMM archives for observations of 900 green pea galaxies to find AGN signatures. Green peas are low-mass galaxies with prominent emission lines, similar in size and star formation rate to high-redshift dwarf galaxies. Of the 29 observations found, 9 show X-ray detections with $S/N>3$. The 2-10 keV X-ray luminosity for these 9 sources exceeds…
▽ More
We searched the Chandra and XMM archives for observations of 900 green pea galaxies to find AGN signatures. Green peas are low-mass galaxies with prominent emission lines, similar in size and star formation rate to high-redshift dwarf galaxies. Of the 29 observations found, 9 show X-ray detections with $S/N>3$. The 2-10 keV X-ray luminosity for these 9 sources exceeds $10^{40}~\mathrm{erg~s}^{-1}$, with 2 sources exceeding $10^{41}~\mathrm{erg~s}^{-1}$, suggesting the presence of intermediate-mass black holes (IMBH) or low-luminosity AGN (LLAGN) with BH masses between $100-10^6M_\mathrm{\odot}$. All X-ray detected sources (plus 6 additional sources) show He~II$\lambda4686$ emission and a broad component of the H$α$ emission line, indicating winds. The line widths of the broad H$α$ and He II$\lambda4686$ emitting gas clouds are weakly correlated ($R^{2}=0.15$), suggesting He II$\lambda4686$ emission is inconsistent with winds from super-Eddington accretors. However, the ratio of X-ray luminosity to star formation rate shows an anti-correlation with metallicity in 5 out of 9 X-ray detected sources, implying ultraluminous X-ray sources are key contributors to the observed X-ray luminosity. This could be due to super-Eddington accretors or IMBH. The X-ray emission is much higher than that produced by Wolf-Rayet stars and supernovae-driven winds. Thus, the X-ray luminosity in these 9 sources can only be explained by black holes with masses over $100~M_\mathrm{\odot}$. Our findings suggest the presence of LLAGN in these galaxies, with broad H$α$ line widths implying BH masses of $10^4-10^6M_\mathrm{\odot}$. Given Green Peas' role as significant Lyman Continuum leakers, LLAGN in these galaxies could have contributed significantly to cosmic reionization.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Higher differentiability for the fractional $p$-Laplacian
Authors:
Lars Diening,
Kyeongbae Kim,
Ho-Sik Lee,
Simon Nowak
Abstract:
In this work, we study the higher differentiability of solutions to the inhomogeneous fractional $p$-Laplace equation under different regularity assumptions on the data. In the superquadratic case, we extend and sharpen several previous results, while in the subquadratic regime our results constitute completely novel developments even in the homogeneous case. In particular, in the local limit our…
▽ More
In this work, we study the higher differentiability of solutions to the inhomogeneous fractional $p$-Laplace equation under different regularity assumptions on the data. In the superquadratic case, we extend and sharpen several previous results, while in the subquadratic regime our results constitute completely novel developments even in the homogeneous case. In particular, in the local limit our results are consistent with well-known higher differentiability results for the standard inhomogeneous $p$-Laplace equation. All of our main results remain valid in the vectorial context of fractional $p$-Laplace systems.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Kinetic Inductance, Quantum Geometry, and Superconductivity in Magic-Angle Twisted Bilayer Graphene
Authors:
Miuko Tanaka,
Joel Î-j. Wang,
Thao H. Dinh,
Daniel Rodan-Legrain,
Sameia Zaman,
Max Hays,
Bharath Kannan,
Aziza Almanakly,
David K. Kim,
Bethany M. Niedzielski,
Kyle Serniak,
Mollie E. Schwartz,
Kenji Watanabe,
Takashi Taniguchi,
Jeffrey A. Grover,
Terry P. Orlando,
Simon Gustavsson,
Pablo Jarillo-Herrero,
William D. Oliver
Abstract:
The physics of superconductivity in magic-angle twisted bilayer graphene (MATBG) is a topic of keen interest in moiré systems research, and it may provide insight into the pairing mechanism of other strongly correlated materials such as high-$T_{\mathrm{c}}$ superconductors. Here, we use DC-transport and microwave circuit quantum electrodynamics (cQED) to measure directly the superfluid stiffness…
▽ More
The physics of superconductivity in magic-angle twisted bilayer graphene (MATBG) is a topic of keen interest in moiré systems research, and it may provide insight into the pairing mechanism of other strongly correlated materials such as high-$T_{\mathrm{c}}$ superconductors. Here, we use DC-transport and microwave circuit quantum electrodynamics (cQED) to measure directly the superfluid stiffness of superconducting MATBG via its kinetic inductance. We find the superfluid stiffness to be much larger than expected from conventional single-band Fermi liquid theory; rather, it aligns well with theory involving quantum geometric effects that are dominant at the magic angle. The temperature dependence of the superfluid stiffness exhibits a power-law behavior, which contraindicates an isotropic BCS model; instead, the extracted power-law exponents indicate an anisotropic superconducting gap, whether interpreted using the conventional anisotropic BCS model or a quantum geometric theory of flat-band superconductivity. Moreover, the quadratic dependence of the stiffness on both DC and microwave current is consistent with Ginzburg-Landau theory. Taken together, these findings strongly suggest a connection between quantum geometry, superfluid stiffness, and unconventional superconductivity in MATBG. Finally, the combined DC-microwave measurement platform used here is applicable to the investigation of other atomically thin superconductors.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Sound event detection based on auxiliary decoder and maximum probability aggregation for DCASE Challenge 2024 Task 4
Authors:
Sang Won Son,
Jongyeon Park,
Hong Kook Kim,
Sulaiman Vesal,
Jeong Eun Lim
Abstract:
In this report, we propose three novel methods for developing a sound event detection (SED) model for the DCASE 2024 Challenge Task 4. First, we propose an auxiliary decoder attached to the final convolutional block to improve feature extraction capabilities while reducing dependency on embeddings from pre-trained large models. The proposed auxiliary decoder operates independently from the main de…
▽ More
In this report, we propose three novel methods for developing a sound event detection (SED) model for the DCASE 2024 Challenge Task 4. First, we propose an auxiliary decoder attached to the final convolutional block to improve feature extraction capabilities while reducing dependency on embeddings from pre-trained large models. The proposed auxiliary decoder operates independently from the main decoder, enhancing performance of the convolutional block during the initial training stages by assigning a different weight strategy between main and auxiliary decoder losses. Next, to address the time interval issue between the DESED and MAESTRO datasets, we propose maximum probability aggregation (MPA) during the training step. The proposed MPA method enables the model's output to be aligned with soft labels of 1 s in the MAESTRO dataset. Finally, we propose a multi-channel input feature that employs various versions of logmel and MFCC features to generate time-frequency pattern. The experimental results demonstrate the efficacy of these proposed methods in a view of improving SED performance by achieving a balanced enhancement across different datasets and label types. Ultimately, this approach presents a significant step forward in developing more robust and flexible SED models
△ Less
Submitted 24 June, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
SyncVSR: Data-Efficient Visual Speech Recognition with End-to-End Crossmodal Audio Token Synchronization
Authors:
Young Jin Ahn,
Jungwoo Park,
Sangha Park,
Jonghyun Choi,
Kee-Eung Kim
Abstract:
Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech recognition, aiming to interpret spoken content from visual cues. A prominent challenge in VSR is the presence of homophenes-visually similar lip gestures that represent different phonemes. Prior approaches have sought to distinguish fine-grained visemes by aligning visual and auditory semantics, but often fel…
▽ More
Visual Speech Recognition (VSR) stands at the intersection of computer vision and speech recognition, aiming to interpret spoken content from visual cues. A prominent challenge in VSR is the presence of homophenes-visually similar lip gestures that represent different phonemes. Prior approaches have sought to distinguish fine-grained visemes by aligning visual and auditory semantics, but often fell short of full synchronization. To address this, we present SyncVSR, an end-to-end learning framework that leverages quantized audio for frame-level crossmodal supervision. By integrating a projection layer that synchronizes visual representation with acoustic data, our encoder learns to generate discrete audio tokens from a video sequence in a non-autoregressive manner. SyncVSR shows versatility across tasks, languages, and modalities at the cost of a forward pass. Our empirical evaluations show that it not only achieves state-of-the-art results but also reduces data usage by up to ninefold.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
Authors:
Seungwoo Son,
Wonpyo Park,
Woohyun Han,
Kyuyeun Kim,
Jaeho Lee
Abstract:
Despite recent advances in LLM quantization, activation quantization remains to be challenging due to the activation outliers. Conventional remedies, e.g., mixing precisions for different channels, introduce extra overhead and reduce the speedup. In this work, we develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tok…
▽ More
Despite recent advances in LLM quantization, activation quantization remains to be challenging due to the activation outliers. Conventional remedies, e.g., mixing precisions for different channels, introduce extra overhead and reduce the speedup. In this work, we develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tokens. Precisely, we propose a method to find a set of key-value cache, coined CushionCache, which mitigates outliers in subsequent tokens when inserted as a prefix. CushionCache works in two steps: First, we greedily search for a prompt token sequence that minimizes the maximum activation values in subsequent tokens. Then, we further tune the token cache to regularize the activations of subsequent tokens to be more quantization-friendly. The proposed method successfully addresses activation outliers of LLMs, providing a substantial performance boost for per-tensor activation quantization methods. We thoroughly evaluate our method over a wide range of models and benchmarks and find that it significantly surpasses the established baseline of per-tensor W8A8 quantization and can be seamlessly integrated with the recent activation quantization method.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
ChatPCG: Large Language Model-Driven Reward Design for Procedural Content Generation
Authors:
In-Chang Baek,
Tae-Hwa Park,
Jin-Ha Noh,
Cheong-Mok Bae,
Kyung-Joong Kim
Abstract:
Driven by the rapid growth of machine learning, recent advances in game artificial intelligence (AI) have significantly impacted productivity across various gaming genres. Reward design plays a pivotal role in training game AI models, wherein researchers implement concepts of specific reward functions. However, despite the presence of AI, the reward design process predominantly remains in the doma…
▽ More
Driven by the rapid growth of machine learning, recent advances in game artificial intelligence (AI) have significantly impacted productivity across various gaming genres. Reward design plays a pivotal role in training game AI models, wherein researchers implement concepts of specific reward functions. However, despite the presence of AI, the reward design process predominantly remains in the domain of human experts, as it is heavily reliant on their creativity and engineering skills. Therefore, this paper proposes ChatPCG, a large language model (LLM)-driven reward design framework.It leverages human-level insights, coupled with game expertise, to generate rewards tailored to specific game features automatically. Moreover, ChatPCG is integrated with deep reinforcement learning, demonstrating its potential for multiplayer game content generation tasks. The results suggest that the proposed LLM exhibits the capability to comprehend game mechanics and content generation tasks, enabling tailored content generation for a specified game. This study not only highlights the potential for improving accessibility in content generation but also aims to streamline the game AI development process.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Anti-aliased metasurfaces beyond the Nyquist limit
Authors:
Seokwoo Kim,
Joohoon Kim,
Kyungtae Kim,
Minsu Jeong,
Junsuk Rho
Abstract:
Sampling is a pivotal element in the design of metasurfaces, enabling a broad spectrum of applications. Despite its flexibility, sampling can result in reduced efficiency and unintended diffractions, which are more pronounced at high numerical aperture or shorter wavelengths, e.g. ultraviolet spectrum. Prevailing metasurface research has often relied on the conventional Nyquist sampling theorem to…
▽ More
Sampling is a pivotal element in the design of metasurfaces, enabling a broad spectrum of applications. Despite its flexibility, sampling can result in reduced efficiency and unintended diffractions, which are more pronounced at high numerical aperture or shorter wavelengths, e.g. ultraviolet spectrum. Prevailing metasurface research has often relied on the conventional Nyquist sampling theorem to assess sampling appropriateness, however, our findings reveal that the Nyquist criterion is insufficient for preventing the diffractive distortion. Specifically, we find that the performance of a metasurface is significantly correlated to the geometric relationship between the spectrum morphology and sampling lattice. Based on lattice-based diffraction analysis, we demonstrate several anti-aliasing strategies from visible to ultraviolet regimes. These approaches significantly reduce aliasing phenomena occurring in high numerical aperture metasurfaces.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Performance Improvement of Language-Queried Audio Source Separation Based on Caption Augmentation From Large Language Models for DCASE Challenge 2024 Task 9
Authors:
Do Hyun Lee,
Yoonah Song,
Hong Kook Kim
Abstract:
We present a prompt-engineering-based text-augmentation approach applied to a language-queried audio source separation (LASS) task. To enhance the performance of LASS, the proposed approach utilizes large language models (LLMs) to generate multiple captions corresponding to each sentence of the training dataset. To this end, we first perform experiments to identify the most effective prompts for c…
▽ More
We present a prompt-engineering-based text-augmentation approach applied to a language-queried audio source separation (LASS) task. To enhance the performance of LASS, the proposed approach utilizes large language models (LLMs) to generate multiple captions corresponding to each sentence of the training dataset. To this end, we first perform experiments to identify the most effective prompts for caption augmentation with a smaller number of captions. A LASS model trained with these augmented captions demonstrates improved performance on the DCASE 2024 Task 9 validation set compared to that trained without augmentation. This study highlights the effectiveness of LLM-based caption augmentation in advancing language-queried audio source separation.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Projected background and sensitivity of AMoRE-II
Authors:
A. Agrawal,
V. V. Alenkov,
P. Aryal,
J. Beyer,
B. Bhandari,
R. S. Boiko,
K. Boonin,
O. Buzanov,
C. R. Byeon,
N. Chanthima,
M. K. Cheoun,
J. S. Choe,
Seonho Choi,
S. Choudhury,
J. S. Chung,
F. A. Danevich,
M. Djamal,
D. Drung,
C. Enss,
A. Fleischmann,
A. M. Gangapshev,
L. Gastaldo,
Y. M. Gavrilyuk,
A. M. Gezhaev,
O. Gileva
, et al. (81 additional authors not shown)
Abstract:
AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located ap…
▽ More
AMoRE-II aims to search for neutrinoless double beta decay with an array of 423 Li$_2$$^{100}$MoO$_4$ crystals operating in the cryogenic system as the main phase of the Advanced Molybdenum-based Rare process Experiment (AMoRE). AMoRE has been planned to operate in three phases: AMoRE-pilot, AMoRE-I, and AMoRE-II. AMoRE-II is currently being installed at the Yemi Underground Laboratory, located approximately 1000 meters deep in Jeongseon, Korea. The goal of AMoRE-II is to reach up to $T^{0νββ}_{1/2}$ $\sim$ 6 $\times$ 10$^{26}$ years, corresponding to an effective Majorana mass of 15 - 29 meV, covering all the inverted mass hierarchy regions. To achieve this, the background level of the experimental configurations and possible background sources of gamma and beta events should be well understood. We have intensively performed Monte Carlo simulations using the GEANT4 toolkit in all the experimental configurations with potential sources. We report the estimated background level that meets the 10$^{-4}$counts/(keV$\cdot$kg$\cdot$yr) requirement for AMoRE-II in the region of interest (ROI) and show the projected half-life sensitivity based on the simulation study.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Authors:
Suwon Shon,
Kwangyoun Kim,
Yi-Te Hsu,
Prashant Sridhar,
Shinji Watanabe,
Karen Livescu
Abstract:
The integration of pre-trained text-based large language models (LLM) with speech input has enabled instruction-following capabilities for diverse speech tasks. This integration requires the use of a speech encoder, a speech adapter, and an LLM, trained on diverse tasks. We propose the use of discrete speech units (DSU), rather than continuous-valued speech encoder outputs, that are converted to t…
▽ More
The integration of pre-trained text-based large language models (LLM) with speech input has enabled instruction-following capabilities for diverse speech tasks. This integration requires the use of a speech encoder, a speech adapter, and an LLM, trained on diverse tasks. We propose the use of discrete speech units (DSU), rather than continuous-valued speech encoder outputs, that are converted to the LLM token embedding space using the speech adapter. We generate DSU using a self-supervised speech encoder followed by k-means clustering. The proposed model shows robust performance on speech inputs from seen/unseen domains and instruction-following capability in spoken question answering. We also explore various types of DSU extracted from different layers of the self-supervised speech encoder, as well as Mel frequency Cepstral Coefficients (MFCC). Our findings suggest that the ASR task and datasets are not crucial in instruction-tuning for spoken question answering tasks.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning
Authors:
Jaehyun Nam,
Kyuyoung Kim,
Seunghyuk Oh,
Jihoon Tack,
Jaehyung Kim,
Jinwoo Shin
Abstract:
Learning effective representations from raw data is crucial for the success of deep learning methods. However, in the tabular domain, practitioners often prefer augmenting raw column features over using learned representations, as conventional tree-based algorithms frequently outperform competing approaches. As a result, feature engineering methods that automatically generate candidate features ha…
▽ More
Learning effective representations from raw data is crucial for the success of deep learning methods. However, in the tabular domain, practitioners often prefer augmenting raw column features over using learned representations, as conventional tree-based algorithms frequently outperform competing approaches. As a result, feature engineering methods that automatically generate candidate features have been widely used. While these approaches are often effective, there remains ambiguity in defining the space over which to search for candidate features. Moreover, they often rely solely on validation scores to select good features, neglecting valuable feedback from past experiments that could inform the planning of future experiments. To address the shortcomings, we propose a new tabular learning framework based on large language models (LLMs), coined Optimizing Column feature generator with decision Tree reasoning (OCTree). Our key idea is to leverage LLMs' reasoning capabilities to find good feature generation rules without manually specifying the search space and provide language-based reasoning information highlighting past experiments as feedback for iterative rule improvements. Here, we choose a decision tree as reasoning as it can be interpreted in natural language, effectively conveying knowledge of past experiments (i.e., the prediction models trained with the generated features) to the LLM. Our empirical results demonstrate that this simple framework consistently enhances the performance of various prediction models across diverse tabular benchmarks, outperforming competing automatic feature engineering methods.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Scaling behavior of the localization length for TE waves at critical incidence on short-range correlated stratified random media
Authors:
Seulong Kim,
Kihong Kim
Abstract:
We theoretically investigate the scaling behavior of the localization length for $s$-polarized electromagnetic waves incident at a critical angle on stratified random media with short-range correlated disorder. By employing the invariant embedding method, extended to waves in correlated random media, and utilizing the Shapiro-Loginov formula of differentiation, we accurately compute the localizati…
▽ More
We theoretically investigate the scaling behavior of the localization length for $s$-polarized electromagnetic waves incident at a critical angle on stratified random media with short-range correlated disorder. By employing the invariant embedding method, extended to waves in correlated random media, and utilizing the Shapiro-Loginov formula of differentiation, we accurately compute the localization length $ξ$ of $s$ waves incident obliquely on stratified random media that exhibit short-range correlated dichotomous randomness in the dielectric permittivity. The random component of the permittivity is characterized by the disorder strength parameter $σ^2$ and the disorder correlation length $l_c$. Away from the critical angle, $ξ$ depends on these parameters independently. However, precisely at the critical angle, we discover that for waves with wavenumber $k$, $kξ$ depends on the single parameter $kl_cσ^2$, satisfying a universal equation $kξ\approx 1.3717\left(kl_cσ^2\right)^{-1/3}$ across the entire range of parameter values. Additionally, we find that $ξ$ scales as $λ^{4/3}$ for the entire range of the wavelength $λ$, regardless of the values of $σ^2$ and $l_c$. We demonstrate that under sufficiently strong disorder, the scaling behavior of the localization length for all other incident angles converges to that for the critical incidence.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Holstein polarons, Rashba-like spin splitting and Ising superconductivity in electron-doped MoSe2
Authors:
Sung Won Jung,
Saumya Mukherjee,
Matthew D. Watson,
Daniil V. Evtushinsky,
Cephise Cacho,
Edoardo Martino,
Helmut Berger,
Timur K. Kim
Abstract:
Interaction between electrons and phonons in solids is a key effect defining physical properties of materials such as electrical and thermal conductivity. In transitional metal dichalcogenides (TMDCs) the electron-phonon coupling results in the creation of polarons, quasiparticles that manifest themselves as discrete features in the electronic spectral function. In this study, we report the format…
▽ More
Interaction between electrons and phonons in solids is a key effect defining physical properties of materials such as electrical and thermal conductivity. In transitional metal dichalcogenides (TMDCs) the electron-phonon coupling results in the creation of polarons, quasiparticles that manifest themselves as discrete features in the electronic spectral function. In this study, we report the formation of polarons at the alkali dosed MoSe2 surface, where Rashba-like spin splitting of the conduction band states is caused by an inversion-symmetry breaking electric field. In addition, we observe the crossover from phonon-like to plasmon-like polaronic spectral features at MoSe2 surface with increasing doping. Our findings support the concept of electron-phonon coupling mediated superconductivity in electron-doped layered TMDC materials, observed using ionic liquid gating technology. Furthermore, the discovered spin-splitting at the Fermi level could offer crucial experimental validation for theoretical models of Ising-type superconductivity in these materials.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Holographic reconstruction of black hole spacetime: machine learning and entanglement entropy
Authors:
Byoungjoon Ahn,
Hyun-Sik Jeong,
Keun-Young Kim,
Kwan Yun
Abstract:
We investigate the bulk reconstruction of AdS black hole spacetime emergent from quantum entanglement within a machine learning framework. Utilizing neural ordinary differential equations alongside Monte-Carlo integration, we develop a method tailored for continuous training functions to extract the general isotropic bulk metric from entanglement entropy data. To validate our approach, we first ap…
▽ More
We investigate the bulk reconstruction of AdS black hole spacetime emergent from quantum entanglement within a machine learning framework. Utilizing neural ordinary differential equations alongside Monte-Carlo integration, we develop a method tailored for continuous training functions to extract the general isotropic bulk metric from entanglement entropy data. To validate our approach, we first apply our machine learning algorithm to holographic entanglement entropy data derived from the Gubser-Rocha and superconductor models, which serve as representative models of strongly coupled matters in holography. Our algorithm successfully extracts the corresponding bulk metrics from these data. Additionally, we extend our methodology to many-body systems by employing entanglement entropy data from a fermionic tight-binding chain at half filling, exemplifying critical one-dimensional systems, and derive the associated bulk metric. We find that the metrics for a tight-binding chain and the Gubser-Rocha model are similar. We speculate this similarity is due to the metallic property of these models.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Measurement of the branching fractions of $\bar{B}\to D^{(*)} K^- K^{(*)0}_{(S)}$ and $\bar{B}\to D^{(*)}D_s^{-}$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (382 additional authors not shown)
Abstract:
We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted…
▽ More
We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted from fits to the distributions of the difference between expected and observed $B$ meson energy, and are efficiency-corrected as a function of $m(K^-K^{(*)0}_{(S)})$ and $m(D^{(*)}K^{(*)0}_{(S)})$ in order to avoid dependence on the decay model. These results include the first observation of $\overline B{}^0\to D^+K^-K_S^0$, $B^-\to D^{*0}K^-K_S^0$, and $\overline B{}^0\to D^{*+}K^-K_S^0$ decays and a significant improvement in the precision of the other channels compared to previous measurements. The helicity-angle distributions and the invariant mass distributions of the $K^- K^{(*)0}_{(S)}$ systems are compatible with quasi-two-body decays via a resonant transition with spin-parity $J^P=1^-$ for the $K^-K_S^0$ systems and $J^P= 1^+$ for the $K^-K^{*0}$ systems. We also present measurements of the branching fractions of four $\overline B{}^0\to D^{(*)+} D_s^-$, $B^{-}\to D^{(*)0} D_s^- $ decay channels with a precision compatible to the current world averages.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Exclusion of the Cosmological Triangle in Reactor-Based Search for Axion-Like Particles
Authors:
Byung Ju Park,
Jae Jin Choi,
Eunju Jeon,
Jinyu Kim,
Kyungwon Kim,
Sung Hyun Kim,
Sun Kee Kim,
Yeongduk Kim,
Young Ju Ko,
Byoung-Cheol Koh,
Chang Hyon Ha,
Seo Hyun Lee,
In Soo Lee,
Hyunseok Lee,
Hyun Su Lee,
Jaison Lee,
Yoomin Oh,
Doojin Kim
Abstract:
We report new constraints on axion-like particle (ALP) using data corresponding to a sodium iodine target exposure of 3063 kg$\cdot$days from the neutrino elastic scattering observation with NaI (NEON) experiment. A 16.7 kg of thallium-doped sodium iodide target was located 23.7 meters from a 2.8 GW thermal power nuclear reactor. We searched for ALPs produced by high-flux photons by comparing the…
▽ More
We report new constraints on axion-like particle (ALP) using data corresponding to a sodium iodine target exposure of 3063 kg$\cdot$days from the neutrino elastic scattering observation with NaI (NEON) experiment. A 16.7 kg of thallium-doped sodium iodide target was located 23.7 meters from a 2.8 GW thermal power nuclear reactor. We searched for ALPs produced by high-flux photons by comparing the energy spectra of data collected during reactor-on (1596 kg$\cdot$days exposure) and reactor-off (1467 kg$\cdot$days exposure) periods. No signal consistent with ALP interaction was identified, allowing us to set exclusion limits at the 95% confidence level. Our limits cover previously unexplored regions for both photon couplings (${g_{aγ}}$) and electron couplings (${g_{ae}}$) for axion masses around 1 MeV/c$^2$. Notably, the NEON data excludes the unconstrained region identified by laboratory-based searches for photon couplings within the "cosmological triangle" for the first time. The observed 95\% confidence level limits reach as low as ${g_{aγ}}$ of 4.33$\times$ 10$^{-8}$ GeV$^{-1}$ and ${g_{ae}}$ of 1.10$\times$ 10$^{-9}$ for axion masses of 1.7 MeV/c$^2$ and 1.0 MeV/c$^2$, respectively.
△ Less
Submitted 11 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Solution for SMART-101 Challenge of CVPR Multi-modal Algorithmic Reasoning Task 2024
Authors:
Jinwoo Ahn,
Junhyeok Park,
Min-Jun Kim,
Kang-Hyeon Kim,
So-Yeong Sohn,
Yun-Ji Lee,
Du-Seong Chang,
Yu-Jung Heo,
Eun-Sol Kim
Abstract:
In this paper, the solution of HYU MLLAB KT Team to the Multimodal Algorithmic Reasoning Task: SMART-101 CVPR 2024 Challenge is presented. Beyond conventional visual question-answering problems, the SMART-101 challenge aims to achieve human-level multimodal understanding by tackling complex visio-linguistic puzzles designed for children in the 6-8 age group. To solve this problem, we suggest two m…
▽ More
In this paper, the solution of HYU MLLAB KT Team to the Multimodal Algorithmic Reasoning Task: SMART-101 CVPR 2024 Challenge is presented. Beyond conventional visual question-answering problems, the SMART-101 challenge aims to achieve human-level multimodal understanding by tackling complex visio-linguistic puzzles designed for children in the 6-8 age group. To solve this problem, we suggest two main ideas. First, to utilize the reasoning ability of a large-scale language model (LLM), the given visual cues (images) are grounded in the text modality. For this purpose, we generate highly detailed text captions that describe the context of the image and use these captions as input for the LLM. Second, due to the nature of puzzle images, which often contain various geometric visual patterns, we utilize an object detection algorithm to ensure these patterns are not overlooked in the captioning process. We employed the SAM algorithm, which can detect various-size objects, to capture the visual features of these geometric patterns and used this information as input for the LLM. Under the puzzle split configuration, we achieved an option selection accuracy Oacc of 29.5 on the test set and a weighted option selection accuracy (WOSA) of 27.1 on the challenge set.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Control of spin-wave polarity and velocity using a ferrimagnetic domain wall
Authors:
Ehsan Faridi,
Giovanni Vignale,
Se Kwon Kim
Abstract:
We present a theoretical study of the scattering of spin waves by a domain wall (DW) in a ferrimagnetic (FiM) spin chain in which two sublattices carry spins of unequal magnitudes. We find that a narrow, but atomically smooth FiM DW exhibits a different behavior in comparison with similarly smooth ferromagnetic and antiferromagnetic DWs due to the inequivalence of the two sublattices. Specifically…
▽ More
We present a theoretical study of the scattering of spin waves by a domain wall (DW) in a ferrimagnetic (FiM) spin chain in which two sublattices carry spins of unequal magnitudes. We find that a narrow, but atomically smooth FiM DW exhibits a different behavior in comparison with similarly smooth ferromagnetic and antiferromagnetic DWs due to the inequivalence of the two sublattices. Specifically, for sufficiently weak anisotropy, the smaller spin at the center of the DW is found to become precisely normal to the easy-axis, selecting an arbitrary direction in the $xy$-plane and thereby breaking the U(1) spin-rotational symmetry spontaneously. This particular form of a FiM DW does not occur in antiferromagnetic systems and is shown to lead to a strong dependence of spin wave scattering pattern on the state of polarization of the spin wave, which can be either right-handed or left-handed, suggesting the utilization of such a narrow DW as a spin-wave filter. Moreover, we find that in the case of an atomically sharp DW, where all the spins point either up or down due to strong easy-axis anisotropy and therefore the polarization of the spin wave is conserved upon transmission, the wave vector of the spin wave changes after passing through the DW leading to a change in the group velocity of the spin wave. This change of the wave vector indicates the acceleration or deceleration of the spin waves and thus a sharp FiM DW could serve as a spin wave accelerator or decelerator in spintronics devices, offering a functionality absent in a ferromagnetic and an antiferromagnetic counterpart. Our results indicate that FiM spin textures can interact with spin waves distinctly from ferromagnetic and antiferromagnetic counterparts, suggesting that they may offer spin-wave functionalities that are absent in more conventional magnets.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
RE-RAG: Improving Open-Domain QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation
Authors:
Kiseung Kim,
Jay-Yoon Lee
Abstract:
The Retrieval Augmented Generation (RAG) framework utilizes a combination of parametric knowledge and external knowledge to demonstrate state-of-the-art performance on open-domain question answering tasks. However, the RAG framework suffers from performance degradation when the query is accompanied by irrelevant contexts. In this work, we propose the RE-RAG framework, which introduces a relevance…
▽ More
The Retrieval Augmented Generation (RAG) framework utilizes a combination of parametric knowledge and external knowledge to demonstrate state-of-the-art performance on open-domain question answering tasks. However, the RAG framework suffers from performance degradation when the query is accompanied by irrelevant contexts. In this work, we propose the RE-RAG framework, which introduces a relevance estimator (RE) that not only provides relative relevance between contexts as previous rerankers did, but also provides confidence, which can be used to classify whether given context is useful for answering the given question. We propose a weakly supervised method for training the RE simply utilizing question-answer data without any labels for correct contexts. We show that RE trained with a small generator (sLM) can not only improve the sLM fine-tuned together with RE but also improve previously unreferenced large language models (LLMs). Furthermore, we investigate new decoding strategies that utilize the proposed confidence measured by RE such as choosing to let the user know that it is "unanswerable" to answer the question given the retrieved contexts or choosing to rely on LLM's parametric knowledge rather than unrelated contexts.
△ Less
Submitted 16 June, 2024; v1 submitted 9 June, 2024;
originally announced June 2024.
-
REP: Resource-Efficient Prompting for On-device Continual Learning
Authors:
Sungho Jeon,
Xinyue Ma,
Kwang In Kim,
Myeongjae Jeon
Abstract:
On-device continual learning (CL) requires the co-optimization of model accuracy and resource efficiency to be practical. This is extremely challenging because it must preserve accuracy while learning new tasks with continuously drifting data and maintain both high energy and memory efficiency to be deployable on real-world devices. Typically, a CL method leverages one of two types of backbone net…
▽ More
On-device continual learning (CL) requires the co-optimization of model accuracy and resource efficiency to be practical. This is extremely challenging because it must preserve accuracy while learning new tasks with continuously drifting data and maintain both high energy and memory efficiency to be deployable on real-world devices. Typically, a CL method leverages one of two types of backbone networks: CNN or ViT. It is commonly believed that CNN-based CL excels in resource efficiency, whereas ViT-based CL is superior in model performance, making each option attractive only for a single aspect. In this paper, we revisit this comparison while embracing powerful pre-trained ViT models of various sizes, including ViT-Ti (5.8M parameters). Our detailed analysis reveals that many practical options exist today for making ViT-based methods more suitable for on-device CL, even when accuracy, energy, and memory are all considered. To further expand this impact, we introduce REP, which improves resource efficiency specifically targeting prompt-based rehearsal-free methods. Our key focus is on avoiding catastrophic trade-offs with accuracy while trimming computational and memory costs throughout the training process. We achieve this by exploiting swift prompt selection that enhances input data using a carefully provisioned model, and by developing two novel algorithms-adaptive token merging (AToM) and adaptive layer dropping (ALD)-that optimize the prompt updating stage. In particular, AToM and ALD perform selective skipping across the data and model-layer dimensions without compromising task-specific features in vision transformer models. Extensive experiments on three image classification datasets validate REP's superior resource efficiency over current state-of-the-art methods.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ and asymmetry parameter of $Ξ_{c}^{0}\toΞ^{0}π^{0}$
Authors:
Belle,
Belle II Collaborations,
:,
I. Adachi,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (360 additional authors not shown)
Abstract:
We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions…
▽ More
We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions $${\cal B}(Ξ_{c}^{0}\toΞ^{0}π^{0})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.48 \pm 0.02 ({\rm stat}) \pm 0.03 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η)/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.11 \pm 0.01 ({\rm stat}) \pm 0.01 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η^{\prime})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.08 \pm 0.02 ({\rm stat}) \pm 0.01 ({\rm syst}) $$ for the first time, where the uncertainties are statistical ($\rm stat$) and systematic ($\rm syst$). By multiplying by the branching fraction of the normalization mode, ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$, we obtain the following absolute branching fraction results $(6.9 \pm 0.3 ({\rm stat}) \pm 0.5 ({\rm syst}) \pm 1.3 ({\rm norm})) \times 10^{-3}$, $(1.6 \pm 0.2 ({\rm stat}) \pm 0.2 ({\rm syst}) \pm 0.3 ({\rm norm})) \times 10^{-3}$, and $(1.2 \pm 0.3 ({\rm stat}) \pm 0.1 ({\rm syst}) \pm 0.2 ({\rm norm})) \times 10^{-3}$, for $Ξ_{c}^{0}$ decays to $Ξ^{0}π^{0}$, $Ξ^{0}η$, and $Ξ^{0}η^{\prime}$ final states, respectively. The third errors are from the uncertainty on ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$. The asymmetry parameter for $Ξ_{c}^{0}\toΞ^{0}π^{0}$ is measured to be $α(Ξ_{c}^{0}\toΞ^{0}π^{0}) = -0.90\pm0.15({\rm stat})\pm0.23({\rm syst})$.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Approximation-Aware Bayesian Optimization
Authors:
Natalie Maus,
Kyurae Kim,
Geoff Pleiss,
David Eriksson,
John P. Cunningham,
Jacob R. Gardner
Abstract:
High-dimensional Bayesian optimization (BO) tasks such as molecular design often require 10,000 function evaluations before obtaining meaningful results. While methods like sparse variational Gaussian processes (SVGPs) reduce computational requirements in these settings, the underlying approximations result in suboptimal data acquisitions that slow the progress of optimization. In this paper we mo…
▽ More
High-dimensional Bayesian optimization (BO) tasks such as molecular design often require 10,000 function evaluations before obtaining meaningful results. While methods like sparse variational Gaussian processes (SVGPs) reduce computational requirements in these settings, the underlying approximations result in suboptimal data acquisitions that slow the progress of optimization. In this paper we modify SVGPs to better align with the goals of BO: targeting informed data acquisition rather than global posterior fidelity. Using the framework of utility-calibrated variational inference, we unify GP approximation and data acquisition into a joint optimization problem, thereby ensuring optimal decisions under a limited computational budget. Our approach can be used with any decision-theoretic acquisition function and is compatible with trust region methods like TuRBO. We derive efficient joint objectives for the expected improvement and knowledge gradient acquisition functions in both the standard and batch BO settings. Our approach outperforms standard SVGPs on high-dimensional benchmark tasks in control and molecular design.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Exploring the interplay between mass-energy equivalence, interactions and entanglement in an optical lattice clock
Authors:
Anjun Chu,
Victor J. Martínez-Lahuerta,
Maya Miklos,
Kyungtae Kim,
Peter Zoller,
Klemens Hammerer,
Jun Ye,
Ana Maria Rey
Abstract:
We propose protocols that probe manifestations of the mass-energy equivalence in an optical lattice clock (OLC) interrogated with spin coherent and entangled quantum states. To tune and uniquely distinguish the mass-energy equivalence effects (gravitational redshift and second order Doppler shift) in such setting, we devise a dressing protocol using an additional nuclear spin state. We then analyz…
▽ More
We propose protocols that probe manifestations of the mass-energy equivalence in an optical lattice clock (OLC) interrogated with spin coherent and entangled quantum states. To tune and uniquely distinguish the mass-energy equivalence effects (gravitational redshift and second order Doppler shift) in such setting, we devise a dressing protocol using an additional nuclear spin state. We then analyze the interplay between photon-mediated interactions and gravitational redshift and show that such interplay can lead to entanglement generation and frequency synchronization. In the regime where all atomic spins synchronize, we show the synchronization time depends on the initial entanglement of the state and can be used as a proxy of its metrological gain compared to a classical state. Our work opens new possibilities for exploring the effects of general relativity on quantum coherence and entanglement in OLC experiments.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
Optimizing Multi-User Semantic Communication via Transfer Learning and Knowledge Distillation
Authors:
Loc X. Nguyen,
Kitae Kim,
Ye Lin Tun,
Sheikh Salman Hassan,
Yan Kyaw Tun,
Zhu Han,
Choong Seon Hong
Abstract:
Semantic communication, notable for ensuring quality of service by jointly optimizing source and channel coding, effectively extracts data semantics, reduces transmission length, and mitigates channel noise. However, most studies overlook multi-user scenarios and resource availability, limiting real-world application. This paper addresses this gap by focusing on downlink communication from a base…
▽ More
Semantic communication, notable for ensuring quality of service by jointly optimizing source and channel coding, effectively extracts data semantics, reduces transmission length, and mitigates channel noise. However, most studies overlook multi-user scenarios and resource availability, limiting real-world application. This paper addresses this gap by focusing on downlink communication from a base station to multiple users with varying computing capacities. Users employ variants of Swin transformer models for source decoding and a simple architecture for channel decoding. We propose a novel training regimen, incorporating transfer learning and knowledge distillation to improve low-computing users' performance. Extensive simulations validate the proposed methods.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
BIPED: Pedagogically Informed Tutoring System for ESL Education
Authors:
Soonwoo Kwon,
Sojung Kim,
Minju Park,
Seunghyun Lee,
Kyuseok Kim
Abstract:
Large Language Models (LLMs) have a great potential to serve as readily available and cost-efficient Conversational Intelligent Tutoring Systems (CITS) for teaching L2 learners of English. Existing CITS, however, are designed to teach only simple concepts or lack the pedagogical depth necessary to address diverse learning strategies. To develop a more pedagogically informed CITS capable of teachin…
▽ More
Large Language Models (LLMs) have a great potential to serve as readily available and cost-efficient Conversational Intelligent Tutoring Systems (CITS) for teaching L2 learners of English. Existing CITS, however, are designed to teach only simple concepts or lack the pedagogical depth necessary to address diverse learning strategies. To develop a more pedagogically informed CITS capable of teaching complex concepts, we construct a BIlingual PEDagogically-informed Tutoring Dataset (BIPED) of one-on-one, human-to-human English tutoring interactions. Through post-hoc analysis of the tutoring interactions, we come up with a lexicon of dialogue acts (34 tutor acts and 9 student acts), which we use to further annotate the collected dataset. Based on a two-step framework of first predicting the appropriate tutor act then generating the corresponding response, we implemented two CITS models using GPT-4 and SOLAR-KO, respectively. We experimentally demonstrate that the implemented models not only replicate the style of human teachers but also employ diverse and contextually appropriate pedagogical strategies.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
High-Performance Ferroelectric Field-Effect Transistors with Ultra-High Current and Carrier Densities
Authors:
Seunguk Song,
Kwan-Ho Kim,
Rachael Keneipp,
Nicholas Trainor,
Chen Chen,
Jeffrey Zheng,
Joan M. Redwing,
Marija Drndić,
Roy H. Olsson III,
Deep Jariwala
Abstract:
Ferroelectric field-effect transistors (FeFET) with two-dimensional (2D) semiconductor channels are promising low-power, embedded non-volatile memory (NVM) candidates for next-generation in-memory computing. However, the performance of FeFETs can be limited by a charge imbalance between the ferroelectric layer and the channel, and for low-dimensional semiconductors, also by a high contact resistan…
▽ More
Ferroelectric field-effect transistors (FeFET) with two-dimensional (2D) semiconductor channels are promising low-power, embedded non-volatile memory (NVM) candidates for next-generation in-memory computing. However, the performance of FeFETs can be limited by a charge imbalance between the ferroelectric layer and the channel, and for low-dimensional semiconductors, also by a high contact resistance between the metal electrodes and the channel. Here, we report a significant enhancement in performance of contact-engineered FeFETs with a 2D MoS2 channel and a ferroelectric Al0.68Sc0.32N (AlScN) gate dielectric. Replacing Ti with In contact electrodes results in a fivefold increase in on-state current (~120 uA/um at 1 V) and on-to-off ratio (~2*10^7) in the FeFETs. In addition, the high carrier concentration in the MoS2 channel during the on-state (> 10^14 cm^-2) facilitates the observation of a metal-to-insulator phase transition in monolayer MoS2 permitting observation of high field effect mobility (> 100 cm^2V^-1s^-1) at cryogenic temperatures. Our work and devices broaden the potential of FeFETs and provides a unique platform to implement high-carrier-density transport in a 2D channel.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Advancing Ultra-Reliable 6G: Transformer and Semantic Localization Empowered Robust Beamforming in Millimeter-Wave Communications
Authors:
Avi Deb Raha,
Kitae Kim,
Apurba Adhikary,
Mrityunjoy Gain,
Choong Seon Hong
Abstract:
Advancements in 6G wireless technology have elevated the importance of beamforming, especially for attaining ultra-high data rates via millimeter-wave (mmWave) frequency deployment. Although promising, mmWave bands require substantial beam training to achieve precise beamforming. While initial deep learning models that use RGB camera images demonstrated promise in reducing beam training overhead,…
▽ More
Advancements in 6G wireless technology have elevated the importance of beamforming, especially for attaining ultra-high data rates via millimeter-wave (mmWave) frequency deployment. Although promising, mmWave bands require substantial beam training to achieve precise beamforming. While initial deep learning models that use RGB camera images demonstrated promise in reducing beam training overhead, their performance suffers due to sensitivity to lighting and environmental variations. Due to this sensitivity, Quality of Service (QoS) fluctuates, eventually affecting the stability and dependability of networks in dynamic environments. This emphasizes a critical need for more robust solutions. This paper proposes a robust beamforming technique to ensure consistent QoS under varying environmental conditions. An optimization problem has been formulated to maximize users' data rates. To solve the formulated NP-hard optimization problem, we decompose it into two subproblems: the semantic localization problem and the optimal beam selection problem. To solve the semantic localization problem, we propose a novel method that leverages the k-means clustering and YOLOv8 model. To solve the beam selection problem, we propose a novel lightweight hybrid architecture that utilizes various data sources and a weighted entropy-based mechanism to predict the optimal beams. Rapid and accurate beam predictions are needed to maintain QoS. A novel metric, Accuracy-Complexity Efficiency (ACE), has been proposed to quantify this. Six testing scenarios have been developed to evaluate the robustness of the proposed model. Finally, the simulation result demonstrates that the proposed model outperforms several state-of-the-art baselines regarding beam prediction accuracy, received power, and ACE in the developed test scenarios.
△ Less
Submitted 21 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
Demystifying SGD with Doubly Stochastic Gradients
Authors:
Kyurae Kim,
Joohwan Ko,
Yi-An Ma,
Jacob R. Gardner
Abstract:
Optimization objectives in the form of a sum of intractable expectations are rising in importance (e.g., diffusion models, variational autoencoders, and many more), a setting also known as "finite sum with infinite data." For these problems, a popular strategy is to employ SGD with doubly stochastic gradients (doubly SGD): the expectations are estimated using the gradient estimator of each compone…
▽ More
Optimization objectives in the form of a sum of intractable expectations are rising in importance (e.g., diffusion models, variational autoencoders, and many more), a setting also known as "finite sum with infinite data." For these problems, a popular strategy is to employ SGD with doubly stochastic gradients (doubly SGD): the expectations are estimated using the gradient estimator of each component, while the sum is estimated by subsampling over these estimators. Despite its popularity, little is known about the convergence properties of doubly SGD, except under strong assumptions such as bounded variance. In this work, we establish the convergence of doubly SGD with independent minibatching and random reshuffling under general conditions, which encompasses dependent component gradient estimators. In particular, for dependent estimators, our analysis allows fined-grained analysis of the effect correlations. As a result, under a per-iteration computational budget of $b \times m$, where $b$ is the minibatch size and $m$ is the number of Monte Carlo samples, our analysis suggests where one should invest most of the budget in general. Furthermore, we prove that random reshuffling (RR) improves the complexity dependence on the subsampling noise.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.