subscribe to arXiv mailings

Tail-robust factor modelling of vector and tensor time series in high dimensions

Authors: Matteo Barigozzi, Haeran Cho, Hyeyoung Maeng

Abstract: We study the problem of factor modelling vector- and tensor-valued time series in the presence of heavy tails in the data, which produce anomalous observations with non-negligible probability. For this, we propose to combine a two-step procedure with data truncation, which is easy to implement and does not require iteratively searching for a numerical solution. Departing away from the light-tail a… ▽ More We study the problem of factor modelling vector- and tensor-valued time series in the presence of heavy tails in the data, which produce anomalous observations with non-negligible probability. For this, we propose to combine a two-step procedure with data truncation, which is easy to implement and does not require iteratively searching for a numerical solution. Departing away from the light-tail assumptions often adopted in the time series factor modelling literature, we derive the theoretical properties of the proposed estimators while only assuming the existence of the $(2 + 2\eps)$-th moment for some $\eps \in (0, 1)$, fully characterising the effect of heavy tails on the rates of estimation as well as the level of truncation. Numerical experiments on simulated datasets demonstrate the good performance of the proposed estimator, which is further supported by applications to two macroeconomic datasets. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.09139 [pdf, other]

Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (414 additional authors not shown)

Abstract: We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det… ▽ More We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 10 pages, 4 figures

Report number: Belle II Preprint 2024-009, KEK Preprint 2024-1

arXiv:2407.08984 [pdf, ps, other]

Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data

Authors: Belle II Collaboration, I. Adachi, K. Adamczyk, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (385 additional authors not shown)

Abstract: We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I… ▽ More We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 12 pages, 4 figures

Report number: Belle II Preprint 2023-019; KEK Preprint 2023-37

arXiv:2407.08769 [pdf, other]

AuNR-SMA: Automated Gold Nanorod Spectral Morphology Analysis Pipeline

Authors: Samuel P. Gleason, Jakob C. Dahl, Mahmoud Elzouka, Xingzhi Wang, Dana O. Byrne, Mumtaz Gababa, Hannah Cho, Ravi Prasher, Sean Lubner, Emory Chan, A. Paul Alivisatos

Abstract: The development of a colloidal synthesis procedure to produce nanomaterials of a specific size with high shape and size purity is often a time consuming, iterative process. This is often due to the time, resource and expertise intensive characterization methods required for quantitative determination of nanomaterial size and shape. Absorption spectroscopy is often the easiest method of colloidal n… ▽ More The development of a colloidal synthesis procedure to produce nanomaterials of a specific size with high shape and size purity is often a time consuming, iterative process. This is often due to the time, resource and expertise intensive characterization methods required for quantitative determination of nanomaterial size and shape. Absorption spectroscopy is often the easiest method of colloidal nanomaterial characterization, however, due to the lack of a reliable method to extract nanoparticle shapes from absorption spectroscopy, it is generally treated as a more qualitative measure for metal nanoparticles. This work demonstrates a gold nanorod (AuNR) spectral morphology analysis (SMA) tool, AuNR-SMA, which is a fast and accurate method to extract quantitative information about an AuNR sample's structural parameters from its absorption spectra. We apply AuNR-SMA in three distinct applications. First, we demonstrate its utility as an automated analysis tool in a high throughput AuNR synthesis procedure by generating quantitative size information from optical spectra. Second, we use the predictions generated by this model to train a machine learning model capable of predicting the resulting AuNR size distributions from the reaction conditions used to synthesize them. Third, we turn this model to spectra extracted from the literature where no size distributions are reported to impute unreported quantitative information of AuNR synthesis. This approach can potentially be extended to any other nanocrystal system where the absorption spectra are size dependent and accurate numerical simulation of the absorption spectra is possible. In addition, this pipeline could be integrated into automated synthesis apparatuses to provide interpretable data from simple measurements and help explore the synthesis science of nanoparticles in a rational manner or facilitate closed-loop workflows. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.05713 [pdf, other]

Short-term Object Interaction Anticipation with Disentangled Object Detection @ Ego4D Short Term Object Interaction Anticipation Challenge

Authors: Hyunjin Cho, Dong Un Kang, Se Young Chun

Abstract: Short-term object interaction anticipation is an important task in egocentric video analysis, including precise predictions of future interactions and their timings as well as the categories and positions of the involved active objects. To alleviate the complexity of this task, our proposed method, SOIA-DOD, effectively decompose it into 1) detecting active object and 2) classifying interaction an… ▽ More Short-term object interaction anticipation is an important task in egocentric video analysis, including precise predictions of future interactions and their timings as well as the categories and positions of the involved active objects. To alleviate the complexity of this task, our proposed method, SOIA-DOD, effectively decompose it into 1) detecting active object and 2) classifying interaction and predicting their timing. Our method first detects all potential active objects in the last frame of egocentric video by fine-tuning a pre-trained YOLOv9. Then, we combine these potential active objects as query with transformer encoder, thereby identifying the most promising next active object and predicting its future interaction and time-to-contact. Experimental results demonstrate that our method outperforms state-of-the-art models on the challenge test set, achieving the best performance in predicting next active objects and their interactions. Finally, our proposed ranked the third overall top-5 mAP when including time-to-contact predictions. The source code is available at https://github.com/KeenyJin/SOIA-DOD. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 4 pages

arXiv:2407.05117 [pdf, ps, other]

Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (349 additional authors not shown)

Abstract: We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper… ▽ More We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper limits at 90\% credibility level on the branching fractions of $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛπ^-$ are determined to be $4.7 \times 10^{-8}$ and $4.3 \times 10^{-8}$, respectively. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: 8 pages, 4 figures

Report number: Belle II Preprint 2024-020; KEK Preprint 2024-17

arXiv:2407.03890 [pdf, other]

Addressing Relative Pose Impact on UWB Localization: Dataset Introduction and Analysis

Authors: Jun Hyeok Choe, Inwook Shim

Abstract: UWB has recently gained new attention as an auxiliary sensor in the field of robot localization due to its compactness and ease of distance measurement. Consequently, various UWB-related localization and dataset research have increased. Despite this broad interest, there is a lack of UWB datasets that thoroughly analyze the performance of UWB ranging measurement. To address this issue, our paper i… ▽ More UWB has recently gained new attention as an auxiliary sensor in the field of robot localization due to its compactness and ease of distance measurement. Consequently, various UWB-related localization and dataset research have increased. Despite this broad interest, there is a lack of UWB datasets that thoroughly analyze the performance of UWB ranging measurement. To address this issue, our paper introduces a UWB dataset that examines UWB relative pose factors affecting ranging measurement. To the best of our knowledge, our dataset is the first to analyze these factors while rigorously providing precise ground-truth UWB poses. The dataset is accessible at https://github.com/cjhhalla/RCV_uwb_dataset . △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 4 pages

arXiv:2407.03783 [pdf, other]

Evidence of $h_{b}(\text{2P}) \to Υ(\text{1S})η$ decay and search for $h_{b}(\text{1P,2P}) \to Υ(\text{1S})π^0$ with the Belle detector

Authors: Belle Collaboration, E. Kovalenko, I. Adachi, H. Aihara, D. M. Asner, T. Aushev, R. Ayad, V. Babu, Sw. Banerjee, K. Belous, J. Bennett, M. Bessner, T. Bilka, D. Biswas, A. Bobrov, D. Bodrov, A. Bondar, A. Bozek, M. Bračko, P. Branchini, T. E. Browder, A. Budano, M. Campajola, M. -C. Chang, B. G. Cheon , et al. (142 additional authors not shown)

Abstract: We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of… ▽ More We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, and $\mathcal{B}[h_{b}(\text{1P})\to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, at the $90\%$ confidence level. These results are obtained with a $131.4$~fb$^{-1}$ data sample collected near the $Υ(\text{5S})$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+e^-$ collider. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: to be submitted to PRL

Report number: Belle Preprint 2024-03, KEK Preprint 2024-03

arXiv:2407.02557 [pdf, other]

Weak-Lensing Characterization of the Dark Matter in 29 Merging Clusters that Exhibit Radio Relics

Authors: Kyle Finner, M. James Jee, Hyejeon Cho, Kim Hyeonghan, Wonki Lee, Reinout J. van Weeren, David Wittman, Mijin Yoon

Abstract: We present a multiwavelength analysis of 29 merging galaxy clusters that exhibit radio relics. For each merging system, we perform a weak-lensing analysis on Subaru optical imaging. We generate high-resolution mass maps of the dark matter distributions, which are critical for discerning the merging constituents. Combining the weak-lensing detections with X-ray emission, radio emission, and galaxy… ▽ More We present a multiwavelength analysis of 29 merging galaxy clusters that exhibit radio relics. For each merging system, we perform a weak-lensing analysis on Subaru optical imaging. We generate high-resolution mass maps of the dark matter distributions, which are critical for discerning the merging constituents. Combining the weak-lensing detections with X-ray emission, radio emission, and galaxy redshifts, we discuss the formation of radio relics from the past collision. For each subcluster, we obtain mass estimates by fitting a multi-component NFW model with and without a concentration-mass relation. Comparing the two mass estimate techniques, we find that the concentration-mass relation underestimates (overestimates) the mass relative to fitting both parameters for high- (low-) mass subclusters. We compare the mass estimates of each subcluster to their velocity dispersion measurements and find that they preferentially lie below the expected velocity dispersion scaling relation, especially at the low-mass end (~$10^{14}\ M_\odot$). We show that the majority of the clusters that exhibit radio relics are in major mergers with a mass ratio below 1:4. We investigate the position of the mass peak relative to the galaxy luminosity peak, number density peak, and BCG locations and find that the BCG tends to better trace the mass peak position. Finally, we update a golden sample of 8 galaxy clusters that have the simplest geometries and can provide the cleanest picture of the past merger, which we recommend for further investigation to constrain the nature of dark matter and the acceleration process that leads to radio relics. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 55 pages, 36 figures, submitted to ApJS

arXiv:2407.00988 [pdf, ps, other]

Pointwise estimates of the Bergman kernel with an exponential weight on the unit ball

Authors: Hong Rae Cho, Soohyun Park

Abstract: We consider the weighted Bergman space $A^2_ψ(\Bn)$ of all holomorphic functions on $\Bn$ square integrable with respect to a particular exponential weight measure $e^{-ψ} dV$ on $\Bn$, where \begin{align*} ψ(z):=\frac{1}{1-|z|^2}. \end{align*} We prove the following estimate for the Bergman kernel $K_ψ(z,w)$ of $A^2_ψ(\Bn)$: \begin{align*} |K_ψ(z,w)|^2\le C\frac{e^{ψ(z)+ψ(w)}}{{\rm Vol}(B_ψ(z,1… ▽ More We consider the weighted Bergman space $A^2_ψ(\Bn)$ of all holomorphic functions on $\Bn$ square integrable with respect to a particular exponential weight measure $e^{-ψ} dV$ on $\Bn$, where \begin{align*} ψ(z):=\frac{1}{1-|z|^2}. \end{align*} We prove the following estimate for the Bergman kernel $K_ψ(z,w)$ of $A^2_ψ(\Bn)$: \begin{align*} |K_ψ(z,w)|^2\le C\frac{e^{ψ(z)+ψ(w)}}{{\rm Vol}(B_ψ(z,1)){\rm Vol}(B_ψ(w, 1))}e^{-\varepsilon d_ψ(z,w)}, \quad z, w\in\Bn, \end{align*} where $d_ψ$ is the Riemannian distance induced by the potential function $ψ$ and $B_ψ(z,1)$ is the $d_ψ$-ball of center $z$ and radius $1$. The result is motivated by Christ \cite{Chr}. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: arXiv admin note: text overlap with arXiv:2207.13937

arXiv:2407.00965 [pdf, other]

Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment

Authors: The Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, J. K. Ahn, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (382 additional authors not shown)

Abstract: A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga… ▽ More A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 12 pages, 3 figures

Report number: Belle II Preprint 2024-019; KEK Preprint 2024-16

arXiv:2407.00879 [pdf, ps, other]

Study of $χ_{bJ}(2P)\toωΥ(1S)$ at Belle

Authors: Belle Collaboration, Z. S. Stottler, T. K. Pedlar, B. G. Fulsom, I. Adachi, K. Adamczyk, H. Aihara, S. Al Said, D. M. Asner, H. Atmacan, T. Aushev, R. Ayad, V. Babu, Sw. Banerjee, M. Bauer, P. Behera, K. Belous, J. Bennett, F. Bernlochner, M. Bessner, T. Bilka, D. Biswas, A. Bobrov, D. Bodrov, G. Bonvicini , et al. (157 additional authors not shown)

Abstract: We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of… ▽ More We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of $B\big(χ_{b0}(2P)\toωΥ(1S)\big) = \big(0.55\pm0.19\pm0.07\big)\%$. We also obtain branching fractions of $B\big(χ_{b1}(2P)\toωΥ(1S)\big) = \big(2.39{}^{+0.20}_{-0.19}\pm0.24\big)\%$ and $B\big(χ_{b2}(2P)\toωΥ(1S)\big) = \big(0.47{}^{+0.13}_{-0.12}\pm0.06\big)\%$, confirming the measurement of the $ω$ transitions of the $J=1,2~P$--wave states. The ratio for the $J=2$ to $J=1$ transitions is also measured and found to differ by 3.3 standard deviations from the expected value in the QCD multipole expansion. △ Less

Submitted 8 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

Comments: 6 pages, 2 figures

Report number: Belle Preprint: 2024-05; KEK Preprint: 2024-10

arXiv:2406.17310 [pdf, other]

High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model

Authors: Joun Yeop Lee, Myeonghun Jeong, Minchan Kim, Ji-Hyun Lee, Hoon-Young Cho, Nam Soo Kim

Abstract: We propose a novel two-stage text-to-speech (TTS) framework with two types of discrete tokens, i.e., semantic and acoustic tokens, for high-fidelity speech synthesis. It features two core components: the Interpreting module, which processes text and a speech prompt into semantic tokens focusing on linguistic contents and alignment, and the Speaking module, which captures the timbre of the target v… ▽ More We propose a novel two-stage text-to-speech (TTS) framework with two types of discrete tokens, i.e., semantic and acoustic tokens, for high-fidelity speech synthesis. It features two core components: the Interpreting module, which processes text and a speech prompt into semantic tokens focusing on linguistic contents and alignment, and the Speaking module, which captures the timbre of the target voice to generate acoustic tokens from semantic tokens, enriching speech reconstruction. The Interpreting stage employs a transducer for its robustness in aligning text to speech. In contrast, the Speaking stage utilizes a Conformer-based architecture integrated with a Grouped Masked Language Model (G-MLM) to boost computational efficiency. Our experiments verify that this innovative structure surpasses the conventional models in the zero-shot scenario in terms of speech quality and speaker similarity. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: Accepted by Interspeech2024

arXiv:2406.16535 [pdf, other]

Token-based Decision Criteria Are Suboptimal in In-context Learning

Authors: Hakaze Cho, Yoshihiro Sakai, Mariko Kato, Kenshiro Tanaka, Akira Ishii, Naoya Inoue

Abstract: In-Context Learning (ICL) typically utilizes classification criteria from probabilities of manually selected label tokens. However, we argue that such token-based classification criteria lead to suboptimal decision boundaries, despite delicate calibrations through translation and constrained rotation. To address this problem, we propose Hidden Calibration, which renounces token probabilities and u… ▽ More In-Context Learning (ICL) typically utilizes classification criteria from probabilities of manually selected label tokens. However, we argue that such token-based classification criteria lead to suboptimal decision boundaries, despite delicate calibrations through translation and constrained rotation. To address this problem, we propose Hidden Calibration, which renounces token probabilities and uses the nearest centroid classifier on the LM's last hidden states. In detail, we use the nearest centroid classification on the hidden states, assigning the category of the nearest centroid previously observed from a few-shot calibration set to the test sample as the predicted label. Our experiments on 3 models and 10 classification datasets indicate that Hidden Calibration consistently outperforms current token-based calibrations by about 20%. Our further analysis demonstrates that Hidden Calibration finds better classification criteria with less inter-categories overlap, and LMs provide linearly separable intra-category clusters with the help of demonstrations, which supports Hidden Calibration and gives new insights into the conventional ICL. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 21 pages, 14 figures, 8 tables

arXiv:2406.16275 [pdf, other]

Investigating the Influence of Prompt-Specific Shortcuts in AI Generated Text Detection

Authors: Choonghyun Park, Hyuhng Joon Kim, Junyeob Kim, Youna Kim, Taeuk Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-goo Lee, Kang Min Yoo

Abstract: AI Generated Text (AIGT) detectors are developed with texts from humans and LLMs of common tasks. Despite the diversity of plausible prompt choices, these datasets are generally constructed with a limited number of prompts. The lack of prompt variation can introduce prompt-specific shortcut features that exist in data collected with the chosen prompt, but do not generalize to others. In this paper… ▽ More AI Generated Text (AIGT) detectors are developed with texts from humans and LLMs of common tasks. Despite the diversity of plausible prompt choices, these datasets are generally constructed with a limited number of prompts. The lack of prompt variation can introduce prompt-specific shortcut features that exist in data collected with the chosen prompt, but do not generalize to others. In this paper, we analyze the impact of such shortcuts in AIGT detection. We propose Feedback-based Adversarial Instruction List Optimization (FAILOpt), an attack that searches for instructions deceptive to AIGT detectors exploiting prompt-specific shortcuts. FAILOpt effectively drops the detection performance of the target detector, comparable to other attacks based on adversarial in-context examples. We also utilize our method to enhance the robustness of the detector by mitigating the shortcuts. Based on the findings, we further train the classifier with the dataset augmented by FAILOpt prompt. The augmented classifier exhibits improvements across generation models, tasks, and attacks. Our code will be available at https://github.com/zxcvvxcz/FAILOpt. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: 19 pages, 3 figures, 13 tables, under review

arXiv:2406.15965 [pdf, other]

Search for charmed baryons in the $Λ_c^+η$ system and measurement of the branching fractions of $Λ_c(2880)^+$ and $Λ_c(2940)^+$ decaying to $Λ_c^+η$ and $pD^0$ relative to $Σ_c(2455)π$

Authors: Belle Collaboration, S. X. Li, C. P. Shen, I. Adachi, J. K. Ahn, H. Aihara, D. M. Asner, H. Atmacan, T. Aushev, R. Ayad, Sw. Banerjee, K. Belous, J. Bennett, M. Bessner, T. Bilka, D. Biswas, D. Bodrov, A. Bozek, M. Bračko, P. Branchini, T. E. Browder, A. Budano, M. Campajola, M. -C. Chang, B. G. Cheon , et al. (102 additional authors not shown)

Abstract: We search for excited charmed baryons in the $Λ_c^+η$ system using a data sample corresponding to an integrated luminosity of 980 $\rm fb^{-1}$. The data were collected by the Belle detector at the KEKB $e^{+}$$e^{-}$ asymmetric-energy collider. No significant signals are found in the $Λ_c^+η$ mass spectrum, including the known $Λ_c(2880)^+$ and $Λ_c(2940)^+$. Clear $Λ_c(2880)^+$ and… ▽ More We search for excited charmed baryons in the $Λ_c^+η$ system using a data sample corresponding to an integrated luminosity of 980 $\rm fb^{-1}$. The data were collected by the Belle detector at the KEKB $e^{+}$$e^{-}$ asymmetric-energy collider. No significant signals are found in the $Λ_c^+η$ mass spectrum, including the known $Λ_c(2880)^+$ and $Λ_c(2940)^+$. Clear $Λ_c(2880)^+$ and $Λ_c(2940)^+$ signals are observed in the $pD^0$ mass spectrum. We set upper limits at 90\% credibility level on ratios of branching fractions of $Λ_c(2880)^+$ and $Λ_c(2940)^+$ decaying to $Λ_c^+η$ relative to $Σ_c(2455)π$ of $<0.13$ for the $Λ_c(2880)^+$ and $<1.11$ for the $Λ_c(2940)^+$. We measure ratios of branching fractions of $Λ_c(2880)^+$ and $Λ_c(2940)^+$ decaying to $pD^0$ relative to $Σ_c(2455)π$ of $0.75 \pm 0.03(\text{stat.}) \pm 0.07(\text{syst.})$ for the $Λ_c(2880)^+$ and $3.59 \pm 0.21(\text{stat.}) \pm 0.56(\text{syst.})$ for the $Λ_c(2940)^+$. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: 10 pages, 4 figures

Report number: Belle Preprint: 2024-06;KEK Preprint: 2024-15

arXiv:2406.15225 [pdf, other]

Deep UAV Path Planning with Assured Connectivity in Dense Urban Setting

Authors: Jiyong Oh, Syed M. Raza, Lusungu J. Mwasinga, Moonseong Kim, Hyunseung Choo

Abstract: Unmanned Ariel Vehicle (UAV) services with 5G connectivity is an emerging field with numerous applications. Operator-controlled UAV flights and manual static flight configurations are major limitations for the wide adoption of scalability of UAV services. Several services depend on excellent UAV connectivity with a cellular network and maintaining it is challenging in predetermined flight paths. T… ▽ More Unmanned Ariel Vehicle (UAV) services with 5G connectivity is an emerging field with numerous applications. Operator-controlled UAV flights and manual static flight configurations are major limitations for the wide adoption of scalability of UAV services. Several services depend on excellent UAV connectivity with a cellular network and maintaining it is challenging in predetermined flight paths. This paper addresses these limitations by proposing a Deep Reinforcement Learning (DRL) framework for UAV path planning with assured connectivity (DUPAC). During UAV flight, DUPAC determines the best route from a defined source to the destination in terms of distance and signal quality. The viability and performance of DUPAC are evaluated under simulated real-world urban scenarios using the Unity framework. The results confirm that DUPAC achieves an autonomous UAV flight path similar to base method with only 2% increment while maintaining an average 9% better connection quality throughout the flight. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 5 pages, 4 figures, Published in the 2024 IEEE Network Operations and Management Symposium (NOMS 2024)

arXiv:2406.07923 [pdf, other]

doi 10.21437/Interspeech.2024

CTC-aligned Audio-Text Embedding for Streaming Open-vocabulary Keyword Spotting

Authors: Sichen Jin, Youngmoon Jung, Seungjin Lee, Jaeyoung Roh, Changwoo Han, Hoonyoung Cho

Abstract: This paper introduces a novel approach for streaming openvocabulary keyword spotting (KWS) with text-based keyword enrollment. For every input frame, the proposed method finds the optimal alignment ending at the frame using connectionist temporal classification (CTC) and aggregates the frame-level acoustic embedding (AE) to obtain higher-level (i.e., character, word, or phrase) AE that aligns with… ▽ More This paper introduces a novel approach for streaming openvocabulary keyword spotting (KWS) with text-based keyword enrollment. For every input frame, the proposed method finds the optimal alignment ending at the frame using connectionist temporal classification (CTC) and aggregates the frame-level acoustic embedding (AE) to obtain higher-level (i.e., character, word, or phrase) AE that aligns with the text embedding (TE) of the target keyword text. After that, we calculate the similarity of the aggregated AE and the TE. To the best of our knowledge, this is the first attempt to dynamically align the audio and the keyword text on-the-fly to attain the joint audio-text embedding for KWS. Despite operating in a streaming fashion, our approach achieves competitive performance on the LibriPhrase dataset compared to the non-streaming methods with a mere 155K model parameters and a decoding algorithm with time complexity O(U), where U is the length of the target keyword at inference time. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.06277 [pdf, other]

Measurement of the branching fractions of $\bar{B}\to D^{(*)} K^- K^{(*)0}_{(S)}$ and $\bar{B}\to D^{(*)}D_s^{-}$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer , et al. (382 additional authors not shown)

Abstract: We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted… ▽ More We present measurements of the branching fractions of eight $\overline B{}^0\to D^{(*)+} K^- K^{(*)0}_{(S)}$, $B^{-}\to D^{(*)0} K^- K^{(*)0}_{(S)}$ decay channels. The results are based on data from SuperKEKB electron-positron collisions at the $Υ(4S)$ resonance collected with the Belle II detector, corresponding to an integrated luminosity of $362~\text{fb}^{-1}$. The event yields are extracted from fits to the distributions of the difference between expected and observed $B$ meson energy, and are efficiency-corrected as a function of $m(K^-K^{(*)0}_{(S)})$ and $m(D^{(*)}K^{(*)0}_{(S)})$ in order to avoid dependence on the decay model. These results include the first observation of $\overline B{}^0\to D^+K^-K_S^0$, $B^-\to D^{*0}K^-K_S^0$, and $\overline B{}^0\to D^{*+}K^-K_S^0$ decays and a significant improvement in the precision of the other channels compared to previous measurements. The helicity-angle distributions and the invariant mass distributions of the $K^- K^{(*)0}_{(S)}$ systems are compatible with quasi-two-body decays via a resonant transition with spin-parity $J^P=1^-$ for the $K^-K_S^0$ systems and $J^P= 1^+$ for the $K^-K^{*0}$ systems. We also present measurements of the branching fractions of four $\overline B{}^0\to D^{(*)+} D_s^-$, $B^{-}\to D^{(*)0} D_s^- $ decay channels with a precision compatible to the current world averages. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Prepared for submission to JHEP. 34 pages, 14 figures

Report number: Belle II Preprint: 2024-014, KEK Preprint: 2024-8

arXiv:2406.06111 [pdf, other]

JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis

Authors: Hyunjae Cho, Junhyeok Lee, Wonbin Jung

Abstract: Non-autoregressive GAN-based neural vocoders are widely used due to their fast inference speed and high perceptual quality. However, they often suffer from audible artifacts such as tonal artifacts in their generated results. Therefore, we propose JenGAN, a new training strategy that involves stacking shifted low-pass filters to ensure the shift-equivariant property. This method helps prevent alia… ▽ More Non-autoregressive GAN-based neural vocoders are widely used due to their fast inference speed and high perceptual quality. However, they often suffer from audible artifacts such as tonal artifacts in their generated results. Therefore, we propose JenGAN, a new training strategy that involves stacking shifted low-pass filters to ensure the shift-equivariant property. This method helps prevent aliasing and reduce artifacts while preserving the model structure used during inference. In our experimental evaluation, JenGAN consistently enhances the performance of vocoder models, yielding significantly superior scores across the majority of evaluation metrics. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: Accepted to Interspeech 2024

arXiv:2406.05761 [pdf, other]

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Authors: Seungone Kim, Juyoung Suk, Ji Yong Cho, Shayne Longpre, Chaeeun Kim, Dongkeun Yoon, Guijin Son, Yejin Cho, Sheikh Shafayat, Jinheon Baek, Sue Hyun Park, Hyeonbin Hwang, Jinkyung Jo, Hyowon Cho, Haebin Shin, Seongyun Lee, Hanseok Oh, Noah Lee, Namgyu Ho, Se June Joo, Miyoung Ko, Yoonjoo Lee, Hyungjoo Chae, Jamin Shin, Joel Jang , et al. (7 additional authors not shown)

Abstract: As language models (LMs) become capable of handling a wide range of tasks, their evaluation is becoming as challenging as their development. Most generation benchmarks currently assess LMs using abstract evaluation criteria like helpfulness and harmlessness, which often lack the flexibility and granularity of human assessment. Additionally, these benchmarks tend to focus disproportionately on spec… ▽ More As language models (LMs) become capable of handling a wide range of tasks, their evaluation is becoming as challenging as their development. Most generation benchmarks currently assess LMs using abstract evaluation criteria like helpfulness and harmlessness, which often lack the flexibility and granularity of human assessment. Additionally, these benchmarks tend to focus disproportionately on specific capabilities such as instruction following, leading to coverage bias. To overcome these limitations, we introduce the BiGGen Bench, a principled generation benchmark designed to thoroughly evaluate nine distinct capabilities of LMs across 77 diverse tasks. A key feature of the BiGGen Bench is its use of instance-specific evaluation criteria, closely mirroring the nuanced discernment of human evaluation. We apply this benchmark to assess 103 frontier LMs using five evaluator LMs. Our code, data, and evaluation results are all publicly available at https://github.com/prometheus-eval/prometheus-eval/tree/main/BiGGen-Bench. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: Work in Progress

arXiv:2406.05314 [pdf, other]

Relational Proxy Loss for Audio-Text based Keyword Spotting

Authors: Youngmoon Jung, Seungjin Lee, Joon-Young Yang, Jaeyoung Roh, Chang Woo Han, Hoon-Young Cho

Abstract: In recent years, there has been an increasing focus on user convenience, leading to increased interest in text-based keyword enrollment systems for keyword spotting (KWS). Since the system utilizes text input during the enrollment phase and audio input during actual usage, we call this task audio-text based KWS. To enable this task, both acoustic and text encoders are typically trained using deep… ▽ More In recent years, there has been an increasing focus on user convenience, leading to increased interest in text-based keyword enrollment systems for keyword spotting (KWS). Since the system utilizes text input during the enrollment phase and audio input during actual usage, we call this task audio-text based KWS. To enable this task, both acoustic and text encoders are typically trained using deep metric learning loss functions, such as triplet- and proxy-based losses. This study aims to improve existing methods by leveraging the structural relations within acoustic embeddings and within text embeddings. Unlike previous studies that only compare acoustic and text embeddings on a point-to-point basis, our approach focuses on the relational structures within the embedding space by introducing the concept of Relational Proxy Loss (RPL). By incorporating RPL, we demonstrated improved performance on the Wall Street Journal (WSJ) corpus. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 5 pages, 2 figures, Accepted by Interspeech 2024

arXiv:2406.04642 [pdf, ps, other]

Measurements of the branching fractions of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ and asymmetry parameter of $Ξ_{c}^{0}\toΞ^{0}π^{0}$

Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (360 additional authors not shown)

Abstract: We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions… ▽ More We present a study of $Ξ_{c}^{0}\toΞ^{0}π^{0}$, $Ξ_{c}^{0}\toΞ^{0}η$, and $Ξ_{c}^{0}\toΞ^{0}η^{\prime}$ decays using the Belle and Belle~II data samples, which have integrated luminosities of 980~$\mathrm{fb}^{-1}$ and 426~$\mathrm{fb}^{-1}$, respectively. We measure the following relative branching fractions $${\cal B}(Ξ_{c}^{0}\toΞ^{0}π^{0})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.48 \pm 0.02 ({\rm stat}) \pm 0.03 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η)/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.11 \pm 0.01 ({\rm stat}) \pm 0.01 ({\rm syst}) ,$$ $${\cal B}(Ξ_{c}^{0}\toΞ^{0}η^{\prime})/{\cal B}(Ξ_{c}^{0}\toΞ^{-}π^{+}) = 0.08 \pm 0.02 ({\rm stat}) \pm 0.01 ({\rm syst}) $$ for the first time, where the uncertainties are statistical ($\rm stat$) and systematic ($\rm syst$). By multiplying by the branching fraction of the normalization mode, ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$, we obtain the following absolute branching fraction results $(6.9 \pm 0.3 ({\rm stat}) \pm 0.5 ({\rm syst}) \pm 1.3 ({\rm norm})) \times 10^{-3}$, $(1.6 \pm 0.2 ({\rm stat}) \pm 0.2 ({\rm syst}) \pm 0.3 ({\rm norm})) \times 10^{-3}$, and $(1.2 \pm 0.3 ({\rm stat}) \pm 0.1 ({\rm syst}) \pm 0.2 ({\rm norm})) \times 10^{-3}$, for $Ξ_{c}^{0}$ decays to $Ξ^{0}π^{0}$, $Ξ^{0}η$, and $Ξ^{0}η^{\prime}$ final states, respectively. The third errors are from the uncertainty on ${\mathcal B}(Ξ_{c}^{0}\toΞ^{-}π^{+})$. The asymmetry parameter for $Ξ_{c}^{0}\toΞ^{0}π^{0}$ is measured to be $α(Ξ_{c}^{0}\toΞ^{0}π^{0}) = -0.90\pm0.15({\rm stat})\pm0.23({\rm syst})$. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: 23 pages, 5 figures

Report number: Belle II Preprint 2024-015; KEK Preprint 2024-9

arXiv:2406.02811 [pdf, other]

Changes in boiling controlled by molar concentration-dependent diffusion of surfactants

Authors: Mario R. Mata, Matic Može, Armin Hadžić, Giseop Lee, Blake Naccarato, Isaac Berk, Iztok Golobič, H. Jeremy Cho

Abstract: Boiling is a prevalent phase-change process that plays a vital role in facilitating efficient heat transfer from a heating surface. While this heat transfer mechanism is generally effective, a rapid increase in surface temperature can lead to hydrodynamic instabilities, resulting in a boiling crisis. Previous studies have shown that surfactants often improve boiling performance and change the boil… ▽ More Boiling is a prevalent phase-change process that plays a vital role in facilitating efficient heat transfer from a heating surface. While this heat transfer mechanism is generally effective, a rapid increase in surface temperature can lead to hydrodynamic instabilities, resulting in a boiling crisis. Previous studies have shown that surfactants often improve boiling performance and change the boiling crisis behavior. Conventional wisdom in this field attributes that these changes in boiling behavior are tied to the critical micelle concentration (CMC) of the particular surfactant. However, our work reveals that these changes in boiling behavior are independent of the CMC for three nonionic surfactants across a wide range of molar concentrations. In addition, visual snapshots of the bubbling behavior indicate changes in bubble formation, such as bubble size and nucleation site density, influenced by the molar concentration-dependent diffusion timescale of surfactants. Hence, these findings offer compelling evidence that boiling behavior, encompassing both boiling performance and boiling crisis, is governed by the dynamic adsorption of surfactants rather than dictated by the CMC. This becomes evident when quantifying the heat transfer coefficient (HTC) and critical heat flux (CHF) using the logarithm of molar concentration, as predicted by theory. Building upon these findings, we propose insights for controlling when CHF modification occurs in specific scenarios involving any surfactants. These insights hold significant potential for optimizing heat transfer processes and leveraging surfactants in energy-related applications to maximize boiling efficiency. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 21 pages, 7 figures, 2 appendices

arXiv:2406.02596 [pdf, other]

Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks

Authors: Hojoon Lee, Hyeonseo Cho, Hyunseung Kim, Donghu Kim, Dugki Min, Jaegul Choo, Clare Lyle

Abstract: This study investigates the loss of generalization ability in neural networks, revisiting warm-starting experiments from Ash & Adams. Our empirical analysis reveals that common methods designed to enhance plasticity by maintaining trainability provide limited benefits to generalization. While reinitializing the network can be effective, it also risks losing valuable prior knowledge. To this end, w… ▽ More This study investigates the loss of generalization ability in neural networks, revisiting warm-starting experiments from Ash & Adams. Our empirical analysis reveals that common methods designed to enhance plasticity by maintaining trainability provide limited benefits to generalization. While reinitializing the network can be effective, it also risks losing valuable prior knowledge. To this end, we introduce the Hare & Tortoise, inspired by the brain's complementary learning system. Hare & Tortoise consists of two components: the Hare network, which rapidly adapts to new information analogously to the hippocampus, and the Tortoise network, which gradually integrates knowledge akin to the neocortex. By periodically reinitializing the Hare network to the Tortoise's weights, our method preserves plasticity while retaining general knowledge. Hare & Tortoise can effectively maintain the network's ability to generalize, which improves advanced reinforcement learning algorithms on the Atari-100k benchmark. The code is available at https://github.com/dojeon-ai/hare-tortoise. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: accepted to ICML 2024

arXiv:2406.01468 [pdf, other]

Understanding Token Probability Encoding in Output Embeddings

Authors: Hakaze Cho, Yoshihiro Sakai, Kenshiro Tanaka, Mariko Kato, Naoya Inoue

Abstract: In this paper, we investigate the output token probability information in the output embedding of language models. We provide an approximate common log-linear encoding of output token probabilities within the output embedding vectors and demonstrate that it is accurate and sparse when the output space is large and output logits are concentrated. Based on such findings, we edit the encoding in outp… ▽ More In this paper, we investigate the output token probability information in the output embedding of language models. We provide an approximate common log-linear encoding of output token probabilities within the output embedding vectors and demonstrate that it is accurate and sparse when the output space is large and output logits are concentrated. Based on such findings, we edit the encoding in output embedding to modify the output probability distribution accurately. Moreover, the sparsity we find in output probability encoding suggests that a large number of dimensions in the output embedding do not contribute to causal language modeling. Therefore, we attempt to delete the output-unrelated dimensions and find more than 30% of the dimensions can be deleted without significant movement in output distribution and degeneration on sequence generation. Additionally, in training dynamics, we use such encoding as a probe and find that the output embeddings capture token frequency information in early steps, even before an obvious convergence starts. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 15 pages, 17 figures, 3 tables

arXiv:2405.20671 [pdf, other]

Position Coupling: Leveraging Task Structure for Improved Length Generalization of Transformers

Authors: Hanseul Cho, Jaeyoung Cha, Pranjal Awasthi, Srinadh Bhojanapalli, Anupam Gupta, Chulhee Yun

Abstract: Even for simple arithmetic tasks like integer addition, it is challenging for Transformers to generalize to longer sequences than those encountered during training. To tackle this problem, we propose position coupling, a simple yet effective method that directly embeds the structure of the tasks into the positional encoding of a (decoder-only) Transformer. Taking a departure from the vanilla absol… ▽ More Even for simple arithmetic tasks like integer addition, it is challenging for Transformers to generalize to longer sequences than those encountered during training. To tackle this problem, we propose position coupling, a simple yet effective method that directly embeds the structure of the tasks into the positional encoding of a (decoder-only) Transformer. Taking a departure from the vanilla absolute position mechanism assigning unique position IDs to each of the tokens, we assign the same position IDs to two or more "relevant" tokens; for integer addition tasks, we regard digits of the same significance as in the same position. On the empirical side, we show that with the proposed position coupling, a small (1-layer) Transformer trained on 1 to 30-digit additions can generalize up to 200-digit additions (6.67x of the trained length). On the theoretical side, we prove that a 1-layer Transformer with coupled positions can solve the addition task involving exponentially many digits, whereas any 1-layer Transformer without positional information cannot entirely solve it. We also demonstrate that position coupling can be applied to other algorithmic tasks such as addition with multiple summands, Nx2 multiplication, copy/reverse, and a two-dimensional task. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: 73 pages, 20 figures, 90 tables

arXiv:2405.20168 [pdf, other]

Enhancing Battlefield Awareness: An Aerial RIS-assisted ISAC System with Deep Reinforcement Learning

Authors: Hyunsang Cho, Seonghoon Yoo, Bang Chul Jung, Joonhyuk Kang

Abstract: This paper considers a joint communication and sensing technique for enhancing situational awareness in practical battlefield scenarios. In particular, we propose an aerial reconfigurable intelligent surface (ARIS)-assisted integrated sensing and communication (ISAC) system consisting of a single access point (AP), an ARIS, multiple users, and a sensing target. With deep reinforcement learning (DR… ▽ More This paper considers a joint communication and sensing technique for enhancing situational awareness in practical battlefield scenarios. In particular, we propose an aerial reconfigurable intelligent surface (ARIS)-assisted integrated sensing and communication (ISAC) system consisting of a single access point (AP), an ARIS, multiple users, and a sensing target. With deep reinforcement learning (DRL), we jointly optimize the transmit beamforming of the AP, the RIS phase shifts, and the trajectory of the ARIS under signal-to-interference-noise ratio (SINR) constraints. Numerical results demonstrate that the proposed technique outperforms the conventional benchmark schemes by suppressing the self-interference and clutter echo signals or optimizing the RIS phase shifts. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.19734 [pdf, other]

Search for the decay $B^{0}\toγγ$ using Belle and Belle II data

Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, S. Al Said, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot , et al. (385 additional authors not shown)

Abstract: We report the result of a search for the rare decay $B^{0} \to γγ$ using a combined dataset of $753\times10^{6}$ $B\bar{B}$ pairs collected by the Belle experiment and $387\times10^{6}$ $B\bar{B}$ pairs collected by the Belle II experiment from decays of the $\rm Υ(4S)$ resonance produced in $e^{+}e^{-}$ collisions. A simultaneous fit to the Belle and Belle II data sets yields… ▽ More We report the result of a search for the rare decay $B^{0} \to γγ$ using a combined dataset of $753\times10^{6}$ $B\bar{B}$ pairs collected by the Belle experiment and $387\times10^{6}$ $B\bar{B}$ pairs collected by the Belle II experiment from decays of the $\rm Υ(4S)$ resonance produced in $e^{+}e^{-}$ collisions. A simultaneous fit to the Belle and Belle II data sets yields $11.0^{+6.5}_{-5.5}$ signal events, corresponding to a 2.5$σ$ significance. We determine the branching fraction $\mathcal{B}(B^{0} \to γγ) = (3.7^{+2.2}_{-1.8}(\rm stat)\pm0.5(\rm syst))\times10^{-8}$ and set a 90% credibility level upper limit of $\mathcal{B}(B^{0} \to γγ) < 6.4\times10^{-8}$. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Report number: Belle II Preprint: 2024-017, KEK Preprint: 2024-13

arXiv:2405.18928 [pdf, other]

Measurement of the energy dependence of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at Belle~II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Ahmed, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, T. Aushev, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, M. Bauer, A. Baur , et al. (444 additional authors not shown)

Abstract: We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the… ▽ More We report measurements of the $e^+e^- \to B\bar{B}$, $B\bar{B}{}^*$, and $B^*\bar{B}{}^*$ cross sections at four energies, 10653, 10701, 10746 and 10805 MeV, using data collected by the Belle~II experiment. We reconstruct one $B$ meson in a large number of hadronic final states and use its momentum to identify the production process. In the first $2-5$ MeV above $B^*\bar{B}{}^*$ threshold, the $e^+e^- \to B^*\bar{B}{}^*$ cross section increases rapidly. This may indicate the presence of a pole close to the threshold. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 30 pages, 15 figures, submitted to JHEP

Report number: Belle II Preprint 2024-016, KEK Preprint 2024-12

arXiv:2405.15737 [pdf]

More Insight from Being More Focused: Analysis of Clustered Market Apps

Authors: Maleknaz Nayebi, Homayoon Farrahi, Ada Lee, Henry Cho, Guenther Ruhe

Abstract: The increasing attraction of mobile apps has inspired researchers to analyze apps from different perspectives. As with any software product, apps have different attributes such as size, content maturity, rating, category, or number of downloads. Current research studies mostly consider sampling across all apps. This often results in comparisons of apps being quite different in nature and category… ▽ More The increasing attraction of mobile apps has inspired researchers to analyze apps from different perspectives. As with any software product, apps have different attributes such as size, content maturity, rating, category, or number of downloads. Current research studies mostly consider sampling across all apps. This often results in comparisons of apps being quite different in nature and category (games compared with weather and calendar apps), also being different in size and complexity. Similar to proprietary software and web-based services, more specific results can be expected from looking at more homogeneous samples as they can be received as a result of applying clustering. In this paper, we target homogeneous samples of apps to increase the degree of insight gained from analytics. As a proof-of-concept, we applied the clustering technique DBSCAN and subsequent correlation analysis between app attributes for a set of 940 open-source mobile apps from F-Droid. We showed that (i) clusters of apps with similar characteristics provided more insight compared to applying the same to the whole data and (ii) defining the similarity of apps based on the similarity of topics as created from the topic modeling technique Latent Dirichlet Allocation does not significantly improve clustering results. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: Authors pre-print

arXiv:2405.14625 [pdf, other]

Test of light-lepton universality in $τ$ decays with the Belle II experiment

Authors: Belle II Collaboration, I. Adachi, K. Adamczyk, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer, J. Becker , et al. (406 additional authors not shown)

Abstract: We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimise… ▽ More We present a measurement of the ratio $R_μ= \mathcal{B}(τ^-\to μ^-\barν_μν_τ) / \mathcal{B}(τ^-\to e^-\barν_eν_τ)$ of branching fractions $\mathcal{B}$ of the $τ$ lepton decaying to muons or electrons using data collected with the Belle II detector at the SuperKEKB $e^+e^-$ collider. The sample has an integrated luminosity of 362 fb$^{-1}$ at a centre-of-mass energy of 10.58 GeV. Using an optimised event selection, a binned maximum likelihood fit is performed using the momentum spectra of the electron and muon candidates. The result, $R_μ= 0.9675 \pm 0.0007 \pm 0.0036$, where the first uncertainty is statistical and the second is systematic, is the most precise to date. It provides a stringent test of the light-lepton universality, translating to a ratio of the couplings of the muon and electron to the $W$ boson in $τ$ decays of $0.9974 \pm 0.0019$, in agreement with the standard model expectation of unity. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Report number: Belle II Preprint 2024-002, KEK Preprint 2023-49

arXiv:2405.13887 [pdf, other]

Multi-Zone Modeling of Black Hole Accretion and Feedback in 3D GRMHD: Bridging Vast Spatial and Temporal Scales

Authors: Hyerin Cho, Ben S. Prather, Kung-Yi Su, Ramesh Narayan, Priyamvada Natarajan

Abstract: Simulating accretion and feedback from the horizon scale of supermassive black holes (SMBHs) out to galactic scales is challenging because of the vast range of scales involved. We describe and test a "multi-zone" technique which is designed to tackle this difficult problem in 3D general relativistic magnetohydrodynamic (GRMHD) simulations. We simulate accretion on a non-spinning SMBH ($a_*=0$) usi… ▽ More Simulating accretion and feedback from the horizon scale of supermassive black holes (SMBHs) out to galactic scales is challenging because of the vast range of scales involved. We describe and test a "multi-zone" technique which is designed to tackle this difficult problem in 3D general relativistic magnetohydrodynamic (GRMHD) simulations. We simulate accretion on a non-spinning SMBH ($a_*=0$) using initial conditions from a large scale galaxy simulation, and achieve steady state over 8 decades in radius. The density scales with radius as $ρ\propto r^{-1}$ inside the Bondi radius $R_B$, which is located at $R_B=2\times 10^5 \,r_g$ ($\approx 60\,{\rm pc}$ for M87) where $r_g$ is the gravitational radius of the SMBH; the plasma-$β\sim$ unity, indicating an extended magnetically arrested state; the mass accretion rate $\dot{M}$ is $\approx 1\%$ of the analytical Bondi accretion rate $\dot{M}_B$; and there is continuous energy feedback out to $\approx 100R_B$ (or beyond $>\,{\rm kpc}$) at a rate $\approx 0.02 \dot{M}c^2$. Surprisingly, any ordered rotation in the external medium does not survive as the magnetized gas flows to smaller radii, and the final steady solution is very similar to when the exterior has no rotation. Using the multi-zone method, we simulate GRMHD accretion for a wide range of Bondi radii, $R_{\rm B} \sim 10^2 - 10^7\,r_{\rm g}$, and find that $\dot{M}/\dot{M}_B\approx (R_B/6\, r_g)^{-0.5}$. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 19 pages, 9 figures, submitted to ApJ

arXiv:2405.13493 [pdf, other]

Euclid. III. The NISP Instrument

Authors: Euclid Collaboration, K. Jahnke, W. Gillard, M. Schirmer, A. Ealet, T. Maciaszek, E. Prieto, R. Barbier, C. Bonoli, L. Corcione, S. Dusini, F. Grupp, F. Hormuth, S. Ligori, L. Martin, G. Morgante, C. Padilla, R. Toledo-Moreo, M. Trifoglio, L. Valenziano, R. Bender, F. J. Castander, B. Garilli, P. B. Lilje, H. -W. Rix , et al. (412 additional authors not shown)

Abstract: The Near-Infrared Spectrometer and Photometer (NISP) on board the Euclid satellite provides multiband photometry and R>=450 slitless grism spectroscopy in the 950-2020nm wavelength range. In this reference article we illuminate the background of NISP's functional and calibration requirements, describe the instrument's integral components, and provide all its key properties. We also sketch the proc… ▽ More The Near-Infrared Spectrometer and Photometer (NISP) on board the Euclid satellite provides multiband photometry and R>=450 slitless grism spectroscopy in the 950-2020nm wavelength range. In this reference article we illuminate the background of NISP's functional and calibration requirements, describe the instrument's integral components, and provide all its key properties. We also sketch the processes needed to understand how NISP operates and is calibrated, and its technical potentials and limitations. Links to articles providing more details and technical background are included. NISP's 16 HAWAII-2RG (H2RG) detectors with a plate scale of 0.3" pix^-1 deliver a field-of-view of 0.57deg^2. In photo mode, NISP reaches a limiting magnitude of ~24.5AB mag in three photometric exposures of about 100s exposure time, for point sources and with a signal-to-noise ratio (SNR) of 5. For spectroscopy, NISP's point-source sensitivity is a SNR = 3.5 detection of an emission line with flux ~2x10^-16erg/s/cm^2 integrated over two resolution elements of 13.4A, in 3x560s grism exposures at 1.6 mu (redshifted Ha). Our calibration includes on-ground and in-flight characterisation and monitoring of detector baseline, dark current, non-linearity, and sensitivity, to guarantee a relative photometric accuracy of better than 1.5%, and relative spectrophotometry to better than 0.7%. The wavelength calibration must be better than 5A. NISP is the state-of-the-art instrument in the NIR for all science beyond small areas available from HST and JWST - and an enormous advance due to its combination of field size and high throughput of telescope and instrument. During Euclid's 6-year survey covering 14000 deg^2 of extragalactic sky, NISP will be the backbone for determining distances of more than a billion galaxies. Its NIR data will become a rich reference imaging and spectroscopy data set for the coming decades. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Paper submitted as part of the A&A special issue 'Euclid on Sky', which contains Euclid key reference papers and first results from the Euclid Early Release Observations

arXiv:2405.13491 [pdf, other]

Euclid. I. Overview of the Euclid mission

Authors: Euclid Collaboration, Y. Mellier, Abdurro'uf, J. A. Acevedo Barroso, A. Achúcarro, J. Adamek, R. Adam, G. E. Addison, N. Aghanim, M. Aguena, V. Ajani, Y. Akrami, A. Al-Bahlawan, A. Alavi, I. S. Albuquerque, G. Alestas, G. Alguero, A. Allaoui, S. W. Allen, V. Allevato, A. V. Alonso-Tetilla, B. Altieri, A. Alvarez-Candal, A. Amara, L. Amendola , et al. (1086 additional authors not shown)

Abstract: The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14… ▽ More The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14,000 deg^2 of extragalactic sky. In addition to accurate weak lensing and clustering measurements that probe structure formation over half of the age of the Universe, its primary probes for cosmology, these exquisite data will enable a wide range of science. This paper provides a high-level overview of the mission, summarising the survey characteristics, the various data-processing steps, and data products. We also highlight the main science objectives and expected performance. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Paper submitted as part of the A&A special issue`Euclid on Sky'

arXiv:2405.13455 [pdf, ps, other]

Carleson measures for weighted Bergman--Zygmund spaces

Authors: Hong Rae Cho, Hyungwoon Koo, Young Joo Lee, Atte Pennanen, Jouni Rättyä, Fanglei Wu

Abstract: For $0<p<\infty$, $Ψ:[0,\infty)\to(0,\infty)$ and a finite positive Borel measure $μ$ on the unit disc $\mathbb{D}$, the Lebesgue--Zygmund space $L^p_{μ,Ψ}$ consists of all measurable functions $f$ such that $\lVert f \rVert_{L_{μ, Ψ}^{p}}^p =\int_{\mathbb{D}}|f|^pΨ(|f|)\,dμ< \infty$. For an integrable radial function $ω$ on $\mathbb{D}$, the corresponding weighted Bergman-Zygmund space… ▽ More For $0<p<\infty$, $Ψ:[0,\infty)\to(0,\infty)$ and a finite positive Borel measure $μ$ on the unit disc $\mathbb{D}$, the Lebesgue--Zygmund space $L^p_{μ,Ψ}$ consists of all measurable functions $f$ such that $\lVert f \rVert_{L_{μ, Ψ}^{p}}^p =\int_{\mathbb{D}}|f|^pΨ(|f|)\,dμ< \infty$. For an integrable radial function $ω$ on $\mathbb{D}$, the corresponding weighted Bergman-Zygmund space $A_{ω, Ψ}^{p}$ is the set of all analytic functions in $L_{μ, Ψ}^{p}$ with $dμ=ω\,dA$. The purpose of the paper is to characterize bounded (and compact) embeddings $A_{ω,Ψ}^{p}\subset L_{μ, Φ}^{q}$, when $0<p\le q<\infty$, the functions $Ψ$ and $Φ$ are essential monotonic, and $Ψ,Φ,ω$ satisfy certain doubling properties. The tools developed on the way to the main results are applied to characterize bounded and compact integral operators acting from $A^p_{ω,Ψ}$ to $A^q_{ν,Φ}$, provided $ν$ admits the same doubling property as $ω$. △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.11790 [pdf, other]

Graviton physics: Quantum field theory of gravitons, graviton noise and gravitational decoherence -- a concise tutorial

Authors: Jen-Tsung Hsiang, Hing-Tong Cho, Bei-Lok Hu

Abstract: The detection of gravitational waves in 2015 ushered in a new era of gravitational wave astronomy capable of probing into the strong field dynamics of black holes and neutron stars. It has opened up an exciting new window for laboratory and space tests of Einstein's theory of classical general relativity. In recent years there are two interesting proposals aimed at revealing the quantum natures of… ▽ More The detection of gravitational waves in 2015 ushered in a new era of gravitational wave astronomy capable of probing into the strong field dynamics of black holes and neutron stars. It has opened up an exciting new window for laboratory and space tests of Einstein's theory of classical general relativity. In recent years there are two interesting proposals aimed at revealing the quantum natures of perturbative gravity: 1) theoretical predictions in how graviton noise from the early universe after the vacuum of the gravitational field was strongly squeezed by inflationary expansion; 2) experimental proposals using the quantum entanglement between two masses each in a superposition state. The first proposal invokes the stochastic properties of quantum fields, the second invokes a key concept of quantum information. An equally basic and interesting idea is to ask whether and how gravity might be responsible for a quantum system becoming classical in appearance, known as gravitational decoherence. Decoherence due to gravity is of special interest because gravity is universal. This is an important issue in macroscopic quantum phenomena. To fully appreciate these exciting developments requires a working knowledge in classical GR, QF theory and QI plus some familiarity with stochastic processes, namely, noise in quantum fields. Traditionally a new researcher may be conversant in one or two of these four subjects: GR, QFT, QI, SP, depending on his/her background. This tutorial attempts to provide the necessary connections between them, helping an engaging reader from any one of these four subjects to leapfrog to the frontier of these interdisciplinary research topics. Here we shall treat the three topics listed in the title, save gravitational entanglement, because its nature and implications proclaimed in relation to quantum gravity still contain many controversial elements. △ Less

Submitted 20 May, 2024; originally announced May 2024.

Comments: 54 pages, 2 figures

arXiv:2405.09441 [pdf, other]

doi 10.3847/1538-4357/ad4966

Constraining the Low-Mass End of the Black Hole Mass Function and the Active Fraction of the Intermediate-mass Black Holes

Authors: Hojin Cho, Jong-Hak Woo

Abstract: We investigate the black hole mass function (BHMF) and the Eddington ratio distribution function (ERDF), focusing on the intermediate-mass black holes (IMBHs) with masses down to $M_{\bullet}\sim10^4 M_\odot$. Based on the active galactic nuclei (AGNs) with a detected broad H$α$ emission line, we construct a sample of 14,242 AGNs at redshift $z<0.35$, including 243 IMBHs with… ▽ More We investigate the black hole mass function (BHMF) and the Eddington ratio distribution function (ERDF), focusing on the intermediate-mass black holes (IMBHs) with masses down to $M_{\bullet}\sim10^4 M_\odot$. Based on the active galactic nuclei (AGNs) with a detected broad H$α$ emission line, we construct a sample of 14,242 AGNs at redshift $z<0.35$, including 243 IMBHs with $M_{\bullet}<10^6 M_\odot$. By jointly modeling the BHMF and ERDF via the maximum posterior estimation, we find that the BHMF peaks at $\sim$$10^{6} M_\odot$ and exhibits a relatively constant value of $10^{-4}\,\mathrm{Mpc^{-3}\,dex^{-1}}$ at the low-mass end. By comparing the derived BHMF of type 1 AGNs with the galaxy mass function based on the updated black hole mass--host galaxy stellar mass relation, we derive the active fraction. We also determine the active fraction for all AGNs using the upper and lower limit of the type 1 fraction. The active fraction decreases from 15%--40% for massive galaxies ($M_\star>10^{10} M_\odot$) to lower than $\sim$2% for dwarf galaxies with $M_\star\sim10^8 M_\odot$. These results suggest that the black hole occupation fraction is expected to be $\sim$50% for low-mass galaxies ($M_\star\sim10^{8.5}$--$10^9 M_\odot$) if the duty cycle is similar IMBHs and supermassive black holes. △ Less

Submitted 3 July, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

Comments: 18 pages, 10 figures

Journal ref: The Astrophysical Journal 969:93 (2024)

arXiv:2405.07414 [pdf, other]

Binning as a Pretext Task: Improving Self-Supervised Learning in Tabular Domains

Authors: Kyungeun Lee, Ye Seul Sim, Hye-Seung Cho, Moonjung Eo, Suhee Yoon, Sanghyu Yoon, Woohyung Lim

Abstract: The ability of deep networks to learn superior representations hinges on leveraging the proper inductive biases, considering the inherent properties of datasets. In tabular domains, it is critical to effectively handle heterogeneous features (both categorical and numerical) in a unified manner and to grasp irregular functions like piecewise constant functions. To address the challenges in the self… ▽ More The ability of deep networks to learn superior representations hinges on leveraging the proper inductive biases, considering the inherent properties of datasets. In tabular domains, it is critical to effectively handle heterogeneous features (both categorical and numerical) in a unified manner and to grasp irregular functions like piecewise constant functions. To address the challenges in the self-supervised learning framework, we propose a novel pretext task based on the classical binning method. The idea is straightforward: reconstructing the bin indices (either orders or classes) rather than the original values. This pretext task provides the encoder with an inductive bias to capture the irregular dependencies, mapping from continuous inputs to discretized bins, and mitigates the feature heterogeneity by setting all features to have category-type targets. Our empirical investigations ascertain several advantages of binning: capturing the irregular function, compatibility with encoder architecture and additional modifications, standardizing all features into equal sets, grouping similar values within a feature, and providing ordering information. Comprehensive evaluations across diverse tabular datasets corroborate that our method consistently improves tabular representation learning performance for a wide range of downstream tasks. The codes are available in https://github.com/kyungeun-lee/tabularbinning. △ Less

Submitted 13 May, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

Comments: ICML 2024, 18 pages (including supplementary materials)

arXiv:2405.07386 [pdf, other]

Search for lepton-flavor-violating $τ^- \to μ^-μ^+μ^-$ decays at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Althubiti, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer, J. Becker , et al. (407 additional authors not shown)

Abstract: We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one sig… ▽ More We present the result of a search for the charged-lepton-flavor violating decay $τ^- \to μ^-μ^+μ^-$ using a $424fb^{-1}$ sample of data recorded by the Belle II experiment at the SuperKEKB $e^{-}e^{+}$ collider. The selection of $e^{-}e^{+}\toτ^+τ^-$ events is based on an inclusive reconstruction of the non-signal tau decay, and on a boosted decision tree to suppress background. We observe one signal candidate, which is compatible with the expectation from background processes. We set a $90\%$ confidence level upper limit of $1.9 \times 10^{-8}$ on the branching fraction of the \taumu decay, which is the most stringent bound to date. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Report number: Belle II Preprint 2024-012 KEK Preprint 2024-6

arXiv:2405.05459 [pdf, other]

Estimation and Inference for Change Points in Functional Regression Time Series

Authors: Shivam Kumar, Haotian Xu, Haeran Cho, Daren Wang

Abstract: In this paper, we study the estimation and inference of change points under a functional linear regression model with changes in the slope function. We present a novel Functional Regression Binary Segmentation (FRBS) algorithm which is computationally efficient as well as achieving consistency in multiple change point detection. This algorithm utilizes the predictive power of piece-wise constant f… ▽ More In this paper, we study the estimation and inference of change points under a functional linear regression model with changes in the slope function. We present a novel Functional Regression Binary Segmentation (FRBS) algorithm which is computationally efficient as well as achieving consistency in multiple change point detection. This algorithm utilizes the predictive power of piece-wise constant functional linear regression models in the reproducing kernel Hilbert space framework. We further propose a refinement step that improves the localization rate of the initial estimator output by FRBS, and derive asymptotic distributions of the refined estimators for two different regimes determined by the magnitude of a change. To facilitate the construction of confidence intervals for underlying change points based on the limiting distribution, we propose a consistent block-type long-run variance estimator. Our theoretical justifications for the proposed approach accommodate temporal dependence and heavy-tailedness in both the functional covariates and the measurement errors. Empirical effectiveness of our methodology is demonstrated through extensive simulation studies and an application to the Standard and Poor's 500 index dataset. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.03685 [pdf, other]

Language-Image Models with 3D Understanding

Authors: Jang Hyun Cho, Boris Ivanovic, Yulong Cao, Edward Schmerling, Yue Wang, Xinshuo Weng, Boyi Li, Yurong You, Philipp Krähenbühl, Yan Wang, Marco Pavone

Abstract: Multi-modal large language models (MLLMs) have shown incredible capabilities in a variety of 2D vision and language tasks. We extend MLLMs' perceptual capabilities to ground and reason about images in 3-dimensional space. To that end, we first develop a large-scale pre-training dataset for 2D and 3D called LV3D by combining multiple existing 2D and 3D recognition datasets under a common task formu… ▽ More Multi-modal large language models (MLLMs) have shown incredible capabilities in a variety of 2D vision and language tasks. We extend MLLMs' perceptual capabilities to ground and reason about images in 3-dimensional space. To that end, we first develop a large-scale pre-training dataset for 2D and 3D called LV3D by combining multiple existing 2D and 3D recognition datasets under a common task formulation: as multi-turn question-answering. Next, we introduce a new MLLM named Cube-LLM and pre-train it on LV3D. We show that pure data scaling makes a strong 3D perception capability without 3D specific architectural design or training objective. Cube-LLM exhibits intriguing properties similar to LLMs: (1) Cube-LLM can apply chain-of-thought prompting to improve 3D understanding from 2D context information. (2) Cube-LLM can follow complex and diverse instructions and adapt to versatile input and output formats. (3) Cube-LLM can be visually prompted such as 2D box or a set of candidate 3D boxes from specialists. Our experiments on outdoor benchmarks demonstrate that Cube-LLM significantly outperforms existing baselines by 21.3 points of AP-BEV on the Talk2Car dataset for 3D grounded reasoning and 17.7 points on the DriveLM dataset for complex reasoning about driving scenarios, respectively. Cube-LLM also shows competitive results in general MLLM benchmarks such as refCOCO for 2D grounding with (87.0) average score, as well as visual question answering benchmarks such as VQAv2, GQA, SQA, POPE, etc. for complex reasoning. Our project is available at https://janghyuncho.github.io/Cube-LLM. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: Project page: https://janghyuncho.github.io/Cube-LLM

arXiv:2405.00986 [pdf, other]

doi 10.1145/3626772.3657928

Multi-intent-aware Session-based Recommendation

Authors: Minjin Choi, Hye-young Kim, Hyunsouk Cho, Jongwuk Lee

Abstract: Session-based recommendation (SBR) aims to predict the following item a user will interact with during an ongoing session. Most existing SBR models focus on designing sophisticated neural-based encoders to learn a session representation, capturing the relationship among session items. However, they tend to focus on the last item, neglecting diverse user intents that may exist within a session. Thi… ▽ More Session-based recommendation (SBR) aims to predict the following item a user will interact with during an ongoing session. Most existing SBR models focus on designing sophisticated neural-based encoders to learn a session representation, capturing the relationship among session items. However, they tend to focus on the last item, neglecting diverse user intents that may exist within a session. This limitation leads to significant performance drops, especially for longer sessions. To address this issue, we propose a novel SBR model, called Multi-intent-aware Session-based Recommendation Model (MiaSRec). It adopts frequency embedding vectors indicating the item frequency in session to enhance the information about repeated items. MiaSRec represents various user intents by deriving multiple session representations centered on each item and dynamically selecting the important ones. Extensive experimental results show that MiaSRec outperforms existing state-of-the-art SBR models on six datasets, particularly those with longer average session length, achieving up to 6.27% and 24.56% gains for MRR@20 and Recall@20. Our code is available at https://github.com/jin530/MiaSRec. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: SIGIR 2024. 5 pages

arXiv:2404.17734 [pdf, other]

Manipulating a Continuous Instrumental Variable in an Observational Study of Premature Babies: Algorithm, Partial Identification Bounds, and Inference under Randomization and Biased Randomization Assumptions

Authors: Zhe Chen, Min Haeng Cho, Bo Zhang

Abstract: Regionalization of intensive care for premature babies refers to a triage system of mothers with high-risk pregnancies to hospitals of varied capabilities based on risks faced by infants. Due to the limited capacity of high-level hospitals, which are equipped with advanced expertise to provide critical care, understanding the effect of delivering premature babies at such hospitals on infant mortal… ▽ More Regionalization of intensive care for premature babies refers to a triage system of mothers with high-risk pregnancies to hospitals of varied capabilities based on risks faced by infants. Due to the limited capacity of high-level hospitals, which are equipped with advanced expertise to provide critical care, understanding the effect of delivering premature babies at such hospitals on infant mortality for different subgroups of high-risk mothers could facilitate the design of an efficient perinatal regionalization system. Towards answering this question, Baiocchi et al. (2010) proposed to strengthen an excess-travel-time-based, continuous instrumental variable (IV) in an IV-based, matched-pair design by switching focus to a smaller cohort amenable to being paired with a larger separation in the IV dose. Three elements changed with the strengthened IV: the study cohort, compliance rate and latent complier subgroup. Here, we introduce a non-bipartite, template matching algorithm that embeds data into a target, pair-randomized encouragement trial which maintains fidelity to the original study cohort while strengthening the IV. We then study randomization-based and IV-dependent, biased-randomization-based inference of partial identification bounds for the sample average treatment effect (SATE) in an IV-based matched pair design, which deviates from the usual effect ratio estimand in that the SATE is agnostic to the IV and who is matched to whom, although a strengthened IV design could narrow the partial identification bounds. Based on our proposed strengthened-IV design, we found that delivering at a high-level NICU reduced preterm babies' mortality rate compared to a low-level NICU for $81,766 \times 2 = 163,532$ mothers and their preterm babies and the effect appeared to be minimal among non-black, low-risk mothers. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.17598 [pdf, other]

Revealing and Utilizing In-group Favoritism for Graph-based Collaborative Filtering

Authors: Hoin Jung, Hyunsoo Cho, Myungje Choi, Joowon Lee, Jung Ho Park, Myungjoo Kang

Abstract: When it comes to a personalized item recommendation system, It is essential to extract users' preferences and purchasing patterns. Assuming that users in the real world form a cluster and there is common favoritism in each cluster, in this work, we introduce Co-Clustering Wrapper (CCW). We compute co-clusters of users and items with co-clustering algorithms and add CF subnetworks for each cluster… ▽ More When it comes to a personalized item recommendation system, It is essential to extract users' preferences and purchasing patterns. Assuming that users in the real world form a cluster and there is common favoritism in each cluster, in this work, we introduce Co-Clustering Wrapper (CCW). We compute co-clusters of users and items with co-clustering algorithms and add CF subnetworks for each cluster to extract the in-group favoritism. Combining the features from the networks, we obtain rich and unified information about users. We experimented real world datasets considering two aspects: Finding the number of groups divided according to in-group preference, and measuring the quantity of improvement of the performance. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 7 pages, 6 figures

arXiv:2404.12817 [pdf, other]

Determination of the CKM angle $φ_{3}$ from a combination of Belle and Belle II results

Authors: Belle, Belle II Collaborations, :, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, S. Al Said, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien , et al. (377 additional authors not shown)

Abstract: We report a determination of the CKM angle $φ_{3}$, also known as $γ$, from a combination of measurements using samples of up to 711~fb$^{-1}$ from the Belle experiment and up to 362~fb$^{-1}$ from the Belle II experiment. We combine results from analyses of $B^+\to DK^+, B^+\to Dπ^+$, and $B^+ \to D^{*}K^+$ decays, where $D$ is an admixture of $D^0$ and $\overline{D}{}^{0}$ mesons, in a likelihoo… ▽ More We report a determination of the CKM angle $φ_{3}$, also known as $γ$, from a combination of measurements using samples of up to 711~fb$^{-1}$ from the Belle experiment and up to 362~fb$^{-1}$ from the Belle II experiment. We combine results from analyses of $B^+\to DK^+, B^+\to Dπ^+$, and $B^+ \to D^{*}K^+$ decays, where $D$ is an admixture of $D^0$ and $\overline{D}{}^{0}$ mesons, in a likelihood fit to obtain $φ_{3} = (78.6^{+7.2}_{-7.3})^{\circ}$. We also briefly discuss the interpretation of this result. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 31 pages, 4 figures

Report number: Belle II Preprint 2023-015, KEK Preprint 2023-31

arXiv:2404.11125 [pdf, other]

Interval-censored linear quantile regression

Authors: Taehwa Choi, Seohyeon Park, Hunyong Cho, Sangbum Choi

Abstract: Censored quantile regression has emerged as a prominent alternative to classical Cox's proportional hazards model or accelerated failure time model in both theoretical and applied statistics. While quantile regression has been extensively studied for right-censored survival data, methodologies for analyzing interval-censored data remain limited in the survival analysis literature. This paper intro… ▽ More Censored quantile regression has emerged as a prominent alternative to classical Cox's proportional hazards model or accelerated failure time model in both theoretical and applied statistics. While quantile regression has been extensively studied for right-censored survival data, methodologies for analyzing interval-censored data remain limited in the survival analysis literature. This paper introduces a novel local weighting approach for estimating linear censored quantile regression, specifically tailored to handle diverse forms of interval-censored survival data. The estimation equation and the corresponding convex objective function for the regression parameter can be constructed as a weighted average of quantile loss contributions at two interval endpoints. The weighting components are nonparametrically estimated using local kernel smoothing or ensemble machine learning techniques. To estimate the nonparametric distribution mass for interval-censored data, a modified EM algorithm for nonparametric maximum likelihood estimation is employed by introducing subject-specific latent Poisson variables. The proposed method's empirical performance is demonstrated through extensive simulation studies and real data analyses of two HIV/AIDS datasets. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: under revision

arXiv:2404.10874 [pdf, other]

doi 10.1103/PhysRevD.109.L111103

Measurement of the branching fraction of the decay $B^- \to D^0 ρ(770)^-$ at Belle II

Authors: Belle II Collaboration, I. Adachi, L. Aggarwal, H. Aihara, N. Akopov, A. Aloisio, N. Anh Ky, D. M. Asner, H. Atmacan, V. Aushev, M. Aversano, R. Ayad, V. Babu, H. Bae, S. Bahinipati, P. Bambade, Sw. Banerjee, S. Bansal, M. Barrett, J. Baudot, A. Baur, A. Beaubien, F. Becherer, J. Becker, J. V. Bennett , et al. (367 additional authors not shown)

Abstract: We measure the branching fraction of the decay $B^- \to D^0 ρ(770)^-$ using data collected with the Belle II detector. The data contain 387 million $B\overline{B}$ pairs produced in $e^+e^-$ collisions at the $Υ(4S)$ resonance. We reconstruct $8360\pm 180$ decays from an analysis of the distributions of the $B^-$ energy and the $ρ(770)^-$ helicity angle. We determine the branching fraction to be… ▽ More We measure the branching fraction of the decay $B^- \to D^0 ρ(770)^-$ using data collected with the Belle II detector. The data contain 387 million $B\overline{B}$ pairs produced in $e^+e^-$ collisions at the $Υ(4S)$ resonance. We reconstruct $8360\pm 180$ decays from an analysis of the distributions of the $B^-$ energy and the $ρ(770)^-$ helicity angle. We determine the branching fraction to be $(0.939 \pm 0.021\mathrm{(stat)} \pm 0.050\mathrm{(syst)})\%$, in agreement with previous results. Our measurement improves the relative precision of the world average by more than a factor of two. △ Less

Submitted 27 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Report number: Belle II Preprint 2024-011, KEK Preprint 2024-4

Journal ref: PRD 109, 111103 (2024)

arXiv:2404.10355 [pdf, other]

AERO: Adaptive Erase Operation for Improving Lifetime and Performance of Modern NAND Flash-Based SSDs

Authors: Sungjun Cho, Beomjun Kim, Hyunuk Cho, Gyeongseob Seo, Onur Mutlu, Myungsuk Kim, Jisung Park

Abstract: This work investigates a new erase scheme in NAND flash memory to improve the lifetime and performance of modern solid-state drives (SSDs). In NAND flash memory, an erase operation applies a high voltage (e.g., > 20 V) to flash cells for a long time (e.g., > 3.5 ms), which degrades cell endurance and potentially delays user I/O requests. While a large body of prior work has proposed various techni… ▽ More This work investigates a new erase scheme in NAND flash memory to improve the lifetime and performance of modern solid-state drives (SSDs). In NAND flash memory, an erase operation applies a high voltage (e.g., > 20 V) to flash cells for a long time (e.g., > 3.5 ms), which degrades cell endurance and potentially delays user I/O requests. While a large body of prior work has proposed various techniques to mitigate the negative impact of erase operations, no work has yet investigated how erase latency should be set to fully exploit the potential of NAND flash memory; most existing techniques use a fixed latency for every erase operation which is set to cover the worst-case operating conditions. To address this, we propose AERO (Adaptive ERase Operation), a new erase scheme that dynamically adjusts erase latency to be just long enough for reliably erasing target cells, depending on the cells' current erase characteristics. AERO accurately predicts such near-optimal erase latency based on the number of fail bits during an erase operation. To maximize its benefits, we further optimize AERO in two aspects. First, at the beginning of an erase operation, AERO attempts to erase the cells for a short time (e.g., 1 ms), which enables AERO to always obtain the number of fail bits necessary to accurately predict the near-optimal erase latency. Second, AERO aggressively yet safely reduces erase latency by leveraging a large reliability margin present in modern SSDs. We demonstrate the feasibility and reliability of AERO using 160 real 3D NAND flash chips, showing that it enhances SSD lifetime over the conventional erase scheme by 43% without change to existing NAND flash chips. Our system-level evaluation using eleven real-world workloads shows that an AERO-enabled SSD reduces read tail latency by 34% on average over a state-of-the-art technique. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: Accepted for publication at Proceedings of the 29th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2024

arXiv:2404.09717 [pdf, other]

Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model

Authors: Hyunsoo Cho

Abstract: Many recent studies endeavor to improve open-source language models through imitation learning, and re-training on the synthetic instruction data from state-of-the-art proprietary models like ChatGPT and GPT-4. However, the innate nature of synthetic data inherently contains noisy data, giving rise to a substantial presence of low-quality data replete with erroneous responses, and flawed reasoning… ▽ More Many recent studies endeavor to improve open-source language models through imitation learning, and re-training on the synthetic instruction data from state-of-the-art proprietary models like ChatGPT and GPT-4. However, the innate nature of synthetic data inherently contains noisy data, giving rise to a substantial presence of low-quality data replete with erroneous responses, and flawed reasoning. Although we intuitively grasp the potential harm of noisy data, we lack a quantitative understanding of its impact. To this end, this paper explores the correlation between the degree of noise and its impact on language models through instruction tuning. We first introduce the Falsity-Controllable (FACO) dataset, which comprises pairs of true answers with corresponding reasoning, as well as false pairs to manually control the falsity ratio of the dataset.Through our extensive experiments, we found multiple intriguing findings of the correlation between the factuality of the dataset and instruction tuning: Specifically, we verified falsity of the instruction is highly relevant to various benchmark scores. Moreover, when LLMs are trained with false instructions, they learn to lie and generate fake unfaithful answers, even though they know the correct answer for the user request. Additionally, we noted that once the language model is trained with a dataset contaminated by noise, restoring its original performance is possible, but it failed to reach full performance. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: Under review @ *ACL

Showing 1–50 of 1,074 results for author: Cho, H