-
Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation
Authors:
Zhilin Zhu,
Xiaopeng Hong,
Zhiheng Ma,
Weijun Zhuang,
Yaohui Ma,
Dai Yong,
Yaowei Wang
Abstract:
Continual Test-Time Adaptation (CTTA) involves adapting a pre-trained source model to continually changing unsupervised target domains. In this paper, we systematically analyze the challenges of this task: online environment, unsupervised nature, and the risks of error accumulation and catastrophic forgetting under continual domain shifts. To address these challenges, we reshape the online data bu…
▽ More
Continual Test-Time Adaptation (CTTA) involves adapting a pre-trained source model to continually changing unsupervised target domains. In this paper, we systematically analyze the challenges of this task: online environment, unsupervised nature, and the risks of error accumulation and catastrophic forgetting under continual domain shifts. To address these challenges, we reshape the online data buffering and organizing mechanism for CTTA. We propose an {uncertainty-aware buffering approach} to identify {and aggregate} significant samples with high certainty from the unsupervised, single-pass data stream. {Based on this}, we propose a graph-based class relation preservation constraint to overcome catastrophic forgetting. Furthermore, a pseudo-target replay objective is used to mitigate error accumulation. Extensive experiments demonstrate the superiority of our method in both segmentation and classification CTTA tasks. Code is available at \href{https://github.com/z1358/OBAO}{this https URL}.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Bonding states underpinning structural transitions in IrTe$_2$ observed with micro-ARPES
Authors:
C. W. Nicholson,
M. D. Watson,
A. Pulkkinen,
M. Rumo,
G. Kremer,
K. Y. Ma,
F. O. von Rohr,
C. Cacho,
C. Monney
Abstract:
Competing interactions in low-dimensional materials can produce nearly degenerate electronic and structural phases. We investigate the staircase of structural phase transitions in layered IrTe$_2$ for which a number of potential transition mechanisms have been postulated. The spatial coexistence of multiple phases on the micron scale has prevented a detailed analysis of the electronic structure. B…
▽ More
Competing interactions in low-dimensional materials can produce nearly degenerate electronic and structural phases. We investigate the staircase of structural phase transitions in layered IrTe$_2$ for which a number of potential transition mechanisms have been postulated. The spatial coexistence of multiple phases on the micron scale has prevented a detailed analysis of the electronic structure. By exploiting micro-ARPES obtained with synchrotron radiation we extract the electronic structure of the multiple structural phases in IrTe$_2$ in order to address the mechanism underlying the phase transitions. We find direct evidence of lowered energy states that appear in the low-temperature phases, states previously predicted by \textit{ab initio} calculations and extended here. Our results validate a proposed scenario of bonding and anti-bonding states as the driver of the phase transitions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (414 additional authors not shown)
Abstract:
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det…
▽ More
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
A Novel Quantum Realization of Jet Clustering in High-Energy Physics Experiments
Authors:
Yongfeng Zhu,
Weifeng Zhuang,
Chen Qian,
Yunheng Ma,
Dong E. Liu,
Manqi Ruan,
Chen Zhou
Abstract:
Exploring the application of quantum technologies to fundamental sciences holds the key to fostering innovation for both sides. In high-energy particle collisions, quarks and gluons are produced and immediately form collimated particle sprays known as jets. Accurate jet clustering is crucial as it retains the information of the originating quark or gluon and forms the basis for studying properties…
▽ More
Exploring the application of quantum technologies to fundamental sciences holds the key to fostering innovation for both sides. In high-energy particle collisions, quarks and gluons are produced and immediately form collimated particle sprays known as jets. Accurate jet clustering is crucial as it retains the information of the originating quark or gluon and forms the basis for studying properties of the Higgs boson, which underlies teh mechanism of mass generation for subatomic particles. For the first time, by mapping collision events into graphs--with particles as nodes and their angular separations as edges--we realize jet clustering using the Quantum Approximate Optimization Algorithm (QAOA), a hybrid quantum-classical algorithm for addressing classical combinatorial optimization problems with available quantum resources. Our results, derived from 30 qubits on quantum computer simulator and 6 qubits on quantum computer hardware, demonstrate that jet clustering performance with QAOA is comparable with or even better than classical algorithms for a small-sized problem. This study highlights the feasibility of quantum computing to revolutionize jet clustering, bringing the practical application of quantum computing in high-energy physics experiments one step closer.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Coupling multi-space topologies in 2D ferromagnetic lattice
Authors:
Zhonglin He,
Wenhui Du,
Kaiying Dou,
Ying Dai,
Baibiao Huang,
Yandong Ma
Abstract:
Topology can manifest topological magnetism (e.g., skyrmion and bimeron) in real space and quantum anomalous Hall (QAH) state in momentum space, which have changed the modern conceptions of matter phase. While the topologies in different spaces are widely studied separately, their coexistence and coupling in single phase is seldomly explored. Here, we report a novel phenomenon that arises from the…
▽ More
Topology can manifest topological magnetism (e.g., skyrmion and bimeron) in real space and quantum anomalous Hall (QAH) state in momentum space, which have changed the modern conceptions of matter phase. While the topologies in different spaces are widely studied separately, their coexistence and coupling in single phase is seldomly explored. Here, we report a novel phenomenon that arises from the interaction of topological magnetism and band topology, the multi-space topology, in 2D ferromagnetic lattice. Based on continuum theory and tight-binding model, we reveal that the interconnection between skyrmion/bimeron and QAH state generates distinctive localized chiral bound states (CBSs). With moderating topological magnetism through magnetic field, the multi-space topologies accompanied with different CBSs can be reversed, facilitating the coupling of multi-space topologies. By performing firstprinciples and atomic spin model simulations, we further demonstrate such multi-space topologies and their coupling in monolayer Cr2NSb. These results represent an important step towards the development of multispace topological phenomena in 2D lattice.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (385 additional authors not shown)
Abstract:
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I…
▽ More
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
One-Shot Pose-Driving Face Animation Platform
Authors:
He Feng,
Donglin Di,
Yongjia Ma,
Wei Chen,
Tonghua Su
Abstract:
The objective of face animation is to generate dynamic and expressive talking head videos from a single reference face, utilizing driving conditions derived from either video or audio inputs. Current approaches often require fine-tuning for specific identities and frequently fail to produce expressive videos due to the limited effectiveness of Wav2Pose modules. To facilitate the generation of one-…
▽ More
The objective of face animation is to generate dynamic and expressive talking head videos from a single reference face, utilizing driving conditions derived from either video or audio inputs. Current approaches often require fine-tuning for specific identities and frequently fail to produce expressive videos due to the limited effectiveness of Wav2Pose modules. To facilitate the generation of one-shot and more consecutive talking head videos, we refine an existing Image2Video model by integrating a Face Locator and Motion Frame mechanism. We subsequently optimize the model using extensive human face video datasets, significantly enhancing its ability to produce high-quality and expressive talking head videos. Additionally, we develop a demo platform using the Gradio framework, which streamlines the process, enabling users to quickly create customized talking head videos.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Symmetry Awareness Encoded Deep Learning Framework for Brain Imaging Analysis
Authors:
Yang Ma,
Dongang Wang,
Peilin Liu,
Lynette Masters,
Michael Barnett,
Weidong Cai,
Chenyu Wang
Abstract:
The heterogeneity of neurological conditions, ranging from structural anomalies to functional impairments, presents a significant challenge in medical imaging analysis tasks. Moreover, the limited availability of well-annotated datasets constrains the development of robust analysis models. Against this backdrop, this study introduces a novel approach leveraging the inherent anatomical symmetrical…
▽ More
The heterogeneity of neurological conditions, ranging from structural anomalies to functional impairments, presents a significant challenge in medical imaging analysis tasks. Moreover, the limited availability of well-annotated datasets constrains the development of robust analysis models. Against this backdrop, this study introduces a novel approach leveraging the inherent anatomical symmetrical features of the human brain to enhance the subsequent detection and segmentation analysis for brain diseases. A novel Symmetry-Aware Cross-Attention (SACA) module is proposed to encode symmetrical features of left and right hemispheres, and a proxy task to detect symmetrical features as the Symmetry-Aware Head (SAH) is proposed, which guides the pretraining of the whole network on a vast 3D brain imaging dataset comprising both healthy and diseased brain images across various MRI and CT. Through meticulous experimentation on downstream tasks, including both classification and segmentation for brain diseases, our model demonstrates superior performance over state-of-the-art methodologies, particularly highlighting the significance of symmetry-aware learning. Our findings advocate for the effectiveness of incorporating symmetry awareness into pretraining and set a new benchmark for medical imaging analysis, promising significant strides toward accurate and efficient diagnostic processes. Code is available at https://github.com/bitMyron/sa-swin.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
A Bistatic ISAC Framework for LEO Satellite Systems: A Rate-Splitting Approach
Authors:
Juha Park,
Jaehyup Seong,
Jaehak Ryu,
Yijie Mao,
Wonjae Shin
Abstract:
Aiming to achieve ubiquitous global connectivity and target detection on the same platform with improved spectral/energy efficiency and reduced onboard hardware cost, low Earth orbit (LEO) satellite systems capable of simultaneously performing communications and radar have attracted significant attention. Designing such a joint system should address not only the challenges of integrating two funct…
▽ More
Aiming to achieve ubiquitous global connectivity and target detection on the same platform with improved spectral/energy efficiency and reduced onboard hardware cost, low Earth orbit (LEO) satellite systems capable of simultaneously performing communications and radar have attracted significant attention. Designing such a joint system should address not only the challenges of integrating two functions but also the unique propagation characteristics of the satellites. To overcome severe echo signal path loss due to the high altitude of the satellite, we put forth a bistatic integrated sensing and communication (ISAC) framework with a radar receiver separated from the satellite. For robust and effective interference management, we employ rate-splitting multiple access (RSMA), which splits and encodes users messages into private and common streams. We optimize the dual-functional precoders to maximize the minimum rate among all users while satisfying the Cramer-Rao bound (CRB) constraints. Given the challenge of acquiring instantaneous channel state information (iCSI) for LEO satellites, we exploit the geometrical and statistical characteristics of the satellite channel. To develop an efficient optimization algorithm, semidefinite relaxation (SDR), sequential rank-1 constraint relaxation (SROCR), and successive convex approximation (SCA) are utilized. Numerical results show that the proposed framework efficiently performs both communication and radar, demonstrating superior interference control capabilities. Furthermore, it is validated that the common stream plays three vital roles: i) beamforming towards the radar target, ii) interference management between communications and radar, and iii) interference management among communication users.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Constraining Holographic Dark Energy and Analyzing Cosmological Tensions
Authors:
Xin Tang,
Yin-Zhe Ma,
Wei-Ming Dai,
Hong-Jian He
Abstract:
We investigate cosmological constraints on the holographic dark energy (HDE) using the state-of-the-art cosmological datasets: Planck CMB angular power spectra and weak lensing power spectra, Atacama Cosmology Telescope (ACT) temperature power spectra, baryon acoustic oscillation (BAO) and redshift-space distortion (RSD) measurements from six-degree-field galaxy survey and Sloan Digital Sky Survey…
▽ More
We investigate cosmological constraints on the holographic dark energy (HDE) using the state-of-the-art cosmological datasets: Planck CMB angular power spectra and weak lensing power spectra, Atacama Cosmology Telescope (ACT) temperature power spectra, baryon acoustic oscillation (BAO) and redshift-space distortion (RSD) measurements from six-degree-field galaxy survey and Sloan Digital Sky Survey (DR12 & DR16) and the Cepheids-Supernovae measurement from SH0ES team (R22). We also examine the HDE model and $Λ$CDM with and without $N_{\rm eff}$ (effective number of relativistic species) being treated as a free parameter. We find that the HDE model can relieve the tensions of $H_0$ and $S_8$ to certain degrees. With ``Planck+ACT+BAO+RSD'' datasets, the constraints are $H_0 = 69.70 \pm 1.39\ \mathrm{km\ s^{-1} Mpc^{-1}}$ and $S_8 = 0.823 \pm 0.011$ in HDE model, which brings down the Hubble tension down to $1.92σ$ confidence level (C.L.) and the $S_8$ tension to $1$-$2σ$ C.L. By adding the R22 data, their values are improved as $H_0 = 71.86 \pm 0.93 \,\mathrm{km\ s^{-1} Mpc^{-1}}$ and $S_8 = 0.813 \pm 0.010$, which further brings the Hubble tension down to $0.85σ$ C.L. and relieves the $S_{8}$ tension. We also quantify the goodness-of-fit of different models with Akaike information criterion (AIC) and Bayesian information criterion (BIC), and find that the HDE agrees with the observational data better than the $Λ$CDM and other extended models (treating $N_{\rm eff}$ as free for fitting).
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
Authors:
Jinxing Zhou,
Dan Guo,
Yuxin Mao,
Yiran Zhong,
Xiaojun Chang,
Meng Wang
Abstract:
Audio-Visual Video Parsing (AVVP) task aims to detect and temporally locate events within audio and visual modalities. Multiple events can overlap in the timeline, making identification challenging. While traditional methods usually focus on improving the early audio-visual encoders to embed more effective features, the decoding phase -- crucial for final event classification, often receives less…
▽ More
Audio-Visual Video Parsing (AVVP) task aims to detect and temporally locate events within audio and visual modalities. Multiple events can overlap in the timeline, making identification challenging. While traditional methods usually focus on improving the early audio-visual encoders to embed more effective features, the decoding phase -- crucial for final event classification, often receives less attention. We aim to advance the decoding phase and improve its interpretability. Specifically, we introduce a new decoding paradigm, \underline{l}abel s\underline{e}m\underline{a}ntic-based \underline{p}rojection (LEAP), that employs labels texts of event categories, each bearing distinct and explicit semantics, for parsing potentially overlapping events.LEAP works by iteratively projecting encoded latent features of audio/visual segments onto semantically independent label embeddings. This process, enriched by modeling cross-modal (audio/visual-label) interactions, gradually disentangles event semantics within video segments to refine relevant label embeddings, guaranteeing a more discriminative and interpretable decoding process. To facilitate the LEAP paradigm, we propose a semantic-aware optimization strategy, which includes a novel audio-visual semantic similarity loss function. This function leverages the Intersection over Union of audio and visual events (EIoU) as a novel metric to calibrate audio-visual similarities at the feature level, accommodating the varied event densities across modalities. Extensive experiments demonstrate the superiority of our method, achieving new state-of-the-art performance for AVVP and also enhancing the relevant audio-visual event localization task.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Alternating Subspace Approximate Message Passing
Authors:
Xu Zhu,
Yufei Ma,
Xiaoguang Li,
Tiejun Li
Abstract:
Numerous renowned algorithms for tackling the compressed sensing problem employ an alternating strategy, which typically involves data matching in one module and denoising in another. Based on an in-depth analysis of the connection between the message passing and operator splitting, we present a novel approach, the Alternating Subspace Method (ASM), which intuitively combines the principles of the…
▽ More
Numerous renowned algorithms for tackling the compressed sensing problem employ an alternating strategy, which typically involves data matching in one module and denoising in another. Based on an in-depth analysis of the connection between the message passing and operator splitting, we present a novel approach, the Alternating Subspace Method (ASM), which intuitively combines the principles of the greedy methods (e.g., the orthogonal matching pursuit type methods) and the splitting methods (e.g., the approximate message passing type methods). Essentially, ASM modifies the splitting method by achieving fidelity in a subspace-restricted fashion. We reveal that such confining strategy still yields a consistent fixed point iteration and establish its local geometric convergence on the lasso problem. Numerical experiments on both the lasso and channel estimation problems demonstrate its high convergence rate and its capacity to incorporate different prior distributions. Further theoretical analysis also demonstrates the advantage of the motivated message-passing splitting by incorporating quasi-variance degree of freedom even for the classical lasso optimization problem. Overall, the proposed method is promising in efficiency, accuracy and flexibility, which has the potential to be competitive in different sparse recovery applications.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Optimal bias of utility function between two-layer network for the evolution of prosocial behavior in two-order game and higher-order game
Authors:
Yihe Ma,
Hui Zhang
Abstract:
Cooperation is an important research object in economics, sociology, and biology, and the evolution of cooperation in structured populations is a interesting research topic. We mainly focus on the evolution of cooperation with two-order and higher-order game in two-layer network. We introduce a bias coefficient of utility function and study the influence of bias coefficient on the evolution of coo…
▽ More
Cooperation is an important research object in economics, sociology, and biology, and the evolution of cooperation in structured populations is a interesting research topic. We mainly focus on the evolution of cooperation with two-order and higher-order game in two-layer network. We introduce a bias coefficient of utility function and study the influence of bias coefficient on the evolution of cooperation in two-layer network. We firstly provide theoretical analysis of fixation probabilities of two-order and higher-order game under weak selection in two-layer network.Secondly,based on the expression of fixation probability, we obtain the critical value of the two different games by comparing the size relationship of fixation probability under weak selection condition and neutral selection condition. Finally, by comparing the relationship between the critical value of single-layer and two-layer network in two-order game and higher-order game, when the nonlinear factor satisfies certain conditions, it is concluded that when the optimal bias coefficient tends towards 0 is met, some two-layer networks promote the evolution of cooperative behavior more than some single-layer networks.
△ Less
Submitted 12 July, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Authors:
Wenqi Zhang,
Zhenglin Cheng,
Yuanyu He,
Mengna Wang,
Yongliang Shen,
Zeqi Tan,
Guiyang Hou,
Mingqian He,
Yanna Ma,
Weiming Lu,
Yueting Zhuang
Abstract:
Although most current large multimodal models (LMMs) can already understand photos of natural scenes and portraits, their understanding of abstract images, e.g., charts, maps, or layouts, and visual reasoning capabilities remains quite rudimentary. They often struggle with simple daily tasks, such as reading time from a clock, understanding a flowchart, or planning a route using a road map. In lig…
▽ More
Although most current large multimodal models (LMMs) can already understand photos of natural scenes and portraits, their understanding of abstract images, e.g., charts, maps, or layouts, and visual reasoning capabilities remains quite rudimentary. They often struggle with simple daily tasks, such as reading time from a clock, understanding a flowchart, or planning a route using a road map. In light of this, we design a multi-modal self-instruct, utilizing large language models and their code capabilities to synthesize massive abstract images and visual reasoning instructions across daily scenarios. Our strategy effortlessly creates a multimodal benchmark with 11,193 instructions for eight visual scenarios: charts, tables, simulated maps, dashboards, flowcharts, relation graphs, floor plans, and visual puzzles. \textbf{This benchmark, constructed with simple lines and geometric elements, exposes the shortcomings of most advanced LMMs} like Claude-3.5-Sonnet and GPT-4o in abstract image understanding, spatial relations reasoning, and visual element induction. Besides, to verify the quality of our synthetic data, we fine-tune an LMM using 62,476 synthetic chart, table and road map instructions. The results demonstrate improved chart understanding and map navigation performance, and also demonstrate potential benefits for other visual reasoning tasks. Our code is available at: \url{https://github.com/zwq2018/Multi-modal-Self-instruct}.
△ Less
Submitted 10 July, 2024; v1 submitted 9 July, 2024;
originally announced July 2024.
-
RS-BNN: A Deep Learning Framework for the Optimal Beamforming Design of Rate-Splitting Multiple Access
Authors:
Yiwen Wang,
Yijie Mao,
Sijie Ji
Abstract:
Rate splitting multiple access (RSMA) relies on beamforming design for attaining spectral efficiency and energy efficiency gains over traditional multiple access schemes. While conventional optimization approaches such as weighted minimum mean square error (WMMSE) achieve suboptimal solutions for RSMA beamforming optimization, they are computationally demanding. A novel approach based on fractiona…
▽ More
Rate splitting multiple access (RSMA) relies on beamforming design for attaining spectral efficiency and energy efficiency gains over traditional multiple access schemes. While conventional optimization approaches such as weighted minimum mean square error (WMMSE) achieve suboptimal solutions for RSMA beamforming optimization, they are computationally demanding. A novel approach based on fractional programming (FP) has unveiled the optimal beamforming structure (OBS) for RSMA. This method, combined with a hyperplane fixed point iteration (HFPI) approach, named FP-HFPI, provides suboptimal beamforming solutions with identical sum rate performance but much lower computational complexity compared to WMMSE. Inspired by such an approach, in this work, a novel deep unfolding framework based on FP-HFPI, named rate-splitting-beamforming neural network (RS-BNN), is proposed to unfold the FP-HFPI algorithm. Numerical results indicate that the proposed RS-BNN attains a level of performance closely matching that of WMMSE and FP-HFPI, while dramatically reducing the computational complexity.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Vector meson's spin alignments in high energy reactions
Authors:
Jin-Hui Chen,
Zuo-Tang Liang,
Yu-Gang Ma,
Xin-Li Sheng,
Qun Wang
Abstract:
The global spin alignment of vector mesons has been observed by the STAR collaboration at the Relativistic Heavy Ion Collider (RHIC) at Brookhaven National Laboratory (BNL). It provides a unique opportunity to probe the correlation between the polarized quark and antiquark in the strongly coupled quark-gluon plasma (sQGP) produced in relativistic heavy ion collisions, opening a new window to explo…
▽ More
The global spin alignment of vector mesons has been observed by the STAR collaboration at the Relativistic Heavy Ion Collider (RHIC) at Brookhaven National Laboratory (BNL). It provides a unique opportunity to probe the correlation between the polarized quark and antiquark in the strongly coupled quark-gluon plasma (sQGP) produced in relativistic heavy ion collisions, opening a new window to explore the properties of sQGP. In addition, spin alignments of vector mesons have also been observed in other high-energy particle collisions. The results seem to be strongly dependent on the hadronization mechanism, so comprehensive studies are needed.In this article, we present a brief review of theoretical and experimental advances in the study of vector meson's spin alignments in a variety of high-energy particle collisions, with emphasis on hadronization mechanisms.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation
Authors:
Ethan Chern,
Jiadi Su,
Yan Ma,
Pengfei Liu
Abstract:
Previous open-source large multimodal models (LMMs) have faced several limitations: (1) they often lack native integration, requiring adapters to align visual representations with pre-trained large language models (LLMs); (2) many are restricted to single-modal generation; (3) while some support multimodal generation, they rely on separate diffusion models for visual modeling and generation. To mi…
▽ More
Previous open-source large multimodal models (LMMs) have faced several limitations: (1) they often lack native integration, requiring adapters to align visual representations with pre-trained large language models (LLMs); (2) many are restricted to single-modal generation; (3) while some support multimodal generation, they rely on separate diffusion models for visual modeling and generation. To mitigate these limitations, we present Anole, an open, autoregressive, native large multimodal model for interleaved image-text generation. We build Anole from Meta AI's Chameleon, adopting an innovative fine-tuning strategy that is both data-efficient and parameter-efficient. Anole demonstrates high-quality, coherent multimodal generation capabilities. We have open-sourced our model, training framework, and instruction tuning data.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
A Color Image Analysis Tool to Help Users Choose a Makeup Foundation Color
Authors:
Yafei Mao,
Christopher Merkle,
Jan P. Allebach
Abstract:
This paper presents an approach to predict the color of skin-with-foundation based on a no makeup selfie image and a foundation shade image. Our approach first calibrates the image with the help of the color checker target, and then trains a supervised-learning model to predict the skin color. In the calibration stage, We propose to use three different transformation matrices to map the device dep…
▽ More
This paper presents an approach to predict the color of skin-with-foundation based on a no makeup selfie image and a foundation shade image. Our approach first calibrates the image with the help of the color checker target, and then trains a supervised-learning model to predict the skin color. In the calibration stage, We propose to use three different transformation matrices to map the device dependent RGB response to the reference CIE XYZ space. In so doing, color correction error can be minimized. We then compute the average value of the region of interest in the calibrated images, and feed them to the prediction model. We explored both the linear regression and support vector regression models. Cross-validation results show that both models can accurately make the prediction.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Multi-branch Collaborative Learning Network for 3D Visual Grounding
Authors:
Zhipeng Qian,
Yiwei Ma,
Zhekai Lin,
Jiayi Ji,
Xiawu Zheng,
Xiaoshuai Sun,
Rongrong Ji
Abstract:
3D referring expression comprehension (3DREC) and segmentation (3DRES) have overlapping objectives, indicating their potential for collaboration. However, existing collaborative approaches predominantly depend on the results of one task to make predictions for the other, limiting effective collaboration. We argue that employing separate branches for 3DREC and 3DRES tasks enhances the model's capac…
▽ More
3D referring expression comprehension (3DREC) and segmentation (3DRES) have overlapping objectives, indicating their potential for collaboration. However, existing collaborative approaches predominantly depend on the results of one task to make predictions for the other, limiting effective collaboration. We argue that employing separate branches for 3DREC and 3DRES tasks enhances the model's capacity to learn specific information for each task, enabling them to acquire complementary knowledge. Thus, we propose the MCLN framework, which includes independent branches for 3DREC and 3DRES tasks. This enables dedicated exploration of each task and effective coordination between the branches. Furthermore, to facilitate mutual reinforcement between these branches, we introduce a Relative Superpoint Aggregation (RSA) module and an Adaptive Soft Alignment (ASA) module. These modules significantly contribute to the precise alignment of prediction results from the two branches, directing the module to allocate increased attention to key positions. Comprehensive experimental evaluation demonstrates that our proposed method achieves state-of-the-art performance on both the 3DREC and 3DRES tasks, with an increase of 2.05% in Acc@0.5 for 3DREC and 3.96% in mIoU for 3DRES.
△ Less
Submitted 10 July, 2024; v1 submitted 7 July, 2024;
originally announced July 2024.
-
Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model
Authors:
Danni Yang,
Ruohan Dong,
Jiayi Ji,
Yiwei Ma,
Haowei Wang,
Xiaoshuai Sun,
Rongrong Ji
Abstract:
Recently, diffusion models have increasingly demonstrated their capabilities in vision understanding. By leveraging prompt-based learning to construct sentences, these models have shown proficiency in classification and visual grounding tasks. However, existing approaches primarily showcase their ability to perform sentence-level localization, leaving the potential for leveraging contextual inform…
▽ More
Recently, diffusion models have increasingly demonstrated their capabilities in vision understanding. By leveraging prompt-based learning to construct sentences, these models have shown proficiency in classification and visual grounding tasks. However, existing approaches primarily showcase their ability to perform sentence-level localization, leaving the potential for leveraging contextual information for phrase-level understanding largely unexplored. In this paper, we utilize Panoptic Narrative Grounding (PNG) as a proxy task to investigate this capability further. PNG aims to segment object instances mentioned by multiple noun phrases within a given narrative text. Specifically, we introduce the DiffPNG framework, a straightforward yet effective approach that fully capitalizes on the diffusion's architecture for segmentation by decomposing the process into a sequence of localization, segmentation, and refinement steps. The framework initially identifies anchor points using cross-attention mechanisms and subsequently performs segmentation with self-attention to achieve zero-shot PNG. Moreover, we introduce a refinement module based on SAM to enhance the quality of the segmentation masks. Our extensive experiments on the PNG dataset demonstrate that DiffPNG achieves strong performance in the zero-shot PNG task setting, conclusively proving the diffusion model's capability for context-aware, phrase-level understanding. Source code is available at \url{https://github.com/nini0919/DiffPNG}.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Experimental investigation of direct non-Hermitian measurement and uncertainty relation towards high-dimensional quantum domain
Authors:
Yi-Tao Wang,
Zhao-An Wang,
Zhi-Peng Li,
Xiao-Dong Zeng,
Jia-Ming Ren,
Wei Liu,
Yuan-Ze Yang,
Nai-Jie Guo,
Lin-Ke Xie,
Jun-You Liu,
Yu-Hang Ma,
Jian-Shun Tang,
Chengjie Zhang,
Chuan-Feng Li,
Guang-Can Guo
Abstract:
Non-Hermitian dynamics in quantum systems have unveiled novel phenomena, yet the implementation of valid non-Hermitian quantum measurement remains a challenge, because a universal quantum projective mechanism on the complete but skewed non-Hermitian eigenstates is not explicit in experiment. This limitation hinders the direct acquisition of non-Hermitian observable statistics (e.g., non-Hermitian…
▽ More
Non-Hermitian dynamics in quantum systems have unveiled novel phenomena, yet the implementation of valid non-Hermitian quantum measurement remains a challenge, because a universal quantum projective mechanism on the complete but skewed non-Hermitian eigenstates is not explicit in experiment. This limitation hinders the direct acquisition of non-Hermitian observable statistics (e.g., non-Hermitian population dynamics), also constrains investigations of non-Hermitian quantum measurement properties such as uncertainty relation. Here, we address these challenges by presenting a non-Hermitian projective protocol and investigating the non-Hermitian uncertainty relation. We derive the uncertainty relation for pseudo-Hermitian (PH) observables that is generalized beyond the Hermitian ones. We then investigate the projective properties of general quantum states onto complete non-Hermitian eigenvectors, and present a quantum simulating method to apply the valid non-Hermitian projective measurement on a direct-sum dilated space. Subsequently, we experimentally construct a quantum simulator in the quantum optical circuit and realize the 3-dimensional non-Hermitian quantum measurement on the single-photon qutrit. Employing this platform, we explore the uncertainty relation experimentally with different PH metrics. Our non-Hermitian quantum measurement method is state-independent and outputs directly the non-Hermitian quantum projective statistics, paving the way for studies of extensive non-Hermitian observable in quantum domain.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Wi-Fi Beyond Communications: Experimental Evaluation of Respiration Monitoring and Motion Detection Using COTS Devices
Authors:
Jiuyu Liu,
Yi Ma,
Rahim Tafazolli
Abstract:
Wi-Fi sensing has become an attractive option for non-invasive monitoring of human activities and vital signs. This paper explores the feasibility of using state-of-the-art commercial off-the-shelf (COTS) devices for Wi-Fi sensing applications, particularly respiration monitoring and motion detection. We utilize the Intel AX210 network interface card (NIC) to transmit Wi-Fi signals in both 2.4 GHz…
▽ More
Wi-Fi sensing has become an attractive option for non-invasive monitoring of human activities and vital signs. This paper explores the feasibility of using state-of-the-art commercial off-the-shelf (COTS) devices for Wi-Fi sensing applications, particularly respiration monitoring and motion detection. We utilize the Intel AX210 network interface card (NIC) to transmit Wi-Fi signals in both 2.4 GHz and 6 GHz frequency bands. Our experiments rely on channel frequency response (CFR) and received signal strength indicator (RSSI) data, which are processed using a moving average algorithm to extract human behavior patterns. The experimental results demonstrate the effectiveness of our approach in capturing and representing human respiration and motion patterns. Furthermore, we compare the performance of Wi-Fi sensing across different frequency bands, highlighting the advantages of using higher frequencies for improved sensitivity and clarity. Our findings showcase the practicality of using COTS devices for Wi-Fi sensing and lay the groundwork for the development of non-invasive, contactless sensing systems. These systems have potential applications in various fields, including healthcare, smart homes, and Metaverse.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Search for the baryon number and lepton number violating decays $τ^-\to Λπ^-$ and $τ^-\to \barΛπ^-$ at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (349 additional authors not shown)
Abstract:
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper…
▽ More
We present a search for the baryon number $B$ and lepton number $L$ violating decays $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛ π^-$ produced from the $e^+e^-\to τ^+τ^-$ process, using a 364 fb$^{-1}$ data sample collected by the Belle~II experiment at the SuperKEKB collider. No evidence of signal is found in either decay mode, which have $|Δ(B-L)|$ equal to $2$ and $0$, respectively. Upper limits at 90\% credibility level on the branching fractions of $τ^- \rightarrow Λπ^-$ and $τ^- \rightarrow \barΛπ^-$ are determined to be $4.7 \times 10^{-8}$ and $4.3 \times 10^{-8}$, respectively.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Treatment effect estimation under covariate-adaptive randomization with heavy-tailed outcomes
Authors:
Hongzi Li,
Wei Ma,
Yingying Ma,
Hanzhong Liu
Abstract:
Randomized experiments are the gold standard for investigating causal relationships, with comparisons of potential outcomes under different treatment groups used to estimate treatment effects. However, outcomes with heavy-tailed distributions pose significant challenges to traditional statistical approaches. While recent studies have explored these issues under simple randomization, their applicat…
▽ More
Randomized experiments are the gold standard for investigating causal relationships, with comparisons of potential outcomes under different treatment groups used to estimate treatment effects. However, outcomes with heavy-tailed distributions pose significant challenges to traditional statistical approaches. While recent studies have explored these issues under simple randomization, their application in more complex randomization designs, such as stratified randomization or covariate-adaptive randomization, has not been adequately addressed. To fill the gap, this paper examines the properties of the estimated influence function-based M-estimator under covariate-adaptive randomization with heavy-tailed outcomes, demonstrating its consistency and asymptotic normality. Yet, the existing variance estimator tends to overestimate the asymptotic variance, especially under more balanced designs, and lacks universal applicability across randomization methods. To remedy this, we introduce a novel stratified transformed difference-in-means estimator to enhance efficiency and propose a universally applicable variance estimator to facilitate valid inferences. Additionally, we establish the consistency of kernel-based density estimation in the context of covariate-adaptive randomization. Numerical results demonstrate the effectiveness of the proposed methods in finite samples.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Authors:
Tiancheng Zhao,
Qianqian Zhang,
Kyusong Lee,
Peng Liu,
Lu Zhang,
Chunxin Fang,
Jiajia Liao,
Kelei Jiang,
Yibo Ma,
Ruochen Xu
Abstract:
We introduce OmChat, a model designed to excel in handling long contexts and video understanding tasks. OmChat's new architecture standardizes how different visual inputs are processed, making it more efficient and adaptable. It uses a dynamic vision encoding process to effectively handle images of various resolutions, capturing fine details across a range of image qualities. OmChat utilizes an ac…
▽ More
We introduce OmChat, a model designed to excel in handling long contexts and video understanding tasks. OmChat's new architecture standardizes how different visual inputs are processed, making it more efficient and adaptable. It uses a dynamic vision encoding process to effectively handle images of various resolutions, capturing fine details across a range of image qualities. OmChat utilizes an active progressive multimodal pretraining strategy, which gradually increases the model's capacity for long contexts and enhances its overall abilities. By selecting high-quality data during training, OmChat learns from the most relevant and informative data points. With support for a context length of up to 512K, OmChat demonstrates promising performance in tasks involving multiple images and videos, outperforming most open-source models in these benchmarks. Additionally, OmChat proposes a prompting strategy for unifying complex multimodal inputs including single image text, multi-image text and videos, and achieving competitive performance on single-image benchmarks. To further evaluate the model's capabilities, we proposed a benchmark dataset named Temporal Visual Needle in a Haystack. This dataset assesses OmChat's ability to comprehend temporal visual details within long videos. Our analysis highlights several key factors contributing to OmChat's success: support for any-aspect high image resolution, the active progressive pretraining strategy, and high-quality supervised fine-tuning datasets. This report provides a detailed overview of OmChat's capabilities and the strategies that enhance its performance in visual understanding.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Enabling On-Device LLMs Personalization with Smartphone Sensing
Authors:
Shiquan Zhang,
Ying Ma,
Le Fang,
Hong Jia,
Simon D'Alfonso,
Vassilis Kostakos
Abstract:
This demo presents a novel end-to-end framework that combines on-device large language models (LLMs) with smartphone sensing technologies to achieve context-aware and personalized services. The framework addresses critical limitations of current personalization solutions via cloud-based LLMs, such as privacy concerns, latency and cost, and limited personal sensor data. To achieve this, we innovati…
▽ More
This demo presents a novel end-to-end framework that combines on-device large language models (LLMs) with smartphone sensing technologies to achieve context-aware and personalized services. The framework addresses critical limitations of current personalization solutions via cloud-based LLMs, such as privacy concerns, latency and cost, and limited personal sensor data. To achieve this, we innovatively proposed deploying LLMs on smartphones with multimodal sensor data and customized prompt engineering, ensuring privacy and enhancing personalization performance through context-aware sensing. A case study involving a university student demonstrated the proposed framework's capability to provide tailored recommendations. In addition, we show that the proposed framework achieves the best trade-off in privacy, performance, latency, cost, battery and energy consumption between on-device and cloud LLMs. Future work aims to integrate more diverse sensor data and conduct large-scale user studies to further refine the personalization. We envision the proposed framework could significantly improve user experiences in various domains such as healthcare, productivity, and entertainment by providing secure, context-aware, and efficient interactions directly on users' devices.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Evidence of $h_{b}(\text{2P}) \to Υ(\text{1S})η$ decay and search for $h_{b}(\text{1P,2P}) \to Υ(\text{1S})π^0$ with the Belle detector
Authors:
Belle Collaboration,
E. Kovalenko,
I. Adachi,
H. Aihara,
D. M. Asner,
T. Aushev,
R. Ayad,
V. Babu,
Sw. Banerjee,
K. Belous,
J. Bennett,
M. Bessner,
T. Bilka,
D. Biswas,
A. Bobrov,
D. Bodrov,
A. Bondar,
A. Bozek,
M. Bračko,
P. Branchini,
T. E. Browder,
A. Budano,
M. Campajola,
M. -C. Chang,
B. G. Cheon
, et al. (142 additional authors not shown)
Abstract:
We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of…
▽ More
We report the first evidence for the $h_{b}(\text{2P}) \to Υ(\text{1S})η$ transition with a significance of $3.5$ standard deviations. The decay branching fraction is measured to be $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})η]=(7.1 ~^{+3.7} _{-3.2}\pm 0.8)\times10^{-3}$, which is noticeably smaller than expected. We also set upper limits on $π^0$ transitions of $\mathcal{B}[h_{b}(\text{2P}) \to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, and $\mathcal{B}[h_{b}(\text{1P})\to Υ(\text{1S})π^0] < 1.8\times10^{-3}$, at the $90\%$ confidence level. These results are obtained with a $131.4$~fb$^{-1}$ data sample collected near the $Υ(\text{5S})$ resonance with the Belle detector at the KEKB asymmetric-energy $e^+e^-$ collider.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
ALTER: Augmentation for Large-Table-Based Reasoning
Authors:
Han Zhang,
Yuheng Ma,
Hanfang Yang
Abstract:
While extensive research has explored the use of large language models (LLMs) for table-based reasoning, most approaches struggle with scalability when applied to large tables. To maintain the superior comprehension abilities of LLMs in these scenarios, we introduce ALTER(Augmentation for Large-Table-Based Reasoning)-a framework designed to harness the latent augmentation potential in both free-fo…
▽ More
While extensive research has explored the use of large language models (LLMs) for table-based reasoning, most approaches struggle with scalability when applied to large tables. To maintain the superior comprehension abilities of LLMs in these scenarios, we introduce ALTER(Augmentation for Large-Table-Based Reasoning)-a framework designed to harness the latent augmentation potential in both free-form natural language (NL) questions, via the query augmentor, and semi-structured tabular data, through the table augmentor. By utilizing only a small subset of relevant data from the table and supplementing it with pre-augmented schema, semantic, and literal information, ALTER achieves outstanding performance on table-based reasoning benchmarks. We also provide a detailed analysis of large-table scenarios, comparing different methods and various partitioning principles. In these scenarios, our method outperforms all other approaches and exhibits robustness and efficiency against perturbations.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Semantic-Aware Power Allocation for Generative Semantic Communications with Foundation Models
Authors:
Chunmei Xu,
Mahdi Boloursaz Mashhadi,
Yi Ma,
Rahim Tafazolli
Abstract:
Recent advancements in diffusion models have made a significant breakthrough in generative modeling. The combination of the generative model and semantic communication (SemCom) enables high-fidelity semantic information exchange at ultra-low rates. A novel generative SemCom framework for image tasks is proposed, wherein pre-trained foundation models serve as semantic encoders and decoders for sema…
▽ More
Recent advancements in diffusion models have made a significant breakthrough in generative modeling. The combination of the generative model and semantic communication (SemCom) enables high-fidelity semantic information exchange at ultra-low rates. A novel generative SemCom framework for image tasks is proposed, wherein pre-trained foundation models serve as semantic encoders and decoders for semantic feature extractions and image regenerations, respectively. The mathematical relationship between the transmission reliability and the perceptual quality of the regenerated image and the semantic values of semantic features are modeled, which are obtained by conducting numerical simulations on the Kodak dataset. We also investigate the semantic-aware power allocation problem, with the objective of minimizing the total power consumption while guaranteeing semantic performance. To solve this problem, two semanticaware power allocation methods are proposed by constraint decoupling and bisection search, respectively. Numerical results show that the proposed semantic-aware methods demonstrate superior performance compared to the conventional one in terms of total power consumption.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Properties of the QCD Matter -- An Experimental Review of Selected Results from RHIC BES Program
Authors:
Jinhui Chen,
Xin Dong,
Xionghong He,
Huanzhong Huang,
Feng Liu,
Xiaofeng Luo,
Yu-Gang Ma,
Lijuan Ruan,
Ming Shao,
Shusu Shi,
Xu Sun,
Aihong Tang,
Zebo Tang,
Fuqiang Wang,
Hai Wang,
Yi Wang,
Zhigang Xiao,
Guannan Xie,
Nu Xu,
Qinghua Xu,
Zhangbu Xu,
Chi Yang,
Shuai Yang,
Wangmei Zha,
Yapeng Zhang
, et al. (3 additional authors not shown)
Abstract:
In the paper, we discuss the development of the multi-gap resistive plate chamber Time-of-Flight (TOF) technology and the production of the STAR TOF detector in China at the beginning of the 21st century. Then we review recent experimental results from the first beam energy scan program (BES-I) at the Relativistic Heavy Ion Collider (RHIC). Topics cover measurements of collectivity, chirality, cri…
▽ More
In the paper, we discuss the development of the multi-gap resistive plate chamber Time-of-Flight (TOF) technology and the production of the STAR TOF detector in China at the beginning of the 21st century. Then we review recent experimental results from the first beam energy scan program (BES-I) at the Relativistic Heavy Ion Collider (RHIC). Topics cover measurements of collectivity, chirality, criticality, global polarization, strangeness, heavy-flavor, di-lepton and light nuclei productions.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Modeling the Nonlinear Power Spectrum in Low-redshift HI Intensity Mapping
Authors:
Zhixing Li,
Laura Wolz,
Hong Guo,
Steven Cunnington,
Yi Mao
Abstract:
We present a simulation-based framework to forecast the HI power spectrum on non-linear scales ($k\gtrsim 1\ {\rm Mpc^{-1}}$), as measured by interferometer arrays like MeerKAT in the low-redshift ($z\leq 1.0$) universe. Building on a galaxy-based HI mock catalog, we meticulously consider various factors, including the emission line profiles of HI discs and some observational settings, and explore…
▽ More
We present a simulation-based framework to forecast the HI power spectrum on non-linear scales ($k\gtrsim 1\ {\rm Mpc^{-1}}$), as measured by interferometer arrays like MeerKAT in the low-redshift ($z\leq 1.0$) universe. Building on a galaxy-based HI mock catalog, we meticulously consider various factors, including the emission line profiles of HI discs and some observational settings, and explore their impacts on the HI power spectrum. While it is relatively insensitive to the profile shape of HI emission line at these scales, we identify a strong correlation with the profile width, that is, the Full Width at Half Maxima (FWHM, also known as $W_{\rm 50}$ in observations) in this work. By modeling the width function of $W_{50}$ as a function of $v_{\rm max}$, we assign each HI source a emission line profile and find that the resulting HI power spectrum is comparatively close to results from particles in the IllustrisTNG hydrodynamical simulation. After implementing $k$-space cuts matching the MeerKAT data, our prediction replicates the trend of the measurements obtained by MeerKAT at $z\approx 0.44$, though with a significantly lower amplitude. Utilizing a Monte Carlo Markov Chain sampling method, we constrain the parameter $A_{W_{\rm 50}}$ in the $W_{\rm 50}$ models and $Ω_{\rm HI}$ with the MeerKAT measurements and find that a strong degeneracy exists between these two parameters.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Splatting Manipulation
Authors:
Chaofan Luo,
Donglin Di,
Yongjia Ma,
Zhou Xue,
Chen Wei,
Xun Yang,
Yebin Liu
Abstract:
Despite significant strides in the field of 3D scene editing, current methods encounter substantial challenge, particularly in preserving 3D consistency in multi-view editing process. To tackle this challenge, we propose a progressive 3D editing strategy that ensures multi-view consistency via a Trajectory-Anchored Scheme (TAS) with a dual-branch editing mechanism. Specifically, TAS facilitates a…
▽ More
Despite significant strides in the field of 3D scene editing, current methods encounter substantial challenge, particularly in preserving 3D consistency in multi-view editing process. To tackle this challenge, we propose a progressive 3D editing strategy that ensures multi-view consistency via a Trajectory-Anchored Scheme (TAS) with a dual-branch editing mechanism. Specifically, TAS facilitates a tightly coupled iterative process between 2D view editing and 3D updating, preventing error accumulation yielded from text-to-image process. Additionally, we explore the relationship between optimization-based methods and reconstruction-based methods, offering a unified perspective for selecting superior design choice, supporting the rationale behind the designed TAS. We further present a tuning-free View-Consistent Attention Control (VCAC) module that leverages cross-view semantic and geometric reference from the source branch to yield aligned views from the target branch during the editing of 2D views. To validate the effectiveness of our method, we analyze 2D examples to demonstrate the improved consistency with the VCAC module. Further extensive quantitative and qualitative results in text-guided 3D scene editing indicate that our method achieves superior editing quality compared to state-of-the-art methods. We will make the complete codebase publicly available following the conclusion of the double-blind review process.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Brownian thermal birefringent noise due to non-diagonal anisotropic photoelastic effect in multilayer coated mirrors
Authors:
Yu-Pei Zhang,
Shi-Xiang Yang,
Wen-Hai Tan,
Cheng-Gang Shao,
Yiqiu Ma,
Shan-Qing Yang
Abstract:
Thermal noise in the mirror coatings limits the accuracy of today's most optical precision measurement experiments. Unlike the more commonly discussed thermal phase noise, the crystalline coating can generate thermal birefringent noise due to its anisotropic nature. In this study, we propose that the non-diagonal anisotropic photoelastic effect induced by the Brownian motion of mirror coating laye…
▽ More
Thermal noise in the mirror coatings limits the accuracy of today's most optical precision measurement experiments. Unlike the more commonly discussed thermal phase noise, the crystalline coating can generate thermal birefringent noise due to its anisotropic nature. In this study, we propose that the non-diagonal anisotropic photoelastic effect induced by the Brownian motion of mirror coating layers may contribute to this noise. Employing a standard model for the coating surface, we calculate the spectrum of the non-diagonal anisotropic Brownian photoelastic(NABP) noise to be $1.2 \times 10^{-11} p_{63} f^{-1/2}/\rm{Hz}^{1/2}$. Further experiments are warranted to validate the influence of this effect and reduce its uncertainty. Our findings highlight that for high-precision experiments involving optical resonant cavities targeting signals imprinted in optical polarizations, this noise could emerge as a limiting factor for experimental sensitivity.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations
Authors:
Yubo Ma,
Yuhang Zang,
Liangyu Chen,
Meiqi Chen,
Yizhu Jiao,
Xinze Li,
Xinyuan Lu,
Ziyu Liu,
Yan Ma,
Xiaoyi Dong,
Pan Zhang,
Liangming Pan,
Yu-Gang Jiang,
Jiaqi Wang,
Yixin Cao,
Aixin Sun
Abstract:
Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark co…
▽ More
Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem. This work presents MMLongBench-Doc, a long-context, multi-modal benchmark comprising 1,062 expert-annotated questions. Distinct from previous datasets, it is constructed upon 130 lengthy PDF-formatted documents with an average of 49.4 pages and 20,971 textual tokens. Towards comprehensive evaluation, answers to these questions rely on pieces of evidence from (1) different sources (text, image, chart, table, and layout structure) and (2) various locations (i.e. page number). Moreover, 33.2% of the questions are cross-page questions requiring evidence across multiple pages. 22.8% of the questions are designed to be unanswerable for detecting potential hallucinations. Experiments on 14 LVLMs demonstrate that long-context DU greatly challenges current models. Notably, the best-performing model, GPT-4o, achieves an F1 score of only 42.7%, while the second-best, GPT-4V, scores 31.4%. Furthermore, 12 LVLMs (all except GPT-4o and GPT-4V) even present worse performance than their LLM counterparts which are fed with lossy-parsed OCR documents. These results validate the necessity of future research toward more capable long-context LVLMs. Project Page: https://mayubo2333.github.io/MMLongBench-Doc
△ Less
Submitted 10 July, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Measurement of the integrated luminosity of data samples collected during 2019-2022 by the Belle II experiment
Authors:
The Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
J. K. Ahn,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Althubiti,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien
, et al. (382 additional authors not shown)
Abstract:
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, diga…
▽ More
A series of data samples was collected with the Belle II detector at the SuperKEKB collider from March 2019 to June 2022. We determine the integrated luminosities of these data samples using three distinct methodologies involving Bhabha ($e^+e^- \to e^+e^-(nγ)$), digamma ($e^+e^- \to γγ(nγ)$), and dimuon ($e^+e^- \to μ^+ μ^- (nγ)$) events. The total integrated luminosity obtained with Bhabha, digamma, and dimuon events is (426.52 $\pm$ 0.03 $\pm$ 2.48)~fb$^{-1}$, (427.32 $\pm$ 0.03 $\pm$ 2.56)~fb$^{-1}$, and (424.84 $\pm$ 0.04 $\pm$ 3.88)~fb$^{-1}$, where the first uncertainties are statistical and the second are systematic. The resulting total integrated luminosity obtained from the combination of the three methods is (426.88 $\pm$ 1.93)~fb$^{-1}$.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Study of $χ_{bJ}(2P)\toωΥ(1S)$ at Belle
Authors:
Belle Collaboration,
Z. S. Stottler,
T. K. Pedlar,
B. G. Fulsom,
I. Adachi,
K. Adamczyk,
H. Aihara,
S. Al Said,
D. M. Asner,
H. Atmacan,
T. Aushev,
R. Ayad,
V. Babu,
Sw. Banerjee,
M. Bauer,
P. Behera,
K. Belous,
J. Bennett,
F. Bernlochner,
M. Bessner,
T. Bilka,
D. Biswas,
A. Bobrov,
D. Bodrov,
G. Bonvicini
, et al. (157 additional authors not shown)
Abstract:
We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of…
▽ More
We report a study of the hadronic transitions $χ_{bJ}(2P)\toωΥ(1S)$, with $ω\toπ^{+}π^{-}π^{0}$, using $28.2\times10^6~Υ(3S)$ mesons recorded by the Belle detector. We present the first evidence for the near--threshold transition $χ_{b0}(2P)\toωΥ(1S)$, the analog of the charm sector decay $χ_{c1}(3872)\toωJ/ψ$, with a branching fraction of $B\big(χ_{b0}(2P)\toωΥ(1S)\big) = \big(0.55\pm0.19\pm0.07\big)\%$. We also obtain branching fractions of $B\big(χ_{b1}(2P)\toωΥ(1S)\big) = \big(2.39{}^{+0.20}_{-0.19}\pm0.24\big)\%$ and $B\big(χ_{b2}(2P)\toωΥ(1S)\big) = \big(0.47{}^{+0.13}_{-0.12}\pm0.06\big)\%$, confirming the measurement of the $ω$ transitions of the $J=1,2~P$--wave states. The ratio for the $J=2$ to $J=1$ transitions is also measured and found to differ by 3.3 standard deviations from the expected value in the QCD multipole expansion.
△ Less
Submitted 8 July, 2024; v1 submitted 30 June, 2024;
originally announced July 2024.
-
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation
Authors:
Mushui Liu,
Yuhang Ma,
Xinfeng Zhang,
Yang Zhen,
Zeng Zhao,
Zhipeng Hu,
Bai Liu,
Changjie Fan
Abstract:
Diffusion Models have exhibited substantial success in text-to-image generation. However, they often encounter challenges when dealing with complex and dense prompts that involve multiple objects, attribute binding, and long descriptions. This paper proposes a framework called \textbf{LLM4GEN}, which enhances the semantic understanding ability of text-to-image diffusion models by leveraging the se…
▽ More
Diffusion Models have exhibited substantial success in text-to-image generation. However, they often encounter challenges when dealing with complex and dense prompts that involve multiple objects, attribute binding, and long descriptions. This paper proposes a framework called \textbf{LLM4GEN}, which enhances the semantic understanding ability of text-to-image diffusion models by leveraging the semantic representation of Large Language Models (LLMs). Through a specially designed Cross-Adapter Module (CAM) that combines the original text features of text-to-image models with LLM features, LLM4GEN can be easily incorporated into various diffusion models as a plug-and-play component and enhances text-to-image generation. Additionally, to facilitate the complex and dense prompts semantic understanding, we develop a LAION-refined dataset, consisting of 1 million (M) text-image pairs with improved image descriptions. We also introduce DensePrompts which contains 7,000 dense prompts to provide a comprehensive evaluation for the text-to-image generation task. With just 10\% of the training data required by recent ELLA, LLM4GEN significantly improves the semantic alignment of SD1.5 and SDXL, demonstrating increases of 7.69\% and 9.60\% in color on T2I-CompBench, respectively. The extensive experiments on DensePrompts also demonstrate that LLM4GEN surpasses existing state-of-the-art models in terms of sample quality, image-text alignment, and human evaluation. The project website is at: \textcolor{magenta}{\url{https://xiaobul.github.io/LLM4GEN/}}
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization
Authors:
Dongxia Wu,
Nikki Lijing Kuang,
Ruijia Niu,
Yi-An Ma,
Rose Yu
Abstract:
Black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle. This process demands sample-efficient optimization due to the high computational cost of function evaluations. While prior studies focus on forward approaches to learn surrogates for the unknown objective function, they struggle with high-dimensional inputs where valid inputs form a smal…
▽ More
Black-box optimization (BBO) aims to optimize an objective function by iteratively querying a black-box oracle. This process demands sample-efficient optimization due to the high computational cost of function evaluations. While prior studies focus on forward approaches to learn surrogates for the unknown objective function, they struggle with high-dimensional inputs where valid inputs form a small subspace (e.g., valid protein sequences), which is common in real-world tasks. Recently, diffusion models have demonstrated impressive capability in learning the high-dimensional data manifold. They have shown promising performance in black-box optimization tasks but only in offline settings. In this work, we propose diffusion-based inverse modeling for black-box optimization (Diff-BBO), the first inverse approach leveraging diffusion models for online BBO problem. Diff-BBO distinguishes itself from forward approaches through the design of acquisition function. Instead of proposing candidates in the design space, Diff-BBO employs a novel acquisition function Uncertainty-aware Exploration (UaE) to propose objective function values, which leverages the uncertainty of a conditional diffusion model to generate samples in the design space. Theoretically, we prove that using UaE leads to optimal optimization outcomes. Empirically, we redesign experiments on the Design-Bench benchmark for online settings and show that Diff-BBO achieves state-of-the-art performance.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Multi-Satellite MIMO Systems for Direct User-Satellite Communications: A Survey
Authors:
Zohre Mashayekh Bakhsh,
Yasaman Omid,
Gaojie Chen,
Farbod Kayhan,
Yi Ma,
Rahim Tafazolli
Abstract:
Advancements in satellite technology have made direct-to-device connectivity a viable solution for ensuring global access. This method is designed to provide internet connectivity to remote, rural, or underserved areas where traditional cellular or broadband networks are lacking or insufficient. This paper is a survey providing an in-depth review of multi-satellite Multiple Input Multiple Output (…
▽ More
Advancements in satellite technology have made direct-to-device connectivity a viable solution for ensuring global access. This method is designed to provide internet connectivity to remote, rural, or underserved areas where traditional cellular or broadband networks are lacking or insufficient. This paper is a survey providing an in-depth review of multi-satellite Multiple Input Multiple Output (MIMO) systems as a potential solution for addressing the link budget challenge in direct user-satellite communication. Special attention is given to works considering multi-satellite MIMO systems, both with and without satellite collaboration. In this context, collaboration refers to sharing data between satellites to improve the performance of the system. This survey begins by explaining several fundamental aspects of satellite communications (SatComs), which are vital prerequisites before investigating the multi-satellite MIMO systems. These aspects encompass satellite orbits, the structure of satellite systems, SatCom links, including the inter-satellite links (ISL) which facilitate satellite cooperation, satellite frequency bands, satellite antenna design, and satellite channel models, which should be known or estimated for effective data transmission to and from multiple satellites. Furthermore, this survey distinguishes itself by providing more comprehensive insights in comparison to other surveys. It specifically delves into the Orthogonal Time Frequency Space (OTFS) within the channel model section. It goes into detail about ISL noise and channel models, and it extends the ISL section by thoroughly investigating hybrid FSO/RF ISLs. Furthermore, analytical comparisons of simulation results from these works are presented to highlight the advantages of employing multi-satellite MIMO systems.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
X. H. Bai,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsuren,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (495 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions…
▽ More
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents
Authors:
Haiyang Shen,
Yue Li,
Desong Meng,
Dongqi Cai,
Sheng Qi,
Li Zhang,
Mengwei Xu,
Yun Ma
Abstract:
Recent advancements in integrating large language models (LLMs) with application programming interfaces (APIs) have gained significant interest in both academia and industry. These API-based agents, leveraging the strong autonomy and planning capabilities of LLMs, can efficiently solve problems requiring multi-step actions. However, their ability to handle multi-dimensional difficulty levels, dive…
▽ More
Recent advancements in integrating large language models (LLMs) with application programming interfaces (APIs) have gained significant interest in both academia and industry. These API-based agents, leveraging the strong autonomy and planning capabilities of LLMs, can efficiently solve problems requiring multi-step actions. However, their ability to handle multi-dimensional difficulty levels, diverse task types, and real-world demands through APIs remains unknown. In this paper, we introduce \textsc{ShortcutsBench}, a large-scale benchmark for the comprehensive evaluation of API-based agents in solving tasks with varying levels of difficulty, diverse task types, and real-world demands. \textsc{ShortcutsBench} includes a wealth of real APIs from Apple Inc.'s operating systems, refined user queries from shortcuts, human-annotated high-quality action sequences from shortcut developers, and accurate parameter filling values about primitive parameter types, enum parameter types, outputs from previous actions, and parameters that need to request necessary information from the system or user. Our extensive evaluation of agents built with $5$ leading open-source (size >= 57B) and $4$ closed-source LLMs (e.g. Gemini-1.5-Pro and GPT-3.5) reveals significant limitations in handling complex queries related to API selection, parameter filling, and requesting necessary information from systems and users. These findings highlight the challenges that API-based agents face in effectively fulfilling real and complex user queries. All datasets, code, and experimental results will be available at \url{https://github.com/eachsheep/shortcutsbench}.
△ Less
Submitted 28 June, 2024;
originally announced July 2024.
-
Radiative Thermal Transistor
Authors:
Yuxuan Li,
Yongdi Dang,
Shen Zhang,
Xinran Li,
Yi Jin,
Philippe Ben-Abdallah,
Jianbin Xu,
Yungui Ma
Abstract:
Developing thermal analogues of field-effect transistor could open the door to a low-power and even zero-power communication technology working with heat rather than electricity. These solid-sate devices could also find many applications in the field of active thermal management in numerous technologies (microelectronic, building science, energy harvesting,conversion,...). Recent theoretical works…
▽ More
Developing thermal analogues of field-effect transistor could open the door to a low-power and even zero-power communication technology working with heat rather than electricity. These solid-sate devices could also find many applications in the field of active thermal management in numerous technologies (microelectronic, building science, energy harvesting,conversion,...). Recent theoretical works has suggested that a photonic transistor made with three terminals can in principle be used to switch, modulate, and even amplify heat flux through exchange of thermal photons. Here, we report an experimental demonstration of thermal transistor effect using a non-contact system composed by a temperature-controlled metal-insulator-based material interacting in far-field regime with two blackbodies held at two different temperatures. We demonstrate that, with a tiny change in the temperature of the active layer, the heat flux received by the cold blackbody can be drastically modified. An amplification parameter of heat flux over 20 is reported.
△ Less
Submitted 15 June, 2024;
originally announced July 2024.
-
Reionization Parameter Inference from 3D Minkowski Functionals of the 21 cm Signals
Authors:
Kangning Diao,
Zhaoting Chen,
Xuelei Chen,
Yi Mao
Abstract:
The Minkowski Functionals (MFs), a set of topological summary statistics, have emerged as a powerful tool for extracting non-Gaussian information. We investigate the prospect of constraining the reionization parameters using the MFs of the 21 cm brightness temperature field from the epoch of reionization (EoR). Realistic effects, including thermal noise, synthesized beam, and foreground avoidance,…
▽ More
The Minkowski Functionals (MFs), a set of topological summary statistics, have emerged as a powerful tool for extracting non-Gaussian information. We investigate the prospect of constraining the reionization parameters using the MFs of the 21 cm brightness temperature field from the epoch of reionization (EoR). Realistic effects, including thermal noise, synthesized beam, and foreground avoidance, are applied to the mock observations from the radio interferometric array experiments such as the Hydrogen Epoch of Reionization Array (HERA) and the Square Kilometre Array (SKA). We demonstrate that the MFs of the 21 cm signal measured with SKA-Low can be used to distinguish different reionization models, whereas the MF measurement with a HERA-like array cannot be made accurately enough. We further forecast the accuracies with which the MF measurements can place constraints on reionization parameters, using the standard MCMC analysis for parameter inference based on forward modeling. We find that for SKA-Low observation, MFs provide unbiased estimations of the reionization parameters with accuracies comparable to the power spectrum (PS) analysis. Furthermore, joint constraints using both MFs and PS can improve the constraint accuracies by up to $30\%$ compared to those with the PS alone. Nevertheless, the constraint accuracies can be degraded if the EoR window is shrunk with strong foreground avoidance. Our analysis demonstrates the promise of MFs as a set of summary statistics that extract complementary information from the 21 cm EoR field to the two-point statistics, which suggests a strong motivation for incorporating the MFs into the data analysis of future 21 cm observations.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
FAST survey of H I and OH absorption towards extragalactic radio sources
Authors:
Yogesh Chandola,
D. J. Saikia,
Yin-Zhe Ma,
Zheng Zheng,
Chao-Wei Tsai,
Di Li,
Denis Tramonte,
Hengxing Pan
Abstract:
Neutral atomic hydrogen and molecular gas in the host galaxies of radio active galactic nuclei (AGN) can be traced using H I 21-cm and OH-1667 MHz absorption lines to understand the fueling and feedback processes. We present the results of an H I and OH absorption survey with the Five-hundred-meter Aperture Spherical radio Telescope (FAST) towards 40 radio sources of low-intermediate radio luminos…
▽ More
Neutral atomic hydrogen and molecular gas in the host galaxies of radio active galactic nuclei (AGN) can be traced using H I 21-cm and OH-1667 MHz absorption lines to understand the fueling and feedback processes. We present the results of an H I and OH absorption survey with the Five-hundred-meter Aperture Spherical radio Telescope (FAST) towards 40 radio sources of low-intermediate radio luminosity ($\sim$10$^{23}$-10$^{26}$ W Hz$^{-1}$ at 1.4 GHz), red mid-infrared color (W2[4.6 $μ$m]$-$W3[12 $μ$m] $>$ 2.5 mag) and redshift up to 0.35. From 13 sources with good data at H I observing frequencies, we report the detection of H I absorption towards 8 sources, 5 of which are new detections including 4 in the redshift range 0.25 to 0.35. Our detection rates are consistent with our previous results with dependence on the star-formation history of the host galaxy reflected in the mid-infrared \textit{WISE} W2$-$W3 colors and the compactness of the radio source. We find no significant dependence of detection rates on radio luminosity or redshift. We also find that H I column densities are anti-correlated with the low-frequency spectral indices ($α_{\rm 150 MHz}^{\rm 1.4 GHz}$, $S_ν\propto ν^{-α}$). We do not have any detection from 23 sources with good data at OH observing frequencies. However, by stacking the spectra we estimate the 3$σ$ upper limit of OH column density to be 2.27$\times$10$^{14}$$T_{\rm ex}$/10 K $\times$1/$f_{\rm c}$ cm$^{-2}$. By stacking the OH spectra for 7 associated H I absorbers, we get a 3$σ$ upper limit of 3.47$\times$10$^{14}$ $T_{\rm ex}$/10 K $\times$1/$f_{\rm c}$ cm$^{-2}$ on OH column density and 1.78$\times$10$^{-7}$ on [OH]/[H I] ratio.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Enhancing Terrestrial Net Primary Productivity Estimation with EXP-CASA: A Novel Light Use Efficiency Model Approach
Authors:
Guanzhou Chen,
Kaiqi Zhang,
Xiaodong Zhang,
Hong Xie,
Haobo Yang,
Xiaoliang Tan,
Tong Wang,
Yule Ma,
Qing Wang,
Jinzhou Cao,
Weihong Cui
Abstract:
The Light Use Efficiency model, epitomized by the CASA model, is extensively applied in the quantitative estimation of vegetation Net Primary Productivity. However, the classic CASA model is marked by significant complexity: the estimation of environmental stress parameters, in particular, necessitates multi-source observation data, adding to the complexity and uncertainty of the model's operation…
▽ More
The Light Use Efficiency model, epitomized by the CASA model, is extensively applied in the quantitative estimation of vegetation Net Primary Productivity. However, the classic CASA model is marked by significant complexity: the estimation of environmental stress parameters, in particular, necessitates multi-source observation data, adding to the complexity and uncertainty of the model's operation. Additionally, the saturation effect of the Normalized Difference Vegetation Index (NDVI), a key variable in the CASA model, weakened the accuracy of CASA's NPP predictions in densely vegetated areas. To address these limitations, this study introduces the Exponential-CASA (EXP-CASA) model. The EXP-CASA model effectively improves the CASA model by using novel functions for estimating the fraction of absorbed photosynthetically active radiation (FPAR) and environmental stress, by utilizing long-term observational data from FLUXNET and MODIS surface reflectance data. In a comparative analysis of NPP estimation accuracy among four different NPP products, EXP-CASA ($R^2 = 0.68, RMSE= 1.1gC\cdot m^{-2} \cdot d^{-1}$) outperforms others, followed by GLASS-NPP, and lastly MODIS-NPP and classic CASA. Additionally, this research assesses the EXP-CASA model's adaptability to various vegetation indices, evaluates the sensitivity and stability of its parameters over time, and compares its accuracy against other leading NPP estimation products. The findings reveal that the EXP-CASA model exhibits strong adaptability to diverse vegetation indices and stability of model parameters over time series. By introducing a novel estimation approach that optimizes model construction, the EXP-CASA model remarkably improves the accuracy of NPP estimations and paves the way for global-scale, consistent, and continuous assessment of vegetation NPP.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Accuracy on the wrong line: On the pitfalls of noisy data for out-of-distribution generalisation
Authors:
Amartya Sanyal,
Yaxi Hu,
Yaodong Yu,
Yian Ma,
Yixin Wang,
Bernhard Schölkopf
Abstract:
"Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisan…
▽ More
"Accuracy-on-the-line" is a widely observed phenomenon in machine learning, where a model's accuracy on in-distribution (ID) and out-of-distribution (OOD) data is positively correlated across different hyperparameters and data configurations. But when does this useful relationship break down? In this work, we explore its robustness. The key observation is that noisy data and the presence of nuisance features can be sufficient to shatter the Accuracy-on-the-line phenomenon. In these cases, ID and OOD accuracy can become negatively correlated, leading to "Accuracy-on-the-wrong-line". This phenomenon can also occur in the presence of spurious (shortcut) features, which tend to overshadow the more complex signal (core, non-spurious) features, resulting in a large nuisance feature space. Moreover, scaling to larger datasets does not mitigate this undesirable behavior and may even exacerbate it. We formally prove a lower bound on Out-of-distribution (OOD) error in a linear classification model, characterizing the conditions on the noise and nuisance features for a large OOD error. We finally demonstrate this phenomenon across both synthetic and real datasets with noisy data and nuisance features.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Fudan Multi-purpose Active TArget Time Projection Chamber (fMeta-TPC) for Photonnuclear Reaction Experiments
Authors:
Huang-Kai Wu,
Xi-Yang Wang,
Yu-Miao Wang,
You-Jing Wang,
De-Qing Fang,
Wan-Bing He,
Wei-Hu Ma,
Xi-Guang Cao,
Chang-Bo Fu,
Xian-Gai Deng,
Yu-Gang Ma
Abstract:
Active Target Time Projection Chambers (AT-TPCs) are state-of-the-art tools in the field of low-energy nuclear physics, particularly suitable for experiments using low-intensity radioactive ion beams or gamma rays. The Fudan Multi-purpose Active Target Time Projection Chamber (fMeta-TPC) with 2048 channels has been developed to study $α$-clustering nuclei. {\fcb In this work, the focus is on the s…
▽ More
Active Target Time Projection Chambers (AT-TPCs) are state-of-the-art tools in the field of low-energy nuclear physics, particularly suitable for experiments using low-intensity radioactive ion beams or gamma rays. The Fudan Multi-purpose Active Target Time Projection Chamber (fMeta-TPC) with 2048 channels has been developed to study $α$-clustering nuclei. {\fcb In this work, the focus is on the study of the photonuclear reaction with the Laser Compton Scattering (LCS) gamma source, especially for the decay of the highly excited $α$-cluster state.} The design of fMeta-TPC is described and a comprehensive evaluation of its offline performance is performed by ultraviolet (UV) laser and $^{241}$Am $α$ source. The result shows that the intrinsic angular resolution of the detector is within 0.30$^{\circ}$ and has an energy resolution of 6.85\% for 3.0 MeV $α$ particles. The gain uniformity of the detector is about 10\% (RMS/Mean), tested by the $^{55}$Fe X-ray source.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.