-
Region Attention Transformer for Medical Image Restoration
Authors:
Zhiwen Yang,
Haowei Chen,
Ziniu Qian,
Yang Zhou,
Hui Zhang,
Dan Zhao,
Bingzheng Wei,
Yan Xu
Abstract:
Transformer-based methods have demonstrated impressive results in medical image restoration, attributed to the multi-head self-attention (MSA) mechanism in the spatial dimension. However, the majority of existing Transformers conduct attention within fixed and coarsely partitioned regions (\text{e.g.} the entire image or fixed patches), resulting in interference from irrelevant regions and fragmen…
▽ More
Transformer-based methods have demonstrated impressive results in medical image restoration, attributed to the multi-head self-attention (MSA) mechanism in the spatial dimension. However, the majority of existing Transformers conduct attention within fixed and coarsely partitioned regions (\text{e.g.} the entire image or fixed patches), resulting in interference from irrelevant regions and fragmentation of continuous image content. To overcome these challenges, we introduce a novel Region Attention Transformer (RAT) that utilizes a region-based multi-head self-attention mechanism (R-MSA). The R-MSA dynamically partitions the input image into non-overlapping semantic regions using the robust Segment Anything Model (SAM) and then performs self-attention within these regions. This region partitioning is more flexible and interpretable, ensuring that only pixels from similar semantic regions complement each other, thereby eliminating interference from irrelevant regions. Moreover, we introduce a focal region loss to guide our model to adaptively focus on recovering high-difficulty regions. Extensive experiments demonstrate the effectiveness of RAT in various medical image restoration tasks, including PET image synthesis, CT image denoising, and pathological image super-resolution. Code is available at \href{https://github.com/Yaziwel/Region-Attention-Transformer-for-Medical-Image-Restoration.git}{https://github.com/RAT}.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
The JWST Weather Report from the Nearest Brown Dwarfs I: multi-period JWST NIRSpec + MIRI monitoring of the benchmark binary brown dwarf WISE 1049AB
Authors:
Beth A. Biller,
Johanna M. Vos,
Yifan Zhou,
Allison M. McCarthy,
Xianyu Tan,
Ian J. M. Crossfield,
Niall Whiteford,
Genaro Suarez,
Jacqueline Faherty,
Elena Manjavacas,
Xueqing Chen,
Pengyu Liu,
Ben J. Sutlieff,
Mary Anne Limbach,
Paul Molliere,
Trent J. Dupuy,
Natalia Oliveros-Gomez,
Philip S. Muirhead,
Thomas Henning,
Gregory Mace,
Nicolas Crouzet,
Theodora Karalidi,
Caroline V. Morley,
Pascal Tremblin,
Tiffany Kataria
Abstract:
We report results from 8 hours of JWST/MIRI LRS spectroscopic monitoring directly followed by 7 hours of JWST/NIRSpec prism spectroscopic monitoring of the benchmark binary brown dwarf WISE 1049AB, the closest, brightest brown dwarfs known. We find water, methane, and CO absorption features in both components, including the 3.3 $μ$m methane absorption feature and a tentative detection of small gra…
▽ More
We report results from 8 hours of JWST/MIRI LRS spectroscopic monitoring directly followed by 7 hours of JWST/NIRSpec prism spectroscopic monitoring of the benchmark binary brown dwarf WISE 1049AB, the closest, brightest brown dwarfs known. We find water, methane, and CO absorption features in both components, including the 3.3 $μ$m methane absorption feature and a tentative detection of small grain ($<$ 1$μ$m) silicate absorption at $>$8.5 $μ$m in WISE 1049A. Both components vary significantly ($>$1$\%$), with WISE 1049B displaying larger variations than WISE 1049A. Using K-means clustering, we find three main transition points in wavelength for both components of the binary: 1) change in behavior at $\sim$2.3 $μ$m coincident with a CO absorption bandhead, 2) change in behavior at 4.2 $μ$m, close to the CO fundamental band at $λ>$ 4.4 $μ$m, and 3) change in behavior at 8.3-8.5 $μ$m, potentially corresponding to silicate absorption. We interpret the lightcurves observed with both NIRSpec and MIRI as likely stemming from 1) a deep pressure level driving the double-peaked variability seen in WISE 1049B at wavelengths $<$2.3 $μ$m and $>$8.5 $μ$m, 2) an intermediate pressure level shaping the lightcurve morphology between 2.3 and 4.2 $μ$m, and 3) a higher-altitude pressure level producing single-peaked and plateaued lightcurve behavior between 4.2 and 8.5 $μ$m.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays at Belle II
Authors:
Belle II Collaboration,
I. Adachi,
L. Aggarwal,
H. Ahmed,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (414 additional authors not shown)
Abstract:
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We det…
▽ More
We report measurements of time-dependent $CP$ asymmetries in $B^0 \to K^0_S π^0 γ$ decays based on a data sample of $(388\pm6)\times10^6$ $B\bar{B}$ events collected at the $Υ(4S)$ resonance with the Belle II detector. The Belle II experiment operates at the SuperKEKB asymmetric-energy $e^+e^-$ collider. We measure decay-time distributions to determine $CP$-violating parameters $S$ and $C$. We determine these parameters for two ranges of $K^0_S π^0$ invariant mass: $m(K^0_S π^0)\in (0.8, 1.0)$ $GeV/c^2$, which is dominated by $B^0 \to K^{*0} (\to K^0_S π^0) γ$ decays, and a complementary region $m(K^0_S π^0)\in (0.6, 0.8)\cup(1.0, 1.8)$ $GeV/c^2$. Our results have improved precision as compared to previous measurements and are consistent with theory predictions.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
AUITestAgent: Automatic Requirements Oriented GUI Function Testing
Authors:
Yongxiang Hu,
Xuan Wang,
Yingchuan Wang,
Yu Zhang,
Shiyu Guo,
Chaoyi Chen,
Xin Wang,
Yangfan Zhou
Abstract:
The Graphical User Interface (GUI) is how users interact with mobile apps. To ensure it functions properly, testing engineers have to make sure it functions as intended, based on test requirements that are typically written in natural language. While widely adopted manual testing and script-based methods are effective, they demand substantial effort due to the vast number of GUI pages and rapid it…
▽ More
The Graphical User Interface (GUI) is how users interact with mobile apps. To ensure it functions properly, testing engineers have to make sure it functions as intended, based on test requirements that are typically written in natural language. While widely adopted manual testing and script-based methods are effective, they demand substantial effort due to the vast number of GUI pages and rapid iterations in modern mobile apps. This paper introduces AUITestAgent, the first automatic, natural language-driven GUI testing tool for mobile apps, capable of fully automating the entire process of GUI interaction and function verification. Since test requirements typically contain interaction commands and verification oracles. AUITestAgent can extract GUI interactions from test requirements via dynamically organized agents. Then, AUITestAgent employs a multi-dimensional data extraction strategy to retrieve data relevant to the test requirements from the interaction trace and perform verification. Experiments on customized benchmarks demonstrate that AUITestAgent outperforms existing tools in the quality of generated GUI interactions and achieved the accuracy of verifications of 94%. Moreover, field deployment in Meituan has shown AUITestAgent's practical usability, with it detecting 4 new functional bugs during 10 regression tests in two months.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Measurement of branching fractions, CP asymmetry, and isospin asymmetry for $\boldsymbol{B\rightarrowργ}$ decays using Belle and Belle II data
Authors:
Belle II Collaboration,
I. Adachi,
K. Adamczyk,
L. Aggarwal,
H. Aihara,
N. Akopov,
A. Aloisio,
N. Anh Ky,
D. M. Asner,
H. Atmacan,
T. Aushev,
V. Aushev,
M. Aversano,
R. Ayad,
V. Babu,
H. Bae,
S. Bahinipati,
P. Bambade,
Sw. Banerjee,
S. Bansal,
M. Barrett,
J. Baudot,
A. Baur,
A. Beaubien,
F. Becherer
, et al. (385 additional authors not shown)
Abstract:
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle I…
▽ More
We present measurements of $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays using a combined data sample of $772 \times 10^6$ $B\overline{B}$ pairs collected by the Belle experiment and $387\times 10^6$ $B\overline{B}$ pairs collected by the Belle II experiment in $e^{+}e^{-}$ collisions at the $Υ(4S)$ resonance. After an optimized selection, a simultaneous fit to the Belle and Belle II data sets yields $114\pm 12$ $B^{+}\rightarrowρ^{+}γ$ and $99\pm 12$ $B^{0}\rightarrowρ^{0}γ$ decays. The measured branching fractions are $(13.1^{+2.0 +1.3}_{-1.9 -1.2})\times 10^{-7}$ and $(7.5\pm 1.3^{+1.0}_{-0.8})\times 10^{-7}$ for $B^{+}\rightarrowρ^{+}γ$ and $B^{0}\rightarrowρ^{0}γ$ decays, respectively, where the first uncertainty is statistical and the second is systematic. We also measure the isospin asymmetry $A_{\rm I}(B\rightarrowργ)=(10.9^{+11.2 +7.8}_{-11.7 -7.3})\%$ and the direct CP asymmetry $A_{CP}(B^{+}\rightarrowρ^{+}γ)=(-8.2\pm 15.2^{+1.6}_{-1.2})\%$.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor Attacks
Authors:
Guang Yang,
Yu Zhou,
Xiang Chen,
Xiangyu Zhang,
Terry Yue Zhuo,
David Lo,
Taolue Chen
Abstract:
Code Language Models (CLMs), particularly those leveraging deep learning, have achieved significant success in code intelligence domain. However, the issue of security, particularly backdoor attacks, is often overlooked in this process. The previous research has focused on designing backdoor attacks for CLMs, but effective defenses have not been adequately addressed. In particular, existing defens…
▽ More
Code Language Models (CLMs), particularly those leveraging deep learning, have achieved significant success in code intelligence domain. However, the issue of security, particularly backdoor attacks, is often overlooked in this process. The previous research has focused on designing backdoor attacks for CLMs, but effective defenses have not been adequately addressed. In particular, existing defense methods from natural language processing, when directly applied to CLMs, are not effective enough and lack generality, working well in some models and scenarios but failing in others, thus fall short in consistently mitigating backdoor attacks. To bridge this gap, we first confirm the phenomenon of ``early learning" as a general occurrence during the training of CLMs. This phenomenon refers to that a model initially focuses on the main features of training data but may become more sensitive to backdoor triggers over time, leading to overfitting and susceptibility to backdoor attacks. We then analyze that overfitting to backdoor triggers results from the use of the cross-entropy loss function, where the unboundedness of cross-entropy leads the model to increasingly concentrate on the features of the poisoned data. Based on this insight, we propose a general and effective loss function DeCE (Deceptive Cross-Entropy) by blending deceptive distributions and applying label smoothing to limit the gradient to be bounded, which prevents the model from overfitting to backdoor triggers and then enhances the security of CLMs against backdoor attacks. To verify the effectiveness of our defense method, we select code synthesis tasks as our experimental scenarios. Our experiments across various code synthesis datasets, models, and poisoning ratios demonstrate the applicability and effectiveness of DeCE in enhancing the security of CLMs.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Constraints of the maximum mass of quark stars based on post-merger evolutions
Authors:
Yurui Zhou,
Chen Zhang,
Junjie Zhao,
Kenta Kiuchi,
Sho Fujibayashi,
Enping Zhou
Abstract:
We semi-analytically investigate the post-merger evolution of the binary quark star merger. The effective-one-body method is employed to estimate the energy and angular momentum dissipation due to gravitational waves in the inspiral phase. Three major mechanisms of energy and angular momentum dissipation are considered in the post-merger phase: mass outflows, neutrinos, and gravitational waves. Th…
▽ More
We semi-analytically investigate the post-merger evolution of the binary quark star merger. The effective-one-body method is employed to estimate the energy and angular momentum dissipation due to gravitational waves in the inspiral phase. Three major mechanisms of energy and angular momentum dissipation are considered in the post-merger phase: mass outflows, neutrinos, and gravitational waves. The proportion of each mechanism could be determined by baryon number, energy and angular momentum conservation laws as well as the equilibrium model for rotating quark stars. Applying this analysis to the GW170817 event suggests two important conclusions: 1) a remnant quark star whose mass is smaller than the maximum mass of a uniformly rotating quark star can collapse before its rotational energy is dissipated via electromagnetic radiation (i.e., $\sim 100\,\mathrm{s}$) as the angular momentum left in the remnant quark star might not be large enough to sustain the additional self-gravity of the supramassive quark star due to the angular momentum dissipation of mass outflows, neutrinos and gravitational waves; 2) considering a general quark star equation of state model, a constraint on the maximum mass of cold and non-rotating quark stars is found as $M_{\mathrm{TOV}}\lesssim2.35^{+0.07}_{-0.17}\,M_{\odot}$, assuming a delayed collapse occurred before a large fraction of the total rotational energy ($\color{blue} \gtrsim 10^{53}\,$erg) of the merger remnant was deposited into the merger environment for the GW170817 event. These constraints could be improved with future merger events, once there are more evidences on its post-merger evolution channel or information on the amount of post-merger gravitational wave and neutrino emissions inferred from the multi-messenger observations.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation
Authors:
Kaiyan Chang,
Zhirong Chen,
Yunhao Zhou,
Wenlong Zhu,
kun wang,
Haobo Xu,
Cangyuan Li,
Mengdi Wang,
Shengwen Liang,
Huawei Li,
Yinhe Han,
Ying Wang
Abstract:
Natural language interfaces have exhibited considerable potential in the automation of Verilog generation derived from high-level specifications through the utilization of large language models, garnering significant attention. Nevertheless, this paper elucidates that visual representations contribute essential contextual information critical to design intent for hardware architectures possessing…
▽ More
Natural language interfaces have exhibited considerable potential in the automation of Verilog generation derived from high-level specifications through the utilization of large language models, garnering significant attention. Nevertheless, this paper elucidates that visual representations contribute essential contextual information critical to design intent for hardware architectures possessing spatial complexity, potentially surpassing the efficacy of natural-language-only inputs. Expanding upon this premise, our paper introduces an open-source benchmark for multi-modal generative models tailored for Verilog synthesis from visual-linguistic inputs, addressing both singular and complex modules. Additionally, we introduce an open-source visual and natural language Verilog query language framework to facilitate efficient and user-friendly multi-modal queries. To evaluate the performance of the proposed multi-modal hardware generative AI in Verilog generation tasks, we compare it with a popular method that relies solely on natural language. Our results demonstrate a significant accuracy improvement in the multi-modal generated Verilog compared to queries based solely on natural language. We hope to reveal a new approach to hardware design in the large-hardware-design-model era, thereby fostering a more diversified and productive approach to hardware design.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Einasto profile as the halo model solution coupled to the depletion radius
Authors:
Yifeng Zhou,
Jiaxin Han
Abstract:
We constrain the halo profiles outside the halo boundaries by solving for the matching profiles required by the halo model. In the halo model framework, the matter distribution in the universe can be decomposed into the spatial distribution of halos convolved with their internal structures. This leads to a set of linear equations in Fourier space which uniquely determines the optimal halo profiles…
▽ More
We constrain the halo profiles outside the halo boundaries by solving for the matching profiles required by the halo model. In the halo model framework, the matter distribution in the universe can be decomposed into the spatial distribution of halos convolved with their internal structures. This leads to a set of linear equations in Fourier space which uniquely determines the optimal halo profiles for any given halo catalog. In this work, we construct three halo catalogs with different boundary definitions, and solve for the optimal profiles in each case using measurements of halo-matter and halo-halo power spectra. Our results show that for a given halo field, there is always a set of matching profiles to accurately reconstruct the input statistics of the matter field, even though it might be complex to model the profiles analytically. Comparing the solutions from different halo catalogs, we find their mass distributions inside the inner depletion radii are nearly identical, while they deviate from each other on larger scales, with a larger boundary resulting in a more extended profile. For the depletion radius based catalog, the numerical solution agrees well with the Einasto profile. Coupling the Einasto profile with the depletion catalog, the resulting halo model can simultaneously predict the halo-matter power spectra to $10\%$ and matter-matter power spectrum to $5\%$, improving over conventional models in both the interpretability and versatility.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
Authors:
Liang Zeng,
Liangjun Zhong,
Liang Zhao,
Tianwen Wei,
Liu Yang,
Jujie He,
Cheng Cheng,
Rui Hu,
Yang Liu,
Shuicheng Yan,
Han Fang,
Yahui Zhou
Abstract:
In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs). We argue that the data scaling law for math reasoning capabilities in modern LLMs is far from being saturated, highlighting how the model's quality improves with increases in data quantity. To support this claim, we introduce the Skywork-Math model…
▽ More
In this paper, we investigate the underlying factors that potentially enhance the mathematical reasoning capabilities of large language models (LLMs). We argue that the data scaling law for math reasoning capabilities in modern LLMs is far from being saturated, highlighting how the model's quality improves with increases in data quantity. To support this claim, we introduce the Skywork-Math model series, supervised fine-tuned (SFT) on common 7B LLMs using our proposed 2.5M-instance Skywork-MathQA dataset. Skywork-Math 7B has achieved impressive accuracies of 51.2% on the competition-level MATH benchmark and 83.9% on the GSM8K benchmark using only SFT data, outperforming an early version of GPT-4 on MATH. The superior performance of Skywork-Math models contributes to our novel two-stage data synthesis and model SFT pipelines, which include three different augmentation methods and a diverse seed problem set, ensuring both the quantity and quality of Skywork-MathQA dataset across varying difficulty levels. Most importantly, we provide several practical takeaways to enhance math reasoning abilities in LLMs for both research and industry applications.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Search, Examine and Early-Termination: Fake News Detection with Annotation-Free Evidences
Authors:
Yuzhou Yang,
Yangming Zhou,
Qichao Ying,
Zhenxing Qian,
Xinpeng Zhang
Abstract:
Pioneer researches recognize evidences as crucial elements in fake news detection apart from patterns. Existing evidence-aware methods either require laborious pre-processing procedures to assure relevant and high-quality evidence data, or incorporate the entire spectrum of available evidences in all news cases, regardless of the quality and quantity of the retrieved data. In this paper, we propos…
▽ More
Pioneer researches recognize evidences as crucial elements in fake news detection apart from patterns. Existing evidence-aware methods either require laborious pre-processing procedures to assure relevant and high-quality evidence data, or incorporate the entire spectrum of available evidences in all news cases, regardless of the quality and quantity of the retrieved data. In this paper, we propose an approach named \textbf{SEE} that retrieves useful information from web-searched annotation-free evidences with an early-termination mechanism. The proposed SEE is constructed by three main phases: \textbf{S}earching online materials using the news as a query and directly using their titles as evidences without any annotating or filtering procedure, sequentially \textbf{E}xamining the news alongside with each piece of evidence via attention mechanisms to produce new hidden states with retrieved information, and allowing \textbf{E}arly-termination within the examining loop by assessing whether there is adequate confidence for producing a correct prediction. We have conducted extensive experiments on datasets with unprocessed evidences, i.e., Weibo21, GossipCop, and pre-processed evidences, namely Snopes and PolitiFact. The experimental results demonstrate that the proposed method outperforms state-of-the-art approaches.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Trustworthy Contrast-enhanced Brain MRI Synthesis
Authors:
Jiyao Liu,
Yuxin Li,
Shangqi Gao,
Yuncheng Zhou,
Xin Gao,
Ningsheng Xu,
Xiao-Yong Zhang,
Xiahai Zhuang
Abstract:
Contrast-enhanced brain MRI (CE-MRI) is a valuable diagnostic technique but may pose health risks and incur high costs. To create safer alternatives, multi-modality medical image translation aims to synthesize CE-MRI images from other available modalities. Although existing methods can generate promising predictions, they still face two challenges, i.e., exhibiting over-confidence and lacking inte…
▽ More
Contrast-enhanced brain MRI (CE-MRI) is a valuable diagnostic technique but may pose health risks and incur high costs. To create safer alternatives, multi-modality medical image translation aims to synthesize CE-MRI images from other available modalities. Although existing methods can generate promising predictions, they still face two challenges, i.e., exhibiting over-confidence and lacking interpretability on predictions. To address the above challenges, this paper introduces TrustI2I, a novel trustworthy method that reformulates multi-to-one medical image translation problem as a multimodal regression problem, aiming to build an uncertainty-aware and reliable system. Specifically, our method leverages deep evidential regression to estimate prediction uncertainties and employs an explicit intermediate and late fusion strategy based on the Mixture of Normal Inverse Gamma (MoNIG) distribution, enhancing both synthesis quality and interpretability. Additionally, we incorporate uncertainty calibration to improve the reliability of uncertainty. Validation on the BraTS2018 dataset demonstrates that our approach surpasses current methods, producing higher-quality images with rational uncertainty estimation.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Event-Aided Time-to-Collision Estimation for Autonomous Driving
Authors:
Jinghang Li,
Bangyan Liao,
Xiuyuan LU,
Peidong Liu,
Shaojie Shen,
Yi Zhou
Abstract:
Predicting a potential collision with leading vehicles is an essential functionality of any autonomous/assisted driving system. One bottleneck of existing vision-based solutions is that their updating rate is limited to the frame rate of standard cameras used. In this paper, we present a novel method that estimates the time to collision using a neuromorphic event-based camera, a biologically inspi…
▽ More
Predicting a potential collision with leading vehicles is an essential functionality of any autonomous/assisted driving system. One bottleneck of existing vision-based solutions is that their updating rate is limited to the frame rate of standard cameras used. In this paper, we present a novel method that estimates the time to collision using a neuromorphic event-based camera, a biologically inspired visual sensor that can sense at exactly the same rate as scene dynamics. The core of the proposed algorithm consists of a two-step approach for efficient and accurate geometric model fitting on event data in a coarse-to-fine manner. The first step is a robust linear solver based on a novel geometric measurement that overcomes the partial observability of event-based normal flow. The second step further refines the resulting model via a spatio-temporal registration process formulated as a nonlinear optimization problem. Experiments on both synthetic and real data demonstrate the effectiveness of the proposed method, outperforming other alternative methods in terms of efficiency and accuracy.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Bimerons create bimerons: proliferation and aggregation induced by currents and magnetic fields
Authors:
Xichao Zhang,
Yan Zhou,
Xiuzhen Yu,
Masahito Mochizuki
Abstract:
The aggregation of topological spin textures at nano and micro scales has practical applications in spintronic technologies. Here, the authors report the in-plane current-induced proliferation and aggregation of bimerons in a bulk chiral magnet. It is found that the spin-transfer torques can induce the proliferation and aggregation of bimerons only in the presence of an appropriate out-of-plane ma…
▽ More
The aggregation of topological spin textures at nano and micro scales has practical applications in spintronic technologies. Here, the authors report the in-plane current-induced proliferation and aggregation of bimerons in a bulk chiral magnet. It is found that the spin-transfer torques can induce the proliferation and aggregation of bimerons only in the presence of an appropriate out-of-plane magnetic field. It is also found that a relatively small damping and a relatively large non-adiabatic spin-transfer torque could lead to more pronounced bimeron proliferation and aggregation. Particularly, the current density should be larger than a certain threshold in order to trigger the proliferation; namely, the bimerons may only be driven into translational motion under weak current injection. Besides, the authors find that the aggregate bimerons could relax into a deformed honeycomb bimeron lattice with a few lattice structure defects after the current injection. The results are promising for the development of bio-inspired spintronic devices that use a large number of aggregate bimerons. The findings also provide a platform for studying aggregation-induced effects in spintronic systems, such as the aggregation-induced lattice phase transitions.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Transformation of a cellular skyrmion to polyomino-like structures
Authors:
Jing Xia,
Xichao Zhang,
Yan Zhou,
Xiaoxi Liu,
Guoping Zhao,
Masahito Mochizuki
Abstract:
Topological spin structures with transformable shapes may have potential implications on data storage and computation. Here, we demonstrate that a square cellular skyrmion on an artificial grid pinning pattern can be manipulated by programmed current pulses. We find that parallel short pulses could result in the elongation of the skyrmion mainly in the current direction, while parallel long pulses…
▽ More
Topological spin structures with transformable shapes may have potential implications on data storage and computation. Here, we demonstrate that a square cellular skyrmion on an artificial grid pinning pattern can be manipulated by programmed current pulses. We find that parallel short pulses could result in the elongation of the skyrmion mainly in the current direction, while parallel long pulses are able to induce the elongation in the direction perpendicular to the current due to the intrinsic skyrmion Hall effect. Consequently, a programmed sequence of parallel pulses could lead to the transformation of the skyrmion to I-, L-, and Z-shaped polyomino-like structures without affecting the topological charge. In addition, we find that orthogonal pulses could lead to the transformation to more complex polyomino-like structures, including the T-shaped and irregular ones. Particularly, when a small T-shaped structure is formed, the topological charge of the system is found to be non-integer due to incomplete compensation of local topological charge densities; however, the T-shaped structure is stable on the attractive pinning pattern. Our results offer an effective way to create polyomino-like spin structures toward functional applications.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Resolving Sentiment Discrepancy for Multimodal Sentiment Detection via Semantics Completion and Decomposition
Authors:
Daiqing Wu,
Dongbao Yang,
Huawen Shen,
Can Ma,
Yu Zhou
Abstract:
With the proliferation of social media posts in recent years, the need to detect sentiments in multimodal (image-text) content has grown rapidly. Since posts are user-generated, the image and text from the same post can express different or even contradictory sentiments, leading to potential \textbf{sentiment discrepancy}. However, existing works mainly adopt a single-branch fusion structure that…
▽ More
With the proliferation of social media posts in recent years, the need to detect sentiments in multimodal (image-text) content has grown rapidly. Since posts are user-generated, the image and text from the same post can express different or even contradictory sentiments, leading to potential \textbf{sentiment discrepancy}. However, existing works mainly adopt a single-branch fusion structure that primarily captures the consistent sentiment between image and text. The ignorance or implicit modeling of discrepant sentiment results in compromised unimodal encoding and limited performances. In this paper, we propose a semantics Completion and Decomposition (CoDe) network to resolve the above issue. In the semantics completion module, we complement image and text representations with the semantics of the OCR text embedded in the image, helping bridge the sentiment gap. In the semantics decomposition module, we decompose image and text representations with exclusive projection and contrastive learning, thereby explicitly capturing the discrepant sentiment between modalities. Finally, we fuse image and text representations by cross-attention and combine them with the learned discrepant sentiment for final classification. Extensive experiments conducted on four multimodal sentiment datasets demonstrate the superiority of CoDe against SOTA methods.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods
Authors:
Yiying Wang,
Xiaojing Li,
Binzhu Wang,
Yueyang Zhou,
Han Ji,
Hong Chen,
Jinshi Zhang,
Fei Yu,
Zewei Zhao,
Song Jin,
Renji Gong,
Wanqing Xu
Abstract:
In domain-specific applications, GPT-4, augmented with precise prompts or Retrieval-Augmented Generation (RAG), shows notable potential but faces the critical tri-lemma of performance, cost, and data privacy. High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PE…
▽ More
In domain-specific applications, GPT-4, augmented with precise prompts or Retrieval-Augmented Generation (RAG), shows notable potential but faces the critical tri-lemma of performance, cost, and data privacy. High performance requires sophisticated processing techniques, yet managing multiple agents within a complex workflow often proves costly and challenging. To address this, we introduce the PEER (Plan, Execute, Express, Review) multi-agent framework. This systematizes domain-specific tasks by integrating precise question decomposition, advanced information retrieval, comprehensive summarization, and rigorous self-assessment. Given the concerns of cost and data privacy, enterprises are shifting from proprietary models like GPT-4 to custom models, striking a balance between cost, security, and performance. We developed industrial practices leveraging online data and user feedback for efficient model tuning. This study provides best practice guidelines for applying multi-agent systems in domain-specific problem-solving and implementing effective agent tuning strategies. Our empirical studies, particularly in the financial question-answering domain, demonstrate that our approach achieves 95.0% of GPT-4's performance, while effectively managing costs and ensuring data privacy.
△ Less
Submitted 9 July, 2024; v1 submitted 9 July, 2024;
originally announced July 2024.
-
Reprogramming Distillation for Medical Foundation Models
Authors:
Yuhang Zhou,
Siyuan Du,
Haolin Li,
Jiangchao Yao,
Ya Zhang,
Yanfeng Wang
Abstract:
Medical foundation models pre-trained on large-scale datasets have demonstrated powerful versatile capabilities for various tasks. However, due to the gap between pre-training tasks (or modalities) and downstream tasks (or modalities), the real-world computation and speed constraints, it might not be straightforward to apply medical foundation models in the downstream scenarios. Previous methods,…
▽ More
Medical foundation models pre-trained on large-scale datasets have demonstrated powerful versatile capabilities for various tasks. However, due to the gap between pre-training tasks (or modalities) and downstream tasks (or modalities), the real-world computation and speed constraints, it might not be straightforward to apply medical foundation models in the downstream scenarios. Previous methods, such as parameter efficient fine-tuning (PEFT) methods and knowledge distillation (KD) methods, are unable to simultaneously address the task (or modality) inconsistency and achieve personalized lightweight deployment under diverse real-world demands. To address the above issues, we propose a novel framework called Reprogramming Distillation (RD). On one hand, RD reprograms the original feature space of the foundation model so that it is more relevant to downstream scenarios, aligning tasks and modalities. On the other hand, through a co-training mechanism and a shared classifier, connections are established between the reprogrammed knowledge and the knowledge of student models, ensuring that the reprogrammed feature space can be smoothly mimic by the student model of different structures. Further, to reduce the randomness under different training conditions, we design a Centered Kernel Alignment (CKA) distillation to promote robust knowledge transfer. Empirically, we show that on extensive datasets, RD consistently achieve superior performance compared with previous PEFT and KD methods.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Multimodal Chain-of-Thought Reasoning via ChatGPT to Protect Children from Age-Inappropriate Apps
Authors:
Chuanbo Hu,
Bin Liu,
Minglei Yin,
Yilu Zhou,
Xin Li
Abstract:
Mobile applications (Apps) could expose children to inappropriate themes such as sexual content, violence, and drug use. Maturity rating offers a quick and effective method for potential users, particularly guardians, to assess the maturity levels of apps. Determining accurate maturity ratings for mobile apps is essential to protect children's health in today's saturated digital marketplace. Exist…
▽ More
Mobile applications (Apps) could expose children to inappropriate themes such as sexual content, violence, and drug use. Maturity rating offers a quick and effective method for potential users, particularly guardians, to assess the maturity levels of apps. Determining accurate maturity ratings for mobile apps is essential to protect children's health in today's saturated digital marketplace. Existing approaches to maturity rating are either inaccurate (e.g., self-reported rating by developers) or costly (e.g., manual examination). In the literature, there are few text-mining-based approaches to maturity rating. However, each app typically involves multiple modalities, namely app description in the text, and screenshots in the image. In this paper, we present a framework for determining app maturity levels that utilize multimodal large language models (MLLMs), specifically ChatGPT-4 Vision. Powered by Chain-of-Thought (CoT) reasoning, our framework systematically leverages ChatGPT-4 to process multimodal app data (i.e., textual descriptions and screenshots) and guide the MLLM model through a step-by-step reasoning pathway from initial content analysis to final maturity rating determination. As a result, through explicitly incorporating CoT reasoning, our framework enables ChatGPT to understand better and apply maturity policies to facilitate maturity rating. Experimental results indicate that the proposed method outperforms all baseline models and other fusion strategies.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Authors:
Shihan Dou,
Haoxiang Jia,
Shenxi Wu,
Huiyuan Zheng,
Weikang Zhou,
Muling Wu,
Mingxu Chai,
Jessica Fan,
Caishuang Huang,
Yunbo Tao,
Yan Liu,
Enyu Zhou,
Ming Zhang,
Yuhao Zhou,
Yueming Wu,
Rui Zheng,
Ming Wen,
Rongxiang Weng,
Jingang Wang,
Xunliang Cai,
Tao Gui,
Xipeng Qiu,
Qi Zhang,
Xuanjing Huang
Abstract:
The increasing development of large language models (LLMs) in code generation has drawn significant attention among researchers. To enhance LLM-based code generation ability, current efforts are predominantly directed towards collecting high-quality datasets and leveraging diverse training technologies. However, there is a notable lack of comprehensive studies examining the limitations and boundar…
▽ More
The increasing development of large language models (LLMs) in code generation has drawn significant attention among researchers. To enhance LLM-based code generation ability, current efforts are predominantly directed towards collecting high-quality datasets and leveraging diverse training technologies. However, there is a notable lack of comprehensive studies examining the limitations and boundaries of these existing methods. To bridge this gap, we conducted an extensive empirical study evaluating the performance of three leading closed-source LLMs and four popular open-source LLMs on three commonly used benchmarks. Our investigation, which evaluated the length, cyclomatic complexity and API number of the generated code, revealed that these LLMs face challenges in generating successful code for more complex problems, and tend to produce code that is shorter yet more complicated as compared to canonical solutions. Additionally, we developed a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types. Furthermore, to better understand the performance of LLMs in real-world projects, we manually created a real-world benchmark comprising 140 code generation tasks. Our analysis highlights distinct differences in bug distributions between actual scenarios and existing benchmarks. Finally, we propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback. Experimental results demonstrate that our approach can significantly mitigate bugs and increase the passing rate by 29.2% after two iterations, indicating substantial potential for LLMs to handle more complex problems.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
In vacuum metasurface for optical microtrap array
Authors:
Donghao Li,
Qiming Liao,
Beining Xu,
Yaoting Zhou,
Keyu Qin,
Zhongxiao Xu,
Heng Shen,
Lingling Huang
Abstract:
Optical tweezer arrays of laser-cooled and individual controlled particles have revolutionized the atomic, molecular and optical physics, and they afford exquisite capabilities for applications in quantum simulation of many-body physics, quantum computation and quantum sensing. Underlying this development is the technical maturity of generating scalable optical beams, enabled by active components…
▽ More
Optical tweezer arrays of laser-cooled and individual controlled particles have revolutionized the atomic, molecular and optical physics, and they afford exquisite capabilities for applications in quantum simulation of many-body physics, quantum computation and quantum sensing. Underlying this development is the technical maturity of generating scalable optical beams, enabled by active components and high numerical aperture objective. However, such a complex combination of bulk optics outside the vacuum chamber is very sensitive to any vibration and drift. Here we demonstrate the generation of 3*3 static tweezer array with a single chip-scale multifunctional metasurface element in vacuum, replacing the meter-long free space optics. Fluorescence counts on the camera validates the successfully trapping of the atomic ensemble array. Further, we discuss the strategy to achieve low scattering and crosstalk, where a metasurface design featuring dual-wavelength independent control is included. Our results, together with other recent development in integrated photonics for cold atoms, could pave the way for compact and portable quantum sensors and simulators in platforms of neutral atom arrays.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Regularity of the $p$-Gauss curvature flow with flat side
Authors:
Genggeng Huang,
Xu-Jia Wang,
Yang Zhou
Abstract:
We study the regularity of the $p$-Gauss curvature flow with flat side. In our previous paper(arxiv:2403.12292), we obtained the regularity of the interface, namely the boundary of the flat part. In this paper, we study the regularity of the convex hypersurface near the interface.
We study the regularity of the $p$-Gauss curvature flow with flat side. In our previous paper(arxiv:2403.12292), we obtained the regularity of the interface, namely the boundary of the flat part. In this paper, we study the regularity of the convex hypersurface near the interface.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Deep Learning-based Anomaly Detection and Log Analysis for Computer Networks
Authors:
Shuzhan Wang,
Ruxue Jiang,
Zhaoqi Wang,
Yan Zhou
Abstract:
Computer network anomaly detection and log analysis, as an important topic in the field of network security, has been a key task to ensure network security and system reliability. First, existing network anomaly detection and log analysis methods are often challenged by high-dimensional data and complex network topologies, resulting in unstable performance and high false-positive rates. In additio…
▽ More
Computer network anomaly detection and log analysis, as an important topic in the field of network security, has been a key task to ensure network security and system reliability. First, existing network anomaly detection and log analysis methods are often challenged by high-dimensional data and complex network topologies, resulting in unstable performance and high false-positive rates. In addition, traditional methods are usually difficult to handle time-series data, which is crucial for anomaly detection and log analysis. Therefore, we need a more efficient and accurate method to cope with these problems. To compensate for the shortcomings of current methods, we propose an innovative fusion model that integrates Isolation Forest, GAN (Generative Adversarial Network), and Transformer with each other, and each of them plays a unique role. Isolation Forest is used to quickly identify anomalous data points, and GAN is used to generate synthetic data with the real data distribution characteristics to augment the training dataset, while the Transformer is used for modeling and context extraction on time series data. The synergy of these three components makes our model more accurate and robust in anomaly detection and log analysis tasks. We validate the effectiveness of this fusion model in an extensive experimental evaluation. Experimental results show that our model significantly improves the accuracy of anomaly detection while reducing the false alarm rate, which helps to detect potential network problems in advance. The model also performs well in the log analysis task and is able to quickly identify anomalous behaviors, which helps to improve the stability of the system. The significance of this study is that it introduces advanced deep learning techniques, which work anomaly detection and log analysis.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Multiple scattering and diffusion of scalar coherent waves in a group of small spheroidal particles with random orientations
Authors:
Mingyuan Ren,
Yajing Qiao,
Ning Zhou,
Jianrui Gong,
Yang Zhou,
Yu Zhang
Abstract:
In this manuscript we study multiple scattering and diffusion of scalar wave in a group of monodisperse spheroidal particles with random orientations. We begin by fixing a spheroid in a prolate spheroidal coordinate system, and attain the expansion of the scalar Green's function in this space. The expansion is firstly based on spheroidal wave functions, and then we transform it into the expansion…
▽ More
In this manuscript we study multiple scattering and diffusion of scalar wave in a group of monodisperse spheroidal particles with random orientations. We begin by fixing a spheroid in a prolate spheroidal coordinate system, and attain the expansion of the scalar Green's function in this space. The expansion is firstly based on spheroidal wave functions, and then we transform it into the expansion of spherical wave functions. Next, we average the Green's function over the orientations of the spheroid to get the averaged transition operator. Finally, we calculate the transport mean free path and anisotropy factor for the spheroidal particles group, based on the irreducible vertex in the Bethe-Salpeter equation. The approaches to get the average transition operator and the mean free paths in this manuscript will be of benefit to the research area of multiple scattering by non-spherical particles.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints
Authors:
Xi Chen,
Mo Liu,
Yining Wang,
Yuan Zhou
Abstract:
In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being compu…
▽ More
In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being computationally intractable, a practical strategy is to adopt the re-solving technique that periodically re-optimizes deterministic linear programs (LP) arising from fluid approximation. However, the fractional structure of MNL makes the fluid approximation in assortment optimization highly non-linear, which brings new technical challenges. To address this challenge, we propose a new epoch-based re-solving algorithm that effectively transforms the denominator of the objective into the constraint. Theoretically, we prove that the regret (i.e., the gap between the resolving policy and the optimal objective of the fluid approximation) scales logarithmically with the length of time horizon and resource capacities.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Metagenomic analysis reveals shared and distinguishing features in horse and donkey gut microbiome and maternal resemblance of the microbiota in hybrid equids
Authors:
Yihang Zhou
Abstract:
Mammalian gut microbiomes are essential for host functions like digestion, immunity, and nutrient utilization. This study examines the gut microbiome of horses, donkeys, and their hybrids, mules and hinnies, to explore the role of microbiomes in hybrid vigor. We performed whole-genome sequencing on rectal microbiota from 18 equids, generating detailed microbiome assemblies. Our analysis revealed s…
▽ More
Mammalian gut microbiomes are essential for host functions like digestion, immunity, and nutrient utilization. This study examines the gut microbiome of horses, donkeys, and their hybrids, mules and hinnies, to explore the role of microbiomes in hybrid vigor. We performed whole-genome sequencing on rectal microbiota from 18 equids, generating detailed microbiome assemblies. Our analysis revealed significant differences between horse and donkey microbiomes, with hybrids showing a pronounced maternal resemblance. Notably, Firmicutes were more abundant in the horse-maternal group, while Fibrobacteres were richer in the donkey-maternal group, indicating distinct digestive processes. Functional annotations indicated metabolic differences, such as protein synthesis in horses and energy metabolism in donkeys. Machine learning predictions of probiotic species highlighted potential health benefits for each maternal group. This study provides a high-resolution view of the equid gut microbiome, revealing significant taxonomic and metabolic differences influenced by maternal lineage, and offers insights into microbial contributions to hybrid vigor.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Metagenomic analysis revealed significant changes in cattle rectum microbiome and antimicrobial resistome under fescue toxicosis
Authors:
Yihang Zhou
Abstract:
Fescue toxicity causes reduced growth and reproductive issues in cattle grazing endophyte-infected tall fescue. To characterize the gut microbiota and its response to fescue toxicosis, we collected fecal samples before and after a 30-days toxic fescue seeds supplementation from eight Angus Simmental pregnant cows and heifers. We sequenced the 16 metagenomes using the whole-genome shotgun approach…
▽ More
Fescue toxicity causes reduced growth and reproductive issues in cattle grazing endophyte-infected tall fescue. To characterize the gut microbiota and its response to fescue toxicosis, we collected fecal samples before and after a 30-days toxic fescue seeds supplementation from eight Angus Simmental pregnant cows and heifers. We sequenced the 16 metagenomes using the whole-genome shotgun approach and generated 157 Gbp of metagenomic sequences. Through de novo assembly and annotation, we obtained a 13.1 Gbp reference contig assembly and identified 22 million microbial genes for cattle rectum microbiota. We discovered a significant reduction of microbial diversity after toxic seed treatment (P<0.01), suggesting dysbiosis of the microbiome. Six bacterial families and 31 species are significantly increased in the fecal microbiota (P-adj<0.05), including members of the top abundant rumen core taxa. This global elevation of rumen microbes in the rectum microbiota suggests a potential impairment of rumen microbiota under fescue toxicosis. Among these, Ruminococcaceae bacterium P7, an important species accounting for ~2% of rumen microbiota, was the most impacted with a 16-fold increase from 0.17% to 2.8% in feces (P<0.01). We hypothesized that rumen Ruminococcaceae bacterium P7 re-adapted to the large intestine environment under toxic fescue stress, causing this dramatic increase in abundance. Functional enrichment analysis revealed that the overrepresented pathways shifted from energy metabolism to antimicrobial resistance and DNA replication. In conclusion, we discovered dramatic microbiota alterations in composition, abundance, and functional capacities under fescue toxicosis, and our results suggest Ruminococcaceae bacterium P7 as a potential biomarker for fescue toxicosis management.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Closing the Gaps: Optimality of Sample Average Approximation for Data-Driven Newsvendor Problems
Authors:
Jiameng Lyu,
Shilin Yuan,
Bingkun Zhou,
Yuan Zhou
Abstract:
We study the regret performance of Sample Average Approximation (SAA) for data-driven newsvendor problems with general convex inventory costs. In literature, the optimality of SAA has not been fully established under both α-global strong convexity and (α,β)-local strong convexity (α-strongly convex within the β-neighborhood of the optimal quantity) conditions. This paper closes the gaps between re…
▽ More
We study the regret performance of Sample Average Approximation (SAA) for data-driven newsvendor problems with general convex inventory costs. In literature, the optimality of SAA has not been fully established under both α-global strong convexity and (α,β)-local strong convexity (α-strongly convex within the β-neighborhood of the optimal quantity) conditions. This paper closes the gaps between regret upper and lower bounds for both conditions. Under the (α,β)-local strong convexity condition, we prove the optimal regret bound of Θ(\log T/α+ 1/ (αβ)) for SAA. This upper bound result demonstrates that the regret performance of SAA is only influenced by αand not by βin the long run, enhancing our understanding about how local properties affect the long-term regret performance of decision-making strategies. Under the α-global strong convexity condition, we demonstrate that the worst-case regret of any data-driven method is lower bounded by Ω(\log T/α), which is the first lower bound result that matches the existing upper bound with respect to both parameter αand time horizon T. Along the way, we propose to analyze the SAA regret via a new gradient approximation technique, as well as a new class of smooth inverted-hat-shaped hard problem instances that might be of independent interest for the lower bounds of broader data-driven problems.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
Authors:
Zhaorun Chen,
Yichao Du,
Zichen Wen,
Yiyang Zhou,
Chenhang Cui,
Zhenzhen Weng,
Haoqin Tu,
Chaoqi Wang,
Zhengwei Tong,
Qinglan Huang,
Canyu Chen,
Qinghao Ye,
Zhihong Zhu,
Yuqing Zhang,
Jiawei Zhou,
Zhuokai Zhao,
Rafael Rafailov,
Chelsea Finn,
Huaxiu Yao
Abstract:
While text-to-image models like DALLE-3 and Stable Diffusion are rapidly proliferating, they often encounter challenges such as hallucination, bias, and the production of unsafe, low-quality output. To effectively address these issues, it is crucial to align these models with desired behaviors based on feedback from a multimodal judge. Despite their significance, current multimodal judges frequent…
▽ More
While text-to-image models like DALLE-3 and Stable Diffusion are rapidly proliferating, they often encounter challenges such as hallucination, bias, and the production of unsafe, low-quality output. To effectively address these issues, it is crucial to align these models with desired behaviors based on feedback from a multimodal judge. Despite their significance, current multimodal judges frequently undergo inadequate evaluation of their capabilities and limitations, potentially leading to misalignment and unsafe fine-tuning outcomes. To address this issue, we introduce MJ-Bench, a novel benchmark which incorporates a comprehensive preference dataset to evaluate multimodal judges in providing feedback for image generation models across four key perspectives: alignment, safety, image quality, and bias. Specifically, we evaluate a large variety of multimodal judges including smaller-sized CLIP-based scoring models, open-source VLMs (e.g. LLaVA family), and close-source VLMs (e.g. GPT-4o, Claude 3) on each decomposed subcategory of our preference dataset. Experiments reveal that close-source VLMs generally provide better feedback, with GPT-4o outperforming other judges in average. Compared with open-source VLMs, smaller-sized scoring models can provide better feedback regarding text-image alignment and image quality, while VLMs provide more accurate feedback regarding safety and generation bias due to their stronger reasoning capabilities. Further studies in feedback scale reveal that VLM judges can generally provide more accurate and stable feedback in natural language (Likert-scale) than numerical scales. Notably, human evaluations on end-to-end fine-tuned models using separate feedback from these multimodal judges provide similar conclusions, further confirming the effectiveness of MJ-Bench. All data, code, models are available at https://huggingface.co/MJ-Bench.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
ASteISR: Adapting Single Image Super-resolution Pre-trained Model for Efficient Stereo Image Super-resolution
Authors:
Yuanbo Zhou,
Yuyang Xue,
Wei Deng,
Xinlin Zhang,
Qinquan Gao,
Tong Tong
Abstract:
Despite advances in the paradigm of pre-training then fine-tuning in low-level vision tasks, significant challenges persist particularly regarding the increased size of pre-trained models such as memory usage and training time. Another concern often encountered is the unsatisfying results yielded when directly applying pre-trained single-image models to multi-image domain. In this paper, we propos…
▽ More
Despite advances in the paradigm of pre-training then fine-tuning in low-level vision tasks, significant challenges persist particularly regarding the increased size of pre-trained models such as memory usage and training time. Another concern often encountered is the unsatisfying results yielded when directly applying pre-trained single-image models to multi-image domain. In this paper, we propose a efficient method for transferring a pre-trained single-image super-resolution (SISR) transformer network to the domain of stereo image super-resolution (SteISR) through a parameter-efficient fine-tuning (PEFT) method. Specifically, we introduce the concept of stereo adapters and spatial adapters which are incorporated into the pre-trained SISR transformer network. Subsequently, the pre-trained SISR model is frozen, enabling us to fine-tune the adapters using stereo datasets along. By adopting this training method, we enhance the ability of the SISR model to accurately infer stereo images by 0.79dB on the Flickr1024 dataset. This method allows us to train only 4.8% of the original model parameters, achieving state-of-the-art performance on four commonly used SteISR benchmarks. Compared to the more complicated full fine-tuning approach, our method reduces training time and memory consumption by 57% and 15%, respectively.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
GMM-ResNext: Combining Generative and Discriminative Models for Speaker Verification
Authors:
Hui Yan,
Zhenchun Lei,
Changhong Liu,
Yong Zhou
Abstract:
With the development of deep learning, many different network architectures have been explored in speaker verification. However, most network architectures rely on a single deep learning architecture, and hybrid networks combining different architectures have been little studied in ASV tasks. In this paper, we propose the GMM-ResNext model for speaker verification. Conventional GMM does not consid…
▽ More
With the development of deep learning, many different network architectures have been explored in speaker verification. However, most network architectures rely on a single deep learning architecture, and hybrid networks combining different architectures have been little studied in ASV tasks. In this paper, we propose the GMM-ResNext model for speaker verification. Conventional GMM does not consider the score distribution of each frame feature over all Gaussian components and ignores the relationship between neighboring speech frames. So, we extract the log Gaussian probability features based on the raw acoustic features and use ResNext-based network as the backbone to extract the speaker embedding. GMM-ResNext combines Generative and Discriminative Models to improve the generalization ability of deep learning models and allows one to more easily specify meaningful priors on model parameters. A two-path GMM-ResNext model based on two gender-related GMMs has also been proposed. The Experimental results show that the proposed GMM-ResNext achieves relative improvements of 48.1\% and 11.3\% in EER compared with ResNet34 and ECAPA-TDNN on VoxCeleb1-O test set.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Federated Fine-Tuning for Pre-Trained Foundation Models Over Wireless Networks
Authors:
Zixin Wang,
Yong Zhou,
Yuanming Shi,
Khaled. B. Letaief
Abstract:
Pre-trained foundation models (FMs), with extensive number of neurons, are key to advancing next-generation intelligence services, where personalizing these models requires massive amount of task-specific data and computational resources. The prevalent solution involves centralized processing at the edge server, which, however, raises privacy concerns due to the transmission of raw data. Instead,…
▽ More
Pre-trained foundation models (FMs), with extensive number of neurons, are key to advancing next-generation intelligence services, where personalizing these models requires massive amount of task-specific data and computational resources. The prevalent solution involves centralized processing at the edge server, which, however, raises privacy concerns due to the transmission of raw data. Instead, federated fine-tuning (FedFT) is an emerging privacy-preserving fine-tuning (FT) paradigm for personalized pre-trained foundation models. In particular, by integrating low-rank adaptation (LoRA) with federated learning (FL), federated LoRA enables the collaborative FT of a global model with edge devices, achieving comparable learning performance to full FT while training fewer parameters over distributed data and preserving raw data privacy. However, the limited radio resources and computation capabilities of edge devices pose significant challenges for deploying federated LoRA over wireless networks. To this paper, we propose a split federated LoRA framework, which deploys the computationally-intensive encoder of a pre-trained model at the edge server, while keeping the embedding and task modules at the edge devices. Building on this split framework, the paper provides a rigorous analysis of the upper bound of the convergence gap for the wireless federated LoRA system. This analysis motivates the formulation of a long-term upper bound minimization problem, where we decompose the formulated long-term mixed-integer programming (MIP) problem into sequential sub-problems using the Lyapunov technique. We then develop an online algorithm for effective device scheduling and bandwidth allocation. Simulation results demonstrate the effectiveness of the proposed online algorithm in enhancing learning performance.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Leptogenesis in Realistic Flipped SU(5)
Authors:
Stephen F. King,
George K. Leontaris,
Luca Marsili,
Ye-Ling Zhou
Abstract:
We study thermal leptogenesis in realistic supersymmetric flipped $SU(5)\times U(1)$ unification. As up-type quarks and neutrinos are arranged in the same multiplets, they exhibit strong correlations, and it is commonly believed that the masses of right-handed (RH) neutrinos are too hierarchical to fit the low-energy neutrino data. This pattern generally predicts a lightest RH neutrino too light t…
▽ More
We study thermal leptogenesis in realistic supersymmetric flipped $SU(5)\times U(1)$ unification. As up-type quarks and neutrinos are arranged in the same multiplets, they exhibit strong correlations, and it is commonly believed that the masses of right-handed (RH) neutrinos are too hierarchical to fit the low-energy neutrino data. This pattern generally predicts a lightest RH neutrino too light to yield successful leptogenesis, with any lepton-antilepton asymmetry generated from heavier neutrinos being washed out unless special flavour structures are assumed. We propose a different scenario in which the lightest two RH neutrinos $N_1$ and $N_2$ have nearby masses of order $10^9$ GeV, with thermal leptogenesis arising non-resonantly from both $N_1$ and $N_2$. We show that this pattern is consistent with all data on fermion masses and mixing and predicts the lightest physical left-handed neutrino mass to be smaller than about $10^{-7}$ eV. The Dirac phase, which does not take the maximal CP-violating value, plays an important role in leptogenesis.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models
Authors:
Daking Rai,
Yilun Zhou,
Shi Feng,
Abulhair Saparov,
Ziyu Yao
Abstract:
Mechanistic interpretability (MI) is an emerging sub-field of interpretability that seeks to understand a neural network model by reverse-engineering its internal computations. Recently, MI has garnered significant attention for interpreting transformer-based language models (LMs), resulting in many novel insights yet introducing new challenges. However, there has not been work that comprehensivel…
▽ More
Mechanistic interpretability (MI) is an emerging sub-field of interpretability that seeks to understand a neural network model by reverse-engineering its internal computations. Recently, MI has garnered significant attention for interpreting transformer-based language models (LMs), resulting in many novel insights yet introducing new challenges. However, there has not been work that comprehensively reviews these insights and challenges, particularly as a guide for newcomers to this field. To fill this gap, we present a comprehensive survey outlining fundamental objects of study in MI, techniques that have been used for its investigation, approaches for evaluating MI results, and significant findings and applications stemming from the use of MI to understand LMs. In particular, we present a roadmap for beginners to navigate the field and leverage MI for their benefit. Finally, we also identify current gaps in the field and discuss potential future directions.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
ECAT: A Entire space Continual and Adaptive Transfer Learning Framework for Cross-Domain Recommendation
Authors:
Chaoqun Hou,
Yuanhang Zhou,
Yi Cao,
Tong Liu
Abstract:
In industrial recommendation systems, there are several mini-apps designed to meet the diverse interests and needs of users. The sample space of them is merely a small subset of the entire space, making it challenging to train an efficient model. In recent years, there have been many excellent studies related to cross-domain recommendation aimed at mitigating the problem of data sparsity. However,…
▽ More
In industrial recommendation systems, there are several mini-apps designed to meet the diverse interests and needs of users. The sample space of them is merely a small subset of the entire space, making it challenging to train an efficient model. In recent years, there have been many excellent studies related to cross-domain recommendation aimed at mitigating the problem of data sparsity. However, few of them have simultaneously considered the adaptability of both sample and representation continual transfer setting to the target task. To overcome the above issue, we propose a Entire space Continual and Adaptive Transfer learning framework called ECAT which includes two core components: First, as for sample transfer, we propose a two-stage method that realizes a coarse-to-fine process. Specifically, we perform an initial selection through a graph-guided method, followed by a fine-grained selection using domain adaptation method. Second, we propose an adaptive knowledge distillation method for continually transferring the representations from a model that is well-trained on the entire space dataset. ECAT enables full utilization of the entire space samples and representations under the supervision of the target task, while avoiding negative migration. Comprehensive experiments on real-world industrial datasets from Taobao show that ECAT advances state-of-the-art performance on offline metrics, and brings +13.6% CVR and +8.6% orders for Baiyibutie, a famous mini-app of Taobao.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness
Authors:
Hung Le,
Yingbo Zhou,
Caiming Xiong,
Silvio Savarese,
Doyen Sahoo
Abstract:
Large language models (LLMs) for code are typically trained to align with natural language instructions to closely follow their intentions and requirements. However, in many practical scenarios, it becomes increasingly challenging for these models to navigate the intricate boundary between helpfulness and safety, especially against highly complex yet potentially malicious instructions. In this wor…
▽ More
Large language models (LLMs) for code are typically trained to align with natural language instructions to closely follow their intentions and requirements. However, in many practical scenarios, it becomes increasingly challenging for these models to navigate the intricate boundary between helpfulness and safety, especially against highly complex yet potentially malicious instructions. In this work, we introduce INDICT: a new framework that empowers LLMs with Internal Dialogues of Critiques for both safety and helpfulness guidance. The internal dialogue is a dual cooperative system between a safety-driven critic and a helpfulness-driven critic. Each critic provides analysis against the given task and corresponding generated response, equipped with external knowledge queried through relevant code snippets and tools like web search and code interpreter. We engage the dual critic system in both code generation stage as well as code execution stage, providing preemptive and post-hoc guidance respectively to LLMs. We evaluated INDICT on 8 diverse tasks across 8 programming languages from 5 benchmarks, using LLMs from 7B to 70B parameters. We observed that our approach can provide an advanced level of critiques of both safety and helpfulness analysis, significantly improving the quality of output codes ($+10\%$ absolute improvements in all models).
△ Less
Submitted 23 June, 2024;
originally announced July 2024.
-
BeNeRF: Neural Radiance Fields from a Single Blurry Image and Event Stream
Authors:
Wenpu Li,
Pian Wan,
Peng Wang,
Jinghang Li,
Yi Zhou,
Peidong Liu
Abstract:
Neural implicit representation of visual scenes has attracted a lot of attention in recent research of computer vision and graphics. Most prior methods focus on how to reconstruct 3D scene representation from a set of images. In this work, we demonstrate the possibility to recover the neural radiance fields (NeRF) from a single blurry image and its corresponding event stream. We model the camera m…
▽ More
Neural implicit representation of visual scenes has attracted a lot of attention in recent research of computer vision and graphics. Most prior methods focus on how to reconstruct 3D scene representation from a set of images. In this work, we demonstrate the possibility to recover the neural radiance fields (NeRF) from a single blurry image and its corresponding event stream. We model the camera motion with a cubic B-Spline in SE(3) space. Both the blurry image and the brightness change within a time interval, can then be synthesized from the 3D scene representation given the 6-DoF poses interpolated from the cubic B-Spline. Our method can jointly learn both the implicit neural scene representation and recover the camera motion by minimizing the differences between the synthesized data and the real measurements without pre-computed camera poses from COLMAP. We evaluate the proposed method with both synthetic and real datasets. The experimental results demonstrate that we are able to render view-consistent latent sharp images from the learned NeRF and bring a blurry image alive in high quality. Code and data are available at https://github.com/WU-CVGL/BeNeRF.
△ Less
Submitted 3 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
GMM-ResNet2: Ensemble of Group ResNet Networks for Synthetic Speech Detection
Authors:
Zhenchun Lei,
Hui Yan,
Changhong Liu,
Yong Zhou,
Minglei Ma
Abstract:
Deep learning models are widely used for speaker recognition and spoofing speech detection. We propose the GMM-ResNet2 for synthesis speech detection. Compared with the previous GMM-ResNet model, GMM-ResNet2 has four improvements. Firstly, the different order GMMs have different capabilities to form smooth approximations to the feature distribution, and multiple GMMs are used to extract multi-scal…
▽ More
Deep learning models are widely used for speaker recognition and spoofing speech detection. We propose the GMM-ResNet2 for synthesis speech detection. Compared with the previous GMM-ResNet model, GMM-ResNet2 has four improvements. Firstly, the different order GMMs have different capabilities to form smooth approximations to the feature distribution, and multiple GMMs are used to extract multi-scale Log Gaussian Probability features. Secondly, the grouping technique is used to improve the classification accuracy by exposing the group cardinality while reducing both the number of parameters and the training time. The final score is obtained by ensemble of all group classifier outputs using the averaging method. Thirdly, the residual block is improved by including one activation function and one batch normalization layer. Finally, an ensemble-aware loss function is proposed to integrate the independent loss functions of all ensemble members. On the ASVspoof 2019 LA task, the GMM-ResNet2 achieves a minimum t-DCF of 0.0227 and an EER of 0.79\%. On the ASVspoof 2021 LA task, the GMM-ResNet2 achieves a minimum t-DCF of 0.2362 and an EER of 2.19\%, and represents a relative reductions of 31.4\% and 76.3\% compared with the LFCC-LCNN baseline.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak Attacks
Authors:
Yue Zhou,
Henry Peng Zou,
Barbara Di Eugenio,
Yang Zhang
Abstract:
We find that language models have difficulties generating fallacious and deceptive reasoning. When asked to generate deceptive outputs, language models tend to leak honest counterparts but believe them to be false. Exploiting this deficiency, we propose a jailbreak attack method that elicits an aligned language model for malicious output. Specifically, we query the model to generate a fallacious y…
▽ More
We find that language models have difficulties generating fallacious and deceptive reasoning. When asked to generate deceptive outputs, language models tend to leak honest counterparts but believe them to be false. Exploiting this deficiency, we propose a jailbreak attack method that elicits an aligned language model for malicious output. Specifically, we query the model to generate a fallacious yet deceptively real procedure for the harmful behavior. Since a fallacious procedure is generally considered fake and thus harmless by LLMs, it helps bypass the safeguard mechanism. Yet the output is factually harmful since the LLM cannot fabricate fallacious solutions but proposes truthful ones. We evaluate our approach over five safety-aligned large language models, comparing four previous jailbreak methods, and show that our approach achieves competitive performance with more harmful outputs. We believe the findings could be extended beyond model safety, such as self-verification and hallucination.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Unusual Pore Volume Dependence of Water Sorption in Monolithic Metal-Organic Framework
Authors:
Jiawang Li,
Guang Wang,
Hongzhao Fan,
Zhigang Li,
Chi Yan Tso,
Yanguang Zhou
Abstract:
Monolithic metal-organic frameworks (MOFs), which have a continuous structure composed of small primary MOF particles and amorphous networks, are demonstrated to possess larger pore volume and thus better larger gas uptake capacity compared to their powder forms. Here, we systematically investigated the water vapor adsorption kinetics in a prototypical MOF, i.e., MOF-801. Our results show that the…
▽ More
Monolithic metal-organic frameworks (MOFs), which have a continuous structure composed of small primary MOF particles and amorphous networks, are demonstrated to possess larger pore volume and thus better larger gas uptake capacity compared to their powder forms. Here, we systematically investigated the water vapor adsorption kinetics in a prototypical MOF, i.e., MOF-801. Our results show that the total pore volume (average pore diameter) of the monolithic MOF-801 is 0.831 cm3/g (5.20 nm) which is much larger than that of powder MOF-801, i.e., 0.488 cm3/g (1.95 nm). Unexpectedly, we find that the water uptake capacity of monolithic MOF-801 is much lower than that of powder MOF-801 when the RH ranges from 10% to 90%. Our molecular dynamics simulations further demonstrate that the unexpected water uptake capacity of monolithic MOF-801 at RH of 10%~90% is caused by the water film formed by the capillary condensation in these mesopores of monolithic MOF-801. The water molecules can overcome the capillary force when the RH is higher than 90%, and then leads to the increase of the corresponding water uptake capacity of monolithic MOF-801. Our findings reveal the underlying mechanisms for water adsorption kinetics in both powder and monolithic MOFs, which could motivate and benefit the new passive cooling or water harvesting system design based on MOFs.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
X. H. Bai,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsuren,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (495 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions…
▽ More
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
Instance Temperature Knowledge Distillation
Authors:
Zhengbo Zhang,
Yuxi Zhou,
Jia Gong,
Jun Liu,
Zhigang Tu
Abstract:
Knowledge distillation (KD) enhances the performance of a student network by allowing it to learn the knowledge transferred from a teacher network incrementally. Existing methods dynamically adjust the temperature to enable the student network to adapt to the varying learning difficulties at different learning stages of KD. KD is a continuous process, but when adjusting the temperature, these meth…
▽ More
Knowledge distillation (KD) enhances the performance of a student network by allowing it to learn the knowledge transferred from a teacher network incrementally. Existing methods dynamically adjust the temperature to enable the student network to adapt to the varying learning difficulties at different learning stages of KD. KD is a continuous process, but when adjusting the temperature, these methods consider only the immediate benefits of the operation in the current learning phase and fail to take into account its future returns. To address this issue, we formulate the adjustment of temperature as a sequential decision-making task and propose a method based on reinforcement learning, termed RLKD. Importantly, we design a novel state representation to enable the agent to make more informed action (i.e. instance temperature adjustment). To handle the problem of delayed rewards in our method due to the KD setting, we explore an instance reward calibration approach. In addition,we devise an efficient exploration strategy that enables the agent to learn valuable instance temperature adjustment policy more efficiently. Our framework can serve as a plug-and-play technique to be inserted into various KD methods easily, and we validate its effectiveness on both image classification and object detection tasks. Our project is at https://www.zayx.me/ITKD.github.io/.
△ Less
Submitted 7 July, 2024; v1 submitted 27 June, 2024;
originally announced July 2024.
-
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
Authors:
Jiazheng Li,
Hainiu Xu,
Zhaoyue Sun,
Yuxiang Zhou,
David West,
Cesare Aloisi,
Yulan He
Abstract:
Generating rationales that justify scoring decisions has been a promising way to facilitate explainability in automated scoring systems. However, existing methods do not match the accuracy of classifier-based methods. Plus, the generated rationales often contain hallucinated information. To address these issues, we propose a novel framework capable of generating more faithful rationales and, more…
▽ More
Generating rationales that justify scoring decisions has been a promising way to facilitate explainability in automated scoring systems. However, existing methods do not match the accuracy of classifier-based methods. Plus, the generated rationales often contain hallucinated information. To address these issues, we propose a novel framework capable of generating more faithful rationales and, more importantly, matching performance with classifier-based black-box scoring systems. We first mimic the human assessment process by querying Large Language Models (LLMs) to generate a thought tree. We then summarise intermediate assessment decisions from each thought tree path for creating synthetic rationale data and rationale preference data. Finally, we utilise the generated synthetic data to calibrate LLMs through a two-step training process: supervised fine-tuning and preference optimization. Extensive experimental results demonstrate that our framework achieves a 38% assessment performance improvement in the QWK score compared to prior work while producing higher-quality rationales, as recognised by human evaluators and LLMs. Our work sheds light on the effectiveness of performing preference optimization using synthetic preference data obtained from thought tree paths.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Virtual Context: Enhancing Jailbreak Attacks with Special Token Injection
Authors:
Yuqi Zhou,
Lin Lu,
Hanchi Sun,
Pan Zhou,
Lichao Sun
Abstract:
Jailbreak attacks on large language models (LLMs) involve inducing these models to generate harmful content that violates ethics or laws, posing a significant threat to LLM security. Current jailbreak attacks face two main challenges: low success rates due to defensive measures and high resource requirements for crafting specific prompts. This paper introduces Virtual Context, which leverages spec…
▽ More
Jailbreak attacks on large language models (LLMs) involve inducing these models to generate harmful content that violates ethics or laws, posing a significant threat to LLM security. Current jailbreak attacks face two main challenges: low success rates due to defensive measures and high resource requirements for crafting specific prompts. This paper introduces Virtual Context, which leverages special tokens, previously overlooked in LLM security, to improve jailbreak attacks. Virtual Context addresses these challenges by significantly increasing the success rates of existing jailbreak methods and requiring minimal background knowledge about the target model, thus enhancing effectiveness in black-box settings without additional overhead. Comprehensive evaluations show that Virtual Context-assisted jailbreak attacks can improve the success rates of four widely used jailbreak methods by approximately 40% across various LLMs. Additionally, applying Virtual Context to original malicious behaviors still achieves a notable jailbreak effect. In summary, our research highlights the potential of special tokens in jailbreak attacks and recommends including this threat in red-teaming testing to comprehensively enhance LLM security.
△ Less
Submitted 11 July, 2024; v1 submitted 28 June, 2024;
originally announced June 2024.
-
Self-Supervised Spatial-Temporal Normality Learning for Time Series Anomaly Detection
Authors:
Yutong Chen,
Hongzuo Xu,
Guansong Pang,
Hezhe Qiao,
Yuan Zhou,
Mingsheng Shang
Abstract:
Time Series Anomaly Detection (TSAD) finds widespread applications across various domains such as financial markets, industrial production, and healthcare. Its primary objective is to learn the normal patterns of time series data, thereby identifying deviations in test samples. Most existing TSAD methods focus on modeling data from the temporal dimension, while ignoring the semantic information in…
▽ More
Time Series Anomaly Detection (TSAD) finds widespread applications across various domains such as financial markets, industrial production, and healthcare. Its primary objective is to learn the normal patterns of time series data, thereby identifying deviations in test samples. Most existing TSAD methods focus on modeling data from the temporal dimension, while ignoring the semantic information in the spatial dimension. To address this issue, we introduce a novel approach, called Spatial-Temporal Normality learning (STEN). STEN is composed of a sequence Order prediction-based Temporal Normality learning (OTN) module that captures the temporal correlations within sequences, and a Distance prediction-based Spatial Normality learning (DSN) module that learns the relative spatial relations between sequences in a feature space. By synthesizing these two modules, STEN learns expressive spatial-temporal representations for the normal patterns hidden in the time series data. Extensive experiments on five popular TSAD benchmarks show that STEN substantially outperforms state-of-the-art competing methods. Our code is available at https://github.com/mala-lab/STEN.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Combating Missed Recalls in E-commerce Search: A CoT-Prompting Testing Approach
Authors:
Shengnan Wu,
Yongxiang Hu,
Yingchuan Wang,
Jiazhen Gu,
Jin Meng,
Liujie Fan,
Zhongshi Luan,
Xin Wang,
Yangfan Zhou
Abstract:
Search components in e-commerce apps, often complex AI-based systems, are prone to bugs that can lead to missed recalls - situations where items that should be listed in search results aren't. This can frustrate shop owners and harm the app's profitability. However, testing for missed recalls is challenging due to difficulties in generating user-aligned test cases and the absence of oracles. In th…
▽ More
Search components in e-commerce apps, often complex AI-based systems, are prone to bugs that can lead to missed recalls - situations where items that should be listed in search results aren't. This can frustrate shop owners and harm the app's profitability. However, testing for missed recalls is challenging due to difficulties in generating user-aligned test cases and the absence of oracles. In this paper, we introduce mrDetector, the first automatic testing approach specifically for missed recalls. To tackle the test case generation challenge, we use findings from how users construct queries during searching to create a CoT prompt to generate user-aligned queries by LLM. In addition, we learn from users who create multiple queries for one shop and compare search results, and provide a test oracle through a metamorphic relation. Extensive experiments using open access data demonstrate that mrDetector outperforms all baselines with the lowest false positive ratio. Experiments with real industrial data show that mrDetector discovers over one hundred missed recalls with only 17 false positives.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
A Survey on Data Quality Dimensions and Tools for Machine Learning
Authors:
Yuhan Zhou,
Fengjiao Tu,
Kewei Sha,
Junhua Ding,
Haihua Chen
Abstract:
Machine learning (ML) technologies have become substantial in practically all aspects of our society, and data quality (DQ) is critical for the performance, fairness, robustness, safety, and scalability of ML models. With the large and complex data in data-centric AI, traditional methods like exploratory data analysis (EDA) and cross-validation (CV) face challenges, highlighting the importance of…
▽ More
Machine learning (ML) technologies have become substantial in practically all aspects of our society, and data quality (DQ) is critical for the performance, fairness, robustness, safety, and scalability of ML models. With the large and complex data in data-centric AI, traditional methods like exploratory data analysis (EDA) and cross-validation (CV) face challenges, highlighting the importance of mastering DQ tools. In this survey, we review 17 DQ evaluation and improvement tools in the last 5 years. By introducing the DQ dimensions, metrics, and main functions embedded in these tools, we compare their strengths and limitations and propose a roadmap for developing open-source DQ tools for ML. Based on the discussions on the challenges and emerging trends, we further highlight the potential applications of large language models (LLMs) and generative AI in DQ evaluation and improvement for ML. We believe this comprehensive survey can enhance understanding of DQ in ML and could drive progress in data-centric AI. A complete list of the literature investigated in this survey is available on GitHub at: https://github.com/haihua0913/awesome-dq4ml.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Geodesic Causal Inference
Authors:
Daisuke Kurisu,
Yidong Zhou,
Taisuke Otsu,
Hans-Georg Müller
Abstract:
Adjusting for confounding and imbalance when establishing statistical relationships is an increasingly important task, and causal inference methods have emerged as the most popular tool to achieve this. Causal inference has been developed mainly for scalar outcomes and recently for distributional outcomes. We introduce here a general framework for causal inference when outcomes reside in general g…
▽ More
Adjusting for confounding and imbalance when establishing statistical relationships is an increasingly important task, and causal inference methods have emerged as the most popular tool to achieve this. Causal inference has been developed mainly for scalar outcomes and recently for distributional outcomes. We introduce here a general framework for causal inference when outcomes reside in general geodesic metric spaces, where we draw on a novel geodesic calculus that facilitates scalar multiplication for geodesics and the characterization of treatment effects through the concept of the geodesic average treatment effect. Using ideas from Fréchet regression, we develop estimation methods of the geodesic average treatment effect and derive consistency and rates of convergence for the proposed estimators. We also study uncertainty quantification and inference for the treatment effect. Our methodology is illustrated by a simulation study and real data examples for compositional outcomes of U.S. statewise energy source data to study the effect of coal mining, network data of New York taxi trips, where the effect of the COVID-19 pandemic is of interest, and brain functional connectivity network data to study the effect of Alzheimer's disease.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.