subscribe to arXiv mailings

doi 10.1145/3637528.3671559

Deep Bag-of-Words Model: An Efficient and Interpretable Relevance Architecture for Chinese E-Commerce

Authors: Zhe Lin, Jiwei Tan, Dan Ou, Xi Chen, Shaowei Yao, Bo Zheng

Abstract: Text relevance or text matching of query and product is an essential technique for the e-commerce search system to ensure that the displayed products can match the intent of the query. Many studies focus on improving the performance of the relevance model in search system. Recently, pre-trained language models like BERT have achieved promising performance on the text relevance task. While these mo… ▽ More Text relevance or text matching of query and product is an essential technique for the e-commerce search system to ensure that the displayed products can match the intent of the query. Many studies focus on improving the performance of the relevance model in search system. Recently, pre-trained language models like BERT have achieved promising performance on the text relevance task. While these models perform well on the offline test dataset, there are still obstacles to deploy the pre-trained language model to the online system as their high latency. The two-tower model is extensively employed in industrial scenarios, owing to its ability to harmonize performance with computational efficiency. Regrettably, such models present an opaque ``black box'' nature, which prevents developers from making special optimizations. In this paper, we raise deep Bag-of-Words (DeepBoW) model, an efficient and interpretable relevance architecture for Chinese e-commerce. Our approach proposes to encode the query and the product into the sparse BoW representation, which is a set of word-weight pairs. The weight means the important or the relevant score between the corresponding word and the raw text. The relevance score is measured by the accumulation of the matched word between the sparse BoW representation of the query and the product. Compared to popular dense distributed representation that usually suffers from the drawback of black-box, the most advantage of the proposed representation model is highly explainable and interventionable, which is a superior advantage to the deployment and operation of online search engines. Moreover, the online efficiency of the proposed model is even better than the most efficient inner product form of dense representation ... △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: KDD'24 accepted paper

arXiv:2407.09374 [pdf]

Grain boundaries control lithiation of solid solution substrates in lithium metal batteries

Authors: Leonardo Shoji Aota, Chanwon Jung, Siyuan Zhang, Ömer K. Büyükuslu, Poonam Yadav, Mahander Pratap Singh, Xinren Chen, Eric Woods, Christina Scheu, Se-Ho Kim, Dierk Raabe, Baptiste Gault

Abstract: The development of sustainable transportation and communication systems requires an increase in both energy density and capacity retention of Li-batteries. Using substrates forming a solid solution with body centered cubic Li enhances the cycle stability of anode-less batteries. However, it remains unclear how the substrate microstructure affects the lithiation behavior. Here, we deploy a correlat… ▽ More The development of sustainable transportation and communication systems requires an increase in both energy density and capacity retention of Li-batteries. Using substrates forming a solid solution with body centered cubic Li enhances the cycle stability of anode-less batteries. However, it remains unclear how the substrate microstructure affects the lithiation behavior. Here, we deploy a correlative, near-atomic scale probing approach through combined ion- and electron-microscopy to examine the distribution of Li in Li-Ag diffusion couples as model system. We reveal that Li regions with over 93.8% at.% nucleate within Ag at random high angle grain boundaries, whereas grain interiors are not lithiated. We evidence the role of kinetics and mechanical constraint from the microstructure over equilibrium thermodynamics in dictating the lithiation process. The findings suggest that grain size and grain boundary character are critical to enhance the electrochemical performance of interlayers/electrodes, particularly for improving lithiation kinetics and hence reducing dendrite formation. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.09250 [pdf]

FedsLLM: Federated Split Learning for Large Language Models over Communication Networks

Authors: Kai Zhao, Zhaohui Yang, Chongwen Huang, Xiaoming Chen, Zhaoyang Zhang

Abstract: Addressing the challenges of deploying large language models in wireless communication networks, this paper combines low-rank adaptation technology (LoRA) with the splitfed learning framework to propose the federated split learning for large language models (FedsLLM) framework. The method introduced in this paper utilizes LoRA technology to reduce processing loads by dividing the network into clie… ▽ More Addressing the challenges of deploying large language models in wireless communication networks, this paper combines low-rank adaptation technology (LoRA) with the splitfed learning framework to propose the federated split learning for large language models (FedsLLM) framework. The method introduced in this paper utilizes LoRA technology to reduce processing loads by dividing the network into client subnetworks and server subnetworks. It leverages a federated server to aggregate and update client models. As the training data are transmitted through a wireless network between clients and both main and federated servers, the training delay is determined by the learning accuracy and the allocation of communication bandwidth. This paper models the minimization of the training delay by integrating computation and communication optimization, simplifying the optimization problem into a convex problem to find the optimal solution. Additionally, it presents a lemma that describes the precise solutions to this problem. Simulation results demonstrate that the proposed optimization algorithm reduces delays by an average of 47.63% compared to unoptimized scenarios. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.09194 [pdf, other]

doi 10.1093/mnras/stae1602

The JWST Weather Report from the Nearest Brown Dwarfs I: multi-period JWST NIRSpec + MIRI monitoring of the benchmark binary brown dwarf WISE 1049AB

Authors: Beth A. Biller, Johanna M. Vos, Yifan Zhou, Allison M. McCarthy, Xianyu Tan, Ian J. M. Crossfield, Niall Whiteford, Genaro Suarez, Jacqueline Faherty, Elena Manjavacas, Xueqing Chen, Pengyu Liu, Ben J. Sutlieff, Mary Anne Limbach, Paul Molliere, Trent J. Dupuy, Natalia Oliveros-Gomez, Philip S. Muirhead, Thomas Henning, Gregory Mace, Nicolas Crouzet, Theodora Karalidi, Caroline V. Morley, Pascal Tremblin, Tiffany Kataria

Abstract: We report results from 8 hours of JWST/MIRI LRS spectroscopic monitoring directly followed by 7 hours of JWST/NIRSpec prism spectroscopic monitoring of the benchmark binary brown dwarf WISE 1049AB, the closest, brightest brown dwarfs known. We find water, methane, and CO absorption features in both components, including the 3.3 $μ$m methane absorption feature and a tentative detection of small gra… ▽ More We report results from 8 hours of JWST/MIRI LRS spectroscopic monitoring directly followed by 7 hours of JWST/NIRSpec prism spectroscopic monitoring of the benchmark binary brown dwarf WISE 1049AB, the closest, brightest brown dwarfs known. We find water, methane, and CO absorption features in both components, including the 3.3 $μ$m methane absorption feature and a tentative detection of small grain ($<$ 1$μ$m) silicate absorption at $>$8.5 $μ$m in WISE 1049A. Both components vary significantly ($>$1$\%$), with WISE 1049B displaying larger variations than WISE 1049A. Using K-means clustering, we find three main transition points in wavelength for both components of the binary: 1) change in behavior at $\sim$2.3 $μ$m coincident with a CO absorption bandhead, 2) change in behavior at 4.2 $μ$m, close to the CO fundamental band at $λ>$ 4.4 $μ$m, and 3) change in behavior at 8.3-8.5 $μ$m, potentially corresponding to silicate absorption. We interpret the lightcurves observed with both NIRSpec and MIRI as likely stemming from 1) a deep pressure level driving the double-peaked variability seen in WISE 1049B at wavelengths $<$2.3 $μ$m and $>$8.5 $μ$m, 2) an intermediate pressure level shaping the lightcurve morphology between 2.3 and 4.2 $μ$m, and 3) a higher-altitude pressure level producing single-peaked and plateaued lightcurve behavior between 4.2 and 8.5 $μ$m. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: 28 pages, 27 figures, accepted to MNRAS

arXiv:2407.08971 [pdf, other]

Full-Stage Pseudo Label Quality Enhancement for Weakly-supervised Temporal Action Localization

Authors: Qianhan Feng, Wenshuo Li, Tong Lin, Xinghao Chen

Abstract: Weakly-supervised Temporal Action Localization (WSTAL) aims to localize actions in untrimmed videos using only video-level supervision. Latest WSTAL methods introduce pseudo label learning framework to bridge the gap between classification-based training and inferencing targets at localization, and achieve cutting-edge results. In these frameworks, a classification-based model is used to generate… ▽ More Weakly-supervised Temporal Action Localization (WSTAL) aims to localize actions in untrimmed videos using only video-level supervision. Latest WSTAL methods introduce pseudo label learning framework to bridge the gap between classification-based training and inferencing targets at localization, and achieve cutting-edge results. In these frameworks, a classification-based model is used to generate pseudo labels for a regression-based student model to learn from. However, the quality of pseudo labels in the framework, which is a key factor to the final result, is not carefully studied. In this paper, we propose a set of simple yet efficient pseudo label quality enhancement mechanisms to build our FuSTAL framework. FuSTAL enhances pseudo label quality at three stages: cross-video contrastive learning at proposal Generation-Stage, prior-based filtering at proposal Selection-Stage and EMA-based distillation at Training-Stage. These designs enhance pseudo label quality at different stages in the framework, and help produce more informative, less false and smoother action proposals. With the help of these comprehensive designs at all stages, FuSTAL achieves an average mAP of 50.8% on THUMOS'14, outperforming the previous best method by 1.2%, and becomes the first method to reach the milestone of 50%. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08956 [pdf, other]

DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor Attacks

Authors: Guang Yang, Yu Zhou, Xiang Chen, Xiangyu Zhang, Terry Yue Zhuo, David Lo, Taolue Chen

Abstract: Code Language Models (CLMs), particularly those leveraging deep learning, have achieved significant success in code intelligence domain. However, the issue of security, particularly backdoor attacks, is often overlooked in this process. The previous research has focused on designing backdoor attacks for CLMs, but effective defenses have not been adequately addressed. In particular, existing defens… ▽ More Code Language Models (CLMs), particularly those leveraging deep learning, have achieved significant success in code intelligence domain. However, the issue of security, particularly backdoor attacks, is often overlooked in this process. The previous research has focused on designing backdoor attacks for CLMs, but effective defenses have not been adequately addressed. In particular, existing defense methods from natural language processing, when directly applied to CLMs, are not effective enough and lack generality, working well in some models and scenarios but failing in others, thus fall short in consistently mitigating backdoor attacks. To bridge this gap, we first confirm the phenomenon of ``early learning" as a general occurrence during the training of CLMs. This phenomenon refers to that a model initially focuses on the main features of training data but may become more sensitive to backdoor triggers over time, leading to overfitting and susceptibility to backdoor attacks. We then analyze that overfitting to backdoor triggers results from the use of the cross-entropy loss function, where the unboundedness of cross-entropy leads the model to increasingly concentrate on the features of the poisoned data. Based on this insight, we propose a general and effective loss function DeCE (Deceptive Cross-Entropy) by blending deceptive distributions and applying label smoothing to limit the gradient to be bounded, which prevents the model from overfitting to backdoor triggers and then enhances the security of CLMs against backdoor attacks. To verify the effectiveness of our defense method, we select code synthesis tasks as our experimental scenarios. Our experiments across various code synthesis datasets, models, and poisoning ratios demonstrate the applicability and effectiveness of DeCE in enhancing the security of CLMs. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: Under Review; Waiting for updates

arXiv:2407.08477 [pdf, ps, other]

Optimal Carbon Emission Control With Allowances Purchasing

Authors: Xinfu Chen, Yuchao Dong, Wenlin Huang, Jin Liang

Abstract: In this paper, we consider a company can simultaneously reduce its emissions and buy carbon allowances at any time. We establish an optimal control model involving two stochastic processes with two control variables, which is a singular control problem. This model can then be converted into a Hamilton-Jacobi-Bellman (HJB) equation, which is a two-dimensional variational equality with gradient barr… ▽ More In this paper, we consider a company can simultaneously reduce its emissions and buy carbon allowances at any time. We establish an optimal control model involving two stochastic processes with two control variables, which is a singular control problem. This model can then be converted into a Hamilton-Jacobi-Bellman (HJB) equation, which is a two-dimensional variational equality with gradient barrier, so that the free boundary is a surface. We prove the existence and uniqueness of the solution. Finally, some numerical results are shown. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08251 [pdf]

Determination of five-parameter grain boundary characteristics in nanocrystalline Ni-W by Scanning Precession Electron Diffraction Tomography

Authors: E. F. Rauch, Patrick Harrison, Saurabh Mohan Das, William Goncalves, Alessandra Da Silva, Xinren Chen, Nicola Viganò, Christian H. Liebscher, Wolfgang Ludwig, Xuyang Zhou

Abstract: Determining the full five-parameter grain boundary characteristics from experiments is essential for understanding grain boundaries impact on material properties, improving related models, and designing advanced alloys. However, achieving this is generally challenging, in particular at nanoscale, due to their 3D nature. In our study, we successfully determined the grain boundary characteristics of… ▽ More Determining the full five-parameter grain boundary characteristics from experiments is essential for understanding grain boundaries impact on material properties, improving related models, and designing advanced alloys. However, achieving this is generally challenging, in particular at nanoscale, due to their 3D nature. In our study, we successfully determined the grain boundary characteristics of an annealed nickel-tungsten alloy (NiW) nanocrystalline needle-shaped specimen (tip) containing twins using Scanning Precession Electron Diffraction (SPED) Tomography. The presence of annealing twins in this face-centered cubic (fcc) material gives rise to common reflections in the SPED diffraction patterns, which challenges the reconstruction of orientation-specific virtual dark field (VDF) images required for tomographic reconstruction of the 3D grain shapes. To address this, an automated post-processing step identifies and deselects these shared reflections prior to the reconstruction of the VDF images. Combined with appropriate intensity normalization and projection alignment procedures, this approach enables high-fidelity 3D reconstruction of the individual grains contained in the needle-shaped sample volume. To probe the accuracy of the resulting boundary characteristics, the twin boundary surface normal directions were extracted from the 3D voxelated grain boundary map using a 3D Hough transform. For the sub-set of coherent Sigma 3 boundaries, the expected {111} grain boundary plane normals were obtained with an angular error of less than 3{\textdegree} for boundary sizes down to 400 nm${}^2$. This work advances our ability to precisely characterize and understand the complex grain boundaries that govern material properties. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08206 [pdf]

System Report for CCL24-Eval Task 7: Multi-Error Modeling and Fluency-Targeted Pre-training for Chinese Essay Evaluation

Authors: Jingshen Zhang, Xiangyu Yang, Xinkai Su, Xinglu Chen, Tianyou Huang, Xinying Qiu

Abstract: This system report presents our approaches and results for the Chinese Essay Fluency Evaluation (CEFE) task at CCL-2024. For Track 1, we optimized predictions for challenging fine-grained error types using binary classification models and trained coarse-grained models on the Chinese Learner 4W corpus. In Track 2, we enhanced performance by constructing a pseudo-dataset with multiple error types pe… ▽ More This system report presents our approaches and results for the Chinese Essay Fluency Evaluation (CEFE) task at CCL-2024. For Track 1, we optimized predictions for challenging fine-grained error types using binary classification models and trained coarse-grained models on the Chinese Learner 4W corpus. In Track 2, we enhanced performance by constructing a pseudo-dataset with multiple error types per sentence. For Track 3, where we achieved first place, we generated fluency-rated pseudo-data via back-translation for pre-training and used an NSP-based strategy with Symmetric Cross Entropy loss to capture context and mitigate long dependencies. Our methods effectively address key challenges in Chinese Essay Fluency Evaluation. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08204 [pdf, other]

doi 10.1145/3637528.3671642

Chromosomal Structural Abnormality Diagnosis by Homologous Similarity

Authors: Juren Li, Fanzhe Fu, Ran Wei, Yifei Sun, Zeyu Lai, Ning Song, Xin Chen, Yang Yang

Abstract: Pathogenic chromosome abnormalities are very common among the general population. While numerical chromosome abnormalities can be quickly and precisely detected, structural chromosome abnormalities are far more complex and typically require considerable efforts by human experts for identification. This paper focuses on investigating the modeling of chromosome features and the identification of chr… ▽ More Pathogenic chromosome abnormalities are very common among the general population. While numerical chromosome abnormalities can be quickly and precisely detected, structural chromosome abnormalities are far more complex and typically require considerable efforts by human experts for identification. This paper focuses on investigating the modeling of chromosome features and the identification of chromosomes with structural abnormalities. Most existing data-driven methods concentrate on a single chromosome and consider each chromosome independently, overlooking the crucial aspect of homologous chromosomes. In normal cases, homologous chromosomes share identical structures, with the exception that one of them is abnormal. Therefore, we propose an adaptive method to align homologous chromosomes and diagnose structural abnormalities through homologous similarity. Inspired by the process of human expert diagnosis, we incorporate information from multiple pairs of homologous chromosomes simultaneously, aiming to reduce noise disturbance and improve prediction performance. Extensive experiments on real-world datasets validate the effectiveness of our model compared to baselines. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08106 [pdf, other]

SGLC: Semantic Graph-Guided Coarse-Fine-Refine Full Loop Closing for LiDAR SLAM

Authors: Neng Wang, Xieyuanli Chen, Chenghao Shi, Zhiqiang Zheng, Hongshan Yu, Huimin Lu

Abstract: Loop closing is a crucial component in SLAM that helps eliminate accumulated errors through two main steps: loop detection and loop pose correction. The first step determines whether loop closing should be performed, while the second estimates the 6-DoF pose to correct odometry drift. Current methods mostly focus on developing robust descriptors for loop closure detection, often neglecting loop po… ▽ More Loop closing is a crucial component in SLAM that helps eliminate accumulated errors through two main steps: loop detection and loop pose correction. The first step determines whether loop closing should be performed, while the second estimates the 6-DoF pose to correct odometry drift. Current methods mostly focus on developing robust descriptors for loop closure detection, often neglecting loop pose estimation. A few methods that do include pose estimation either suffer from low accuracy or incur high computational costs. To tackle this problem, we introduce SGLC, a real-time semantic graph-guided full loop closing method, with robust loop closure detection and 6-DoF pose estimation capabilities. SGLC takes into account the distinct characteristics of foreground and background points. For foreground instances, it builds a semantic graph that not only abstracts point cloud representation for fast descriptor generation and matching but also guides the subsequent loop verification and initial pose estimation. Background points, meanwhile, are exploited to provide more geometric features for scan-wise descriptor construction and stable planar information for further pose refinement. Loop pose estimation employs a coarse-fine-refine registration scheme that considers the alignment of both instance points and background points, offering high efficiency and accuracy. We evaluate the loop closing performance of SGLC through extensive experiments on the KITTI and KITTI-360 datasets, demonstrating its superiority over existing state-of-the-art methods. Additionally, we integrate SGLC into a SLAM system, eliminating accumulated errors and improving overall SLAM performance. The implementation of SGLC will be released at https://github.com/nubot-nudt/SGLC. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 8 pages, 4 figures

arXiv:2407.08093 [pdf, other]

MemWarp: Discontinuity-Preserving Cardiac Registration with Memorized Anatomical Filters

Authors: Hang Zhang, Xiang Chen, Renjiu Hu, Dongdong Liu, Gaolei Li, Rongguang Wang

Abstract: Many existing learning-based deformable image registration methods impose constraints on deformation fields to ensure they are globally smooth and continuous. However, this assumption does not hold in cardiac image registration, where different anatomical regions exhibit asymmetric motions during respiration and movements due to sliding organs within the chest. Consequently, such global constraint… ▽ More Many existing learning-based deformable image registration methods impose constraints on deformation fields to ensure they are globally smooth and continuous. However, this assumption does not hold in cardiac image registration, where different anatomical regions exhibit asymmetric motions during respiration and movements due to sliding organs within the chest. Consequently, such global constraints fail to accommodate local discontinuities across organ boundaries, potentially resulting in erroneous and unrealistic displacement fields. In this paper, we address this issue with MemWarp, a learning framework that leverages a memory network to store prototypical information tailored to different anatomical regions. MemWarp is different from earlier approaches in two main aspects: firstly, by decoupling feature extraction from similarity matching in moving and fixed images, it facilitates more effective utilization of feature maps; secondly, despite its capability to preserve discontinuities, it eliminates the need for segmentation masks during model inference. In experiments on a publicly available cardiac dataset, our method achieves considerable improvements in registration accuracy and producing realistic deformations, outperforming state-of-the-art methods with a remarkable 7.1\% Dice score improvement over the runner-up semi-supervised method. Source code will be available at https://github.com/tinymilky/Mem-Warp. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 11 pages, 2 figure, 2 tables

arXiv:2407.08081 [pdf, other]

RoCap: A Robotic Data Collection Pipeline for the Pose Estimation of Appearance-Changing Objects

Authors: Jiahao Nick Li, Toby Chong, Zhongyi Zhou, Hironori Yoshida, Koji Yatani, Xiang 'Anthony' Chen, Takeo Igarashi

Abstract: Object pose estimation plays a vital role in mixed-reality interactions when users manipulate tangible objects as controllers. Traditional vision-based object pose estimation methods leverage 3D reconstruction to synthesize training data. However, these methods are designed for static objects with diffuse colors and do not work well for objects that change their appearance during manipulation, suc… ▽ More Object pose estimation plays a vital role in mixed-reality interactions when users manipulate tangible objects as controllers. Traditional vision-based object pose estimation methods leverage 3D reconstruction to synthesize training data. However, these methods are designed for static objects with diffuse colors and do not work well for objects that change their appearance during manipulation, such as deformable objects like plush toys, transparent objects like chemical flasks, reflective objects like metal pitchers, and articulated objects like scissors. To address this limitation, we propose Rocap, a robotic pipeline that emulates human manipulation of target objects while generating data labeled with ground truth pose information. The user first gives the target object to a robotic arm, and the system captures many pictures of the object in various 6D configurations. The system trains a model by using captured images and their ground truth pose information automatically calculated from the joint angles of the robotic arm. We showcase pose estimation for appearance-changing objects by training simple deep-learning models using the collected data and comparing the results with a model trained with synthetic data based on 3D reconstruction via quantitative and qualitative evaluation. The findings underscore the promising capabilities of Rocap. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07726 [pdf, other]

PaliGemma: A versatile 3B VLM for transfer

Authors: Lucas Beyer, Andreas Steiner, André Susano Pinto, Alexander Kolesnikov, Xiao Wang, Daniel Salz, Maxim Neumann, Ibrahim Alabdulmohsin, Michael Tschannen, Emanuele Bugliarello, Thomas Unterthiner, Daniel Keysers, Skanda Koppula, Fangyu Liu, Adam Grycner, Alexey Gritsenko, Neil Houlsby, Manoj Kumar, Keran Rong, Julian Eisenschlos, Rishabh Kabra, Matthias Bauer, Matko Bošnjak, Xi Chen, Matthias Minderer , et al. (10 additional authors not shown)

Abstract: PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model. It is trained to be a versatile and broadly knowledgeable base model that is effective to transfer. It achieves strong performance on a wide variety of open-world tasks. We evaluate PaliGemma on almost 40 diverse tasks including standard VLM benchmarks, but also more… ▽ More PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model. It is trained to be a versatile and broadly knowledgeable base model that is effective to transfer. It achieves strong performance on a wide variety of open-world tasks. We evaluate PaliGemma on almost 40 diverse tasks including standard VLM benchmarks, but also more specialized tasks such as remote-sensing and segmentation. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07651 [pdf, other]

Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07397 [pdf, other]

SimuSOE: A Simulated Snoring Dataset for Obstructive Sleep Apnea-Hypopnea Syndrome Evaluation during Wakefulness

Authors: Jie Lin, Xiuping Yang, Li Xiao, Xinhong Li, Weiyan Yi, Yuhong Yang, Weiping Tu, Xiong Chen

Abstract: Obstructive Sleep Apnea-Hypopnea Syndrome (OSAHS) is a prevalent chronic breathing disorder caused by upper airway obstruction. Previous studies advanced OSAHS evaluation through machine learning-based systems trained on sleep snoring or speech signal datasets. However, constructing datasets for training a precise and rapid OSAHS evaluation system poses a challenge, since 1) it is time-consuming t… ▽ More Obstructive Sleep Apnea-Hypopnea Syndrome (OSAHS) is a prevalent chronic breathing disorder caused by upper airway obstruction. Previous studies advanced OSAHS evaluation through machine learning-based systems trained on sleep snoring or speech signal datasets. However, constructing datasets for training a precise and rapid OSAHS evaluation system poses a challenge, since 1) it is time-consuming to collect sleep snores and 2) the speech signal is limited in reflecting upper airway obstruction. In this paper, we propose a new snoring dataset for OSAHS evaluation, named SimuSOE, in which a novel and time-effective snoring collection method is introduced for tackling the above problems. In particular, we adopt simulated snoring which is a type of snore intentionally emitted by patients to replace natural snoring. Experimental results indicate that the simulated snoring signal during wakefulness can serve as an effective feature in OSAHS preliminary screening. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.06714 [pdf, other]

Improving the Transferability of Adversarial Examples by Feature Augmentation

Authors: Donghua Wang, Wen Yao, Tingsong Jiang, Xiaohu Zheng, Junqi Wu, Xiaoqian Chen

Abstract: Despite the success of input transformation-based attacks on boosting adversarial transferability, the performance is unsatisfying due to the ignorance of the discrepancy across models. In this paper, we propose a simple but effective feature augmentation attack (FAUG) method, which improves adversarial transferability without introducing extra computation costs. Specifically, we inject the random… ▽ More Despite the success of input transformation-based attacks on boosting adversarial transferability, the performance is unsatisfying due to the ignorance of the discrepancy across models. In this paper, we propose a simple but effective feature augmentation attack (FAUG) method, which improves adversarial transferability without introducing extra computation costs. Specifically, we inject the random noise into the intermediate features of the model to enlarge the diversity of the attack gradient, thereby mitigating the risk of overfitting to the specific model and notably amplifying adversarial transferability. Moreover, our method can be combined with existing gradient attacks to augment their performance further. Extensive experiments conducted on the ImageNet dataset across CNN and transformer models corroborate the efficacy of our method, e.g., we achieve improvement of +26.22% and +5.57% on input transformation-based attacks and combination methods, respectively. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 19 pages, 4 figures, 4 tables

arXiv:2407.06688 [pdf, other]

Universal Multi-view Black-box Attack against Object Detectors via Layout Optimization

Authors: Donghua Wang, Wen Yao, Tingsong Jiang, Chao Li, Xiaoqian Chen

Abstract: Object detectors have demonstrated vulnerability to adversarial examples crafted by small perturbations that can deceive the object detector. Existing adversarial attacks mainly focus on white-box attacks and are merely valid at a specific viewpoint, while the universal multi-view black-box attack is less explored, limiting their generalization in practice. In this paper, we propose a novel univer… ▽ More Object detectors have demonstrated vulnerability to adversarial examples crafted by small perturbations that can deceive the object detector. Existing adversarial attacks mainly focus on white-box attacks and are merely valid at a specific viewpoint, while the universal multi-view black-box attack is less explored, limiting their generalization in practice. In this paper, we propose a novel universal multi-view black-box attack against object detectors, which optimizes a universal adversarial UV texture constructed by multiple image stickers for a 3D object via the designed layout optimization algorithm. Specifically, we treat the placement of image stickers on the UV texture as a circle-based layout optimization problem, whose objective is to find the optimal circle layout filled with image stickers so that it can deceive the object detector under the multi-view scenario. To ensure reasonable placement of image stickers, two constraints are elaborately devised. To optimize the layout, we adopt the random search algorithm enhanced by the devised important-aware selection strategy to find the most appropriate image sticker for each circle from the image sticker pools. Extensive experiments conducted on four common object detectors suggested that the detection performance decreases by a large magnitude of 74.29% on average in multi-view scenarios. Additionally, a novel evaluation tool based on the photo-realistic simulator is designed to assess the texture-based attack fairly. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 12 pages, 13 figures, 5 tables

arXiv:2407.06612 [pdf]

AI-based Automatic Segmentation of Prostate on Multi-modality Images: A Review

Authors: Rui Jin, Derun Li, Dehui Xiang, Lei Zhang, Hailing Zhou, Fei Shi, Weifang Zhu, Jing Cai, Tao Peng, Xinjian Chen

Abstract: Prostate cancer represents a major threat to health. Early detection is vital in reducing the mortality rate among prostate cancer patients. One approach involves using multi-modality (CT, MRI, US, etc.) computer-aided diagnosis (CAD) systems for the prostate region. However, prostate segmentation is challenging due to imperfections in the images and the prostate's complex tissue structure. The ad… ▽ More Prostate cancer represents a major threat to health. Early detection is vital in reducing the mortality rate among prostate cancer patients. One approach involves using multi-modality (CT, MRI, US, etc.) computer-aided diagnosis (CAD) systems for the prostate region. However, prostate segmentation is challenging due to imperfections in the images and the prostate's complex tissue structure. The advent of precision medicine and a significant increase in clinical capacity have spurred the need for various data-driven tasks in the field of medical imaging. Recently, numerous machine learning and data mining tools have been integrated into various medical areas, including image segmentation. This article proposes a new classification method that differentiates supervision types, either in number or kind, during the training phase. Subsequently, we conducted a survey on artificial intelligence (AI)-based automatic prostate segmentation methods, examining the advantages and limitations of each. Additionally, we introduce variants of evaluation metrics for the verification and performance assessment of the segmentation method and summarize the current challenges. Finally, future research directions and development trends are discussed, reflecting the outcomes of our literature survey, suggesting high-precision detection and treatment of prostate cancer as a promising avenue. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06573 [pdf, other]

LLM for Mobile: An Initial Roadmap

Authors: Daihang Chen, Yonghui Liu, Mingyi Zhou, Yanjie Zhao, Haoyu Wang, Shuai Wang, Xiao Chen, Tegawendé F. Bissyandé, Jacques Klein, Li Li

Abstract: When mobile meets LLMs, mobile app users deserve to have more intelligent usage experiences. For this to happen, we argue that there is a strong need to appl LLMs for the mobile ecosystem. We therefore provide a research roadmap for guiding our fellow researchers to achieve that as a whole. In this roadmap, we sum up six directions that we believe are urgently required for research to enable nativ… ▽ More When mobile meets LLMs, mobile app users deserve to have more intelligent usage experiences. For this to happen, we argue that there is a strong need to appl LLMs for the mobile ecosystem. We therefore provide a research roadmap for guiding our fellow researchers to achieve that as a whole. In this roadmap, we sum up six directions that we believe are urgently required for research to enable native intelligence in mobile devices. In each direction, we further summarize the current research progress and the gaps that still need to be filled by our fellow researchers. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.06520 [pdf, other]

Unveiling the Secrets of Vortex Neutron Decay

Authors: Wei Kou, Bing'ang Guo, Xurong Chen

Abstract: Investigation of decay and scattering processes of particles in a vortex state offers a novel and promising approach for probing particle structure. Our study reveals distinct properties of vortex neutron decay, which deviate from those of classical plane-wave neutron decay. We present the energy-angle distributions of the final-state electron and antineutrino in unpolarized vortex neutrons, as we… ▽ More Investigation of decay and scattering processes of particles in a vortex state offers a novel and promising approach for probing particle structure. Our study reveals distinct properties of vortex neutron decay, which deviate from those of classical plane-wave neutron decay. We present the energy-angle distributions of the final-state electron and antineutrino in unpolarized vortex neutrons, as well as angle distributions integrated over their energies. Notably, we provide theoretical calculations of the decay behavior of neutrons with varying vortex cone angles and initial energies. We propose that identifying the vortex state of the initial neutron can be achieved by analyzing the angular and energy distributions of the final-state particles, introducing new degrees of freedom for studying weak interactions and neutron decay kinematics that have been previously overlooked in particle physics. This timely investigation takes advantage of recent advancements in vortex neutron preparation and analysis, opening up new avenues for exploring the fundamental properties of matter. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 10 pages, 7 figures

arXiv:2407.06503 [pdf, other]

Preference-Guided Reinforcement Learning for Efficient Exploration

Authors: Guojian Wang, Faguo Wu, Xiao Zhang, Tianyuan Chen, Xuyang Chen, Lin Zhao

Abstract: In this paper, we investigate preference-based reinforcement learning (PbRL) that allows reinforcement learning (RL) agents to learn from human feedback. This is particularly valuable when defining a fine-grain reward function is not feasible. However, this approach is inefficient and impractical for promoting deep exploration in hard-exploration tasks with long horizons and sparse rewards. To tac… ▽ More In this paper, we investigate preference-based reinforcement learning (PbRL) that allows reinforcement learning (RL) agents to learn from human feedback. This is particularly valuable when defining a fine-grain reward function is not feasible. However, this approach is inefficient and impractical for promoting deep exploration in hard-exploration tasks with long horizons and sparse rewards. To tackle this issue, we introduce LOPE: Learning Online with trajectory Preference guidancE, an end-to-end preference-guided RL framework that enhances exploration efficiency in hard-exploration tasks. Our intuition is that LOPE directly adjusts the focus of online exploration by considering human feedback as guidance, avoiding learning a separate reward model from preferences. Specifically, LOPE includes a two-step sequential policy optimization process consisting of trust-region-based policy improvement and preference guidance steps. We reformulate preference guidance as a novel trajectory-wise state marginal matching problem that minimizes the maximum mean discrepancy distance between the preferred trajectories and the learned policy. Furthermore, we provide a theoretical analysis to characterize the performance improvement bound and evaluate the LOPE's effectiveness. When assessed in various challenging hard-exploration environments, LOPE outperforms several state-of-the-art methods regarding convergence rate and overall performance. The code used in this study is available at \url{https://github.com/buaawgj/LOPE}. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 13 pages, 17 figures

arXiv:2407.06333 [pdf, ps, other]

A third-order finite difference weighted essentially non-oscillatory scheme with shallow neural network

Authors: Kwanghyuk Park, Xinjuan Chen, Dongjin Lee, Jiaxi Gu, Jae-Hun Jung

Abstract: In this paper, we introduce the finite difference weighted essentially non-oscillatory (WENO) scheme based on the neural network for hyperbolic conservation laws. We employ the supervised learning and design two loss functions, one with the mean squared error and the other with the mean squared logarithmic error, where the WENO3-JS weights are computed as the labels. Each loss function consists of… ▽ More In this paper, we introduce the finite difference weighted essentially non-oscillatory (WENO) scheme based on the neural network for hyperbolic conservation laws. We employ the supervised learning and design two loss functions, one with the mean squared error and the other with the mean squared logarithmic error, where the WENO3-JS weights are computed as the labels. Each loss function consists of two components where the first component compares the difference between the weights from the neural network and WENO3-JS weights, while the second component matches the output weights of the neural network and the linear weights. The former of the loss function enforces the neural network to follow the WENO properties, implying that there is no need for the post-processing layer. Additionally the latter leads to better performance around discontinuities. As a neural network structure, we choose the shallow neural network (SNN) for computational efficiency with the Delta layer consisting of the normalized undivided differences. These constructed WENO3-SNN schemes show the outperformed results in one-dimensional examples and improved behavior in two-dimensional examples, compared with the simulations from WENO3-JS and WENO3-Z. △ Less

Submitted 10 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.06227 [pdf, ps, other]

Communication and Control Co-Design in 6G: Sequential Decision-Making with LLMs

Authors: Xianfu Chen, Celimuge Wu, Yi Shen, Yusheng Ji, Tsutomu Yoshinaga, Qiang Ni, Charilaos C. Zarakovitis, Honggang Zhang

Abstract: This article investigates a control system within the context of six-generation wireless networks. The control performance optimization confronts the technical challenges that arise from the intricate interactions between communication and control sub-systems, asking for a co-design. Accounting for the system dynamics, we formulate the sequential co-design decision-makings of communication and con… ▽ More This article investigates a control system within the context of six-generation wireless networks. The control performance optimization confronts the technical challenges that arise from the intricate interactions between communication and control sub-systems, asking for a co-design. Accounting for the system dynamics, we formulate the sequential co-design decision-makings of communication and control over the discrete time horizon as a Markov decision process, for which a practical offline learning framework is proposed. Our proposed framework integrates large language models into the elements of reinforcement learning. We present a case study on the age of semantics-aware communication and control co-design to showcase the potentials from our proposed learning framework. Furthermore, we discuss the open issues remaining to make our proposed offline learning framework feasible for real-world implementations, and highlight the research directions for future explorations. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.06220 [pdf, ps, other]

Exact enumeration of RNA secondary structures by helices and loops

Authors: Ricky X. F. Chen, Christian M. Reidys, Michael S. Waterman

Abstract: Enumerative studies of RNA secondary structures were initiated four decades ago by Waterman and his coworkers. Since then, RNA secondary structures have been explored according to many different structural characteristics, for instance, helices, components and loops by Hofacker, Schuster and Stadler, orders by Nebel, saturated structures by Clote, the $5^{\prime}$-$3^{\prime}$ end distance by Clot… ▽ More Enumerative studies of RNA secondary structures were initiated four decades ago by Waterman and his coworkers. Since then, RNA secondary structures have been explored according to many different structural characteristics, for instance, helices, components and loops by Hofacker, Schuster and Stadler, orders by Nebel, saturated structures by Clote, the $5^{\prime}$-$3^{\prime}$ end distance by Clote, Ponty and Steyaert, and the rainbow spectrum by Li and Reidys. However, the majority of the contributions are asymptotic results, and it is harder to derive explicit formulas. In this paper, we obtain exact formulas counting RNA secondary structures with a given number of helices as well as a given joint size distribution of helices and loops, while some related asymptotic results due to Hofacker, Schuster and Stadler have been known for about twenty years. Our approach is combinatorial, analyzing a recent bijection between RNA secondary structures and plane trees discovered by the first author and proposing a variation of Chen's bijective approach of counting trees by forests of simple trees. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: may not be submitted for publication, but comments are welcome

MSC Class: 92B05; 05C05; 05A15

arXiv:2407.06192 [pdf, other]

Multi-Object Hallucination in Vision-Language Models

Authors: Xuweiyi Chen, Ziqiao Ma, Xuejun Zhang, Sihan Xu, Shengyi Qian, Jianing Yang, David F. Fouhey, Joyce Chai

Abstract: Large vision language models (LVLMs) often suffer from object hallucination, producing objects not present in the given images. While current benchmarks for object hallucination primarily concentrate on the presence of a single object class rather than individual entities, this work systematically investigates multi-object hallucination, examining how models misperceive (e.g., invent nonexistent o… ▽ More Large vision language models (LVLMs) often suffer from object hallucination, producing objects not present in the given images. While current benchmarks for object hallucination primarily concentrate on the presence of a single object class rather than individual entities, this work systematically investigates multi-object hallucination, examining how models misperceive (e.g., invent nonexistent objects or become distracted) when tasked with focusing on multiple objects simultaneously. We introduce Recognition-based Object Probing Evaluation (ROPE), an automated evaluation protocol that considers the distribution of object classes within a single image during testing and uses visual referring prompts to eliminate ambiguity. With comprehensive empirical studies and analysis of potential factors leading to multi-object hallucination, we found that (1) LVLMs suffer more hallucinations when focusing on multiple objects compared to a single object. (2) The tested object class distribution affects hallucination behaviors, indicating that LVLMs may follow shortcuts and spurious correlations.(3) Hallucinatory behaviors are influenced by data-specific factors, salience and frequency, and model intrinsic behaviors. We hope to enable LVLMs to recognize and reason about multiple objects that often occur in realistic visual scenes, provide insights, and quantify our progress towards mitigating the issues. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Accepted to ALVR @ ACL 2024 | Project page: https://multi-object-hallucination.github.io/

arXiv:2407.05953 [pdf, ps, other]

Circuit Partitioning and Transmission Cost Optimization in Distributed Quantum Computing

Authors: Xinyu Chen, Zilu Chen, Xueyun Cheng, Zhijin Guan

Abstract: Given the limitations on the number of qubits in current NISQ devices, the implementation of large-scale quantum algorithms on such devices is challenging, prompting research into distributed quantum computing. This paper focuses on the issue of excessive communication complexity in distributed quantum computing oriented towards quantum circuits. To reduce the number of quantum state transmissions… ▽ More Given the limitations on the number of qubits in current NISQ devices, the implementation of large-scale quantum algorithms on such devices is challenging, prompting research into distributed quantum computing. This paper focuses on the issue of excessive communication complexity in distributed quantum computing oriented towards quantum circuits. To reduce the number of quantum state transmissions, i.e., the transmission cost, in distributed quantum circuits, a circuit partitioning method based on the QUBO model is proposed, coupled with the lookahead method for transmission cost optimization. Initially, the problem of distributed quantum circuit partitioning is transformed into a graph minimum cut problem. The QUBO model, which can be accelerated by quantum algorithms, is introduced to minimize the number of quantum gates between QPUs and the transmission cost. Subsequently, the dynamic lookahead strategy for the selection of transmission qubits is proposed to optimize the transmission cost in distributed quantum circuits. Finally, through numerical simulations, the impact of different circuit partitioning indicators on the transmission cost is explored, and the proposed method is evaluated on benchmark circuits. Experimental results demonstrate that the transmission cost optimized through the method proposed in this paper is significantly reduced compared with current methods for optimizing transmission cost, achieving noticeable improvements across different numbers of partitions. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05923 [pdf, other]

Maximum Entropy Method for Valence Quark Distributions in Exotic Hadrons: A Study of the $Z_c(3900)$ Case

Authors: Chengdong Han, Xiaopeng Wang, Wei Kou, Xurong Chen

Abstract: In this study we demonstrate the application of the Maximum Entropy Method (MEM) to determine the valence quark distribution of exotic hadrons. Our investigation yields three key findings. Firstly, we observe a significant shift towards smaller Bjorken scale $x$ in the peak position of the valence quark distribution for hadrons with an increasing number of valence quarks, consistent with previous… ▽ More In this study we demonstrate the application of the Maximum Entropy Method (MEM) to determine the valence quark distribution of exotic hadrons. Our investigation yields three key findings. Firstly, we observe a significant shift towards smaller Bjorken scale $x$ in the peak position of the valence quark distribution for hadrons with an increasing number of valence quarks, consistent with previous results by Kawamura and Kumano. Secondly, assuming that the $Z_c(3900)$ initially consists of four valence quarks, we employ MEM to determine its initial valence quark distribution, estimating a radius of $r_c=1.276$ fm at an extremely low resolution scale $Q^2$. Furthermore, we identify a notable discrepancy between our computed charge form factor $G_c(q)$ at leading order and the outcomes of hadron molecular state calculations. We propose that this form factor can be extracted from the QCD counting rule cross-section, which is grounded in Generalized Distribution Amplitudes (GDA) linked to the multi-quark states. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 7 pages, 7 figures

arXiv:2407.05761 [pdf, other]

Interpretability of Uncertainty: Exploring Cortical Lesion Segmentation in Multiple Sclerosis

Authors: Nataliia Molchanova, Alessandro Cagol, Pedro M. Gordaliza, Mario Ocampo-Pineda, Po-Jui Lu, Matthias Weigel, Xinjie Chen, Adrien Depeursinge, Cristina Granziera, Henning Müller, Meritxell Bach Cuadra

Abstract: Uncertainty quantification (UQ) has become critical for evaluating the reliability of artificial intelligence systems, especially in medical image segmentation. This study addresses the interpretability of instance-wise uncertainty values in deep learning models for focal lesion segmentation in magnetic resonance imaging, specifically cortical lesion (CL) segmentation in multiple sclerosis. CL seg… ▽ More Uncertainty quantification (UQ) has become critical for evaluating the reliability of artificial intelligence systems, especially in medical image segmentation. This study addresses the interpretability of instance-wise uncertainty values in deep learning models for focal lesion segmentation in magnetic resonance imaging, specifically cortical lesion (CL) segmentation in multiple sclerosis. CL segmentation presents several challenges, including the complexity of manual segmentation, high variability in annotation, data scarcity, and class imbalance, all of which contribute to aleatoric and epistemic uncertainty. We explore how UQ can be used not only to assess prediction reliability but also to provide insights into model behavior, detect biases, and verify the accuracy of UQ methods. Our research demonstrates the potential of instance-wise uncertainty values to offer post hoc global model explanations, serving as a sanity check for the model. The implementation is available at https://github.com/NataliiaMolch/interpret-lesion-unc. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05611 [pdf, other]

GenFollower: Enhancing Car-Following Prediction with Large Language Models

Authors: Xianda Chen, Mingxing Peng, PakHin Tiu, Yuanfei Wu, Junjie Chen, Meixin Zhu, Xinhu Zheng

Abstract: Accurate modeling of car-following behaviors is essential for various applications in traffic management and autonomous driving systems. However, current approaches often suffer from limitations like high sensitivity to data quality and lack of interpretability. In this study, we propose GenFollower, a novel zero-shot prompting approach that leverages large language models (LLMs) to address these… ▽ More Accurate modeling of car-following behaviors is essential for various applications in traffic management and autonomous driving systems. However, current approaches often suffer from limitations like high sensitivity to data quality and lack of interpretability. In this study, we propose GenFollower, a novel zero-shot prompting approach that leverages large language models (LLMs) to address these challenges. We reframe car-following behavior as a language modeling problem and integrate heterogeneous inputs into structured prompts for LLMs. This approach achieves improved prediction performance and interpretability compared to traditional baseline models. Experiments on the Waymo Open datasets demonstrate GenFollower's superior performance and ability to provide interpretable insights into factors influencing car-following behavior. This work contributes to advancing the understanding and prediction of car-following behaviors, paving the way for enhanced traffic management and autonomous driving systems. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05598 [pdf, ps, other]

Probing the nature of the anticharmed-strange pentaquark states: mass spectra, decays, and magnetic moments

Authors: Xuejie Liu, Yue Tan, Xiaoyun Chen, Dianyong Chen, Hongxia Huang, Jialun Ping

Abstract: Within the framework of the quark delocalization color screening model, a systematic investigation of the anticharmed-strange pentaquark system is performed using the resonance group method. The currently estimations predict three bound states with estimated masses to be 2886 MeV, 3039 MeV, and 3153 MeV, respectively. Additionally, three resonance states are identified in various scattering phase… ▽ More Within the framework of the quark delocalization color screening model, a systematic investigation of the anticharmed-strange pentaquark system is performed using the resonance group method. The currently estimations predict three bound states with estimated masses to be 2886 MeV, 3039 MeV, and 3153 MeV, respectively. Additionally, three resonance states are identified in various scattering phase shifts processes. Among them, two resonance states $ΣD$ and $Σ^{\ast}D^{\ast}$ with quantum number $\frac{1}{2}(\frac{1}{2}^{-})$ are detected in channels $ND_{s}^{\ast}$ and $ND$, and $ΣD^{\ast}$ and $ΛD$, with masses and decay widths of ($M_{R}=3053\sim3055$ MeV, $T_{total}=13.0\sim13.4$ MeV) and ($M_{R}=3389\sim3390$ MeV, $T_{total}=10.4$ MeV), respectively. In the $ΛD^{\ast}$ and $ΣD^{\ast}$ channels, a resonance state with quantum number $\frac{1}{2}(\frac{3}{2}^{-})$ is discovered, with its mass and decay width being $3250\sim3252$ MeV and 4.4 MeV, respectively. These predicted pentaquark states have $\bar{c}snnn$ quark compositions, allowing them to be recognized as genuine pentaquark states. To validate these predictions, it is expected that upcoming experiments will further explore the predicted resonance and bound states in these possible decay channels. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05564 [pdf, ps, other]

A Re-solving Heuristic for Dynamic Assortment Optimization with Knapsack Constraints

Authors: Xi Chen, Mo Liu, Yining Wang, Yuan Zhou

Abstract: In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being compu… ▽ More In this paper, we consider a multi-stage dynamic assortment optimization problem with multi-nomial choice modeling (MNL) under resource knapsack constraints. Given the current resource inventory levels, the retailer makes an assortment decision at each period, and the goal of the retailer is to maximize the total profit from purchases. With the exact optimal dynamic assortment solution being computationally intractable, a practical strategy is to adopt the re-solving technique that periodically re-optimizes deterministic linear programs (LP) arising from fluid approximation. However, the fractional structure of MNL makes the fluid approximation in assortment optimization highly non-linear, which brings new technical challenges. To address this challenge, we propose a new epoch-based re-solving algorithm that effectively transforms the denominator of the objective into the constraint. Theoretically, we prove that the regret (i.e., the gap between the resolving policy and the optimal objective of the fluid approximation) scales logarithmically with the length of time horizon and resource capacities. △ Less

Submitted 7 July, 2024; originally announced July 2024.

arXiv:2407.05546 [pdf, other]

AID-AppEAL: Automatic Image Dataset and Algorithm for Content Appeal Enhancement and Assessment Labeling

Authors: Sherry X. Chen, Yaron Vaxman, Elad Ben Baruch, David Asulin, Aviad Moreshet, Misha Sra, Pradeep Sen

Abstract: We propose Image Content Appeal Assessment (ICAA), a novel metric that quantifies the level of positive interest an image's content generates for viewers, such as the appeal of food in a photograph. This is fundamentally different from traditional Image-Aesthetics Assessment (IAA), which judges an image's artistic quality. While previous studies often confuse the concepts of ``aesthetics'' and ``a… ▽ More We propose Image Content Appeal Assessment (ICAA), a novel metric that quantifies the level of positive interest an image's content generates for viewers, such as the appeal of food in a photograph. This is fundamentally different from traditional Image-Aesthetics Assessment (IAA), which judges an image's artistic quality. While previous studies often confuse the concepts of ``aesthetics'' and ``appeal,'' our work addresses this by being the first to study ICAA explicitly. To do this, we propose a novel system that automates dataset creation and implements algorithms to estimate and boost content appeal. We use our pipeline to generate two large-scale datasets (70K+ images each) in diverse domains (food and room interior design) to train our models, which revealed little correlation between content appeal and aesthetics. Our user study, with more than 76% of participants preferring the appeal-enhanced images, confirms that our appeal ratings accurately reflect user preferences, establishing ICAA as a unique evaluative criterion. Our code and datasets are available at https://github.com/SherryXTChen/AID-Appeal. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: European Conference on Computer Vision

arXiv:2407.05249 [pdf, ps, other]

RIS-assisted Coverage Enhancement in mmWave Integrated Sensing and Communication Networks

Authors: Xu Gan, Chongwen Huang, Zhaohui Yang, Xiaoming Chen, Faouzi Bader, Zhaoyang Zhang, Chau Yuen, Yong Liang Guan, Merouane Debbah

Abstract: Integrated sensing and communication (ISAC) has emerged as a promising technology to facilitate high-rate communications and super-resolution sensing, particularly operating in the millimeter wave (mmWave) band. However, the vulnerability of mmWave signals to blockages severely impairs ISAC capabilities and coverage. To tackle this, an efficient and low-cost solution is to deploy distributed recon… ▽ More Integrated sensing and communication (ISAC) has emerged as a promising technology to facilitate high-rate communications and super-resolution sensing, particularly operating in the millimeter wave (mmWave) band. However, the vulnerability of mmWave signals to blockages severely impairs ISAC capabilities and coverage. To tackle this, an efficient and low-cost solution is to deploy distributed reconfigurable intelligent surfaces (RISs) to construct virtual links between the base stations (BSs) and users in a controllable fashion. In this paper, we investigate the generalized RIS-assisted mmWave ISAC networks considering the blockage effect, and examine the beneficial impact of RISs on the coverage rate utilizing stochastic geometry. Specifically, taking into account the coupling effect of ISAC dual functions within the same network topology, we derive the conditional coverage probability of ISAC performance for two association cases, based on the proposed beam pattern model and user association policies. Then, the marginal coverage rate is calculated by combining these two cases through the distance-dependent thinning method. Simulation results verify the accuracy of derived theoretical formulations and provide valuable guidelines for the practical network deployment. Specifically, our results indicate the superiority of the RIS deployment with the density of 40 km${}^{-2}$ BSs, and that the joint coverage rate of ISAC performance exhibits potential growth from $67.1\%$ to $92.2\%$ with the deployment of RISs. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.05168 [pdf, other]

Deception in Nash Equilibrium Seeking

Authors: Michael Tang, Umar Javed, Xudong Chen, Miroslav Krstic, Jorge I. Poveda

Abstract: In socio-technical multi-agent systems, deception exploits privileged information to induce false beliefs in "victims," keeping them oblivious and leading to outcomes detrimental to them or advantageous to the deceiver. We consider model-free Nash-equilibrium-seeking for non-cooperative games with asymmetric information and introduce model-free deceptive algorithms with stability guarantees. In th… ▽ More In socio-technical multi-agent systems, deception exploits privileged information to induce false beliefs in "victims," keeping them oblivious and leading to outcomes detrimental to them or advantageous to the deceiver. We consider model-free Nash-equilibrium-seeking for non-cooperative games with asymmetric information and introduce model-free deceptive algorithms with stability guarantees. In the simplest algorithm, the deceiver includes in his action policy the victim's exploration signal, with an amplitude tuned by an integrator of the regulation error between the deceiver's actual and desired payoff. The integral feedback drives the deceiver's payoff to the payoff's reference value, while the victim is led to adopt a suboptimal action, at which the pseudogradient of the deceiver's payoff is zero. The deceiver's and victim's actions turn out to constitute a "deceptive" Nash equilibrium of a different game, whose structure is managed - in real time - by the deceiver. We examine quadratic, aggregative, and more general games and provide conditions for a successful deception, mutual and benevolent deception, and immunity to deception. Stability results are established using techniques based on averaging and singular perturbations. Among the examples in the paper is a microeconomic duopoly in which the deceiver induces in the victim a belief that the buyers disfavor the deceiver more than they actually do, leading the victim to increase the price above the Nash price, and resulting in an increased profit for the deceiver and a decreased profit for the victim. A study of the deceiver's integral feedback for the desired profit reveals that, in duopolies with equal marginal costs, a deceiver that is greedy for very high profit can attain any such profit, and pursue this with arbitrarily high integral gain (impatiently), irrespective of the market preference for the victim. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.04672 [pdf, ps, other]

Rapid Mixing via Coupling Independence for Spin Systems with Unbounded Degree

Authors: Xiaoyu Chen, Weiming Feng

Abstract: We develop a new framework to prove the mixing or relaxation time for the Glauber dynamics on spin systems with unbounded degree. It works for general spin systems including both $2$-spin and multi-spin systems. As applications for this approach: $\bullet$ We prove the optimal $O(n)$ relaxation time for the Glauber dynamics of random $q$-list-coloring on an $n$-vertices triangle-tree graph with… ▽ More We develop a new framework to prove the mixing or relaxation time for the Glauber dynamics on spin systems with unbounded degree. It works for general spin systems including both $2$-spin and multi-spin systems. As applications for this approach: $\bullet$ We prove the optimal $O(n)$ relaxation time for the Glauber dynamics of random $q$-list-coloring on an $n$-vertices triangle-tree graph with maximum degree $Δ$ such that $q/Δ> α^\star$, where $α^\star \approx 1.763$ is the unique positive solution of the equation $α= \exp(1/α)$. This improves the $n^{1+o(1)}$ relaxation time for Glauber dynamics obtained by the previous work of Jain, Pham, and Vuong (2022). Besides, our framework can also give a near-linear time sampling algorithm under the same condition. $\bullet$ We prove the optimal $O(n)$ relaxation time and near-optimal $\widetilde{O}(n)$ mixing time for the Glauber dynamics on hardcore models with parameter $λ$ in $\textit{balanced}$ bipartite graphs such that $λ< λ_c(Δ_L)$ for the max degree $Δ_L$ in left part and the max degree $Δ_R$ of right part satisfies $Δ_R = O(Δ_L)$. This improves the previous result by Chen, Liu, and Yin (2023). At the heart of our proof is the notion of $\textit{coupling independence}$ which allows us to consider multiple vertices as a huge single vertex with exponentially large domain and do a "coarse-grained" local-to-global argument on spin systems. The technique works for general (multi) spin systems and helps us obtain some new comparison results for Glauber dynamics. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.04620 [pdf, other]

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Authors: Yu Sun, Xinhao Li, Karan Dalal, Jiarui Xu, Arjun Vikram, Genghan Zhang, Yann Dubois, Xinlei Chen, Xiaolong Wang, Sanmi Koyejo, Tatsunori Hashimoto, Carlos Guestrin

Abstract: Self-attention performs well in long context but has quadratic complexity. Existing RNN layers have linear complexity, but their performance in long context is limited by the expressive power of their hidden state. We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and t… ▽ More Self-attention performs well in long context but has quadratic complexity. Existing RNN layers have linear complexity, but their performance in long context is limited by the expressive power of their hidden state. We propose a new class of sequence modeling layers with linear complexity and an expressive hidden state. The key idea is to make the hidden state a machine learning model itself, and the update rule a step of self-supervised learning. Since the hidden state is updated by training even on test sequences, our layers are called Test-Time Training (TTT) layers. We consider two instantiations: TTT-Linear and TTT-MLP, whose hidden state is a linear model and a two-layer MLP respectively. We evaluate our instantiations at the scale of 125M to 1.3B parameters, comparing with a strong Transformer and Mamba, a modern RNN. Both TTT-Linear and TTT-MLP match or exceed the baselines. Similar to Transformer, they can keep reducing perplexity by conditioning on more tokens, while Mamba cannot after 16k context. With preliminary systems optimization, TTT-Linear is already faster than Transformer at 8k context and matches Mamba in wall-clock time. TTT-MLP still faces challenges in memory I/O, but shows larger potential in long context, pointing to a promising direction for future research. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.04576 [pdf, other]

Optimal Mixing for Randomly Sampling Edge Colorings on Trees Down to the Max Degree

Authors: Charlie Carlson, Xiaoyu Chen, Weiming Feng, Eric Vigoda

Abstract: We address the convergence rate of Markov chains for randomly generating an edge coloring of a given tree. Our focus is on the Glauber dynamics which updates the color at a randomly chosen edge in each step. For a tree $T$ with $n$ vertices and maximum degree $Δ$, when the number of colors $q$ satisfies $q\geqΔ+2$ then we prove that the Glauber dynamics has an optimal relaxation time of $O(n)$, wh… ▽ More We address the convergence rate of Markov chains for randomly generating an edge coloring of a given tree. Our focus is on the Glauber dynamics which updates the color at a randomly chosen edge in each step. For a tree $T$ with $n$ vertices and maximum degree $Δ$, when the number of colors $q$ satisfies $q\geqΔ+2$ then we prove that the Glauber dynamics has an optimal relaxation time of $O(n)$, where the relaxation time is the inverse of the spectral gap. This is optimal in the range of $q$ in terms of $Δ$ as Dyer, Goldberg, and Jerrum (2006) showed that the relaxation time is $Ω(n^3)$ when $q=Δ+1$. For the case $q=Δ+1$, we show that an alternative Markov chain which updates a pair of neighboring edges has relaxation time $O(n)$. Moreover, for the $Δ$-regular complete tree we prove $O(n\log^2{n})$ mixing time bounds for the respective Markov chain. Our proofs establish approximate tensorization of variance via a novel inductive approach, where the base case is a tree of height $\ell=O(Δ^2\log^2Δ)$, which we analyze using a canonical paths argument. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.04433 [pdf, other]

Radiative decays of $P$-wave bottom baryons from light-cone sum rules

Authors: X. Luo, H. M. Yang, H. X. Chen

Abstract: We carry out a comprehensive investigation on the radiative decays of $P$-wave bottom baryons using the light-cone sum rule method. We analyze their electromagnetic transitions into ground-state bottom baryons together with a photon. Together with their mass spectra and strong decays investigated in Refs. \cite{Yang:2020zrh,Tan:2023opd}, a rather complete QCD sum rule study has been done to unders… ▽ More We carry out a comprehensive investigation on the radiative decays of $P$-wave bottom baryons using the light-cone sum rule method. We analyze their electromagnetic transitions into ground-state bottom baryons together with a photon. Together with their mass spectra and strong decays investigated in Refs. \cite{Yang:2020zrh,Tan:2023opd}, a rather complete QCD sum rule study has been done to understand the $P$-wave singly bottom baryons within the framework of heavy quark effective theory. As summarized in Tables~\ref{tab:candidate}/\ref{tab:candidate6f}, some $P$-wave bottom baryons have limited strong decay widths so that their radiative decay widths become non-negligible. We propose to study these excited bottom baryons and their radiative decays in the future Belle-II, BESIII, and LHCb experiments. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 24 pages, 10 figures, 7 tables, suggestions and comments welcome

arXiv:2407.04215 [pdf, other]

T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models

Authors: Zhongqi Wang, Jie Zhang, Shiguang Shan, Xilin Chen

Abstract: While text-to-image diffusion models demonstrate impressive generation capabilities, they also exhibit vulnerability to backdoor attacks, which involve the manipulation of model outputs through malicious triggers. In this paper, for the first time, we propose a comprehensive defense method named T2IShield to detect, localize, and mitigate such attacks. Specifically, we find the "Assimilation Pheno… ▽ More While text-to-image diffusion models demonstrate impressive generation capabilities, they also exhibit vulnerability to backdoor attacks, which involve the manipulation of model outputs through malicious triggers. In this paper, for the first time, we propose a comprehensive defense method named T2IShield to detect, localize, and mitigate such attacks. Specifically, we find the "Assimilation Phenomenon" on the cross-attention maps caused by the backdoor trigger. Based on this key insight, we propose two effective backdoor detection methods: Frobenius Norm Threshold Truncation and Covariance Discriminant Analysis. Besides, we introduce a binary-search approach to localize the trigger within a backdoor sample and assess the efficacy of existing concept editing methods in mitigating backdoor attacks. Empirical evaluations on two advanced backdoor attack scenarios show the effectiveness of our proposed defense method. For backdoor sample detection, T2IShield achieves a detection F1 score of 88.9$\%$ with low computational cost. Furthermore, T2IShield achieves a localization F1 score of 86.4$\%$ and invalidates 99$\%$ poisoned samples. Codes are released at https://github.com/Robin-WZQ/T2IShield. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted by ECCV2024

arXiv:2407.04197 [pdf]

Compact Ion Beam System for Fusion Demonstration

Authors: Allan Xi Chen, Nai-Wei Liu, Alexander Gunn, Zhe Su, Benjamin F. Sigal, Matthew Salazar, Nawar Abdalla, James Chen, Alfred Y. Wong, Qiong Wang

Abstract: We demonstrate a compact ion beam device capable of accelerating H$^+$ and D$^+$ ions up to 75keV energy, on to a solid target, with sufficient beam current to study fusion reactions. The ion beam system uses a microwave driven plasma source to generate ions that are accelerated to high energy with a DC acceleration structure. The plasma source is driven by pulsed microwaves from a solid-state RF… ▽ More We demonstrate a compact ion beam device capable of accelerating H$^+$ and D$^+$ ions up to 75keV energy, on to a solid target, with sufficient beam current to study fusion reactions. The ion beam system uses a microwave driven plasma source to generate ions that are accelerated to high energy with a DC acceleration structure. The plasma source is driven by pulsed microwaves from a solid-state RF amplifier, which is impedance matched to the plasma source chamber at the ISM band frequency (2.4-2.5GHz). The plasma chamber is held at high positive DC potential and is isolated from the impedance matching structure (at ground potential) by a dielectric-filled gap. To facilitate the use of high-energy-particle detectors near the target, the plasma chamber is biased to a high positive voltage, while the target remains grounded. A target loaded with deuterium is used to study D-D fusion and a B$_4$C or LaB$_6$ target is used to study p-$^{11}$B fusion. Detectors include solid-state charged particle detector and a scintillation fast neutron detector. The complete ion beam system can fit on a laboratory table and is a useful tool for teaching undergraduate and graduate students about the physics of fusion. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 18 pages, 13 figures

arXiv:2407.04060 [pdf, other]

2.4-THz Bandwidth Optical Coherent Receiver Based on a Photonic Crystal Microcomb

Authors: Callum Deakin, Jizhao Zang, Xi Chen, Di Che, Lauren Dallachiesa, Brian Stern, Nicolas K. Fontaine, Scott Papp

Abstract: We demonstrate a spectrally-sliced single-polarization optical coherent receiver with a record 2.4-THz bandwidth, using a 200-GHz tantalum pentoxide photonic crystal microring resonator as the local oscillator frequency comb. We demonstrate a spectrally-sliced single-polarization optical coherent receiver with a record 2.4-THz bandwidth, using a 200-GHz tantalum pentoxide photonic crystal microring resonator as the local oscillator frequency comb. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 2024 European Conference on Optical Communication (ECOC)

arXiv:2407.03892 [pdf, other]

On the Effectiveness of Acoustic BPE in Decoder-Only TTS

Authors: Bohan Li, Feiyu Shen, Yiwei Guo, Shuai Wang, Xie Chen, Kai Yu

Abstract: Discretizing speech into tokens and generating them by a decoder-only model have been a promising direction for text-to-speech (TTS) and spoken language modeling (SLM). To shorten the sequence length of speech tokens, acoustic byte-pair encoding (BPE) has emerged in SLM that treats speech tokens from self-supervised semantic representations as characters to further compress the token sequence. But… ▽ More Discretizing speech into tokens and generating them by a decoder-only model have been a promising direction for text-to-speech (TTS) and spoken language modeling (SLM). To shorten the sequence length of speech tokens, acoustic byte-pair encoding (BPE) has emerged in SLM that treats speech tokens from self-supervised semantic representations as characters to further compress the token sequence. But the gain in TTS has not been fully investigated, and the proper choice of acoustic BPE remains unclear. In this work, we conduct a comprehensive study on various settings of acoustic BPE to explore its effectiveness in decoder-only TTS models with semantic speech tokens. Experiments on LibriTTS verify that acoustic BPE uniformly increases the intelligibility and diversity of synthesized speech, while showing different features across BPE settings. Hence, acoustic BPE is a favorable tool for decoder-only TTS. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 5 pages, 3 tables, 1 figures. accepted to Interspeech 2024

arXiv:2407.03741 [pdf, other]

A Unified Expression for Upper Bounds on the BLER of Spinal Codes over Fading Channels

Authors: Aimin Li, Xiaomeng Chen, Shaohua Wu, Gary C. F. Lee, Sumei Sun

Abstract: Performance evaluation of particular channel coding has been a significant topic in coding theory, often involving the use of bounding techniques. This paper focuses on the new family of capacity-achieving codes, Spinal codes, to provide a comprehensive analysis framework to tightly upper bound the block error rate (BLER) of Spinal codes in the finite block length (FBL) regime. First, we resort to… ▽ More Performance evaluation of particular channel coding has been a significant topic in coding theory, often involving the use of bounding techniques. This paper focuses on the new family of capacity-achieving codes, Spinal codes, to provide a comprehensive analysis framework to tightly upper bound the block error rate (BLER) of Spinal codes in the finite block length (FBL) regime. First, we resort to a variant of the Gallager random coding bound to upper bound the BLER of Spinal codes over the fading channel. Then, this paper derives a new bound without resorting to the use of Gallager random coding bound, achieving provable tightness over the wide range of signal-to-noise ratios (SNR). The derived BLER upper bounds in this paper are generalized, facilitating the performance evaluations of Spinal codes over different types of fast fading channels. Over the Rayleigh, Nakagami-m, and Rician fading channels, this paper explicitly derived the BLER upper bounds on Spinal codes as case studies. Based on the bounds, we theoretically reveal that the tail transmission pattern (TTP) for ML-decoded Spinal codes remains optimal in terms of reliability performance. Simulations verify the tightness of the bounds and the insights obtained. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.03633 [pdf, ps, other]

Thermodynamics of heavy quarkonium in the spinning black hole background

Authors: Zhou-Run Zhu, Sheng Wang, Xun Chen, Jun-Xia Chen, Defu Hou

Abstract: In this paper, we examine the thermodynamics of heavy quarkonium in the spinning black hole background. Specifically, we investigate the effect of angular momentum on the interquark distance, free energy, binding energy, entropy, entropic force, and internal energy of heavy quarkonium from the thermodynamic relationship. Our findings indicate that the angular momentum reduces the maximum value of… ▽ More In this paper, we examine the thermodynamics of heavy quarkonium in the spinning black hole background. Specifically, we investigate the effect of angular momentum on the interquark distance, free energy, binding energy, entropy, entropic force, and internal energy of heavy quarkonium from the thermodynamic relationship. Our findings indicate that the angular momentum reduces the maximum value of interquark distance, suggesting that it promotes the dissociation of quarkonium. Additionally, we observe that the angular momentum suppresses free energy. From the results of binding energy, the angular momentum favors the melting of meson into a free quark and antiquark. Moreover, the results show that angular momentum increases the entropy and entropic force, thus accelerates the dissociation of quarkonium. The angular momentum increases the internal energy at large interquark distance. Finally, we find that the angular momentum has a more pronounced effect on quarkonium when the axis of quark pair $Q\overline{Q}$ is transverse to the direction of angular momentum. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 16 pages, 6 figures

arXiv:2407.03605 [pdf, other]

Orthogonal Constrained Minimization with Tensor $\ell_{2,p}$ Regularization for HSI Denoising and Destriping

Authors: Xiaoxia Liu, Shijie Yu, Jian Lu, Xiaojun Chen

Abstract: Hyperspectral images (HSIs) are often contaminated by a mixture of noises such as Gaussian noise, dead lines, stripes, and so on. In this paper, we propose a novel approach for HSI denoising and destriping, called NLTL2p, which consists of an orthogonal constrained minimization model and an iterative algorithm with convergence guarantees. The model of the proposed NLTL2p approach is built based on… ▽ More Hyperspectral images (HSIs) are often contaminated by a mixture of noises such as Gaussian noise, dead lines, stripes, and so on. In this paper, we propose a novel approach for HSI denoising and destriping, called NLTL2p, which consists of an orthogonal constrained minimization model and an iterative algorithm with convergence guarantees. The model of the proposed NLTL2p approach is built based on a new sparsity-enhanced Nonlocal Low-rank Tensor regularization and a tensor $\ell_{2,p}$ norm with $p\in(0,1)$. The low-rank constraints for HSI denoising utilize the spatial nonlocal self-similarity and spectral correlation of HSIs and are formulated based on independent higher-order singular value decomposition with sparsity enhancement on its core tensor to prompt more low-rankness. The tensor $\ell_{2,p}$ norm for HSI destriping is extended from the matrix $\ell_{2,p}$ norm. A proximal block coordinate descent algorithm is proposed in the NLTL2p approach to solve the resulting nonconvex nonsmooth minimization with orthogonal constraints. We show any accumulation point of the sequence generated by the proposed algorithm converges to a first-order stationary point, which is defined using three equalities of substationarity, symmetry, and feasibility for orthogonal constraints. In the numerical experiments, we compare the proposed method with state-of-the-art methods including a deep learning based method, and test the methods on both simulated and real HSI datasets. Our proposed NLTL2p method demonstrates outperformance in terms of metrics such as mean peak signal-to-noise ratio as well as visual quality. △ Less

Submitted 3 July, 2024; originally announced July 2024.

MSC Class: 68U10; 90C26; 15A18; 65F22

arXiv:2407.03591 [pdf, other]

Controlling quasi-parametric amplifications: From multiple PT-symmetry phase transitions to non-Hermitian sensing

Authors: Xiaoxiong Wu, Kai Bai, Penghong Yu, Zhaohui Dong, Yanyan He, Jingui Ma, Vladislav V. Yakovlev, Meng Xiao, Xianfeng Chen, Luqi Yuan

Abstract: Quasi-parametric amplification (QPA) is a nonlinear interaction in which the idler wave is depleted through some loss mechanism. QPA plays an important role in signal amplification in ultrafast photonics and quantum light generation. The QPA process has a number of features characterized by the non-Hermitian parity-time ($\mathcal{PT}$) symmetry. In this report, we explore new interaction regimes… ▽ More Quasi-parametric amplification (QPA) is a nonlinear interaction in which the idler wave is depleted through some loss mechanism. QPA plays an important role in signal amplification in ultrafast photonics and quantum light generation. The QPA process has a number of features characterized by the non-Hermitian parity-time ($\mathcal{PT}$) symmetry. In this report, we explore new interaction regimes and uncover multiple $\mathcal{PT}$-symmetry phase transitions in such QPA process where transitions are particularly sensitive to external parameters. In particular, we demonstrate the feasibility of detection of $10^{-11}$ inhomogeneities of the doped absorber, which is order of magnitude more sensitive than similar measurements performed in a linear absorption regime. In doing so, we reveal a family of $\mathcal{PT}$-symmetry phase transitions appearing in the QPA process and provide a novel nonlinear optical sensing mechanism for precise optical measurements. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 17 pages, 6 figures

arXiv:2407.03575 [pdf, other]

DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification

Authors: Wenhui Zhu, Xiwen Chen, Peijie Qiu, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang

Abstract: Multiple instance learning (MIL) stands as a powerful approach in weakly supervised learning, regularly employed in histological whole slide image (WSI) classification for detecting tumorous lesions. However, existing mainstream MIL methods focus on modeling correlation between instances while overlooking the inherent diversity among instances. However, few MIL methods have aimed at diversity mode… ▽ More Multiple instance learning (MIL) stands as a powerful approach in weakly supervised learning, regularly employed in histological whole slide image (WSI) classification for detecting tumorous lesions. However, existing mainstream MIL methods focus on modeling correlation between instances while overlooking the inherent diversity among instances. However, few MIL methods have aimed at diversity modeling, which empirically show inferior performance but with a high computational cost. To bridge this gap, we propose a novel MIL aggregation method based on diverse global representation (DGR-MIL), by modeling diversity among instances through a set of global vectors that serve as a summary of all instances. First, we turn the instance correlation into the similarity between instance embeddings and the predefined global vectors through a cross-attention mechanism. This stems from the fact that similar instance embeddings typically would result in a higher correlation with a certain global vector. Second, we propose two mechanisms to enforce the diversity among the global vectors to be more descriptive of the entire bag: (i) positive instance alignment and (ii) a novel, efficient, and theoretically guaranteed diversification learning paradigm. Specifically, the positive instance alignment module encourages the global vectors to align with the center of positive instances (e.g., instances containing tumors in WSI). To further diversify the global representations, we propose a novel diversification learning paradigm leveraging the determinantal point process. The proposed model outperforms the state-of-the-art MIL aggregation models by a substantial margin on the CAMELYON-16 and the TCGA-lung cancer datasets. The code is available at \url{https://github.com/ChongQingNoSubway/DGR-MIL}. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: Accepted to ECCV 2024

arXiv:2407.03440 [pdf, other]

Advanced Framework for Animal Sound Classification With Features Optimization

Authors: Qiang Yang, Xiuying Chen, Changsheng Ma, Carlos M. Duarte, Xiangliang Zhang

Abstract: The automatic classification of animal sounds presents an enduring challenge in bioacoustics, owing to the diverse statistical properties of sound signals, variations in recording equipment, and prevalent low Signal-to-Noise Ratio (SNR) conditions. Deep learning models like Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) have excelled in human speech recognition but have not… ▽ More The automatic classification of animal sounds presents an enduring challenge in bioacoustics, owing to the diverse statistical properties of sound signals, variations in recording equipment, and prevalent low Signal-to-Noise Ratio (SNR) conditions. Deep learning models like Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) have excelled in human speech recognition but have not been effectively tailored to the intricate nature of animal sounds, which exhibit substantial diversity even within the same domain. We propose an automated classification framework applicable to general animal sound classification. Our approach first optimizes audio features from Mel-frequency cepstral coefficients (MFCC) including feature rearrangement and feature reduction. It then uses the optimized features for the deep learning model, i.e., an attention-based Bidirectional LSTM (Bi-LSTM), to extract deep semantic features for sound classification. We also contribute an animal sound benchmark dataset encompassing oceanic animals and birds1. Extensive experimentation with real-world datasets demonstrates that our approach consistently outperforms baseline methods by over 25% in precision, recall, and accuracy, promising advancements in animal sound classification. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03007 [pdf, other]

What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks

Authors: Chengrui Huang, Zhengliang Shi, Yuntao Wen, Xiuying Chen, Peng Han, Shen Gao, Shuo Shang

Abstract: Tool learning methods have enhanced the ability of large language models (LLMs) to interact with real-world applications. Many existing works fine-tune LLMs or design prompts to enable LLMs to select appropriate tools and correctly invoke them to meet user requirements. However, it is observed in previous works that the performance of tool learning varies from tasks, datasets, training settings, a… ▽ More Tool learning methods have enhanced the ability of large language models (LLMs) to interact with real-world applications. Many existing works fine-tune LLMs or design prompts to enable LLMs to select appropriate tools and correctly invoke them to meet user requirements. However, it is observed in previous works that the performance of tool learning varies from tasks, datasets, training settings, and algorithms. Without understanding the impact of these factors, it can lead to inconsistent results, inefficient model deployment, and suboptimal tool utilization, ultimately hindering the practical integration and scalability of LLMs in real-world scenarios. Therefore, in this paper, we explore the impact of both internal and external factors on the performance of tool learning frameworks. Through extensive experiments on two benchmark datasets, we find several insightful conclusions for future work, including the observation that LLMs can benefit significantly from increased trial and exploration. We believe our empirical study provides a new perspective for future tool learning research. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 19 pages, 9 figures

Showing 1–50 of 9,605 results for author: Chen, X