subscribe to arXiv mailings

Fast Mixing in Sparse Random Ising Models

Authors: Kuikui Liu, Sidhanth Mohanty, Amit Rajaraman, David X. Wu

Abstract: Motivated by the community detection problem in Bayesian inference, as well as the recent explosion of interest in spin glasses from statistical physics, we study the classical Glauber dynamics for sampling from Ising models with sparse random interactions. It is now well-known that when the interaction matrix has spectral diameter less than $1$, Glauber dynamics mixes in $O(n\log n)$ steps. Unfor… ▽ More Motivated by the community detection problem in Bayesian inference, as well as the recent explosion of interest in spin glasses from statistical physics, we study the classical Glauber dynamics for sampling from Ising models with sparse random interactions. It is now well-known that when the interaction matrix has spectral diameter less than $1$, Glauber dynamics mixes in $O(n\log n)$ steps. Unfortunately, such criteria fail dramatically for interactions supported on arguably the most well-studied sparse random graph: the Erdős--Rényi random graph $G(n,d/n)$, due to the presence of almost linearly many outlier eigenvalues of unbounded magnitude. We prove that for the \emph{Viana--Bray spin glass}, where the interactions are supported on $G(n,d/n)$ and randomly assigned $\pmβ$, Glauber dynamics mixes in $n^{1+o(1)}$ time with high probability as long as $β\le O(1/\sqrt{d})$, independent of $n$. We further extend our results to random graphs drawn according to the $2$-community stochastic block model, as well as when the interactions are given by a "centered" version of the adjacency matrix. The latter setting is particularly relevant for the inference problem in community detection. Indeed, we build on this result to demonstrate that Glauber dynamics succeeds at recovering communities in the stochastic block model in an upcoming paper. The primary technical ingredient in our proof is showing that with high probability, a sparse random graph can be decomposed into two parts --- a \emph{bulk} which behaves like a graph with bounded maximum degree and a well-behaved spectrum, and a \emph{near-forest} with favorable pseudorandom properties. We then use this decomposition to design a localization procedure that interpolates to simpler Ising models supported only on the near-forest, and then execute a pathwise analysis to establish a modified log-Sobolev inequality. △ Less

Submitted 10 May, 2024; originally announced May 2024.

Comments: 66 pages, 4 figures

arXiv:2405.05985 [pdf, other]

TrafficGPT: Towards Multi-Scale Traffic Analysis and Generation with Spatial-Temporal Agent Framework

Authors: Jinhui Ouyang, Yijie Zhu, Xiang Yuan, Di Wu

Abstract: The precise prediction of multi-scale traffic is a ubiquitous challenge in the urbanization process for car owners, road administrators, and governments. In the case of complex road networks, current and past traffic information from both upstream and downstream roads are crucial since various road networks have different semantic information about traffic. Rationalizing the utilization of semanti… ▽ More The precise prediction of multi-scale traffic is a ubiquitous challenge in the urbanization process for car owners, road administrators, and governments. In the case of complex road networks, current and past traffic information from both upstream and downstream roads are crucial since various road networks have different semantic information about traffic. Rationalizing the utilization of semantic information can realize short-term, long-term, and unseen road traffic prediction. As the demands of multi-scale traffic analysis increase, on-demand interactions and visualizations are expected to be available for transportation participants. We have designed a multi-scale traffic generation system, namely TrafficGPT, using three AI agents to process multi-scale traffic data, conduct multi-scale traffic analysis, and present multi-scale visualization results. TrafficGPT consists of three essential AI agents: 1) a text-to-demand agent that is employed with Question & Answer AI to interact with users and extract prediction tasks through texts; 2) a traffic prediction agent that leverages multi-scale traffic data to generate temporal features and similarity, and fuse them with limited spatial features and similarity, to achieve accurate prediction of three tasks; and 3) a suggestion and visualization agent that uses the prediction results to generate suggestions and visualizations, providing users with a comprehensive understanding of traffic conditions. Our TrafficGPT system focuses on addressing concerns about traffic prediction from transportation participants, and conducted extensive experiments on five real-world road datasets to demonstrate its superior predictive and interactive performance △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.05895 [pdf, ps, other]

Border rank bounds for $GL(V)$-invariant tensors arising from matrices of constant rank

Authors: Derek Wu

Abstract: We prove border rank bounds for a class of $GL(V)$-invariant tensors in $V^*\otimes U\otimes W$, where $U$ and $W$ are $GL(V)$-modules. These tensors correspond to spaces of matrices of constant rank. In particular we prove lower bounds for tensors in $\mathbb{C}^l\otimes\mathbb{C}^m\otimes\mathbb{C}^n$ that are not $1_A$-generic, where no nontrivial bounds were known, and also when $l,m\ll n$, wh… ▽ More We prove border rank bounds for a class of $GL(V)$-invariant tensors in $V^*\otimes U\otimes W$, where $U$ and $W$ are $GL(V)$-modules. These tensors correspond to spaces of matrices of constant rank. In particular we prove lower bounds for tensors in $\mathbb{C}^l\otimes\mathbb{C}^m\otimes\mathbb{C}^n$ that are not $1_A$-generic, where no nontrivial bounds were known, and also when $l,m\ll n$, where previously only bounds for unbalanced matrix multiplication tensors were known. We give the first explicit use of Young flattenings for tensors beyond Koszul to obtain border rank lower bounds, and determine the border rank of three tensors. △ Less

Submitted 9 May, 2024; originally announced May 2024.

MSC Class: 68Q17; 14L30; 15A69; 15A30

arXiv:2405.04853 [pdf, other]

Mack modes in supersonic boundary layer

Authors: Nader Masmoudi, Yuxi Wang, Di Wu, Zhifei Zhang

Abstract: Understanding the transition mechanism of boundary layer flows is of great significance in physics and engineering, especially due to the current development of supersonic and hypersonic aircraft. In this paper, we construct multiple unstable acoustic modes so-called Mack modes, which play a crucial role during the early stage of transition in the supersonic boundary layer. To this end, we develop… ▽ More Understanding the transition mechanism of boundary layer flows is of great significance in physics and engineering, especially due to the current development of supersonic and hypersonic aircraft. In this paper, we construct multiple unstable acoustic modes so-called Mack modes, which play a crucial role during the early stage of transition in the supersonic boundary layer. To this end, we develop an inner-outer gluing iteration to solve a hyperbolic-elliptic mixed type and singular system. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2405.02580 [pdf, other]

PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval-Augmented Property Generation

Authors: Ye Liu, Yue Xue, Daoyuan Wu, Yuqiang Sun, Yi Li, Miaolei Shi, Yang Liu

Abstract: With recent advances in large language models (LLMs), this paper explores the potential of leveraging state-of-the-art LLMs, such as GPT-4, to transfer existing human-written properties (e.g., those from Certora auditing reports) and automatically generate customized properties for unknown code. To this end, we embed existing properties into a vector database and retrieve a reference property for… ▽ More With recent advances in large language models (LLMs), this paper explores the potential of leveraging state-of-the-art LLMs, such as GPT-4, to transfer existing human-written properties (e.g., those from Certora auditing reports) and automatically generate customized properties for unknown code. To this end, we embed existing properties into a vector database and retrieve a reference property for LLM-based in-context learning to generate a new prop- erty for a given code. While this basic process is relatively straight- forward, ensuring that the generated properties are (i) compilable, (ii) appropriate, and (iii) runtime-verifiable presents challenges. To address (i), we use the compilation and static analysis feedback as an external oracle to guide LLMs in iteratively revising the generated properties. For (ii), we consider multiple dimensions of similarity to rank the properties and employ a weighted algorithm to identify the top-K properties as the final result. For (iii), we design a dedicated prover to formally verify the correctness of the generated prop- erties. We have implemented these strategies into a novel system called PropertyGPT, with 623 human-written properties collected from 23 Certora projects. Our experiments show that PropertyGPT can generate comprehensive and high-quality properties, achieving an 80% recall compared to the ground truth. It successfully detected 26 CVEs/attack incidents out of 37 tested and also uncovered 12 zero-day vulnerabilities, resulting in $8,256 bug bounty rewards. △ Less

Submitted 4 May, 2024; originally announced May 2024.

arXiv:2405.02540 [pdf, ps, other]

Several results on exact sequences in categories of modules over trusses

Authors: Yongduo Wang, Dengke Jia, Jian He, Dejun Wu

Abstract: Categorical aspects of the theory of modules over trusses were studied in recent years. The snake lemma and the nine lemma in categories of modules over trusses are formulated in this paper. Categorical aspects of the theory of modules over trusses were studied in recent years. The snake lemma and the nine lemma in categories of modules over trusses are formulated in this paper. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2006.16624, arXiv:2311.01979 by other authors

MSC Class: 18G80; 18E10

arXiv:2405.01844 [pdf, other]

A Survey on Privacy-Preserving Caching at Network Edge: Classification, Solutions, and Challenges

Authors: Xianzhi Zhang, Yipeng Zhou, Di Wu, Shazia Riaz, Quan Z. Sheng, Miao Hu, Linchang Xiao

Abstract: Caching content at the network edge is a popular and effective technique widely deployed to alleviate the burden of network backhaul, shorten service delay and improve service quality. However, there has been some controversy over privacy violations in caching content at the network edge. On the one hand, the multi-access open edge network provides an ideal surface for external attackers to obtain… ▽ More Caching content at the network edge is a popular and effective technique widely deployed to alleviate the burden of network backhaul, shorten service delay and improve service quality. However, there has been some controversy over privacy violations in caching content at the network edge. On the one hand, the multi-access open edge network provides an ideal surface for external attackers to obtain private data from the edge cache by extracting sensitive information. On the other hand, privacy can be infringed by curious edge caching providers through caching trace analysis targeting to achieve better caching performance or higher profits. Therefore, an in-depth understanding of privacy issues in edge caching networks is vital and indispensable for creating a privacy-preserving caching service at the network edge. In this article, we are among the first to fill in this gap by examining privacy-preserving techniques for caching content at the network edge. Firstly, we provide an introduction to the background of Privacy-Preserving Edge Caching (PPEC). Next, we summarize the key privacy issues and present a taxonomy for caching at the network edge from the perspective of private data. Additionally, we conduct a retrospective review of the state-of-the-art countermeasures against privacy leakage from content caching at the network edge. Finally, we conclude the survey and envision challenges for future research. △ Less

Submitted 3 May, 2024; originally announced May 2024.

arXiv:2405.01275 [pdf, other]

Variable Selection in Ultra-high Dimensional Feature Space for the Cox Model with Interval-Censored Data

Authors: Daewoo Pak, Jianrui Zhang, Di Wu, Haolei Weng, Chenxi Li

Abstract: We develop a set of variable selection methods for the Cox model under interval censoring, in the ultra-high dimensional setting where the dimensionality can grow exponentially with the sample size. The methods select covariates via a penalized nonparametric maximum likelihood estimation with some popular penalty functions, including lasso, adaptive lasso, SCAD, and MCP. We prove that our penalize… ▽ More We develop a set of variable selection methods for the Cox model under interval censoring, in the ultra-high dimensional setting where the dimensionality can grow exponentially with the sample size. The methods select covariates via a penalized nonparametric maximum likelihood estimation with some popular penalty functions, including lasso, adaptive lasso, SCAD, and MCP. We prove that our penalized variable selection methods with folded concave penalties or adaptive lasso penalty enjoy the oracle property. Extensive numerical experiments show that the proposed methods have satisfactory empirical performance under various scenarios. The utility of the methods is illustrated through an application to a genome-wide association study of age to early childhood caries. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2405.00699 [pdf, other]

Direct Training Needs Regularisation: Anytime Optimal Inference Spiking Neural Network

Authors: Dengyu Wu, Yi Qi, Kaiwen Cai, Gaojie Jin, Xinping Yi, Xiaowei Huang

Abstract: Spiking Neural Network (SNN) is acknowledged as the next generation of Artificial Neural Network (ANN) and hold great promise in effectively processing spatial-temporal information. However, the choice of timestep becomes crucial as it significantly impacts the accuracy of the neural network training. Specifically, a smaller timestep indicates better performance in efficient computing, resulting i… ▽ More Spiking Neural Network (SNN) is acknowledged as the next generation of Artificial Neural Network (ANN) and hold great promise in effectively processing spatial-temporal information. However, the choice of timestep becomes crucial as it significantly impacts the accuracy of the neural network training. Specifically, a smaller timestep indicates better performance in efficient computing, resulting in reduced latency and operations. While, using a small timestep may lead to low accuracy due to insufficient information presentation with few spikes. This observation motivates us to develop an SNN that is more reliable for adaptive timestep by introducing a novel regularisation technique, namely Spatial-Temporal Regulariser (STR). Our approach regulates the ratio between the strength of spikes and membrane potential at each timestep. This effectively balances spatial and temporal performance during training, ultimately resulting in an Anytime Optimal Inference (AOI) SNN. Through extensive experiments on frame-based and event-based datasets, our method, in combination with cutoff based on softmax output, achieves state-of-the-art performance in terms of both latency and accuracy. Notably, with STR and cutoff, SNN achieves 2.14 to 2.89 faster in inference compared to the pre-configured timestep with near-zero accuracy drop of 0.50% to 0.64% over the event-based datasets. Code available: https://github.com/Dengyu-Wu/AOI-SNN-Regularisation △ Less

Submitted 15 April, 2024; originally announced May 2024.

arXiv:2404.18671 [pdf, ps, other]

doi 10.1088/1402-4896/ad3f86

Uncertainty relation and the constrained quadratic programming

Authors: Lin Zhang, Dade Wu, Ming-Jing Zhao, Hua Nan

Abstract: The uncertainty relation is a fundamental concept in quantum theory, plays a pivotal role in various quantum information processing tasks. In this study, we explore the additive uncertainty relation pertaining to two or more observables, in terms of their variance,by utilizing the generalized Gell-Mann representation in qudit systems. We find that the tight state-independent lower bound of the var… ▽ More The uncertainty relation is a fundamental concept in quantum theory, plays a pivotal role in various quantum information processing tasks. In this study, we explore the additive uncertainty relation pertaining to two or more observables, in terms of their variance,by utilizing the generalized Gell-Mann representation in qudit systems. We find that the tight state-independent lower bound of the variance sum can be characterized as a quadratic programming problem with nonlinear constraints in optimization theory. As illustrative examples, we derive analytical solutions for these quadratic programming problems in lower-dimensional systems, which align with the state-independent lower bounds. Additionally, we introduce a numerical algorithm tailored for solving these quadratic programming instances, highlighting its efficiency and accuracy. The advantage of our approach lies in its potential ability to simultaneously achieve the optimal value of the quadratic programming problem with nonlinear constraints but also precisely identify the extremal state where this optimal value is attained. This enables us to establish a tight state-independent lower bound for the sum of variances, and further identify the extremal state at which this lower bound is realized. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: 35 pages, LaTeX

Journal ref: Physica Scripta 99, 065103 (2024)

arXiv:2404.18644 [pdf, other]

Low-Overhead Defect-Adaptive Surface Code with Bandage-Like Super-Stabilizers

Authors: Zuolin Wei, Tan He, Yangsen Ye, Dachao Wu, Yiming Zhang, Youwei Zhao, Weiping Lin, He-Liang Huang, Xiaobo Zhu, Jian-Wei Pan

Abstract: To make practical quantum algorithms work, large-scale quantum processors protected by error-correcting codes are required to resist noise and ensure reliable computational outcomes. However, a major challenge arises from defects in processor fabrication, as well as occasional losses or cosmic rays during the computing process, all of which can lead to qubit malfunctions and disrupt error-correcti… ▽ More To make practical quantum algorithms work, large-scale quantum processors protected by error-correcting codes are required to resist noise and ensure reliable computational outcomes. However, a major challenge arises from defects in processor fabrication, as well as occasional losses or cosmic rays during the computing process, all of which can lead to qubit malfunctions and disrupt error-correcting codes' normal operations. In this context, we introduce an automatic adapter to implement the surface code on defective lattices. Unlike previous approaches, this adapter leverages newly proposed bandage-like super-stabilizers to save more qubits when defects are clustered, thus enhancing the code distance and reducing super-stabilizer weight. For instance, in comparison with earlier methods, with a code size of 27 and a random defect rate of 2\%, the disabled qubits decrease by $1/3$, and the average preserved code distance increases by 63\%. This demonstrates a significant reduction in overhead when handling defects using our approach, and this advantage amplifies with increasing processor size and defect rates. Our work presents a low-overhead, automated solution to the challenge of adapting the surface code to defects, an essential step towards scaling up the construction of large-scale quantum computers for practical applications. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.18050 [pdf, other]

Deformability, inherent mechanical properties and chemical bonding of Al11Nd3 in Al-Nd target material

Authors: Xue-Qian Wang, Run-Xin Song, Xu Guan, Shuan Li, Shuchen Sun, Hongbo Yang, Daogao Wu, Ganfeng Tu, Song Li, Hai-Le Yan, Liang Zuo

Abstract: Microstructure uniformity of the Al-Nd target materials with Al11Nd3 significantly affects the performance of the fabricated film, which is widely used as wiring material in largesize thin-film transistor liquid crystal display (TFT-LCD) panels. Understanding the inherent mechanical properties and chemical bonds of Al11Nd3 is crucial for homogenizing the Al-Nd target. Here, by a combined experimen… ▽ More Microstructure uniformity of the Al-Nd target materials with Al11Nd3 significantly affects the performance of the fabricated film, which is widely used as wiring material in largesize thin-film transistor liquid crystal display (TFT-LCD) panels. Understanding the inherent mechanical properties and chemical bonds of Al11Nd3 is crucial for homogenizing the Al-Nd target. Here, by a combined experimental and ab-initio theoretical study, the microstructure and deformability of the Al-3wt%Nd alloy and the inherent mechanical properties and chemical bonds of Al11Nd3 are investigated comprehensively. The Al-3wt%Nd alloy is composed of the pre-eutectic α-Al matrix and the eutectic α-Al and a high stable α-Al11Nd3 phases. During the plastic deformation, the eutectic microstructure transforms from a cellular to a lamellar shape, while the morphology and dimension of α-Al11Nd3 are not changed significantly. By examining ideal tensile strength, elastic moduli, hardness and brittleness-ductility, the hardnessbrittleness of α-Al11Nd3 is quantitatively evaluated, accounting for its difficulties of plastic deformation and fragmentation. Combining band structure, population analysis, topological analysis and crystal orbital Hamilton population, it is revealed that α-Al11Nd3 possesses two types of chemical bonds: the Nd-Al and Al-Al bonds. The former is a typical ionic bond with electron transfer from Nd to Al, while the latter, dominated by both 3s-3p and 3p-3p interactions, is a weak covalent bond. The mixed chemical bond is responsible for the high hardness-brittleness of α-Al11Nd3. This work is expected to lay a foundation for Al-Nd alloy and catalyze the fabrication of high-quality Al-Nd target materials. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: 11 figures,5 tables

arXiv:2404.17900 [pdf, other]

Unsupervised Anomaly Detection via Masked Diffusion Posterior Sampling

Authors: Di Wu, Shicai Fan, Xue Zhou, Li Yu, Yuzhong Deng, Jianxiao Zou, Baihong Lin

Abstract: Reconstruction-based methods have been commonly used for unsupervised anomaly detection, in which a normal image is reconstructed and compared with the given test image to detect and locate anomalies. Recently, diffusion models have shown promising applications for anomaly detection due to their powerful generative ability. However, these models lack strict mathematical support for normal image re… ▽ More Reconstruction-based methods have been commonly used for unsupervised anomaly detection, in which a normal image is reconstructed and compared with the given test image to detect and locate anomalies. Recently, diffusion models have shown promising applications for anomaly detection due to their powerful generative ability. However, these models lack strict mathematical support for normal image reconstruction and unexpectedly suffer from low reconstruction quality. To address these issues, this paper proposes a novel and highly-interpretable method named Masked Diffusion Posterior Sampling (MDPS). In MDPS, the problem of normal image reconstruction is mathematically modeled as multiple diffusion posterior sampling for normal images based on the devised masked noisy observation model and the diffusion-based normal image prior under Bayesian framework. Using a metric designed from pixel-level and perceptual-level perspectives, MDPS can effectively compute the difference map between each normal posterior sample and the given test image. Anomaly scores are obtained by averaging all difference maps for multiple posterior samples. Exhaustive experiments on MVTec and BTAD datasets demonstrate that MDPS can achieve state-of-the-art performance in normal image reconstruction quality as well as anomaly detection and localization. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Journal ref: International Joint Conference on Artificial Intelligence 2024

arXiv:2404.17833 [pdf, other]

Testing and Understanding Erroneous Planning in LLM Agents through Synthesized User Inputs

Authors: Zhenlan Ji, Daoyuan Wu, Pingchuan Ma, Zongjie Li, Shuai Wang

Abstract: Agents based on large language models (LLMs) have demonstrated effectiveness in solving a wide range of tasks by integrating LLMs with key modules such as planning, memory, and tool usage. Increasingly, customers are adopting LLM agents across a variety of commercial applications critical to reliability, including support for mental well-being, chemical synthesis, and software development. Neverth… ▽ More Agents based on large language models (LLMs) have demonstrated effectiveness in solving a wide range of tasks by integrating LLMs with key modules such as planning, memory, and tool usage. Increasingly, customers are adopting LLM agents across a variety of commercial applications critical to reliability, including support for mental well-being, chemical synthesis, and software development. Nevertheless, our observations and daily use of LLM agents indicate that they are prone to making erroneous plans, especially when the tasks are complex and require long-term planning. In this paper, we propose PDoctor, a novel and automated approach to testing LLM agents and understanding their erroneous planning. As the first work in this direction, we formulate the detection of erroneous planning as a constraint satisfiability problem: an LLM agent's plan is considered erroneous if its execution violates the constraints derived from the user inputs. To this end, PDoctor first defines a domain-specific language (DSL) for user queries and synthesizes varying inputs with the assistance of the Z3 constraint solver. These synthesized inputs are natural language paragraphs that specify the requirements for completing a series of tasks. Then, PDoctor derives constraints from these requirements to form a testing oracle. We evaluate PDoctor with three mainstream agent frameworks and two powerful LLMs (GPT-3.5 and GPT-4). The results show that PDoctor can effectively detect diverse errors in agent planning and provide insights and error characteristics that are valuable to both agent developers and users. We conclude by discussing potential alternative designs and directions to extend PDoctor. △ Less

Submitted 27 April, 2024; originally announced April 2024.

arXiv:2404.17607 [pdf, other]

Utilizing Large Language Models to Identify Reddit Users Considering Vaping Cessation for Digital Interventions

Authors: Sai Krishna Revanth Vuruma, Dezhi Wu, Saborny Sen Gupta, Lucas Aust, Valerie Lookingbill, Caleb Henry, Yang Ren, Erin Kasson, Li-Shiun Chen, Patricia Cavazos-Rehg, Dian Hu, Ming Huang

Abstract: The widespread adoption of social media platforms globally not only enhances users' connectivity and communication but also emerges as a vital channel for the dissemination of health-related information, thereby establishing social media data as an invaluable organic data resource for public health research. The surge in popularity of vaping or e-cigarette use in the United States and other countr… ▽ More The widespread adoption of social media platforms globally not only enhances users' connectivity and communication but also emerges as a vital channel for the dissemination of health-related information, thereby establishing social media data as an invaluable organic data resource for public health research. The surge in popularity of vaping or e-cigarette use in the United States and other countries has caused an outbreak of e-cigarette and vaping use-associated lung injury (EVALI), leading to hospitalizations and fatalities in 2019, highlighting the urgency to comprehend vaping behaviors and develop effective strategies for cession. In this study, we extracted a sample dataset from one vaping sub-community on Reddit to analyze users' quit vaping intentions. Leveraging large language models including both the latest GPT-4 and traditional BERT-based language models for sentence-level quit-vaping intention prediction tasks, this study compares the outcomes of these models against human annotations. Notably, when compared to human evaluators, GPT-4 model demonstrates superior consistency in adhering to annotation guidelines and processes, showcasing advanced capabilities to detect nuanced user quit-vaping intentions that human evaluators might overlook. These preliminary findings emphasize the potential of GPT-4 in enhancing the accuracy and reliability of social media data analysis, especially in identifying subtle users' intentions that may elude human detection. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.16407 [pdf, other]

U2++ MoE: Scaling 4.7x parameters with minimal impact on RTF

Authors: Xingchen Song, Di Wu, Binbin Zhang, Dinghao Zhou, Zhendong Peng, Bo Dang, Fuping Pan, Chao Yang

Abstract: Scale has opened new frontiers in natural language processing, but at a high cost. In response, by learning to only activate a subset of parameters in training and inference, Mixture-of-Experts (MoE) have been proposed as an energy efficient path to even larger and more capable language models and this shift towards a new generation of foundation models is gaining momentum, particularly within the… ▽ More Scale has opened new frontiers in natural language processing, but at a high cost. In response, by learning to only activate a subset of parameters in training and inference, Mixture-of-Experts (MoE) have been proposed as an energy efficient path to even larger and more capable language models and this shift towards a new generation of foundation models is gaining momentum, particularly within the field of Automatic Speech Recognition (ASR). Recent works that incorporating MoE into ASR models have complex designs such as routing frames via supplementary embedding network, improving multilingual ability for the experts, and utilizing dedicated auxiliary losses for either expert load balancing or specific language handling. We found that delicate designs are not necessary, while an embarrassingly simple substitution of MoE layers for all Feed-Forward Network (FFN) layers is competent for the ASR task. To be more specific, we benchmark our proposed model on a large scale inner-source dataset (160k hours), the results show that we can scale our baseline Conformer (Dense-225M) to its MoE counterparts (MoE-1B) and achieve Dense-1B level Word Error Rate (WER) while maintaining a Dense-225M level Real Time Factor (RTF). Furthermore, by applying Unified 2-pass framework with bidirectional attention decoders (U2++), we achieve the streaming and non-streaming decoding modes in a single MoE based model, which we call U2++ MoE. We hope that our study can facilitate the research on scaling speech foundation models without sacrificing deployment efficiency. △ Less

Submitted 25 April, 2024; originally announced April 2024.

ACM Class: I.2.7

arXiv:2404.14061 [pdf, other]

FedTAD: Topology-aware Data-free Knowledge Distillation for Subgraph Federated Learning

Authors: Yinlin Zhu, Xunkai Li, Zhengyu Wu, Di Wu, Miao Hu, Rong-Hua Li

Abstract: Subgraph federated learning (subgraph-FL) is a new distributed paradigm that facilitates the collaborative training of graph neural networks (GNNs) by multi-client subgraphs. Unfortunately, a significant challenge of subgraph-FL arises from subgraph heterogeneity, which stems from node and topology variation, causing the impaired performance of the global GNN. Despite various studies, they have no… ▽ More Subgraph federated learning (subgraph-FL) is a new distributed paradigm that facilitates the collaborative training of graph neural networks (GNNs) by multi-client subgraphs. Unfortunately, a significant challenge of subgraph-FL arises from subgraph heterogeneity, which stems from node and topology variation, causing the impaired performance of the global GNN. Despite various studies, they have not yet thoroughly investigated the impact mechanism of subgraph heterogeneity. To this end, we decouple node and topology variation, revealing that they correspond to differences in label distribution and structure homophily. Remarkably, these variations lead to significant differences in the class-wise knowledge reliability of multiple local GNNs, misguiding the model aggregation with varying degrees. Building on this insight, we propose topology-aware data-free knowledge distillation technology (FedTAD), enhancing reliable knowledge transfer from the local model to the global model. Extensive experiments on six public datasets consistently demonstrate the superiority of FedTAD over state-of-the-art baselines. △ Less

Submitted 25 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: Accepted by IJCAI 2024

arXiv:2404.13945 [pdf, other]

How do LLMs Support Deep Learning Testing? A Comprehensive Study Through the Lens of Image Mutation

Authors: Liwen Wang, Yuanyuan Yuan, Ao Sun, Zongjie Li, Pingchuan Ma, Daoyuan Wu, Shuai Wang

Abstract: Visual deep learning (VDL) systems have shown significant success in real-world applications like image recognition, object detection, and autonomous driving. To evaluate the reliability of VDL, a mainstream approach is software testing, which requires diverse and controllable mutations over image semantics. The rapid development of multi-modal large language models (MLLMs) has introduced revoluti… ▽ More Visual deep learning (VDL) systems have shown significant success in real-world applications like image recognition, object detection, and autonomous driving. To evaluate the reliability of VDL, a mainstream approach is software testing, which requires diverse and controllable mutations over image semantics. The rapid development of multi-modal large language models (MLLMs) has introduced revolutionary image mutation potentials through instruction-driven methods. Users can now freely describe desired mutations and let MLLMs generate the mutated images. However, the quality of MLLM-produced test inputs in VDL testing remains largely unexplored. We present the first study, aiming to assess MLLMs' adequacy from 1) the semantic validity of MLLM mutated images, 2) the alignment of MLLM mutated images with their text instructions (prompts), 3) the faithfulness of how different mutations preserve semantics that are ought to remain unchanged, and 4) the effectiveness of detecting VDL faults. With large-scale human studies and quantitative evaluations, we identify MLLM's promising potentials in expanding the covered semantics of image mutations. Notably, while SoTA MLLMs (e.g., GPT-4V) fail to support or perform worse in editing existing semantics in images (as in traditional mutations like rotation), they generate high-quality test inputs using "semantic-additive" mutations (e.g., "dress a dog with clothes"), which bring extra semantics to images; these were infeasible for past approaches. Hence, we view MLLM-based mutations as a vital complement to traditional mutations, and advocate future VDL testing tasks to combine MLLM-based methods and traditional image mutations for comprehensive and reliable testing. △ Less

Submitted 5 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.12256 [pdf, other]

doi 10.1109/TIV.2024.3389640

An Online Spatial-Temporal Graph Trajectory Planner for Autonomous Vehicles

Authors: Jilan Samiuddin, Benoit Boulet, Di Wu

Abstract: The autonomous driving industry is expected to grow by over 20 times in the coming decade and, thus, motivate researchers to delve into it. The primary focus of their research is to ensure safety, comfort, and efficiency. An autonomous vehicle has several modules responsible for one or more of the aforementioned items. Among these modules, the trajectory planner plays a pivotal role in the safety… ▽ More The autonomous driving industry is expected to grow by over 20 times in the coming decade and, thus, motivate researchers to delve into it. The primary focus of their research is to ensure safety, comfort, and efficiency. An autonomous vehicle has several modules responsible for one or more of the aforementioned items. Among these modules, the trajectory planner plays a pivotal role in the safety of the vehicle and the comfort of its passengers. The module is also responsible for respecting kinematic constraints and any applicable road constraints. In this paper, a novel online spatial-temporal graph trajectory planner is introduced to generate safe and comfortable trajectories. First, a spatial-temporal graph is constructed using the autonomous vehicle, its surrounding vehicles, and virtual nodes along the road with respect to the vehicle itself. Next, the graph is forwarded into a sequential network to obtain the desired states. To support the planner, a simple behavioral layer is also presented that determines kinematic constraints for the planner. Furthermore, a novel potential function is also proposed to train the network. Finally, the proposed planner is tested on three different complex driving tasks, and the performance is compared with two frequently used methods. The results show that the proposed planner generates safe and feasible trajectories while achieving similar or longer distances in the forward direction and comparable comfort ride. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: This is the accepted version and published in the "Early Access" area of IEEE Xplore for the IEEE Transactions on Intelligent Vehicles on 16 April 2024. Article statistics: 11 pages, 9 figures, 2 tables

arXiv:2404.11201 [pdf, other]

Neuron Specialization: Leveraging intrinsic task modularity for multilingual machine translation

Authors: Shaomu Tan, Di Wu, Christof Monz

Abstract: Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference. Language-specific modeling methods show promise in reducing interference. However, they often rely on heuristics to distribute capacity and struggle to foster cross-lingual transfer via isolated modules. In this paper, we explore intrinsic task modularity within multilingual networks… ▽ More Training a unified multilingual model promotes knowledge transfer but inevitably introduces negative interference. Language-specific modeling methods show promise in reducing interference. However, they often rely on heuristics to distribute capacity and struggle to foster cross-lingual transfer via isolated modules. In this paper, we explore intrinsic task modularity within multilingual networks and leverage these observations to circumvent interference under multilingual translation. We show that neurons in the feed-forward layers tend to be activated in a language-specific manner. Meanwhile, these specialized neurons exhibit structural overlaps that reflect language proximity, which progress across layers. Based on these findings, we propose Neuron Specialization, an approach that identifies specialized neurons to modularize feed-forward layers and then continuously updates them through sparse networks. Extensive experiments show that our approach achieves consistent performance gains over strong baselines with additional analyses demonstrating reduced interference and increased knowledge transfer. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2404.09668 [pdf, other]

Exploring field-evolution and dynamical-capture coalescing binary black holes in GWTC-3

Authors: Yin-Jie Li, Shao-Peng Tang, Shi-Jie Gao, Dao-Cheng Wu, Yuan-Zhu Wang

Abstract: The continuously expanding sample of gravitational-wave observations is revealing the formation and evolutionary mechanism of merging compact binaries. Two primary channels, namely, isolated field binary evolution and dynamical capture, are widely accepted as potential producers of merging binary black holes (BBHs), which are distinguishable with the spin-orientation distributions of the BBHs. We… ▽ More The continuously expanding sample of gravitational-wave observations is revealing the formation and evolutionary mechanism of merging compact binaries. Two primary channels, namely, isolated field binary evolution and dynamical capture, are widely accepted as potential producers of merging binary black holes (BBHs), which are distinguishable with the spin-orientation distributions of the BBHs. We investigate the two formation channels in GWTC-3, with a dedicated semi-parametric population model, i.e., a mixture of two sub-populations with different spin-orientation distributions (one is nearly-aligned and the other is nearly-isotropic). It turns out that the two sub-populations have different mass and mass-ratio distributions. The nearly-aligned sub-population, which is consistent with the isolated field formation channels, has a less preference for symmetric systems, and likely dominate the 10-solar-mass peak in the primary-mass function. While the isotropic sub-population shows a stronger preference for symmetric systems, and mainly contribute to the 35-solar-mass peak in the primary-mass function, consistent with the dynamical channels. Moreover, our results show that the purely isotropic-spin and the single well-aligned (i.e., the width of $\cosθ$ distribution $σ_{\rm t}<0.5$) scenario are ruled out (by a Bayes factor of $\ln\mathcal{B}=5.2$ and $\ln\mathcal{B}=9.8$). △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 26 pages, 16 figures

arXiv:2404.09483 [pdf, other]

Deep Learning for Cosmological Parameter Inference from Dark Matter Halo Density Field

Authors: Zhiwei Min, Xu Xiao, Jiacheng Ding, Liang Xiao, Jie Jiang, Donglin Wu, Qiufan Lin, Yin Li, Yang Wang, Shuai Liu, Zhixin Chen, Xiangru Li, Jinqu Zhang, Le Zhang, Xiao-Dong Li

Abstract: We propose a lightweight deep convolutional neural network (lCNN) to estimate cosmological parameters from simulated three-dimensional DM halo distributions and associated statistics. The training dataset comprises 2000 realizations of a cubic box with a side length of 1000 $h^{-1}{\rm Mpc}$, and interpolated over a cubic grid of $300^3$ voxels, with each simulation produced using $512^3$ DM parti… ▽ More We propose a lightweight deep convolutional neural network (lCNN) to estimate cosmological parameters from simulated three-dimensional DM halo distributions and associated statistics. The training dataset comprises 2000 realizations of a cubic box with a side length of 1000 $h^{-1}{\rm Mpc}$, and interpolated over a cubic grid of $300^3$ voxels, with each simulation produced using $512^3$ DM particles and $512^3$ neutrinos . Under the flat $Λ$CDM model, simulations vary standard six cosmological parameters including $Ω_m$, $Ω_b$, $h$, $n_s$, $σ_8$, $w$, along with the neutrino mass sum, $M_ν$. We find that: 1) within the framework of lCNN, extracting large-scale structure information is more efficient from the halo density field compared to relying on the statistical quantities including the power spectrum, the two-point correlation function, and the coefficients from wavelet scattering transform; 2) combining the halo density field with its Fourier transformed counterpart enhances predictions, while augmenting the training dataset with measured statistics further improves performance; 3) achieving high accuracy in inferring $Ω_m$, $h$, $n_s$, and $σ_8$ by the neural network model, while being inefficient in predicting $Ω_b$,$M_ν$ and $w$; 4) compared to the simple random forest network trained with three statistical quantities, lCNN yields unbiased estimations with reduced statistical errors: approximately 33.3\% for $Ω_m$, 20.0\% for $h$, 8.3\% for $n_s$, and 40.0\% for $σ_8$. Our study emphasizes this lCNN-based novel approach in extracting large-scale structure information and estimating cosmological parameters. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 10 pages,9 figures

arXiv:2404.09185 [pdf, other]

Robust spin order and fragile charge order in Na0.5CoO2 as revealed by time-resolved terahertz spectroscopy

Authors: X. Y. Zhou, S. J. Zhang, D. Wu, H. Wang, B. H. Li, S. F. Wu, Q. M. Liu, T. C. Hu, R. S. Li, J. Y. Yuan, S. X. Xu, Q. Wu, L. Yue, T. Dong, N. L. Wang

Abstract: Near-infrared (NIR) pump-terahertz (THz) probe spectroscopy is used to investigate the charge and spin exciations in a strongly correlated electron compound Na0.5CoO2. This compound exhibits a coexistence of various charge and spin orders arising from intricate interactions among charge, spin, and orbital degrees of freedom. NIR pulses create significantly diverse effects on the charge and spin or… ▽ More Near-infrared (NIR) pump-terahertz (THz) probe spectroscopy is used to investigate the charge and spin exciations in a strongly correlated electron compound Na0.5CoO2. This compound exhibits a coexistence of various charge and spin orders arising from intricate interactions among charge, spin, and orbital degrees of freedom. NIR pulses create significantly diverse effects on the charge and spin orders; while the charge order is easily melted,coherent magnon excitations are present in all fluences examined. Furthermore, a novel π phase shift of the coherent magnon oscillations is observed in the pump-induced change of the terahertz electric field between regions of increasing and decreasing field change. These results unequivocally illustrate that ultrashort laser pulses enable the disentanglement of different interactions within complex systems characterized by multiple orders, providing a fresh perspective on the interplay between itinerant and localized electrons within the Co 3d t2g multiplets. △ Less

Submitted 14 April, 2024; originally announced April 2024.

arXiv:2404.09182 [pdf, other]

doi 10.1103/PhysRevLett.132.206401

Coexistence of interacting charge density waves in a layered semiconductor

Authors: B. Q. Lv, Alfred Zong, Dong Wu, Zhengwei Nie, Yifan Su, Dongsung Choi, Batyr Ilyas, Bryan T. Fichera, Jiarui Li, Edoardo Baldini, Masataka Mogi, Y. -B. Huang, Hoi Chun Po, Sheng Meng, Yao Wang, N. L. Wang, Nuh Gedik

Abstract: Coexisting orders are key features of strongly correlated materials and underlie many intriguing phenomena from unconventional superconductivity to topological orders. Here, we report the coexistence of two interacting charge-density-wave (CDW) orders in EuTe4, a layered crystal that has drawn considerable attention owing to its anomalous thermal hysteresis and a semiconducting CDW state despite t… ▽ More Coexisting orders are key features of strongly correlated materials and underlie many intriguing phenomena from unconventional superconductivity to topological orders. Here, we report the coexistence of two interacting charge-density-wave (CDW) orders in EuTe4, a layered crystal that has drawn considerable attention owing to its anomalous thermal hysteresis and a semiconducting CDW state despite the absence of perfect FS nesting. By accessing unoccupied conduction bands with time- and angle-resolved photoemission measurements, we find that mono- and bi-layers of Te in the unit cell host different CDWs that are associated with distinct energy gaps. The two gaps display dichotomous evolutions following photoexcitation, where the larger bilayer CDW gap exhibits less renormalization and faster recovery. Surprisingly, the CDW in the Te monolayer displays an additional momentum-dependent gap renormalization that cannot be captured by density-functional theory calculations. This phenomenon is attributed to interlayer interactions between the two CDW orders, which account for the semiconducting nature of the equilibrium state. Our findings not only offer microscopic insights into the correlated ground state of EuTe4 but also provide a general non-equilibrium approach to understand coexisting, layer-dependent orders in a complex system. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: To appear in PRL

Journal ref: Physical Review Letters 132, 206401 (2024)

arXiv:2404.08875 [pdf]

Layer-by-layer connection for large area single crystal boron nitride multilayer films

Authors: Hui Shi, Mingyuan Wang, Hongying Chen, Adrien Rousseau, Junpeng Shu, Ming Tian, Ruowang Chen, Juliette Plo, Pierre Valvin, Bernard Gil, Jiajie Qi, Qinghe Wang, Kaihui Liu, Mingliang Zhang, Guillaume Cassabois, Di Wu, Neng Wan

Abstract: Boron nitride (BN) is today considered as one of the most promising materials for many novel applications including bright single photon emission, deep UV opto-electronics, small sized solid-state neutron detector, and high-performance two-dimensional materials, etc. Despite the recent successful fabrication of large-area BN single-crystals (typically <= 5 atomic layers), the scalable growth of th… ▽ More Boron nitride (BN) is today considered as one of the most promising materials for many novel applications including bright single photon emission, deep UV opto-electronics, small sized solid-state neutron detector, and high-performance two-dimensional materials, etc. Despite the recent successful fabrication of large-area BN single-crystals (typically <= 5 atomic layers), the scalable growth of thicker single-crystalline BN films still constitutes a great challenge. In this work, we demonstrate an approach to grow large-area multilayer single-crystal BN films by chemical vapor deposition on face-centered cubic Fe-Ni (111) single crystal alloy thin films with different stoichiometric phases. We show that the BN growth is greatly tunable and improved by increasing the Fe content in single-crystal Fe-Ni (111). The formation of pyramid-shaped multilayer BN domains with aligned orientation enables a continuous connection following a layer-by-layer, 'first-meet-first-connect', mosaic stitching mechanism. By means of selected area electron diffraction, micro-photoluminescence spectroscopy in the deep UV and high-resolution transmission electron microscopy, the layer-by-layer connection mechanism is unambiguously evidenced, and the stacking order has been verified to occur as unidirectional AB and ABC stackings, i.e., in the Bernal and rhombohedral BN phase. △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2404.03900 [pdf, other]

Nonparametric Modern Hopfield Models

Authors: Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu, Feng Ruan, Han Liu

Abstract: We present a nonparametric construction for deep learning compatible modern Hopfield models and utilize this framework to debut an efficient variant. Our key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models as a nonparametric regression problem subject to a set of query-memory pairs. Crucially, our framework not only recovers the known resul… ▽ More We present a nonparametric construction for deep learning compatible modern Hopfield models and utilize this framework to debut an efficient variant. Our key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models as a nonparametric regression problem subject to a set of query-memory pairs. Crucially, our framework not only recovers the known results from the original dense modern Hopfield model but also fills the void in the literature regarding efficient modern Hopfield models, by introducing \textit{sparse-structured} modern Hopfield models with sub-quadratic complexity. We establish that this sparse model inherits the appealing theoretical properties of its dense analogue -- connection with transformer attention, fixed point convergence and exponential memory capacity -- even without knowing details of the Hopfield energy function. Additionally, we showcase the versatility of our framework by constructing a family of modern Hopfield models as extensions, including linear, random masked, top-$K$ and positive random feature modern Hopfield models. Empirically, we validate the efficacy of our framework in both synthetic and realistic settings. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 59 pages; Code available at https://github.com/MAGICS-LAB/NonparametricHopfield

arXiv:2404.03827 [pdf, other]

Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models

Authors: Dennis Wu, Jerry Yao-Chieh Hu, Teng-Yun Hsiao, Han Liu

Abstract: We propose a two-stage memory retrieval dynamics for modern Hopfield models, termed $\mathtt{U\text{-}Hop}$, with enhanced memory capacity. Our key contribution is a learnable feature map $Φ$ which transforms the Hopfield energy function into kernel space. This transformation ensures convergence between the local minima of energy and the fixed points of retrieval dynamics within the kernel space.… ▽ More We propose a two-stage memory retrieval dynamics for modern Hopfield models, termed $\mathtt{U\text{-}Hop}$, with enhanced memory capacity. Our key contribution is a learnable feature map $Φ$ which transforms the Hopfield energy function into kernel space. This transformation ensures convergence between the local minima of energy and the fixed points of retrieval dynamics within the kernel space. Consequently, the kernel norm induced by $Φ$ serves as a novel similarity measure. It utilizes the stored memory patterns as learning data to enhance memory capacity across all modern Hopfield models. Specifically, we accomplish this by constructing a separation loss $\mathcal{L}_Φ$ that separates the local minima of kernelized energy by separating stored memory patterns in kernel space. Methodologically, $\mathtt{U\text{-}Hop}$ memory retrieval process consists of: (Stage I) minimizing separation loss for a more uniform memory (local minimum) distribution, followed by (Stage II) standard Hopfield energy minimization for memory retrieval. This results in a significant reduction of possible metastable states in the Hopfield energy function, thus enhancing memory capacity by preventing memory confusion. Empirically, with real-world datasets, we demonstrate that $\mathtt{U\text{-}Hop}$ outperforms all existing modern Hopfield models and state-of-the-art similarity measures, achieving substantial improvements in both associative memory retrieval and deep learning tasks. Code is available at https://github.com/MAGICS-LAB/UHop ; future updates are on arXiv:2404.03827 △ Less

Submitted 12 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: Accepted at ICML 2024; v2 updated to camera-ready version; Code available at https://github.com/MAGICS-LAB/UHop

arXiv:2404.03203 [pdf, other]

doi 10.1038/s41524-024-01266-x

Giant and controllable nonlinear magneto-optical effects in two-dimensional magnets

Authors: Dezhao Wu, Meng Ye, Haowei Chen, Yong Xu, Wenhui Duan

Abstract: The interplay of polarization and magnetism in materials with light can create rich nonlinear magneto-optical (NLMO) effects, and the recent discovery of two-dimensional (2D) van der Waals magnets provides remarkable control over NLMO effects due to their superb tunability. Here, based on first-principles calculations, we reported giant NLMO effects in CrI3-based 2D magnets, including a dramatic c… ▽ More The interplay of polarization and magnetism in materials with light can create rich nonlinear magneto-optical (NLMO) effects, and the recent discovery of two-dimensional (2D) van der Waals magnets provides remarkable control over NLMO effects due to their superb tunability. Here, based on first-principles calculations, we reported giant NLMO effects in CrI3-based 2D magnets, including a dramatic change of second-harmonics generation (SHG) polarization direction (90 degrees) and intensity (on/off switch) under magnetization reversal, and a 100% SHG circular dichroism effect. We further revealed that these effects could not only be used to design ultra-thin multifunctional optical devices, but also to detect subtle magnetic orderings. Remarkably, we analytically derived conditions to achieve giant NLMO effects and propose general strategies to realize them in 2D magnets. Our work not only uncovers a series of intriguing NLMO phenomena, but also paves the way for both fundamental research and device applications of ultra-thin NLMO materials. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: 10 pages,5 figures, npj Computational Materials accepted

arXiv:2404.01687 [pdf, other]

Search for a sub-eV sterile neutrino using Daya Bay's full dataset

Authors: F. P. An, W. D. Bai, A. B. Balantekin, M. Bishai, S. Blyth, G. F. Cao, J. Cao, J. F. Chang, Y. Chang, H. S. Chen, H. Y. Chen, S. M. Chen, Y. Chen, Y. X. Chen, Z. Y. Chen, J. Cheng, Y. C. Cheng, Z. K. Cheng, J. J. Cherwinka, M. C. Chu, J. P. Cummings, O. Dalager, F. S. Deng, X. Y. Ding, Y. Y. Ding , et al. (176 additional authors not shown)

Abstract: This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis… ▽ More This Letter presents results of a search for the mixing of a sub-eV sterile neutrino with three active neutrinos based on the full data sample of the Daya Bay Reactor Neutrino Experiment, collected during 3158 days of detector operation, which contains $5.55 \times 10^{6}$ reactor \anue candidates identified as inverse beta-decay interactions followed by neutron-capture on gadolinium. The analysis benefits from a doubling of the statistics of our previous result and from improvements of several important systematic uncertainties. No significant oscillation due to mixing of a sub-eV sterile neutrino with active neutrinos was found. Exclusion limits are set by both Feldman-Cousins and CLs methods. Light sterile neutrino mixing with $\sin^2 2θ_{14} \gtrsim 0.01$ can be excluded at 95\% confidence level in the region of $0.01$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.1 $ eV$^2$. This result represents the world-leading constraints in the region of $2 \times 10^{-4}$ eV$^2 \lesssim |Δm^{2}_{41}| \lesssim 0.2 $ eV$^2$. △ Less

Submitted 15 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 7 pages, 4 figures, 1 table

arXiv:2403.18005 [pdf, ps, other]

The Tale of Three Scales: the Planck, the Species, and the Black Hole Scales

Authors: Alek Bedroya, Cumrun Vafa, David H. Wu

Abstract: Quantum gravity (QG) has a natural cutoff given by the Planck scale $M_{\rm pl}$. However, it is known that the EFT of gravity can break down at a lower scale, the species scale $Λ_s\lesssim M_{\rm pl}$, if there are light species of particles. Here we point out that there is a third scale $Λ_{\rm BH}\lesssim Λ_s\lesssim M_{\rm pl}$, which marks the inverse length (or the temperature) of the small… ▽ More Quantum gravity (QG) has a natural cutoff given by the Planck scale $M_{\rm pl}$. However, it is known that the EFT of gravity can break down at a lower scale, the species scale $Λ_s\lesssim M_{\rm pl}$, if there are light species of particles. Here we point out that there is a third scale $Λ_{\rm BH}\lesssim Λ_s\lesssim M_{\rm pl}$, which marks the inverse length (or the temperature) of the smallest black hole where the EFT gives a correct description of its entropy and free energy. This latter scale is hard to detect from the viewpoint of EFT as it represents a phase transition to a state with lower free energy. We illustrate this using examples drawn from consistent QG landscape. In particular $Λ_{\rm BH}$ gets related to Gregory--Laflamme transition in the decompactification limits of quantum gravity and to the Horowitz--Polchinski solution in the light perturbative string limits. We propose the existence of $Λ_{\rm BH}$ marking the temperature at which neutral black holes undergo a phase transition, as a new Swampland condition for all consistent quantum theories of gravity. In the asymptotic regimes of field space $Λ_{\rm BH}$ is close to the mass scale of the lightest tower but deviates from it as we move inwards in the moduli space. △ Less

Submitted 4 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: 8 pages, 2 figures

arXiv:2403.17413 [pdf, other]

LM-Combiner: A Contextual Rewriting Model for Chinese Grammatical Error Correction

Authors: Yixuan Wang, Baoxin Wang, Yijun Liu, Dayong Wu, Wanxiang Che

Abstract: Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task. Recent work using model ensemble methods based on voting can effectively mitigate over-correction and improve the precision of the GEC system. However, these methods still require the output of several GEC systems and inevitably lead to reduced error recall. In this light, we propose the LM-Combiner, a rewri… ▽ More Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task. Recent work using model ensemble methods based on voting can effectively mitigate over-correction and improve the precision of the GEC system. However, these methods still require the output of several GEC systems and inevitably lead to reduced error recall. In this light, we propose the LM-Combiner, a rewriting model that can directly modify the over-correction of GEC system outputs without a model ensemble. Specifically, we train the model on an over-correction dataset constructed through the proposed K-fold cross inference method, which allows it to directly generate filtered sentences by combining the original and the over-corrected text. In the inference stage, we directly take the original sentences and the output results of other systems as input and then obtain the filtered sentences through LM-Combiner. Experiments on the FCGEC dataset show that our proposed method effectively alleviates the over-correction of the original system (+18.2 Precision) while ensuring the error recall remains unchanged. Besides, we find that LM-Combiner still has a good rewriting performance even with small parameters and few training data, and thus can cost-effectively mitigate the over-correction of black-box GEC systems (e.g., ChatGPT). △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted to COLING 2024

arXiv:2403.17312 [pdf, other]

ALISA: Accelerating Large Language Model Inference via Sparsity-Aware KV Caching

Authors: Youpeng Zhao, Di Wu, Jun Wang

Abstract: The Transformer architecture has significantly advanced natural language processing (NLP) and has been foundational in developing large language models (LLMs) such as LLaMA and OPT, which have come to dominate a broad range of NLP tasks. Despite their superior accuracy, LLMs present unique challenges in practical inference, concerning the compute and memory-intensive nature. Thanks to the autoregr… ▽ More The Transformer architecture has significantly advanced natural language processing (NLP) and has been foundational in developing large language models (LLMs) such as LLaMA and OPT, which have come to dominate a broad range of NLP tasks. Despite their superior accuracy, LLMs present unique challenges in practical inference, concerning the compute and memory-intensive nature. Thanks to the autoregressive characteristic of LLM inference, KV caching for the attention layers in Transformers can effectively accelerate LLM inference by substituting quadratic-complexity computation with linear-complexity memory accesses. Yet, this approach requires increasing memory as demand grows for processing longer sequences. The overhead leads to reduced throughput due to I/O bottlenecks and even out-of-memory errors, particularly on resource-constrained systems like a single commodity GPU. In this paper, we propose ALISA, a novel algorithm-system co-design solution to address the challenges imposed by KV caching. On the algorithm level, ALISA prioritizes tokens that are most important in generating a new token via a Sparse Window Attention (SWA) algorithm. SWA introduces high sparsity in attention layers and reduces the memory footprint of KV caching at negligible accuracy loss. On the system level, ALISA employs three-phase token-level dynamical scheduling and optimizes the trade-off between caching and recomputation, thus maximizing the overall performance in resource-constrained systems. In a single GPU-CPU system, we demonstrate that under varying workloads, ALISA improves the throughput of baseline systems such as FlexGen and vLLM by up to 3X and 1.9X, respectively. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: ISCA 2024

arXiv:2403.16683 [pdf, other]

Optimal Mass Transport of Nonlinear Systems under Input and Density Constraints

Authors: Dongjun Wu, Anders Rantzer

Abstract: We investigate optimal mass transport problem of affine-nonlinear dynamical systems with input and density constraints. Three algorithms are proposed to tackle this problem, including two Uzawa-type methods and a splitting algorithm based on the Douglas-Rachford algorithm. Some preliminary simulation results are presented to demonstrate the effectiveness of our approaches. We investigate optimal mass transport problem of affine-nonlinear dynamical systems with input and density constraints. Three algorithms are proposed to tackle this problem, including two Uzawa-type methods and a splitting algorithm based on the Douglas-Rachford algorithm. Some preliminary simulation results are presented to demonstrate the effectiveness of our approaches. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.16073 [pdf, other]

Combining Fine-Tuning and LLM-based Agents for Intuitive Smart Contract Auditing with Justifications

Authors: Wei Ma, Daoyuan Wu, Yuqiang Sun, Tianwen Wang, Shangqing Liu, Jian Zhang, Yue Xue, Yang Liu

Abstract: Smart contracts are decentralized applications built atop blockchains like Ethereum. Recent research has shown that large language models (LLMs) have potential in auditing smart contracts, but the state-of-the-art indicates that even GPT-4 can achieve only 30% precision (when both decision and justification are correct). This is likely because off-the-shelf LLMs were primarily pre-trained on a gen… ▽ More Smart contracts are decentralized applications built atop blockchains like Ethereum. Recent research has shown that large language models (LLMs) have potential in auditing smart contracts, but the state-of-the-art indicates that even GPT-4 can achieve only 30% precision (when both decision and justification are correct). This is likely because off-the-shelf LLMs were primarily pre-trained on a general text/code corpus and not fine-tuned on the specific domain of Solidity smart contract auditing. In this paper, we propose TrustLLM, a general framework that combines fine-tuning and LLM-based agents for intuitive smart contract auditing with justifications. Specifically, TrustLLM is inspired by the observation that expert human auditors first perceive what could be wrong and then perform a detailed analysis of the code to identify the cause. As such, TrustLLM employs a two-stage fine-tuning approach: it first tunes a Detector model to make decisions and then tunes a Reasoner model to generate causes of vulnerabilities. However, fine-tuning alone faces challenges in accurately identifying the optimal cause of a vulnerability. Therefore, we introduce two LLM-based agents, the Ranker and Critic, to iteratively select and debate the most suitable cause of vulnerability based on the output of the fine-tuned Reasoner model. To evaluate TrustLLM, we collected a balanced dataset with 1,734 positive and 1,810 negative samples to fine-tune TrustLLM. We then compared it with traditional fine-tuned models (CodeBERT, GraphCodeBERT, CodeT5, and UnixCoder) as well as prompt learning-based LLMs (GPT4, GPT-3.5, and CodeLlama-13b/34b). On a dataset of 263 real smart contract vulnerabilities, TrustLLM achieves an F1 score of 91.21% and an accuracy of 91.11%. The causes generated by TrustLLM achieved a consistency of about 38% compared to the ground truth causes. △ Less

Submitted 24 March, 2024; originally announced March 2024.

arXiv:2403.15432 [pdf, other]

BRIEDGE: EEG-Adaptive Edge AI for Multi-Brain to Multi-Robot Interaction

Authors: Jinhui Ouyang, Mingzhu Wu, Xinglin Li, Hanhui Deng, Di Wu

Abstract: Recent advances in EEG-based BCI technologies have revealed the potential of brain-to-robot collaboration through the integration of sensing, computing, communication, and control. In this paper, we present BRIEDGE as an end-to-end system for multi-brain to multi-robot interaction through an EEG-adaptive neural network and an encoding-decoding communication framework, as illustrated in Fig.1. As d… ▽ More Recent advances in EEG-based BCI technologies have revealed the potential of brain-to-robot collaboration through the integration of sensing, computing, communication, and control. In this paper, we present BRIEDGE as an end-to-end system for multi-brain to multi-robot interaction through an EEG-adaptive neural network and an encoding-decoding communication framework, as illustrated in Fig.1. As depicted, the edge mobile server or edge portable server will collect EEG data from the users and utilize the EEG-adaptive neural network to identify the users' intentions. The encoding-decoding communication framework then encodes the EEG-based semantic information and decodes it into commands in the process of data transmission. To better extract the joint features of heterogeneous EEG data as well as enhance classification accuracy, BRIEDGE introduces an informer-based ProbSparse self-attention mechanism. Meanwhile, parallel and secure transmissions for multi-user multi-task scenarios under physical channels are addressed by dynamic autoencoder and autodecoder communications. From mobile computing and edge AI perspectives, model compression schemes composed of pruning, weight sharing, and quantization are also used to deploy lightweight EEG-adaptive models running on both transmitter and receiver sides. Based on the effectiveness of these components, a code map representing various commands enables multiple users to control multiple intelligent agents concurrently. Our experiments in comparison with state-of-the-art works show that BRIEDGE achieves the best classification accuracy of heterogeneous EEG data, and more stable performance under noisy environments. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.15264 [pdf, ps, other]

Control contraction metrics on Lie groups

Authors: Dongjun Wu, Bowen Yi, Ian R. Manchester

Abstract: In this paper, we extend the control contraction metrics (CCM) approach, which was originally proposed for the universal tracking control of nonlinear systems, to those that evolves on Lie groups. Our idea is to view the manifold as a constrained set that is embedded in Euclidean space, and then propose the sufficient conditions for the existence of a CCM and the associated controller design. Nota… ▽ More In this paper, we extend the control contraction metrics (CCM) approach, which was originally proposed for the universal tracking control of nonlinear systems, to those that evolves on Lie groups. Our idea is to view the manifold as a constrained set that is embedded in Euclidean space, and then propose the sufficient conditions for the existence of a CCM and the associated controller design. Notably, we demonstrate that the search for CCM on Lie groups can be reformulated as convex conditions. The results extend the applicability of the CCM approach and provide a framework for analyzing the behavior of control systems with Lie group structures. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.15191 [pdf, other]

VORTEX: Real-Time Off-Chain Payments and Cross-Chain Swaps for Cryptocurrencies

Authors: Di Wu, Jian Liu, Zhengwei Hou, Wu Wen, Kui Ren

Abstract: In this paper, we present VERTEX, a TEE-based layer-2 solution that tackles two crucial challenges in the realm of cryptocurrencies: off-chain payments and cross-chain swaps. It offers three notable features: - Channel-free off-chain payments: it allows a payer to make direct payments to anyone without requiring any on-chain relationship or intermediary channels. - Real-time yet decentralized cros… ▽ More In this paper, we present VERTEX, a TEE-based layer-2 solution that tackles two crucial challenges in the realm of cryptocurrencies: off-chain payments and cross-chain swaps. It offers three notable features: - Channel-free off-chain payments: it allows a payer to make direct payments to anyone without requiring any on-chain relationship or intermediary channels. - Real-time yet decentralized cross-chain swaps: it is the first known solution that enables real-time cross-chain swaps without relying on a central server. This novel feature is made possible through a ground-breaking fair exchange protocol. - TEE crash-tolerance: it offers two solutions to handle TEE crashes, one of which involves an innovative application of time-lock puzzles in this context. We evaluate ECHO on a network consists of 1000 nodes and the evaluation results show that ECHO can achieve 7000 TPS △ Less

Submitted 5 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.13586 [pdf, other]

Non-Equilibrium and Self-Organization Evolution in Hot-Spot Ignition Processes

Authors: X. -Y. Fu, Z. -Y. Guo, Q. -H. Wang, R. -C. Wang, D. Wu, J. Zhang

Abstract: Due to disparate formation mechanisms, as for central hot-spot ignition and fast ignition, the initial temperatures of electron and ions usually differs from each other in the hot spot. Considering the percipient dependence of fusion cross-section and energy losses on temperature, this difference manifests the inadequacy of the equilibrium theoretical model in accurately depicting the ignition con… ▽ More Due to disparate formation mechanisms, as for central hot-spot ignition and fast ignition, the initial temperatures of electron and ions usually differs from each other in the hot spot. Considering the percipient dependence of fusion cross-section and energy losses on temperature, this difference manifests the inadequacy of the equilibrium theoretical model in accurately depicting the ignition condition and evolution of the hot-spot. In this work, we studied a non-equilibrium model and extended this model to both isobaric and isochoric scenarios, characterized by varying hot-spot densities, temperatures and expansion velocities. In both cases, a spontaneous self-organization evolution was observed, manifesting as the bifurcation of ion and electron temperatures. Notably, the ion temperature is particularly prominent during the ignition process. This inevitability can be traced to the preponderant deposition rates of alpha-particles into D-T ions and the decreasing rate of energy exchange between electrons and D-T ions at elevated temperatures. The inherent structure, characterized by higher ion temperature and lower electron temperature during ignition, directly contributes to the augmentation of D-T reactions and mitigates energy losses through electron conduction and bremsstrahlung, thereby naturally facilitating nuclear fusions. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: The junior undergraduate students, X.-Y. Fu, Z.-Y. Guo, Q.-H. Wang, and R.-C. Wang, all contributed equally to this work

arXiv:2403.11700 [pdf, other]

Virbo: Multimodal Multilingual Avatar Video Generation in Digital Marketing

Authors: Juan Zhang, Jiahao Chen, Cheng Wang, Zhiwang Yu, Tangquan Qi, Can Liu, Di Wu

Abstract: With the widespread popularity of internet celebrity marketing all over the world, short video production has gradually become a popular way of presenting products information. However, the traditional video production industry usually includes series of procedures as script writing, video filming in a professional studio, video clipping, special effects rendering, customized post-processing, and… ▽ More With the widespread popularity of internet celebrity marketing all over the world, short video production has gradually become a popular way of presenting products information. However, the traditional video production industry usually includes series of procedures as script writing, video filming in a professional studio, video clipping, special effects rendering, customized post-processing, and so forth. Not to mention that multilingual videos is not accessible for those who could not speak multilingual languages. These complicated procedures usually needs a professional team to complete, and this made short video production costly in both time and money. This paper presents an intelligent system that supports the automatic generation of talking avatar videos, namely Virbo. With simply a user-specified script, Virbo could use a deep generative model to generate a target talking videos. Meanwhile, the system also supports multimodal inputs to customize the video with specified face, specified voice and special effects. This system also integrated a multilingual customization module that supports generate multilingual talking avatar videos in a batch with hundreds of delicate templates and creative special effects. Through a series of user studies and demo tests, we found that Virbo can generate talking avatar videos that maintained a high quality of videos as those from a professional team while reducing the entire production costs significantly. This intelligent system will effectively promote the video production industry and facilitate the internet marketing neglecting of language barriers and cost challenges. △ Less

Submitted 22 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.10059 [pdf, other]

Repoformer: Selective Retrieval for Repository-Level Code Completion

Authors: Di Wu, Wasi Uddin Ahmad, Dejiao Zhang, Murali Krishna Ramanathan, Xiaofei Ma

Abstract: Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion. However, the invariable use of retrieval in existing methods exposes issues in both efficiency and robustness, with a large proportion of the retrieved contexts proving unhelpful or harmful to code language models (code LMs). In this paper, we propose a selective RAG framework to a… ▽ More Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion. However, the invariable use of retrieval in existing methods exposes issues in both efficiency and robustness, with a large proportion of the retrieved contexts proving unhelpful or harmful to code language models (code LMs). In this paper, we propose a selective RAG framework to avoid retrieval when unnecessary. To power this framework, we design a self-supervised learning approach to enable a code LM to accurately self-evaluate whether retrieval can improve its output quality and robustly leverage the potentially noisy retrieved contexts. Using this LM as both the selective RAG policy and the generation model, our framework achieves state-of-the-art repository-level code completion performance on diverse benchmarks including RepoEval, CrossCodeEval, and CrossCodeLongEval, a new long-form code completion benchmark. Meanwhile, our analyses show that selectively retrieving brings as much as 70% inference speedup in the online serving setting without harming the performance. We further demonstrate that our framework is able to accommodate different generation models, retrievers, and programming languages. These advancements position our framework as an important step towards more accurate and efficient repository-level code completion. △ Less

Submitted 4 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: ICML 2024

arXiv:2403.09485 [pdf, other]

Dynamical pressure boundary condition for weakly-compressible smoothed particle hydrodynamics

Authors: Shuoguo Zhang, Yu Fan, Dong Wu, Chi Zhang, Xiangyu Hu

Abstract: This paper introduces a novel dynamical pressure boundary condition for weakly-compressible smoothed particle hydrodynamics (WCSPH). Unlike previous methods that rely on indirect approaches or ghost particles, our method integrates the dynamical boundary pressure directly into the SPH approximation of the pressure gradient on near-boundary particles. Additionally, we develop a meshfree bidirection… ▽ More This paper introduces a novel dynamical pressure boundary condition for weakly-compressible smoothed particle hydrodynamics (WCSPH). Unlike previous methods that rely on indirect approaches or ghost particles, our method integrates the dynamical boundary pressure directly into the SPH approximation of the pressure gradient on near-boundary particles. Additionally, we develop a meshfree bidirectional in-/outflow buffer by periodically relabelling buffer particles at each time step, a concept that has not been explored before. This simple yet effective buffer facilitates the simulation of both uni- and bidirectional flows, especially those with mixed in-/outflow boundary conditions. We validate the accuracy and convergence of our method through benchmark cases with available analytical solutions. Furthermore, we demonstrate its versatility in hemodynamic simulations by investigating generic carotid and aorta flows with the Windkessel model, paving the way for studying the cardiovascular system within a unified meshfree computational framework. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 40 pages and 15 figures

arXiv:2403.08651 [pdf, other]

HAIFIT: Human-Centered AI for Fashion Image Translation

Authors: Jianan Jiang, Xinglin Li, Weiren Yu, Di Wu

Abstract: In the realm of fashion design, sketches serve as the canvas for expressing an artist's distinctive drawing style and creative vision, capturing intricate details like stroke variations and texture nuances. The advent of sketch-to-image cross-modal translation technology has notably aided designers. However, existing methods often compromise these sketch details during image generation, resulting… ▽ More In the realm of fashion design, sketches serve as the canvas for expressing an artist's distinctive drawing style and creative vision, capturing intricate details like stroke variations and texture nuances. The advent of sketch-to-image cross-modal translation technology has notably aided designers. However, existing methods often compromise these sketch details during image generation, resulting in images that deviate from the designer's intended concept. This limitation hampers the ability to offer designers a precise preview of the final output. To overcome this challenge, we introduce HAIFIT, a novel approach that transforms sketches into high-fidelity, lifelike clothing images by integrating multi-scale features and capturing extensive feature map dependencies from diverse perspectives. Through extensive qualitative and quantitative evaluations conducted on our self-collected dataset, our method demonstrates superior performance compared to existing methods in generating photorealistic clothing images. Our method excels in preserving the distinctive style and intricate details essential for fashion design applications. △ Less

Submitted 25 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: 8 pages,8 figures

arXiv:2403.08200 [pdf, ps, other]

Prototyping and Experimental Results for Environment-Aware Millimeter Wave Beam Alignment via Channel Knowledge Map

Authors: Zhuoyin Dai, Di Wu, Zhenjun Dong, Kun Li, Dingyang Ding, Sihan Wang, Yong Zeng

Abstract: Channel knowledge map (CKM), which aims to directly reflect the intrinsic channel properties of the local wireless environment, is a novel technique for achieving environmentaware communication. In this paper, to alleviate the large training overhead in millimeter wave (mmWave) beam alignment, an environment-aware and training-free beam alignment prototype is established based on a typical CKM, te… ▽ More Channel knowledge map (CKM), which aims to directly reflect the intrinsic channel properties of the local wireless environment, is a novel technique for achieving environmentaware communication. In this paper, to alleviate the large training overhead in millimeter wave (mmWave) beam alignment, an environment-aware and training-free beam alignment prototype is established based on a typical CKM, termed beam index map (BIM). To this end, a general CKM construction method is first presented, and an indoor BIM is constructed offline to learn the candidate transmit and receive beam index pairs for each grid in the experimental area. Furthermore, based on the location information of the receiver (or the dynamic obstacles) from the ultra-wide band (UWB) positioning system, the established BIM is used to achieve training-free beam alignment by directly providing the beam indexes for the transmitter and receiver. Three typical scenarios are considered in the experiment, including quasi-static environment with line-of-sight (LoS) link, quasistatic environment without LoS link and dynamic environment. Besides, the receiver orientation measured from the gyroscope is also used to help CKM predict more accurate beam indexes. The experiment results show that compared with the benchmark location-based beam alignment strategy, the CKM-based beam alignment strategy can achieve much higher received power, which is close to that achieved by exhaustive beam search, but with significantly reduced training overhead. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.07804

Type IV-like Solar Radio Burst Consisting of a Series of Spikes Observed by PSP

Authors: Bing Ma, Ling Chen, De-Jin Wu, Marc Pulupa, Stuart D. Bale

Abstract: Solar and interplanetary radio bursts can reflect the existence and motion of energetic electrons and are therefore a kind of vital phenomenon in solar activities. The present study reported a solar radio burst (SRB) event observed by Parker Solar Probe (PSP) in its 8th orbital encounter phase, and it lasted about 20 hours in a frequency range of 0.5-15 MHz, called the type IV-like SRB. This type… ▽ More Solar and interplanetary radio bursts can reflect the existence and motion of energetic electrons and are therefore a kind of vital phenomenon in solar activities. The present study reported a solar radio burst (SRB) event observed by Parker Solar Probe (PSP) in its 8th orbital encounter phase, and it lasted about 20 hours in a frequency range of 0.5-15 MHz, called the type IV-like SRB. This type IV-like SRB consists of a series of numerous spikes with the center-frequency drifting slowly from ~5 MHz to ~1 MHz, and each individual spike appears a much faster frequency drifting and has a narrow frequency range of a few MHz and short duration of a few minutes. Based on the empirical models of the solar atmosphere adopted commonly, combining the in-situ measurement by PSP, we propose that these small-scale spikes were generated by a group of solitary kinetic Alfvén waves (SKAWs) in a magnetic loop accompanying coronal mass ejection (CME) and moving outwards, in which the frequency drifting of individual spike is caused by the SKAW's propagation and the center-frequency drifting may be attributed to the motion of the magnetic loop. △ Less

Submitted 16 June, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

Comments: There are some questions about models and the emission mechanisms to be discussed more carefully. We need to revise this manuscript

arXiv:2403.07699 [pdf, other]

Ion Kinetics and Neutron Generation Associated with Electromagnetic Turbulence in Laboratory-scale Counter-streaming Plasmas

Authors: P. Liu, D. Wu, T. X. Hu, D. W. Yuan, G. Zhao, Z. M. Sheng, X. T. He, J. Zhang

Abstract: Electromagnetic turbulence and ion kinetics in counter-streaming plasmas hold great significance in laboratory astrophysics, such as turbulence field amplification and particle energization. Here, we quantitatively demonstrate for the first time how electromagnetic turbulence affects ion kinetics under achievable laboratory conditions (millimeter-scale interpenetrating plasmas with initial velocit… ▽ More Electromagnetic turbulence and ion kinetics in counter-streaming plasmas hold great significance in laboratory astrophysics, such as turbulence field amplification and particle energization. Here, we quantitatively demonstrate for the first time how electromagnetic turbulence affects ion kinetics under achievable laboratory conditions (millimeter-scale interpenetrating plasmas with initial velocity of $2000\ \mathrm{km/s}$, density of $4 \times 10^{19}\ \mathrm{cm}^{-3}$, and temperature of $100\ \mathrm{eV}$) utilizing a recently developed high-order implicit particle-in-cell code without scaling transformation. It is found that the electromagnetic turbulence is driven by ion two-stream and filamentation instabilities. For the magnetized scenarios where an applied magnetic field of tens of Tesla is perpendicular to plasma flows, the growth rates of instabilities increase with the strengthening of applied magnetic field, which therefore leads to a significant enhancement of turbulence fields. Under the competition between the stochastic acceleration due to electromagnetic turbulence and collisional thermalization, ion distribution function shows a distinct super-Gaussian shape, and the ion kinetics are manifested in neutron yields and spectra. Our results have well explained the recent unmagnetized experimental observations, and the findings of magnetized scenario can be verified by current astrophysical experiments. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: Accepted by Phys. Rev. Lett. on 12 Mar

arXiv:2403.06838 [pdf, other]

ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts

Authors: Lyuye Zhang, Kaixuan Li, Kairan Sun, Daoyuan Wu, Ye Liu, Haoye Tian, Yang Liu

Abstract: Smart contracts are susceptible to various security issues, among which access control (AC) vulnerabilities are particularly critical. While existing research has proposed multiple detection tools, the automatic and appropriate repair of AC vulnerabilities in smart contracts remains a challenge. Unlike commonly supported vulnerability types by existing repair tools, such as reentrancy, which are u… ▽ More Smart contracts are susceptible to various security issues, among which access control (AC) vulnerabilities are particularly critical. While existing research has proposed multiple detection tools, the automatic and appropriate repair of AC vulnerabilities in smart contracts remains a challenge. Unlike commonly supported vulnerability types by existing repair tools, such as reentrancy, which are usually fixed by template-based approaches, the main obstacle of AC lies in identifying the appropriate roles or permissions amid a long list of non-AC-related source code to generate proper patch code, a task that demands human-level intelligence. Leveraging recent advancements in large language models (LLMs), we employ the state-of-the-art GPT-4 model and enhance it with a novel approach called ACFIX. The key insight is that we can mine common AC practices for major categories of code functionality and use them to guide LLMs in fixing code with similar functionality. To this end, ACFIX involves both offline and online phases. First, during the offline phase, ACFIX mines a taxonomy of common Role-based Access Control (RBAC) practices from 344,251 on-chain contracts, categorizing 49 role-permission pairs from the top 1,000 pairs mined. Second, during the online phase, ACFIX tracks AC-related elements across the contract and uses this context information along with a Chain-of-Thought pipeline to guide LLMs in identifying the most appropriate role-permission pair for the subject contract and subsequently generating a suitable patch. This patch will then undergo a validity and effectiveness check. To evaluate ACFIX, we built the first benchmark dataset of 118 real-world AC vulnerabilities, and our evaluation revealed that ACFIX successfully repaired 94.92% of them. This represents a significant improvement compared to the baseline GPT-4, which achieved only 52.54%. △ Less

Submitted 18 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: This is a technical report from Nanyang Technological University

arXiv:2403.06531 [pdf]

Ultrafast switching of sliding ferroelectricity and dynamical magnetic field in van der Waals bilayer induced by light

Authors: Jian Wang, Xu Li, Xingyue Ma, Lan Chen, Jun-Ming Liu, Chun-Gang Duan, Jorge Íñiguez-González, Di Wu, Yurong Yang

Abstract: Sliding ferroelectricity is a unique type of polarity recently observed in a properly stacked van der Waals bilayer. However, electric-field control of sliding ferroelectricity is hard and could induce large coercive electric fields and serious leakage currents which corrode the ferroelectricity and electronic properties, which are essential for modern two-dimensional electronics and optoelectroni… ▽ More Sliding ferroelectricity is a unique type of polarity recently observed in a properly stacked van der Waals bilayer. However, electric-field control of sliding ferroelectricity is hard and could induce large coercive electric fields and serious leakage currents which corrode the ferroelectricity and electronic properties, which are essential for modern two-dimensional electronics and optoelectronics. Here, we proposed laser-pulse deterministic control of sliding ferroelectricity in bilayer h-BN by first principles and molecular dynamics simulation with machine-learned force fields. The laser pulses excite shear modes which exhibit certain directional movements of lateral sliding between bilayers. The vibration of excited modes under laser pulses is predicted to overcome the energy barrier and achieve the switching of sliding ferroelectricity. Furthermore, it is found that three possible sliding transitions - between AB (BA) and BA (AB) stacking - can lead to the occurrence of dynamical magnetic fields along three different directions. Remarkably, the magnetic fields are generated by the simple linear motion of nonmagnetic species, without any need for more exotic (circular, spiral) pathways. Such predictions of deterministic control of sliding ferroelectricity and multi-states of dynamical magnetic field thus expand the potential applications of sliding ferroelectricity in memory and electronic devices. △ Less

Submitted 11 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.06069 [pdf, other]

Implicit Image-to-Image Schrodinger Bridge for CT Super-Resolution and Denoising

Authors: Yuang Wang, Siyeop Yoon, Pengfei Jin, Matthew Tivnan, Zhennong Chen, Rui Hu, Li Zhang, Zhiqiang Chen, Quanzheng Li, Dufan Wu

Abstract: Conditional diffusion models have gained recognition for their effectiveness in image restoration tasks, yet their iterative denoising process, starting from Gaussian noise, often leads to slow inference speeds. As a promising alternative, the Image-to-Image Schrödinger Bridge (I2SB) initializes the generative process from corrupted images and integrates training techniques from conditional diffus… ▽ More Conditional diffusion models have gained recognition for their effectiveness in image restoration tasks, yet their iterative denoising process, starting from Gaussian noise, often leads to slow inference speeds. As a promising alternative, the Image-to-Image Schrödinger Bridge (I2SB) initializes the generative process from corrupted images and integrates training techniques from conditional diffusion models. In this study, we extended the I2SB method by introducing the Implicit Image-to-Image Schrodinger Bridge (I3SB), transitioning its generative process to a non-Markovian process by incorporating corrupted images in each generative step. This enhancement empowers I3SB to generate images with better texture restoration using a small number of generative steps. The proposed method was validated on CT super-resolution and denoising tasks and outperformed existing methods, including the conditional denoising diffusion probabilistic model (cDDPM) and I2SB, in both visual quality and quantitative metrics. These findings underscore the potential of I3SB in improving medical image restoration by providing fast and accurate generative modeling. △ Less

Submitted 9 March, 2024; originally announced March 2024.

arXiv:2403.05851 [pdf, other]

Interest-Aware Joint Caching, Computing, and Communication Optimization for Mobile VR Delivery in MEC Networks

Authors: Baojie Fu, Tong Tang, Dapeng Wu, Ruyan Wang

Abstract: In the upcoming B5G/6G era, virtual reality (VR) over wireless has become a typical application, which is an inevitable trend in the development of video. However, in immersive and interactive VR experiences, VR services typically exhibit high delay, while simultaneously posing challenges for the energy consumption of local devices. To address these issues, this paper aims to improve the performan… ▽ More In the upcoming B5G/6G era, virtual reality (VR) over wireless has become a typical application, which is an inevitable trend in the development of video. However, in immersive and interactive VR experiences, VR services typically exhibit high delay, while simultaneously posing challenges for the energy consumption of local devices. To address these issues, this paper aims to improve the performance of the VR service in the edge-terminal cooperative system. Specifically, we formulate a problem of joint caching, computing, and communication VR service policy, by optimizing the weighted sum of overall VR delivery delay and energy consumption of local devices. For the purpose of designing the optimal VR service policy, the optimization problem is decoupled into three independent subproblems to be solved separately. To enhance the caching efficiency within the network, a bidirectional encoder representations from transformers (Bert)-based user interest analysis method is first proposed to characterize the content requesting behavior accurately. On the basis of this, a service cost minimum-maximization problem is formulated with consideration of performance fairness among users. Thereafter, the joint caching and computing scheme is derived for each user with given allocation of communication resources while a bisection-based communication scheme is acquired with the given information on joint caching and computing policy. With alternative optimization, an optimal policy for joint caching, computing and communication based on user interest can be finally obtained. Simulation results are presented to demonstrate the superiority of the proposed user interest-aware caching scheme and the effective of the joint caching, computing and communication optimization policy with consideration of user fairness. △ Less

Submitted 9 March, 2024; originally announced March 2024.

arXiv:2403.05530 [pdf, other]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Showing 51–100 of 1,481 results for author: Wu, D