-
The Transmission Value of Energy Storage and Fundamental Limitations
Authors:
Qian Zhang,
P. R. Kumar,
Le Xie
Abstract:
This study addresses the transmission value of energy storage in electric grids. The inherent connection between storage and transmission infrastructure is captured from a "cumulative energy" perspective, which enables the reformulating of the conventional optimization problem by employing line power flow as the decision variable. The study also establishes the theoretical limitations of both stor…
▽ More
This study addresses the transmission value of energy storage in electric grids. The inherent connection between storage and transmission infrastructure is captured from a "cumulative energy" perspective, which enables the reformulating of the conventional optimization problem by employing line power flow as the decision variable. The study also establishes the theoretical limitations of both storage and transmission lines that can be replaced by each other, providing explicit closed-form expressions for the minimum capacity needed. As a key departure from conventional practice in which transmission lines are designed according to the peak power delivery needs, with sufficient storage capacity, the transmission line capacity can be designed based on the average power delivery needs. The models of this paper only rely on a few basic assumptions, paving the way for understanding future storage as a transmission asset market design. Numerical experiments based on 2-bus, modified RTS 24-bus, RTS-GMLC, and Texas synthetic power systems illustrate the results.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Power Optimization and Deep Learning for Channel Estimation of Active IRS-Aided IoT
Authors:
Yan Wang,
Wei Gao,
Qi Zhang,
Jiajia Liu,
Feng Shu
Abstract:
In this paper, channel estimation of an active intelligent reflecting surface (IRS) aided uplink Internet of Things (IoT) network is investigated. Firstly, the least square (LS) estimators for the direct channel and the cascaded channel are presented, respectively. The corresponding mean square errors (MSE) of channel estimators are derived. Subsequently, in order to evaluate the influence of adju…
▽ More
In this paper, channel estimation of an active intelligent reflecting surface (IRS) aided uplink Internet of Things (IoT) network is investigated. Firstly, the least square (LS) estimators for the direct channel and the cascaded channel are presented, respectively. The corresponding mean square errors (MSE) of channel estimators are derived. Subsequently, in order to evaluate the influence of adjusting the transmit power at the IoT devices or the reflected power at the active IRS on Sum-MSE performance, two situations are considered. In the first case, under the total power sum constraint of the IoT devices and active IRS, the closed-form expression of the optimal power allocation factor is derived. In the second case, when the transmit power at the IoT devices is fixed, there exists an optimal reflective power at active IRS. To further improve the estimation performance, the convolutional neural network (CNN)-based direct channel estimation (CDCE) algorithm and the CNN-based cascaded channel estimation (CCCE) algorithm are designed. Finally, simulation results demonstrate the existence of an optimal power allocation strategy that minimizes the Sum-MSE, and further validate the superiority of the proposed CDCE / CCCE algorithms over their respective traditional LS and minimum mean square error (MMSE) baselines.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Neural Poisson Solver: A Universal and Continuous Framework for Natural Signal Blending
Authors:
Delong Wu,
Hao Zhu,
Qi Zhang,
You Li,
Zhan Ma,
Xun Cao
Abstract:
Implicit Neural Representation (INR) has become a popular method for representing visual signals (e.g., 2D images and 3D scenes), demonstrating promising results in various downstream applications. Given its potential as a medium for visual signals, exploring the development of a neural blending method that utilizes INRs is a natural progression. Neural blending involves merging two INRs to create…
▽ More
Implicit Neural Representation (INR) has become a popular method for representing visual signals (e.g., 2D images and 3D scenes), demonstrating promising results in various downstream applications. Given its potential as a medium for visual signals, exploring the development of a neural blending method that utilizes INRs is a natural progression. Neural blending involves merging two INRs to create a new INR that encapsulates information from both original representations. A direct approach involves applying traditional image editing methods to the INR rendering process. However, this method often results in blending distortions, artifacts, and color shifts, primarily due to the discretization of the underlying pixel grid and the introduction of boundary conditions for solving variational problems. To tackle this issue, we introduce the Neural Poisson Solver, a plug-and-play and universally applicable framework across different signal dimensions for blending visual signals represented by INRs. Our Neural Poisson Solver offers a variational problem-solving approach based on the continuous Poisson equation, demonstrating exceptional performance across various domains. Specifically, we propose a gradient-guided neural solver to represent the solution process of the variational problem, refining the target signal to achieve natural blending results. We also develop a Poisson equation-based loss and optimization scheme to train our solver, ensuring it effectively blends the input INR scenes while preserving their inherent structure and semantic content. The lack of dependence on additional prior knowledge makes our method easily adaptable to various task categories, highlighting its versatility. Comprehensive experimental results validate the robustness of our approach across multiple dimensions and blending tasks.
△ Less
Submitted 11 July, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
The Human Factor in AI Red Teaming: Perspectives from Social and Collaborative Computing
Authors:
Alice Qian Zhang,
Ryland Shaw,
Jacy Reese Anthis,
Ashlee Milton,
Emily Tseng,
Jina Suh,
Lama Ahmad,
Ram Shankar Siva Kumar,
Julian Posada,
Benjamin Shestakofsky,
Sarah T. Roberts,
Mary L. Gray
Abstract:
Rapid progress in general-purpose AI has sparked significant interest in "red teaming," a practice of adversarial testing originating in military and cybersecurity applications. AI red teaming raises many questions about the human factor, such as how red teamers are selected, biases and blindspots in how tests are conducted, and harmful content's psychological effects on red teamers. A growing bod…
▽ More
Rapid progress in general-purpose AI has sparked significant interest in "red teaming," a practice of adversarial testing originating in military and cybersecurity applications. AI red teaming raises many questions about the human factor, such as how red teamers are selected, biases and blindspots in how tests are conducted, and harmful content's psychological effects on red teamers. A growing body of HCI and CSCW literature examines related practices-including data labeling, content moderation, and algorithmic auditing. However, few, if any, have investigated red teaming itself. This workshop seeks to consider the conceptual and empirical challenges associated with this practice, often rendered opaque by non-disclosure agreements. Future studies may explore topics ranging from fairness to mental health and other areas of potential harm. We aim to facilitate a community of researchers and practitioners who can begin to meet these challenges with creativity, innovation, and thoughtful reflection.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
S&D Messenger: Exchanging Semantic and Domain Knowledge for Generic Semi-Supervised Medical Image Segmentation
Authors:
Qixiang Zhang,
Haonan Wang,
Xiaomeng Li
Abstract:
Semi-supervised medical image segmentation (SSMIS) has emerged as a promising solution to tackle the challenges of time-consuming manual labeling in the medical field. However, in practical scenarios, there are often domain variations within the datasets, leading to derivative scenarios like semi-supervised medical domain generalization (Semi-MDG) and unsupervised medical domain adaptation (UMDA).…
▽ More
Semi-supervised medical image segmentation (SSMIS) has emerged as a promising solution to tackle the challenges of time-consuming manual labeling in the medical field. However, in practical scenarios, there are often domain variations within the datasets, leading to derivative scenarios like semi-supervised medical domain generalization (Semi-MDG) and unsupervised medical domain adaptation (UMDA). In this paper, we aim to develop a generic framework that masters all three tasks. We notice a critical shared challenge across three scenarios: the explicit semantic knowledge for segmentation performance and rich domain knowledge for generalizability exclusively exist in the labeled set and unlabeled set respectively. Such discrepancy hinders existing methods from effectively comprehending both types of knowledge under semi-supervised settings. To tackle this challenge, we develop a Semantic & Domain Knowledge Messenger (S&D Messenger) which facilitates direct knowledge delivery between the labeled and unlabeled set, and thus allowing the model to comprehend both of them in each individual learning flow. Equipped with our S&D Messenger, a naive pseudo-labeling method can achieve huge improvement on six benchmark datasets for SSMIS (+7.5%), UMDA (+5.6%), and Semi-MDG tasks (+1.14%), compared with state-of-the-art methods designed for specific tasks.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
In-Orbit Processing or Not? Sunlight-Aware Task Scheduling for Energy-Efficient Space Edge Computing Networks
Authors:
Weisen Liu,
Zeqi Lai,
Qian Wu,
Hewu Li,
Qi Zhang,
Zonglun Li,
Yuanjie Li,
Jun Liu
Abstract:
With the rapid evolution of space-borne capabilities, space edge computing (SEC) is becoming a new computation paradigm for future integrated space and terrestrial networks. Satellite edges adopt advanced on-board hardware, which not only enables new opportunities to perform complex intelligent tasks in orbit, but also involves new challenges due to the additional energy consumption in power-const…
▽ More
With the rapid evolution of space-borne capabilities, space edge computing (SEC) is becoming a new computation paradigm for future integrated space and terrestrial networks. Satellite edges adopt advanced on-board hardware, which not only enables new opportunities to perform complex intelligent tasks in orbit, but also involves new challenges due to the additional energy consumption in power-constrained space environment. In this paper, we present PHOENIX, an energy-efficient task scheduling framework for emerging SEC networks. PHOENIX exploits a key insight that in the SEC network, there always exist a number of sunlit edges which are illuminated during the entire orbital period and have sufficient energy supplement from the sun. PHOENIX accomplishes energy-efficient in-orbit computing by judiciously offloading space tasks to "sunlight-sufficient" edges or to the ground. Specifically, PHOENIX first formulates the SEC battery energy optimizing (SBEO) problem which aims at minimizing the average battery energy consumption while satisfying various task completion constraints. Then PHOENIX incorporates a sunlight-aware scheduling mechanism to solve the SBEO problem and schedule SEC tasks efficiently. Finally, we implement a PHOENIX prototype and build an SEC testbed. Extensive data-driven evaluations demonstrate that as compared to other state-of-the-art solutions, PHOENIX can effectively reduce up to 54.8% SEC battery energy consumption and prolong battery lifetime to 2.9$\times$ while still completing tasks on time.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Bayesian Federated Learning with Hamiltonian Monte Carlo: Algorithm and Theory
Authors:
Jiajun Liang,
Qian Zhang,
Wei Deng,
Qifan Song,
Guang Lin
Abstract:
This work introduces a novel and efficient Bayesian federated learning algorithm, namely, the Federated Averaging stochastic Hamiltonian Monte Carlo (FA-HMC), for parameter estimation and uncertainty quantification. We establish rigorous convergence guarantees of FA-HMC on non-iid distributed data sets, under the strong convexity and Hessian smoothness assumptions. Our analysis investigates the ef…
▽ More
This work introduces a novel and efficient Bayesian federated learning algorithm, namely, the Federated Averaging stochastic Hamiltonian Monte Carlo (FA-HMC), for parameter estimation and uncertainty quantification. We establish rigorous convergence guarantees of FA-HMC on non-iid distributed data sets, under the strong convexity and Hessian smoothness assumptions. Our analysis investigates the effects of parameter space dimension, noise on gradients and momentum, and the frequency of communication (between the central node and local nodes) on the convergence and communication costs of FA-HMC. Beyond that, we establish the tightness of our analysis by showing that the convergence rate cannot be improved even for continuous FA-HMC process. Moreover, extensive empirical studies demonstrate that FA-HMC outperforms the existing Federated Averaging-Langevin Monte Carlo (FA-LD) algorithm.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Digging into the Interior of Hot Cores with ALMA (DIHCA). IV. Fragmentation in High-mass Star-Forming Clumps
Authors:
Kosuke Ishihara,
Patricio Sanhueza,
Fumitaka Nakamura,
Masao Saito,
Huei-Ru V. Chen,
Shanghuo Li,
Fernando Olguin,
Kotomi Taniguchi,
Kaho Morii,
Xing Lu,
Qiuyi Luo,
Takeshi Sakai,
Qizhou Zhang
Abstract:
Fragmentation contributes to the formation and evolution of stars. Observationally, high-mass stars are known to form multiple-star systems, preferentially in cluster environments. Theoretically, Jeans instability has been suggested to determine characteristic fragmentation scales, and thermal or turbulent motion in the parental gas clump mainly contributes to the instability. To search for such a…
▽ More
Fragmentation contributes to the formation and evolution of stars. Observationally, high-mass stars are known to form multiple-star systems, preferentially in cluster environments. Theoretically, Jeans instability has been suggested to determine characteristic fragmentation scales, and thermal or turbulent motion in the parental gas clump mainly contributes to the instability. To search for such a characteristic fragmentation scale, we have analyzed ALMA 1.33 mm continuum observations toward 30 high-mass star-forming clumps taken by the Digging into the Interior of Hot Cores with ALMA (DIHCA) survey. We have identified 573 cores using the dendrogram algorithm and measured the separation of cores by using the Minimum Spanning Tree (MST) technique. The core separation corrected by projection effects has a distribution peaked around 5800 au. In order to remove biases produced by different distances and sensitivities, we further smooth the images to a common physical scale and perform completeness tests. Our careful analysis finds a characteristic fragmentation scale of $\sim$7000 au, comparable to the thermal Jeans length of the clumps. We conclude that thermal Jeans fragmentation plays a dominant role in determining the clump fragmentation in high-mass star-forming regions, without the need of invoking turbulent Jeans fragmentation.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Heuristic Predictive Control for Multi-Robot Flocking in Congested Environments
Authors:
Guobin Zhu,
Qingrui Zhang,
Bo Zhu,
Tianjiang Hu
Abstract:
Multi-robot flocking possesses extraordinary advantages over a single-robot system in diverse domains, but it is challenging to ensure safe and optimal performance in congested environments. Hence, this paper is focused on the investigation of distributed optimal flocking control for multiple robots in crowded environments. A heuristic predictive control solution is proposed based on a Gibbs Rando…
▽ More
Multi-robot flocking possesses extraordinary advantages over a single-robot system in diverse domains, but it is challenging to ensure safe and optimal performance in congested environments. Hence, this paper is focused on the investigation of distributed optimal flocking control for multiple robots in crowded environments. A heuristic predictive control solution is proposed based on a Gibbs Random Field (GRF), in which bio-inspired potential functions are used to characterize robot-robot and robot-environment interactions. The optimal solution is obtained by maximizing a posteriori joint distribution of the GRF in a certain future time instant. A gradient-based heuristic solution is developed, which could significantly speed up the computation of the optimal control. Mathematical analysis is also conducted to show the validity of the heuristic solution. Multiple collision risk levels are designed to improve the collision avoidance performance of robots in dynamic environments. The proposed heuristic predictive control is evaluated comprehensively from multiple perspectives based on different metrics in a challenging simulation environment. The competence of the proposed algorithm is validated via the comparison with the non-heuristic predictive control and two existing popular flocking control methods. Real-life experiments are also performed using four quadrotor UAVs to further demonstrate the efficiency of the proposed design.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
SKYCASTLE: Taming LEO Mobility to Facilitate Seamless and Low-latency Satellite Internet Services
Authors:
Jihao Li,
Hewu Li,
Zeqi Lai,
Qian Wu,
Weisen Liu,
Xiaomo Wang,
Yuanjie Li,
Jun Liu,
Qi Zhang
Abstract:
Emerging integrated space and terrestrial networks (ISTN) built upon low earth orbit (LEO) satellite constellations aim at providing planet-wide Internet services, not only for residential users, but also for mobile users (e.g., in airplane and cruise scenarios). Efficiently managing global mobility and keeping connections active for mobile users is critical for ISTN operators. However, our quanti…
▽ More
Emerging integrated space and terrestrial networks (ISTN) built upon low earth orbit (LEO) satellite constellations aim at providing planet-wide Internet services, not only for residential users, but also for mobile users (e.g., in airplane and cruise scenarios). Efficiently managing global mobility and keeping connections active for mobile users is critical for ISTN operators. However, our quantitative analysis identifies that existing mobility management (MM) schemes suffer from frequent connection interruptions and long latency in ISTN scenarios. The fundamental challenge stems from a unique characteristic of ISTNs: not only users are mobile, but also core network infrastructures (i.e., LEO satellites) are frequently changing their locations in the network. To facilitate seamless and low-latency satellite Internet services, this paper presents SKYCASTLE, a novel network-based global mobility management mechanism. SKYCASTLE incorporates two key techniques to address frequent connection interruptions in ISTNs. First, to reduce the interruption time, SKYCASTLE adopts distributed satellite anchors to track the location changes of mobile nodes, manage handovers and avoid routing convergence. Second, SKYCASTLE leverages an anchor manager to schedule MM functionalities at satellites to reduce deployment costs while guaranteeing low latency. Extensive evaluations combining real constellation information and mobile user trajectories show that: SKYCASTLE can improve up to 55.8% uninterrupted time and reduce 47.8% latency as compared to other existing MM solutions.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
LuSNAR:A Lunar Segmentation, Navigation and Reconstruction Dataset based on Muti-sensor for Autonomous Exploration
Authors:
Jiayi Liu,
Qianyu Zhang,
Xue Wan,
Shengyang Zhang,
Yaolin Tian,
Haodong Han,
Yutao Zhao,
Baichuan Liu,
Zeyuan Zhao,
Xubo Luo
Abstract:
With the complexity of lunar exploration missions, the moon needs to have a higher level of autonomy. Environmental perception and navigation algorithms are the foundation for lunar rovers to achieve autonomous exploration. The development and verification of algorithms require highly reliable data support. Most of the existing lunar datasets are targeted at a single task, lacking diverse scenes a…
▽ More
With the complexity of lunar exploration missions, the moon needs to have a higher level of autonomy. Environmental perception and navigation algorithms are the foundation for lunar rovers to achieve autonomous exploration. The development and verification of algorithms require highly reliable data support. Most of the existing lunar datasets are targeted at a single task, lacking diverse scenes and high-precision ground truth labels. To address this issue, we propose a multi-task, multi-scene, and multi-label lunar benchmark dataset LuSNAR. This dataset can be used for comprehensive evaluation of autonomous perception and navigation systems, including high-resolution stereo image pairs, panoramic semantic labels, dense depth maps, LiDAR point clouds, and the position of rover. In order to provide richer scene data, we built 9 lunar simulation scenes based on Unreal Engine. Each scene is divided according to topographic relief and the density of objects. To verify the usability of the dataset, we evaluated and analyzed the algorithms of semantic segmentation, 3D reconstruction, and autonomous navigation. The experiment results prove that the dataset proposed in this paper can be used for ground verification of tasks such as autonomous environment perception and navigation, and provides a lunar benchmark dataset for testing the accessibility of algorithm metrics. We make LuSNAR publicly available at: https://github.com/autumn999999/LuSNAR-dataset.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Soli-enabled Noncontact Heart Rate Detection for Sleep and Meditation Tracking
Authors:
Luzhou Xu,
Jaime Lien,
Haiguang Li,
Nicholas Gillian,
Rajeev Nongpiur,
Jihan Li,
Qian Zhang,
Jian Cui,
David Jorgensen,
Adam Bernstein,
Lauren Bedal,
Eiji Hayashi,
Jin Yamanaka,
Alex Lee,
Jian Wang,
D Shin,
Ivan Poupyrev,
Trausti Thormundsson,
Anupam Pathak,
Shwetak Patel
Abstract:
Heart rate (HR) is a crucial physiological signal that can be used to monitor health and fitness. Traditional methods for measuring HR require wearable devices, which can be inconvenient or uncomfortable, especially during sleep and meditation. Noncontact HR detection methods employing microwave radar can be a promising alternative. However, the existing approaches in the literature usually use hi…
▽ More
Heart rate (HR) is a crucial physiological signal that can be used to monitor health and fitness. Traditional methods for measuring HR require wearable devices, which can be inconvenient or uncomfortable, especially during sleep and meditation. Noncontact HR detection methods employing microwave radar can be a promising alternative. However, the existing approaches in the literature usually use high-gain antennas and require the sensor to face the user's chest or back, making them difficult to integrate into a portable device and unsuitable for sleep and meditation tracking applications. This study presents a novel approach for noncontact HR detection using a miniaturized Soli radar chip embedded in a portable device (Google Nest Hub). The chip has a $6.5 \mbox{ mm} \times 5 \mbox{ mm} \times 0.9 \mbox{ mm}$ dimension and can be easily integrated into various devices. The proposed approach utilizes advanced signal processing and machine learning techniques to extract HRs from radar signals. The approach is validated on a sleep dataset (62 users, 498 hours) and a meditation dataset (114 users, 1131 minutes). The approach achieves a mean absolute error (MAE) of $1.69$ bpm and a mean absolute percentage error (MAPE) of $2.67\%$ on the sleep dataset. On the meditation dataset, the approach achieves an MAE of $1.05$ bpm and a MAPE of $1.56\%$. The recall rates for the two datasets are $88.53\%$ and $98.16\%$, respectively. This study represents the first application of the noncontact HR detection technology to sleep and meditation tracking, offering a promising alternative to wearable devices for HR monitoring during sleep and meditation.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Authors:
Shihan Dou,
Haoxiang Jia,
Shenxi Wu,
Huiyuan Zheng,
Weikang Zhou,
Muling Wu,
Mingxu Chai,
Jessica Fan,
Caishuang Huang,
Yunbo Tao,
Yan Liu,
Enyu Zhou,
Ming Zhang,
Yuhao Zhou,
Yueming Wu,
Rui Zheng,
Ming Wen,
Rongxiang Weng,
Jingang Wang,
Xunliang Cai,
Tao Gui,
Xipeng Qiu,
Qi Zhang,
Xuanjing Huang
Abstract:
The increasing development of large language models (LLMs) in code generation has drawn significant attention among researchers. To enhance LLM-based code generation ability, current efforts are predominantly directed towards collecting high-quality datasets and leveraging diverse training technologies. However, there is a notable lack of comprehensive studies examining the limitations and boundar…
▽ More
The increasing development of large language models (LLMs) in code generation has drawn significant attention among researchers. To enhance LLM-based code generation ability, current efforts are predominantly directed towards collecting high-quality datasets and leveraging diverse training technologies. However, there is a notable lack of comprehensive studies examining the limitations and boundaries of these existing methods. To bridge this gap, we conducted an extensive empirical study evaluating the performance of three leading closed-source LLMs and four popular open-source LLMs on three commonly used benchmarks. Our investigation, which evaluated the length, cyclomatic complexity and API number of the generated code, revealed that these LLMs face challenges in generating successful code for more complex problems, and tend to produce code that is shorter yet more complicated as compared to canonical solutions. Additionally, we developed a taxonomy of bugs for incorrect codes that includes three categories and 12 sub-categories, and analyze the root cause for common bug types. Furthermore, to better understand the performance of LLMs in real-world projects, we manually created a real-world benchmark comprising 140 code generation tasks. Our analysis highlights distinct differences in bug distributions between actual scenarios and existing benchmarks. Finally, we propose a novel training-free iterative method that introduces self-critique, enabling LLMs to critique and correct their generated code based on bug types and compiler feedback. Experimental results demonstrate that our approach can significantly mitigate bugs and increase the passing rate by 29.2% after two iterations, indicating substantial potential for LLMs to handle more complex problems.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
An Earth Rover dataset recorded at the ICRA@40 party
Authors:
Qi Zhang,
Zhihao Lin,
Arnoud Visser
Abstract:
The ICRA conference is celebrating its $40^{th}$ anniversary in Rotterdam in September 2024, with as highlight the Happy Birthday ICRA Party at the iconic Holland America Line Cruise Terminal. One month later the IROS conference will take place, which will include the Earth Rover Challenge. In this challenge open-world autonomous navigation models are studied truly open-world settings.
As part o…
▽ More
The ICRA conference is celebrating its $40^{th}$ anniversary in Rotterdam in September 2024, with as highlight the Happy Birthday ICRA Party at the iconic Holland America Line Cruise Terminal. One month later the IROS conference will take place, which will include the Earth Rover Challenge. In this challenge open-world autonomous navigation models are studied truly open-world settings.
As part of the Earth Rover Challenge several real-world navigation sets in several cities world-wide, like Auckland, Australia and Wuhan, China. The only dataset recorded in the Netherlands is the small village Oudewater. The proposal is to record a dataset with the robot used in the Earth Rover Challenge in Rotterdam, in front of the Holland America Line Cruise Terminal, before the festivities of the Happy Birthday ICRA Party start.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Collaborative Analysis for Paired A/B Testing Experiments
Authors:
Qiong Zhang,
Lulu Kang,
Xinwei Deng
Abstract:
With the extensive use of digital devices, online experimental platforms are commonly used to conduct experiments to collect data for evaluating different variations of products, algorithms, and interface designs, a.k.a., A/B tests. In practice, multiple A/B testing experiments are often carried out based on a common user population on the same platform. The same user's responses to different expe…
▽ More
With the extensive use of digital devices, online experimental platforms are commonly used to conduct experiments to collect data for evaluating different variations of products, algorithms, and interface designs, a.k.a., A/B tests. In practice, multiple A/B testing experiments are often carried out based on a common user population on the same platform. The same user's responses to different experiments can be correlated to some extent due to the individual effect of the user. In this paper, we propose a novel framework that collaboratively analyzes the data from paired A/B tests, namely, a pair of A/B testing experiments conducted on the same set of experimental subjects. The proposed analysis approach for paired A/B tests can lead to more accurate estimates than the traditional separate analysis of each experiment. We obtain the asymptotic distribution of the proposed estimators and demonstrate that the proposed estimators are asymptotically the best linear unbiased estimators under certain assumptions. Moreover, the proposed analysis approach is computationally efficient, easy to implement, and robust to different types of responses. Both numerical simulations and numerical studies based on a real case are used to examine the performance of the proposed method.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Unfolding a Hopf bifurcation in a linear reaction-diffusion equation with strongly localized impurity existence of breathing pulses
Authors:
Ji Li,
Qing Yu,
Qian Zhang
Abstract:
This paper presents a general framework to derive the weakly nonlinear stability near a Hopf bifurcation in a special class of multi-scale reaction-diffusion equations. The main focus is on how the linearity and nonlinearity of the fast variables in system influence the emergence of the breathing pulses when the slow variables are linear and the bifurcation parameter is around the Hopf bifurcation…
▽ More
This paper presents a general framework to derive the weakly nonlinear stability near a Hopf bifurcation in a special class of multi-scale reaction-diffusion equations. The main focus is on how the linearity and nonlinearity of the fast variables in system influence the emergence of the breathing pulses when the slow variables are linear and the bifurcation parameter is around the Hopf bifurcation point. By applying the matching principle to the fast and slow changing quantities and using the relevant theory of singular perturbation, we obtain explicit expressions for the stationary pulses. Then, the normal form theory and the center manifold theory are applied to give Hopf normal form expressions. Finally, one of these expressions is verified by the numerical simulation.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
FlowLearn: Evaluating Large Vision-Language Models on Flowchart Understanding
Authors:
Huitong Pan,
Qi Zhang,
Cornelia Caragea,
Eduard Dragut,
Longin Jan Latecki
Abstract:
Flowcharts are graphical tools for representing complex concepts in concise visual representations. This paper introduces the FlowLearn dataset, a resource tailored to enhance the understanding of flowcharts. FlowLearn contains complex scientific flowcharts and simulated flowcharts. The scientific subset contains 3,858 flowcharts sourced from scientific literature and the simulated subset contains…
▽ More
Flowcharts are graphical tools for representing complex concepts in concise visual representations. This paper introduces the FlowLearn dataset, a resource tailored to enhance the understanding of flowcharts. FlowLearn contains complex scientific flowcharts and simulated flowcharts. The scientific subset contains 3,858 flowcharts sourced from scientific literature and the simulated subset contains 10,000 flowcharts created using a customizable script. The dataset is enriched with annotations for visual components, OCR, Mermaid code representation, and VQA question-answer pairs. Despite the proven capabilities of Large Vision-Language Models (LVLMs) in various visual understanding tasks, their effectiveness in decoding flowcharts - a crucial element of scientific communication - has yet to be thoroughly investigated. The FlowLearn test set is crafted to assess the performance of LVLMs in flowchart comprehension. Our study thoroughly evaluates state-of-the-art LVLMs, identifying existing limitations and establishing a foundation for future enhancements in this relatively underexplored domain. For instance, in tasks involving simulated flowcharts, GPT-4V achieved the highest accuracy (58%) in counting the number of nodes, while Claude recorded the highest accuracy (83%) in OCR tasks. Notably, no single model excels in all tasks within the FlowLearn framework, highlighting significant opportunities for further development.
△ Less
Submitted 9 July, 2024; v1 submitted 6 July, 2024;
originally announced July 2024.
-
Cost and Power-Consumption Analysis for Power Profile Monitoring with Multiple Monitors per Link in Optical Networks
Authors:
Qiaolun Zhang,
Patricia Layec,
Alix May,
Annalisa Morea,
Aryanaz Attarpour,
Massimo Tornatore
Abstract:
Network monitoring is essential to collect compre-hensive data on signal quality in optical networks. As deploying large amounts of monitoring equipment results in elevated cost and power consumption, novel low-cost monitoring methods are continuously being investigated. A new technique called Power Profile Monitoring (PPM) has recently gained traction thanks to its ability to monitor an entire li…
▽ More
Network monitoring is essential to collect compre-hensive data on signal quality in optical networks. As deploying large amounts of monitoring equipment results in elevated cost and power consumption, novel low-cost monitoring methods are continuously being investigated. A new technique called Power Profile Monitoring (PPM) has recently gained traction thanks to its ability to monitor an entire lightpath using a single post-processing unit at the lightpath receiver. PPM does not require to deploy an individual monitor for each span, as in the traditional monitoring technique using Optical Time-Domain Reflectometer (OTDR). PPM and OTDR have different monitoring applications, which will be elaborated in our discussion, hence they can be considered either alternative or complementary techniques according to the targeted monitoring capabilities to be implemented in the network. In this work, we aim to quantify the cost and power consumption of PPM (using OTDR as a baseline reference), as this analysis can provide guidelines for the implementation and deployment of PPM. First, we discuss how PPM and OTDR monitors are deployed, and we formally state a new Optimized Monitoring Placement (OMP) problem for PPM. Solving the OMP problem allows to identify the minimum number of PPM monitors that guarantees that all links in the networks are monitored by at least n PPM monitors (note that using n > 1 allows for increased monitoring accuracy). We prove the NP-hardness of the OMP problem and formulate it using an Integer Linear Programming (ILP) model. Finally, we also devise a heuristic algorithm for the OMP problem to scale to larger topologies. Our numerical results, obtained on realistic topologies, suggest that the cost (power) of one PPM module should be lower than 2.6 times and 10.2 times that of one OTDR for nation-wide and continental-wide topology, respectively.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Authors:
Tiancheng Zhao,
Qianqian Zhang,
Kyusong Lee,
Peng Liu,
Lu Zhang,
Chunxin Fang,
Jiajia Liao,
Kelei Jiang,
Yibo Ma,
Ruochen Xu
Abstract:
We introduce OmChat, a model designed to excel in handling long contexts and video understanding tasks. OmChat's new architecture standardizes how different visual inputs are processed, making it more efficient and adaptable. It uses a dynamic vision encoding process to effectively handle images of various resolutions, capturing fine details across a range of image qualities. OmChat utilizes an ac…
▽ More
We introduce OmChat, a model designed to excel in handling long contexts and video understanding tasks. OmChat's new architecture standardizes how different visual inputs are processed, making it more efficient and adaptable. It uses a dynamic vision encoding process to effectively handle images of various resolutions, capturing fine details across a range of image qualities. OmChat utilizes an active progressive multimodal pretraining strategy, which gradually increases the model's capacity for long contexts and enhances its overall abilities. By selecting high-quality data during training, OmChat learns from the most relevant and informative data points. With support for a context length of up to 512K, OmChat demonstrates promising performance in tasks involving multiple images and videos, outperforming most open-source models in these benchmarks. Additionally, OmChat proposes a prompting strategy for unifying complex multimodal inputs including single image text, multi-image text and videos, and achieving competitive performance on single-image benchmarks. To further evaluate the model's capabilities, we proposed a benchmark dataset named Temporal Visual Needle in a Haystack. This dataset assesses OmChat's ability to comprehend temporal visual details within long videos. Our analysis highlights several key factors contributing to OmChat's success: support for any-aspect high image resolution, the active progressive pretraining strategy, and high-quality supervised fine-tuning datasets. This report provides a detailed overview of OmChat's capabilities and the strategies that enhance its performance in visual understanding.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Poster: Flexible Scheduling of Network and Computing Resources for Distributed AI Tasks
Authors:
Ruikun Wang,
Jiawei Zhang,
Qiaolun Zhang,
Bojun Zhang,
Zhiqun Gu,
Aryanaz Attarpour,
Yuefeng Ji,
Massimo Tornatore
Abstract:
Many emerging Artificial Intelligence (AI) applications require on-demand provisioning of large-scale computing, which can only be enabled by leveraging distributed computing services interconnected through networking. To address such increasing demand for networking to serve AI tasks, we investigate new scheduling strategies to improve communication efficiency and test them on a programmable test…
▽ More
Many emerging Artificial Intelligence (AI) applications require on-demand provisioning of large-scale computing, which can only be enabled by leveraging distributed computing services interconnected through networking. To address such increasing demand for networking to serve AI tasks, we investigate new scheduling strategies to improve communication efficiency and test them on a programmable testbed. We also show relevant challenges and research directions.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
PA-LOCO: Learning Perturbation-Adaptive Locomotion for Quadruped Robots
Authors:
Zhiyuan Xiao,
Xinyu Zhang,
Xiang Zhou,
Qingrui Zhang
Abstract:
Numerous locomotion controllers have been designed based on Reinforcement Learning (RL) to facilitate blind quadrupedal locomotion traversing challenging terrains. Nevertheless, locomotion control is still a challenging task for quadruped robots traversing diverse terrains amidst unforeseen disturbances. Recently, privileged learning has been employed to learn reliable and robust quadrupedal locom…
▽ More
Numerous locomotion controllers have been designed based on Reinforcement Learning (RL) to facilitate blind quadrupedal locomotion traversing challenging terrains. Nevertheless, locomotion control is still a challenging task for quadruped robots traversing diverse terrains amidst unforeseen disturbances. Recently, privileged learning has been employed to learn reliable and robust quadrupedal locomotion over various terrains based on a teacher-student architecture. However, its one-encoder structure is not adequate in addressing external force perturbations. The student policy would experience inevitable performance degradation due to the feature embedding discrepancy between the feature encoder of the teacher policy and the one of the student policy. Hence, this paper presents a privileged learning framework with multiple feature encoders and a residual policy network for robust and reliable quadruped locomotion subject to various external perturbations. The multi-encoder structure can decouple latent features from different privileged information, ultimately leading to enhanced performance of the learned policy in terms of robustness, stability, and reliability. The efficiency of the proposed feature encoding module is analyzed in depth using extensive simulation data. The introduction of the residual policy network helps mitigate the performance degradation experienced by the student policy that attempts to clone the behaviors of a teacher policy. The proposed framework is evaluated on a Unitree GO1 robot, showcasing its performance enhancement over the state-of-the-art privileged learning algorithm through extensive experiments conducted on diverse terrains. Ablation studies are conducted to illustrate the efficiency of the residual policy network.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs
Authors:
Keyu An,
Qian Chen,
Chong Deng,
Zhihao Du,
Changfeng Gao,
Zhifu Gao,
Yue Gu,
Ting He,
Hangrui Hu,
Kai Hu,
Shengpeng Ji,
Yabin Li,
Zerui Li,
Heng Lu,
Haoneng Luo,
Xiang Lv,
Bin Ma,
Ziyang Ma,
Chongjia Ni,
Changhe Song,
Jiaqi Shi,
Xian Shi,
Hao Wang,
Wen Wang,
Yuxuan Wang
, et al. (8 additional authors not shown)
Abstract:
This report introduces FunAudioLLM, a model family designed to enhance natural voice interactions between humans and large language models (LLMs). At its core are two innovative models: SenseVoice, which handles multilingual speech recognition, emotion recognition, and audio event detection; and CosyVoice, which facilitates natural speech generation with control over multiple languages, timbre, sp…
▽ More
This report introduces FunAudioLLM, a model family designed to enhance natural voice interactions between humans and large language models (LLMs). At its core are two innovative models: SenseVoice, which handles multilingual speech recognition, emotion recognition, and audio event detection; and CosyVoice, which facilitates natural speech generation with control over multiple languages, timbre, speaking style, and speaker identity. SenseVoice-Small delivers exceptionally low-latency ASR for 5 languages, and SenseVoice-Large supports high-precision ASR for over 50 languages, while CosyVoice excels in multi-lingual voice generation, zero-shot in-context learning, cross-lingual voice cloning, and instruction-following capabilities. The models related to SenseVoice and CosyVoice have been open-sourced on Modelscope and Huggingface, along with the corresponding training, inference, and fine-tuning codes released on GitHub. By integrating these models with LLMs, FunAudioLLM enables applications such as speech-to-speech translation, emotional voice chat, interactive podcasts, and expressive audiobook narration, thereby pushing the boundaries of voice interaction technology. Demos are available at https://fun-audio-llm.github.io, and the code can be accessed at https://github.com/FunAudioLLM.
△ Less
Submitted 10 July, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
Occupancy as Set of Points
Authors:
Yiang Shi,
Tianheng Cheng,
Qian Zhang,
Wenyu Liu,
Xinggang Wang
Abstract:
In this paper, we explore a novel point representation for 3D occupancy prediction from multi-view images, which is named Occupancy as Set of Points. Existing camera-based methods tend to exploit dense volume-based representation to predict the occupancy of the whole scene, making it hard to focus on the special areas or areas out of the perception range. In comparison, we present the Points of In…
▽ More
In this paper, we explore a novel point representation for 3D occupancy prediction from multi-view images, which is named Occupancy as Set of Points. Existing camera-based methods tend to exploit dense volume-based representation to predict the occupancy of the whole scene, making it hard to focus on the special areas or areas out of the perception range. In comparison, we present the Points of Interest (PoIs) to represent the scene and propose OSP, a novel framework for point-based 3D occupancy prediction. Owing to the inherent flexibility of the point-based representation, OSP achieves strong performance compared with existing methods and excels in terms of training and inference adaptability. It extends beyond traditional perception boundaries and can be seamlessly integrated with volume-based methods to significantly enhance their effectiveness. Experiments on the Occ3D nuScenes occupancy benchmark show that OSP has strong performance and flexibility. Code and models are available at \url{https://github.com/hustvl/osp}.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Your Mega-Constellations Can Be Slim:A Cost-Effective Approach for Constructing Survivable and Performant LEO Satellite Networks
Authors:
Zeqi Lai,
Yibo Wang,
Hewu Li,
Qian Wu,
Qi Zhang,
Yunan Hou,
Jun Liu,
Yuanjie Li
Abstract:
In this paper, we investigate an important research problem facing the upcoming satellite Internet: from a network perspective, how many satellites exactly do we need to construct a survivable and performant LSN? To answer this question, we first formulate the survivable and performant LSN design (SPLD) problem, which aims to find the minimum number of needed satellites to construct an LSN that ca…
▽ More
In this paper, we investigate an important research problem facing the upcoming satellite Internet: from a network perspective, how many satellites exactly do we need to construct a survivable and performant LSN? To answer this question, we first formulate the survivable and performant LSN design (SPLD) problem, which aims to find the minimum number of needed satellites to construct an LSN that can provide sufficient amount of redundant paths, required link capacity and acceptable latency for traffic carried by the LSN. Second, to efficiently solve the tricky SPLD problem, we propose MEGAREDUCE, a requirement-driven constellation optimization mechanism, which can calculate feasible solutions for SPLD in polynomial time. Finally, we conduct extensive trace-driven simulations to verify MEGAREDUCE's cost-effectiveness in constructing survivable and performant LSNs on demand, and showcase how MEGAREDUCE can help optimize the incremental deployment and long-term maintenance of future satellite Internet.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
WizardMerge -- Save Us From Merging Without Any Clues
Authors:
Qingyu Zhang,
Junzhe Li,
Jiayi Lin,
Jie Ding,
Lanteng Lin,
Chenxiong Qian
Abstract:
Modern software development necessitates efficient version-oriented collaboration among developers. While Git is the most popular version control system, it generates unsatisfactory version merging results due to textual-based workflow, leading to potentially unexpected results in the merged version of the project. Although numerous merging tools have been proposed for improving merge results, dev…
▽ More
Modern software development necessitates efficient version-oriented collaboration among developers. While Git is the most popular version control system, it generates unsatisfactory version merging results due to textual-based workflow, leading to potentially unexpected results in the merged version of the project. Although numerous merging tools have been proposed for improving merge results, developers remain struggling to resolve the conflicts and fix incorrectly modified code without clues. We present WizardMerge, an auxiliary tool that leverages merging results from Git to retrieve code block dependency on text and LLVM-IR level and provide suggestions for developers to resolve errors introduced by textual merging. Through the evaluation, we subjected WizardMerge to testing on 227 conflicts within five large-scale projects. The outcomes demonstrate that WizardMerge diminishes conflict merging time costs, achieving a 23.85% reduction. Beyond addressing conflicts, WizardMerge provides merging suggestions for over 70% of the code blocks potentially affected by the conflicts. Notably, WizardMerge exhibits the capability to identify conflict-unrelated code blocks that require manual intervention yet are harmfully applied by Git during the merging.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Manipulating liquid-liquid phase separation using patterned flow
Authors:
Yulin Li,
Tong Zhou,
Yanyu Li,
Qi Zhang,
Zhihong You
Abstract:
The precise control of liquid-liquid phase separation (LLPS) is the key to developing cutting-edge technologies that benefit diverse disciplines. Fluid flow was found to be capable of controlling the structure and effective temperature of LLPS, but the extent and precision of control were less than optimal. In this article, we propose that patterned flow can be employed as a generic tool to manipu…
▽ More
The precise control of liquid-liquid phase separation (LLPS) is the key to developing cutting-edge technologies that benefit diverse disciplines. Fluid flow was found to be capable of controlling the structure and effective temperature of LLPS, but the extent and precision of control were less than optimal. In this article, we propose that patterned flow can be employed as a generic tool to manipulate LLPS effectively. By combining theoretical modeling and numerical simulations, we demonstrate that flows with tailor-made structures can become functional, allowing us to control diverse aspects of LLPS. Typical examples include the capture and pinning of droplets, fine-tuning of droplet sizes, forced assembly of periodic droplet arrays, and the remodeling of the kinetics and structure of phase separation. These manipulations are grounded on the redistribution of chemical potential by the structured flow. Our results not only can lead to potential LLPS-based technologies, but also highlight the rich behavior of LLPS introduced by the patterned flow.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
A multi-field decomposed model order reduction approach for thermo-mechanically coupled gradient-extended damage simulations
Authors:
Qinghua Zhang,
Stephan Ritzert,
Jian Zhang,
Jannick Kehls,
Stefanie Reese,
Tim Brepols
Abstract:
Numerical simulations are crucial for comprehending how engineering structures behave under extreme conditions, particularly when dealing with thermo-mechanically coupled issues compounded by damage-induced material softening. However, such simulations often entail substantial computational expenses. To mitigate this, the focus has shifted towards employing model order reduction (MOR) techniques,…
▽ More
Numerical simulations are crucial for comprehending how engineering structures behave under extreme conditions, particularly when dealing with thermo-mechanically coupled issues compounded by damage-induced material softening. However, such simulations often entail substantial computational expenses. To mitigate this, the focus has shifted towards employing model order reduction (MOR) techniques, which hold promise for accelerating computations. Yet, applying MOR to highly nonlinear, multi-physical problems influenced by material softening remains a relatively new area of research, with numerous unanswered questions. Addressing this gap, this study proposes and investigates a novel multi-field decomposed MOR technique, rooted in a snapshot-based Proper Orthogonal Decomposition-Galerkin (POD-G) projection approach. Utilizing a recently developed thermo-mechanically coupled gradient-extended damage-plasticity model as a case study, this work demonstrates that splitting snapshot vectors into distinct physical fields (displacements, damage, temperature) and projecting them onto separate lower-dimensional subspaces can yield more precise and stable outcomes compared to conventional methods. Through a series of numerical benchmark tests, our multi-field decomposed MOR technique demonstrates its capacity to significantly reduce computational expenses in simulations involving severe damage, while maintaining a high level of accuracy.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Shared-Protected Backup Paths Assignment with Mode Group Division Multiplexing in Optical Networks
Authors:
Jiaheng Xiong,
Qiaolun Zhang,
Ruikun Wang,
Alberto Gatto,
Francesco Musumeci,
Massimo Tornatore
Abstract:
We evaluate the resource efficiency of Mode Group Division Multiplexing (MGDM) with shared path protection (SPP) in optical networks. On our case studies, SPP with MGDM obtains significant savings in terms of both additional backup spectrum occupation and MIMO-computing resources compared to other few-mode-transmission scenarios.
We evaluate the resource efficiency of Mode Group Division Multiplexing (MGDM) with shared path protection (SPP) in optical networks. On our case studies, SPP with MGDM obtains significant savings in terms of both additional backup spectrum occupation and MIMO-computing resources compared to other few-mode-transmission scenarios.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Empirical Tests of Optimization Assumptions in Deep Learning
Authors:
Hoang Tran,
Qinzi Zhang,
Ashok Cutkosky
Abstract:
There is a significant gap between our theoretical understanding of optimization algorithms used in deep learning and their practical performance. Theoretical development usually focuses on proving convergence guarantees under a variety of different assumptions, which are themselves often chosen based on a rough combination of intuitive match to practice and analytical convenience. The theory/prac…
▽ More
There is a significant gap between our theoretical understanding of optimization algorithms used in deep learning and their practical performance. Theoretical development usually focuses on proving convergence guarantees under a variety of different assumptions, which are themselves often chosen based on a rough combination of intuitive match to practice and analytical convenience. The theory/practice gap may then arise because of the failure to prove a theorem under such assumptions, or because the assumptions do not reflect reality. In this paper, we carefully measure the degree to which these assumptions are capable of explaining modern optimization algorithms by developing new empirical metrics that closely track the key quantities that must be controlled in theoretical analysis. All of our tested assumptions (including typical modern assumptions based on bounds on the Hessian) fail to reliably capture optimization performance. This highlights a need for new empirical verification of analytical assumptions used in theoretical analysis.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving
Authors:
Qingwen Zhang,
Yi Yang,
Peizheng Li,
Olov Andersson,
Patric Jensfelt
Abstract:
Scene flow estimation predicts the 3D motion at each point in successive LiDAR scans. This detailed, point-level, information can help autonomous vehicles to accurately predict and understand dynamic changes in their surroundings. Current state-of-the-art methods require annotated data to train scene flow networks and the expense of labeling inherently limits their scalability. Self-supervised app…
▽ More
Scene flow estimation predicts the 3D motion at each point in successive LiDAR scans. This detailed, point-level, information can help autonomous vehicles to accurately predict and understand dynamic changes in their surroundings. Current state-of-the-art methods require annotated data to train scene flow networks and the expense of labeling inherently limits their scalability. Self-supervised approaches can overcome the above limitations, yet face two principal challenges that hinder optimal performance: point distribution imbalance and disregard for object-level motion constraints. In this paper, we propose SeFlow, a self-supervised method that integrates efficient dynamic classification into a learning-based scene flow pipeline. We demonstrate that classifying static and dynamic points helps design targeted objective functions for different motion patterns. We also emphasize the importance of internal cluster consistency and correct object point association to refine the scene flow estimation, in particular on object details. Our real-time capable method achieves state-of-the-art performance on the self-supervised scene flow task on Argoverse 2 and Waymo datasets. The code is open-sourced at https://github.com/KTH-RPL/SeFlow along with trained model weights.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
ESALE: Enhancing Code-Summary Alignment Learning for Source Code Summarization
Authors:
Chunrong Fang,
Weisong Sun,
Yuchen Chen,
Xiao Chen,
Zhao Wei,
Quanjun Zhang,
Yudu You,
Bin Luo,
Yang Liu,
Zhenyu Chen
Abstract:
(Source) code summarization aims to automatically generate succinct natural language summaries for given code snippets. Such summaries play a significant role in promoting developers to understand and maintain code. Inspired by neural machine translation, deep learning-based code summarization techniques widely adopt an encoder-decoder framework, where the encoder transforms given code snippets in…
▽ More
(Source) code summarization aims to automatically generate succinct natural language summaries for given code snippets. Such summaries play a significant role in promoting developers to understand and maintain code. Inspired by neural machine translation, deep learning-based code summarization techniques widely adopt an encoder-decoder framework, where the encoder transforms given code snippets into context vectors, and the decoder decodes context vectors into summaries. Recently, large-scale pre-trained models for source code are equipped with encoders capable of producing general context vectors and have achieved substantial improvements on code summarization. However, although they are usually trained mainly on code-focused tasks and can capture general code features, they still fall short in capturing specific features that need to be summarized.
This paper proposes a novel approach to improve code summarization based on summary-focused tasks. Specifically, we exploit a multi-task learning paradigm to train the encoder on three summary-focused tasks to enhance its ability to learn code-summary alignment, including unidirectional language modeling (ULM), masked language modeling (MLM), and action word prediction (AWP). Unlike pre-trained models that mainly predict masked tokens in code snippets, we design ULM and MLM to predict masked words in summaries. Intuitively, predicting words based on given code snippets would help learn the code-summary alignment. Additionally, we introduce the domain-specific task AWP to enhance the ability of the encoder to learn the alignment between action words and code snippets. The extensive experiments on four datasets demonstrate that our approach, called ESALE significantly outperforms baselines in all three widely used metrics, including BLEU, METEOR, and ROUGE-L.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Collaborative Performance Prediction for Large Language Models
Authors:
Qiyuan Zhang,
Fuyuan Lyu,
Xue Liu,
Chen Ma
Abstract:
Comprehensively understanding and accurately predicting the performance of large language models across diverse downstream tasks has emerged as a pivotal challenge in NLP research. The pioneering scaling law on downstream works demonstrated intrinsic similarities within model families and utilized such similarities for performance prediction. However, they tend to overlook the similarities between…
▽ More
Comprehensively understanding and accurately predicting the performance of large language models across diverse downstream tasks has emerged as a pivotal challenge in NLP research. The pioneering scaling law on downstream works demonstrated intrinsic similarities within model families and utilized such similarities for performance prediction. However, they tend to overlook the similarities between model families and only consider design factors listed in the original scaling law. To overcome these limitations, we introduce a novel framework, Collaborative Performance Prediction (CPP), which significantly enhances prediction accuracy by leveraging the historical performance of various models on downstream tasks and other design factors for both model and task. We also collect a collaborative data sourced from online platforms containing both historical performance and additional design factors. With the support of the collaborative data, CPP not only surpasses traditional scaling laws in predicting the performance of scaled LLMs but also facilitates a detailed analysis of factor importance, an area previously overlooked.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Look Ahead or Look Around? A Theoretical Comparison Between Autoregressive and Masked Pretraining
Authors:
Qi Zhang,
Tianqi Du,
Haotian Huang,
Yifei Wang,
Yisen Wang
Abstract:
In recent years, the rise of generative self-supervised learning (SSL) paradigms has exhibited impressive performance across visual, language, and multi-modal domains. While the varied designs of generative SSL objectives lead to distinct properties in downstream tasks, a theoretical understanding of these differences remains largely unexplored. In this paper, we establish the first theoretical co…
▽ More
In recent years, the rise of generative self-supervised learning (SSL) paradigms has exhibited impressive performance across visual, language, and multi-modal domains. While the varied designs of generative SSL objectives lead to distinct properties in downstream tasks, a theoretical understanding of these differences remains largely unexplored. In this paper, we establish the first theoretical comparisons between two leading generative SSL paradigms: autoregressive SSL and masked SSL. Through establishing theoretical frameworks, we elucidate the strengths and limitations of autoregressive and masked SSL within the primary evaluation tasks of classification and content generation. Our findings demonstrate that in classification tasks, the flexibility of targeted tokens in masked SSL fosters more inter-sample connections compared to the fixed position of target tokens in autoregressive SSL, which yields superior clustering performance. In content generation tasks, the misalignment between the flexible lengths of test samples and the fixed length of unmasked texts in masked SSL (vs. flexible lengths of conditional texts in autoregressive SSL) hinders its generation performance. To leverage each other's strengths and mitigate weaknesses, we propose diversity-enhanced autoregressive and variable-length masked objectives, which substantially improve the classification performance of autoregressive SSL and the generation performance of masked SSL. Code is available at https://github.com/PKU-ML/LookAheadLookAround.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
A Power-Consumption Analysis for Different IPoWDM Network Architectures with ZR/ZR+ and Long-Haul Muxponders
Authors:
Qiaolun Zhang,
Annalisa Morea,
Patricia Layec,
Memedhe Ibrahimi,
Francesco Musumeci,
Massimo Tornatore
Abstract:
Operators are constantly faced with the need to increase optical-network capacity to accommodate rapid traffic growth while minimizing the cost-per-bit and power-per-bit. The drastic reduction of power consumption of IP routers and ZR/ZR+ pluggable transponders seen in the last years has renewed the interest in "opaque" optical-network architectures, where no optical bypassing is allowed. In this…
▽ More
Operators are constantly faced with the need to increase optical-network capacity to accommodate rapid traffic growth while minimizing the cost-per-bit and power-per-bit. The drastic reduction of power consumption of IP routers and ZR/ZR+ pluggable transponders seen in the last years has renewed the interest in "opaque" optical-network architectures, where no optical bypassing is allowed. In this work, we aim to quantify and compare the power consumption of four "IP over Wavelength Division Multiplexing" (IPoWDM) transport network architectures employing ZR/ZR+ modules vs. long-haul muxponders, considering different grooming, regeneration, and optical bypassing capabilities. We first propose a power consumption model for different IPoWDM node architectures with ZR/ZR+ modules and long-haul muxponders. Then, to obtain the power consumption of different architectures, we propose a compact auxiliary-graph-based network-design algorithm extensible to different network architectures. Moreover, we investigate how the continuous decrease in the power consumption of ZR/ZR+ and IP routers can impact the power consumption of different architectures through a sensitivity analysis. Illustrative numerical results on networks of different sizes show that, despite drastic reductions of power consumption at IP layer, optical bypassing is still the most power-efficient solution, reducing consumption by up to 48%.
△ Less
Submitted 30 June, 2024;
originally announced July 2024.
-
Deep Frequency Derivative Learning for Non-stationary Time Series Forecasting
Authors:
Wei Fan,
Kun Yi,
Hangting Ye,
Zhiyuan Ning,
Qi Zhang,
Ning An
Abstract:
While most time series are non-stationary, it is inevitable for models to face the distribution shift issue in time series forecasting. Existing solutions manipulate statistical measures (usually mean and std.) to adjust time series distribution. However, these operations can be theoretically seen as the transformation towards zero frequency component of the spectrum which cannot reveal full distr…
▽ More
While most time series are non-stationary, it is inevitable for models to face the distribution shift issue in time series forecasting. Existing solutions manipulate statistical measures (usually mean and std.) to adjust time series distribution. However, these operations can be theoretically seen as the transformation towards zero frequency component of the spectrum which cannot reveal full distribution information and would further lead to information utilization bottleneck in normalization, thus hindering forecasting performance. To address this problem, we propose to utilize the whole frequency spectrum to transform time series to make full use of data distribution from the frequency perspective. We present a deep frequency derivative learning framework, DERITS, for non-stationary time series forecasting. Specifically, DERITS is built upon a novel reversible transformation, namely Frequency Derivative Transformation (FDT) that makes signals derived in the frequency domain to acquire more stationary frequency representations. Then, we propose the Order-adaptive Fourier Convolution Network to conduct adaptive frequency filtering and learning. Furthermore, we organize DERITS as a parallel-stacked architecture for the multi-order derivation and fusion for forecasting. Finally, we conduct extensive experiments on several datasets which show the consistent superiority in both time series forecasting and shift alleviation.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Quantifying Spuriousness of Biased Datasets Using Partial Information Decomposition
Authors:
Barproda Halder,
Faisal Hamman,
Pasan Dissanayake,
Qiuyi Zhang,
Ilia Sucholutsky,
Sanghamitra Dutta
Abstract:
Spurious patterns refer to a mathematical association between two or more variables in a dataset that are not causally related. However, this notion of spuriousness, which is usually introduced due to sampling biases in the dataset, has classically lacked a formal definition. To address this gap, this work presents the first information-theoretic formalization of spuriousness in a dataset (given a…
▽ More
Spurious patterns refer to a mathematical association between two or more variables in a dataset that are not causally related. However, this notion of spuriousness, which is usually introduced due to sampling biases in the dataset, has classically lacked a formal definition. To address this gap, this work presents the first information-theoretic formalization of spuriousness in a dataset (given a split of spurious and core features) using a mathematical framework called Partial Information Decomposition (PID). Specifically, we disentangle the joint information content that the spurious and core features share about another target variable (e.g., the prediction label) into distinct components, namely unique, redundant, and synergistic information. We propose the use of unique information, with roots in Blackwell Sufficiency, as a novel metric to formally quantify dataset spuriousness and derive its desirable properties. We empirically demonstrate how higher unique information in the spurious features in a dataset could lead a model into choosing the spurious features over the core features for inference, often having low worst-group-accuracy. We also propose a novel autoencoder-based estimator for computing unique information that is able to handle high-dimensional image data. Finally, we also show how this unique information in the spurious feature is reduced across several dataset-based spurious-pattern-mitigation techniques such as data reweighting and varying levels of background mixing, demonstrating a novel tradeoff between unique information (spuriousness) and worst-group-accuracy.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
GraphArena: Benchmarking Large Language Models on Graph Computational Problems
Authors:
Jianheng Tang,
Qifan Zhang,
Yuhan Li,
Jia Li
Abstract:
The "arms race" of Large Language Models (LLMs) demands novel, challenging, and diverse benchmarks to faithfully examine their progresses. We introduce GraphArena, a benchmarking tool designed to evaluate LLMs on graph computational problems using million-scale real-world graphs from diverse scenarios such as knowledge graphs, social networks, and molecular structures. GraphArena offers a suite of…
▽ More
The "arms race" of Large Language Models (LLMs) demands novel, challenging, and diverse benchmarks to faithfully examine their progresses. We introduce GraphArena, a benchmarking tool designed to evaluate LLMs on graph computational problems using million-scale real-world graphs from diverse scenarios such as knowledge graphs, social networks, and molecular structures. GraphArena offers a suite of 10 computational tasks, encompassing four polynomial-time (e.g., Shortest Distance) and six NP-complete challenges (e.g., Travelling Salesman Problem). It features a rigorous evaluation framework that classifies LLM outputs as correct, suboptimal (feasible but not optimal), or hallucinatory (properly formatted but infeasible). Evaluation of 10 leading LLMs, including GPT-4o and LLaMA3-70B-Instruct, reveals that even top-performing models struggle with larger, more complex graph problems and exhibit hallucination issues. Despite the application of strategies such as chain-of-thought prompting, these issues remain unresolved. GraphArena contributes a valuable supplement to the existing LLM benchmarks and is open-sourced at https://github.com/squareRoot3/GraphArena.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
X. H. Bai,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsuren,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (495 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions…
▽ More
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
Module control of network analysis in psychopathology
Authors:
Chunyu Pan,
Quan Zhang,
Yue Zhu,
Shengzhou Kong,
Juan Liu,
Changsheng Zhang,
Fei Wang,
Xizhe Zhang
Abstract:
The network approach to characterizing psychopathology departs from traditional latent categorical and dimensional approaches. Causal interplay among symptoms contributed to dynamic psychopathology system. Therefore, analyzing the symptom clusters is critical for understanding mental disorders. Furthermore, despite extensive research studying the topological features of symptom networks, the contr…
▽ More
The network approach to characterizing psychopathology departs from traditional latent categorical and dimensional approaches. Causal interplay among symptoms contributed to dynamic psychopathology system. Therefore, analyzing the symptom clusters is critical for understanding mental disorders. Furthermore, despite extensive research studying the topological features of symptom networks, the control relationships between symptoms remain largely unclear. Here, we present a novel systematizing concept, module control, to analyze the control principle of the symptom network at a module level. We introduce Module Control Network (MCN) to identify key modules that regulate the network's behavior. By applying our approach to a multivariate psychological dataset, we discover that non-emotional modules, such as sleep-related and stress-related modules, are the primary controlling modules in the symptom network. Our findings indicate that module control can expose central symptom cluster governing psychopathology network, offering novel insights into the underlying mechanisms of mental disorders and individualized approach to psychological interventions.
△ Less
Submitted 30 May, 2024;
originally announced July 2024.
-
Harnessing XGBoost for Robust Biomarker Selection of Obsessive-Compulsive Disorder (OCD) from Adolescent Brain Cognitive Development (ABCD) data
Authors:
Xinyu Shen,
Qimin Zhang,
Huili Zheng,
Weiwei Qi
Abstract:
This study evaluates the performance of various supervised machine learning models in analyzing highly correlated neural signaling data from the Adolescent Brain Cognitive Development (ABCD) Study, with a focus on predicting obsessive-compulsive disorder scales. We simulated a dataset to mimic the correlation structures commonly found in imaging data and evaluated logistic regression, elastic netw…
▽ More
This study evaluates the performance of various supervised machine learning models in analyzing highly correlated neural signaling data from the Adolescent Brain Cognitive Development (ABCD) Study, with a focus on predicting obsessive-compulsive disorder scales. We simulated a dataset to mimic the correlation structures commonly found in imaging data and evaluated logistic regression, elastic networks, random forests, and XGBoost on their ability to handle multicollinearity and accurately identify predictive features. Our study aims to guide the selection of appropriate machine learning methods for processing neuroimaging data, highlighting models that best capture underlying signals in high feature correlations and prioritize clinically relevant features associated with Obsessive-Compulsive Disorder (OCD).
△ Less
Submitted 14 May, 2024;
originally announced July 2024.
-
Simulating Financial Market via Large Language Model based Agents
Authors:
Shen Gao,
Yuntao Wen,
Minghang Zhu,
Jianing Wei,
Yuhan Cheng,
Qunzi Zhang,
Shuo Shang
Abstract:
Most economic theories typically assume that financial market participants are fully rational individuals and use mathematical models to simulate human behavior in financial markets. However, human behavior is often not entirely rational and is challenging to predict accurately with mathematical models. In this paper, we propose \textbf{A}gent-based \textbf{S}imulated \textbf{F}inancial \textbf{M}…
▽ More
Most economic theories typically assume that financial market participants are fully rational individuals and use mathematical models to simulate human behavior in financial markets. However, human behavior is often not entirely rational and is challenging to predict accurately with mathematical models. In this paper, we propose \textbf{A}gent-based \textbf{S}imulated \textbf{F}inancial \textbf{M}arket (ASFM), which first constructs a simulated stock market with a real order matching system. Then, we propose a large language model based agent as the stock trader, which contains the profile, observation, and tool-learning based action module. The trading agent can comprehensively understand current market dynamics and financial policy information, and make decisions that align with their trading strategy. In the experiments, we first verify that the reactions of our ASFM are consistent with the real stock market in two controllable scenarios. In addition, we also conduct experiments in two popular economics research directions, and we find that conclusions drawn in our \model align with the preliminary findings in economics research. Based on these observations, we believe our proposed ASFM provides a new paradigm for economic research.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Private Zeroth-Order Nonsmooth Nonconvex Optimization
Authors:
Qinzi Zhang,
Hoang Tran,
Ashok Cutkosky
Abstract:
We introduce a new zeroth-order algorithm for private stochastic optimization on nonconvex and nonsmooth objectives. Given a dataset of size $M$, our algorithm ensures $(α,αρ^2/2)$-Rényi differential privacy and finds a $(δ,ε)$-stationary point so long as $M=\tildeΩ\left(\frac{d}{δε^3} + \frac{d^{3/2}}{ρδε^2}\right)$. This matches the optimal complexity of its non-private zeroth-order analog. Nota…
▽ More
We introduce a new zeroth-order algorithm for private stochastic optimization on nonconvex and nonsmooth objectives. Given a dataset of size $M$, our algorithm ensures $(α,αρ^2/2)$-Rényi differential privacy and finds a $(δ,ε)$-stationary point so long as $M=\tildeΩ\left(\frac{d}{δε^3} + \frac{d^{3/2}}{ρδε^2}\right)$. This matches the optimal complexity of its non-private zeroth-order analog. Notably, although the objective is not smooth, we have privacy ``for free'' whenever $ρ\ge \sqrt{d}ε$.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
AutoRAG-HP: Automatic Online Hyper-Parameter Tuning for Retrieval-Augmented Generation
Authors:
Jia Fu,
Xiaoting Qin,
Fangkai Yang,
Lu Wang,
Jue Zhang,
Qingwei Lin,
Yubo Chen,
Dongmei Zhang,
Saravan Rajmohan,
Qi Zhang
Abstract:
Recent advancements in Large Language Models have transformed ML/AI development, necessitating a reevaluation of AutoML principles for the Retrieval-Augmented Generation (RAG) systems. To address the challenges of hyper-parameter optimization and online adaptation in RAG, we propose the AutoRAG-HP framework, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem…
▽ More
Recent advancements in Large Language Models have transformed ML/AI development, necessitating a reevaluation of AutoML principles for the Retrieval-Augmented Generation (RAG) systems. To address the challenges of hyper-parameter optimization and online adaptation in RAG, we propose the AutoRAG-HP framework, which formulates the hyper-parameter tuning as an online multi-armed bandit (MAB) problem and introduces a novel two-level Hierarchical MAB (Hier-MAB) method for efficient exploration of large search spaces. We conduct extensive experiments on tuning hyper-parameters, such as top-k retrieved documents, prompt compression ratio, and embedding methods, using the ALCE-ASQA and Natural Questions datasets. Our evaluation from jointly optimization all three hyper-parameters demonstrate that MAB-based online learning methods can achieve Recall@5 $\approx 0.8$ for scenarios with prominent gradients in search space, using only $\sim20\%$ of the LLM API calls required by the Grid Search approach. Additionally, the proposed Hier-MAB approach outperforms other baselines in more challenging optimization scenarios. The code will be made available at https://aka.ms/autorag.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models
Authors:
Siyuan Wu,
Yue Huang,
Chujie Gao,
Dongping Chen,
Qihui Zhang,
Yao Wan,
Tianyi Zhou,
Xiangliang Zhang,
Jianfeng Gao,
Chaowei Xiao,
Lichao Sun
Abstract:
Large Language Models (LLMs) such as GPT-4 and Llama3 have significantly impacted various fields by enabling high-quality synthetic data generation and reducing dependence on expensive human-generated datasets. Despite this, challenges remain in the areas of generalization, controllability, diversity, and truthfulness within the existing generative frameworks. To address these challenges, this pap…
▽ More
Large Language Models (LLMs) such as GPT-4 and Llama3 have significantly impacted various fields by enabling high-quality synthetic data generation and reducing dependence on expensive human-generated datasets. Despite this, challenges remain in the areas of generalization, controllability, diversity, and truthfulness within the existing generative frameworks. To address these challenges, this paper presents UniGen, a comprehensive LLM-powered framework designed to produce diverse, accurate, and highly controllable datasets. UniGen is adaptable, supporting all types of text datasets and enhancing the generative process through innovative mechanisms. To augment data diversity, UniGen incorporates an attribute-guided generation module and a group checking feature. For accuracy, it employs a code-based mathematical assessment for label verification alongside a retrieval-augmented generation technique for factual validation. The framework also allows for user-specified constraints, enabling customization of the data generation process to suit particular requirements. Extensive experiments demonstrate the superior quality of data generated by UniGen, and each module within UniGen plays a critical role in this enhancement. Additionally, UniGen is applied in two practical scenarios: benchmarking LLMs and data augmentation. The results indicate that UniGen effectively supports dynamic and evolving benchmarking, and that data augmentation improves LLM capabilities in various domains, including agent-oriented abilities and reasoning skills.
△ Less
Submitted 28 June, 2024; v1 submitted 27 June, 2024;
originally announced June 2024.
-
Sequential three-way group decision-making for double hierarchy hesitant fuzzy linguistic term set
Authors:
Nanfang Luo,
Qinghua Zhang,
Qin Xie,
Yutai Wang,
Longjun Yin,
Guoyin Wang
Abstract:
Group decision-making (GDM) characterized by complexity and uncertainty is an essential part of various life scenarios. Most existing researches lack tools to fuse information quickly and interpret decision results for partially formed decisions. This limitation is particularly noticeable when there is a need to improve the efficiency of GDM. To address this issue, a novel multi-level sequential t…
▽ More
Group decision-making (GDM) characterized by complexity and uncertainty is an essential part of various life scenarios. Most existing researches lack tools to fuse information quickly and interpret decision results for partially formed decisions. This limitation is particularly noticeable when there is a need to improve the efficiency of GDM. To address this issue, a novel multi-level sequential three-way decision for group decision-making (S3W-GDM) method is constructed from the perspective of granular computing. This method simultaneously considers the vagueness, hesitation, and variation of GDM problems under double hierarchy hesitant fuzzy linguistic term sets (DHHFLTS) environment. First, for fusing information efficiently, a novel multi-level expert information fusion method is proposed, and the concepts of expert decision table and the extraction/aggregation of decision-leveled information based on the multi-level granularity are defined. Second, the neighborhood theory, outranking relation and regret theory (RT) are utilized to redesign the calculations of conditional probability and relative loss function. Then, the granular structure of DHHFLTS based on the sequential three-way decision (S3WD) is defined to improve the decision-making efficiency, and the decision-making strategy and interpretation of each decision-level are proposed. Furthermore, the algorithm of S3W-GDM is given. Finally, an illustrative example of diagnosis is presented, and the comparative and sensitivity analysis with other methods are performed to verify the efficiency and rationality of the proposed method.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Electric-field control of the perpendicular magnetization switching in ferroelectric/ferrimagnet heterostructures
Authors:
Pengfei Liu,
Tao Xu,
Qi Liu,
Juncai Dong,
Ting Lin,
Qinhua Zhang,
Xiukai Lan,
Yu Sheng,
Chunyu Wang,
Jiajing Pei,
Hongxin Yang,
Lin Gu,
Kaiyou Wang
Abstract:
Electric field control of the magnetic state in ferrimagnets holds great promise for developing spintronic devices due to low power consumption. Here, we demonstrate a non-volatile reversal of perpendicular net magnetization in a ferrimagnet by manipulating the electric-field driven polarization within the Pb (Zr0.2Ti0.8) O3 (PZT)/CoGd heterostructure. Electron energy loss spectra and X-ray absorp…
▽ More
Electric field control of the magnetic state in ferrimagnets holds great promise for developing spintronic devices due to low power consumption. Here, we demonstrate a non-volatile reversal of perpendicular net magnetization in a ferrimagnet by manipulating the electric-field driven polarization within the Pb (Zr0.2Ti0.8) O3 (PZT)/CoGd heterostructure. Electron energy loss spectra and X-ray absorption spectrum directly verify that the oxygen ion migration at the PZT/CoGd interface associated with reversing the polarization causes the enhanced/reduced oxidation in CoGd. Ab initio calculations further substantiate that the migrated oxygen ions can modulate the relative magnetization of Co/Gd sublattices, facilitating perpendicular net magnetization switching. Our findings offer an approach to effectively control ferrimagnetic net magnetization, holding significant implications for ferrimagnetic spintronic applications.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.