subscribe to arXiv mailings

Domain-adaptive Video Deblurring via Test-time Blurring

Authors: Jin-Ting He, Fu-Jen Tsai, Jia-Hao Wu, Yan-Tsung Peng, Chung-Chi Tsai, Chia-Wen Lin, Yen-Yu Lin

Abstract: Dynamic scene video deblurring aims to remove undesirable blurry artifacts captured during the exposure process. Although previous video deblurring methods have achieved impressive results, they suffer from significant performance drops due to the domain gap between training and testing videos, especially for those captured in real-world scenarios. To address this issue, we propose a domain adapta… ▽ More Dynamic scene video deblurring aims to remove undesirable blurry artifacts captured during the exposure process. Although previous video deblurring methods have achieved impressive results, they suffer from significant performance drops due to the domain gap between training and testing videos, especially for those captured in real-world scenarios. To address this issue, we propose a domain adaptation scheme based on a blurring model to achieve test-time fine-tuning for deblurring models in unseen domains. Since blurred and sharp pairs are unavailable for fine-tuning during inference, our scheme can generate domain-adaptive training pairs to calibrate a deblurring model for the target domain. First, a Relative Sharpness Detection Module is proposed to identify relatively sharp regions from the blurry input images and regard them as pseudo-sharp images. Next, we utilize a blurring model to produce blurred images based on the pseudo-sharp images extracted during testing. To synthesize blurred images in compliance with the target data distribution, we propose a Domain-adaptive Blur Condition Generation Module to create domain-specific blur conditions for the blurring model. Finally, the generated pseudo-sharp and blurred pairs are used to fine-tune a deblurring model for better performance. Extensive experimental results demonstrate that our approach can significantly improve state-of-the-art video deblurring methods, providing performance gains of up to 7.54dB on various real-world video deblurring datasets. The source code is available at https://github.com/Jin-Ting-He/DADeblur. △ Less

Submitted 12 July, 2024; originally announced July 2024.

Comments: ECCV 2024

arXiv:2407.08664 [pdf, other]

MBD-NODE: Physics-informed data-driven modeling and simulation of constrained multibody systems

Authors: Jingquan Wang, Shu Wang, Huzaifa Mustafa Unjhawala, Jinlong Wu, Dan Negrut

Abstract: We describe a framework that can integrate prior physical information, e.g., the presence of kinematic constraints, to support data-driven simulation in multi-body dynamics. Unlike other approaches, e.g., Fully-connected Neural Network (FCNN) or Recurrent Neural Network (RNN)-based methods that are used to model the system states directly, the proposed approach embraces a Neural Ordinary Different… ▽ More We describe a framework that can integrate prior physical information, e.g., the presence of kinematic constraints, to support data-driven simulation in multi-body dynamics. Unlike other approaches, e.g., Fully-connected Neural Network (FCNN) or Recurrent Neural Network (RNN)-based methods that are used to model the system states directly, the proposed approach embraces a Neural Ordinary Differential Equation (NODE) paradigm that models the derivatives of the system states. A central part of the proposed methodology is its capacity to learn the multibody system dynamics from prior physical knowledge and constraints combined with data inputs. This learning process is facilitated by a constrained optimization approach, which ensures that physical laws and system constraints are accounted for in the simulation process. The models, data, and code for this work are publicly available as open source at https://github.com/uwsbel/sbel-reproducibility/tree/master/2024/MNODE-code. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08662 [pdf, other]

Uncertainty Estimation of Large Language Models in Medical Question Answering

Authors: Jiaxin Wu, Yizhou Yu, Hong-Yu Zhou

Abstract: Large Language Models (LLMs) show promise for natural language generation in healthcare, but risk hallucinating factually incorrect information. Deploying LLMs for medical question answering necessitates reliable uncertainty estimation (UE) methods to detect hallucinations. In this work, we benchmark popular UE methods with different model sizes on medical question-answering datasets. Our results… ▽ More Large Language Models (LLMs) show promise for natural language generation in healthcare, but risk hallucinating factually incorrect information. Deploying LLMs for medical question answering necessitates reliable uncertainty estimation (UE) methods to detect hallucinations. In this work, we benchmark popular UE methods with different model sizes on medical question-answering datasets. Our results show that current approaches generally perform poorly in this domain, highlighting the challenge of UE for medical applications. We also observe that larger models tend to yield better results, suggesting a correlation between model size and the reliability of UE. To address these challenges, we propose Two-phase Verification, a probability-free Uncertainty Estimation approach. First, an LLM generates a step-by-step explanation alongside its initial answer, followed by formulating verification questions to check the factual claims in the explanation. The model then answers these questions twice: first independently, and then referencing the explanation. Inconsistencies between the two sets of answers measure the uncertainty in the original response. We evaluate our approach on three biomedical question-answering datasets using Llama 2 Chat models and compare it against the benchmarked baseline methods. The results show that our Two-phase Verification method achieves the best overall accuracy and stability across various datasets and model sizes, and its performance scales as the model size increases. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08639 [pdf, other]

$β$-DPO: Direct Preference Optimization with Dynamic $β$

Authors: Junkang Wu, Yuexiang Xie, Zhengyi Yang, Jiancan Wu, Jinyang Gao, Bolin Ding, Xiang Wang, Xiangnan He

Abstract: Direct Preference Optimization (DPO) has emerged as a compelling approach for training Large Language Models (LLMs) to adhere to human preferences. However, the performance of DPO is sensitive to the fine-tuning of its trade-off parameter $β$, as well as to the quality of the preference data. We analyze the impact of $β$ and data quality on DPO, uncovering that optimal $β$ values vary with the inf… ▽ More Direct Preference Optimization (DPO) has emerged as a compelling approach for training Large Language Models (LLMs) to adhere to human preferences. However, the performance of DPO is sensitive to the fine-tuning of its trade-off parameter $β$, as well as to the quality of the preference data. We analyze the impact of $β$ and data quality on DPO, uncovering that optimal $β$ values vary with the informativeness of pairwise data. Addressing the limitations of static $β$ values, we introduce a novel framework that dynamically calibrates $β$ at the batch level, informed by data quality considerations. Additionally, our method incorporates $β$-guided data filtering to safeguard against the influence of outliers. Through empirical evaluation, we demonstrate that our dynamic $β$ adjustment technique significantly improves DPO's performance across a range of models and datasets, offering a more robust and adaptable training paradigm for aligning LLMs with human feedback. The code is available at \url{https://github.com/junkangwu/beta-DPO}. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08300 [pdf, ps, other]

Regularity of viscosity solutions of the $σ_k$-Yamabe-type Problem for $k>n/2$

Authors: Jinyang Wu

Abstract: We study the regularity of Lipschitz viscosity solutions to the $σ_k$ Yamabe problem in the negative cone case. If either $k=n$ or the manifold is conformally flat and $k>n/2$, we prove that all Lipschitz viscosity solutions are smooth away from a closed set of measure zero. For the general $k>n/2$ case, under certain assumptions, we prove the existence of a Lipschitz viscosity solution that is sm… ▽ More We study the regularity of Lipschitz viscosity solutions to the $σ_k$ Yamabe problem in the negative cone case. If either $k=n$ or the manifold is conformally flat and $k>n/2$, we prove that all Lipschitz viscosity solutions are smooth away from a closed set of measure zero. For the general $k>n/2$ case, under certain assumptions, we prove the existence of a Lipschitz viscosity solution that is smooth away from a closed set of measure zero. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 31 pages. Comments Welcome!

MSC Class: 35J60; 35B65; 35D40; 53C18

arXiv:2407.07999 [pdf, ps, other]

Fusion of Short-term and Long-term Attention for Video Mirror Detection

Authors: Mingchen Xu, Jing Wu, Yukun Lai, Ze Ji

Abstract: Techniques for detecting mirrors from static images have witnessed rapid growth in recent years. However, these methods detect mirrors from single input images. Detecting mirrors from video requires further consideration of temporal consistency between frames. We observe that humans can recognize mirror candidates, from just one or two frames, based on their appearance (e.g. shape, color). However… ▽ More Techniques for detecting mirrors from static images have witnessed rapid growth in recent years. However, these methods detect mirrors from single input images. Detecting mirrors from video requires further consideration of temporal consistency between frames. We observe that humans can recognize mirror candidates, from just one or two frames, based on their appearance (e.g. shape, color). However, to ensure that the candidate is indeed a mirror (not a picture or a window), we often need to observe more frames for a global view. This observation motivates us to detect mirrors by fusing appearance features extracted from a short-term attention module and context information extracted from a long-term attention module. To evaluate the performance, we build a challenging benchmark dataset of 19,255 frames from 281 videos. Experimental results demonstrate that our method achieves state-of-the-art performance on the benchmark dataset. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07880 [pdf, other]

Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization

Authors: Junkang Wu, Yuexiang Xie, Zhengyi Yang, Jiancan Wu, Jiawei Chen, Jinyang Gao, Bolin Ding, Xiang Wang, Xiangnan He

Abstract: This study addresses the challenge of noise in training datasets for Direct Preference Optimization (DPO), a method for aligning Large Language Models (LLMs) with human preferences. We categorize noise into pointwise noise, which includes low-quality data points, and pairwise noise, which encompasses erroneous data pair associations that affect preference rankings. Utilizing Distributionally Robus… ▽ More This study addresses the challenge of noise in training datasets for Direct Preference Optimization (DPO), a method for aligning Large Language Models (LLMs) with human preferences. We categorize noise into pointwise noise, which includes low-quality data points, and pairwise noise, which encompasses erroneous data pair associations that affect preference rankings. Utilizing Distributionally Robust Optimization (DRO), we enhance DPO's resilience to these types of noise. Our theoretical insights reveal that DPO inherently embeds DRO principles, conferring robustness to pointwise noise, with the regularization coefficient $β$ playing a critical role in its noise resistance. Extending this framework, we introduce Distributionally Robustifying DPO (Dr. DPO), which integrates pairwise robustness by optimizing against worst-case pairwise scenarios. The novel hyperparameter $β'$ in Dr. DPO allows for fine-tuned control over data pair reliability, providing a strategic balance between exploration and exploitation in noisy training environments. Empirical evaluations demonstrate that Dr. DPO substantially improves the quality of generated text and response accuracy in preference datasets, showcasing enhanced performance in both noisy and noise-free settings. The code is available at https://github.com/junkangwu/Dr_DPO. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07651 [pdf, other]

Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$

Authors: M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (645 additional authors not shown)

Abstract: The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be… ▽ More The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07577 [pdf, other]

IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

Authors: Yatai Ji, Shilong Zhang, Jie Wu, Peize Sun, Weifeng Chen, Xuefeng Xiao, Sidi Yang, Yujiu Yang, Ping Luo

Abstract: The rapid advancement of Large Vision-Language models (LVLMs) has demonstrated a spectrum of emergent capabilities. Nevertheless, current models only focus on the visual content of a single scenario, while their ability to associate instances across different scenes has not yet been explored, which is essential for understanding complex visual content, such as movies with multiple characters and i… ▽ More The rapid advancement of Large Vision-Language models (LVLMs) has demonstrated a spectrum of emergent capabilities. Nevertheless, current models only focus on the visual content of a single scenario, while their ability to associate instances across different scenes has not yet been explored, which is essential for understanding complex visual content, such as movies with multiple characters and intricate plots. Towards movie understanding, a critical initial step for LVLMs is to unleash the potential of character identities memory and recognition across multiple visual scenarios. To achieve the goal, we propose visual instruction tuning with ID reference and develop an ID-Aware Large Vision-Language Model, IDA-VLM. Furthermore, our research introduces a novel benchmark MM-ID, to examine LVLMs on instance IDs memory and recognition across four dimensions: matching, location, question-answering, and captioning. Our findings highlight the limitations of existing LVLMs in recognizing and associating instance identities with ID reference. This paper paves the way for future artificial intelligence systems to possess multi-identity visual inputs, thereby facilitating the comprehension of complex visual narratives like movies. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07328 [pdf, other]

CATP: Context-Aware Trajectory Prediction with Competition Symbiosis

Authors: Jiang Wu, Dongyu Liu, Yuchen Lin, Yingcai Wu

Abstract: Contextual information is vital for accurate trajectory prediction. For instance, the intricate flying behavior of migratory birds hinges on their analysis of environmental cues such as wind direction and air pressure. However, the diverse and dynamic nature of contextual information renders it an arduous task for AI models to comprehend its impact on trajectories and consequently predict them acc… ▽ More Contextual information is vital for accurate trajectory prediction. For instance, the intricate flying behavior of migratory birds hinges on their analysis of environmental cues such as wind direction and air pressure. However, the diverse and dynamic nature of contextual information renders it an arduous task for AI models to comprehend its impact on trajectories and consequently predict them accurately. To address this issue, we propose a ``manager-worker'' framework to unleash the full potential of contextual information and construct CATP model, an implementation of the framework for Context-Aware Trajectory Prediction. The framework comprises a manager model, several worker models, and a tailored training mechanism inspired by competition symbiosis in nature. Taking CATP as an example, each worker needs to compete against others for training data and develop an advantage in predicting specific moving patterns. The manager learns the workers' performance in different contexts and selects the best one in the given context to predict trajectories, enabling CATP as a whole to operate in a symbiotic manner. We conducted two comparative experiments and an ablation study to quantitatively evaluate the proposed framework and CATP model. The results showed that CATP could outperform SOTA models, and the framework could be generalized to different context-aware tasks. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.07268 [pdf, other]

Dataset Quantization with Active Learning based Adaptive Sampling

Authors: Zhenghao Zhao, Yuzhang Shang, Junyi Wu, Yan Yan

Abstract: Deep learning has made remarkable progress recently, largely due to the availability of large, well-labeled datasets. However, the training on such datasets elevates costs and computational demands. To address this, various techniques like coreset selection, dataset distillation, and dataset quantization have been explored in the literature. Unlike traditional techniques that depend on uniform sam… ▽ More Deep learning has made remarkable progress recently, largely due to the availability of large, well-labeled datasets. However, the training on such datasets elevates costs and computational demands. To address this, various techniques like coreset selection, dataset distillation, and dataset quantization have been explored in the literature. Unlike traditional techniques that depend on uniform sample distributions across different classes, our research demonstrates that maintaining performance is feasible even with uneven distributions. We find that for certain classes, the variation in sample quantity has a minimal impact on performance. Inspired by this observation, an intuitive idea is to reduce the number of samples for stable classes and increase the number of samples for sensitive classes to achieve a better performance with the same sampling ratio. Then the question arises: how can we adaptively select samples from a dataset to achieve optimal performance? In this paper, we propose a novel active learning based adaptive sampling strategy, Dataset Quantization with Active Learning based Adaptive Sampling (DQAS), to optimize the sample selection. In addition, we introduce a novel pipeline for dataset quantization, utilizing feature space from the final stage of dataset quantization to generate more precise dataset bins. Our comprehensive evaluations on the multiple datasets show that our approach outperforms the state-of-the-art dataset compression methods. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: Accepted to ECCV 2024

arXiv:2407.06714 [pdf, other]

Improving the Transferability of Adversarial Examples by Feature Augmentation

Authors: Donghua Wang, Wen Yao, Tingsong Jiang, Xiaohu Zheng, Junqi Wu, Xiaoqian Chen

Abstract: Despite the success of input transformation-based attacks on boosting adversarial transferability, the performance is unsatisfying due to the ignorance of the discrepancy across models. In this paper, we propose a simple but effective feature augmentation attack (FAUG) method, which improves adversarial transferability without introducing extra computation costs. Specifically, we inject the random… ▽ More Despite the success of input transformation-based attacks on boosting adversarial transferability, the performance is unsatisfying due to the ignorance of the discrepancy across models. In this paper, we propose a simple but effective feature augmentation attack (FAUG) method, which improves adversarial transferability without introducing extra computation costs. Specifically, we inject the random noise into the intermediate features of the model to enlarge the diversity of the attack gradient, thereby mitigating the risk of overfitting to the specific model and notably amplifying adversarial transferability. Moreover, our method can be combined with existing gradient attacks to augment their performance further. Extensive experiments conducted on the ImageNet dataset across CNN and transformer models corroborate the efficacy of our method, e.g., we achieve improvement of +26.22% and +5.57% on input transformation-based attacks and combination methods, respectively. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 19 pages, 4 figures, 4 tables

arXiv:2407.06472 [pdf]

Optimizing Electric Carsharing System Operations and Battery Management: Integrating V2G, B2G and Battery Swapping Strategies

Authors: Shuang Yang, Gonçalo Homem de Almeida Correia, Jianjun Wu, Huijun Sun

Abstract: Shared electric vehicles (SEVs) have emerged as a promising solution to contribute to sustainable urban mobility. However, ensuring the efficient operation and effective battery management of SEV systems remains a complex challenge. This challenge stems from factors such as slow plug-in charging, the potential role of SEVs in balancing grid load pressure, and the optimization of SEV operations to… ▽ More Shared electric vehicles (SEVs) have emerged as a promising solution to contribute to sustainable urban mobility. However, ensuring the efficient operation and effective battery management of SEV systems remains a complex challenge. This challenge stems from factors such as slow plug-in charging, the potential role of SEVs in balancing grid load pressure, and the optimization of SEV operations to ensure their economic viability. To tackle these challenges, this paper introduces an integrated strategy for optimizing various aspects of SEV systems, encompassing strategies like Vehicle-to-Grid (V2G), Battery-to-Grid (B2G), and battery swapping. This approach is built on a space-time-energy network model that facilitates the optimization of battery charging and discharging scheduling, SEV operations like relocations and battery swapping, battery swapping station selection and the number of batteries. The objective of this approach is to maximize profits while addressing operational constraints and the complexities of energy management within SEV systems. Given the substantial complexity that arises with large-problem scales, the paper introduces a column generation-based heuristic algorithm. Extensive experimental validation is conducted, including sensitivity analysis on different charging speeds and fleet sizes. The results illuminate the impact of varying charging rates and fleet sizes on performance indicators. Notably, it is observed that battery swapping is particularly effective as an auxiliary charging method when the number of vehicles is limited. Conversely, in scenarios with a large fleet, the necessity for battery swapping diminishes. Moreover, results show the effectiveness of V2G and B2G technologies in grid load balancing. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.06136 [pdf, other]

Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning

Authors: Xiaojie Li, Yibo Yang, Jianlong Wu, Bernard Ghanem, Liqiang Nie, Min Zhang

Abstract: Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples while preserving the knowledge of previously learned classes. Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially, prone to overfitting to the current session. Existing dynamic strateg… ▽ More Few-shot class-incremental learning (FSCIL) confronts the challenge of integrating new classes into a model with minimal training samples while preserving the knowledge of previously learned classes. Traditional methods widely adopt static adaptation relying on a fixed parameter space to learn from data that arrive sequentially, prone to overfitting to the current session. Existing dynamic strategies require the expansion of the parameter space continually, leading to increased complexity. To address these challenges, we integrate the recently proposed selective state space model (SSM) into FSCIL. Concretely, we propose a dual selective SSM projector that dynamically adjusts the projection parameters based on the intermediate features for dynamic adaptation. The dual design enables the model to maintain the robust features of base classes, while adaptively learning distinctive feature shifts for novel classes. Additionally, we develop a class-sensitive selective scan mechanism to guide dynamic adaptation. It minimizes the disruption to base-class representations caused by training on novel data, and meanwhile, forces the selective scan to perform in distinct patterns between base and novel classes. Experiments on miniImageNet, CUB-200, and CIFAR-100 demonstrate that our framework outperforms the existing state-of-the-art methods. The code is available at https://github.com/xiaojieli0903/Mamba-FSCIL. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Code: https://github.com/xiaojieli0903/Mamba-FSCIL

arXiv:2407.05743 [pdf, other]

$P_c(4457)$ interpreted as a $J^P=1/2^+$ state by $\bar{D}^0Λ^+_c(2595)$-$π^0 P_c(4312)$ interaction

Authors: Jin-Zi Wu, Jin-Yi Pang, Jia-Jun Wu

Abstract: $P_c(4457)$ has been discovered over five years, but the parity of this particle remains undetermined. In this letter we propose a new interpretation for $P_c(4457)$, which is the state generated from the coupled-channel $\bar{D}^0Λ_c^{+}(2595)$ and $π^0 P_c(4312)$ since they can exchange an almost on-shell $Σ_c^+$. In this scenario, the parity of $P_c(4457)… ▽ More $P_c(4457)$ has been discovered over five years, but the parity of this particle remains undetermined. In this letter we propose a new interpretation for $P_c(4457)$, which is the state generated from the coupled-channel $\bar{D}^0Λ_c^{+}(2595)$ and $π^0 P_c(4312)$ since they can exchange an almost on-shell $Σ_c^+$. In this scenario, the parity of $P_c(4457)$ will be positive, which is different from the candidate of the bound state of $\bar{D}^*Σ_c$. The main decay channel of $P_c(4457)$ in this model is $P_c(4312)π$. We propose three processes $Λ_b^0 \to J/ψK_s p π^-$, $Λ_b^0 \to J/ψK^- p π^0$, and $Λ_b^0 \to J/ψp π^- π^+ K^-$ to verify $P_c(4457)\to P_c(4312)π$. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 6 Pages, 2 figures

arXiv:2407.05319 [pdf, other]

Rethinking Targeted Adversarial Attacks For Neural Machine Translation

Authors: Junjie Wu, Lemao Liu, Wei Bi, Dit-Yan Yeung

Abstract: Targeted adversarial attacks are widely used to evaluate the robustness of neural machine translation systems. Unfortunately, this paper first identifies a critical issue in the existing settings of NMT targeted adversarial attacks, where their attacking results are largely overestimated. To this end, this paper presents a new setting for NMT targeted adversarial attacks that could lead to reliabl… ▽ More Targeted adversarial attacks are widely used to evaluate the robustness of neural machine translation systems. Unfortunately, this paper first identifies a critical issue in the existing settings of NMT targeted adversarial attacks, where their attacking results are largely overestimated. To this end, this paper presents a new setting for NMT targeted adversarial attacks that could lead to reliable attacking results. Under the new setting, it then proposes a Targeted Word Gradient adversarial Attack (TWGA) method to craft adversarial examples. Experimental results demonstrate that our proposed setting could provide faithful attacking results for targeted adversarial attacks on NMT systems, and the proposed TWGA method can effectively attack such victim NMT systems. In-depth analyses on a large-scale dataset further illustrate some valuable findings. 1 Our code and data are available at https://github.com/wujunjie1998/TWGA. △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: 5 pages, 2 figures, accepted by ICASSP 2024

arXiv:2407.05251 [pdf, ps, other]

Almost elementary groupoid models for $C^*$-algebras

Authors: Xin Ma, Jianchao Wu

Abstract: The notion of almost elementariness for a locally compact Hausdorff étale groupoid $\mathcal{G}$ with a compact unit space was introduced by the authors as a sufficient condition ensuring the reduced groupoid $C^*$-algebra $C^*_r(\mathcal{G})$ is (tracially) $\mathcal{Z}$-stable and thus classifiable under additional natural assumption. In this paper, we explore the converse direction and show tha… ▽ More The notion of almost elementariness for a locally compact Hausdorff étale groupoid $\mathcal{G}$ with a compact unit space was introduced by the authors as a sufficient condition ensuring the reduced groupoid $C^*$-algebra $C^*_r(\mathcal{G})$ is (tracially) $\mathcal{Z}$-stable and thus classifiable under additional natural assumption. In this paper, we explore the converse direction and show that many groupoids in the literature serving as models for classifiable $C^*$-algebras are almost elementary. In particular, for a large class $\mathcal{C}$ of Elliott invariants and a $C^*$-algebra $A$ with $\operatorname{Ell}(A)\in \mathcal{C}$, we show that $A$ is classifiable if and only if $A$ possesses a minimal, effective, amenable, second countable, almost elementary groupoid model, which leads to a groupoid-theoretic characterization of classifiability of $C^*$-algebras with certain Elliott invariants. Moreover, we build a connection between almost elementariness and pure infiniteness for groupoids and study obstructions to obtaining a transformation groupoid model for the Jiang-Su algebra $\mathcal{Z}$. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.05108 [pdf, other]

The Role of Depth, Width, and Tree Size in Expressiveness of Deep Forest

Authors: Shen-Huan Lyu, Jin-Hui Wu, Qin-Cheng Zheng, Baoliu Ye

Abstract: Random forests are classical ensemble algorithms that construct multiple randomized decision trees and aggregate their predictions using naive averaging. \citet{zhou2019deep} further propose a deep forest algorithm with multi-layer forests, which outperforms random forests in various tasks. The performance of deep forests is related to three hyperparameters in practice: depth, width, and tree size… ▽ More Random forests are classical ensemble algorithms that construct multiple randomized decision trees and aggregate their predictions using naive averaging. \citet{zhou2019deep} further propose a deep forest algorithm with multi-layer forests, which outperforms random forests in various tasks. The performance of deep forests is related to three hyperparameters in practice: depth, width, and tree size, but little has been known about its theoretical explanation. This work provides the first upper and lower bounds on the approximation complexity of deep forests concerning the three hyperparameters. Our results confirm the distinctive role of depth, which can exponentially enhance the expressiveness of deep forests compared with width and tree size. Experiments confirm the theoretical findings. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Journal ref: In: Proceedings of the 27th European Conference on Artificial Intelligence, 2024

arXiv:2407.04888 [pdf, other]

Unraveling Radiomics Complexity: Strategies for Optimal Simplicity in Predictive Modeling

Authors: Mahdi Ait Lhaj Loutfi, Teodora Boblea Podasca, Alex Zwanenburg, Taman Upadhaya, Jorge Barrios, David R. Raleigh, William C. Chen, Dante P. I. Capaldi, Hong Zheng, Olivier Gevaert, Jing Wu, Alvin C. Silva, Paul J. Zhang, Harrison X. Bai, Jan Seuntjens, Steffen Löck, Patrick O. Richard, Olivier Morin, Caroline Reinhold, Martin Lepage, Martin Vallières

Abstract: Background: The high dimensionality of radiomic feature sets, the variability in radiomic feature types and potentially high computational requirements all underscore the need for an effective method to identify the smallest set of predictive features for a given clinical problem. Purpose: Develop a methodology and tools to identify and explain the smallest set of predictive radiomic features. Mat… ▽ More Background: The high dimensionality of radiomic feature sets, the variability in radiomic feature types and potentially high computational requirements all underscore the need for an effective method to identify the smallest set of predictive features for a given clinical problem. Purpose: Develop a methodology and tools to identify and explain the smallest set of predictive radiomic features. Materials and Methods: 89,714 radiomic features were extracted from five cancer datasets: low-grade glioma, meningioma, non-small cell lung cancer (NSCLC), and two renal cell carcinoma cohorts (n=2104). Features were categorized by computational complexity into morphological, intensity, texture, linear filters, and nonlinear filters. Models were trained and evaluated on each complexity level using the area under the curve (AUC). The most informative features were identified, and their importance was explained. The optimal complexity level and associated most informative features were identified using systematic statistical significance analyses and a false discovery avoidance procedure, respectively. Their predictive importance was explained using a novel tree-based method. Results: MEDimage, a new open-source tool, was developed to facilitate radiomic studies. Morphological features were optimal for MRI-based meningioma (AUC: 0.65) and low-grade glioma (AUC: 0.68). Intensity features were optimal for CECT-based renal cell carcinoma (AUC: 0.82) and CT-based NSCLC (AUC: 0.76). Texture features were optimal for MRI-based renal cell carcinoma (AUC: 0.72). Tuning the Hounsfield unit range improved results for CECT-based renal cell carcinoma (AUC: 0.86). Conclusion: Our proposed methodology and software can estimate the optimal radiomics complexity level for specific medical outcomes, potentially simplifying the use of radiomics in predictive modeling across various contexts. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.04643 [pdf]

Granular Ta-Te nanowire superconductivity violating the Pauli limit

Authors: Lingxiao Zhao, Yi Zhao, Cuiying Pei, Changhua Li, Qi Wang, Juefei Wu, Weizheng Cao, Lin Xiong, Haiyin Zhu, Tianping Ying, Yanpeng Qi

Abstract: Strategies to achieve higher upper-critical-field superconductors (μ0Hc2(0)) are of great interest for both fundamental science and practical applications. While reducing the thickness of two-dimensional (2D) materials to a few layers significantly enhances μ0Hc2(0) with accompanied potential unconventional pairing mechanisms, further dimensional reduction to 1D compounds rarely exceeds the expect… ▽ More Strategies to achieve higher upper-critical-field superconductors (μ0Hc2(0)) are of great interest for both fundamental science and practical applications. While reducing the thickness of two-dimensional (2D) materials to a few layers significantly enhances μ0Hc2(0) with accompanied potential unconventional pairing mechanisms, further dimensional reduction to 1D compounds rarely exceeds the expected Pauli limit. Here, we report the discovery of a 1D granular Ta-Te nanowire that becomes superconducting under high pressure, with a maximum critical temperature (Tc) of 5.1 K. Remarkably, the μ0Hc2(0) reaches 16 T, which is twice the Pauli limit, setting a record of μ0Hc2 (0) in all the reported 1D superconductors. Our work demonstrates that the Ta-Te nanowire not only is a potential candidate for applications in high magnetic fields, but also provides an ideal platform for further investigations of the mechanisms between nanowires and large μ0Hc2(0). △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 12 pages,4 figures

arXiv:2407.04055 [pdf, other]

Benchmark on Drug Target Interaction Modeling from a Structure Perspective

Authors: Xinnan Zhang, Jialin Wu, Junyi Xie, Tianlong Chen, Kaixiong Zhou

Abstract: The prediction modeling of drug-target interactions is crucial to drug discovery and design, which has seen rapid advancements owing to deep learning technologies. Recently developed methods, such as those based on graph neural networks (GNNs) and Transformers, demonstrate exceptional performance across various datasets by effectively extracting structural information. However, the benchmarking of… ▽ More The prediction modeling of drug-target interactions is crucial to drug discovery and design, which has seen rapid advancements owing to deep learning technologies. Recently developed methods, such as those based on graph neural networks (GNNs) and Transformers, demonstrate exceptional performance across various datasets by effectively extracting structural information. However, the benchmarking of these novel methods often varies significantly in terms of hyperparameter settings and datasets, which limits algorithmic progress. In view of these, we conduct a comprehensive survey and benchmark for drug-target interaction modeling from a structure perspective, via integrating tens of explicit (i.e., GNN-based) and implicit (i.e., Transformer-based) structure learning algorithms. To this end, we first unify the hyperparameter setting within each class of structure learning methods. Moreover, we conduct a macroscopical comparison between these two classes of encoding strategies as well as the different featurization techniques that inform molecules' chemical and physical properties. We then carry out the microscopical comparison between all the integrated models across the six datasets, via comprehensively benchmarking their effectiveness and efficiency. Remarkably, the summarized insights from the benchmark studies lead to the design of model combos. We demonstrate that our combos can achieve new state-of-the-art performance on various datasets associated with cost-effective memory and computation. Our code is available at \hyperlink{https://github.com/justinwjl/GTB-DTI/tree/main}{https://github.com/justinwjl/GTB-DTI/tree/main}. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Submitted to NIPS 2024 Dataset and Benchmark

arXiv:2407.03868 [pdf]

Observation of exceptional line semimetal in three-dimensional non-Hermitian phononic crystals

Authors: Yejian Hu, Jien Wu, Peidong Ye, Weiyin Deng, Jiuyang Lu, Xueqin Huang, Ziyu Wang, Manzhu Ke, Zhengyou Liu

Abstract: Non-Hermitian topological phases, which exhibit unique features such as skin effect and exceptional points originated from nontrivial band topologies in complex plane, have attracted enormous attention in condensed-matter physics and metamaterials. Here we report the realization of an exceptional line semimetal in a three-dimensional non-Hermitian phononic crystal. A pair of exceptional rings with… ▽ More Non-Hermitian topological phases, which exhibit unique features such as skin effect and exceptional points originated from nontrivial band topologies in complex plane, have attracted enormous attention in condensed-matter physics and metamaterials. Here we report the realization of an exceptional line semimetal in a three-dimensional non-Hermitian phononic crystal. A pair of exceptional rings with opposite topologies are connected by the drumhead bulk states in the first Brillouin zone. The exceptional rings not only possess wave-function topology and thus result in the drumhead surface states, but also host spectral topology and thereby give rise to the hybrid-order geometry-dependent skin effect in three dimensions. Our experimental results evidence the complete non-Hermitian bulk-boundary correspondence of the three-dimensional exceptional line semimetal, and may pave the way for designing non-Hermitian acoustic devices. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 5 figures

arXiv:2407.03676 [pdf]

Out-of-Plane Polarization from Spin Reflection Induces Field-Free Spin-Orbit Torque Switching in Structures with Canted NiO Interfacial Moments

Authors: Zhe Zhang, Zhuoyi Li, Yuzhe Chen, Fangyuan Zhu, Yu Yan, Yao Li, Liang He, Jun Du, Rong Zhang, Jing Wu, Xianyang Lu, Yongbing Xu

Abstract: Realizing deterministic current-induced spin-orbit torque (SOT) magnetization switching, especially in systems exhibiting perpendicular magnetic anisotropy (PMA), typically requires the application of a collinear in-plane field, posing a challenging problem. In this study, we successfully achieve field-free SOT switching in the CoFeB/MgO system. In a Ta/CoFeB/MgO/NiO/Ta structure, spin reflection… ▽ More Realizing deterministic current-induced spin-orbit torque (SOT) magnetization switching, especially in systems exhibiting perpendicular magnetic anisotropy (PMA), typically requires the application of a collinear in-plane field, posing a challenging problem. In this study, we successfully achieve field-free SOT switching in the CoFeB/MgO system. In a Ta/CoFeB/MgO/NiO/Ta structure, spin reflection at the NiO interface, characterized by noncollinear spin structures with canted magnetization, generates a spin current with an out-of-plane spin polarization σz. We confirm the contribution of σz to the field-free SOT switching through measurements of the shift effect in the out-of-plane magnetization hysteresis loops under different currents. The incorporation of NiO as an antiferromagnetic insulator, mitigates the current shunting effect and ensures excellent thermal stability of the device. The sample with 0.8 nm MgO and 2 nm NiO demonstrates an impressive optimal switching ratio approaching 100% without an in-plane field. This breakthrough in the CoFeB/MgO system promises significant applications in spintronics, advancing us closer to realizing innovative technologies. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.03671 [pdf]

Spatio-temporal cooperative control Method of Highway Ramp Merge Based on Vehicle-road Coordination

Authors: Xiaoxue Xu, Maokai Lai, Haitao Zhang, Xiang Dong, Tao Li, Jie Wu, Yuan Li, Ting Peng

Abstract: The merging area of highway ramps faces multiple challenges, including traffic congestion, collision risks, speed mismatches, driver behavior uncertainties, limited visibility, and bottleneck effects. However, autonomous vehicles engaging in depth coordination between vehicle and road in merging zones, by pre-planning and uploading travel trajectories, can significantly enhance the safety and effi… ▽ More The merging area of highway ramps faces multiple challenges, including traffic congestion, collision risks, speed mismatches, driver behavior uncertainties, limited visibility, and bottleneck effects. However, autonomous vehicles engaging in depth coordination between vehicle and road in merging zones, by pre-planning and uploading travel trajectories, can significantly enhance the safety and efficiency of merging zones.In this paper,we mainly introduce mainline priority cooperation method to achieve the time and space cooperative control of highway merge.Vehicle-mounted intelligent units share real-time vehicle status and driving intentions with Road Section Management Units, which pre-plan the spatiotemporal trajectories of vehicle travel. After receiving these trajectories, Vehicle Intelligent Units strictly adhere to them. Through this deep collaboration between vehicles and roads, conflicts in time and space during vehicle travel are eliminated in advance. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.03590 [pdf, other]

A Fast Dynamic Point Detection Method for LiDAR-Inertial Odometry in Driving Scenarios

Authors: Zikang Yuan, Xiaoxiang Wang, Jingying Wu, Junda Cheng, Xin Yang

Abstract: Existing 3D point-based dynamic point detection and removal methods have a significant time overhead, making them difficult to adapt to LiDAR-inertial odometry systems. This paper proposes a label consistency based dynamic point detection and removal method for handling moving vehicles and pedestrians in autonomous driving scenarios, and embeds the proposed dynamic point detection and removal meth… ▽ More Existing 3D point-based dynamic point detection and removal methods have a significant time overhead, making them difficult to adapt to LiDAR-inertial odometry systems. This paper proposes a label consistency based dynamic point detection and removal method for handling moving vehicles and pedestrians in autonomous driving scenarios, and embeds the proposed dynamic point detection and removal method into a self-designed LiDAR-inertial odometry system. Experimental results on three public datasets demonstrate that our method can accomplish the dynamic point detection and removal with extremely low computational overhead (i.e., 1$\sim$9ms) in LIO systems, meanwhile achieve comparable preservation rate and rejection rate to state-of-the-art methods and significantly enhance the accuracy of pose estimation. We have released the source code of this work for the development of the community. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 8 pages, submitted to RA-L

arXiv:2407.03243 [pdf, other]

Visual Grounding with Attention-Driven Constraint Balancing

Authors: Weitai Kang, Luowei Zhou, Junyi Wu, Changchang Sun, Yan Yan

Abstract: Unlike Object Detection, Visual Grounding task necessitates the detection of an object described by complex free-form language. To simultaneously model such complex semantic and visual representations, recent state-of-the-art studies adopt transformer-based models to fuse features from both modalities, further introducing various modules that modulate visual features to align with the language exp… ▽ More Unlike Object Detection, Visual Grounding task necessitates the detection of an object described by complex free-form language. To simultaneously model such complex semantic and visual representations, recent state-of-the-art studies adopt transformer-based models to fuse features from both modalities, further introducing various modules that modulate visual features to align with the language expressions and eliminate the irrelevant redundant information. However, their loss function, still adopting common Object Detection losses, solely governs the bounding box regression output, failing to fully optimize for the above objectives. To tackle this problem, in this paper, we first analyze the attention mechanisms of transformer-based models. Building upon this, we further propose a novel framework named Attention-Driven Constraint Balancing (AttBalance) to optimize the behavior of visual features within language-relevant regions. Extensive experimental results show that our method brings impressive improvements. Specifically, we achieve constant improvements over five different models evaluated on four different benchmarks. Moreover, we attain a new state-of-the-art performance by integrating our method into QRNet. △ Less

Submitted 6 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03201 [pdf, other]

doi 10.1038/s44306-024-00035-2

Wideband Coherent Microwave Conversion via Magnon Nonlinearity in Hybrid Quantum System

Authors: Jiahao Wu, Jiacheng Liu, Zheyu Ren, Man Yin Leung, Wai Kuen Leung, Kin On Ho, Xiangrong Wang, Qiming Shao, Sen Yang

Abstract: Frequency conversion is a widely realized physical process in nonlinear systems of optics and electronics. As an emerging nonlinear platform, spintronic devices have the potential to achieve stronger frequency conversion. Here, we demonstrated a microwave frequency conversion method in a hybrid quantum system, integrating nitrogen-vacancy centers in diamond with magnetic thin film CoFeB. We achiev… ▽ More Frequency conversion is a widely realized physical process in nonlinear systems of optics and electronics. As an emerging nonlinear platform, spintronic devices have the potential to achieve stronger frequency conversion. Here, we demonstrated a microwave frequency conversion method in a hybrid quantum system, integrating nitrogen-vacancy centers in diamond with magnetic thin film CoFeB. We achieve a conversion bandwidth ranging from 0.1 to 12GHz, presenting an up to $\mathrm{25^{th}}$ order frequency conversion and further display the application of this method for frequency detection and qubits coherent control. Distinct from traditional frequency conversion techniques based on nonlinear electric response, our approach employs nonlinear magnetic response in spintronic devices. The nonlinearity, originating from the symmetry breaking such as domain walls in magnetic films, presents that our method can be adapted to hybrid systems of other spintronic devices and spin qubits, expanding the application scope of spintronic devices and providing a promising on-chip platform for coupling quantum systems. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 11 pages, 5 figures

Journal ref: npj Spintronics volume 2, Article number: 30 (2024)

arXiv:2407.03130 [pdf, other]

Towards Efficient Pixel Labeling for Industrial Anomaly Detection and Localization

Authors: Hanxi Li, Jingqi Wu, Lin Yuanbo Wu, Hao Chen, Deyin Liu, Chunhua Shen

Abstract: In the realm of practical Anomaly Detection (AD) tasks, manual labeling of anomalous pixels proves to be a costly endeavor. Consequently, many AD methods are crafted as one-class classifiers, tailored for training sets completely devoid of anomalies, ensuring a more cost-effective approach. While some pioneering work has demonstrated heightened AD accuracy by incorporating real anomaly samples in… ▽ More In the realm of practical Anomaly Detection (AD) tasks, manual labeling of anomalous pixels proves to be a costly endeavor. Consequently, many AD methods are crafted as one-class classifiers, tailored for training sets completely devoid of anomalies, ensuring a more cost-effective approach. While some pioneering work has demonstrated heightened AD accuracy by incorporating real anomaly samples in training, this enhancement comes at the price of labor-intensive labeling processes. This paper strikes the balance between AD accuracy and labeling expenses by introducing ADClick, a novel Interactive Image Segmentation (IIS) algorithm. ADClick efficiently generates "ground-truth" anomaly masks for real defective images, leveraging innovative residual features and meticulously crafted language prompts. Notably, ADClick showcases a significantly elevated generalization capacity compared to existing state-of-the-art IIS approaches. Functioning as an anomaly labeling tool, ADClick generates high-quality anomaly labels (AP $= 94.1\%$ on MVTec AD) based on only $3$ to $5$ manual click annotations per training image. Furthermore, we extend the capabilities of ADClick into ADClick-Seg, an enhanced model designed for anomaly detection and localization. By fine-tuning the ADClick-Seg model using the weak labels inferred by ADClick, we establish the state-of-the-art performances in supervised AD tasks (AP $= 86.4\%$ on MVTec AD and AP $= 78.4\%$, PRO $= 98.6\%$ on KSDD2). △ Less

Submitted 4 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

Comments: 18 pages, 5 figures

arXiv:2407.03069 [pdf, other]

Frequency-selective terahertz wave amplification by a time-boundary-engineered Huygens metasurface

Authors: Fu Deng, Fengjie Zhu, Xiaoyue Zhou, Yi Chan, Jingbo Wu, Caihong Zhang, Biaobing Jin, Jensen Li, Kebin Fan, Jingdi Zhang

Abstract: Ultrafast manipulation of optical resonance can establish the time-boundary effect in time-variant media leading to a new degree of freedom for coherent control of electromagnetic waves. Here, we demonstrate that a free-standing all dielectric Huygens metasurface of degenerate electric and magnetic resonances can prompt the broadband near-unity transmission in its static state, whereas it enables… ▽ More Ultrafast manipulation of optical resonance can establish the time-boundary effect in time-variant media leading to a new degree of freedom for coherent control of electromagnetic waves. Here, we demonstrate that a free-standing all dielectric Huygens metasurface of degenerate electric and magnetic resonances can prompt the broadband near-unity transmission in its static state, whereas it enables wave amplification in the presence of time boundary. The time boundary is realized by femtosecond laser excitations that transiently inject free carriers into the constituent meta-atoms for dynamic removal of a pre-established two-fold degeneracy. We observe that the transmittance in the photo-excited Huygens metasurface can exceed unity transmittance, i.e., THz wave amplification, by a factor over 20% in intensity at frequencies tunable by varying the arrival of time boundary with respect to that of the seed terahertz pulse. By numerical simulations and analysis with time-dependent coupled mode theory, we show that the wave amplification results from the ultrafast Q-switching and shift in resonant frequencies. This work demonstrates a new approach to achieve tunable amplification in an optical microcavity by exploiting the concept of time-variant media and the unique electromagnetic properties of Huygens metasurface. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.03014 [pdf]

Dielectric Fano Nanoantennas for Enabling Sub-Nanosecond Lifetimes in NV-based Single Photon Emitters

Authors: Shu An, Dmitry Kalashnikov, Wenqiao Shi, Zackaria Mahfoud, Ah Bian Chew, Yan Liu, Jing Wu, Di Zhu, Weibo Gao, Cheng-Wei Qiu, Victor Leong, Zhaogang Dong

Abstract: Solid-state quantum emitters are essential sources of single photons, and enhancing their emission rates is of paramount importance for applications in quantum communications, computing, and metrology. One approach is to couple quantum emitters with resonant photonic nanostructures, where the emission rate is enhanced due to the Purcell effect. Dielectric nanoantennas are promising as they provide… ▽ More Solid-state quantum emitters are essential sources of single photons, and enhancing their emission rates is of paramount importance for applications in quantum communications, computing, and metrology. One approach is to couple quantum emitters with resonant photonic nanostructures, where the emission rate is enhanced due to the Purcell effect. Dielectric nanoantennas are promising as they provide strong emission enhancement compared to plasmonic ones, which suffer from high Ohmic loss. Here, we designed and fabricated a dielectric Fano resonator based on a pair of silicon (Si) ellipses and a disk, which supports the mode hybridization between quasi-bound-states-in-the-continuum (quasi-BIC) and Mie resonance. We demonstrated the performance of the developed resonant system by interfacing it with single photon emitters (SPEs) based on nitrogen-vacancy (NV-) centers in nanodiamonds (NDs). We observed that the interfaced emitters have a Purcell enhancement factor of ~10, with sub-ns emission lifetime and a polarization contrast of 9. Our results indicate a promising method for developing efficient and compact single-photon sources for integrated quantum photonics applications. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 20 pages, 4 figures

arXiv:2407.02899 [pdf, other]

Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (639 additional authors not shown)

Abstract: A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be… ▽ More A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02886 [pdf, other]

A Wolf in Sheep's Clothing: Practical Black-box Adversarial Attacks for Evading Learning-based Windows Malware Detection in the Wild

Authors: Xiang Ling, Zhiyu Wu, Bin Wang, Wei Deng, Jingzheng Wu, Shouling Ji, Tianyue Luo, Yanjun Wu

Abstract: Given the remarkable achievements of existing learning-based malware detection in both academia and industry, this paper presents MalGuise, a practical black-box adversarial attack framework that evaluates the security risks of existing learning-based Windows malware detection systems under the black-box setting. MalGuise first employs a novel semantics-preserving transformation of call-based redi… ▽ More Given the remarkable achievements of existing learning-based malware detection in both academia and industry, this paper presents MalGuise, a practical black-box adversarial attack framework that evaluates the security risks of existing learning-based Windows malware detection systems under the black-box setting. MalGuise first employs a novel semantics-preserving transformation of call-based redividing to concurrently manipulate both nodes and edges of malware's control-flow graph, making it less noticeable. By employing a Monte-Carlo-tree-search-based optimization, MalGuise then searches for an optimized sequence of call-based redividing transformations to apply to the input Windows malware for evasions. Finally, it reconstructs the adversarial malware file based on the optimized transformation sequence while adhering to Windows executable format constraints, thereby maintaining the same semantics as the original. MalGuise is systematically evaluated against three state-of-the-art learning-based Windows malware detection systems under the black-box setting. Evaluation results demonstrate that MalGuise achieves a remarkably high attack success rate, mostly exceeding 95%, with over 91% of the generated adversarial malware files maintaining the same semantics. Furthermore, MalGuise achieves up to a 74.97% attack success rate against five anti-virus products, highlighting potential tangible security concerns to real-world users. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: This paper has been accepted by 33rd USENIX Security Symposium 2024

arXiv:2407.02835 [pdf, other]

doi 10.1109/LSP.2023.3324581

A Pairwise DomMix Attentive Adversarial Network for Unsupervised Domain Adaptive Object Detection

Authors: Jie Shao, Jiacheng Wu, Wenzhong Shen, Cheng Yang

Abstract: Unsupervised Domain Adaptive Object Detection (DAOD) could adapt a model trained on a source domain to an unlabeled target domain for object detection. Existing unsupervised DAOD methods usually perform feature alignments from the target to the source. Unidirectional domain transfer would omit information about the target samples and result in suboptimal adaptation when there are large domain shif… ▽ More Unsupervised Domain Adaptive Object Detection (DAOD) could adapt a model trained on a source domain to an unlabeled target domain for object detection. Existing unsupervised DAOD methods usually perform feature alignments from the target to the source. Unidirectional domain transfer would omit information about the target samples and result in suboptimal adaptation when there are large domain shifts. Therefore, we propose a pairwise attentive adversarial network with a Domain Mixup (DomMix) module to mitigate the aforementioned challenges. Specifically, a deep-level mixup is employed to construct an intermediate domain that allows features from both domains to share their differences. Then a pairwise attentive adversarial network is applied with attentive encoding on both image-level and instance-level features at different scales and optimizes domain alignment by adversarial learning. This allows the network to focus on regions with disparate contextual information and learn their similarities between different domains. Extensive experiments are conducted on several benchmark datasets, demonstrating the superiority of our proposed method. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: has published on IEEE Signal Processing Letters, 2023

arXiv:2407.01965 [pdf, other]

AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment

Authors: Yilong Lai, Jialong Wu, Congzhi Zhang, Haowen Sun, Deyu Zhou

Abstract: Conversational Query Reformulation (CQR) has significantly advanced in addressing the challenges of conversational search, particularly those stemming from the latent user intent and the need for historical context. Recent works aimed to boost the performance of CRQ through alignment. However, they are designed for one specific retrieval system, which potentially results in poor generalization. To… ▽ More Conversational Query Reformulation (CQR) has significantly advanced in addressing the challenges of conversational search, particularly those stemming from the latent user intent and the need for historical context. Recent works aimed to boost the performance of CRQ through alignment. However, they are designed for one specific retrieval system, which potentially results in poor generalization. To overcome this limitation, we present a novel framework AdaCQR. By aligning reformulation models with both term-based and semantic-based retrieval systems, AdaCQR enhances the generalizability of information-seeking queries across diverse retrieval environments through a dual-phase training strategy. We also developed two effective approaches for acquiring superior labels and diverse input candidates, boosting the efficiency and robustness of the framework. Experimental evaluations on the TopiOCQA and QReCC datasets demonstrate that AdaCQR significantly outperforms existing methods, offering both quantitative and qualitative improvements in conversational query reformulation. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01862 [pdf, other]

Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The 3rd BARN Challenge at ICRA 2024

Authors: Xuesu Xiao, Zifan Xu, Aniket Datar, Garrett Warnell, Peter Stone, Joshua Julian Damanik, Jaewon Jung, Chala Adane Deresa, Than Duc Huy, Chen Jinyu, Chen Yichen, Joshua Adrian Cahyono, Jingda Wu, Longfei Mo, Mingyang Lv, Bowen Lan, Qingyang Meng, Weizhi Tao, Li Cheng

Abstract: The 3rd BARN (Benchmark Autonomous Robot Navigation) Challenge took place at the 2024 IEEE International Conference on Robotics and Automation (ICRA 2024) in Yokohama, Japan and continued to evaluate the performance of state-of-the-art autonomous ground navigation systems in highly constrained environments. Similar to the trend in The 1st and 2nd BARN Challenge at ICRA 2022 and 2023 in Philadelphi… ▽ More The 3rd BARN (Benchmark Autonomous Robot Navigation) Challenge took place at the 2024 IEEE International Conference on Robotics and Automation (ICRA 2024) in Yokohama, Japan and continued to evaluate the performance of state-of-the-art autonomous ground navigation systems in highly constrained environments. Similar to the trend in The 1st and 2nd BARN Challenge at ICRA 2022 and 2023 in Philadelphia (North America) and London (Europe), The 3rd BARN Challenge in Yokohama (Asia) became more regional, i.e., mostly Asian teams participated. The size of the competition has slightly shrunk (six simulation teams, four of which were invited to the physical competition). The competition results, compared to last two years, suggest that the field has adopted new machine learning approaches while at the same time slightly converged to a few common practices. However, the regional nature of the physical participants suggests a challenge to promote wider participation all over the world and provide more resources to travel to the venue. In this article, we discuss the challenge, the approaches used by the three winning teams, and lessons learned to direct future research and competitions. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: arXiv admin note: text overlap with arXiv:2308.03205

arXiv:2407.01731 [pdf, other]

Uncertainty Quantification in Table Structure Recognition

Authors: Kehinde Ajayi, Leizhen Zhang, Yi He, Jian Wu

Abstract: Quantifying uncertainties for machine learning models is a critical step to reduce human verification effort by detecting predictions with low confidence. This paper proposes a method for uncertainty quantification (UQ) of table structure recognition (TSR). The proposed UQ method is built upon a mixture-of-expert approach termed Test-Time Augmentation (TTA). Our key idea is to enrich and diversify… ▽ More Quantifying uncertainties for machine learning models is a critical step to reduce human verification effort by detecting predictions with low confidence. This paper proposes a method for uncertainty quantification (UQ) of table structure recognition (TSR). The proposed UQ method is built upon a mixture-of-expert approach termed Test-Time Augmentation (TTA). Our key idea is to enrich and diversify the table representations, to spotlight the cells with high recognition uncertainties. To evaluate the effectiveness, we proposed two heuristics to differentiate highly uncertain cells from normal cells, namely, masking and cell complexity quantification. Masking involves varying the pixel intensity to deem the detection uncertainty. Cell complexity quantification gauges the uncertainty of each cell by its topological relation with neighboring cells. The evaluation results based on standard benchmark datasets demonstrate that the proposed method is effective in quantifying uncertainty in TSR models. To our best knowledge, this study is the first of its kind to enable UQ in TSR tasks. Our code and data are available at: https://github.com/lamps-lab/UQTTA.git. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 7 Figures

arXiv:2407.01458 [pdf, other]

Contractual Reinforcement Learning: Pulling Arms with Invisible Hands

Authors: Jibang Wu, Siyu Chen, Mengdi Wang, Huazheng Wang, Haifeng Xu

Abstract: The agency problem emerges in today's large scale machine learning tasks, where the learners are unable to direct content creation or enforce data collection. In this work, we propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design. The problem, termed \emph{contractual reinforcement learning}, naturally aris… ▽ More The agency problem emerges in today's large scale machine learning tasks, where the learners are unable to direct content creation or enforce data collection. In this work, we propose a theoretical framework for aligning economic interests of different stakeholders in the online learning problems through contract design. The problem, termed \emph{contractual reinforcement learning}, naturally arises from the classic model of Markov decision processes, where a learning principal seeks to optimally influence the agent's action policy for their common interests through a set of payment rules contingent on the realization of next state. For the planning problem, we design an efficient dynamic programming algorithm to determine the optimal contracts against the far-sighted agent. For the learning problem, we introduce a generic design of no-regret learning algorithms to untangle the challenges from robust design of contracts to the balance of exploration and exploitation, reducing the complexity analysis to the construction of efficient search algorithms. For several natural classes of problems, we design tailored search algorithms that provably achieve $\tilde{O}(\sqrt{T})$ regret. We also present an algorithm with $\tilde{O}(T^{2/3})$ for the general problem that improves the existing analysis in online contract design with mild technical assumptions. △ Less

Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.01418 [pdf, other]

RoboPack: Learning Tactile-Informed Dynamics Models for Dense Packing

Authors: Bo Ai, Stephen Tian, Haochen Shi, Yixuan Wang, Cheston Tan, Yunzhu Li, Jiajun Wu

Abstract: Tactile feedback is critical for understanding the dynamics of both rigid and deformable objects in many manipulation tasks, such as non-prehensile manipulation and dense packing. We introduce an approach that combines visual and tactile sensing for robotic manipulation by learning a neural, tactile-informed dynamics model. Our proposed framework, RoboPack, employs a recurrent graph neural network… ▽ More Tactile feedback is critical for understanding the dynamics of both rigid and deformable objects in many manipulation tasks, such as non-prehensile manipulation and dense packing. We introduce an approach that combines visual and tactile sensing for robotic manipulation by learning a neural, tactile-informed dynamics model. Our proposed framework, RoboPack, employs a recurrent graph neural network to estimate object states, including particles and object-level latent physics information, from historical visuo-tactile observations and to perform future state predictions. Our tactile-informed dynamics model, learned from real-world data, can solve downstream robotics tasks with model-predictive control. We demonstrate our approach on a real robot equipped with a compliant Soft-Bubble tactile sensor on non-prehensile manipulation and dense packing tasks, where the robot must infer the physics properties of objects from direct and indirect interactions. Trained on only an average of 30 minutes of real-world interaction data per task, our model can perform online adaptation and make touch-informed predictions. Through extensive evaluations in both long-horizon dynamics prediction and real-world manipulation, our method demonstrates superior effectiveness compared to previous learning-based and physics-based simulation systems. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: Robotics: Science and Systems (RSS), 2024. Project page: https://robo-pack.github.io/

ACM Class: I.2.9; I.2.6; I.2.10

arXiv:2407.00932 [pdf, other]

Orbital phases of $p$-band ultracold fermions in the frustrated triangular lattice

Authors: Jiaqi Wu, Hui Tan, Rui Cao, Jianmin Yuan, Yongqiang Li

Abstract: Orbital degrees of freedom play an important role for understanding the emergence of unconventional quantum phases. Ultracold atomic gases in optical lattices provide a wonderful platform to simulate orbital physics. In this work, we consider spinless fermionic atoms loaded into $p$-orbital bands of a two-dimensional frustrated triangular lattice. The system can be described by an extended Fermi-H… ▽ More Orbital degrees of freedom play an important role for understanding the emergence of unconventional quantum phases. Ultracold atomic gases in optical lattices provide a wonderful platform to simulate orbital physics. In this work, we consider spinless fermionic atoms loaded into $p$-orbital bands of a two-dimensional frustrated triangular lattice. The system can be described by an extended Fermi-Hubbard model, which is numerically solved by using the orbital version of real-space dynamical mean-field theory. Low-temperature phase diagrams are obtained, which contain stripe-, ferro- and para-orbital ordered quantum phases, due to the interplay of anisotropic hoppings and geometrical frustration. In order to understand the underlying mechanics of competing orbital orders, we derive an effective orbital-exchange model, which yields consistent explanation with our main numerical results. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 9 pages, 7 figures

arXiv:2407.00729 [pdf]

Discovering one molecule out of a million: inverse design of molecular hole transporting semiconductors tailored for perovskite solar cells

Authors: Jianchang Wu, Luca Torresi, ManMan Hu, Patrick Reiser, Jiyun Zhang, Juan S. Rocha-Ortiz, Luyao Wang, Zhiqiang Xie, Kaicheng Zhang, Byung-wook Park, Anastasia Barabash, Yicheng Zhao, Junsheng Luo, Yunuo Wang, Larry Lüer, Lin-Long Deng, Jens A. Hauch, Sang Il Seok, Pascal Friederich, Christoph J. Brabec

Abstract: The inverse design of tailored organic molecules for specific optoelectronic devices of high complexity holds an enormous potential, but has not yet been realized1,2. The complexity and literally infinite diversity of conjugated molecular structures present both, an unprecedented opportunity for technological breakthroughs as well as an unseen optimization challenge. Current models rely on big dat… ▽ More The inverse design of tailored organic molecules for specific optoelectronic devices of high complexity holds an enormous potential, but has not yet been realized1,2. The complexity and literally infinite diversity of conjugated molecular structures present both, an unprecedented opportunity for technological breakthroughs as well as an unseen optimization challenge. Current models rely on big data which do not exist for specialized research films. However, a hybrid computational and high throughput experimental screening workflow allowed us to train predictive models with as little as 149 molecules. We demonstrate a unique closed-loop workflow combining high throughput synthesis and Bayesian optimization that discovers new hole transporting materials with tailored properties for solar cell applications. A series of high-performance molecules were identified from minimal suggestions, achieving up to 26.23% (certified 25.88%) power conversion efficiency in perovskite solar cells. Our work paves the way for rapid, informed discovery in vast molecular libraries, revolutionizing material selection for complex devices. We believe that our approach can be generalized to other emerging fields and indeed accelerate the development of optoelectronic semiconductor devices in general. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 21 pages, 5 figures

arXiv:2407.00631 [pdf, other]

TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets

Authors: Jintai Chen, Yaojun Hu, Yue Wang, Yingzhou Lu, Xu Cao, Miao Lin, Hongxia Xu, Jian Wu, Cao Xiao, Jimeng Sun, Lucas Glass, Kexin Huang, Marinka Zitnik, Tianfan Fu

Abstract: Clinical trials are pivotal for developing new medical treatments, yet they typically pose some risks such as patient mortality, adverse events, and enrollment failure that waste immense efforts spanning over a decade. Applying artificial intelligence (AI) to forecast or simulate key events in clinical trials holds great potential for providing insights to guide trial designs. However, complex dat… ▽ More Clinical trials are pivotal for developing new medical treatments, yet they typically pose some risks such as patient mortality, adverse events, and enrollment failure that waste immense efforts spanning over a decade. Applying artificial intelligence (AI) to forecast or simulate key events in clinical trials holds great potential for providing insights to guide trial designs. However, complex data collection and question definition requiring medical expertise and a deep understanding of trial designs have hindered the involvement of AI thus far. This paper tackles these challenges by presenting a comprehensive suite of meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design, encompassing prediction of trial duration, patient dropout rate, serious adverse event, mortality rate, trial approval outcome, trial failure reason, drug dose finding, design of eligibility criteria. Furthermore, we provide basic validation methods for each task to ensure the datasets' usability and reliability. We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design, ultimately advancing clinical trial research and accelerating medical solution development. The curated dataset, metrics, and basic models are publicly available at https://github.com/ML2Health/ML2ClinicalTrials/tree/main/AI4Trial. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2407.00136 [pdf, other]

Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components. △ Less

Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

arXiv:2406.20085 [pdf, other]

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

Authors: Yicheng Chen, Xiangtai Li, Yining Li, Yanhong Zeng, Jianzong Wu, Xiangyu Zhao, Kai Chen

Abstract: Diffusion-based models have shown great potential in generating high-quality images with various layouts, which can benefit downstream perception tasks. However, a fully automatic layout generation driven only by language and a suitable metric for measuring multiple generated instances has not been well explored. In this work, we present Auto Cherry-Picker (ACP), a novel framework that generates h… ▽ More Diffusion-based models have shown great potential in generating high-quality images with various layouts, which can benefit downstream perception tasks. However, a fully automatic layout generation driven only by language and a suitable metric for measuring multiple generated instances has not been well explored. In this work, we present Auto Cherry-Picker (ACP), a novel framework that generates high-quality multi-modal training examples to augment perception and multi-modal training. Starting with a simple list of natural language concepts, we prompt large language models (LLMs) to generate a detailed description and design reasonable layouts. Next, we use an off-the-shelf text-to-image model to generate multiple images. Then, the generated data are refined using a comprehensively designed metric to ensure quality. In particular, we present a new metric, Composite Layout and Image Score (CLIS), to evaluate the generated images fairly. Our synthetic high-quality examples boost performance in various scenarios by customizing the initial concept list, especially in addressing challenges associated with long-tailed distribution and imbalanced datasets. Experiment results on downstream tasks demonstrate that Auto Cherry-Picker can significantly improve the performance of existing models. In addition, we have thoroughly investigated the correlation between CLIS and performance gains in downstream tasks, and we find that a better CLIS score results in better performance. This finding shows the potential for evaluation metrics as the role for various visual perception and MLLM tasks. Code will be available. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: 19 pages, 7 figures

arXiv:2406.19939 [pdf, other]

Data-driven methods for flow and transport in porous media: a review

Authors: Guang Yang, Ran Xu, Yusong Tian, Songyuan Guo, Jingyi Wu, Xu Chu

Abstract: This review examined the current advancements in data-driven methods for analyzing flow and transport in porous media, which has various applications in energy, chemical engineering, environmental science, and beyond. Although there has been progress in recent years, the challenges of current experimental and high-fidelity numerical simulations, such as high computational costs and difficulties in… ▽ More This review examined the current advancements in data-driven methods for analyzing flow and transport in porous media, which has various applications in energy, chemical engineering, environmental science, and beyond. Although there has been progress in recent years, the challenges of current experimental and high-fidelity numerical simulations, such as high computational costs and difficulties in accurately representing complex, heterogeneous structures, can still potentially be addressed by state-of-the-art data-driven methods. We analyzed the synergistic potential of these methods, addressed their limitations, and suggested how they can be effectively integrated to improve both the fidelity and efficiency of current research. A discussion on future research directions in this field was conducted, emphasizing the need for collaborative efforts that combine domain expertise in physics and advanced computationald and data-driven methodologies. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.19190 [pdf, ps, other]

Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (643 additional authors not shown)

Abstract: Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec… ▽ More Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 13 pages, 6 figures

arXiv:2406.18846 [pdf, other]

AFBench: A Large-scale Benchmark for Airfoil Design

Authors: Jian Liu, Jianyu Wu, Hairun Xie, Guoqing Zhang, Jing Wang, Wei Liu, Wanli Ouyang, Junjun Jiang, Xianming Liu, Shixiang Tang, Miao Zhang

Abstract: Data-driven generative models have emerged as promising approaches towards achieving efficient mechanical inverse design. However, due to prohibitively high cost in time and money, there is still lack of open-source and large-scale benchmarks in this field. It is mainly the case for airfoil inverse design, which requires to generate and edit diverse geometric-qualified and aerodynamic-qualified ai… ▽ More Data-driven generative models have emerged as promising approaches towards achieving efficient mechanical inverse design. However, due to prohibitively high cost in time and money, there is still lack of open-source and large-scale benchmarks in this field. It is mainly the case for airfoil inverse design, which requires to generate and edit diverse geometric-qualified and aerodynamic-qualified airfoils following the multimodal instructions, \emph{i.e.,} dragging points and physical parameters. This paper presents the open-source endeavors in airfoil inverse design, \emph{AFBench}, including a large-scale dataset with 200 thousand airfoils and high-quality aerodynamic and geometric labels, two novel and practical airfoil inverse design tasks, \emph{i.e.,} conditional generation on multimodal physical parameters, controllable editing, and comprehensive metrics to evaluate various existing airfoil inverse design methods. Our aim is to establish \emph{AFBench} as an ecosystem for training and evaluating airfoil inverse design methods, with a specific focus on data-driven controllable inverse design models by multimodal instructions capable of bridging the gap between ideas and execution, the academic research and industrial applications. We have provided baseline models, comprehensive experimental observations, and analysis to accelerate future research. Our baseline model is trained on an RTX 3090 GPU within 16 hours. The codebase, datasets and benchmarks will be available at \url{https://hitcslj.github.io/afbench/}. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: Submitted to NeurIPS 2024 Dataset & Benchmark Track

arXiv:2406.18611 [pdf, other]

doi 10.1016/j.jfluidstructs.2022.103793

Analysis of Full-scale Riser Responses in Field Conditions Based on Gaussian Mixture Model

Authors: Jie Wu, Sølve Eidnes, Jingzhe Jin, Halvor Lie, Decao Yin, Elizabeth Passano, Svein Sævik, Signe Riemer-Sorensen

Abstract: Offshore slender marine structures experience complex and combined load conditions from waves, current and vessel motions that may result in both wave frequency and vortex shedding response patterns. Field measurements often consist of records of environmental conditions and riser responses, typically with 30-minute intervals. These data can be represented in a high-dimensional parameter space. Ho… ▽ More Offshore slender marine structures experience complex and combined load conditions from waves, current and vessel motions that may result in both wave frequency and vortex shedding response patterns. Field measurements often consist of records of environmental conditions and riser responses, typically with 30-minute intervals. These data can be represented in a high-dimensional parameter space. However, it is difficult to visualize and understand the structural responses, as they are affected by many of these parameters. It becomes easier to identify trends and key parameters if the measurements with the same characteristics can be grouped together. Cluster analysis is an unsupervised learning method, which groups the data based on their relative distance, density of the data space, intervals, or statistical distributions. In the present study, a Gaussian mixture model guided by domain knowledge has been applied to analyze field measurements. Using the 242 measurement events of the Helland-Hansen riser, it is demonstrated that riser responses can be grouped into 12 clusters by the identification of key environmental parameters. This results in an improved understanding of complex structure responses. Furthermore, the cluster results are valuable for evaluating the riser response prediction accuracy. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: Matches accepted version

Journal ref: Journal of Fluids and Structures, Volume 116, 2023, 103793

arXiv:2406.18532 [pdf, other]

Symbolic Learning Enables Self-Evolving Agents

Authors: Wangchunshu Zhou, Yixin Ou, Shengwei Ding, Long Li, Jialong Wu, Tiannan Wang, Jiamin Chen, Shuai Wang, Xiaohua Xu, Ningyu Zhang, Huajun Chen, Yuchen Eleanor Jiang

Abstract: The AI community has been exploring a pathway to artificial general intelligence (AGI) by developing "language agents", which are complex large language models (LLMs) pipelines involving both prompting techniques and tool usage methods. While language agents have demonstrated impressive capabilities for many real-world tasks, a fundamental limitation of current language agents research is that the… ▽ More The AI community has been exploring a pathway to artificial general intelligence (AGI) by developing "language agents", which are complex large language models (LLMs) pipelines involving both prompting techniques and tool usage methods. While language agents have demonstrated impressive capabilities for many real-world tasks, a fundamental limitation of current language agents research is that they are model-centric, or engineering-centric. That's to say, the progress on prompts, tools, and pipelines of language agents requires substantial manual engineering efforts from human experts rather than automatically learning from data. We believe the transition from model-centric, or engineering-centric, to data-centric, i.e., the ability of language agents to autonomously learn and evolve in environments, is the key for them to possibly achieve AGI. In this work, we introduce agent symbolic learning, a systematic framework that enables language agents to optimize themselves on their own in a data-centric way using symbolic optimizers. Specifically, we consider agents as symbolic networks where learnable weights are defined by prompts, tools, and the way they are stacked together. Agent symbolic learning is designed to optimize the symbolic network within language agents by mimicking two fundamental algorithms in connectionist learning: back-propagation and gradient descent. Instead of dealing with numeric weights, agent symbolic learning works with natural language simulacrums of weights, loss, and gradients. We conduct proof-of-concept experiments on both standard benchmarks and complex real-world tasks and show that agent symbolic learning enables language agents to update themselves after being created and deployed in the wild, resulting in "self-evolving agents". △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: Code available at https://github.com/aiwaves-cn/agents

arXiv:2406.18283 [pdf, other]

Cascaded multi-phonon stimulated Raman scattering near second-harmonic-generation in thin-film lithium niobate microdisk

Authors: Yuxuan He, Xiongshuo Yan, Jiangwei Wu, Xiangmin Liu, Yuping Chen, Xianfeng Chen

Abstract: High-quality microresonators can greatly enhance light-matter interactions and are excellent platforms for studying nonlinear optics. Wavelength conversion through nonlinear processes is the key to many applications of integrated optics. The stimulated Raman scattering process can extend the emission wavelength of a laser source to a wider range. Lithium niobate, as a Raman active crystalline mate… ▽ More High-quality microresonators can greatly enhance light-matter interactions and are excellent platforms for studying nonlinear optics. Wavelength conversion through nonlinear processes is the key to many applications of integrated optics. The stimulated Raman scattering process can extend the emission wavelength of a laser source to a wider range. Lithium niobate, as a Raman active crystalline material, has remarkable potential for wavelength conversion. Here, we demonstrate the generation of cascaded multi-phonon Raman signals near the second-harmonic-generation peak in X-cut thin-film lithium niobate microdisk. Fine tuning of the specific cascaded Raman spectral lines has also been made by changing the pump wavelength. Raman lines can reach wavelength up to about 80 nm away from the SHG signal. We realize the SFG process associated with Raman signals in the visible range as well. Our work extends the use of WGM microresonators as effective optical upconversion wavelength converters in nonlinear optical applications. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.18200 [pdf, other]

SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding

Authors: Zhenglin Wang, Jialong Wu, Yilong Lai, Congzhi Zhang, Deyu Zhou

Abstract: Large Language Models (LLMs) demonstrate remarkable emergent abilities across various tasks, yet fall short of complex reasoning and planning tasks. The tree-search-based reasoning methods address this by surpassing the capabilities of chain-of-thought prompting, encouraging exploration of intermediate steps. However, such methods introduce significant inference latency due to the systematic explo… ▽ More Large Language Models (LLMs) demonstrate remarkable emergent abilities across various tasks, yet fall short of complex reasoning and planning tasks. The tree-search-based reasoning methods address this by surpassing the capabilities of chain-of-thought prompting, encouraging exploration of intermediate steps. However, such methods introduce significant inference latency due to the systematic exploration and evaluation of multiple thought paths. This paper introduces SeeD, a novel and efficient inference framework to optimize runtime speed and GPU memory management concurrently. By employing a scheduled speculative execution, SeeD efficiently handles multiple iterations for the thought generation and the state evaluation, leveraging a rounds-scheduled strategy to manage draft model dispatching. Extensive experimental evaluations on three reasoning datasets demonstrate superior speedup performance of SeeD, providing a viable path for batched inference in training-free speculative decoding. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Showing 1–50 of 5,644 results for author: Wu, J