subscribe to arXiv mailings

Optimization of noncollinear magnetic ordering temperature in Y-type hexaferrite by machine learning

Authors: Yonghong Li, Jing Zhang, Linfeng Jiang, Long Zhang, Yugang Zhang, Xueliang Wu, Yisheng Chai, Xiaoyuan Zhou, Zizhen Zhou

Abstract: Searching the optimal doping compositions of the Y-type hexaferrite Ba2Mg2Fe12O22 remains a long-standing challenge for enhanced non-collinear magnetic transition temperature (TNC). Instead of the conventional trial-and-error approach, the composition-property descriptor is established via a data driven machine learning method named SISSO (sure independence screening and sparsifying operator). Bas… ▽ More Searching the optimal doping compositions of the Y-type hexaferrite Ba2Mg2Fe12O22 remains a long-standing challenge for enhanced non-collinear magnetic transition temperature (TNC). Instead of the conventional trial-and-error approach, the composition-property descriptor is established via a data driven machine learning method named SISSO (sure independence screening and sparsifying operator). Based on the chosen efficient and physically interpretable descriptor, a series of Y-type hexaferrite compositions are predicted to hold high TNC, among which the BaSrMg0.28Co1.72Fe10Al2O22 is then experimentally validated. Test results indicate that, under appropriate external magnetic field conditions, the TNC of this composition reaches up to reaches up to 568 K, and its magnetic transition temperature is also elevated to 735 K. This work offers a machine learning-based route to develop room temperature single phase multiferroics for device applications. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: accepted by Applied Physics Letters in 2024

arXiv:2407.02832 [pdf, other]

doi 10.1109/TGRS.2023.3337383

Style Alignment based Dynamic Observation Method for UAV-View Geo-localization

Authors: Jie Shao, LingHao Jiang

Abstract: The task of UAV-view geo-localization is to estimate the localization of a query satellite/drone image by matching it against a reference dataset consisting of drone/satellite images. Though tremendous strides have been made in feature alignment between satellite and drone views, vast differences in both inter and intra-class due to changes in viewpoint, altitude, and lighting remain a huge challe… ▽ More The task of UAV-view geo-localization is to estimate the localization of a query satellite/drone image by matching it against a reference dataset consisting of drone/satellite images. Though tremendous strides have been made in feature alignment between satellite and drone views, vast differences in both inter and intra-class due to changes in viewpoint, altitude, and lighting remain a huge challenge. In this paper, a style alignment based dynamic observation method for UAV-view geo-localization is proposed to meet the above challenges from two perspectives: visual style transformation and surrounding noise control. Specifically, we introduce a style alignment strategy to transfrom the diverse visual style of drone-view images into a unified satellite images visual style. Then a dynamic observation module is designed to evaluate the spatial distribution of images by mimicking human observation habits. It is featured by the hierarchical attention block (HAB) with a dual-square-ring stream structure, to reduce surrounding noise and geographical deformation. In addition, we propose a deconstruction loss to push away features of different geo-tags and squeeze knowledge from unmatched images by correlation calculation. The experimental results demonstrate the state-of-the-art performance of our model on benchmarked datasets. In particular, when compared to the prior art on University-1652, our results surpass the best of them (FSRA), while only requiring 2x fewer parameters. Code will be released at https://github.com/Xcco1/SA\_DOM △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: has published on IEEE Transactions on Geoscience and Remote Sensing, 2023

arXiv:2407.02234 [pdf, other]

How turbulence increases the bubble-particle collision rate

Authors: Linfeng Jiang, Dominik Krug

Abstract: We study the effect of turbulence on collisions between a finite-size bubble and small inertial particles based on interface-resolved simulations. Our results show that the interaction with the flow field around the bubble remains the dominant effect. Nonlinear dependencies in this process can enhance the turbulent collision rate by up to 100\% compared to quiescent flow. Fluctuations in the bubbl… ▽ More We study the effect of turbulence on collisions between a finite-size bubble and small inertial particles based on interface-resolved simulations. Our results show that the interaction with the flow field around the bubble remains the dominant effect. Nonlinear dependencies in this process can enhance the turbulent collision rate by up to 100\% compared to quiescent flow. Fluctuations in the bubble slip velocity during the interaction with the particle additionally increase the collision rate. We present a frozen-turbulence model that captures the relevant effects providing a physically consistent framework to model collisions of small inertial particles with finite-sized objects in turbulence. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01429 [pdf, other]

Generalized quantum repeater graph states

Authors: Bikun Li, Kenneth Goodenough, Filip Rozpędek, Liang Jiang

Abstract: All-photonic quantum repeaters are essential for establishing long-range quantum entanglement. Within repeater nodes, reliably performing entanglement swapping is a key component of scalable quantum communication. To tackle the challenge of probabilistic Bell state measurement in linear optics, which often leads to information loss, various approaches have been proposed to ensure the loss toleranc… ▽ More All-photonic quantum repeaters are essential for establishing long-range quantum entanglement. Within repeater nodes, reliably performing entanglement swapping is a key component of scalable quantum communication. To tackle the challenge of probabilistic Bell state measurement in linear optics, which often leads to information loss, various approaches have been proposed to ensure the loss tolerance of distributing a single ebit. We have generalized previous work regarding repeater graph states with elaborate connectivity, enabling the efficient establishment of exploitable ebits at a finite rate with high probability. We demonstrate that our new scheme significantly outperforms the previous work with much flexibility and discuss the generation overhead of such resource states. These findings offer new insights into the scalability and reliability of loss-tolerant quantum networks. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2406.19465 [pdf, other]

Can Large Language Models Generate High-quality Patent Claims?

Authors: Lekang Jiang, Caiqi Zhang, Pascal A Scherz, Stephan Goetz

Abstract: Large language models (LLMs) have shown exceptional performance across various text generation tasks but remain under-explored in the patent domain, which offers highly structured and precise language. This paper constructs a dataset to investigate the performance of current LLMs in patent claim generation. Our results demonstrate that generating claims based on patent descriptions outperforms pre… ▽ More Large language models (LLMs) have shown exceptional performance across various text generation tasks but remain under-explored in the patent domain, which offers highly structured and precise language. This paper constructs a dataset to investigate the performance of current LLMs in patent claim generation. Our results demonstrate that generating claims based on patent descriptions outperforms previous research relying on abstracts. Interestingly, current patent-specific LLMs perform much worse than state-of-the-art general LLMs, highlighting the necessity for future research on in-domain LLMs. We also find that LLMs can produce high-quality first independent claims, but their performances markedly decrease for subsequent dependent claims. Moreover, fine-tuning can enhance the completeness of inventions' features, conceptual clarity, and feature linkage. Among the tested LLMs, GPT-4 demonstrates the best performance in comprehensive human evaluations by patent experts, with better feature coverage, conceptual clarity, and technical coherence. Despite these capabilities, comprehensive revision and modification are still necessary to pass rigorous patent scrutiny and ensure legal robustness. △ Less

Submitted 27 June, 2024; originally announced June 2024.

Comments: 13 pages

arXiv:2406.18510 [pdf, other]

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

Authors: Liwei Jiang, Kavel Rao, Seungju Han, Allyson Ettinger, Faeze Brahman, Sachin Kumar, Niloofar Mireshghallah, Ximing Lu, Maarten Sap, Yejin Choi, Nouha Dziri

Abstract: We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic exploration of novel jailbreaks. Compared to prior work that performed red-teaming via recruited human workers, gradient-based optimization, or iterative revision with… ▽ More We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic exploration of novel jailbreaks. Compared to prior work that performed red-teaming via recruited human workers, gradient-based optimization, or iterative revision with LLMs, our work investigates jailbreaks from chatbot users who were not specifically instructed to break the system. WildTeaming reveals previously unidentified vulnerabilities of frontier LLMs, resulting in up to 4.6x more diverse and successful adversarial attacks compared to state-of-the-art jailbreak methods. While many datasets exist for jailbreak evaluation, very few open-source datasets exist for jailbreak training, as safety training data has been closed even when model weights are open. With WildTeaming we create WildJailbreak, a large-scale open-source synthetic safety dataset with 262K vanilla (direct request) and adversarial (complex jailbreak) prompt-response pairs. To mitigate exaggerated safety behaviors, WildJailbreak provides two contrastive types of queries: 1) harmful queries (vanilla & adversarial) and 2) benign queries that resemble harmful queries in form but contain no harm. As WildJailbreak considerably upgrades the quality and scale of existing safety resources, it uniquely enables us to examine the scaling effects of data and the interplay of data properties and model capabilities during safety training. Through extensive experiments, we identify the training properties that enable an ideal balance of safety behaviors: appropriate safeguarding without over-refusal, effective handling of vanilla and adversarial queries, and minimal, if any, decrease in general capabilities. All components of WildJailbeak contribute to achieving balanced safety behaviors of models. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.18495 [pdf, other]

WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Authors: Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Yejin Choi, Nouha Dziri

Abstract: We introduce WildGuard -- an open, light-weight moderation tool for LLM safety that achieves three goals: (1) identifying malicious intent in user prompts, (2) detecting safety risks of model responses, and (3) determining model refusal rate. Together, WildGuard serves the increasing needs for automatic safety moderation and evaluation of LLM interactions, providing a one-stop tool with enhanced a… ▽ More We introduce WildGuard -- an open, light-weight moderation tool for LLM safety that achieves three goals: (1) identifying malicious intent in user prompts, (2) detecting safety risks of model responses, and (3) determining model refusal rate. Together, WildGuard serves the increasing needs for automatic safety moderation and evaluation of LLM interactions, providing a one-stop tool with enhanced accuracy and broad coverage across 13 risk categories. While existing open moderation tools such as Llama-Guard2 score reasonably well in classifying straightforward model interactions, they lag far behind a prompted GPT-4, especially in identifying adversarial jailbreaks and in evaluating models' refusals, a key measure for evaluating safety behaviors in model responses. To address these challenges, we construct WildGuardMix, a large-scale and carefully balanced multi-task safety moderation dataset with 92K labeled examples that cover vanilla (direct) prompts and adversarial jailbreaks, paired with various refusal and compliance responses. WildGuardMix is a combination of WildGuardTrain, the training data of WildGuard, and WildGuardTest, a high-quality human-annotated moderation test set with 5K labeled items covering broad risk scenarios. Through extensive evaluations on WildGuardTest and ten existing public benchmarks, we show that WildGuard establishes state-of-the-art performance in open-source safety moderation across all the three tasks compared to ten strong existing open-source moderation models (e.g., up to 26.4% improvement on refusal detection). Importantly, WildGuard matches and sometimes exceeds GPT-4 performance (e.g., up to 3.9% improvement on prompt harmfulness identification). WildGuard serves as a highly effective safety moderator in an LLM interface, reducing the success rate of jailbreak attacks from 79.8% to 2.4%. △ Less

Submitted 9 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

Comments: First two authors contributed equally. Third and fourth authors contributed equally

arXiv:2406.17443 [pdf, other]

Using joint angles based on the international biomechanical standards for human action recognition and related tasks

Authors: Kevin Schlegel, Lei Jiang, Hao Ni

Abstract: Keypoint data has received a considerable amount of attention in machine learning for tasks like action detection and recognition. However, human experts in movement such as doctors, physiotherapists, sports scientists and coaches use a notion of joint angles standardised by the International Society of Biomechanics to precisely and efficiently communicate static body poses and movements. In this… ▽ More Keypoint data has received a considerable amount of attention in machine learning for tasks like action detection and recognition. However, human experts in movement such as doctors, physiotherapists, sports scientists and coaches use a notion of joint angles standardised by the International Society of Biomechanics to precisely and efficiently communicate static body poses and movements. In this paper, we introduce the basic biomechanical notions and show how they can be used to convert common keypoint data into joint angles that uniquely describe the given pose and have various desirable mathematical properties, such as independence of both the camera viewpoint and the person performing the action. We experimentally demonstrate that the joint angle representation of keypoint data is suitable for machine learning applications and can in some cases bring an immediate performance gain. The use of joint angles as a human meaningful representation of kinematic data is in particular promising for applications where interpretability and dialog with human experts is important, such as many sports and medical applications. To facilitate further research in this direction, we will release a python package to convert keypoint data into joint angles as outlined in this paper. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.16328 [pdf, other]

Convolutional neural network based reduced order modeling for multiscale problems

Authors: Xuhan Zhang, Lijian Jiang

Abstract: In this paper, we combine convolutional neural networks (CNNs) with reduced order modeling (ROM) for efficient simulations of multiscale problems. These problems are modeled by partial differential equations with high-dimensional random inputs. The proposed method involves two separate CNNs: Basis CNNs and Coefficient CNNs (Coef CNNs), which correspond to two main parts of ROM. The method is calle… ▽ More In this paper, we combine convolutional neural networks (CNNs) with reduced order modeling (ROM) for efficient simulations of multiscale problems. These problems are modeled by partial differential equations with high-dimensional random inputs. The proposed method involves two separate CNNs: Basis CNNs and Coefficient CNNs (Coef CNNs), which correspond to two main parts of ROM. The method is called CNN-based ROM. The former one learns input-specific basis functions from the snapshots of fine-scale solutions. An activation function, inspired by Galerkin projection, is utilized at the output layer to reconstruct fine-scale solutions from the basis functions. Numerical results show that the basis functions learned by the Basis CNNs resemble data, which help to significantly reduce the number of the basis functions. Moreover, CNN-based ROM is less sensitive to data fluctuation caused by numerical errors than traditional ROM. Since the tests of Basis CNNs still need fine-scale stiffness matrix and load vector, it can not be directly applied to nonlinear problems. The Coef CNNs can be applied to nonlinear problems and designed to determine the coefficients for linear combination of basis functions. In addition, two applications of CNN-based ROM are presented, including predicting MsFEM basis functions within oversampling regions and building accurate surrogates for inverse problems. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 35 pages, 29 figures

arXiv:2406.13987 [pdf]

Image anomaly detection and prediction scheme based on SSA optimized ResNet50-BiGRU model

Authors: Qianhui Wan, Zecheng Zhang, Liheng Jiang, Zhaoqi Wang, Yan Zhou

Abstract: Image anomaly detection is a popular research direction, with many methods emerging in recent years due to rapid advancements in computing. The use of artificial intelligence for image anomaly detection has been widely studied. By analyzing images of athlete posture and movement, it is possible to predict injury status and suggest necessary adjustments. Most existing methods rely on convolutional… ▽ More Image anomaly detection is a popular research direction, with many methods emerging in recent years due to rapid advancements in computing. The use of artificial intelligence for image anomaly detection has been widely studied. By analyzing images of athlete posture and movement, it is possible to predict injury status and suggest necessary adjustments. Most existing methods rely on convolutional networks to extract information from irrelevant pixel data, limiting model accuracy. This paper introduces a network combining Residual Network (ResNet) and Bidirectional Gated Recurrent Unit (BiGRU), which can predict potential injury types and provide early warnings by analyzing changes in muscle and bone poses from video images. To address the high complexity of this network, the Sparrow search algorithm was used for optimization. Experiments conducted on four datasets demonstrated that our model has the smallest error in image anomaly detection compared to other models, showing strong adaptability. This provides a new approach for anomaly detection and predictive analysis in images, contributing to the sustainable development of human health and performance. △ Less

Submitted 20 June, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.13203 [pdf]

Dynamical phase-field model of cavity electromagnonic systems

Authors: Shihao Zhuang, Yujie Zhu, Changchun Zhong, Liang Jiang, Xufeng Zhang, Jia-Mian Hu

Abstract: Cavity electromagnonic system, which simultaneously consists of cavities for photons, magnons (quanta of spin waves), and acoustic phonons, provides an exciting platform to achieve coherent energy transduction among different physical systems down to single quantum level. Here we report a dynamical phase-field model that allows simulating the coupled dynamics of the electromagnetic waves, magnetiz… ▽ More Cavity electromagnonic system, which simultaneously consists of cavities for photons, magnons (quanta of spin waves), and acoustic phonons, provides an exciting platform to achieve coherent energy transduction among different physical systems down to single quantum level. Here we report a dynamical phase-field model that allows simulating the coupled dynamics of the electromagnetic waves, magnetization, and strain in 3D multiphase systems. As examples of application, we computationally demonstrate the excitation of hybrid magnon-photon modes (magnon polaritons), Floquet-induced magnonic Aulter-Townes splitting, dynamical energy exchange (Rabi oscillation) and relative phase control (Ramsey interference) between the two magnon polariton modes. The simulation results are consistent with analytical calculations based on Floquet Hamiltonian theory. Simulations are also performed to design a cavity electro-magno-mechanical system that enables the triple phonon-magnon-photon resonance, where the resonant excitation of a chiral, fundamental (n=1) transverse acoustic phonon mode by magnon polaritons is demonstrated. With the capability to predict coupling strength, dissipation rates, and temporal evolution of photon/magnon/phonon mode profiles using fundamental materials parameters as the inputs, the present dynamical phase-field model represents a valuable computational tool to guide the fabrication of the cavity electromagnonic system and the design of operating conditions for applications in quantum sensing, transduction, and communication. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.10744 [pdf, other]

Technique Report of CVPR 2024 PBDL Challenges

Authors: Ying Fu, Yu Li, Shaodi You, Boxin Shi, Linwei Chen, Yunhao Zou, Zichun Wang, Yichen Li, Yuze Han, Yingkai Zhang, Jianan Wang, Qinglin Liu, Wei Yu, Xiaoqian Lv, Jianing Li, Shengping Zhang, Xiangyang Ji, Yuanpei Chen, Yuhan Zhang, Weihang Peng, Liwen Zhang, Zhe Xu, Dingyong Gou, Cong Li, Senyan Xu , et al. (75 additional authors not shown)

Abstract: The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, a… ▽ More The intersection of physics-based vision and deep learning presents an exciting frontier for advancing computer vision technologies. By leveraging the principles of physics to inform and enhance deep learning models, we can develop more robust and accurate vision systems. Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images. In recent years, deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems. This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop. The challenge consisted of eight tracks, focusing on Low-Light Enhancement and Detection as well as High Dynamic Range (HDR) Imaging. This report details the objectives, methodologies, and results of each track, highlighting the top-performing solutions and their innovative approaches. △ Less

Submitted 12 July, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

Comments: CVPR 2024 PBDL Challenges: https://pbdl-ws.github.io/pbdl2024/challenge/index.html

arXiv:2406.10148 [pdf, other]

A Primal-Dual-Assisted Penalty Approach to Bilevel Optimization with Coupled Constraints

Authors: Liuyuan Jiang, Quan Xiao, Victor M. Tenorio, Fernando Real-Rojas, Antonio Marques, Tianyi Chen

Abstract: Interest in bilevel optimization has grown in recent years, partially due to its applications to tackle challenging machine-learning problems. Several exciting recent works have been centered around developing efficient gradient-based algorithms that can solve bilevel optimization problems with provable guarantees. However, the existing literature mainly focuses on bilevel problems either without… ▽ More Interest in bilevel optimization has grown in recent years, partially due to its applications to tackle challenging machine-learning problems. Several exciting recent works have been centered around developing efficient gradient-based algorithms that can solve bilevel optimization problems with provable guarantees. However, the existing literature mainly focuses on bilevel problems either without constraints, or featuring only simple constraints that do not couple variables across the upper and lower levels, excluding a range of complex applications. Our paper studies this challenging but less explored scenario and develops a (fully) first-order algorithm, which we term BLOCC, to tackle BiLevel Optimization problems with Coupled Constraints. We establish rigorous convergence theory for the proposed algorithm and demonstrate its effectiveness on two well-known real-world applications - hyperparameter selection in support vector machine (SVM) and infrastructure planning in transportation networks using the real data from the city of Seville. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.09991 [pdf, other]

On the Interacting/Active Lifetime of Supernova Fallback Disk around Isolated Neutron Stars

Authors: Kun Xu, Hao-Ran Yang, Long Jiang, Wen-Cong Chen, Xiang-Dong Li, Jifeng Liu

Abstract: The fallback disk model is widely accepted to explain long-period neutron stars (NSs) which can't be simulated by magnetic dipole radiation. However, no confirmed detection of disk was found from the newly discovered long period pulsars GLEAM-X 162759.5-523504.3, GPM J1839-10 and the known slowest isolated NSs 1E 161348-5055. This might be that the disks have either been in noninteracting/inactive… ▽ More The fallback disk model is widely accepted to explain long-period neutron stars (NSs) which can't be simulated by magnetic dipole radiation. However, no confirmed detection of disk was found from the newly discovered long period pulsars GLEAM-X 162759.5-523504.3, GPM J1839-10 and the known slowest isolated NSs 1E 161348-5055. This might be that the disks have either been in noninteracting/inactive state where its emission is too weak to be detected or have been disrupted. In this work, we conduct simulations to examine the lifetime of supernova fallback disks around isolated neutron stars. We assume that the disk's mass varies in a self-similar way and its interaction with the NS occurs only in interacting/active state. Our results reveal that nearly all the interacting lifetimes for the disk are shorter than 0.1 Myr while the existence lifetimes are considerably longer. △ Less

Submitted 16 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

Comments: Accepted in ApJ, comments are welcome

arXiv:2406.09789 [pdf, ps, other]

Localized subspace iteration methods for elliptic multiscale problems

Authors: Xiaofei Guan, Lijian Jiang, Yajun Wang, Zihao Yang

Abstract: This paper proposes localized subspace iteration (LSI) methods to construct generalized finite element basis functions for elliptic problems with multiscale coefficients. The key components of the proposed method consist of the localization of the original differential operator and the subspace iteration of the corresponding local spectral problems, where the localization is conducted by enforcing… ▽ More This paper proposes localized subspace iteration (LSI) methods to construct generalized finite element basis functions for elliptic problems with multiscale coefficients. The key components of the proposed method consist of the localization of the original differential operator and the subspace iteration of the corresponding local spectral problems, where the localization is conducted by enforcing the local homogeneous Dirichlet condition and the partition of the unity functions. From a novel perspective, some multiscale methods can be regarded as one iteration step under approximating the eigenspace of the corresponding local spectral problems. Vice versa, new multiscale methods can be designed through subspaces of spectral problem algorithms. Then, we propose the efficient localized standard subspace iteration (LSSI) method and the localized Krylov subspace iteration (LKSI) method based on the standard subspace and Krylov subspace, respectively. Convergence analysis is carried out for the proposed method. Various numerical examples demonstrate the effectiveness of our methods. In addition, the proposed methods show significant superiority in treating long-channel cases over other well-known multiscale methods. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 23 pages

MSC Class: 65N99; 65N30; 34E13

arXiv:2406.07961 [pdf, other]

Accurate Explanation Model for Image Classifiers using Class Association Embedding

Authors: Ruitao Xie, Jingbang Chen, Limai Jiang, Rui Xiao, Yi Pan, Yunpeng Cai

Abstract: Image classification is a primary task in data analysis where explainable models are crucially demanded in various applications. Although amounts of methods have been proposed to obtain explainable knowledge from the black-box classifiers, these approaches lack the efficiency of extracting global knowledge regarding the classification task, thus is vulnerable to local traps and often leads to poor… ▽ More Image classification is a primary task in data analysis where explainable models are crucially demanded in various applications. Although amounts of methods have been proposed to obtain explainable knowledge from the black-box classifiers, these approaches lack the efficiency of extracting global knowledge regarding the classification task, thus is vulnerable to local traps and often leads to poor accuracy. In this study, we propose a generative explanation model that combines the advantages of global and local knowledge for explaining image classifiers. We develop a representation learning method called class association embedding (CAE), which encodes each sample into a pair of separated class-associated and individual codes. Recombining the individual code of a given sample with altered class-associated code leads to a synthetic real-looking sample with preserved individual characters but modified class-associated features and possibly flipped class assignments. A building-block coherency feature extraction algorithm is proposed that efficiently separates class-associated features from individual ones. The extracted feature space forms a low-dimensional manifold that visualizes the classification decision patterns. Explanation on each individual sample can be then achieved in a counter-factual generation manner which continuously modifies the sample in one direction, by shifting its class-associated code along a guided path, until its classification outcome is changed. We compare our method with state-of-the-art ones on explaining image classification tasks in the form of saliency maps, demonstrating that our method achieves higher accuracies. The code is available at https://github.com/xrt11/XAI-CODE. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 40th IEEE International Conference on Data Engineering

arXiv:2406.06701 [pdf, other]

The XMM-SERVS X-ray eXtended Galaxy Cluster (XVXGC) catalog

Authors: Weiwei Xu, Linhua Jiang, Ran Li, Bin Luo, W. Nielsen Brandt, Chaoli Zhang, Thomas Erben

Abstract: To explain the well-known tension between cosmological parameter constraints obtained from the primary CMB and those drawn from galaxy cluster samples, we propose a possible explanation for the incompleteness of detected clusters are higher than estimated. We aim to search for galaxy groups and clusters with particularly extended surface brightness distributions by creating a new X-ray-selected ca… ▽ More To explain the well-known tension between cosmological parameter constraints obtained from the primary CMB and those drawn from galaxy cluster samples, we propose a possible explanation for the incompleteness of detected clusters are higher than estimated. We aim to search for galaxy groups and clusters with particularly extended surface brightness distributions by creating a new X-ray-selected catalog of extended galaxy clusters from the XMM-SERVS data, based on a dedicated source detection and characterization algorithm that is optimized for extended sources. Our state-of-the-art algorithm is composed of wavelet filtering, source detection, and characterization. We make a visual inspection of the optical image, and spatial distribution of galaxies within the same redshift layer to confirm the existence of clusters and estimate the cluster redshift with the spectroscopic and photometric redshifts of galaxies. The growth curve analysis is used to characterize the detections. We report a catalog of extended X-ray galaxy clusters detected from the XMM-SERVS data, named the XMM- SERVS X-ray eXtended Galaxy Cluster (XVXGC) catalog. It includes 141 cluster candidates. Specifically, there are 52 clusters previously identified as clusters with the intra-cluster medium (ICM) emission (class 3), 37 ones previously known as optical or infrared clusters but detected as X-ray clusters for the first time (class 2), and 52 identified as clusters for the first time (class 1). Compared with the class3 sample, the 'class1+2' sample is systematically fainter, and exhibits a flatter surface brightness profile. The median flux in [0.1-2.4]keV band for 'class1+2' and class3 sample is 2.336e-14 and 3.163e-14erg/s/cm2, respectively. The median slope of surface brightness profile are 0.502 and 0.577 for the 'class1+2' and class 3 samples, respectively. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 16pages, 11 figures, 5 tables, submit to A&A. This entire sample is available at https://github.com/wwxu/xvxgc.github.io together with the paper publication

arXiv:2406.05673 [pdf, other]

Flow of Reasoning: Efficient Training of LLM Policy with Divergent Thinking

Authors: Fangxu Yu, Lai Jiang, Haoqiang Kang, Shibo Hao, Lianhui Qin

Abstract: Divergent thinking, the cognitive process of generating diverse solutions, is a hallmark of human creativity and problem-solving. For machines, sampling diverse solution trajectories in complex reasoning problems is crucial for robust outcomes, data augmentation, and enhanced model generalization. Large language models (LLMs) often struggle with generating high-quality, diverse reasoning. While su… ▽ More Divergent thinking, the cognitive process of generating diverse solutions, is a hallmark of human creativity and problem-solving. For machines, sampling diverse solution trajectories in complex reasoning problems is crucial for robust outcomes, data augmentation, and enhanced model generalization. Large language models (LLMs) often struggle with generating high-quality, diverse reasoning. While supervised fine-tuning helps with quality, it requires extensive supervision data to capture the full diversity of solutions. Alternatively, reinforcement learning methods like PPO aim to find limited highest-reward solutions while neglecting the solution diversity, akin to convergent thinking. To address these limitations, we propose Flow of Reasoning (FoR) -- an efficient LLM training approach enabling diverse reasoning with minimal data. FoR formulates multi-step LLM reasoning as a Markovian flow from an initial state to terminal states. The formulation allows to adapt principled GFlowNet approaches to train the LLM as a policy, which is able to sample multiple reasoning paths with probabilities proportional to the unnormalized reward. Empirical results show that, with limited training data (e.g., 15 examples), FoR can discover diverse high-quality solutions that excel greatly beyond current state-of-the-art methods across three tasks, including embodied reasoning (BlocksWorld), math puzzle solving (Game24), and logical reasoning (PrOntoQA). Code is available at https://github.com/Yu-Fangxu/FoR. △ Less

Submitted 24 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.05637 [pdf, ps, other]

A Generalized Version of Chung's Lemma and its Applications

Authors: Li Jiang, Xiao Li, Andre Milzarek, Junwen Qiu

Abstract: Chung's lemma is a classical tool for establishing asymptotic convergence rates of (stochastic) optimization methods under strong convexity-type assumptions and appropriate polynomial diminishing step sizes. In this work, we develop a generalized version of Chung's lemma, which provides a simple non-asymptotic convergence framework for a more general family of step size rules. We demonstrate broad… ▽ More Chung's lemma is a classical tool for establishing asymptotic convergence rates of (stochastic) optimization methods under strong convexity-type assumptions and appropriate polynomial diminishing step sizes. In this work, we develop a generalized version of Chung's lemma, which provides a simple non-asymptotic convergence framework for a more general family of step size rules. We demonstrate broad applicability of the proposed generalized Chung's lemma by deriving tight non-asymptotic convergence rates for a large variety of stochastic methods. In particular, we obtain partially new non-asymptotic complexity results for stochastic optimization methods, such as stochastic gradient descent and random reshuffling, under a general $(θ,μ)$-Polyak-Lojasiewicz (PL) condition and for various step sizes strategies, including polynomial, constant, exponential, and cosine step sizes rules. Notably, as a by-product of our analysis, we observe that exponential step sizes can adapt to the objective function's geometry, achieving the optimal convergence rate without requiring exact knowledge of the underlying landscape. Our results demonstrate that the developed variant of Chung's lemma offers a versatile, systematic, and streamlined approach to establish non-asymptotic convergence rates under general step size rules. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 43 pages, 5 figures

MSC Class: 90C15; 90C30; 90C26

arXiv:2406.04275 [pdf, other]

Interfacing Gottesman-Kitaev-Preskill Qubits to Quantum Memories

Authors: Prajit Dhara, Liang Jiang, Saikat Guha

Abstract: Gottesman-Kitaev-Preskill (GKP) states have been demonstrated to pose significant advantages when utilized for fault-tolerant all optical continuous-variable quantum computing as well as for quantum communications links for entanglement distribution. However interfacing these systems to long-lived solid-state quantum memories has remained an open problem. Here we propose an interface between quant… ▽ More Gottesman-Kitaev-Preskill (GKP) states have been demonstrated to pose significant advantages when utilized for fault-tolerant all optical continuous-variable quantum computing as well as for quantum communications links for entanglement distribution. However interfacing these systems to long-lived solid-state quantum memories has remained an open problem. Here we propose an interface between quantum memories and GKP qubit states based on a cavity-mediated controlled displacement gate. We characterize the quality of memory-GKP entanglement as a function of cavity parameters suggesting optimal regimes of operation for high-quality state transfer between either qubit states. We further extend this protocol to demonstrate the creation of GKP cluster states by avoiding the requirement of ancillary optical quadrature-squeezed light. Utilizing post-selected entanglement swapping operations for GKP qubits, we demonstrate the utility of our protocol for high-rate entanglement generation between quantum memories. Extensions and derivatives of our proposal could enable a wide variety of applications by utilizing the operational trade-offs for qubits encoded in memory and in the GKP basis. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 17 pages; 8 figures; Comments are welcome!

arXiv:2406.04272 [pdf, other]

Entangling Quantum Memories at Channel Capacity

Authors: Prajit Dhara, Liang Jiang, Saikat Guha

Abstract: Entangling quantum memories, mediated by optical-frequency or microwave channels, at high rates and fidelities is key for linking qubits across short and long ranges. All well-known protocols encode up to one qubit per optical mode, hence entangling one pair of memory qubits per transmitted mode over the channel, with probability $η$, the channel's transmissivity. The rate is proportional to $η$ i… ▽ More Entangling quantum memories, mediated by optical-frequency or microwave channels, at high rates and fidelities is key for linking qubits across short and long ranges. All well-known protocols encode up to one qubit per optical mode, hence entangling one pair of memory qubits per transmitted mode over the channel, with probability $η$, the channel's transmissivity. The rate is proportional to $η$ ideal Bell states (ebits) per mode. The quantum capacity, $C(η) = -\log_2(1-η)$ ebits per mode, which $\approx 1.44η$ for high loss, i.e., $η\ll 1$, thereby making these schemes near rate-optimal. However, $C(η) \to \infty$ as $η\to 1$, making the known schemes highly rate-suboptimal for shorter ranges. We show that a cavity-assisted memory-photon interface can be used to entangle matter memories with Gottesman-Kitaev-Preskill (GKP) photonic qudits, which along with dual-homodyne entanglement swaps that retain analog information, enables entangling memories at capacity-approaching rates at low loss. We benefit from loss resilience of GKP qudits, and their ability to encode multiple qubits in one mode. Our memory-photon interface further supports the preparation of needed ancilla GKP qudits. We expect our result to spur research in low-loss high-cooperativity cavity-coupled qubits with high-efficiency optical coupling, and demonstrations of high-rate short-range quantum links. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 14 pages; 8 figures; Comments are welcome!

arXiv:2406.03271 [pdf, other]

Image Copy-Move Forgery Detection and Localization Scheme: How to Avoid Missed Detection and False Alarm

Authors: Li Jiang, Zhaowei Lu, Yuebing Gao, Yifan Wang

Abstract: Image copy-move is an operation that replaces one part of the image with another part of the same image, which can be used for illegal purposes due to the potential semantic changes. Recent studies have shown that keypoint-based algorithms achieved excellent and robust localization performance even when small or smooth tampered areas were involved. However, when the input image is low-resolution,… ▽ More Image copy-move is an operation that replaces one part of the image with another part of the same image, which can be used for illegal purposes due to the potential semantic changes. Recent studies have shown that keypoint-based algorithms achieved excellent and robust localization performance even when small or smooth tampered areas were involved. However, when the input image is low-resolution, most existing keypoint-based algorithms are difficult to generate sufficient keypoints, resulting in more missed detections. In addition, existing algorithms are usually unable to distinguish between Similar but Genuine Objects (SGO) images and tampered images, resulting in more false alarms. This is mainly due to the lack of further verification of local homography matrix in forgery localization stage. To tackle these problems, this paper firstly proposes an excessive keypoint extraction strategy to overcome missed detection. Subsequently, a group matching algorithm is used to speed up the matching of excessive keypoints. Finally, a new iterative forgery localization algorithm is introduced to quickly form pixel-level localization results while ensuring a lower false alarm. Extensive experimental results show that our scheme has superior performance than state-of-the-art algorithms in overcoming missed detection and false alarm. Our code is available at https://github.com/LUZW1998/CMFDL. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2406.02856 [pdf, other]

Xmodel-LM Technical Report

Authors: Yichuan Wang, Yang Liu, Yu Yan, Qun Wang, Xucheng Huang, Ling Jiang

Abstract: We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on around 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale. Our model checkpoints… ▽ More We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on around 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale. Our model checkpoints and code are publicly accessible on GitHub at https://github.com/XiaoduoAILab/XmodelLM. △ Less

Submitted 26 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02746 [pdf, other]

RATT: A Thought Structure for Coherent and Correct LLM Reasoning

Authors: Jinghan Zhang, Xiting Wang, Weijieying Ren, Lu Jiang, Dongjie Wang, Kunpeng Liu

Abstract: Large Language Models (LLMs) gain substantial reasoning and decision-making capabilities from thought structures. However, existing methods such as Tree of Thought and Retrieval Augmented Thoughts often fall short in complex tasks due to the limitations of insufficient local retrieval of factual knowledge and inadequate global selection of strategies. These limitations make it challenging for thes… ▽ More Large Language Models (LLMs) gain substantial reasoning and decision-making capabilities from thought structures. However, existing methods such as Tree of Thought and Retrieval Augmented Thoughts often fall short in complex tasks due to the limitations of insufficient local retrieval of factual knowledge and inadequate global selection of strategies. These limitations make it challenging for these methods to balance factual accuracy and comprehensive logical optimization effectively. To address these limitations, we introduce the Retrieval Augmented Thought Tree (RATT), a novel thought structure that considers both overall logical soundness and factual correctness at each step of the thinking process. Specifically, at every point of a thought branch, RATT performs planning and lookahead to explore and evaluate multiple potential reasoning steps, and integrate the fact-checking ability of Retrieval-Augmented Generation (RAG) with LLM's ability to assess overall strategy. Through this combination of factual knowledge and strategic feasibility, the RATT adjusts and integrates the thought tree structure to search for the most promising branches within the search space. This thought structure significantly enhances the model's coherence in logical inference and efficiency in decision-making, and thus increases the limit of the capacity of LLM to generate reliable inferences and decisions based on thought structures. A broad range of experiments on different types of tasks showcases that the RATT structure significantly outperforms existing methods in factual correctness and logical coherence. △ Less

Submitted 11 July, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

arXiv:2406.02669 [pdf, other]

A generalized cycle benchmarking algorithm for characterizing mid-circuit measurements

Authors: Zhihan Zhang, Senrui Chen, Yunchao Liu, Liang Jiang

Abstract: Mid-circuit measurement (MCM) is a crucial ingredient in the development of fault-tolerant quantum computation. While there have been rapid experimental progresses in realizing MCM, a systematic method for characterizing noisy MCM is still under exploration. In this work we develop an algorithm to characterize noisy MCM, via a generalization of cycle benchmarking -- a standard approach for charact… ▽ More Mid-circuit measurement (MCM) is a crucial ingredient in the development of fault-tolerant quantum computation. While there have been rapid experimental progresses in realizing MCM, a systematic method for characterizing noisy MCM is still under exploration. In this work we develop an algorithm to characterize noisy MCM, via a generalization of cycle benchmarking -- a standard approach for characterizing the Pauli noise channel of Clifford gates. The key idea is to use a joint Fourier transform on the classical and quantum registers and then estimate parameters in the Fourier space, analogous to Pauli fidelities used in cycle benchmarking. Furthermore, we develop a theory of the noise learnability of MCM, which determines what information can be learned about the noise model (in the presence of state preparation and measurement noise) and what cannot, which shows that all learnable information can be learned using our algorithm. As an application, we show how to use the learned information to test the independence between measurement noise and state preparation noise in a MCM. Finally, we conduct numerical simulations to illustrate the practical applicability of the algorithm. Similar to cycle benchmarking, we expect the algorithm to provide a useful toolkit that is of experimental interest. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: 27 pages, 9 figures

arXiv:2406.01281 [pdf]

Extraction of Maternal and fetal ECG in a non-invasive way from abdominal ECG recordings using modified Progressive FastICA Peel-off

Authors: Yao Li, Xuanyu Luo, Haowen Zhao, Jiawen Cui, Yangfan She, Dongfang Li, Lai Jiang, Xu Zhang

Abstract: The non-invasive abdominal electrocardiogram (AECG) gives a non-invasive way to monitor fetal well-being during pregnancy. Due to the overlap with maternal ECG (MECG) as well as potential noises from other sources, it is challenging to extract weak fetal ECG (FECG) using surface electrodes. Taking advantage of precise source separation capability of the FastICA approach combined with its constrain… ▽ More The non-invasive abdominal electrocardiogram (AECG) gives a non-invasive way to monitor fetal well-being during pregnancy. Due to the overlap with maternal ECG (MECG) as well as potential noises from other sources, it is challenging to extract weak fetal ECG (FECG) using surface electrodes. Taking advantage of precise source separation capability of the FastICA approach combined with its constrained version specific to FECG, with weak source extraction capability warranted by the peel-off strategy and FECG waveform reconstruction ability ensured by singular value decomposition (SVD) method, a novel framework for FECG extraction from AECG recordings is presented in this paper. Specifically, a periodic constrained FastICA(pcFastICA) was developed to improve the precision of examining and correcting FECG source signals, based on the statistical characteristics of continuous and repetitive ECG emissions. Additionally, a successive judgement algorithm is designed to selected the optimal maternal and fetal ECG. The performance of the proposed method was examined on public datasets, synthetic data and clinical data, with an F1-scores for FECG extraction on ADFECG and NIFECGA dataset of 99.71% and 99.36%, on synthetic data with the highest noise level of 98.77%, on clinical data of 98.09%, which are all superior to other comparative methods. The results indicates that our proposed method has potential and effectiveness to separate weak FECG from multichannel AECG with high precision in high noise condition, which is of vital importance for ensuring the safety of both the fetus and the mother, as well as the advancement of artificial intelligent clinical monitoring. △ Less

Submitted 3 June, 2024; originally announced June 2024.

arXiv:2406.00593 [pdf]

Low threshold optical bistability based on MoS2 in asymmetric Fabry-Perot cavity structure in visible light band

Authors: Songqing Tang, Mengjiao Ren, Zhiheng Li, Zhiwei Zheng, Leyong Jiang

Abstract: This article theoretically proposes a multi-layer Fabry-Perot cavity structure based on nonlinear MoS2, whose cavity is composed of asymmetric photonic crystals. In this structure, we observed a low threshold optical bistability phenomenon on the order of a in the visible light band, which is caused by the large third-order nonlinear conductivity of the bilayer MoS2 and the Fabry-Perot cavity reso… ▽ More This article theoretically proposes a multi-layer Fabry-Perot cavity structure based on nonlinear MoS2, whose cavity is composed of asymmetric photonic crystals. In this structure, we observed a low threshold optical bistability phenomenon on the order of a in the visible light band, which is caused by the large third-order nonlinear conductivity of the bilayer MoS2 and the Fabry-Perot cavity resonance. Research has found that when light is incident from two different directions in an asymmetric Fabry-Perot cavity, the optical bistability exhibits not exactly the same behavior. In addition, we further investigated and found that the optical bistability behavior in this simple multi-layer structure is closely related to parameters such as incident wavelength, Fabry-Perot cavity length, and refractive index of the photonic crystal dielectric. This work provides a new approach for the implementation of low threshold optical bistable devices in the visible light band, which is expected to be applied in nonlinear optical fields such as all optical switches and all optical logic devices. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 22 pages, 6 figures

arXiv:2406.00590 [pdf]

MoS2-based optical bistability in silver-Bragg reflector multilayer structure at visible light band

Authors: Songqing Tang, Mengjiao Ren, Zhiheng Li, Zhiwei Zheng, Leyong Jiang

Abstract: In this paper, we present a theoretical analysis of the optical bistability in a metallic silver-Bragg reflector structure by embedding bilayer MoS2 at the visible band. The nonlinear OB is achieved due to the nonlinear conductivity of the bilayer MoS2 and the excitation of the optical Tamm state at the interface between the silver and the Bragg reflector. It is found that the hysteresis behaviour… ▽ More In this paper, we present a theoretical analysis of the optical bistability in a metallic silver-Bragg reflector structure by embedding bilayer MoS2 at the visible band. The nonlinear OB is achieved due to the nonlinear conductivity of the bilayer MoS2 and the excitation of the optical Tamm state at the interface between the silver and the Bragg reflector. It is found that the hysteresis behaviour and the threshold width of the OB can be effectively tuned by varying the incident light wavelength. In addition, the optical bistable behaviour of the structure can be adjusted by varying the position of the MoS2 inset in the defect layer, incident angle and the structural parameters of the spacer layer. Although the current threshold cannot be commercialized, we believe that this solution will provide a meaningful path reference for low threshold bistability in the visible light band. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 23 pages, 6 figures

arXiv:2406.00420 [pdf]

Realization of type-II double-zero-index photonic crystals

Authors: Zebin Zhu, Dong Zhao, Ziyao Wang, Xucheng Yang, Liyong Jiang, Zhen Gao

Abstract: Some photonic crystals (PCs) with Dirac-like conical dispersions exhibit the property of double zero refractive index (that is, both epsilon and mu near zero (EMNZ)), wherein the electromagnetic waves have an infinite effective wavelength and do not experience any spatial phase change. The Dirac-like cones that support EMNZ are previously thought to present only at the center of the Brillouin zone… ▽ More Some photonic crystals (PCs) with Dirac-like conical dispersions exhibit the property of double zero refractive index (that is, both epsilon and mu near zero (EMNZ)), wherein the electromagnetic waves have an infinite effective wavelength and do not experience any spatial phase change. The Dirac-like cones that support EMNZ are previously thought to present only at the center of the Brillouin zone ($Γ$ point) with a zero wavevector (we refer to as type-I EMNZ), which is constrained by the proportional relationship between phase refractive index and wavevector ($n=kc/ω$). Here, we demonstrate the existence of an anomalous type-II EMNZ in PCs, which is associated with the Dirac-like point at off-$Γ$ points. By introducing a wave modulation approach, we theoretically elucidate its physical mechanism, and resolve the paradox of type-II EMNZ with non-zero wavevectors. We then fabricate a type-II EMNZ PC operating at the X point, and experimentally demonstrate that both its effective permittivity and permeability are zero at the Dirac-like point. Type-II EMNZ PCs exhibit a range of intriguing phenomena, including angle-selective transmission, wavefront flattening, a 180$^{\circ}$ phase shift upon transmission, and waveguiding with natural zero radiation loss. The extraordinary properties of type-II EMNZ PCs may open new avenues for the development of angle-selective optical filters, directional light sources, phase-controlled optical switches, ultracompact photonic circuits, nanolasers, and on-chip nonlinear enhancement. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: 38 pages, 13 figures

arXiv:2405.20000 [pdf, other]

Combining physics-informed graph neural network and finite difference for solving forward and inverse spatiotemporal PDEs

Authors: Hao Zhang, Longxiang Jiang, Xinkun Chu, Yong Wen, Luxiong Li, Yonghao Xiao, Liyuan Wang

Abstract: The great success of Physics-Informed Neural Networks (PINN) in solving partial differential equations (PDEs) has significantly advanced our simulation and understanding of complex physical systems in science and engineering. However, many PINN-like methods are poorly scalable and are limited to in-sample scenarios. To address these challenges, this work proposes a novel discrete approach termed P… ▽ More The great success of Physics-Informed Neural Networks (PINN) in solving partial differential equations (PDEs) has significantly advanced our simulation and understanding of complex physical systems in science and engineering. However, many PINN-like methods are poorly scalable and are limited to in-sample scenarios. To address these challenges, this work proposes a novel discrete approach termed Physics-Informed Graph Neural Network (PIGNN) to solve forward and inverse nonlinear PDEs. In particular, our approach seamlessly integrates the strength of graph neural networks (GNN), physical equations and finite difference to approximate solutions of physical systems. Our approach is compared with the PINN baseline on three well-known nonlinear PDEs (heat, Burgers and FitzHugh-Nagumo). We demonstrate the excellent performance of the proposed method to work with irregular meshes, longer time steps, arbitrary spatial resolutions, varying initial conditions (ICs) and boundary conditions (BCs) by conducting extensive numerical experiments. Numerical results also illustrate the superiority of our approach in terms of accuracy, time extrapolability, generalizability and scalability. The main advantage of our approach is that models trained in small domains with simple settings have excellent fitting capabilities and can be directly applied to more complex situations in large domains. △ Less

Submitted 14 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.18676 [pdf]

Exploring Automated Contouring Across Institutional Boundaries: A Deep Learning Approach with Mouse Micro-CT Datasets

Authors: Lu Jiang, Di Xu, Qifan Xu, Arion Chatziioannou, Keisuke S. Iwamoto, Susanta Hui, Ke Sheng

Abstract: Image-guided mouse irradiation is essential to understand interventions involving radiation prior to human studies. Our objective is to employ Swin UNEt Transformers (Swin UNETR) to segment native micro-CT and contrast-enhanced micro-CT scans and benchmark the results against 3D no-new-Net (nnU-Net). Swin UNETR reformulates mouse organ segmentation as a sequence-to-sequence prediction task, using… ▽ More Image-guided mouse irradiation is essential to understand interventions involving radiation prior to human studies. Our objective is to employ Swin UNEt Transformers (Swin UNETR) to segment native micro-CT and contrast-enhanced micro-CT scans and benchmark the results against 3D no-new-Net (nnU-Net). Swin UNETR reformulates mouse organ segmentation as a sequence-to-sequence prediction task, using a hierarchical Swin Transformer encoder to extract features at 5 resolution levels, and connects to a Fully Convolutional Neural Network (FCNN)-based decoder via skip connections. The models were trained and evaluated on open datasets, with data separation based on individual mice. Further evaluation on an external mouse dataset acquired on a different micro-CT with lower kVp and higher imaging noise was also employed to assess model robustness and generalizability. Results indicate that Swin UNETR consistently outperforms nnU-Net and AIMOS in terms of average dice similarity coefficient (DSC) and Hausdorff distance (HD95p), except in two mice of intestine contouring. This superior performance is especially evident in the external dataset, confirming the model's robustness to variations in imaging conditions, including noise and quality, thereby positioning Swin UNETR as a highly generalizable and efficient tool for automated contouring in pre-clinical workflows. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.15236 [pdf, other]

Detecting Errors in a Quantum Network with Pauli Checks

Authors: Alvin Gonzales, Daniel Dilley, Bikun Li, Liang Jiang, Zain H. Saleem

Abstract: We apply the quantum error detection scheme Pauli check sandwiching (PCS) to quantum networks by turning it into a distributed multiparty protocol. PCS is a distance 1 code and requires less resource overhead than standard quantum error correction and detection methods. We provide analytical equations for the final fidelity and postselection rate. We also introduce a recursive version of PCS for e… ▽ More We apply the quantum error detection scheme Pauli check sandwiching (PCS) to quantum networks by turning it into a distributed multiparty protocol. PCS is a distance 1 code and requires less resource overhead than standard quantum error correction and detection methods. We provide analytical equations for the final fidelity and postselection rate. We also introduce a recursive version of PCS for entanglement purification that only scales polynomially in the resources required as a function of the number of recursions. The recursive PCS scheme generates a family of distance 2 quantum codes. Our analytical results are benchmarked against BBPSSW in comparable scenarios. We also perform simulations with noisy gates for entanglement swapping and attain substantial fidelity improvements. Lastly, we discuss various setups and graph state properties of PCS. △ Less

Submitted 3 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

Comments: Comments are welcome!

arXiv:2405.14231 [pdf, other]

From Role-Play to Drama-Interaction: An LLM Solution

Authors: Weiqi Wu, Hongqiu Wu, Lai Jiang, Xingyuan Liu, Jiale Hong, Hai Zhao, Min Zhang

Abstract: Drama is a form of storytelling inspired by human creativity, proceeding with a predefined storyline, carrying emotions and thoughts. This paper introduces \emph{LLM-based interactive drama}, which endows traditional drama with an unprecedented immersion, where a person is allowed to walk into it and interact with the characters and scenes. We define this new artistic genre by 6 essential elements… ▽ More Drama is a form of storytelling inspired by human creativity, proceeding with a predefined storyline, carrying emotions and thoughts. This paper introduces \emph{LLM-based interactive drama}, which endows traditional drama with an unprecedented immersion, where a person is allowed to walk into it and interact with the characters and scenes. We define this new artistic genre by 6 essential elements-plot, character, thought, diction, spectacle and interaction-and study the entire pipeline to forge a backbone \emph{drama LLM} to drive the playing process, which is challenged by limited drama resources, uncontrollable narrative development, and complicated instruction following. We propose \emph{Narrative Chain} to offer finer control over the narrative progression during interaction with players; \emph{Auto-Drama} to synthesize drama scripts given arbitrary stories; \emph{Sparse Instruction Tuning} to allow the model to follow sophisticated instructions. We manually craft 3 scripts, \emph{Detective Conan}, \emph{Harry Potter}, \emph{Romeo and Juliet}, and design a 5-dimension principle to evaluate the drama LLM comprehensively. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Accepted by ACL 2024 Findings

arXiv:2405.13762 [pdf, other]

A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

Authors: Gwanghyun Kim, Alonso Martinez, Yu-Chuan Su, Brendan Jou, José Lezama, Agrim Gupta, Lijun Yu, Lu Jiang, Aren Jansen, Jacob Walker, Krishna Somandepalli

Abstract: Training diffusion models for audiovisual sequences allows for a range of generation tasks by learning conditional distributions of various input-output combinations of the two modalities. Nevertheless, this strategy often requires training a separate model for each task which is expensive. Here, we propose a novel training approach to effectively learn arbitrary conditional distributions in the a… ▽ More Training diffusion models for audiovisual sequences allows for a range of generation tasks by learning conditional distributions of various input-output combinations of the two modalities. Nevertheless, this strategy often requires training a separate model for each task which is expensive. Here, we propose a novel training approach to effectively learn arbitrary conditional distributions in the audiovisual space.Our key contribution lies in how we parameterize the diffusion timestep in the forward diffusion process. Instead of the standard fixed diffusion timestep, we propose applying variable diffusion timesteps across the temporal dimension and across modalities of the inputs. This formulation offers flexibility to introduce variable noise levels for various portions of the input, hence the term mixture of noise levels. We propose a transformer-based audiovisual latent diffusion model and show that it can be trained in a task-agnostic fashion using our approach to enable a variety of audiovisual generation tasks at inference time. Experiments demonstrate the versatility of our method in tackling cross-modal and multimodal interpolation tasks in the audiovisual space. Notably, our proposed approach surpasses baselines in generating temporally and perceptually consistent samples conditioned on the input. Project page: avdit2024.github.io △ Less

Submitted 22 May, 2024; originally announced May 2024.

arXiv:2405.11647 [pdf, other]

Hummer: Towards Limited Competitive Preference Dataset

Authors: Li Jiang, Yusen Wu, Junwu Xiong, Jingqing Ruan, Yichuan Ding, Qingpei Guo, Zujie Wen, Jun Zhou, Xiaotie Deng

Abstract: Preference datasets are essential for incorporating human preferences into pre-trained language models, playing a key role in the success of Reinforcement Learning from Human Feedback. However, these datasets often demonstrate conflicting alignment objectives, leading to increased vulnerability to jailbreak attacks and challenges in adapting downstream tasks to prioritize specific alignment object… ▽ More Preference datasets are essential for incorporating human preferences into pre-trained language models, playing a key role in the success of Reinforcement Learning from Human Feedback. However, these datasets often demonstrate conflicting alignment objectives, leading to increased vulnerability to jailbreak attacks and challenges in adapting downstream tasks to prioritize specific alignment objectives without negatively impacting others. In this work, we introduce a novel statistical metric, Alignment Dimension Conflict, to quantify the degree of conflict within preference datasets. We then present \texttt{Hummer} and its fine-grained variant, \texttt{Hummer-F}, as innovative pairwise preference datasets with reduced-conflict alignment objectives. \texttt{Hummer} is built based on UltraFeedback and is enhanced by AI feedback from GPT-4, marking as the first preference dataset aimed at reducing the competition between alignment objectives. Furthermore, we develop reward models, HummerRM and HummerRM-F, which employ a hybrid sampling approach to balance diverse alignment objectives effectively. This sampling method positions HummerRM as an ideal model for domain-specific further fine-tuning and reducing vulnerabilities to attacks. △ Less

Submitted 20 May, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

Comments: 9 pages, 5 figures

arXiv:2405.11607 [pdf, other]

OFHE: An Electro-Optical Accelerator for Discretized TFHE

Authors: Mengxin Zheng, Cheng Chu, Qian Lou, Nathan Youngblood, Mo Li, Sajjad Moazeni, Lei Jiang

Abstract: This paper presents \textit{OFHE}, an electro-optical accelerator designed to process Discretized TFHE (DTFHE) operations, which encrypt multi-bit messages and support homomorphic multiplications, lookup table operations and full-domain functional bootstrappings. While DTFHE is more efficient and versatile than other fully homomorphic encryption schemes, it requires 32-, 64-, and 128-bit polynomia… ▽ More This paper presents \textit{OFHE}, an electro-optical accelerator designed to process Discretized TFHE (DTFHE) operations, which encrypt multi-bit messages and support homomorphic multiplications, lookup table operations and full-domain functional bootstrappings. While DTFHE is more efficient and versatile than other fully homomorphic encryption schemes, it requires 32-, 64-, and 128-bit polynomial multiplications, which can be time-consuming. Existing TFHE accelerators are not easily upgradable to support DTFHE operations due to limited datapaths, a lack of datapath bit-width reconfigurability, and power inefficiencies when processing FFT and inverse FFT (IFFT) kernels. Compared to prior TFHE accelerators, OFHE addresses these challenges by improving the DTFHE operation latency by 8.7\%, the DTFHE operation throughput by $57\%$, and the DTFHE operation throughput per Watt by $94\%$. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.11549 [pdf]

Experimental Study on Deuterium-Deuterium Thermonuclear Fusion with Interface Confinement

Authors: Darong Chen, Liang Jiang, Shuai Chen, Bao Wang, Dangguo Li, Peng Liang

Abstract: Nuclear fusion is recognized as the energy of the future, and huge efforts and capitals have been put into the research of controlled nuclear fusion in the past decades. The most challenging thing for controlled nuclear fusion is to generate and keep a super high temperature. Here, a sonication system, combining with micro-scale fluid control techniques, was built to generate cavitation within a l… ▽ More Nuclear fusion is recognized as the energy of the future, and huge efforts and capitals have been put into the research of controlled nuclear fusion in the past decades. The most challenging thing for controlled nuclear fusion is to generate and keep a super high temperature. Here, a sonication system, combining with micro-scale fluid control techniques, was built to generate cavitation within a limited region. As bubbles being rapidly compressed, high temperature plasma generated interior leads to particle emissions, where a Cs2LiYCl6: Ce3+ (CLYC) scintillator was used to collect the emission events. The pulse shape discrimination methods applied on captured signals revealed that only gamma ray events were observed in sonication with normal water as excepted, while obvious separation of neutron and gamma ray events was surprisingly identified in sonication with deuterated water. This result suggested that neutrons were emitted from the sonicated deuterated water, i.e. deuterium-deuterium thermonuclear fusion was initiated. This study provides an alternative and feasible approach to achieve controllable nuclear fusion and makes great sense for future researches on the application of fusion energy. △ Less

Submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.11464 [pdf, other]

Efficient Prompt Tuning by Multi-Space Projection and Prompt Fusion

Authors: Pengxiang Lan, Enneng Yang, Yuting Liu, Guibing Guo, Linying Jiang, Jianzhe Zhao, Xingwei Wang

Abstract: Prompt tuning is a promising method to fine-tune a pre-trained language model without retraining its large-scale parameters. Instead, it attaches a soft prompt to the input text, whereby downstream tasks can be well adapted by merely learning the embeddings of prompt tokens. Nevertheless, existing methods still suffer from two challenges: (i) they are hard to balance accuracy and efficiency. A lon… ▽ More Prompt tuning is a promising method to fine-tune a pre-trained language model without retraining its large-scale parameters. Instead, it attaches a soft prompt to the input text, whereby downstream tasks can be well adapted by merely learning the embeddings of prompt tokens. Nevertheless, existing methods still suffer from two challenges: (i) they are hard to balance accuracy and efficiency. A longer (shorter) soft prompt generally leads to a better(worse) accuracy but at the cost of more (less) training time. (ii)The performance may not be consistent when adapting to different downstream tasks. We attribute it to the same embedding space but responsible for different requirements of downstream tasks. To address these issues, we propose an Efficient Prompt Tuning method (EPT) by multi-space projection and prompt fusion. Specifically, it decomposes a given soft prompt into a shorter prompt and two low-rank matrices, significantly reducing the training time. Accuracy is also enhanced by leveraging low-rank matrices and the short prompt as additional knowledge sources to enrich the semantics of the original short prompt. In addition, we project the soft prompt into multiple subspaces to improve the performance consistency, and then adaptively learn the combination weights of different spaces through a gating network. Experiments on 13 natural language processing downstream tasks show that our method significantly and consistently outperforms 11 comparison methods with the relative percentage of improvements up to 12.9%, and training time decreased by 14%. △ Less

Submitted 1 July, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

arXiv:2405.10825 [pdf, other]

Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities

Authors: Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili Jin, Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di Wu, Xue Liu, Charlie Zhang, Xianbin Wang, Jiangchuan Liu

Abstract: Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The advancement of LLM techniques also offers promising opportunities to automate many tasks in the telecommunication (telecom) field. After pre-training and fine-tuning, LLMs can perform diverse downstream tasks bas… ▽ More Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The advancement of LLM techniques also offers promising opportunities to automate many tasks in the telecommunication (telecom) field. After pre-training and fine-tuning, LLMs can perform diverse downstream tasks based on human instructions, paving the way to artificial general intelligence (AGI)-enabled 6G. Given the great potential of LLM technologies, this work aims to provide a comprehensive overview of LLM-enabled telecom networks. In particular, we first present LLM fundamentals, including model architecture, pre-training, fine-tuning, inference and utilization, model evaluation, and telecom deployment. Then, we introduce LLM-enabled key techniques and telecom applications in terms of generation, classification, optimization, and prediction problems. Specifically, the LLM-enabled generation applications include telecom domain knowledge, code, and network configuration generation. After that, the LLM-based classification applications involve network security, text, image, and traffic classification problems. Moreover, multiple LLM-enabled optimization techniques are introduced, such as automated reward function design for reinforcement learning and verbal reinforcement learning. Furthermore, for LLM-aided prediction problems, we discussed time-series prediction models and multi-modality prediction problems for telecom. Finally, we highlight the challenges and identify the future directions of LLM-enabled telecom networks. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2405.09215 [pdf, other]

Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

Authors: Wanting Xu, Yang Liu, Langping He, Xucheng Huang, Ling Jiang

Abstract: We introduce Xmodel-VLM, a cutting-edge multimodal vision language model. It is designed for efficient deployment on consumer GPU servers. Our work directly confronts a pivotal industry issue by grappling with the prohibitive service costs that hinder the broad adoption of large-scale multimodal systems. Through rigorous training, we have developed a 1B-scale language model from the ground up, emp… ▽ More We introduce Xmodel-VLM, a cutting-edge multimodal vision language model. It is designed for efficient deployment on consumer GPU servers. Our work directly confronts a pivotal industry issue by grappling with the prohibitive service costs that hinder the broad adoption of large-scale multimodal systems. Through rigorous training, we have developed a 1B-scale language model from the ground up, employing the LLaVA paradigm for modal alignment. The result, which we call Xmodel-VLM, is a lightweight yet powerful multimodal vision language model. Extensive testing across numerous classic multimodal benchmarks has revealed that despite its smaller size and faster execution, Xmodel-VLM delivers performance comparable to that of larger models. Our model checkpoints and code are publicly available on GitHub at https://github.com/XiaoduoAILab/XmodelVLM. △ Less

Submitted 20 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

arXiv:2405.08977 [pdf, other]

Constraints on the variation of the fine-structure constant at 3<z<10 with JWST emission-line galaxies

Authors: Linhua Jiang, Shuqi Fu, Feige Wang, Sarah E. I. Bosman, Zheng Cai, Hyunsung D. Jun, Zhiwei Pan, Fengwu Sun, Jinyi Yang, Huanian Zhang

Abstract: We present constraints on the spacetime variation of the fine-structure constant $α$ at redshifts $3<z<10$ using JWST emission-line galaxies. The galaxy sample consists of 572 high-quality spectra with strong and narrow [O III] $λλ$4959,5007 doublet emission lines from 522 galaxies, including 267 spectra at $z>5$. The [O III] doublet lines are arguably the best emission lines to probe the variatio… ▽ More We present constraints on the spacetime variation of the fine-structure constant $α$ at redshifts $3<z<10$ using JWST emission-line galaxies. The galaxy sample consists of 572 high-quality spectra with strong and narrow [O III] $λλ$4959,5007 doublet emission lines from 522 galaxies, including 267 spectra at $z>5$. The [O III] doublet lines are arguably the best emission lines to probe the variation in $α$. We divide our sample into 5 subsamples based on redshift and calculate the relative variation $Δα/α$ for the individual subsamples. The calculated $Δα/α$ values are consistent with zero within $1σ$ at all redshifts, suggesting no time variation in $α$ above a level of $(1-2) \times10^{-4}$ ($1σ$) in the past 13.2 billion years. When the whole sample is combined, the constraint is improved to be $Δα/α= (0.4\pm0.7) \times10^{-4}$. We further test the spatial variation in $α$ using four subsamples of galaxies in four different directions on the sky. The measured $Δα/α$ values are consistent with zero at a $1σ$ level of $\sim10^{-4}$. While the constraints in this work are not as stringent as those from lower-redshift quasar absorption lines in previous studies, this work uses an independent tracer and provides the first constraints on $Δα/α$ at the highest redshifts. Our analyses also indicate that the relative wavelength calibration of the JWST spectra is robust. With the growing number of emission-line galaxies from JWST, we expect to achieve stronger constraints in the future. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 9 pages, 6 figures, submitted to ApJ

arXiv:2405.08403 [pdf, other]

TFWT: Tabular Feature Weighting with Transformer

Authors: Xinhao Zhang, Zaitian Wang, Lu Jiang, Wanfu Gao, Pengfei Wang, Kunpeng Liu

Abstract: In this paper, we propose a novel feature weighting method to address the limitation of existing feature processing methods for tabular data. Typically the existing methods assume equal importance across all samples and features in one dataset. This simplified processing methods overlook the unique contributions of each feature, and thus may miss important feature information. As a result, it lead… ▽ More In this paper, we propose a novel feature weighting method to address the limitation of existing feature processing methods for tabular data. Typically the existing methods assume equal importance across all samples and features in one dataset. This simplified processing methods overlook the unique contributions of each feature, and thus may miss important feature information. As a result, it leads to suboptimal performance in complex datasets with rich features. To address this problem, we introduce Tabular Feature Weighting with Transformer, a novel feature weighting approach for tabular data. Our method adopts Transformer to capture complex feature dependencies and contextually assign appropriate weights to discrete and continuous features. Besides, we employ a reinforcement learning strategy to further fine-tune the weighting process. Our extensive experimental results across various real-world datasets and diverse downstream tasks show the effectiveness of TFWT and highlight the potential for enhancing feature weighting in tabular data analysis. △ Less

Submitted 17 May, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted by IJCAI 2024

arXiv:2405.07530 [pdf, other]

Prompt-based Code Completion via Multi-Retrieval Augmented Generation

Authors: Hanzhuo Tan, Qi Luo, Ling Jiang, Zizheng Zhan, Jing Li, Haotian Zhang, Yuqun Zhang

Abstract: Automated code completion, aiming at generating subsequent tokens from unfinished code, has been significantly benefited from recent progress in pre-trained Large Language Models (LLMs). However, these models often suffer from coherence issues and hallucinations when dealing with complex code logic or extrapolating beyond their training data. Existing Retrieval Augmented Generation (RAG) technique… ▽ More Automated code completion, aiming at generating subsequent tokens from unfinished code, has been significantly benefited from recent progress in pre-trained Large Language Models (LLMs). However, these models often suffer from coherence issues and hallucinations when dealing with complex code logic or extrapolating beyond their training data. Existing Retrieval Augmented Generation (RAG) techniques partially address these issues by retrieving relevant code with a separate encoding model where the retrieved snippet serves as contextual reference for code completion. However, their retrieval scope is subject to a singular perspective defined by the encoding model, which largely overlooks the complexity and diversity inherent in code semantics. To address this limitation, we propose ProCC, a code completion framework leveraging prompt engineering and the contextual multi-armed bandits algorithm to flexibly incorporate and adapt to multiple perspectives of code. ProCC first employs a prompt-based multi-retriever system which crafts prompt templates to elicit LLM knowledge to understand code semantics with multiple retrieval perspectives. Then, it adopts the adaptive retrieval selection algorithm to incorporate code similarity into the decision-making process to determine the most suitable retrieval perspective for the LLM to complete the code. Experimental results demonstrate that ProCC outperforms state-of-the-art code completion technique by 8.6% on our collected open-source benchmark suite and 10.1% on the private-domain benchmark suite collected from a billion-user e-commerce company in terms of Exact Match. ProCC also allows augmenting fine-tuned techniques in a plug-and-play manner, yielding 5.6% improvement over our studied fine-tuned model. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.07303 [pdf, other]

Search for solar axions by Primakoff effect with the full dataset of the CDEX-1B Experiment

Authors: L. T. Yang, S. K. Liu, Q. Yue, K. J. Kang, Y. J. Li, H. P. An, Greeshma C., J. P. Chang, Y. H. Chen, J. P. Cheng, W. H. Dai, Z. Deng, C. H. Fang, X. P. Geng, H. Gong, Q. J. Guo, T. Guo, X. Y. Guo, L. He, J. R. He, J. W. Hu, H. X. Huang, T. C. Huang, L. Jiang, S. Karmakar , et al. (61 additional authors not shown)

Abstract: We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China Jinping Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axio… ▽ More We present the first limit on $g_{Aγ}$ coupling constant using the Bragg-Primakoff conversion based on an exposure of 1107.5 kg days of data from the CDEX-1B experiment at the China Jinping Underground Laboratory. The data are consistent with the null signal hypothesis, and no excess signals are observed. Limits of the coupling $g_{Aγ}<2.08\times10^{-9}$ GeV$^{-1}$ (95\% C.L.) are derived for axions with mass up to 100 eV/$c^2$. Within the hadronic model of KSVZ, our results exclude axion mass $>5.3~\rm{eV}/c^2$ at 95\% C.L. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: 7 pages, 5 figures

arXiv:2405.05671 [pdf, other]

Self-correcting GKP qubit and gates in a driven-dissipative circuit

Authors: Frederik Nathan, Liam O'Brien, Kyungjoo Noh, Matthew H. Matheny, Arne L. Grimsmo, Liang Jiang, Gil Refael

Abstract: We propose a circuit architecture for a dissipatively error-corrected GKP qubit. The device consists of a high-impedance LC circuit coupled to a Josephson junction and a resistor via a controllable switch. When the switch is activated via a particular family of stepwise protocols, the resistor absorbs all noise-induced entropy, resulting in dissipative error correction of both phase and amplitude… ▽ More We propose a circuit architecture for a dissipatively error-corrected GKP qubit. The device consists of a high-impedance LC circuit coupled to a Josephson junction and a resistor via a controllable switch. When the switch is activated via a particular family of stepwise protocols, the resistor absorbs all noise-induced entropy, resulting in dissipative error correction of both phase and amplitude errors. This leads to an exponential increase of qubit lifetime, reaching beyond 10ms in simulations with near-feasible parameters. We show that the lifetime remains exponentially long in the presence of extrinsic noise and device/control imperfections (e.g., due to parasitics and finite control bandwidth) under specific thresholds. In this regime, lifetime is likely only limited by phase slips and quasiparticle tunneling. We show that the qubit can be read out and initialized via measurement of the supercurrent in the Josephson junction. We finally show that the qubit supports native self-correcting single-qubit Clifford gates, where dissipative error-correction of control noise leads to exponential suppression of gate infidelity. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: 12 pages + 8 figures in the main text

arXiv:2405.04032 [pdf, other]

Locally Differentially Private In-Context Learning

Authors: Chunyan Zheng, Keke Sun, Wenhao Zhao, Haibo Zhou, Lixin Jiang, Shaoyang Song, Chunlai Zhou

Abstract: Large pretrained language models (LLMs) have shown surprising In-Context Learning (ICL) ability. An important application in deploying large language models is to augment LLMs with a private database for some specific task. The main problem with this promising commercial use is that LLMs have been shown to memorize their training data and their prompt data are vulnerable to membership inference at… ▽ More Large pretrained language models (LLMs) have shown surprising In-Context Learning (ICL) ability. An important application in deploying large language models is to augment LLMs with a private database for some specific task. The main problem with this promising commercial use is that LLMs have been shown to memorize their training data and their prompt data are vulnerable to membership inference attacks (MIA) and prompt leaking attacks. In order to deal with this problem, we treat LLMs as untrusted in privacy and propose a locally differentially private framework of in-context learning(LDP-ICL) in the settings where labels are sensitive. Considering the mechanisms of in-context learning in Transformers by gradient descent, we provide an analysis of the trade-off between privacy and utility in such LDP-ICL for classification. Moreover, we apply LDP-ICL to the discrete distribution estimation problem. In the end, we perform several experiments to demonstrate our analysis results. △ Less

Submitted 8 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: This paper was published at LREC-Coling 2024

arXiv:2405.03781 [pdf, other]

doi 10.3847/1538-4357/ad488a

Large Scale Overdensity of Lyman Break Galaxies Around the z=6.3 Ultraluminous Quasar J0100+2802

Authors: Maria Pudoka, Feige Wang, Xiaohui Fan, Jinyi Yang, Jaclyn Champagne, Victoria Jones, Fuyan Bian, Zheng Cai, Linhua Jiang, Dezi Liu, Xue-Bing Wu

Abstract: We study the environment of the z=6.33 ultraluminous quasar SDSS J010013.02+280225.8 (J0100) to understand its association with large-scale structure. Theoretical models propose high-redshift quasars as markers of galaxy overdensities residing in the most massive dark matter halos (DMHs) in the early universe. J0100 is an ultraluminous quasar with the most massive black hole known at z>6, suggesti… ▽ More We study the environment of the z=6.33 ultraluminous quasar SDSS J010013.02+280225.8 (J0100) to understand its association with large-scale structure. Theoretical models propose high-redshift quasars as markers of galaxy overdensities residing in the most massive dark matter halos (DMHs) in the early universe. J0100 is an ultraluminous quasar with the most massive black hole known at z>6, suggesting a high likelihood of residing in a massive DMH. We present wide-field ($\sim$522 square arcminute) imaging in the r-, i-, and z-bands from the Large Binocular Camera on the Large Binocular Telescope, with Y- and J-band imaging from the Wide-field Infrared Camera on the Canada-France-Hawaii Telescope, centered on J0100. Applying color selections, we identify 23 objects as i-droput Lyman Break Galaxy (LBG) candidates in the J0100 field. We use the deep photometric catalog in the 1.27 square degree COSMOS field to calculate the density of LBGs in a blank field, and to estimate the selection completeness and purity. The observed surface density of LBG candidates in the J0100 field corresponds to a galaxy overdensity of $δ$=4 (at 8.4$σ$). This large-scale overdensity suggests that the $\sim$ 22 square arcminute overdensity found by Kashino et al. using JWST data extends out to much larger scales. We calculate the angular auto-correlation function of the candidates and find a positive correlation on $\lesssim$ 10 arcminute scales as well as evidence of asymmetries in their spatial distribution, further suggesting a direct detection of large-scale structure around the ultra-luminous quasar J0100. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 21 pages, 11 figures, 3 tables, to be published in The Astrophysical Journal (ApJ)

arXiv:2405.03280 [pdf, other]

Animate Your Thoughts: Decoupled Reconstruction of Dynamic Natural Vision from Slow Brain Activity

Authors: Yizhuo Lu, Changde Du, Chong Wang, Xuanliu Zhu, Liuyun Jiang, Huiguang He

Abstract: Reconstructing human dynamic vision from brain activity is a challenging task with great scientific significance. The difficulty stems from two primary issues: (1) vision-processing mechanisms in the brain are highly intricate and not fully revealed, making it challenging to directly learn a mapping between fMRI and video; (2) the temporal resolution of fMRI is significantly lower than that of nat… ▽ More Reconstructing human dynamic vision from brain activity is a challenging task with great scientific significance. The difficulty stems from two primary issues: (1) vision-processing mechanisms in the brain are highly intricate and not fully revealed, making it challenging to directly learn a mapping between fMRI and video; (2) the temporal resolution of fMRI is significantly lower than that of natural videos. To overcome these issues, this paper propose a two-stage model named Mind-Animator, which achieves state-of-the-art performance on three public datasets. Specifically, during the fMRI-to-feature stage, we decouple semantic, structural, and motion features from fMRI through fMRI-vision-language tri-modal contrastive learning and sparse causal attention. In the feature-to-video stage, these features are merged to videos by an inflated Stable Diffusion. We substantiate that the reconstructed video dynamics are indeed derived from fMRI, rather than hallucinations of the generative model, through permutation tests. Additionally, the visualization of voxel-wise and ROI-wise importance maps confirms the neurobiological interpretability of our model. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2405.02798 [pdf, other]

Structural Balance in Real-World Social Networks: Incorporating Direction and Transitivity in Measuring Partial Balance

Authors: Rezvaneh Rezapour, Ly Dinh, Lan Jiang, Jana Diesner

Abstract: Structural balance theory predicts that triads in networks gravitate towards stable configurations. The theory has been verified for undirected graphs. Since real-world networks are often directed, we introduce a novel method for considering both transitivity and sign consistency for evaluating partial balance in signed digraphs. We test our approach on graphs constructed by using different method… ▽ More Structural balance theory predicts that triads in networks gravitate towards stable configurations. The theory has been verified for undirected graphs. Since real-world networks are often directed, we introduce a novel method for considering both transitivity and sign consistency for evaluating partial balance in signed digraphs. We test our approach on graphs constructed by using different methods for identifying edge signs: natural language processing to infer signs from underlying text data, and self-reported survey data. Our results show that for various social contexts and edge sign detection methods, partial balance of these digraphs are moderately high, ranging from 61% to 96%. Our approach not only enhances the theoretical framework of structural balance but also provides practical insights into the stability of social networks, enabling a deeper understanding of interpersonal and group dynamics across different communication platforms. △ Less

Submitted 4 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2006.02565

arXiv:2405.02155 [pdf, other]

Multi-method Integration with Confidence-based Weighting for Zero-shot Image Classification

Authors: Siqi Yin, Lifan Jiang

Abstract: This paper introduces a novel framework for zero-shot learning (ZSL), i.e., to recognize new categories that are unseen during training, by using a multi-model and multi-alignment integration method. Specifically, we propose three strategies to enhance the model's performance to handle ZSL: 1) Utilizing the extensive knowledge of ChatGPT and the powerful image generation capabilities of DALL-E to… ▽ More This paper introduces a novel framework for zero-shot learning (ZSL), i.e., to recognize new categories that are unseen during training, by using a multi-model and multi-alignment integration method. Specifically, we propose three strategies to enhance the model's performance to handle ZSL: 1) Utilizing the extensive knowledge of ChatGPT and the powerful image generation capabilities of DALL-E to create reference images that can precisely describe unseen categories and classification boundaries, thereby alleviating the information bottleneck issue; 2) Integrating the results of text-image alignment and image-image alignment from CLIP, along with the image-image alignment results from DINO, to achieve more accurate predictions; 3) Introducing an adaptive weighting mechanism based on confidence levels to aggregate the outcomes from different prediction methods. Experimental results on multiple datasets, including CIFAR-10, CIFAR-100, and TinyImageNet, demonstrate that our model can significantly improve classification accuracy compared to single-model approaches, achieving AUROC scores above 96% across all test datasets, and notably surpassing 99% on the CIFAR-10 dataset. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Showing 1–50 of 1,665 results for author: Jiang, L