subscribe to arXiv mailings

Deformation-Recovery Diffusion Model (DRDM): Instance Deformation for Image Manipulation and Synthesis

Authors: Jian-Qing Zheng, Yuanhan Mo, Yang Sun, Jiahua Li, Fuping Wu, Ziyang Wang, Tonia Vincent, Bartłomiej W. Papież

Abstract: In medical imaging, the diffusion models have shown great potential in synthetic image generation tasks. However, these models often struggle with the interpretable connections between the generated and existing images and could create illusions. To address these challenges, our research proposes a novel diffusion-based generative model based on deformation diffusion and recovery. This model, name… ▽ More In medical imaging, the diffusion models have shown great potential in synthetic image generation tasks. However, these models often struggle with the interpretable connections between the generated and existing images and could create illusions. To address these challenges, our research proposes a novel diffusion-based generative model based on deformation diffusion and recovery. This model, named Deformation-Recovery Diffusion Model (DRDM), diverges from traditional score/intensity and latent feature-based approaches, emphasizing morphological changes through deformation fields rather than direct image synthesis. This is achieved by introducing a topological-preserving deformation field generation method, which randomly samples and integrates a set of multi-scale Deformation Vector Fields (DVF). DRDM is trained to learn to recover unreasonable deformation components, thereby restoring each randomly deformed image to a realistic distribution. These innovations facilitate the generation of diverse and anatomically plausible deformations, enhancing data augmentation and synthesis for further analysis in downstream tasks, such as few-shot learning and image registration. Experimental results in cardiac MRI and pulmonary CT show DRDM is capable of creating diverse, large (over 10% image size deformation scale), and high-quality (negative ratio of folding rate is lower than 1%) deformation fields. The further experimental results in downstream tasks, 2D image segmentation and 3D image registration, indicate significant improvements resulting from DRDM, showcasing the potential of our model to advance image manipulation and synthesis in medical imaging and beyond. Our implementation will be available at https://github.com/jianqingzheng/def_diff_rec. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2407.03688 [pdf, other]

Adaptive sampling strategy for tolerance analysis of freeform optical surfaces based on critical ray aiming

Authors: Rundong Fan, Shili Wei, Zhuang Qian, Huiru Ji, Hao Tan, Yan Mo, Donglin Ma

Abstract: The tolerance analysis of freeform surfaces plays a crucial role in the development of advanced imaging systems. However, the intricate relationship between surface error and imaging quality poses significant challenges, necessitating dense sampling of featured rays during the computation process to ensure an accurate tolerance for different fields of view (FOVs). Here, we propose an adaptive samp… ▽ More The tolerance analysis of freeform surfaces plays a crucial role in the development of advanced imaging systems. However, the intricate relationship between surface error and imaging quality poses significant challenges, necessitating dense sampling of featured rays during the computation process to ensure an accurate tolerance for different fields of view (FOVs). Here, we propose an adaptive sampling strategy called "Critical Ray Aiming" for surface tolerance analysis. By identifying the most sensitive ray to wave aberration at each surface point, our methodology facilitates flexible sampling of the FOVs and entrance pupil (EP), achieving computational efficiency without compromising accuracy in determining tolerable surface error. We demonstrate the effectiveness of our method through tolerance analysis of two different freeform imaging systems. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.00136 [pdf, other]

Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components. △ Less

Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

arXiv:2406.16841 [pdf, ps, other]

Proposal for the generation of continuous-wave vacuum ultraviolet laser light for Th-229 isomer precision spectroscopy

Authors: Qi Xiao, Gleb Penyazkov, Ruihan Yu, Beichen Huang, Jiatong Li, Juanlang Shi, Yanmei Yu, Yuxiang Mo, Shiqian Ding

Abstract: We propose to generate continuous-wave vacuum ultraviolet (VUV) laser light at 148.4 nm using four-wave mixing in cadmium vapor for precision spectroscopy of the Th-229 isomer transition. Due to the large transition matrix elements of cadmium, the readily accessible wavelengths for the incident laser beams, and the high coherence of the four-wave mixing process, over 30 $μ$W of VUV power can be ge… ▽ More We propose to generate continuous-wave vacuum ultraviolet (VUV) laser light at 148.4 nm using four-wave mixing in cadmium vapor for precision spectroscopy of the Th-229 isomer transition. Due to the large transition matrix elements of cadmium, the readily accessible wavelengths for the incident laser beams, and the high coherence of the four-wave mixing process, over 30 $μ$W of VUV power can be generated with a narrow linewidth. This development paves the way for coherently driving the Th-229 isomer transition and developing the nuclear optical clock. △ Less

Submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.15305 [pdf, other]

PID: Prompt-Independent Data Protection Against Latent Diffusion Models

Authors: Ang Li, Yichuan Mo, Mingjie Li, Yisen Wang

Abstract: The few-shot fine-tuning of Latent Diffusion Models (LDMs) has enabled them to grasp new concepts from a limited number of images. However, given the vast amount of personal images accessible online, this capability raises critical concerns about civil privacy. While several previous defense methods have been developed to prevent such misuse of LDMs, they typically assume that the textual prompts… ▽ More The few-shot fine-tuning of Latent Diffusion Models (LDMs) has enabled them to grasp new concepts from a limited number of images. However, given the vast amount of personal images accessible online, this capability raises critical concerns about civil privacy. While several previous defense methods have been developed to prevent such misuse of LDMs, they typically assume that the textual prompts used by data protectors exactly match those employed by data exploiters. In this paper, we first empirically demonstrate that breaking this assumption, i.e., in cases where discrepancies exist between the textual conditions used by protectors and exploiters, could substantially reduce the effectiveness of these defenses. Furthermore, considering the visual encoder's independence from textual prompts, we delve into the visual encoder and thoroughly investigate how manipulating the visual encoder affects the few-shot fine-tuning process of LDMs. Drawing on these insights, we propose a simple yet effective method called \textbf{Prompt-Independent Defense (PID)} to safeguard privacy against LDMs. We show that PID can act as a strong privacy shield on its own while requiring significantly less computational power. We believe our studies, along with the comprehensive understanding and new defense method, provide a notable advance toward reliable data protection against LDMs. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 27 pages, ICML 2024 poster

arXiv:2406.07824 [pdf, other]

Efficient Arbitrated Quantum Digital Signature with Multi-Receiver Verification

Authors: Siyu Xiong, Bangying Tang, Hui Han, Jinquan Huang, Mingqiang Bai, Fangzhao Li, Wanrong Yu Zhiwen Mo, Bo Liu

Abstract: Quantum digital signature is used to authenticate the identity of the signer with information theoretical security, while providing non-forgery and non-repudiation services. In traditional multi-receiver quantum digital signature schemes without an arbitrater, the transferability of one-to-one signature is always required to achieve unforgeability, with complicated implementation and heavy key con… ▽ More Quantum digital signature is used to authenticate the identity of the signer with information theoretical security, while providing non-forgery and non-repudiation services. In traditional multi-receiver quantum digital signature schemes without an arbitrater, the transferability of one-to-one signature is always required to achieve unforgeability, with complicated implementation and heavy key consumption. In this article, we propose an arbitrated quantum digital signature scheme, in which the signature can be verified by multiple receivers simultaneously, and meanwhile, the transferability of the signature is still kept. Our scheme can be simplified performed to various quantum secure networks, due to the proposed efficient signature calculation procedure with low secure key consumption and low computation complexity, by employing one-time universal hashing algorithm and one-time pad encryption scheme. The evaluation results show that our scheme uses at least two orders of magnitude less key than existing signature schemes with transferability when signing files of the same length with the same number of receivers and security parameter settings. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.03127 [pdf, other]

Towards Real-world Scenario: Imbalanced New Intent Discovery

Authors: Shun Zhang, Chaoran Yan, Jian Yang, Jiaheng Liu, Ying Mo, Jiaqi Bai, Tongliang Li, Zhoujun Li

Abstract: New Intent Discovery (NID) aims at detecting known and previously undefined categories of user intent by utilizing limited labeled and massive unlabeled data. Most prior works often operate under the unrealistic assumption that the distribution of both familiar and new intent classes is uniform, overlooking the skewed and long-tailed distributions frequently encountered in real-world scenarios. To… ▽ More New Intent Discovery (NID) aims at detecting known and previously undefined categories of user intent by utilizing limited labeled and massive unlabeled data. Most prior works often operate under the unrealistic assumption that the distribution of both familiar and new intent classes is uniform, overlooking the skewed and long-tailed distributions frequently encountered in real-world scenarios. To bridge the gap, our work introduces the imbalanced new intent discovery (i-NID) task, which seeks to identify familiar and novel intent categories within long-tailed distributions. A new benchmark (ImbaNID-Bench) comprised of three datasets is created to simulate the real-world long-tail distributions. ImbaNID-Bench ranges from broad cross-domain to specific single-domain intent categories, providing a thorough representation of practical use cases. Besides, a robust baseline model ImbaNID is proposed to achieve cluster-friendly intent representations. It includes three stages: model pre-training, generation of reliable pseudo-labels, and robust representation learning that strengthens the model performance to handle the intricacies of real-world data distributions. Our extensive experiments on previous benchmarks and the newly established benchmark demonstrate the superior performance of ImbaNID in addressing the i-NID task, highlighting its potential as a powerful baseline for uncovering and categorizing user intents in imbalanced and long-tailed distributions\footnote{\url{https://github.com/Zkdc/i-NID}}. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: ACL 2024

arXiv:2405.17802 [pdf, other]

Multi-level Interaction Modeling for Protein Mutational Effect Prediction

Authors: Yuanle Mo, Xin Hong, Bowen Gao, Yinjun Jia, Yanyan Lan

Abstract: Protein-protein interactions are central mediators in many biological processes. Accurately predicting the effects of mutations on interactions is crucial for guiding the modulation of these interactions, thereby playing a significant role in therapeutic development and drug discovery. Mutations generally affect interactions hierarchically across three levels: mutated residues exhibit different si… ▽ More Protein-protein interactions are central mediators in many biological processes. Accurately predicting the effects of mutations on interactions is crucial for guiding the modulation of these interactions, thereby playing a significant role in therapeutic development and drug discovery. Mutations generally affect interactions hierarchically across three levels: mutated residues exhibit different sidechain conformations, which lead to changes in the backbone conformation, eventually affecting the binding affinity between proteins. However, existing methods typically focus only on sidechain-level interaction modeling, resulting in suboptimal predictions. In this work, we propose a self-supervised multi-level pre-training framework, ProMIM, to fully capture all three levels of interactions with well-designed pretraining objectives. Experiments show ProMIM outperforms all the baselines on the standard benchmark, especially on mutations where significant changes in backbone conformations may occur. In addition, leading results from zero-shot evaluations for SARS-CoV-2 mutational effect prediction and antibody optimization underscore the potential of ProMIM as a powerful next-generation tool for developing novel therapeutic approaches and new drugs. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.08599 [pdf, other]

The distributed biased min-consensus protocol revisited: pre-specified finite time control strategies and small-gain based analysis

Authors: Yuanqiu Mo, He Wang

Abstract: Unlike the classical distributed consensus protocols enabling the group of agents as a whole to reach an agreement regarding a certain quantity of interest in a distributed fashion, the distributed biased min-consensus protocol (DBMC) has been proven to generate advanced complexity pertaining to solving the shortest path problem. As such a protocol is commonly incorporated as the first step of a h… ▽ More Unlike the classical distributed consensus protocols enabling the group of agents as a whole to reach an agreement regarding a certain quantity of interest in a distributed fashion, the distributed biased min-consensus protocol (DBMC) has been proven to generate advanced complexity pertaining to solving the shortest path problem. As such a protocol is commonly incorporated as the first step of a hierarchical architecture in real applications, e.g., robots path planning, management of dispersed computing services, an impedance limiting the application potential of DBMC lies in, the lack of results regarding to its convergence within a user-assigned time. In this paper, we first propose two control strategies ensuring the state error of DBMC decrease exactly to zero or a desired level manipulated by the user, respectively. To compensate the high feedback gains incurred by these two control strategies, this paper further investigates the nominal DBMC itself. By leveraging small gain based stability tools, this paper also proves the global exponential input-to-state stability of DBMC, outperforming its current stability results. Simulations have been provided to validate the efficacy of our theoretical result. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.06652 [pdf]

Large Language Model (LLM) AI text generation detection based on transformer deep learning algorithm

Authors: Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li

Abstract: In this paper, a tool for detecting LLM AI text generation is developed based on the Transformer model, aiming to improve the accuracy of AI text generation detection and provide reference for subsequent research. Firstly the text is Unicode normalised, converted to lowercase form, characters other than non-alphabetic characters and punctuation marks are removed by regular expressions, spaces are… ▽ More In this paper, a tool for detecting LLM AI text generation is developed based on the Transformer model, aiming to improve the accuracy of AI text generation detection and provide reference for subsequent research. Firstly the text is Unicode normalised, converted to lowercase form, characters other than non-alphabetic characters and punctuation marks are removed by regular expressions, spaces are added around punctuation marks, first and last spaces are removed, consecutive ellipses are replaced with single spaces and the text is connected using the specified delimiter. Next remove non-alphabetic characters and extra whitespace characters, replace multiple consecutive whitespace characters with a single space and again convert to lowercase form. The deep learning model combines layers such as LSTM, Transformer and CNN for text classification or sequence labelling tasks. The training and validation sets show that the model loss decreases from 0.127 to 0.005 and accuracy increases from 94.96 to 99.8, indicating that the model has good detection and classification ability for AI generated text. The test set confusion matrix and accuracy show that the model has 99% prediction accuracy for AI-generated text, with a precision of 0.99, a recall of 1, and an f1 score of 0.99, achieving a very high classification accuracy. Looking forward, it has the prospect of wide application in the field of AI text detection. △ Less

Submitted 6 April, 2024; originally announced May 2024.

Comments: 6 pages

arXiv:2405.05288 [pdf, other]

Learning Social Graph for Inactive User Recommendation

Authors: Nian Liu, Shen Fan, Ting Bai, Peng Wang, Mingwei Sun, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Chuan Shi

Abstract: Social relations have been widely incorporated into recommender systems to alleviate data sparsity problem. However, raw social relations don't always benefit recommendation due to their inferior quality and insufficient quantity, especially for inactive users, whose interacted items are limited. In this paper, we propose a novel social recommendation method called LSIR (\textbf{L}earning \textbf{… ▽ More Social relations have been widely incorporated into recommender systems to alleviate data sparsity problem. However, raw social relations don't always benefit recommendation due to their inferior quality and insufficient quantity, especially for inactive users, whose interacted items are limited. In this paper, we propose a novel social recommendation method called LSIR (\textbf{L}earning \textbf{S}ocial Graph for \textbf{I}nactive User \textbf{R}ecommendation) that learns an optimal social graph structure for social recommendation, especially for inactive users. LSIR recursively aggregates user and item embeddings to collaboratively encode item and user features. Then, graph structure learning (GSL) is employed to refine the raw user-user social graph, by removing noisy edges and adding new edges based on the enhanced embeddings. Meanwhile, mimic learning is implemented to guide active users in mimicking inactive users during model training, which improves the construction of new edges for inactive users. Extensive experiments on real-world datasets demonstrate that LSIR achieves significant improvements of up to 129.58\% on NDCG in inactive user recommendation. Our code is available at~\url{https://github.com/liun-online/LSIR}. △ Less

Submitted 22 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

Comments: This paper has been received by DASFAA 2024

arXiv:2404.19660 [pdf, other]

Decoder Decomposition for the Analysis of the Latent Space of Nonlinear Autoencoders With Wind-Tunnel Experimental Data

Authors: Yaxin Mo, Tullio Traverso, Luca Magri

Abstract: Turbulent flows are chaotic and multi-scale dynamical systems, which have large numbers of degrees of freedom. Turbulent flows, however, can be modelled with a smaller number of degrees of freedom when using the appropriate coordinate system, which is the goal of dimensionality reduction via nonlinear autoencoders. Autoencoders are expressive tools, but they are difficult to interpret. The goal of… ▽ More Turbulent flows are chaotic and multi-scale dynamical systems, which have large numbers of degrees of freedom. Turbulent flows, however, can be modelled with a smaller number of degrees of freedom when using the appropriate coordinate system, which is the goal of dimensionality reduction via nonlinear autoencoders. Autoencoders are expressive tools, but they are difficult to interpret. The goal of this paper is to propose a method to aid the interpretability of autoencoders. This is the decoder decomposition. First, we propose the decoder decomposition, which is a post-processing method to connect the latent variables to the coherent structures of flows. Second, we apply the decoder decomposition to analyse the latent space of synthetic data of a two-dimensional unsteady wake past a cylinder. We find that the dimension of latent space has a significant impact on the interpretability of autoencoders. We identify the physical and spurious latent variables. Third, we apply the decoder decomposition to the latent space of wind-tunnel experimental data of a three-dimensional turbulent wake past a bluff body. We show that the reconstruction error is a function of both the latent space dimension and the decoder size, which are correlated. Finally, we apply the decoder decomposition to rank and select latent variables based on the coherent structures that they represent. This is useful to filter unwanted or spurious latent variables, or to pinpoint specific coherent structures of interest. The ability to rank and select latent variables will help users design and interpret nonlinear autoencoders. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.19175 [pdf, other]

Game-MUG: Multimodal Oriented Game Situation Understanding and Commentary Generation Dataset

Authors: Zhihao Zhang, Feiqi Cao, Yingbin Mo, Yiran Zhang, Josiah Poon, Caren Han

Abstract: The dynamic nature of esports makes the situation relatively complicated for average viewers. Esports broadcasting involves game expert casters, but the caster-dependent game commentary is not enough to fully understand the game situation. It will be richer by including diverse multimodal esports information, including audiences' talks/emotions, game audio, and game match event information. This p… ▽ More The dynamic nature of esports makes the situation relatively complicated for average viewers. Esports broadcasting involves game expert casters, but the caster-dependent game commentary is not enough to fully understand the game situation. It will be richer by including diverse multimodal esports information, including audiences' talks/emotions, game audio, and game match event information. This paper introduces GAME-MUG, a new multimodal game situation understanding and audience-engaged commentary generation dataset and its strong baseline. Our dataset is collected from 2020-2022 LOL game live streams from YouTube and Twitch, and includes multimodal esports game information, including text, audio, and time-series event logs, for detecting the game situation. In addition, we also propose a new audience conversation augmented commentary dataset by covering the game situation and audience conversation understanding, and introducing a robust joint multimodal dual learning model as a baseline. We examine the model's game situation/event understanding ability and commentary generation capability to show the effectiveness of the multimodal aspects coverage and the joint integration learning approach. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.16727 [pdf, other]

Learning-Based Efficient Approximation of Data-enabled Predictive Control

Authors: Yihan Zhou, Yiwen Lu, Zishuo Li, Jiaqi Yan, Yilin Mo

Abstract: Data-Enabled Predictive Control (DeePC) bypasses the need for system identification by directly leveraging raw data to formulate optimal control policies. However, the size of the optimization problem in DeePC grows linearly with respect to the data size, which prohibits its application due to high computational costs. In this paper, we propose an efficient approximation of DeePC, whose size is in… ▽ More Data-Enabled Predictive Control (DeePC) bypasses the need for system identification by directly leveraging raw data to formulate optimal control policies. However, the size of the optimization problem in DeePC grows linearly with respect to the data size, which prohibits its application due to high computational costs. In this paper, we propose an efficient approximation of DeePC, whose size is invariant with respect to the amount of data collected, via differentiable convex programming. Specifically, the optimization problem in DeePC is decomposed into two parts: a control objective and a scoring function that evaluates the likelihood of a guessed I/O sequence, the latter of which is approximated with a size-invariant learned optimization problem. The proposed method is validated through numerical simulations on a quadruple tank system, illustrating that the learned controller can reduce the computational time of DeePC by 5x while maintaining its control performance. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.13701 [pdf, other]

Semantic-Rearrangement-Based Multi-Level Alignment for Domain Generalized Segmentation

Authors: Guanlong Jiao, Chenyangguang Zhang, Haonan Yin, Yu Mo, Biqing Huang, Hui Pan, Yi Luo, Jingxian Liu

Abstract: Domain generalized semantic segmentation is an essential computer vision task, for which models only leverage source data to learn the capability of generalized semantic segmentation towards the unseen target domains. Previous works typically address this challenge by global style randomization or feature regularization. In this paper, we argue that given the observation that different local seman… ▽ More Domain generalized semantic segmentation is an essential computer vision task, for which models only leverage source data to learn the capability of generalized semantic segmentation towards the unseen target domains. Previous works typically address this challenge by global style randomization or feature regularization. In this paper, we argue that given the observation that different local semantic regions perform different visual characteristics from the source domain to the target domain, methods focusing on global operations are hard to capture such regional discrepancies, thus failing to construct domain-invariant representations with the consistency from local to global level. Therefore, we propose the Semantic-Rearrangement-based Multi-Level Alignment (SRMA) to overcome this problem. SRMA first incorporates a Semantic Rearrangement Module (SRM), which conducts semantic region randomization to enhance the diversity of the source domain sufficiently. A Multi-Level Alignment module (MLA) is subsequently proposed with the help of such diversity to establish the global-regional-local consistent domain-invariant representations. By aligning features across randomized samples with domain-neutral knowledge at multiple levels, SRMA provides a more robust way to handle the source-target domain gap. Extensive experiments demonstrate the superiority of SRMA over the current state-of-the-art works on various benchmarks. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2403.19081 [pdf]

Surface variation analysis of freeform optical systems over surface frequency bands for prescribed wavefront errors

Authors: Rundong Fan, Shili Wei, Huiru JI, Zhuang Qian, Hao Tan, Yan Mo, Donglin MA

Abstract: The surface errors of freeform surfaces reflect the manufacturing complexities and significantly impact the feasibility of processing designed optical systems. With multiple degrees of freedom, freeform surfaces pose challenges in surface tolerance analysis in the field. Nevertheless, current research has neglected the influence of surface slopes on the directions of ray propagation. A sudden alte… ▽ More The surface errors of freeform surfaces reflect the manufacturing complexities and significantly impact the feasibility of processing designed optical systems. With multiple degrees of freedom, freeform surfaces pose challenges in surface tolerance analysis in the field. Nevertheless, current research has neglected the influence of surface slopes on the directions of ray propagation. A sudden alteration in the surface slope will lead to a corresponding abrupt shift in the wavefront, even when the change in surface sag is minimal. Moreover, within the realm of freeform surface manufacturing, variation in surface slope across different frequency bands may give rise to unique surface variation. Within the context of this study, we propose a tolerance analysis method to analyze surface variation in freeform surfaces considering surface frequency band slopes based on real ray data. This approach utilizes real ray data to rapidly evaluate surface variation within a specified frequency band of surface slopes. Crucially, our proposed method yields the capability to obtain system surface variation with significant wavefront aberration, in contrast to previous methodologies. The feasibility and advantages of this framework are assessed by analyzing a single-mirror system with a single field and an off-axis two-mirror system. We expect to integrate the proposed methodology with freeform surface design and manufacturing, thereby expanding the scope of freeform optics. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.04704 [pdf, other]

Quantum Advantage in Reversing Unknown Unitary Evolutions

Authors: Yu-Ao Chen, Yin Mo, Yingjian Liu, Lei Zhang, Xin Wang

Abstract: We introduce the Quantum Unitary Reversal Algorithm (QURA), a deterministic and exact approach to universally reverse arbitrary unknown unitary transformations using $\mathcal{O}(d^2)$ calls of the unitary, where $d$ is the system dimension. Our construction resolves a fundamental problem of time-reversal simulations for closed quantum systems by affirming the feasibility of reversing any unitary… ▽ More We introduce the Quantum Unitary Reversal Algorithm (QURA), a deterministic and exact approach to universally reverse arbitrary unknown unitary transformations using $\mathcal{O}(d^2)$ calls of the unitary, where $d$ is the system dimension. Our construction resolves a fundamental problem of time-reversal simulations for closed quantum systems by affirming the feasibility of reversing any unitary evolution without knowing the exact process. The algorithm also provides the construction of a key oracle for unitary inversion in quantum algorithm frameworks such as quantum singular value transformation. Notably, our work demonstrates that compared with classical methods relying on process tomography, reversing an unknown unitary on a quantum computer holds a quadratic quantum advantage in computation complexity. QURA ensures an exact unitary inversion while the classical counterpart can never achieve exact inversion using a finite number of unitary calls. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 17 pages including appendix

arXiv:2403.03761 [pdf, other]

Parameterized quantum comb and simpler circuits for reversing unknown qubit-unitary operations

Authors: Yin Mo, Lei Zhang, Yu-Ao Chen, Yingjian Liu, Tengxiang Lin, Xin Wang

Abstract: Quantum comb is an essential tool for characterizing complex quantum protocols in quantum information processing. In this work, we introduce PQComb, a framework leveraging parameterized quantum circuits to explore the capabilities of quantum combs for general quantum process transformation tasks and beyond. By optimizing PQComb for time-reversal simulations of unknown unitary evolutions, we develo… ▽ More Quantum comb is an essential tool for characterizing complex quantum protocols in quantum information processing. In this work, we introduce PQComb, a framework leveraging parameterized quantum circuits to explore the capabilities of quantum combs for general quantum process transformation tasks and beyond. By optimizing PQComb for time-reversal simulations of unknown unitary evolutions, we develop a simpler protocol for unknown qubit unitary inversion that reduces the ancilla qubit overhead from 6 to 3 compared to the existing method in [Yoshida, Soeda, Murao, PRL 131, 120602, 2023]. This demonstrates the utility of quantum comb structures and showcases PQComb's potential for solving complex quantum tasks. Our results pave the way for broader PQComb applications in quantum computing and quantum information, emphasizing its versatility for tackling diverse problems in quantum machine learning. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 12 pages including appendix

arXiv:2402.19263 [pdf, other]

Spinal Osteophyte Detection via Robust Patch Extraction on minimally annotated X-rays

Authors: Soumya Snigdha Kundu, Yuanhan Mo, Nicharee Srikijkasemwat, Bartłomiej W. Papiez

Abstract: The development and progression of arthritis is strongly associated with osteophytes, which are small and elusive bone growths. This paper presents one of the first efforts towards automated spinal osteophyte detection in spinal X-rays. A novel automated patch extraction process, called SegPatch, has been proposed based on deep learning-driven vertebrae segmentation and the enlargement of mask con… ▽ More The development and progression of arthritis is strongly associated with osteophytes, which are small and elusive bone growths. This paper presents one of the first efforts towards automated spinal osteophyte detection in spinal X-rays. A novel automated patch extraction process, called SegPatch, has been proposed based on deep learning-driven vertebrae segmentation and the enlargement of mask contours. A final patch classification accuracy of 84.5\% is secured, surpassing a baseline tiling-based patch generation technique by 9.5%. This demonstrates that even with limited annotations, SegPatch can deliver superior performance for detection of tiny structures such as osteophytes. The proposed approach has potential to assist clinicians in expediting the process of manually identifying osteophytes in spinal X-ray. △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: ISBI'24 Full Paper

arXiv:2402.11254 [pdf, other]

C-ICL: Contrastive In-context Learning for Information Extraction

Authors: Ying Mo, Jiahao Liu, Jian Yang, Qifan Wang, Shun Zhang, Jingang Wang, Zhoujun Li

Abstract: There has been increasing interest in exploring the capabilities of advanced large language models (LLMs) in the field of information extraction (IE), specifically focusing on tasks related to named entity recognition (NER) and relation extraction (RE). Although researchers are exploring the use of few-shot information extraction through in-context learning with LLMs, they tend to focus only on us… ▽ More There has been increasing interest in exploring the capabilities of advanced large language models (LLMs) in the field of information extraction (IE), specifically focusing on tasks related to named entity recognition (NER) and relation extraction (RE). Although researchers are exploring the use of few-shot information extraction through in-context learning with LLMs, they tend to focus only on using correct or positive examples for demonstration, neglecting the potential value of incorporating incorrect or negative examples into the learning process. In this paper, we present c-ICL, a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations. This approach enhances the ability of LLMs to extract entities and relations by utilizing prompts that incorporate not only the positive samples but also the reasoning behind them. This method allows for the identification and correction of potential interface errors. Specifically, our proposed method taps into the inherent contextual information and valuable information in hard negative samples and the nearest positive neighbors to the test and then applies the in-context learning demonstrations based on LLMs. Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods, delivering substantial enhancements in performance across a broad spectrum of related tasks. These improvements are noteworthy, showcasing the versatility of our approach in miscellaneous scenarios. △ Less

Submitted 24 June, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

Comments: 15 pages

arXiv:2402.09505 [pdf, other]

3C 273 Host Galaxy with Hubble Space Telescope Coronagraphy

Authors: Bin B. Ren, Kevin Fogarty, John H. Debes, Eileen T. Meyer, Youbin Mo, Dimitri Mawet, Marshall D. Perrin, Patrick M. Ogle, Johannes Sahlmann

Abstract: The close-in regions of bright quasars' host galaxies have been difficult to image due to the overwhelming light from the quasars. With coronagraphic observations in visible light using the Space Telescope Imaging Spectrograph (STIS) on the Hubble Space Telescope, we removed 3C 273 quasar light using color-matching reference stars. The observations revealed the host galaxy from 60" to 0.2" with ne… ▽ More The close-in regions of bright quasars' host galaxies have been difficult to image due to the overwhelming light from the quasars. With coronagraphic observations in visible light using the Space Telescope Imaging Spectrograph (STIS) on the Hubble Space Telescope, we removed 3C 273 quasar light using color-matching reference stars. The observations revealed the host galaxy from 60" to 0.2" with nearly full angular coverage. Isophote modeling revealed a new core jet, a core blob, and multiple smaller-scale blobs within 2.5". The blobs could potentially be satellite galaxies or infalling materials towards the central quasar. Using archival STIS data, we constrained the apparent motion of its large scale jets over a 22 yr timeline. By resolving the 3C 273 host galaxy with STIS, our study validates the coronagraph usage on extragalactic sources in obtaining new insights into the central ~kpc regions of quasar hosts. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: 13 pages, 11 figures, 2 tables, A&A Letters accepted

arXiv:2402.09111 [pdf, other]

On-shell Bootstrap for n-gluons and gravitons scattering in (A)dS, Unitarity and Soft limit

Authors: Jiajie Mei, Yuyu Mo

Abstract: We propose an algorithm to recursively bootstrap $n$-point gluon and graviton Mellin-Momentum amplitudes in (A)dS spacetime using only three-point amplitude. We discover that gluon amplitudes are simply determined by factorization for $n\geq 5$. The same principle applies to $n$-point graviton amplitudes, but additional constraints such as flat space and soft limits are needed to fix contact terms… ▽ More We propose an algorithm to recursively bootstrap $n$-point gluon and graviton Mellin-Momentum amplitudes in (A)dS spacetime using only three-point amplitude. We discover that gluon amplitudes are simply determined by factorization for $n\geq 5$. The same principle applies to $n$-point graviton amplitudes, but additional constraints such as flat space and soft limits are needed to fix contact terms. Furthermore, we establish a mapping from $n$-point Mellin-Momentum amplitudes to $n$-point cosmological correlators. We efficiently compute explicit examples up to five points. This leads to the first five-graviton amplitude in $AdS_{d+1}$. △ Less

Submitted 16 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

Comments: v2: Typos and minor errors fixed, references added

arXiv:2402.07197 [pdf, other]

GraphTranslator: Aligning Graph Model to Large Language Model for Open-ended Tasks

Authors: Mengmei Zhang, Mingwei Sun, Peng Wang, Shen Fan, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Cheng Yang, Chuan Shi

Abstract: Large language models (LLMs) like ChatGPT, exhibit powerful zero-shot and instruction-following capabilities, have catalyzed a revolutionary transformation across diverse fields, especially for open-ended tasks. While the idea is less explored in the graph domain, despite the availability of numerous powerful graph models (GMs), they are restricted to tasks in a pre-defined form. Although several… ▽ More Large language models (LLMs) like ChatGPT, exhibit powerful zero-shot and instruction-following capabilities, have catalyzed a revolutionary transformation across diverse fields, especially for open-ended tasks. While the idea is less explored in the graph domain, despite the availability of numerous powerful graph models (GMs), they are restricted to tasks in a pre-defined form. Although several methods applying LLMs to graphs have been proposed, they fail to simultaneously handle the pre-defined and open-ended tasks, with LLM as a node feature enhancer or as a standalone predictor. To break this dilemma, we propose to bridge the pretrained GM and LLM by a Translator, named GraphTranslator, aiming to leverage GM to handle the pre-defined tasks effectively and utilize the extended interface of LLMs to offer various open-ended tasks for GM. To train such Translator, we propose a Producer capable of constructing the graph-text alignment data along node information, neighbor information and model information. By translating node representation into tokens, GraphTranslator empowers an LLM to make predictions based on language instructions, providing a unified perspective for both pre-defined and open-ended tasks. Extensive results demonstrate the effectiveness of our proposed GraphTranslator on zero-shot node classification. The graph question answering experiments reveal our GraphTranslator potential across a broad spectrum of open-ended tasks through language instructions. Our code is available at: https://github.com/alibaba/GraphTranslator. △ Less

Submitted 27 February, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

arXiv:2402.06255 [pdf, other]

Fight Back Against Jailbreaking via Prompt Adversarial Tuning

Authors: Yichuan Mo, Yuji Wang, Zeming Wei, Yisen Wang

Abstract: While Large Language Models (LLMs) have achieved tremendous success in various applications, they are also susceptible to jailbreak attacks. Several primary defense strategies have been proposed to protect LLMs from producing harmful information, mostly with a particular focus on harmful content filtering or heuristical defensive prompt designs. However, how to achieve intrinsic robustness through… ▽ More While Large Language Models (LLMs) have achieved tremendous success in various applications, they are also susceptible to jailbreak attacks. Several primary defense strategies have been proposed to protect LLMs from producing harmful information, mostly with a particular focus on harmful content filtering or heuristical defensive prompt designs. However, how to achieve intrinsic robustness through the prompts remains an open problem. In this paper, motivated by adversarial training paradigms for achieving reliable robustness, we propose an approach named Prompt Adversarial Tuning (PAT) that trains a prompt control attached to the user prompt as a guard prefix. To achieve our defense goal whilst maintaining natural performance, we optimize the control prompt with both adversarial and benign prompts. Comprehensive experiments show that our method is effective against both black-box and white-box attacks, reducing the success rate of advanced attacks to nearly 0 while maintaining the model's utility on the benign task. The proposed defense strategy incurs only negligible computational overhead, charting a new perspective for future explorations in LLM security. Our code is available at https://github.com/rain152/PAT. △ Less

Submitted 9 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

arXiv:2402.00086 [pdf, other]

Retrosynthesis prediction enhanced by in-silico reaction data augmentation

Authors: Xu Zhang, Yiming Mo, Wenguan Wang, Yi Yang

Abstract: Recent advances in machine learning (ML) have expedited retrosynthesis research by assisting chemists to design experiments more efficiently. However, all ML-based methods consume substantial amounts of paired training data (i.e., chemical reaction: product-reactant(s) pair), which is costly to obtain. Moreover, companies view reaction data as a valuable asset and restrict the accessibility to res… ▽ More Recent advances in machine learning (ML) have expedited retrosynthesis research by assisting chemists to design experiments more efficiently. However, all ML-based methods consume substantial amounts of paired training data (i.e., chemical reaction: product-reactant(s) pair), which is costly to obtain. Moreover, companies view reaction data as a valuable asset and restrict the accessibility to researchers. These issues prevent the creation of more powerful retrosynthesis models due to their data-driven nature. As a response, we exploit easy-to-access unpaired data (i.e., one component of product-reactant(s) pair) for generating in-silico paired data to facilitate model training. Specifically, we present RetroWISE, a self-boosting framework that employs a base model inferred from real paired data to perform in-silico reaction generation and augmentation using unpaired data, ultimately leading to a superior model. On three benchmark datasets, RetroWISE achieves the best overall performance against state-of-the-art models (e.g., +8.6% top-1 accuracy on the USPTO-50K test dataset). Moreover, it consistently improves the prediction accuracy of rare transformations. These results show that Retro- WISE overcomes the training bottleneck by in-silico reactions, thereby paving the way toward more effective ML-based retrosynthesis models. △ Less

Submitted 31 January, 2024; originally announced February 2024.

arXiv:2401.12564 [pdf, other]

Graph Contrastive Invariant Learning from the Causal Perspective

Authors: Yanhu Mo, Xiao Wang, Shaohua Fan, Chuan Shi

Abstract: Graph contrastive learning (GCL), learning the node representation by contrasting two augmented graphs in a self-supervised way, has attracted considerable attention. GCL is usually believed to learn the invariant representation. However, does this understanding always hold in practice? In this paper, we first study GCL from the perspective of causality. By analyzing GCL with the structural causal… ▽ More Graph contrastive learning (GCL), learning the node representation by contrasting two augmented graphs in a self-supervised way, has attracted considerable attention. GCL is usually believed to learn the invariant representation. However, does this understanding always hold in practice? In this paper, we first study GCL from the perspective of causality. By analyzing GCL with the structural causal model (SCM), we discover that traditional GCL may not well learn the invariant representations due to the non-causal information contained in the graph. How can we fix it and encourage the current GCL to learn better invariant representations? The SCM offers two requirements and motives us to propose a novel GCL method. Particularly, we introduce the spectral graph augmentation to simulate the intervention upon non-causal factors. Then we design the invariance objective and independence objective to better capture the causal factors. Specifically, (i) the invariance objective encourages the encoder to capture the invariant information contained in causal variables, and (ii) the independence objective aims to reduce the influence of confounders on the causal variables. Experimental results demonstrate the effectiveness of our approach on node classification tasks. △ Less

Submitted 7 March, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.11425 [pdf, other]

Grayscale Image Colorization with GAN and CycleGAN in Different Image Domain

Authors: Chen Liang, Yunchen Sheng, Yichen Mo

Abstract: Automatic colorization of grayscale image has been a challenging task. Previous research have applied supervised methods in conquering this problem [ 1]. In this paper, we reproduces a GAN-based coloring model, and experiments one of its variant. We also proposed a CycleGAN based model and experiments those methods on various datasets. The result shows that the proposed CycleGAN model does well in… ▽ More Automatic colorization of grayscale image has been a challenging task. Previous research have applied supervised methods in conquering this problem [ 1]. In this paper, we reproduces a GAN-based coloring model, and experiments one of its variant. We also proposed a CycleGAN based model and experiments those methods on various datasets. The result shows that the proposed CycleGAN model does well in human-face coloring and comic coloring, but lack the ability to diverse colorization. △ Less

Submitted 21 January, 2024; originally announced January 2024.

ACM Class: I.4.3

arXiv:2401.09699 [pdf, other]

Curriculum Recommendations Using Transformer Base Model with InfoNCE Loss And Language Switching Method

Authors: Xiaonan Xu, Bin Yuan, Yongyao Mo, Tianbo Song, Shulin Li

Abstract: The Curriculum Recommendations paradigm is dedicated to fostering learning equality within the ever-evolving realms of educational technology and curriculum development. In acknowledging the inherent obstacles posed by existing methodologies, such as content conflicts and disruptions from language translation, this paradigm aims to confront and overcome these challenges. Notably, it addresses cont… ▽ More The Curriculum Recommendations paradigm is dedicated to fostering learning equality within the ever-evolving realms of educational technology and curriculum development. In acknowledging the inherent obstacles posed by existing methodologies, such as content conflicts and disruptions from language translation, this paradigm aims to confront and overcome these challenges. Notably, it addresses content conflicts and disruptions introduced by language translation, hindrances that can impede the creation of an all-encompassing and personalized learning experience. The paradigm's objective is to cultivate an educational environment that not only embraces diversity but also customizes learning experiences to suit the distinct needs of each learner. To overcome these challenges, our approach builds upon notable contributions in curriculum development and personalized learning, introducing three key innovations. These include the integration of Transformer Base Model to enhance computational efficiency, the implementation of InfoNCE Loss for accurate content-topic matching, and the adoption of a language switching strategy to alleviate translation-related ambiguities. Together, these innovations aim to collectively tackle inherent challenges and contribute to forging a more equitable and effective learning journey for a diverse range of learners. Competitive cross-validation scores underscore the efficacy of sentence-transformers/LaBSE, achieving 0.66314, showcasing our methodology's effectiveness in diverse linguistic nuances for content alignment prediction. Index Terms-Curriculum Recommendation, Transformer model with InfoNCE Loss, Language Switching. △ Less

Submitted 17 January, 2024; originally announced January 2024.

Comments: 4pages, 2 figures, ICAICA2023

MSC Class: 68T50

arXiv:2401.07206 [pdf, other]

Probabilistic Reduced-Dimensional Vector Autoregressive Modeling with Oblique Projections

Authors: Yanfang Mo, S. Joe Qin

Abstract: In this paper, we propose a probabilistic reduced-dimensional vector autoregressive (PredVAR) model to extract low-dimensional dynamics from high-dimensional noisy data. The model utilizes an oblique projection to partition the measurement space into a subspace that accommodates the reduced-dimensional dynamics and a complementary static subspace. An optimal oblique decomposition is derived for th… ▽ More In this paper, we propose a probabilistic reduced-dimensional vector autoregressive (PredVAR) model to extract low-dimensional dynamics from high-dimensional noisy data. The model utilizes an oblique projection to partition the measurement space into a subspace that accommodates the reduced-dimensional dynamics and a complementary static subspace. An optimal oblique decomposition is derived for the best predictability regarding prediction error covariance. Building on this, we develop an iterative PredVAR algorithm using maximum likelihood and the expectation-maximization (EM) framework. This algorithm alternately updates the estimates of the latent dynamics and optimal oblique projection, yielding dynamic latent variables with rank-ordered predictability and an explicit latent VAR model that is consistent with the outer projection model. The superior performance and efficiency of the proposed approach are demonstrated using data sets from a synthesized Lorenz system and an industrial process from Eastman Chemical. △ Less

Submitted 14 January, 2024; originally announced January 2024.

Comments: 16pages, 5 figures

arXiv:2401.04672 [pdf, other]

Reversing Unknown Quantum Processes via Virtual Combs: for Channels with Limited Information

Authors: Chengkai Zhu, Yin Mo, Yu-Ao Chen, Xin Wang

Abstract: The inherent irreversibility of quantum dynamics for open systems poses a significant barrier to the inversion of unknown quantum processes. To tackle this challenge, we propose the framework of virtual combs that exploit the unknown process iteratively with additional classical post-processing to simulate the process inverse. Our research establishes a path to achieving the exact inverse of unkno… ▽ More The inherent irreversibility of quantum dynamics for open systems poses a significant barrier to the inversion of unknown quantum processes. To tackle this challenge, we propose the framework of virtual combs that exploit the unknown process iteratively with additional classical post-processing to simulate the process inverse. Our research establishes a path to achieving the exact inverse of unknown channels with certain conditions, accompanied by a no-go theorem that underscores the intrinsic limitations imposed by quantum mechanics on such tasks. Notably, we demonstrate that an $n$-slot virtual comb can exactly reverse a depolarizing channel with one unknown noise parameter out of $n+1$ potential candidates, and a 1-slot virtual comb can exactly reverse an arbitrary pair of quantum channels. We further explore the approximate inverse of an unknown channel within a given channel set. For any unknown depolarizing channels within a specified noise region, we unveil a worst-case error decay of $\mathcal{O}(n^{-1})$ of reversing the channel via virtual combs. Moreover, we show that virtual combs with constant slots can be applied to universally reverse unitary operations and investigate the trade-off between the slot number and the sampling overhead. △ Less

Submitted 9 January, 2024; originally announced January 2024.

Comments: 24 pages, 2 figures

arXiv:2312.06220 [pdf, other]

Dance of Channel and Sequence: An Efficient Attention-Based Approach for Multivariate Time Series Forecasting

Authors: Haoxin Wang, Yipeng Mo, Nan Yin, Honghe Dai, Bixiong Li, Songhai Fan, Site Mo

Abstract: In recent developments, predictive models for multivariate time series analysis have exhibited commendable performance through the adoption of the prevalent principle of channel independence. Nevertheless, it is imperative to acknowledge the intricate interplay among channels, which fundamentally influences the outcomes of multivariate predictions. Consequently, the notion of channel independence,… ▽ More In recent developments, predictive models for multivariate time series analysis have exhibited commendable performance through the adoption of the prevalent principle of channel independence. Nevertheless, it is imperative to acknowledge the intricate interplay among channels, which fundamentally influences the outcomes of multivariate predictions. Consequently, the notion of channel independence, while offering utility to a certain extent, becomes increasingly impractical, leading to information degradation. In response to this pressing concern, we present CSformer, an innovative framework characterized by a meticulously engineered two-stage self-attention mechanism. This mechanism is purposefully designed to enable the segregated extraction of sequence-specific and channel-specific information, while sharing parameters to promote synergy and mutual reinforcement between sequences and channels. Simultaneously, we introduce sequence adapters and channel adapters, ensuring the model's ability to discern salient features across various dimensions. Rigorous experimentation, spanning multiple real-world datasets, underscores the robustness of our approach, consistently establishing its position at the forefront of predictive performance across all datasets. This augmentation substantially enhances the capacity for feature extraction inherent to multivariate time series data, facilitating a more comprehensive exploitation of the available information. △ Less

Submitted 11 December, 2023; originally announced December 2023.

arXiv:2312.05332 [pdf, other]

MPC-Inspired Reinforcement Learning for Verifiable Model-Free Control

Authors: Yiwen Lu, Zishuo Li, Yihan Zhou, Na Li, Yilin Mo

Abstract: In this paper, we introduce a new class of parameterized controllers, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common control… ▽ More In this paper, we introduce a new class of parameterized controllers, drawing inspiration from Model Predictive Control (MPC). The controller resembles a Quadratic Programming (QP) solver of a linear MPC problem, with the parameters of the controller being trained via Deep Reinforcement Learning (DRL) rather than derived from system models. This approach addresses the limitations of common controllers with Multi-Layer Perceptron (MLP) or other general neural network architecture used in DRL, in terms of verifiability and performance guarantees, and the learned controllers possess verifiable properties like persistent feasibility and asymptotic stability akin to MPC. On the other hand, numerical examples illustrate that the proposed controller empirically matches MPC and MLP controllers in terms of control performance and has superior robustness against modeling uncertainty and noises. Furthermore, the proposed controller is significantly more computationally efficient compared to MPC and requires fewer parameters to learn than MLP controllers. Real-world experiments on vehicle drift maneuvering task demonstrate the potential of these controllers for robotics and other demanding control tasks. △ Less

Submitted 9 April, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

arXiv:2312.03481 [pdf, other]

Experimental demonstration of mice tumor control with a laser-accelerated high-energy electron radiotherapy prototype

Authors: Zhiyuan Guo, Shuang Liu, Bing Zhou, Junqi Liu, Haiyang Wang, Yang Wan, Yifei Pi, Xiaoyan Wang, Yingyi Mo, Bo Guo, Jianfei Hua, Wei Lu

Abstract: Radiotherapy using very-high-energy electron (VHEE) beams (50-300 MeV) has attracted considerable attention due to its advantageous dose deposition characteristics, enabling deep penetration and the potential for ultra-high dose rate treatment. One promising approach to compactly delivering these high energy electron beams in a cost-effective manner is laser wakefield acceleration (LWFA), which of… ▽ More Radiotherapy using very-high-energy electron (VHEE) beams (50-300 MeV) has attracted considerable attention due to its advantageous dose deposition characteristics, enabling deep penetration and the potential for ultra-high dose rate treatment. One promising approach to compactly delivering these high energy electron beams in a cost-effective manner is laser wakefield acceleration (LWFA), which offers ultra-strong accelerating gradients. However, the transition from this concept to a functional machine intended for tumor treatment is still being investigated. Here we present the first self-developed prototype for LWFA-based VHEE radiotherapy, exhibiting high compactness (occupying less than 5 square meters) and high operational stability (validated over a period of one month). Subsequently, we employed this device to irradiate a tumor implanted in a mouse model. Following a dose delivery of $5.8\pm0.2$ Gy with precise tumor conformity, all irradiated mice exhibited pronounced control of tumor growth. For comparison, this tumor-control efficacy was similar to that achieved using commercial X-ray radiotherapy equipment operating at equivalent doses. These results demonstrate the potential of a compact laser-driven VHEE system for preclinical studies involving small animal models and its promising prospects for future clinical translation in cancer therapy. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: 20 pages, 4 figures

arXiv:2311.15494 [pdf, other]

Enhancement of non-Stabilizerness within Indefinite Causal Order

Authors: Yin Mo, Chengkai Zhu, Zhiping Liu, Mingrui Jing, Xin Wang

Abstract: In the field of quantum computation, the non-stabilizerness of a quantum circuit is crucial for understanding and quantifying quantum speed-up. In this work, we explore some intriguing phenomena regarding the non-stabilizerness of a circuit when a Quantum SWITCH structure is employed. This structure is a novel quantum construct that enables quantum states to pass through operations in a superposit… ▽ More In the field of quantum computation, the non-stabilizerness of a quantum circuit is crucial for understanding and quantifying quantum speed-up. In this work, we explore some intriguing phenomena regarding the non-stabilizerness of a circuit when a Quantum SWITCH structure is employed. This structure is a novel quantum construct that enables quantum states to pass through operations in a superposition of different orders and has shown superiority in numerous tasks over circuits with a definite causal order. Firstly, we discover that the completely stabilizer-preserving operations, which cannot generate magic states under standard conditions, can be transformed into a resourceful operation capable of generating magic states when processed by the Quantum SWITCH. Secondly, when considering the effects of noisy channels on operations, we observe that while the non-stabilizerness of each path may be annihilated, their superposition could still preserve the non-stabilizerness of the operation. These findings reveal unique properties brought by the Quantum SWITCH and open further avenues in future research on magic resources of general quantum architecture. △ Less

Submitted 26 November, 2023; originally announced November 2023.

Comments: 5+4 pages, 4 figures

arXiv:2310.19654 [pdf, other]

MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient image-text retrieval

Authors: Youbo Lei, Feifei He, Chen Chen, Yingbin Mo, Si Jia Li, Defeng Xie, Haonan Lu

Abstract: Due to the success of large-scale visual-language pretraining (VLP) models and the widespread use of image-text retrieval in industry areas, it is now critically necessary to reduce the model size and streamline their mobile-device deployment. Single- and dual-stream model structures are commonly used in image-text retrieval with the goal of closing the semantic gap between textual and visual moda… ▽ More Due to the success of large-scale visual-language pretraining (VLP) models and the widespread use of image-text retrieval in industry areas, it is now critically necessary to reduce the model size and streamline their mobile-device deployment. Single- and dual-stream model structures are commonly used in image-text retrieval with the goal of closing the semantic gap between textual and visual modalities. While single-stream models use deep feature fusion to achieve more accurate cross-model alignment, dual-stream models are better at offline indexing and fast inference.We propose a Multi-teacher Cross-modality Alignment Distillation (MCAD) technique to integrate the advantages of single- and dual-stream models. By incorporating the fused single-stream features into the image and text features of the dual-stream model, we formulate new modified teacher similarity distributions and features. Then, we conduct both distribution and feature distillation to boost the capability of the student dual-stream model, achieving high retrieval performance without increasing inference complexity.Extensive experiments demonstrate the remarkable performance and high efficiency of MCAD on image-text retrieval tasks. Furthermore, we implement a lightweight CLIP model on Snapdragon/Dimensity chips with only $\sim$100M running memory and $\sim$8.0ms search latency, achieving the mobile-device application of VLP models. △ Less

Submitted 1 April, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: Accepted by NAACL 2024 Findings

arXiv:2310.12147 [pdf, other]

InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions

Authors: Hanbo Zhang, Jie Xu, Yuchen Mo, Tao Kong

Abstract: Ambiguity is ubiquitous in human communication. Previous approaches in Human-Robot Interaction (HRI) have often relied on predefined interaction templates, leading to reduced performance in realistic and open-ended scenarios. To address these issues, we present a large-scale dataset, \invig, for interactive visual grounding under language ambiguity. Our dataset comprises over 520K images accompani… ▽ More Ambiguity is ubiquitous in human communication. Previous approaches in Human-Robot Interaction (HRI) have often relied on predefined interaction templates, leading to reduced performance in realistic and open-ended scenarios. To address these issues, we present a large-scale dataset, \invig, for interactive visual grounding under language ambiguity. Our dataset comprises over 520K images accompanied by open-ended goal-oriented disambiguation dialogues, encompassing millions of object instances and corresponding question-answer pairs. Leveraging the \invig dataset, we conduct extensive studies and propose a set of baseline solutions for end-to-end interactive visual disambiguation and grounding, achieving a 45.6\% success rate during validation. To the best of our knowledge, the \invig dataset is the first large-scale dataset for resolving open-ended interactive visual grounding, presenting a practical yet highly challenging benchmark for ambiguity-aware HRI. Codes and datasets are available at: \href{https://openivg.github.io}{https://openivg.github.io}. △ Less

Submitted 18 October, 2023; originally announced October 2023.

Comments: 8 pages, 9 figures, 3 tables, under review

arXiv:2310.11790 [pdf, other]

Finite Time Performance Analysis of MIMO Systems Identification

Authors: Shuai Sun, Jiayun Li, Yilin Mo

Abstract: This paper is concerned with the finite time identification performance of an n dimensional discrete-time Multiple-Input Multiple-Output (MIMO) Linear Time-Invariant system, with p inputs and m outputs. We prove that the widely-used Ho-Kalman algorithm and Multivariable Output Error State Space (MOESP) algorithm are ill-conditioned for MIMO system when n/m or n/p is large. Moreover, by analyzing t… ▽ More This paper is concerned with the finite time identification performance of an n dimensional discrete-time Multiple-Input Multiple-Output (MIMO) Linear Time-Invariant system, with p inputs and m outputs. We prove that the widely-used Ho-Kalman algorithm and Multivariable Output Error State Space (MOESP) algorithm are ill-conditioned for MIMO system when n/m or n/p is large. Moreover, by analyzing the Cramer-Rao bound, we derive a fundamental limit for identifying the real and stable (or marginally stable) poles of MIMO system and prove that the sample complexity for any unbiased pole estimation algorithm to reach a certain level of accuracy explodes superpolynomially with respect to n/(pm). Numerical results are provided to illustrate the ill-conditionedness of Ho-Kalman algorithm and MOESP algorithm as well as the fundamental limit on identification. △ Less

Submitted 18 October, 2023; originally announced October 2023.

Comments: 9 pages, 4 figures

arXiv:2310.07229 [pdf, other]

ProFSA: Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment

Authors: Bowen Gao, Yinjun Jia, Yuanle Mo, Yuyan Ni, Weiying Ma, Zhiming Ma, Yanyan Lan

Abstract: Pocket representations play a vital role in various biomedical applications, such as druggability estimation, ligand affinity prediction, and de novo drug design. While existing geometric features and pretrained representations have demonstrated promising results, they usually treat pockets independent of ligands, neglecting the fundamental interactions between them. However, the limited pocket-li… ▽ More Pocket representations play a vital role in various biomedical applications, such as druggability estimation, ligand affinity prediction, and de novo drug design. While existing geometric features and pretrained representations have demonstrated promising results, they usually treat pockets independent of ligands, neglecting the fundamental interactions between them. However, the limited pocket-ligand complex structures available in the PDB database (less than 100 thousand non-redundant pairs) hampers large-scale pretraining endeavors for interaction modeling. To address this constraint, we propose a novel pocket pretraining approach that leverages knowledge from high-resolution atomic protein structures, assisted by highly effective pretrained small molecule representations. By segmenting protein structures into drug-like fragments and their corresponding pockets, we obtain a reasonable simulation of ligand-receptor interactions, resulting in the generation of over 5 million complexes. Subsequently, the pocket encoder is trained in a contrastive manner to align with the representation of pseudo-ligand furnished by some pretrained small molecule encoders. Our method, named ProFSA, achieves state-of-the-art performance across various tasks, including pocket druggability prediction, pocket matching, and ligand binding affinity prediction. Notably, ProFSA surpasses other pretraining methods by a substantial margin. Moreover, our work opens up a new avenue for mitigating the scarcity of protein-ligand complex data through the utilization of high-quality and diverse protein structure databases. △ Less

Submitted 7 March, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

arXiv:2310.06387 [pdf, other]

Jailbreak and Guard Aligned Language Models with Only Few In-Context Demonstrations

Authors: Zeming Wei, Yifei Wang, Ang Li, Yichuan Mo, Yisen Wang

Abstract: Large Language Models (LLMs) have shown remarkable success in various tasks, yet their safety and the risk of generating harmful content remain pressing concerns. In this paper, we delve into the potential of In-Context Learning (ICL) to modulate the alignment of LLMs. Specifically, we propose the In-Context Attack (ICA) which employs harmful demonstrations to subvert LLMs, and the In-Context Defe… ▽ More Large Language Models (LLMs) have shown remarkable success in various tasks, yet their safety and the risk of generating harmful content remain pressing concerns. In this paper, we delve into the potential of In-Context Learning (ICL) to modulate the alignment of LLMs. Specifically, we propose the In-Context Attack (ICA) which employs harmful demonstrations to subvert LLMs, and the In-Context Defense (ICD) which bolsters model resilience through examples that demonstrate refusal to produce harmful responses. We offer theoretical insights to elucidate how a limited set of in-context demonstrations can pivotally influence the safety alignment of LLMs. Through extensive experiments, we demonstrate the efficacy of ICA and ICD in respectively elevating and mitigating the success rates of jailbreaking prompts. Our findings illuminate the profound influence of ICL on LLM behavior, opening new avenues for improving the safety of LLMs. △ Less

Submitted 25 May, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2310.03347 [pdf, other]

Razumikhin-type ISS Lyapunov function and small gain theorem for discrete time time-delay systems with application to a biased min-consensus protocol

Authors: Yuanqiu Mo, Wenwu Yu, Huazhou Hou, Soura Dasgupta

Abstract: This paper considers small gain theorems for the global asymptotic and exponential input-to-state stability for discrete time time-delay systems using Razumikhin-type Lyapunov function. Among other things, unlike the existing literature, it provides both necessary and sufficient conditions for exponential input-to-state stability in terms of the Razumikhin-type Lyapunov function and the small gain… ▽ More This paper considers small gain theorems for the global asymptotic and exponential input-to-state stability for discrete time time-delay systems using Razumikhin-type Lyapunov function. Among other things, unlike the existing literature, it provides both necessary and sufficient conditions for exponential input-to-state stability in terms of the Razumikhin-type Lyapunov function and the small gain theorem. Previous necessary ad sufficient conditions were with the more computationally onerous, Krasovskii-type Lyapunov functions. The result finds application in the robust stability analysis of a graph-based distributed algorithm, namely, the biased min-consensus protocol, which can be used to compute the length of the shortest path from each node to its nearest source in a graph. We consider the biased min-consensus protocol under perturbations that are common in communication networks, including noise, delay and asynchronous communication. By converting such a perturbed protocol into a discrete time time-delay nonlinear system, we prove its exponential input-to-state stability under perturbations using our Razumikhin-type Lyapunov-based small gain theorem. Simulations are provided to verify the theoretical results. △ Less

Submitted 5 October, 2023; originally announced October 2023.

arXiv:2310.02000 [pdf, other]

doi 10.1007/978-3-031-16452-1_15

MUSCLE: Multi-task Self-supervised Continual Learning to Pre-train Deep Models for X-ray Images of Multiple Body Parts

Authors: Weibin Liao, Haoyi Xiong, Qingzhong Wang, Yan Mo, Xuhong Li, Yi Liu, Zeyu Chen, Siyu Huang, Dejing Dou

Abstract: While self-supervised learning (SSL) algorithms have been widely used to pre-train deep models, few efforts [11] have been done to improve representation learning of X-ray image analysis with SSL pre-trained models. In this work, we study a novel self-supervised pre-training pipeline, namely Multi-task Self-super-vised Continual Learning (MUSCLE), for multiple medical imaging tasks, such as classi… ▽ More While self-supervised learning (SSL) algorithms have been widely used to pre-train deep models, few efforts [11] have been done to improve representation learning of X-ray image analysis with SSL pre-trained models. In this work, we study a novel self-supervised pre-training pipeline, namely Multi-task Self-super-vised Continual Learning (MUSCLE), for multiple medical imaging tasks, such as classification and segmentation, using X-ray images collected from multiple body parts, including heads, lungs, and bones. Specifically, MUSCLE aggregates X-rays collected from multiple body parts for MoCo-based representation learning, and adopts a well-designed continual learning (CL) procedure to further pre-train the backbone subject various X-ray analysis tasks jointly. Certain strategies for image pre-processing, learning schedules, and regularization have been used to solve data heterogeneity, overfitting, and catastrophic forgetting problems for multi-task/dataset learning in MUSCLE.We evaluate MUSCLE using 9 real-world X-ray datasets with various tasks, including pneumonia classification, skeletal abnormality classification, lung segmentation, and tuberculosis (TB) detection. Comparisons against other pre-trained models [7] confirm the proof-of-concept that self-supervised multi-task/dataset continual pre-training could boost the performance of X-ray image analysis. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: accepted by Medical Image Computing and Computer Assisted Intervention (MICCAI) 2022

arXiv:2309.17194 [pdf, other]

Generalized Activation via Multivariate Projection

Authors: Jiayun Li, Yuxiao Cheng, Yiwen Lu, Zhuofan Xia, Yilin Mo, Gao Huang

Abstract: Activation functions are essential to introduce nonlinearity into neural networks, with the Rectified Linear Unit (ReLU) often favored for its simplicity and effectiveness. Motivated by the structural similarity between a shallow Feedforward Neural Network (FNN) and a single iteration of the Projected Gradient Descent (PGD) algorithm, a standard approach for solving constrained optimization proble… ▽ More Activation functions are essential to introduce nonlinearity into neural networks, with the Rectified Linear Unit (ReLU) often favored for its simplicity and effectiveness. Motivated by the structural similarity between a shallow Feedforward Neural Network (FNN) and a single iteration of the Projected Gradient Descent (PGD) algorithm, a standard approach for solving constrained optimization problems, we consider ReLU as a projection from R onto the nonnegative half-line R+. Building on this interpretation, we extend ReLU by substituting it with a generalized projection operator onto a convex cone, such as the Second-Order Cone (SOC) projection, thereby naturally extending it to a Multivariate Projection Unit (MPU), an activation function with multiple inputs and multiple outputs. We further provide mathematical proof establishing that FNNs activated by SOC projections outperform those utilizing ReLU in terms of expressive power. Experimental evaluations on widely-adopted architectures further corroborate MPU's effectiveness against a broader range of existing activation functions. △ Less

Submitted 27 January, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

arXiv:2309.17136 [pdf, other]

Latent Dynamic Networked System Identification with High-Dimensional Networked Data

Authors: Jiaxin Yu, Yanfang Mo, S. Joe Qin

Abstract: Networked dynamic systems are ubiquitous in various domains, such as industrial processes, social networks, and biological systems. These systems produce high-dimensional data that reflect the complex interactions among the network nodes with rich sensor measurements. In this paper, we propose a novel algorithm for latent dynamic networked system identification that leverages the network structure… ▽ More Networked dynamic systems are ubiquitous in various domains, such as industrial processes, social networks, and biological systems. These systems produce high-dimensional data that reflect the complex interactions among the network nodes with rich sensor measurements. In this paper, we propose a novel algorithm for latent dynamic networked system identification that leverages the network structure and performs dimension reduction for each node via dynamic latent variables (DLVs). The algorithm assumes that the DLVs of each node have an auto-regressive model with exogenous input and interactions from other nodes. The DLVs of each node are extracted to capture the most predictable latent variables in the high dimensional data, while the residual factors are not predictable. The advantage of the proposed framework is demonstrated on an industrial process network for system identification and dynamic data analytics. △ Less

Submitted 29 September, 2023; originally announced September 2023.

arXiv:2309.15154 [pdf, other]

doi 10.1103/PhysRevB.109.L081401

Anomalous Linear and Quadratic Nodeless Surface Dirac Cones in Three-Dimensional Dirac Semimetals

Authors: Dongling Liu, Xiao-Jiao Wang, Yijie Mo, Zhongbo Yan

Abstract: Surface Dirac cones in three-dimensional topological insulators have generated tremendous and enduring interest for almost two decades owing to hosting a multitude of exotic properties. In this work, we unveil the existence of two types of anomalous surface Dirac cones in three-dimensional Dirac semimetals. These surface Dirac cones are located at the surfaces perpendicular to the rotation symmetr… ▽ More Surface Dirac cones in three-dimensional topological insulators have generated tremendous and enduring interest for almost two decades owing to hosting a multitude of exotic properties. In this work, we unveil the existence of two types of anomalous surface Dirac cones in three-dimensional Dirac semimetals. These surface Dirac cones are located at the surfaces perpendicular to the rotation symmetry axis, and are found to display a number of features remarkably different from that in topological insulators. The most prominent one is the absence of singular Dirac node. In addition, the spin textures of these nodeless surface Dirac cones are found to exhibit a unique two-phase-angle dependence, leading to the presence of two different winding numbers in the orbital-resolved spin textures, which is rather different from the well-known spin-momentum locking in topological insulators. Despite the absence of Dirac node, we find that the two types of surface Dirac cones are also characterized by quantized $π$ Berry phases, even though one of them takes a quadratic dispersion. In the presence of time-reversal-symmetry-breaking fields, we find that the responses of the surface and bulk Dirac cones display an interesting bulk-surface correspondence. The uncovering of these nodeless surface Dirac cones broadens our understanding of the topological surface states and bulk-boundary correspondence in Dirac semimetals, and also lays down the basis for studying unconventional Dirac physics. △ Less

Submitted 17 February, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

Comments: 8+8 pages, 3 figures; published version

Journal ref: Phys. Rev. B 109, L081401 (2024)

arXiv:2309.12090 [pdf, other]

Multi-Task Cooperative Learning via Searching for Flat Minima

Authors: Fuping Wu, Le Zhang, Yang Sun, Yuanhan Mo, Thomas Nichols, Bartlomiej W. Papiez

Abstract: Multi-task learning (MTL) has shown great potential in medical image analysis, improving the generalizability of the learned features and the performance in individual tasks. However, most of the work on MTL focuses on either architecture design or gradient manipulation, while in both scenarios, features are learned in a competitive manner. In this work, we propose to formulate MTL as a multi/bi-l… ▽ More Multi-task learning (MTL) has shown great potential in medical image analysis, improving the generalizability of the learned features and the performance in individual tasks. However, most of the work on MTL focuses on either architecture design or gradient manipulation, while in both scenarios, features are learned in a competitive manner. In this work, we propose to formulate MTL as a multi/bi-level optimization problem, and therefore force features to learn from each task in a cooperative approach. Specifically, we update the sub-model for each task alternatively taking advantage of the learned sub-models of the other tasks. To alleviate the negative transfer problem during the optimization, we search for flat minima for the current objective function with regard to features from other tasks. To demonstrate the effectiveness of the proposed approach, we validate our method on three publicly available datasets. The proposed method shows the advantage of cooperative learning, and yields promising results when compared with the state-of-the-art MTL approaches. The code will be available online. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: This paper has been accepted by MedAGI workshop in MICCAI2023

arXiv:2309.07561 [pdf, other]

Adaptive Prompt Learning with Distilled Connective Knowledge for Implicit Discourse Relation Recognition

Authors: Bang Wang, Zhenglin Wang, Wei Xiang, Yijun Mo

Abstract: Implicit discourse relation recognition (IDRR) aims at recognizing the discourse relation between two text segments without an explicit connective. Recently, the prompt learning has just been applied to the IDRR task with great performance improvements over various neural network-based approaches. However, the discrete nature of the state-art-of-art prompting approach requires manual design of tem… ▽ More Implicit discourse relation recognition (IDRR) aims at recognizing the discourse relation between two text segments without an explicit connective. Recently, the prompt learning has just been applied to the IDRR task with great performance improvements over various neural network-based approaches. However, the discrete nature of the state-art-of-art prompting approach requires manual design of templates and answers, a big hurdle for its practical applications. In this paper, we propose a continuous version of prompt learning together with connective knowledge distillation, called AdaptPrompt, to reduce manual design efforts via continuous prompting while further improving performance via knowledge transfer. In particular, we design and train a few virtual tokens to form continuous templates and automatically select the most suitable one by gradient search in the embedding space. We also design an answer-relation mapping rule to generate a few virtual answers as the answer space. Furthermore, we notice the importance of annotated connectives in the training dataset and design a teacher-student architecture for knowledge transfer. Experiments on the up-to-date PDTB Corpus V3.0 validate our design objectives in terms of the better relation recognition performance over the state-of-the-art competitors. △ Less

Submitted 14 September, 2023; originally announced September 2023.

arXiv:2309.01161 [pdf, other]

Probabilistic Reduced-Dimensional Vector Autoregressive Modeling for Dynamics Prediction and Reconstruction with Oblique Projections

Authors: Yanfang Mo, Jiaxin Yu, S. Joe Qin

Abstract: In this paper, we propose a probabilistic reduced-dimensional vector autoregressive (PredVAR) model with oblique projections. This model partitions the measurement space into a dynamic subspace and a static subspace that do not need to be orthogonal. The partition allows us to apply an oblique projection to extract dynamic latent variables (DLVs) from high-dimensional data with maximized predictab… ▽ More In this paper, we propose a probabilistic reduced-dimensional vector autoregressive (PredVAR) model with oblique projections. This model partitions the measurement space into a dynamic subspace and a static subspace that do not need to be orthogonal. The partition allows us to apply an oblique projection to extract dynamic latent variables (DLVs) from high-dimensional data with maximized predictability. We develop an alternating iterative PredVAR algorithm that exploits the interaction between updating the latent VAR dynamics and estimating the oblique projection, using expectation maximization (EM) and a statistical constraint. In addition, the noise covariance matrices are estimated as a natural outcome of the EM method. A simulation case study of the nonlinear Lorenz oscillation system illustrates the advantages of the proposed approach over two alternatives. △ Less

Submitted 3 September, 2023; originally announced September 2023.

arXiv:2308.09073 [pdf, other]

mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view Contrastive Learning

Authors: Ying Mo, Jian Yang, Jiahao Liu, Qifan Wang, Ruoyu Chen, Jingang Wang, Zhoujun Li

Abstract: Cross-lingual named entity recognition (CrossNER) faces challenges stemming from uneven performance due to the scarcity of multilingual corpora, especially for non-English data. While prior efforts mainly focus on data-driven transfer methods, a significant aspect that has not been fully explored is aligning both semantic and token-level representations across diverse languages. In this paper, we… ▽ More Cross-lingual named entity recognition (CrossNER) faces challenges stemming from uneven performance due to the scarcity of multilingual corpora, especially for non-English data. While prior efforts mainly focus on data-driven transfer methods, a significant aspect that has not been fully explored is aligning both semantic and token-level representations across diverse languages. In this paper, we propose Multi-view Contrastive Learning for Cross-lingual Named Entity Recognition (mCL-NER). Specifically, we reframe the CrossNER task into a problem of recognizing relationships between pairs of tokens. This approach taps into the inherent contextual nuances of token-to-token connections within entities, allowing us to align representations across different languages. A multi-view contrastive learning framework is introduced to encompass semantic contrasts between source, codeswitched, and target sentences, as well as contrasts among token-to-token relations. By enforcing agreement within both semantic and relational spaces, we minimize the gap between source sentences and their counterparts of both codeswitched and target sentences. This alignment extends to the relationships between diverse tokens, enhancing the projection of entities across languages. We further augment CrossNER by combining self-training with labeled source data and unlabeled target data. Our experiments on the XTREME benchmark, spanning 40 languages, demonstrate the superiority of mCL-NER over prior data-driven and model-based approaches. It achieves a substantial increase of nearly +2.0 $F_1$ scores across a broad spectrum and establishes itself as the new state-of-the-art performer. △ Less

Submitted 21 February, 2024; v1 submitted 17 August, 2023; originally announced August 2023.

Comments: 9 pages, Accepted by AAAI 2024

arXiv:2308.05123 [pdf, other]

Towards Automatic Scoring of Spinal X-ray for Ankylosing Spondylitis

Authors: Yuanhan Mo, Yao Chen, Aimee Readie, Gregory Ligozio, Thibaud Coroller, Bartłomiej W. Papież

Abstract: Manually grading structural changes with the modified Stoke Ankylosing Spondylitis Spinal Score (mSASSS) on spinal X-ray imaging is costly and time-consuming due to bone shape complexity and image quality variations. In this study, we address this challenge by prototyping a 2-step auto-grading pipeline, called VertXGradeNet, to automatically predict mSASSS scores for the cervical and lumbar verteb… ▽ More Manually grading structural changes with the modified Stoke Ankylosing Spondylitis Spinal Score (mSASSS) on spinal X-ray imaging is costly and time-consuming due to bone shape complexity and image quality variations. In this study, we address this challenge by prototyping a 2-step auto-grading pipeline, called VertXGradeNet, to automatically predict mSASSS scores for the cervical and lumbar vertebral units (VUs) in X-ray spinal imaging. The VertXGradeNet utilizes VUs generated by our previously developed VU extraction pipeline (VertXNet) as input and predicts mSASSS based on those VUs. VertXGradeNet was evaluated on an in-house dataset of lateral cervical and lumbar X-ray images for axial spondylarthritis patients. Our results show that VertXGradeNet can predict the mSASSS score for each VU when the data is limited in quantity and imbalanced. Overall, it can achieve a balanced accuracy of 0.56 and 0.51 for 4 different mSASSS scores (i.e., a score of 0, 1, 2, 3) on two test datasets. The accuracy of the presented method shows the potential to streamline the spinal radiograph readings and therefore reduce the cost of future clinical trials. △ Less

Submitted 8 August, 2023; originally announced August 2023.

arXiv:2307.01851 [pdf, other]

Boundary Flat Bands with Topological Spin Textures Protected by Sub-chiral Symmetry

Authors: Yijie Mo, Xiao-Jiao Wang, Rui Yu, Zhongbo Yan

Abstract: Chiral symmetry plays an indispensable role in topological classifications as well as in the understanding of the origin of bulk or boundary flat bands. The conventional definition of chiral symmetry refers to the existence of a constant unitary matrix anticommuting with the Hamiltonian. As a constant unitary matrix has constant eigenvectors, boundary flat bands enforced by chiral symmetry, which… ▽ More Chiral symmetry plays an indispensable role in topological classifications as well as in the understanding of the origin of bulk or boundary flat bands. The conventional definition of chiral symmetry refers to the existence of a constant unitary matrix anticommuting with the Hamiltonian. As a constant unitary matrix has constant eigenvectors, boundary flat bands enforced by chiral symmetry, which share the same eigenvectors with the chiral symmetry operator, are known to carry fixed (pseudo)spin polarizations and be featureless in quantum geometry. In this work, we generalize the chiral symmetry and introduce a concept termed sub-chiral symmetry. Unlike the conventional chiral symmetry operator defined as constant, the sub-chiral symmetry operator depends on partial components of the momentum vector, so as its eigenvectors. We show that topological gapped or gapless systems without the chiral symmetry but with the sub-chiral symmetry can support boundary flat bands, which exhibit topological spin textures and quantized Berry phases. We expect that such intriguing boundary flat bands could give rise to a variety of exotic physics in the presence of interactions or disorders. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 7+4 pages, 2 figures

Showing 1–50 of 497 results for author: Mo, Y