subscribe to arXiv mailings

Disassembling Obfuscated Executables with LLM

Authors: Huanyao Rong, Yue Duan, Hang Zhang, XiaoFeng Wang, Hongbo Chen, Shengchen Duan, Shen Wang

Abstract: Disassembly is a challenging task, particularly for obfuscated executables containing junk bytes, which is designed to induce disassembly errors. Existing solutions rely on heuristics or leverage machine learning techniques, but only achieve limited successes. Fundamentally, such obfuscation cannot be defeated without in-depth understanding of the binary executable's semantics, which is made possi… ▽ More Disassembly is a challenging task, particularly for obfuscated executables containing junk bytes, which is designed to induce disassembly errors. Existing solutions rely on heuristics or leverage machine learning techniques, but only achieve limited successes. Fundamentally, such obfuscation cannot be defeated without in-depth understanding of the binary executable's semantics, which is made possible by the emergence of large language models (LLMs). In this paper, we present DisasLLM, a novel LLM-driven dissembler to overcome the challenge in analyzing obfuscated executables. DisasLLM consists of two components: an LLM-based classifier that determines whether an instruction in an assembly code snippet is correctly decoded, and a disassembly strategy that leverages this model to disassemble obfuscated executables end-to-end. We evaluated DisasLLM on a set of heavily obfuscated executables, which is shown to significantly outperform other state-of-the-art disassembly solutions. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2407.08353 [pdf]

One-dimensional flat bands in phosphorene nanoribbons with pentagonal nature

Authors: Shuo Sun, Jing-Yang You, Zhihao Cai, Jie Su, Tong Yang, Xinnan Peng, Yihe Wang, Daiyu Geng, Jian Gou, Yuli Huang, Sisheng Duan, Lan Chen, Kehui Wu, Andrew T. S. Wee, Yuan Ping Feng, Jia Lin Zhang, Jiong Lu, Baojie Feng, Wei Chen

Abstract: Materials with topological flat bands can serve as a promising platform to investigate strongly interacting phenomena. However, experimental realization of ideal flat bands is mostly limited to artificial lattices or moiré systems. Here we report a general way to construct one-dimensional (1D) flat bands in phosphorene nanoribbons (PNRs) with pentagonal nature: penta-hexa-PNRs and penta-dodeca-PNR… ▽ More Materials with topological flat bands can serve as a promising platform to investigate strongly interacting phenomena. However, experimental realization of ideal flat bands is mostly limited to artificial lattices or moiré systems. Here we report a general way to construct one-dimensional (1D) flat bands in phosphorene nanoribbons (PNRs) with pentagonal nature: penta-hexa-PNRs and penta-dodeca-PNRs, wherein the corresponding flat bands are directly verified by using angle-resolved photoemission spectroscopy. We confirm that the observed 1D flat bands originate from the electronic 1D sawtooth and Lieb lattices, respectively, as revealed by the combination of bond-resolved scanning tunneling microscopy, scanning tunneling spectroscopy, tight-binding models, and first-principles calculations. Our study demonstrates a general way to construct 1D flat bands in 1D solid materials system, which provides a robust platform to explore strongly interacting phases of matter. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 13 pages, 4 figures

arXiv:2407.01183 [pdf, other]

TCSR-SQL: Towards Table Content-aware Text-to-SQL with Self-retrieval

Authors: Wenbo Xu, Liang Yan, Peiyi Han, Haifeng Zhu, Chuanyi Liu, Shaoming Duan, Cuiyun Gao, Yingwei Liang

Abstract: Large Language Model-based (LLM-based) Text-to-SQL methods have achieved important progress in generating SQL queries for real-world applications. When confronted with table content-aware questions in real-world scenarios, ambiguous data content keywords and non-existent database schema column names within the question leads to the poor performance of existing methods. To solve this problem, we pr… ▽ More Large Language Model-based (LLM-based) Text-to-SQL methods have achieved important progress in generating SQL queries for real-world applications. When confronted with table content-aware questions in real-world scenarios, ambiguous data content keywords and non-existent database schema column names within the question leads to the poor performance of existing methods. To solve this problem, we propose a novel approach towards Table Content-aware Text-to-SQL with Self-Retrieval (TCSR-SQL). It leverages LLM's in-context learning capability to extract data content keywords within the question and infer possible related database schema, which is used to generate Seed SQL to fuzz search databases. The search results are further used to confirm the encoding knowledge with the designed encoding knowledge table, including column names and exact stored content values used in the SQL. The encoding knowledge is sent to obtain the final Precise SQL following multi-rounds of generation-execution-revision process. To validate our approach, we introduce a table-content-aware, question-related benchmark dataset, containing 1,692 question-SQL pairs. Comprehensive experiments conducted on this benchmark demonstrate the remarkable performance of TCSR-SQL, achieving an improvement of at least 13.7% in execution accuracy compared to other state-of-the-art methods. △ Less

Submitted 12 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

arXiv:2406.14549 [pdf, other]

Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Large Language Models

Authors: Sunny Duan, Mikail Khona, Abhiram Iyer, Rylan Schaeffer, Ila R Fiete

Abstract: The proliferation of large language models has revolutionized natural language processing tasks, yet it raises profound concerns regarding data privacy and security. Language models are trained on extensive corpora including potentially sensitive or proprietary information, and the risk of data leakage -- where the model response reveals pieces of such information -- remains inadequately understoo… ▽ More The proliferation of large language models has revolutionized natural language processing tasks, yet it raises profound concerns regarding data privacy and security. Language models are trained on extensive corpora including potentially sensitive or proprietary information, and the risk of data leakage -- where the model response reveals pieces of such information -- remains inadequately understood. This study examines susceptibility to data leakage by quantifying the phenomenon of memorization in machine learning models, focusing on the evolution of memorization patterns over training. We investigate how the statistical characteristics of training data influence the memories encoded within the model by evaluating how repetition influences memorization. We reproduce findings that the probability of memorizing a sequence scales logarithmically with the number of times it is present in the data. Furthermore, we find that sequences which are not apparently memorized after the first encounter can be uncovered throughout the course of training even without subsequent encounters. The presence of these latent memorized sequences presents a challenge for data privacy since they may be hidden at the final checkpoint of the model. To this end, we develop a diagnostic test for uncovering these latent memorized sequences by considering their cross entropy loss. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.12793 [pdf, other]

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

Authors: Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, Jing Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang , et al. (32 additional authors not shown)

Abstract: We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained… ▽ More We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained on ten trillions of tokens mostly in Chinese and English, along with a small set of corpus from 24 languages, and aligned primarily for Chinese and English usage. The high-quality alignment is achieved via a multi-stage post-training process, which involves supervised fine-tuning and learning from human feedback. Evaluations show that GLM-4 1) closely rivals or outperforms GPT-4 in terms of general metrics such as MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, 2) gets close to GPT-4-Turbo in instruction following as measured by IFEval, 3) matches GPT-4 Turbo (128K) and Claude 3 for long context tasks, and 4) outperforms GPT-4 in Chinese alignments as measured by AlignBench. The GLM-4 All Tools model is further aligned to understand user intent and autonomously decide when and which tool(s) touse -- including web browser, Python interpreter, text-to-image model, and user-defined functions -- to effectively complete complex tasks. In practical applications, it matches and even surpasses GPT-4 All Tools in tasks like accessing online information via web browsing and solving math problems using Python interpreter. Over the course, we have open-sourced a series of models, including ChatGLM-6B (three generations), GLM-4-9B (128K, 1M), GLM-4V-9B, WebGLM, and CodeGeeX, attracting over 10 million downloads on Hugging face in the year 2023 alone. The open models can be accessed through https://github.com/THUDM and https://huggingface.co/THUDM. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.07436 [pdf, other]

McEval: Massively Multilingual Code Evaluation

Authors: Linzheng Chai, Shukai Liu, Jian Yang, Yuwei Yin, Ke Jin, Jiaheng Liu, Tao Sun, Ge Zhang, Changyu Ren, Hongcheng Guo, Zekun Wang, Boyang Wang, Xianjie Wu, Bing Wang, Tongliang Li, Liqun Yang, Sufeng Duan, Zhoujun Li

Abstract: Code large language models (LLMs) have shown remarkable advances in code understanding, completion, and generation tasks. Programming benchmarks, comprised of a selection of code challenges and corresponding test cases, serve as a standard to evaluate the capability of different LLMs in such tasks. However, most existing benchmarks primarily focus on Python and are still restricted to a limited nu… ▽ More Code large language models (LLMs) have shown remarkable advances in code understanding, completion, and generation tasks. Programming benchmarks, comprised of a selection of code challenges and corresponding test cases, serve as a standard to evaluate the capability of different LLMs in such tasks. However, most existing benchmarks primarily focus on Python and are still restricted to a limited number of languages, where other languages are translated from the Python samples (e.g. MultiPL-E) degrading the data diversity. To further facilitate the research of code LLMs, we propose a massively multilingual code benchmark covering 40 programming languages (McEval) with 16K test samples, which substantially pushes the limits of code LLMs in multilingual scenarios. The benchmark contains challenging code completion, understanding, and generation evaluation tasks with finely curated massively multilingual instruction corpora McEval-Instruct. In addition, we introduce an effective multilingual coder mCoder trained on McEval-Instruct to support multilingual programming language generation. Extensive experimental results on McEval show that there is still a difficult journey between open-source models and closed-source LLMs (e.g. GPT-series models) in numerous languages. The instruction corpora, evaluation benchmark, and leaderboard are available at \url{https://mceval.github.io/}. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 22 pages

arXiv:2406.07032 [pdf, other]

RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks

Authors: Zhechao Wang, Peirui Cheng, Pengju Tian, Yuchao Wang, Mingxin Chen, Shujing Duan, Zhirui Wang, Xinming Li, Xian Sun

Abstract: Remote sensing lightweight foundation models have achieved notable success in online perception within remote sensing. However, their capabilities are restricted to performing online inference solely based on their own observations and models, thus lacking a comprehensive understanding of large-scale remote sensing scenarios. To overcome this limitation, we propose a Remote Sensing Distributed Fou… ▽ More Remote sensing lightweight foundation models have achieved notable success in online perception within remote sensing. However, their capabilities are restricted to performing online inference solely based on their own observations and models, thus lacking a comprehensive understanding of large-scale remote sensing scenarios. To overcome this limitation, we propose a Remote Sensing Distributed Foundation Model (RS-DFM) based on generalized information mapping and interaction. This model can realize online collaborative perception across multiple platforms and various downstream tasks by mapping observations into a unified space and implementing a task-agnostic information interaction strategy. Specifically, we leverage the ground-based geometric prior of remote sensing oblique observations to transform the feature mapping from absolute depth estimation to relative depth estimation, thereby enhancing the model's ability to extract generalized features across diverse heights and perspectives. Additionally, we present a dual-branch information compression module to decouple high-frequency and low-frequency feature information, achieving feature-level compression while preserving essential task-agnostic details. In support of our research, we create a multi-task simulation dataset named AirCo-MultiTasks for multi-UAV collaborative observation. We also conduct extensive experiments, including 3D object detection, instance segmentation, and trajectory prediction. The numerous results demonstrate that our RS-DFM achieves state-of-the-art performance across various downstream tasks. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.06305 [pdf, other]

NeuroMoCo: A Neuromorphic Momentum Contrast Learning Method for Spiking Neural Networks

Authors: Yuqi Ma, Huamin Wang, Hangchi Shen, Xuemei Chen, Shukai Duan, Shiping Wen

Abstract: Recently, brain-inspired spiking neural networks (SNNs) have attracted great research attention owing to their inherent bio-interpretability, event-triggered properties and powerful perception of spatiotemporal information, which is beneficial to handling event-based neuromorphic datasets. In contrast to conventional static image datasets, event-based neuromorphic datasets present heightened compl… ▽ More Recently, brain-inspired spiking neural networks (SNNs) have attracted great research attention owing to their inherent bio-interpretability, event-triggered properties and powerful perception of spatiotemporal information, which is beneficial to handling event-based neuromorphic datasets. In contrast to conventional static image datasets, event-based neuromorphic datasets present heightened complexity in feature extraction due to their distinctive time series and sparsity characteristics, which influences their classification accuracy. To overcome this challenge, a novel approach termed Neuromorphic Momentum Contrast Learning (NeuroMoCo) for SNNs is introduced in this paper by extending the benefits of self-supervised pre-training to SNNs to effectively stimulate their potential. This is the first time that self-supervised learning (SSL) based on momentum contrastive learning is realized in SNNs. In addition, we devise a novel loss function named MixInfoNCE tailored to their temporal characteristics to further increase the classification accuracy of neuromorphic datasets, which is verified through rigorous ablation experiments. Finally, experiments on DVS-CIFAR10, DVS128Gesture and N-Caltech101 have shown that NeuroMoCo of this paper establishes new state-of-the-art (SOTA) benchmarks: 83.6% (Spikformer-2-256), 98.62% (Spikformer-2-256), and 84.4% (SEW-ResNet-18), respectively. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 32 pages,4 figures,4 tables

arXiv:2406.04422 [pdf, ps, other]

Collapsing-ring blowup solutions for the nonlinear heat equation

Authors: Senhao Duan, Nejla Nouaili, Hatem Zaag

Abstract: In this paper, we construct a singular standing ring solution of the nonlinear heat in the radial case. We give rigorous proof for the existence of a ring blow-up solution in finite time. This result was predicted formally by Baruch, Fibich and Gavish \cite{BFGpd10}. We also prove the stability of these dynamics among radially symmetric solutions. In this paper, we construct a singular standing ring solution of the nonlinear heat in the radial case. We give rigorous proof for the existence of a ring blow-up solution in finite time. This result was predicted formally by Baruch, Fibich and Gavish \cite{BFGpd10}. We also prove the stability of these dynamics among radially symmetric solutions. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 25 pages

MSC Class: 35B40; 35B44

arXiv:2406.02629 [pdf, other]

SSNet: A Lightweight Multi-Party Computation Scheme for Practical Privacy-Preserving Machine Learning Service in the Cloud

Authors: Shijin Duan, Chenghong Wang, Hongwu Peng, Yukui Luo, Wujie Wen, Caiwen Ding, Xiaolin Xu

Abstract: As privacy-preserving becomes a pivotal aspect of deep learning (DL) development, multi-party computation (MPC) has gained prominence for its efficiency and strong security. However, the practice of current MPC frameworks is limited, especially when dealing with large neural networks, exemplified by the prolonged execution time of 25.8 seconds for secure inference on ResNet-152. The primary challe… ▽ More As privacy-preserving becomes a pivotal aspect of deep learning (DL) development, multi-party computation (MPC) has gained prominence for its efficiency and strong security. However, the practice of current MPC frameworks is limited, especially when dealing with large neural networks, exemplified by the prolonged execution time of 25.8 seconds for secure inference on ResNet-152. The primary challenge lies in the reliance of current MPC approaches on additive secret sharing, which incurs significant communication overhead with non-linear operations such as comparisons. Furthermore, additive sharing suffers from poor scalability on party size. In contrast, the evolving landscape of MPC necessitates accommodating a larger number of compute parties and ensuring robust performance against malicious activities or computational failures. In light of these challenges, we propose SSNet, which for the first time, employs Shamir's secret sharing (SSS) as the backbone of MPC-based ML framework. We meticulously develop all framework primitives and operations for secure DL models tailored to seamlessly integrate with the SSS scheme. SSNet demonstrates the ability to scale up party numbers straightforwardly and embeds strategies to authenticate the computation correctness without incurring significant performance overhead. Additionally, SSNet introduces masking strategies designed to reduce communication overhead associated with non-linear operations. We conduct comprehensive experimental evaluations on commercial cloud computing infrastructure from Amazon AWS, as well as across diverse prevalent DNN models and datasets. SSNet demonstrates a substantial performance boost, achieving speed-ups ranging from 3x to 14x compared to SOTA MPC frameworks. Moreover, SSNet also represents the first framework that is evaluated on a five-party computation setup, in the context of secure DL inference. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 16 pages, 9 figures

arXiv:2405.14185 [pdf, other]

A structure-aware framework for learning device placements on computation graphs

Authors: Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Peiyu Zhang, Panagiotis Kyriakis, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Shahin Nazarian, Theodore L. Willke, Paul Bogdan

Abstract: Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we… ▽ More Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit using reinforcement learning. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into consideration the directed and acyclic nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and personalized graph partitioning jointly, using an unspecified number of groups. To train the entire framework, we utilize reinforcement learning techniques by employing the execution time of the suggested device placements to formulate the reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to $58.2\%$ over CPU execution and by up to $60.24\%$ compared to other commonly used baselines. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2405.05542 [pdf, other]

Dynamic Deep Factor Graph for Multi-Agent Reinforcement Learning

Authors: Yuchen Shi, Shihong Duan, Cheng Xu, Ran Wang, Fangwen Ye, Chau Yuen

Abstract: This work introduces a novel value decomposition algorithm, termed \textit{Dynamic Deep Factor Graphs} (DDFG). Unlike traditional coordination graphs, DDFG leverages factor graphs to articulate the decomposition of value functions, offering enhanced flexibility and adaptability to complex value function structures. Central to DDFG is a graph structure generation policy that innovatively generates… ▽ More This work introduces a novel value decomposition algorithm, termed \textit{Dynamic Deep Factor Graphs} (DDFG). Unlike traditional coordination graphs, DDFG leverages factor graphs to articulate the decomposition of value functions, offering enhanced flexibility and adaptability to complex value function structures. Central to DDFG is a graph structure generation policy that innovatively generates factor graph structures on-the-fly, effectively addressing the dynamic collaboration requirements among agents. DDFG strikes an optimal balance between the computational overhead associated with aggregating value functions and the performance degradation inherent in their complete decomposition. Through the application of the max-sum algorithm, DDFG efficiently identifies optimal policies. We empirically validate DDFG's efficacy in complex scenarios, including higher-order predator-prey tasks and the StarCraft II Multi-agent Challenge (SMAC), thus underscoring its capability to surmount the limitations faced by existing value decomposition algorithms. DDFG emerges as a robust solution for MARL challenges that demand nuanced understanding and facilitation of dynamic agent collaboration. The implementation of DDFG is made publicly accessible, with the source code available at \url{https://github.com/SICC-Group/DDFG}. △ Less

Submitted 7 June, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: submitted to IEEE TPAMI

arXiv:2404.04265 [pdf, other]

Accelerating Matrix Factorization by Dynamic Pruning for Fast Recommendation

Authors: Yining Wu, Shengyu Duan, Gaole Sai, Chenhong Cao, Guobing Zou

Abstract: Matrix factorization (MF) is a widely used collaborative filtering (CF) algorithm for recommendation systems (RSs), due to its high prediction accuracy, great flexibility and high efficiency in big data processing. However, with the dramatically increased number of users/items in current RSs, the computational complexity for training a MF model largely increases. Many existing works have accelerat… ▽ More Matrix factorization (MF) is a widely used collaborative filtering (CF) algorithm for recommendation systems (RSs), due to its high prediction accuracy, great flexibility and high efficiency in big data processing. However, with the dramatically increased number of users/items in current RSs, the computational complexity for training a MF model largely increases. Many existing works have accelerated MF, by either putting in additional computational resources or utilizing parallel systems, introducing a large cost. In this paper, we propose algorithmic methods to accelerate MF, without inducing any additional computational resources. In specific, we observe fine-grained structured sparsity in the decomposed feature matrices when considering a certain threshold. The fine-grained structured sparsity causes a large amount of unnecessary operations during both matrix multiplication and latent factor update, increasing the computational time of the MF training process. Based on the observation, we firstly propose to rearrange the feature matrices based on joint sparsity, which potentially makes a latent vector with a smaller index more dense than that with a larger index. The feature matrix rearrangement is given to limit the error caused by the later performed pruning process. We then propose to prune the insignificant latent factors by an early stopping process during both matrix multiplication and latent factor update. The pruning process is dynamically performed according to the sparsity of the latent factors for different users/items, to accelerate the process. The experiments show that our method can achieve 1.2-1.65 speedups, with up to 20.08% error increase, compared with the conventional MF training process. We also prove the proposed methods are applicable considering different hyperparameters including optimizer, optimization strategy and initialization method. △ Less

Submitted 18 March, 2024; originally announced April 2024.

arXiv:2403.16228 [pdf, other]

Rank-Dependent Predictable Forward Performance Processes

Authors: Bahman Angoshtari, Shida Duan

Abstract: Predictable forward performance processes (PFPPs) are stochastic optimal control frameworks for an agent who controls a randomly evolving system but can only prescribe the system dynamics for a short period ahead. This is a common scenario in which a controlling agent frequently re-calibrates her model. We introduce a new class of PFPPs based on rank-dependent utility, generalizing existing models… ▽ More Predictable forward performance processes (PFPPs) are stochastic optimal control frameworks for an agent who controls a randomly evolving system but can only prescribe the system dynamics for a short period ahead. This is a common scenario in which a controlling agent frequently re-calibrates her model. We introduce a new class of PFPPs based on rank-dependent utility, generalizing existing models that are based on expected utility theory (EUT). We establish existence of rank-dependent PFPPs under a conditionally complete market and exogenous probability distortion functions which are updated periodically. We show that their construction reduces to solving an integral equation that generalizes the integral equation obtained under EUT in previous studies. We then propose a new approach for solving the integral equation via theory of Volterra equations. We illustrate our result in the special case of conditionally complete Black-Scholes model. △ Less

Submitted 24 March, 2024; originally announced March 2024.

Comments: 43 pages, 3 figures

MSC Class: 91G10; 91G80; 60H30

arXiv:2403.13844 [pdf, other]

Scheduled Knowledge Acquisition on Lightweight Vector Symbolic Architectures for Brain-Computer Interfaces

Authors: Yejia Liu, Shijin Duan, Xiaolin Xu, Shaolei Ren

Abstract: Brain-Computer interfaces (BCIs) are typically designed to be lightweight and responsive in real-time to provide users timely feedback. Classical feature engineering is computationally efficient but has low accuracy, whereas the recent neural networks (DNNs) improve accuracy but are computationally expensive and incur high latency. As a promising alternative, the low-dimensional computing (LDC) cl… ▽ More Brain-Computer interfaces (BCIs) are typically designed to be lightweight and responsive in real-time to provide users timely feedback. Classical feature engineering is computationally efficient but has low accuracy, whereas the recent neural networks (DNNs) improve accuracy but are computationally expensive and incur high latency. As a promising alternative, the low-dimensional computing (LDC) classifier based on vector symbolic architecture (VSA), achieves small model size yet higher accuracy than classical feature engineering methods. However, its accuracy still lags behind that of modern DNNs, making it challenging to process complex brain signals. To improve the accuracy of a small model, knowledge distillation is a popular method. However, maintaining a constant level of distillation between the teacher and student models may not be the best way for a growing student during its progressive learning stages. In this work, we propose a simple scheduled knowledge distillation method based on curriculum data order to enable the student to gradually build knowledge from the teacher model, controlled by an $α$ scheduler. Meanwhile, we employ the LDC/VSA as the student model to enhance the on-device inference efficiency for tiny BCI devices that demand low latency. The empirical results have demonstrated that our approach achieves better tradeoff between accuracy and hardware efficiency compared to other methods. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: Accepted as a full paper by the tinyML Research Symposium 2024

arXiv:2403.11518 [pdf, other]

doi 10.1088/1674-1056/ad0d9d

Optical manipulation of the topological phase in ZrTe5 revealed by time- and angle-resolved photoemission

Authors: Chaozhi Huang, Chengyang Xu, Fengfeng Zhu, Shaofeng Duan, Jianzhe Liu, Lingxiao Gu, Shichong Wang, Haoran Liu, Dong Qian, Weidong Luo, Wentao Zhang

Abstract: High-resolution time- and angle-resolved photoemission measurements were conducted on the topological insulator ZrTe5. With strong femtosecond photoexcitation, a possible ultrafast phase transition from a weak to a strong topological insulating phase was experimentally realized by recovering the energy gap inversion in a time scale that was shorter than 0.15 ps. This photoinduced transient strong… ▽ More High-resolution time- and angle-resolved photoemission measurements were conducted on the topological insulator ZrTe5. With strong femtosecond photoexcitation, a possible ultrafast phase transition from a weak to a strong topological insulating phase was experimentally realized by recovering the energy gap inversion in a time scale that was shorter than 0.15 ps. This photoinduced transient strong topological phase can last longer than 2 ps at the highest excitation fluence studied, and it cannot be attributed to the photoinduced heating of electrons or modification of the conduction band filling. Additionally, the measured unoccupied electronic states are consistent with the first-principles calculation based on experimental crystal lattice constants, which favor a strong topological insulating phase. These findings provide new insights into the longstanding controversy about the strong and weak topological properties in ZrTe5, and they suggest that many-body effects including electron-electron interactions must be taken into account to understand the equilibrium weak topological insulating phase in ZrTe5. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Journal ref: Chinese Physics B 33, 017901 (2024)

arXiv:2403.06682 [pdf, other]

Restoring Ancient Ideograph: A Multimodal Multitask Neural Network Approach

Authors: Siyu Duan, Jun Wang, Qi Su

Abstract: Cultural heritage serves as the enduring record of human thought and history. Despite significant efforts dedicated to the preservation of cultural relics, many ancient artefacts have been ravaged irreversibly by natural deterioration and human actions. Deep learning technology has emerged as a valuable tool for restoring various kinds of cultural heritages, including ancient text restoration. Pre… ▽ More Cultural heritage serves as the enduring record of human thought and history. Despite significant efforts dedicated to the preservation of cultural relics, many ancient artefacts have been ravaged irreversibly by natural deterioration and human actions. Deep learning technology has emerged as a valuable tool for restoring various kinds of cultural heritages, including ancient text restoration. Previous research has approached ancient text restoration from either visual or textual perspectives, often overlooking the potential of synergizing multimodal information. This paper proposes a novel Multimodal Multitask Restoring Model (MMRM) to restore ancient texts, particularly emphasising the ideograph. This model combines context understanding with residual visual information from damaged ancient artefacts, enabling it to predict damaged characters and generate restored images simultaneously. We tested the MMRM model through experiments conducted on both simulated datasets and authentic ancient inscriptions. The results show that the proposed method gives insightful restoration suggestions in both simulation experiments and real-world scenarios. To the best of our knowledge, this work represents the pioneering application of multimodal deep learning in ancient text restoration, which will contribute to the understanding of ancient society and culture in digital humanities fields. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: Accept by Lrec-Coling 2024

arXiv:2403.04204 [pdf, other]

On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models

Authors: Xinpeng Wang, Shitong Duan, Xiaoyuan Yi, Jing Yao, Shanlin Zhou, Zhihua Wei, Peng Zhang, Dongkuan Xu, Maosong Sun, Xing Xie

Abstract: Big models have achieved revolutionary breakthroughs in the field of AI, but they might also pose potential concerns. Addressing such concerns, alignment technologies were introduced to make these models conform to human preferences and values. Despite considerable advancements in the past year, various challenges lie in establishing the optimal alignment strategy, such as data cost and scalable o… ▽ More Big models have achieved revolutionary breakthroughs in the field of AI, but they might also pose potential concerns. Addressing such concerns, alignment technologies were introduced to make these models conform to human preferences and values. Despite considerable advancements in the past year, various challenges lie in establishing the optimal alignment strategy, such as data cost and scalable oversight, and how to align remains an open question. In this survey paper, we comprehensively investigate value alignment approaches. We first unpack the historical context of alignment tracing back to the 1920s (where it comes from), then delve into the mathematical essence of alignment (what it is), shedding light on the inherent challenges. Following this foundation, we provide a detailed examination of existing alignment methods, which fall into three categories: Reinforcement Learning, Supervised Fine-Tuning, and In-context Learning, and demonstrate their intrinsic connections, strengths, and limitations, helping readers better understand this research area. In addition, two emerging topics, personal alignment, and multimodal alignment, are also discussed as novel frontiers in this field. Looking forward, we discuss potential alignment paradigms and how they could handle remaining challenges, prospecting where future alignment will go. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 23 pages, 7 figures

arXiv:2403.03609 [pdf, ps, other]

Powers of edge ideals of edge-weighted trees

Authors: Jiaxin Li, Guangjun Zhu, Shiya Duan

Abstract: This paper gives exact formulas for the regularity of edge ideals of edge-weighted integrally closed trees. In addition, we provide some linear upper bounds on the regularity of powers of such ideals. This paper gives exact formulas for the regularity of edge ideals of edge-weighted integrally closed trees. In addition, we provide some linear upper bounds on the regularity of powers of such ideals. △ Less

Submitted 6 March, 2024; originally announced March 2024.

MSC Class: Primary 13A15; 13D02; Secondary 05E40

arXiv:2403.03419 [pdf, other]

Negating Negatives: Alignment without Human Positive Samples via Distributional Dispreference Optimization

Authors: Shitong Duan, Xiaoyuan Yi, Peng Zhang, Tun Lu, Xing Xie, Ning Gu

Abstract: Large language models (LLMs) have revolutionized the role of AI, yet also pose potential risks of propagating unethical content. Alignment technologies have been introduced to steer LLMs towards human preference, gaining increasing attention. Despite notable breakthroughs in this direction, existing methods heavily rely on high-quality positive-negative training pairs, suffering from noisy labels… ▽ More Large language models (LLMs) have revolutionized the role of AI, yet also pose potential risks of propagating unethical content. Alignment technologies have been introduced to steer LLMs towards human preference, gaining increasing attention. Despite notable breakthroughs in this direction, existing methods heavily rely on high-quality positive-negative training pairs, suffering from noisy labels and the marginal distinction between preferred and dispreferred response data. Given recent LLMs' proficiency in generating helpful responses, this work pivots towards a new research focus: achieving alignment using solely human-annotated negative samples, preserving helpfulness while reducing harmfulness. For this purpose, we propose Distributional Dispreference Optimization (D$^2$O), which maximizes the discrepancy between the generated responses and the dispreferred ones to effectively eschew harmful information. We theoretically demonstrate that D$^2$O is equivalent to learning a distributional instead of instance-level preference model reflecting human dispreference against the distribution of negative responses. Besides, D$^2$O integrates an implicit Jeffrey Divergence regularization to balance the exploitation and exploration of reference policies and converges to a non-negative one during training. Extensive experiments demonstrate that our method achieves comparable generation quality and surpasses the latest baselines in producing less harmful and more informative responses with better training stability and faster convergence. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2402.09725 [pdf, other]

Improving Non-autoregressive Machine Translation with Error Exposure and Consistency Regularization

Authors: Xinran Chen, Sufeng Duan, Gongshen Liu

Abstract: Being one of the IR-NAT (Iterative-refinemennt-based NAT) frameworks, the Conditional Masked Language Model (CMLM) adopts the mask-predict paradigm to re-predict the masked low-confidence tokens. However, CMLM suffers from the data distribution discrepancy between training and inference, where the observed tokens are generated differently in the two cases. In this paper, we address this problem wi… ▽ More Being one of the IR-NAT (Iterative-refinemennt-based NAT) frameworks, the Conditional Masked Language Model (CMLM) adopts the mask-predict paradigm to re-predict the masked low-confidence tokens. However, CMLM suffers from the data distribution discrepancy between training and inference, where the observed tokens are generated differently in the two cases. In this paper, we address this problem with the training approaches of error exposure and consistency regularization (EECR). We construct the mixed sequences based on model prediction during training, and propose to optimize over the masked tokens under imperfect observation conditions. We also design a consistency learning method to constrain the data distribution for the masked tokens under different observing situations to narrow down the gap between training and inference. The experiments on five translation benchmarks obtains an average improvement of 0.68 and 0.40 BLEU scores compared to the base models, respectively, and our CMLMC-EECR achieves the best performance with a comparable translation quality with the Transformer. The experiments results demonstrate the effectiveness of our method. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2401.02111 [pdf, ps, other]

Edge ideals of some edge-weighted graphs

Authors: Guangjun Zhu, Shiya Duan, Yijun Cui, Jiaxin Li

Abstract: This paper presents exact formulas for the regularity and depth of powers of edge ideals of an edge-weighted star graph. Additionally, we provide exact formulas for the regularity of powers of the edge ideal of an edge-weighted integrally closed path, as well as lower bounds on the depth of powers of such an edge ideal. This paper presents exact formulas for the regularity and depth of powers of edge ideals of an edge-weighted star graph. Additionally, we provide exact formulas for the regularity of powers of the edge ideal of an edge-weighted integrally closed path, as well as lower bounds on the depth of powers of such an edge ideal. △ Less

Submitted 4 January, 2024; originally announced January 2024.

MSC Class: Primary 13F20; 13C15; 05C22; Secondary 05E40

arXiv:2401.00225 [pdf]

Enhancing dysarthria speech feature representation with empirical mode decomposition and Walsh-Hadamard transform

Authors: Ting Zhu, Shufei Duan, Camille Dingam, Huizhi Liang, Wei Zhang

Abstract: Dysarthria speech contains the pathological characteristics of vocal tract and vocal fold, but so far, they have not yet been included in traditional acoustic feature sets. Moreover, the nonlinearity and non-stationarity of speech have been ignored. In this paper, we propose a feature enhancement algorithm for dysarthria speech called WHFEMD. It combines empirical mode decomposition (EMD) and fast… ▽ More Dysarthria speech contains the pathological characteristics of vocal tract and vocal fold, but so far, they have not yet been included in traditional acoustic feature sets. Moreover, the nonlinearity and non-stationarity of speech have been ignored. In this paper, we propose a feature enhancement algorithm for dysarthria speech called WHFEMD. It combines empirical mode decomposition (EMD) and fast Walsh-Hadamard transform (FWHT) to enhance features. With the proposed algorithm, the fast Fourier transform of the dysarthria speech is first performed and then followed by EMD to get intrinsic mode functions (IMFs). After that, FWHT is used to output new coefficients and to extract statistical features based on IMFs, power spectral density, and enhanced gammatone frequency cepstral coefficients. To evaluate the proposed approach, we conducted experiments on two public pathological speech databases including UA Speech and TORGO. The results show that our algorithm performed better than traditional features in classification. We achieved improvements of 13.8% (UA Speech) and 3.84% (TORGO), respectively. Furthermore, the incorporation of an imbalanced classification algorithm to address data imbalance has resulted in a 12.18% increase in recognition accuracy. This algorithm effectively addresses the challenges of the imbalanced dataset and non-linearity in dysarthric speech and simultaneously provides a robust representation of the local pathological features of the vocal folds and tracts. △ Less

Submitted 30 December, 2023; originally announced January 2024.

arXiv:2312.08998 [pdf]

Design, construction and evaluation of emotional multimodal pathological speech database

Authors: Ting Zhu, Shufei Duan, Huizhi Liang, Wei Zhang

Abstract: The lack of an available emotion pathology database is one of the key obstacles in studying the emotion expression status of patients with dysarthria. The first Chinese multimodal emotional pathological speech database containing multi-perspective information is constructed in this paper. It includes 29 controls and 39 patients with different degrees of motor dysarthria, expressing happy, sad, ang… ▽ More The lack of an available emotion pathology database is one of the key obstacles in studying the emotion expression status of patients with dysarthria. The first Chinese multimodal emotional pathological speech database containing multi-perspective information is constructed in this paper. It includes 29 controls and 39 patients with different degrees of motor dysarthria, expressing happy, sad, angry and neutral emotions. All emotional speech was labeled for intelligibility, types and discrete dimensional emotions by developed WeChat mini-program. The subjective analysis justifies from emotion discrimination accuracy, speech intelligibility, valence-arousal spatial distribution, and correlation between SCL-90 and disease severity. The automatic recognition tested on speech and glottal data, with average accuracy of 78% for controls and 60% for patients in audio, while 51% for controls and 38% for patients in glottal data, indicating an influence of the disease on emotional expression. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2312.05657 [pdf, other]

Leveraging Reinforcement Learning and Large Language Models for Code Optimization

Authors: Shukai Duan, Nikos Kanakaris, Xiongye Xiao, Heng Ping, Chenyu Zhou, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Theodore L. Willke, Shahin Nazarian, Paul Bogdan

Abstract: Code optimization is a daunting task that requires a significant level of expertise from experienced programmers. This level of expertise is not sufficient when compared to the rapid development of new hardware architectures. Towards advancing the whole code optimization process, recent approaches rely on machine learning and artificial intelligence techniques. This paper introduces a new framewor… ▽ More Code optimization is a daunting task that requires a significant level of expertise from experienced programmers. This level of expertise is not sufficient when compared to the rapid development of new hardware architectures. Towards advancing the whole code optimization process, recent approaches rely on machine learning and artificial intelligence techniques. This paper introduces a new framework to decrease the complexity of code optimization. The proposed framework builds on large language models (LLMs) and reinforcement learning (RL) and enables LLMs to receive feedback from their environment (i.e., unit tests) during the fine-tuning process. We compare our framework with existing state-of-the-art models and show that it is more efficient with respect to speed and computational usage, as a result of the decrement in training steps and its applicability to models with fewer parameters. Additionally, our framework reduces the possibility of logical and syntactical errors. Toward evaluating our approach, we run several experiments on the PIE dataset using a CodeT5 language model and RRHF, a new reinforcement learning algorithm. We adopt a variety of evaluation metrics with regards to optimization quality, and speedup. The evaluation results demonstrate that the proposed framework has similar results in comparison with existing models using shorter training times and smaller pre-trained models. In particular, we accomplish an increase of 5.6% and 2.2 over the baseline models concerning the %OP T and SP metrics. △ Less

Submitted 9 December, 2023; originally announced December 2023.

arXiv:2312.00856 [pdf, other]

QAFE-Net: Quality Assessment of Facial Expressions with Landmark Heatmaps

Authors: Shuchao Duan, Amirhossein Dadashzadeh, Alan Whone, Majid Mirmehdi

Abstract: Facial expression recognition (FER) methods have made great inroads in categorising moods and feelings in humans. Beyond FER, pain estimation methods assess levels of intensity in pain expressions, however assessing the quality of all facial expressions is of critical value in health-related applications. In this work, we address the quality of five different facial expressions in patients affecte… ▽ More Facial expression recognition (FER) methods have made great inroads in categorising moods and feelings in humans. Beyond FER, pain estimation methods assess levels of intensity in pain expressions, however assessing the quality of all facial expressions is of critical value in health-related applications. In this work, we address the quality of five different facial expressions in patients affected by Parkinson's disease. We propose a novel landmark-guided approach, QAFE-Net, that combines temporal landmark heatmaps with RGB data to capture small facial muscle movements that are encoded and mapped to severity scores. The proposed approach is evaluated on a new Parkinson's Disease Facial Expression dataset (PFED5), as well as on the pain estimation benchmark, the UNBC-McMaster Shoulder Pain Expression Archive Database. Our comparative experiments demonstrate that the proposed method outperforms SOTA action quality assessment works on PFED5 and achieves lower mean absolute error than the SOTA pain estimation methods on UNBC-McMaster. Our code and the new PFED5 dataset are available at https://github.com/shuchaoduan/QAFE-Net. △ Less

Submitted 12 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

Comments: Accepted to ELFA workshop at WACV 2024

arXiv:2311.17819 [pdf, ps, other]

Weak Solar Radio Bursts from the Solar Wind Acceleration Region Observed by Parker Solar Probe and Its Probable Emission Mechanism

Authors: Ling Chen, Bing Ma, Dejin Wu, Xiaowei Zhou, Marc Pulupa, PeiJin Zhang, Pietro Zucca, Stuart D. Bale, Justin C. Kasper, SuPing Duan

Abstract: The Parker Solar Probe (PSP) provides us the unprecedentedly close approach observation to the Sun, and hence the possibility of directly understanding the "elementary process" which occurs in the kinetic scale of particles collective interactioin in solar coronal plasmas. We reported a kind of weak solar radio bursts (SRBs), which are detected by PSP when it passed a low-density magnetic channel… ▽ More The Parker Solar Probe (PSP) provides us the unprecedentedly close approach observation to the Sun, and hence the possibility of directly understanding the "elementary process" which occurs in the kinetic scale of particles collective interactioin in solar coronal plasmas. We reported a kind of weak solar radio bursts (SRBs), which are detected by PSP when it passed a low-density magnetic channel during its second encounter phase. These weak SRBs have low starting frequecny $\sim 20$ MHz and narrow frequency range from a few tens MHz to a few hundres kHz. Their dynamic spectra display a strongly evolving feature of the intermediate relative drift rate decreasing rapidly from above 0.01/s to below 0.01/s. Analyses based on common empirical models of solar coronal plasmas indicate that these weak SRBs originate from the heliocentric distance $\sim 1.1-6.1~R_S$ (the solar radius), a typical solar wind acceleration region with a low-$β$ plasma, and indicate that their soruces have a typic motion velociy $\sim v_A$ (Alfvén velocity) obviously lower than that of fast electrons required by effectively exciting SRBs. We propose that solitary kinetic Alfvén waves with kinetic scales can be responsible for the generation of these small-scalevweak SRBs, called solitary wave radiation (SWR). △ Less

Submitted 29 November, 2023; originally announced November 2023.

arXiv:2311.15179 [pdf, other]

Estimation of the User Contribution Rate by Leveraging Time Sequence in Pairwise Matching function-point between Users Feedback and App Updating Log

Authors: Shiqi Duan, Jianxun Liu, Yong Xiao, Xiangping Zhang

Abstract: Mobile applications have become an inseparable part of people's daily life. Nonetheless, the market competition is extremely fierce, and apps lacking recognition among most users are susceptible to market elimination. To this end, developers must swiftly and accurately apprehend the requirements of the wider user base to effectively strategize and promote their apps' orderly and healthy evolution.… ▽ More Mobile applications have become an inseparable part of people's daily life. Nonetheless, the market competition is extremely fierce, and apps lacking recognition among most users are susceptible to market elimination. To this end, developers must swiftly and accurately apprehend the requirements of the wider user base to effectively strategize and promote their apps' orderly and healthy evolution. The rate at which general user requirements are adopted by developers, or user contribution, is a very valuable metric that can be an important tool for app developers or software engineering researchers to measure or gain insight into the evolution of app requirements and predict the evolution of app software. Regrettably, the landscape lacks refined quantitative analysis approaches and tools for this pivotal indicator. To address this problem, this paper exploratively proposes a quantitative analysis approach based on the temporal correlation perception that exists in the app update log and user reviews, which provides a feasible solution for quantitatively obtaining the user contribution. The main idea of this scheme is to consider valid user reviews as user requirements and app update logs as developer responses, and to mine and analyze the pairwise and chronological relationships existing between the two by text computing, thus constructing a feasible approach for quantitatively calculating user contribution. To demonstrate the feasibility of the approach, this paper collects data from four Chinese apps in the App Store in mainland China and one English app in the U.S. region, including 2,178 update logs and 4,236,417 user reviews, and from the results of the experiment, it was found that 16.6%-43.2% of the feature of these apps would be related to the drive from the online popular user requirements. △ Less

Submitted 25 November, 2023; originally announced November 2023.

arXiv:2311.09489 [pdf, other]

MirrorNet: A TEE-Friendly Framework for Secure On-device DNN Inference

Authors: Ziyu Liu, Yukui Luo, Shijin Duan, Tong Zhou, Xiaolin Xu

Abstract: Deep neural network (DNN) models have become prevalent in edge devices for real-time inference. However, they are vulnerable to model extraction attacks and require protection. Existing defense approaches either fail to fully safeguard model confidentiality or result in significant latency issues. To overcome these challenges, this paper presents MirrorNet, which leverages Trusted Execution Enviro… ▽ More Deep neural network (DNN) models have become prevalent in edge devices for real-time inference. However, they are vulnerable to model extraction attacks and require protection. Existing defense approaches either fail to fully safeguard model confidentiality or result in significant latency issues. To overcome these challenges, this paper presents MirrorNet, which leverages Trusted Execution Environment (TEE) to enable secure on-device DNN inference. It generates a TEE-friendly implementation for any given DNN model to protect the model confidentiality, while meeting the stringent computation and storage constraints of TEE. The framework consists of two key components: the backbone model (BackboneNet), which is stored in the normal world but achieves lower inference accuracy, and the Companion Partial Monitor (CPM), a lightweight mirrored branch stored in the secure world, preserving model confidentiality. During inference, the CPM monitors the intermediate results from the BackboneNet and rectifies the classification output to achieve higher accuracy. To enhance flexibility, MirrorNet incorporates two modules: the CPM Strategy Generator, which generates various protection strategies, and the Performance Emulator, which estimates the performance of each strategy and selects the most optimal one. Extensive experiments demonstrate the effectiveness of MirrorNet in providing security guarantees while maintaining low computation latency, making MirrorNet a practical and promising solution for secure on-device DNN inference. For the evaluation, MirrorNet can achieve a 18.6% accuracy gap between authenticated and illegal use, while only introducing 0.99% hardware overhead. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: Accepted by ICCAD 2023

arXiv:2311.07619 [pdf, other]

Modeling User Viewing Flow Using Large Language Models for Article Recommendation

Authors: Zhenghao Liu, Zulong Chen, Moufeng Zhang, Shaoyang Duan, Hong Wen, Liangyue Li, Nan Li, Yu Gu, Ge Yu

Abstract: This paper proposes the User Viewing Flow Modeling (SINGLE) method for the article recommendation task, which models the user constant preference and instant interest from user-clicked articles. Specifically, we first employ a user constant viewing flow modeling method to summarize the user's general interest to recommend articles. In this case, we utilize Large Language Models (LLMs) to capture c… ▽ More This paper proposes the User Viewing Flow Modeling (SINGLE) method for the article recommendation task, which models the user constant preference and instant interest from user-clicked articles. Specifically, we first employ a user constant viewing flow modeling method to summarize the user's general interest to recommend articles. In this case, we utilize Large Language Models (LLMs) to capture constant user preferences from previously clicked articles, such as skills and positions. Then we design the user instant viewing flow modeling method to build interactions between user-clicked article history and candidate articles. It attentively reads the representations of user-clicked articles and aims to learn the user's different interest views to match the candidate article. Our experimental results on the Alibaba Technology Association (ATA) website show the advantage of SINGLE, achieving a 2.4% improvement over previous baseline models in the online A/B test. Our further analyses illustrate that SINGLE has the ability to build a more tailored recommendation system by mimicking different article viewing behaviors of users and recommending more appropriate and diverse articles to match user interests. △ Less

Submitted 7 March, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

Comments: Accepted by WebConf 2024

arXiv:2311.07603 [pdf, other]

PECoP: Parameter Efficient Continual Pretraining for Action Quality Assessment

Authors: Amirhossein Dadashzadeh, Shuchao Duan, Alan Whone, Majid Mirmehdi

Abstract: The limited availability of labelled data in Action Quality Assessment (AQA), has forced previous works to fine-tune their models pretrained on large-scale domain-general datasets. This common approach results in weak generalisation, particularly when there is a significant domain shift. We propose a novel, parameter efficient, continual pretraining framework, PECoP, to reduce such domain shift vi… ▽ More The limited availability of labelled data in Action Quality Assessment (AQA), has forced previous works to fine-tune their models pretrained on large-scale domain-general datasets. This common approach results in weak generalisation, particularly when there is a significant domain shift. We propose a novel, parameter efficient, continual pretraining framework, PECoP, to reduce such domain shift via an additional pretraining stage. In PECoP, we introduce 3D-Adapters, inserted into the pretrained model, to learn spatiotemporal, in-domain information via self-supervised learning where only the adapter modules' parameters are updated. We demonstrate PECoP's ability to enhance the performance of recent state-of-the-art methods (MUSDL, CoRe, and TSA) applied to AQA, leading to considerable improvements on benchmark datasets, JIGSAWS ($\uparrow6.0\%$), MTL-AQA ($\uparrow0.99\%$), and FineDiving ($\uparrow2.54\%$). We also present a new Parkinson's Disease dataset, PD4T, of real patients performing four various actions, where we surpass ($\uparrow3.56\%$) the state-of-the-art in comparison. Our code, pretrained models, and the PD4T dataset are available at https://github.com/Plrbear/PECoP. △ Less

Submitted 10 November, 2023; originally announced November 2023.

Comments: Accepted to WACV 2024 (preprint)

arXiv:2311.05608 [pdf, other]

FigStep: Jailbreaking Large Vision-language Models via Typographic Visual Prompts

Authors: Yichen Gong, Delong Ran, Jinyuan Liu, Conglei Wang, Tianshuo Cong, Anyu Wang, Sisi Duan, Xiaoyun Wang

Abstract: Ensuring the safety of artificial intelligence-generated content (AIGC) is a longstanding topic in the artificial intelligence (AI) community, and the safety concerns associated with Large Language Models (LLMs) have been widely investigated. Recently, large vision-language models (VLMs) represent an unprecedented revolution, as they are built upon LLMs but can incorporate additional modalities (e… ▽ More Ensuring the safety of artificial intelligence-generated content (AIGC) is a longstanding topic in the artificial intelligence (AI) community, and the safety concerns associated with Large Language Models (LLMs) have been widely investigated. Recently, large vision-language models (VLMs) represent an unprecedented revolution, as they are built upon LLMs but can incorporate additional modalities (e.g., images). However, the safety of VLMs lacks systematic evaluation, and there may be an overconfidence in the safety guarantees provided by their underlying LLMs. In this paper, to demonstrate that introducing additional modality modules leads to unforeseen AI safety issues, we propose FigStep, a straightforward yet effective jailbreaking algorithm against VLMs. Instead of feeding textual harmful instructions directly, FigStep converts the harmful content into images through typography to bypass the safety alignment within the textual module of the VLMs, inducing VLMs to output unsafe responses that violate common AI safety policies. In our evaluation, we manually review 46,500 model responses generated by 3 families of the promising open-source VLMs, i.e., LLaVA, MiniGPT4, and CogVLM (a total of 6 VLMs). The experimental results show that FigStep can achieve an average attack success rate of 82.50% on 500 harmful queries in 10 topics. Moreover, we demonstrate that the methodology of FigStep can even jailbreak GPT-4V, which already leverages an OCR detector to filter harmful queries. Above all, our work reveals that VLMs are vulnerable to jailbreaking attacks, which highlights the necessity of novel safety alignments between visual and textual modalities. △ Less

Submitted 13 December, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

Comments: Technical Report

arXiv:2310.20290 [pdf, other]

On Rayleigh Quotient Iteration for Dual Quaternion Hermitian Eigenvalue Problem

Authors: Shan-Qi Duan, Qing-Wen Wang, Xue-Feng Duan

Abstract: The application of eigenvalue theory to dual quaternion Hermitian matrices holds significance in the realm of multi-agent formation control. In this paper, we study the Rayleigh quotient iteration (RQI) for solving the right eigenpairs of dual quaternion Hermitian matrices. Combined with dual representation, the RQI algorithm can effectively compute the extreme eigenvalue along with the associated… ▽ More The application of eigenvalue theory to dual quaternion Hermitian matrices holds significance in the realm of multi-agent formation control. In this paper, we study the Rayleigh quotient iteration (RQI) for solving the right eigenpairs of dual quaternion Hermitian matrices. Combined with dual representation, the RQI algorithm can effectively compute the extreme eigenvalue along with the associated eigenvector of the large dual quaternion Hermitian matrices. Furthermore, a convergence analysis of the Rayleigh quotient iteration is derived, demonstrating a local convergence rate of at least cubic, which is faster than the linear convergence rate of the power method. Numerical examples are provided to illustrate the high accuracy and low CPU time cost of the proposed Rayleigh quotient iteration compared with the power method for solving the dual quaternion Hermitian eigenvalue problem. △ Less

Submitted 6 March, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

Comments: arXiv admin note: text overlap with arXiv:2111.12211 by other authors

arXiv:2310.18070 [pdf, other]

doi 10.1109/TASLP.2023.3313885

Multi-grained Evidence Inference for Multi-choice Reading Comprehension

Authors: Yilin Zhao, Hai Zhao, Sufeng Duan

Abstract: Multi-choice Machine Reading Comprehension (MRC) is a major and challenging task for machines to answer questions according to provided options. Answers in multi-choice MRC cannot be directly extracted in the given passages, and essentially require machines capable of reasoning from accurate extracted evidence. However, the critical evidence may be as simple as just one word or phrase, while it is… ▽ More Multi-choice Machine Reading Comprehension (MRC) is a major and challenging task for machines to answer questions according to provided options. Answers in multi-choice MRC cannot be directly extracted in the given passages, and essentially require machines capable of reasoning from accurate extracted evidence. However, the critical evidence may be as simple as just one word or phrase, while it is hidden in the given redundant, noisy passage with multiple linguistic hierarchies from phrase, fragment, sentence until the entire passage. We thus propose a novel general-purpose model enhancement which integrates multi-grained evidence comprehensively, named Multi-grained evidence inferencer (Mugen), to make up for the inability. Mugen extracts three different granularities of evidence: coarse-, middle- and fine-grained evidence, and integrates evidence with the original passages, achieving significant and consistent performance improvement on four multi-choice MRC benchmarks. △ Less

Submitted 27 October, 2023; originally announced October 2023.

Comments: Accepted by TASLP 2023, vol. 31, pp. 3896-3907

Journal ref: in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 3896-3907, 2023

arXiv:2310.11984 [pdf, other]

From Interpolation to Extrapolation: Complete Length Generalization for Arithmetic Transformers

Authors: Shaoxiong Duan, Yining Shi, Wei Xu

Abstract: In this paper, we investigate the inherent capabilities of transformer models in learning arithmetic algorithms, such as addition and parity. Through experiments and attention analysis, we identify a number of crucial factors for achieving optimal length generalization. We show that transformer models are able to generalize to long lengths with the help of targeted attention biasing. In particular… ▽ More In this paper, we investigate the inherent capabilities of transformer models in learning arithmetic algorithms, such as addition and parity. Through experiments and attention analysis, we identify a number of crucial factors for achieving optimal length generalization. We show that transformer models are able to generalize to long lengths with the help of targeted attention biasing. In particular, our solution solves the Parity task, a well-known and theoretically proven failure mode for Transformers. We then introduce Attention Bias Calibration (ABC), a calibration stage that enables the model to automatically learn the proper attention biases, which we show to be connected to mechanisms in relative position encoding. We demonstrate that using ABC, the transformer model can achieve unprecedented near-perfect length generalization on certain arithmetic tasks. In addition, we show that ABC bears remarkable similarities to RPE and LoRA, which may indicate the potential for applications to more complex tasks. △ Less

Submitted 10 May, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

arXiv:2310.11053 [pdf, other]

Denevil: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction Learning

Authors: Shitong Duan, Xiaoyuan Yi, Peng Zhang, Tun Lu, Xing Xie, Ning Gu

Abstract: Large Language Models (LLMs) have made unprecedented breakthroughs, yet their increasing integration into everyday life might raise societal risks due to generated unethical content. Despite extensive study on specific issues like bias, the intrinsic values of LLMs remain largely unexplored from a moral philosophy perspective. This work delves into ethical values utilizing Moral Foundation Theory.… ▽ More Large Language Models (LLMs) have made unprecedented breakthroughs, yet their increasing integration into everyday life might raise societal risks due to generated unethical content. Despite extensive study on specific issues like bias, the intrinsic values of LLMs remain largely unexplored from a moral philosophy perspective. This work delves into ethical values utilizing Moral Foundation Theory. Moving beyond conventional discriminative evaluations with poor reliability, we propose DeNEVIL, a novel prompt generation algorithm tailored to dynamically exploit LLMs' value vulnerabilities and elicit the violation of ethics in a generative manner, revealing their underlying value inclinations. On such a basis, we construct MoralPrompt, a high-quality dataset comprising 2,397 prompts covering 500+ value principles, and then benchmark the intrinsic values across a spectrum of LLMs. We discovered that most models are essentially misaligned, necessitating further ethical value alignment. In response, we develop VILMO, an in-context alignment method that substantially enhances the value compliance of LLM outputs by learning to generate appropriate value instructions, outperforming existing competitors. Our methods are suitable for black-box and open-source models, offering a promising initial step in studying the ethical values of LLMs. △ Less

Submitted 4 March, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: Accepted by ICLR 2024

arXiv:2310.07548 [pdf, other]

Attribute Localization and Revision Network for Zero-Shot Learning

Authors: Junzhe Xu, Suling Duan, Chenwei Tang, Zhenan He, Jiancheng Lv

Abstract: Zero-shot learning enables the model to recognize unseen categories with the aid of auxiliary semantic information such as attributes. Current works proposed to detect attributes from local image regions and align extracted features with class-level semantics. In this paper, we find that the choice between local and global features is not a zero-sum game, global features can also contribute to the… ▽ More Zero-shot learning enables the model to recognize unseen categories with the aid of auxiliary semantic information such as attributes. Current works proposed to detect attributes from local image regions and align extracted features with class-level semantics. In this paper, we find that the choice between local and global features is not a zero-sum game, global features can also contribute to the understanding of attributes. In addition, aligning attribute features with class-level semantics ignores potential intra-class attribute variation. To mitigate these disadvantages, we present Attribute Localization and Revision Network in this paper. First, we design Attribute Localization Module (ALM) to capture both local and global features from image regions, a novel module called Scale Control Unit is incorporated to fuse global and local representations. Second, we propose Attribute Revision Module (ARM), which generates image-level semantics by revising the ground-truth value of each attribute, compensating for performance degradation caused by ignoring intra-class variation. Finally, the output of ALM will be aligned with revised semantics produced by ARM to achieve the training process. Comprehensive experimental results on three widely used benchmarks demonstrate the effectiveness of our model in the zero-shot prediction task. △ Less

Submitted 11 October, 2023; originally announced October 2023.

arXiv:2309.13568 [pdf, ps, other]

The $\circ$ operation and $*$ operation of Cohen-Macaulay bipartite graphs

Authors: Yulong Yang, Guangjun Zhu, Yijun Cui, Shiya Duan

Abstract: Let $G$ be a finite simple graph with the vertex set $V$ and let $I_G$ be its edge ideal in the polynomial ring $S=\mathbb{K}[x_V]$. In this paper, we compute the depth and the Castelnuovo--Mumford regularity of $S/I_G$ when $G=G_1\circ G_2$ or $G=G_1* G_2$ is a graph obtained from Cohen-Macaulay bipartite graphs $G_1$, $G_2$ by $\circ$ operation or $*$ operation, respectively. Let $G$ be a finite simple graph with the vertex set $V$ and let $I_G$ be its edge ideal in the polynomial ring $S=\mathbb{K}[x_V]$. In this paper, we compute the depth and the Castelnuovo--Mumford regularity of $S/I_G$ when $G=G_1\circ G_2$ or $G=G_1* G_2$ is a graph obtained from Cohen-Macaulay bipartite graphs $G_1$, $G_2$ by $\circ$ operation or $*$ operation, respectively. △ Less

Submitted 27 September, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

Comments: arXiv admin note: text overlap with arXiv:2308.06010

MSC Class: Primary 13C15; 13A15; 13D02; Secondary 05E40

arXiv:2309.02230 [pdf, other]

DCP-Net: A Distributed Collaborative Perception Network for Remote Sensing Semantic Segmentation

Authors: Zhechao Wang, Peirui Cheng, Shujing Duan, Kaiqiang Chen, Zhirui Wang, Xinming Li, Xian Sun

Abstract: Onboard intelligent processing is widely applied in emergency tasks in the field of remote sensing. However, it is predominantly confined to an individual platform with a limited observation range as well as susceptibility to interference, resulting in limited accuracy. Considering the current state of multi-platform collaborative observation, this article innovatively presents a distributed colla… ▽ More Onboard intelligent processing is widely applied in emergency tasks in the field of remote sensing. However, it is predominantly confined to an individual platform with a limited observation range as well as susceptibility to interference, resulting in limited accuracy. Considering the current state of multi-platform collaborative observation, this article innovatively presents a distributed collaborative perception network called DCP-Net. Firstly, the proposed DCP-Net helps members to enhance perception performance by integrating features from other platforms. Secondly, a self-mutual information match module is proposed to identify collaboration opportunities and select suitable partners, prioritizing critical collaborative features and reducing redundant transmission cost. Thirdly, a related feature fusion module is designed to address the misalignment between local and collaborative features, improving the quality of fused features for the downstream task. We conduct extensive experiments and visualization analyses using three semantic segmentation datasets, including Potsdam, iSAID and DFC23. The results demonstrate that DCP-Net outperforms the existing methods comprehensively, improving mIoU by 2.61%~16.89% at the highest collaboration efficiency, which promotes the performance to a state-of-the-art level. △ Less

Submitted 5 September, 2023; originally announced September 2023.

arXiv:2308.07492 [pdf, ps, other]

generalized Radon transforms on fractal measures

Authors: Shengze Duan

Abstract: In the setting of a general Borel measure $μ$ on $R^d$ with the natural ball size condition $$μ[B(x,r)]\leq Cr^s,$$ we establish the $L^p(μ)$-$L^q(μ)$-estimate for the generalized Radon transform $$(Af)(x):=\int_{Φ(x,y)=0}(fμ)(y)ψ(x,y)dσ_x(y),$$ where $Φ$ is a smooth function away from the diagonal. Among other reasonable assumptions, an $L^2$-Sobolev bound on $A$ on $R^d$ is imposed. This bound… ▽ More In the setting of a general Borel measure $μ$ on $R^d$ with the natural ball size condition $$μ[B(x,r)]\leq Cr^s,$$ we establish the $L^p(μ)$-$L^q(μ)$-estimate for the generalized Radon transform $$(Af)(x):=\int_{Φ(x,y)=0}(fμ)(y)ψ(x,y)dσ_x(y),$$ where $Φ$ is a smooth function away from the diagonal. Among other reasonable assumptions, an $L^2$-Sobolev bound on $A$ on $R^d$ is imposed. This bound is satisfied in many natural situations. The main result is, in general, sharp up to endpoints. △ Less

Submitted 14 August, 2023; originally announced August 2023.

arXiv:2308.07139 [pdf, other]

Extremely thin perfect absorber by generalized multipole bianisotropic effect

Authors: Hao Ma, Andrey B. Evlyukhin, Andrey E. Miroshnichenko, Fengjie Zhu, Siyu Duan, Jingbo Wu, Caihong Zhang, Jian Chen, Biao-Bing Jin, Willie J. Padilla, Kebin Fan

Abstract: Symmetry breaking plays a crucial role in understanding the fundamental physics underlying numerous physical phenomena, including the electromagnetic response in resonators, giving rise to intriguing effects such as directional light scattering, supercavity lasing, and topologically protected states. In this work, we demonstrate that adding a small fraction of lossy metal (as low as… ▽ More Symmetry breaking plays a crucial role in understanding the fundamental physics underlying numerous physical phenomena, including the electromagnetic response in resonators, giving rise to intriguing effects such as directional light scattering, supercavity lasing, and topologically protected states. In this work, we demonstrate that adding a small fraction of lossy metal (as low as $1\times10^{-6}$ in volume), to a lossless dielectric resonator breaks inversion symmetry thereby lifting its degeneracy, leading to a strong bianisotropic response. In the case of the metasurface composed of such resonators, this effect leads to unidirectional perfect absorption while maintaining nearly perfect reflection from the opposite direction. We have developed more general Onsager-Casimir relations for the polarizabilities of particle arrays, taking into account the contributions of quadrupoles, which shows that bianisotropy is not solely due to dipoles, but also involves high-order multipoles. Our experimental validation demonstrates an extremely thin terahertz-perfect absorber with a wavelength-to-thickness ratio of up to 25,000, where the material thickness is only 2% of the theoretical minimum thickness dictated by the fundamental limit. Our findings have significant implications for a variety of applications, including energy harvesting, thermal management, single-photon detection, and low-power directional emission. △ Less

Submitted 14 August, 2023; originally announced August 2023.

arXiv:2308.06016 [pdf, ps, other]

Integral closure and normality of edge ideals of some edge-weighted graphs

Authors: Shiya Duan, Guangjun Zhu, Yijun Cui, Jiaxin Li

Abstract: Let $G_ω$ be an edge-weighted simple graph. In this paper, we give a complete characterization of the graph $G_ω$ whose edge ideal $I(G_ω)$ is integrally closed. We also show that if $G_ω$ is an edge-weighted star graph, a path or a cycle, and $I(G_ω)$ is integrally closed, then $I(G_ω)$ is normal. Let $G_ω$ be an edge-weighted simple graph. In this paper, we give a complete characterization of the graph $G_ω$ whose edge ideal $I(G_ω)$ is integrally closed. We also show that if $G_ω$ is an edge-weighted star graph, a path or a cycle, and $I(G_ω)$ is integrally closed, then $I(G_ω)$ is normal. △ Less

Submitted 11 August, 2023; originally announced August 2023.

MSC Class: Primary 13B22; 13F20; Secondary 05C99; 05E4

arXiv:2308.01469 [pdf, other]

VertexSerum: Poisoning Graph Neural Networks for Link Inference

Authors: Ruyi Ding, Shijin Duan, Xiaolin Xu, Yunsi Fei

Abstract: Graph neural networks (GNNs) have brought superb performance to various applications utilizing graph structural data, such as social analysis and fraud detection. The graph links, e.g., social relationships and transaction history, are sensitive and valuable information, which raises privacy concerns when using GNNs. To exploit these vulnerabilities, we propose VertexSerum, a novel graph poisoning… ▽ More Graph neural networks (GNNs) have brought superb performance to various applications utilizing graph structural data, such as social analysis and fraud detection. The graph links, e.g., social relationships and transaction history, are sensitive and valuable information, which raises privacy concerns when using GNNs. To exploit these vulnerabilities, we propose VertexSerum, a novel graph poisoning attack that increases the effectiveness of graph link stealing by amplifying the link connectivity leakage. To infer node adjacency more accurately, we propose an attention mechanism that can be embedded into the link detection network. Our experiments demonstrate that VertexSerum significantly outperforms the SOTA link inference attack, improving the AUC scores by an average of $9.8\%$ across four real-world datasets and three different GNN structures. Furthermore, our experiments reveal the effectiveness of VertexSerum in both black-box and online learning settings, further validating its applicability in real-world scenarios. △ Less

Submitted 2 August, 2023; originally announced August 2023.

arXiv:2307.02751 [pdf, ps, other]

DSARSR: Deep Stacked Auto-encoders Enhanced Robust Speaker Recognition

Authors: Zhifeng Wang, Chunyan Zeng, Surong Duan, Hongjie Ouyang, Hongmin Xu

Abstract: Speaker recognition is a biometric modality that utilizes the speaker's speech segments to recognize the identity, determining whether the test speaker belongs to one of the enrolled speakers. In order to improve the robustness of the i-vector framework on cross-channel conditions and explore the nova method for applying deep learning to speaker recognition, the Stacked Auto-encoders are used to g… ▽ More Speaker recognition is a biometric modality that utilizes the speaker's speech segments to recognize the identity, determining whether the test speaker belongs to one of the enrolled speakers. In order to improve the robustness of the i-vector framework on cross-channel conditions and explore the nova method for applying deep learning to speaker recognition, the Stacked Auto-encoders are used to get the abstract extraction of the i-vector instead of applying PLDA. After pre-processing and feature extraction, the speaker and channel-independent speeches are employed for UBM training. The UBM is then used to extract the i-vector of the enrollment and test speech. Unlike the traditional i-vector framework, which uses linear discriminant analysis (LDA) to reduce dimension and increase the discrimination between speaker subspaces, this research use stacked auto-encoders to reconstruct the i-vector with lower dimension and different classifiers can be chosen to achieve final classification. The experimental results show that the proposed method achieves better performance than the state-of-the-art method. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: 12 pages, 3 figures

arXiv:2306.15513 [pdf, other]

PASNet: Polynomial Architecture Search Framework for Two-party Computation-based Secure Neural Network Deployment

Authors: Hongwu Peng, Shanglin Zhou, Yukui Luo, Nuo Xu, Shijin Duan, Ran Ran, Jiahui Zhao, Chenghong Wang, Tong Geng, Wujie Wen, Xiaolin Xu, Caiwen Ding

Abstract: Two-party computation (2PC) is promising to enable privacy-preserving deep learning (DL). However, the 2PC-based privacy-preserving DL implementation comes with high comparison protocol overhead from the non-linear operators. This work presents PASNet, a novel systematic framework that enables low latency, high energy efficiency & accuracy, and security-guaranteed 2PC-DL by integrating the hardwar… ▽ More Two-party computation (2PC) is promising to enable privacy-preserving deep learning (DL). However, the 2PC-based privacy-preserving DL implementation comes with high comparison protocol overhead from the non-linear operators. This work presents PASNet, a novel systematic framework that enables low latency, high energy efficiency & accuracy, and security-guaranteed 2PC-DL by integrating the hardware latency of the cryptographic building block into the neural architecture search loss function. We develop a cryptographic hardware scheduler and the corresponding performance model for Field Programmable Gate Arrays (FPGA) as a case study. The experimental results demonstrate that our light-weighted model PASNet-A and heavily-weighted model PASNet-B achieve 63 ms and 228 ms latency on private inference on ImageNet, which are 147 and 40 times faster than the SOTA CryptGPU system, and achieve 70.54% & 78.79% accuracy and more than 1000 times higher energy efficiency. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: DAC 2023 accepeted publication, short version was published on AAAI 2023 workshop on DL-Hardware Co-Design for AI Acceleration: RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference

ACM Class: E.3; I.2; B.0

Journal ref: DAC 2023

arXiv:2306.00311 [pdf, other]

doi 10.1103/PhysRevLett.130.226501

Ultrafast Switching from the Charge Density Wave Phase to a Metastable Metallic State in 1T-TiSe$_2$

Authors: Shaofeng Duan, Wei Xia, Chaozhi Huang, Shichong Wang, Lingxiao Gu, Haoran Liu, Dao Xiang, Dong Qian, Yanfeng Guo, Wentao Zhang

Abstract: The ultrafast electronic structures of the charge density wave material 1T-TiSe$_2$ were investigated by high-resolution time- and angle-resolved photoemission spectroscopy. We found that the quasiparticle populations drove ultrafast electronic phase transitions in 1T-TiSe$_2$ within 100 fs after photoexcitation, and a metastable metallic state, which was significantly different from the equilibri… ▽ More The ultrafast electronic structures of the charge density wave material 1T-TiSe$_2$ were investigated by high-resolution time- and angle-resolved photoemission spectroscopy. We found that the quasiparticle populations drove ultrafast electronic phase transitions in 1T-TiSe$_2$ within 100 fs after photoexcitation, and a metastable metallic state, which was significantly different from the equilibrium normal phase, was evidenced far below the charge density wave transition temperature. Detailed time- and pump-fluence-dependent experiments revealed that the photoinduced metastable metallic state was a result of the halted motion of the atoms through the coherent electron-phonon coupling process, and the lifetime of this state was prolonged to picoseconds with the highest pump fluence used in this study. Ultrafast electronic dynamics were well captured by the time-dependent Ginzburg-Landau model. Our work demonstrates a mechanism for realizing novel electronic states by photoinducing coherent motion of atoms in the lattice. △ Less

Submitted 31 May, 2023; originally announced June 2023.

Comments: 13 Pages, 10 figures

Journal ref: Phys. Rev. Lett. 130, 226501 (2023)

arXiv:2304.08020 [pdf, other]

Sparse Positive-Definite Estimation for Covariance Matrices with Repeated Measurements

Authors: Sunpeng Duan, Guo Yu, Juntao Duan, Yuedong Wang

Abstract: Repeated measurements are common in many fields, where random variables are observed repeatedly across different subjects. Such data have an underlying hierarchical structure, and it is of interest to learn covariance/correlation at different levels. Most existing methods for sparse covariance/correlation matrix estimation assume independent samples. Ignoring the underlying hierarchical structure… ▽ More Repeated measurements are common in many fields, where random variables are observed repeatedly across different subjects. Such data have an underlying hierarchical structure, and it is of interest to learn covariance/correlation at different levels. Most existing methods for sparse covariance/correlation matrix estimation assume independent samples. Ignoring the underlying hierarchical structure and correlation within the subject leads to erroneous scientific conclusions. In this paper, we study the problem of sparse and positive-definite estimation of between-subject and within-subject covariance/correlation matrices for repeated measurements. Our estimators are solutions to convex optimization problems that can be solved efficiently. We establish estimation error rates for the proposed estimators and demonstrate their favorable performance through theoretical analysis and comprehensive simulation studies. We further apply our methods to construct between-subject and within-subject covariance graphs of clinical variables from hemodialysis patients. △ Less

Submitted 10 June, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

arXiv:2302.12506 [pdf]

Exploring the Enablers of Digital Transformation in Small and Medium-Sized Enterprise

Authors: Sachithra Lokuge, Sophia Duan

Abstract: Recently, digital transformation has caught much attention of both academics and practitioners. With the advent of digital technologies, small-and-medium-sized enterprises (SMEs) have obtained the capacity to initiate digital transformation initiatives in a similar fashion to large-sized organizations. The innate characteristics of digital technologies also favor SMEs in promoting initiation of di… ▽ More Recently, digital transformation has caught much attention of both academics and practitioners. With the advent of digital technologies, small-and-medium-sized enterprises (SMEs) have obtained the capacity to initiate digital transformation initiatives in a similar fashion to large-sized organizations. The innate characteristics of digital technologies also favor SMEs in promoting initiation of digital transformation. However, the process digital transformation in SMEs remains a black box and the existing findings of digital transformation in SMEs are limited and remain fragmented. Considering the important contribution SMEs can offer to nations and economies; it is timely and relevant to conduct a profound analysis on digital transformation in SMEs. By conducting a thorough review of existing related literature in management, information systems, and business disciplines, this book chapter aims to understand both internal and external enablers of the digital transformation in SMEs. △ Less

Submitted 24 February, 2023; originally announced February 2023.

arXiv:2302.12347 [pdf, other]

MetaLDC: Meta Learning of Low-Dimensional Computing Classifiers for Fast On-Device Adaption

Authors: Yejia Liu, Shijin Duan, Xiaolin Xu, Shaolei Ren

Abstract: Fast model updates for unseen tasks on intelligent edge devices are crucial but also challenging due to the limited computational power. In this paper,we propose MetaLDC, which meta-trains braininspired ultra-efficient low-dimensional computing classifiers to enable fast adaptation on tiny devices with minimal computational costs. Concretely, during the meta-training stage, MetaLDC meta trains a r… ▽ More Fast model updates for unseen tasks on intelligent edge devices are crucial but also challenging due to the limited computational power. In this paper,we propose MetaLDC, which meta-trains braininspired ultra-efficient low-dimensional computing classifiers to enable fast adaptation on tiny devices with minimal computational costs. Concretely, during the meta-training stage, MetaLDC meta trains a representation offline by explicitly taking into account that the final (binary) class layer will be fine-tuned for fast adaptation for unseen tasks on tiny devices; during the meta-testing stage, MetaLDC uses closed-form gradients of the loss function to enable fast adaptation of the class layer. Unlike traditional neural networks, MetaLDC is designed based on the emerging LDC framework to enable ultra-efficient on-device inference. Our experiments have demonstrated that compared to SOTA baselines, MetaLDC achieves higher accuracy, robustness against random bit errors, as well as cost-efficient hardware computation. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: Accepted as a full paper by the TinyML Research Symposium 2023; 8 pages, 5 figures

arXiv:2302.02292 [pdf, other]

RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference

Authors: Hongwu Peng, Shanglin Zhou, Yukui Luo, Nuo Xu, Shijin Duan, Ran Ran, Jiahui Zhao, Shaoyi Huang, Xi Xie, Chenghong Wang, Tong Geng, Wujie Wen, Xiaolin Xu, Caiwen Ding

Abstract: The proliferation of deep learning (DL) has led to the emergence of privacy and security concerns. To address these issues, secure Two-party computation (2PC) has been proposed as a means of enabling privacy-preserving DL computation. However, in practice, 2PC methods often incur high computation and communication overhead, which can impede their use in large-scale systems. To address this challen… ▽ More The proliferation of deep learning (DL) has led to the emergence of privacy and security concerns. To address these issues, secure Two-party computation (2PC) has been proposed as a means of enabling privacy-preserving DL computation. However, in practice, 2PC methods often incur high computation and communication overhead, which can impede their use in large-scale systems. To address this challenge, we introduce RRNet, a systematic framework that aims to jointly reduce the overhead of MPC comparison protocols and accelerate computation through hardware acceleration. Our approach integrates the hardware latency of cryptographic building blocks into the DNN loss function, resulting in improved energy efficiency, accuracy, and security guarantees. Furthermore, we propose a cryptographic hardware scheduler and corresponding performance model for Field Programmable Gate Arrays (FPGAs) to further enhance the efficiency of our framework. Experiments show RRNet achieved a much higher ReLU reduction performance than all SOTA works on CIFAR-10 dataset. △ Less

Submitted 22 February, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

Comments: This is work is a updated version of arXiv:2209.09424, the original version has been withdrawn

ACM Class: I.2

Showing 1–50 of 142 results for author: Duan, S