-
Photonic quasicrystal of spin angular momentum
Authors:
Min Lin,
Xinxin Gou,
Zhenwei Xie,
Aiping Yang,
Luping Du,
Xiaocong Yuan
Abstract:
Quasicrystals,characterized by long-range order without translational symmetry,have catalyzed transformative advances in various fields,including optics in terms of field quasicrystals.Here,we present the first demonstration of photonic quasicrystals formed by spin angular momentum, unveiling novel spin-orbit coupling effects absent in traditional field quasicrystals.A de Bruijn tiling like theore…
▽ More
Quasicrystals,characterized by long-range order without translational symmetry,have catalyzed transformative advances in various fields,including optics in terms of field quasicrystals.Here,we present the first demonstration of photonic quasicrystals formed by spin angular momentum, unveiling novel spin-orbit coupling effects absent in traditional field quasicrystals.A de Bruijn tiling like theoretical framework was built elucidating the formation mechanism of spin quasicrystals for diverse symmetries.Moreover,the configurations of these spin textures can be manipulated through the adjustments of the wavefronts,among which phason-like discontinuous dynamics is observed and quantitatively measured. Unlike optical quasicrystals shaped by electromagnetic fields,these spin-governed quasicrystals exhibit quasi-periodic properties of kinematic parameters,extending their potential applications to other physical systems. These findings hold promise for novel advancements in optical trapping,quasicrystal fabrication,and optical encryption systems.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Sharp non-uniqueness for the 2D hyper-dissipative Navier-Stokes equations
Authors:
Lili Du,
Xinliang Li
Abstract:
In this article, we study the non-uniqueness of weak solutions for the two-dimensional hyper-dissipative Navier-Stokes equations in the super-critical spaces $L_{t}^γW_{x}^{s,p}$ when $α\in[1,\frac{3}{2})$, and obtain the conclusion that the non-uniqueness of the weak solutions at the two endpoints is sharp in view of the generalized Ladyženskaja-Prodi-Serrin condition with the triplet…
▽ More
In this article, we study the non-uniqueness of weak solutions for the two-dimensional hyper-dissipative Navier-Stokes equations in the super-critical spaces $L_{t}^γW_{x}^{s,p}$ when $α\in[1,\frac{3}{2})$, and obtain the conclusion that the non-uniqueness of the weak solutions at the two endpoints is sharp in view of the generalized Ladyženskaja-Prodi-Serrin condition with the triplet $(s,γ,p)=(s,\infty, \frac{2}{2α-1+s})$ and $(s, \frac{2α}{2α-1+s}, \infty)$. As a good observation, we use the intermittency of the temporal concentrated function in an almost optimal way, and establish its relationship with the viscosity exponent $α$ as well as the regularity of the weak solutions. The research results extend the recent elegant works on 2D Navier-Stokes equations in [Cheskidov and Luo, Invent. Math., 229 (2022), pp. 987--1054; Cheskidov and Luo, Ann. PDE, 9:13 (2023)] to the hyper-dissipative case $α\in(1,\frac{3}{2})$, and are also applicable in Lebesgue and Besov spaces. It is proved that even in the case of high viscosity, the behavior of the solution remains unpredictable and stochastic due to the lack of integrability and regularity.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Realization of Conditional Operations through Transition Pathway Engineering
Authors:
Sheng Zhang,
Peng Duan,
Yun-Jie Wang,
Tian-Le Wang,
Peng Wang,
Ren-Ze Zhao,
Xiao-Yan Yang,
Ze-An Zhao,
Liang-Liang Guo,
Yong Chen,
Hai-Feng Zhang,
Lei Du,
Hao-Ran Tao,
Zhi-Fei Li,
Yuan Wu,
Zhi-Long Jia,
Wei-Cheng Kong,
Zhao-Yun Chen,
Yu-Chun Wu,
Guo-Ping Guo
Abstract:
In the NISQ era, achieving large-scale quantum computing demands compact circuits to mitigate decoherence and gate error accumulation. Quantum operations with diverse degrees of freedom hold promise for circuit compression, but conventional approaches encounter challenges in simultaneously adjusting multiple parameters. Here, we propose a transition composite gate (TCG) scheme grounded on state-se…
▽ More
In the NISQ era, achieving large-scale quantum computing demands compact circuits to mitigate decoherence and gate error accumulation. Quantum operations with diverse degrees of freedom hold promise for circuit compression, but conventional approaches encounter challenges in simultaneously adjusting multiple parameters. Here, we propose a transition composite gate (TCG) scheme grounded on state-selective transition path engineering, enabling more expressive conditional operations. We experimentally validate a controlled unitary (CU) gate as an example, with independent and continuous parameters. By adjusting the parameters of $\rm X^{12}$ gate, we obtain the CU family with a fidelity range of 95.2% to 99.0% leveraging quantum process tomography (QPT). To demonstrate the capability of circuit compression, we use TCG scheme to prepare 3-qubit Greenberger-Horne-Zeilinger (GHZ) and W states, with the fidelity of 96.77% and 95.72%. TCG can achieve the reduction in circuit depth of about 40% and 44% compared with the use of CZ gates only. Moreover, we show that short-path TCG (SPTCG) can further reduce the state-preparation circuit time cost. The TCG scheme exhibits advantages in certain quantum circuits and shows significant potential for large-scale quantum algorithms.
△ Less
Submitted 10 July, 2024; v1 submitted 9 July, 2024;
originally announced July 2024.
-
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Authors:
Xingrun Xing,
Boyan Gao,
Zheng Zhang,
David A. Clifton,
Shitao Xiao,
Li Du,
Guoqi Li,
Jiajun Zhang
Abstract:
The recent advancements in large language models (LLMs) with billions of parameters have significantly boosted their performance across various real-world applications. However, the inference processes for these models require substantial energy and computational resources, presenting considerable deployment challenges. In contrast, human brains, which contain approximately 86 billion biological n…
▽ More
The recent advancements in large language models (LLMs) with billions of parameters have significantly boosted their performance across various real-world applications. However, the inference processes for these models require substantial energy and computational resources, presenting considerable deployment challenges. In contrast, human brains, which contain approximately 86 billion biological neurons, exhibit significantly greater energy efficiency compared to LLMs with a similar number of parameters. Inspired by this, we redesign 7 to 70 billion parameter LLMs using bio-plausible spiking mechanisms, emulating the efficient behavior of the human brain. We propose the first spiking large language model as recent LLMs termed SpikeLLM. Coupled with the proposed model, a novel spike-driven quantization framework named Optimal Brain Spiking is introduced to reduce the energy cost and accelerate inference speed via two essential approaches: first (second)-order differentiation-based salient channel detection, and per-channel salient outlier expansion with Generalized Integrate-and-Fire neurons. Our proposed spike-driven quantization can plug in main streams of quantization training methods. In the OmniQuant pipeline, SpikeLLM significantly reduces 25.51% WikiText2 perplexity and improves 3.08% average accuracy of 6 zero-shot datasets on a LLAMA2-7B 4A4W model. In the GPTQ pipeline, SpikeLLM realizes a sparse ternary quantization, which achieves additive in all linear layers. Compared with PB-LLM with similar operations, SpikeLLM also exceeds significantly. We will release our code on GitHub.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Probing the equilibration of the QCD matter created in heavy-ion collisions with dileptons
Authors:
Xiang-Yu Wu,
Lipei Du,
Charles Gale,
Sangyong Jeon
Abstract:
A systematic study of intermediate invariant mass dilepton production in Pb+Pb collisions at $\sqrt{s_{NN}} = 5.02$ TeV is performed, using next-to-leading-order (NLO) thermal QCD dilepton emission rates with a multistage dynamical approach which includes event-by-event IP-Glasma initial conditions, relativistic viscous fluid dynamics, and a hadronic afterburner. Considering dilepton yield and ani…
▽ More
A systematic study of intermediate invariant mass dilepton production in Pb+Pb collisions at $\sqrt{s_{NN}} = 5.02$ TeV is performed, using next-to-leading-order (NLO) thermal QCD dilepton emission rates with a multistage dynamical approach which includes event-by-event IP-Glasma initial conditions, relativistic viscous fluid dynamics, and a hadronic afterburner. Considering dilepton yield and anisotropic flow, special attention is paid to the out-of-equilibrium aspects, both thermal and chemical, and to the contribution of the Drell-Yan process. The relative contribution of each of those different channels to dilepton observables is calculated and discussed.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Stable Heterogeneous Treatment Effect Estimation across Out-of-Distribution Populations
Authors:
Yuling Zhang,
Anpeng Wu,
Kun Kuang,
Liang Du,
Zixun Sun,
Zhi Wang
Abstract:
Heterogeneous treatment effect (HTE) estimation is vital for understanding the change of treatment effect across individuals or subgroups. Most existing HTE estimation methods focus on addressing selection bias induced by imbalanced distributions of confounders between treated and control units, but ignore distribution shifts across populations. Thereby, their applicability has been limited to the…
▽ More
Heterogeneous treatment effect (HTE) estimation is vital for understanding the change of treatment effect across individuals or subgroups. Most existing HTE estimation methods focus on addressing selection bias induced by imbalanced distributions of confounders between treated and control units, but ignore distribution shifts across populations. Thereby, their applicability has been limited to the in-distribution (ID) population, which shares a similar distribution with the training dataset. In real-world applications, where population distributions are subject to continuous changes, there is an urgent need for stable HTE estimation across out-of-distribution (OOD) populations, which, however, remains an open problem. As pioneers in resolving this problem, we propose a novel Stable Balanced Representation Learning with Hierarchical-Attention Paradigm (SBRL-HAP) framework, which consists of 1) Balancing Regularizer for eliminating selection bias, 2) Independence Regularizer for addressing the distribution shift issue, 3) Hierarchical-Attention Paradigm for coordination between balance and independence. In this way, SBRL-HAP regresses counterfactual outcomes using ID data, while ensuring the resulting HTE estimation can be successfully generalized to out-of-distribution scenarios, thereby enhancing the model's applicability in real-world settings. Extensive experiments conducted on synthetic and real-world datasets demonstrate the effectiveness of our SBRL-HAP in achieving stable HTE estimation across OOD populations, with an average 10% reduction in the error metric PEHE and 11% decrease in the ATE bias, compared to the SOTA methods.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic
Authors:
Liulu He,
Yufei Zhao,
Rui Gao,
Yuan Du,
Li Du
Abstract:
Fast convolution algorithms, including Winograd and FFT, can efficiently accelerate convolution operations in deep models. However, these algorithms depend on high-precision arithmetic to maintain inference accuracy, which conflicts with the model quantization. To resolve this conflict and further improve the efficiency of quantized convolution, we proposes SFC, a new algebra transform for fast co…
▽ More
Fast convolution algorithms, including Winograd and FFT, can efficiently accelerate convolution operations in deep models. However, these algorithms depend on high-precision arithmetic to maintain inference accuracy, which conflicts with the model quantization. To resolve this conflict and further improve the efficiency of quantized convolution, we proposes SFC, a new algebra transform for fast convolution by extending the Discrete Fourier Transform (DFT) with symbolic computing, in which only additions are required to perform the transformation at specific transform points, avoiding the calculation of irrational number and reducing the requirement for precision. Additionally, we enhance convolution efficiency by introducing correction terms to convert invalid circular convolution outputs of the Fourier method into effective ones. The numerical error analysis is presented for the first time in this type of work and proves that our algorithms can provide a 3.68x multiplication reduction for 3x3 convolution, while the Winograd algorithm only achieves a 2.25x reduction with similarly low numerical errors. Experiments carried out on benchmarks and FPGA show that our new algorithms can further improve the computation efficiency of quantized models while maintaining accuracy, surpassing both the quantization-alone method and existing works on fast convolution quantization.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Constraining the Physical Parameters of Blazars Using the Seed Factor Approach
Authors:
Chang-Bin Deng,
Yong-You Shi,
Yu-Jie Song,
Rui Xue,
Lei-Ming Du,
Ze-Rui Wang,
Zhao-Hua Xie
Abstract:
The discovery that blazars dominate the extra-galactic γ-ray sky is a triumph in the Fermi era. However, the exact location of γ-ray emission region still remains in debate. Low-synchrotron-peaked blazars (LSPs) are estimated to produce high-energy radiation through the external Compton process, thus their emission regions are closely related to the external photon fields. We employed the seed fac…
▽ More
The discovery that blazars dominate the extra-galactic γ-ray sky is a triumph in the Fermi era. However, the exact location of γ-ray emission region still remains in debate. Low-synchrotron-peaked blazars (LSPs) are estimated to produce high-energy radiation through the external Compton process, thus their emission regions are closely related to the external photon fields. We employed the seed factor approach proposed by Georganopoulos et al. It directly matches the observed seed factor of each LSP with the characteristic seed factors of external photon fields to locate the γ-ray emission region. A sample of 1138 LSPs with peak frequencies and peak luminosities was adopted to plot a histogram distribution of observed seed factors. We also collected some spectral energy distributions (SEDs) of historical flare states to investigate the variation of γ-ray emission region. Those SEDs were fitted by both quadratic and cubic functions using the Markov-chain Monte Carlo method. Furthermore, we derived some physical parameters of blazars and compared them with the constraint of internal γγ-absorption. We find that dusty torus dominates the soft photon fields of LSPs and most γ-ray emission regions of LSPs are located at 1-10 pc. The soft photon fields could also transition from dusty torus to broad line region and cosmic microwave background in different flare states. Our results suggest that the cubic function is better than the quadratic function to fit the SEDs.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Liouville results for semilinear integral equations with conical diffusion
Authors:
Isabeau Birindelli,
Lele Du,
Giulio Galise
Abstract:
Nonexistence results for positive supersolutions of the equation $$-Lu=u^p\quad\text{in $\mathbb R^N_+$}$$ are obtained, $-L$ being any symmetric and stable linear operator, positively homogeneous of degree $2s$, $s\in(0,1)$, whose spectral measure is absolutely continuous and positive only in a relative open set of the unit sphere of $\mathbb R^N$. The results are sharp: $u\equiv 0…
▽ More
Nonexistence results for positive supersolutions of the equation $$-Lu=u^p\quad\text{in $\mathbb R^N_+$}$$ are obtained, $-L$ being any symmetric and stable linear operator, positively homogeneous of degree $2s$, $s\in(0,1)$, whose spectral measure is absolutely continuous and positive only in a relative open set of the unit sphere of $\mathbb R^N$. The results are sharp: $u\equiv 0$ is the only nonnegative supersolution in the subcritical regime $1\leq p\leq\frac{N+s}{N-s}\,$, while nontrivial supersolutions exist, at least for some specific $-L$, as soon as $p>\frac{N+s}{N-s}$. \\ The arguments used rely on a rescaled test function's method, suitably adapted to such nonlocal setting with weak diffusion; they are quite general and also employed to obtain Liouville type results in the whole space.
△ Less
Submitted 24 June, 2024; v1 submitted 18 June, 2024;
originally announced June 2024.
-
LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models
Authors:
Xiaohao Yang,
He Zhao,
Dinh Phung,
Wray Buntine,
Lan Du
Abstract:
Topic modeling has been a widely used tool for unsupervised text analysis. However, comprehensive evaluations of a topic model remain challenging. Existing evaluation methods are either less comparable across different models (e.g., perplexity) or focus on only one specific aspect of a model (e.g., topic quality or document representation quality) at a time, which is insufficient to reflect the ov…
▽ More
Topic modeling has been a widely used tool for unsupervised text analysis. However, comprehensive evaluations of a topic model remain challenging. Existing evaluation methods are either less comparable across different models (e.g., perplexity) or focus on only one specific aspect of a model (e.g., topic quality or document representation quality) at a time, which is insufficient to reflect the overall model performance. In this paper, we propose WALM (Words Agreement with Language Model), a new evaluation method for topic modeling that comprehensively considers the semantic quality of document representations and topics in a joint manner, leveraging the power of large language models (LLMs). With extensive experiments involving different types of topic models, WALM is shown to align with human judgment and can serve as a complementary evaluation method to the existing ones, bringing a new perspective to topic modeling. Our software package will be available at https://github.com/Xiaohao-Yang/Topic_Model_Evaluation, which can be integrated with many widely used topic models.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Enabling Large-Scale and High-Precision Fluid Simulations on Near-Term Quantum Computers
Authors:
Zhao-Yun Chen,
Teng-Yang Ma,
Chuang-Chao Ye,
Liang Xu,
Ming-Yang Tan,
Xi-Ning Zhuang,
Xiao-Fan Xu,
Yun-Jie Wang,
Tai-Ping Sun,
Yong Chen,
Lei Du,
Liang-Liang Guo,
Hai-Feng Zhang,
Hao-Ran Tao,
Tian-Le Wang,
Xiao-Yan Yang,
Ze-An Zhao,
Peng Wang,
Sheng Zhang,
Chi Zhang,
Ren-Ze Zhao,
Zhi-Long Jia,
Wei-Cheng Kong,
Meng-Han Dou,
Jun-Chao Wang
, et al. (7 additional authors not shown)
Abstract:
Quantum computational fluid dynamics (QCFD) offers a promising alternative to classical computational fluid dynamics (CFD) by leveraging quantum algorithms for higher efficiency. This paper introduces a comprehensive QCFD method, including an iterative method "Iterative-QLS" that suppresses error in quantum linear solver, and a subspace method to scale the solution to a larger size. We implement o…
▽ More
Quantum computational fluid dynamics (QCFD) offers a promising alternative to classical computational fluid dynamics (CFD) by leveraging quantum algorithms for higher efficiency. This paper introduces a comprehensive QCFD method, including an iterative method "Iterative-QLS" that suppresses error in quantum linear solver, and a subspace method to scale the solution to a larger size. We implement our method on a superconducting quantum computer, demonstrating successful simulations of steady Poiseuille flow and unsteady acoustic wave propagation. The Poiseuille flow simulation achieved a relative error of less than $0.2\%$, and the unsteady acoustic wave simulation solved a 5043-dimensional matrix. We emphasize the utilization of the quantum-classical hybrid approach in applications of near-term quantum computers. By adapting to quantum hardware constraints and offering scalable solutions for large-scale CFD problems, our method paves the way for practical applications of near-term quantum computers in computational science.
△ Less
Submitted 19 June, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
MTS-Net: Dual-Enhanced Positional Multi-Head Self-Attention for 3D CT Diagnosis of May-Thurner Syndrome
Authors:
Yixin Huang,
Yiqi Jin,
Ke Tao,
Kaijian Xia,
Jianfeng Gu,
Lei Yu,
Lan Du,
Cunjian Chen
Abstract:
May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-t…
▽ More
May-Thurner Syndrome (MTS), also known as iliac vein compression syndrome or Cockett's syndrome, is a condition potentially impacting over 20 percent of the population, leading to an increased risk of iliofemoral deep venous thrombosis. In this paper, we present a 3D-based deep learning approach called MTS-Net for diagnosing May-Thurner Syndrome using CT scans. To effectively capture the spatial-temporal relationship among CT scans and emulate the clinical process of diagnosing MTS, we propose a novel attention module called the dual-enhanced positional multi-head self-attention (DEP-MHSA). The proposed DEP-MHSA reconsiders the role of positional embedding and incorporates a dual-enhanced positional embedding in both attention weights and residual connections. Further, we establish a new dataset, termed MTS-CT, consisting of 747 subjects. Experimental results demonstrate that our proposed approach achieves state-of-the-art MTS diagnosis results, and our self-attention design facilitates the spatial-temporal modeling. We believe that our DEP-MHSA is more suitable to handle CT image sequence modeling and the proposed dataset enables future research on MTS diagnosis. We make our code and dataset publicly available at: https://github.com/Nutingnon/MTS_dep_mhsa.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Fast and Practical Strassen's Matrix Multiplication using FPGAs
Authors:
Afzal Ahmad,
Linfeng Du,
Wei Zhang
Abstract:
Matrix multiplication is a cornerstone operation in a wide array of scientific fields, including machine learning and computer graphics. The standard algorithm for matrix multiplication has a complexity of $\mathcal{O}(n^3)$ for $n\times n$ matrices. Strassen's algorithm improves this to $\mathcal{O}(n^{2.807})$, but its practicality is limited for small to medium matrix sizes due to the large num…
▽ More
Matrix multiplication is a cornerstone operation in a wide array of scientific fields, including machine learning and computer graphics. The standard algorithm for matrix multiplication has a complexity of $\mathcal{O}(n^3)$ for $n\times n$ matrices. Strassen's algorithm improves this to $\mathcal{O}(n^{2.807})$, but its practicality is limited for small to medium matrix sizes due to the large number of additions it introduces. This paper presents a novel FPGA-based implementation of Strassen's algorithm that achieves superior speed over an optimized General Matrix Multiply (GeMM) implementation for matrices as small as $n=256$. Our design, tested extensively on two high-performance FPGA accelerators (Alveo U50 and U280) across various data types, matches or surpasses the performance of a highly optimized baseline across a range of matrix sizes.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
Navigating Conflicting Views: Harnessing Trust for Learning
Authors:
Jueqing Lu,
Lan Du,
Wray Buntine,
Myong Chol Jung,
Joanna Dipnall,
Belinda Gabbe
Abstract:
Resolving conflicts is essential to make the decisions of multi-view classification more reliable. Much research has been conducted on learning consistent informative representations among different views, assuming that all views are identically important and strictly aligned. However, real-world multi-view data may not always conform to these assumptions, as some views may express distinct inform…
▽ More
Resolving conflicts is essential to make the decisions of multi-view classification more reliable. Much research has been conducted on learning consistent informative representations among different views, assuming that all views are identically important and strictly aligned. However, real-world multi-view data may not always conform to these assumptions, as some views may express distinct information. To address this issue, we develop a computational trust-based discounting method to enhance the existing trustworthy framework in scenarios where conflicts between different views may arise. Its belief fusion process considers the trustworthiness of predictions made by individual views via an instance-wise probability-sensitive trust discounting mechanism. We evaluate our method on six real-world datasets, using Top-1 Accuracy, AUC-ROC for Uncertainty-Aware Prediction, Fleiss' Kappa, and a new metric called Multi-View Agreement with Ground Truth that takes into consideration the ground truth labels. The experimental results show that computational trust can effectively resolve conflicts, paving the way for more reliable multi-view classification models in real-world applications.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation
Authors:
Rongyu Zhang,
Aosong Cheng,
Yulin Luo,
Gaole Dai,
Huanrui Yang,
Jiaming Liu,
Ran Xu,
Li Du,
Yuan Du,
Yanbing Jiang,
Shanghang Zhang
Abstract:
Continual Test-Time Adaptation (CTTA), which aims to adapt the pre-trained model to ever-evolving target domains, emerges as an important task for vision models. As current vision models appear to be heavily biased towards texture, continuously adapting the model from one domain distribution to another can result in serious catastrophic forgetting. Drawing inspiration from the human visual system'…
▽ More
Continual Test-Time Adaptation (CTTA), which aims to adapt the pre-trained model to ever-evolving target domains, emerges as an important task for vision models. As current vision models appear to be heavily biased towards texture, continuously adapting the model from one domain distribution to another can result in serious catastrophic forgetting. Drawing inspiration from the human visual system's adeptness at processing both shape and texture according to the famous Trichromatic Theory, we explore the integration of a Mixture-of-Activation-Sparsity-Experts (MoASE) as an adapter for the CTTA task. Given the distinct reaction of neurons with low/high activation to domain-specific/agnostic features, MoASE decomposes the neural activation into high-activation and low-activation components with a non-differentiable Spatial Differentiate Dropout (SDD). Based on the decomposition, we devise a multi-gate structure comprising a Domain-Aware Gate (DAG) that utilizes domain information to adaptive combine experts that process the post-SDD sparse activations of different strengths, and the Activation Sparsity Gate (ASG) that adaptively assigned feature selection threshold of the SDD for different experts for more precise feature decomposition. Finally, we introduce a Homeostatic-Proximal (HP) loss to bypass the error accumulation problem when continuously adapting the model. Extensive experiments on four prominent benchmarks substantiate that our methodology achieves state-of-the-art performance in both classification and segmentation CTTA tasks. Our code is now available at https://github.com/RoyZry98/MoASE-Pytorch.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Fast Asymmetric Factorization for Large Scale Multiple Kernel Clustering
Authors:
Yan Chen,
Liang Du,
Lei Duan
Abstract:
Kernel methods are extensively employed for nonlinear data clustering, yet their effectiveness heavily relies on selecting suitable kernels and associated parameters, posing challenges in advance determination. In response, Multiple Kernel Clustering (MKC) has emerged as a solution, allowing the fusion of information from multiple base kernels for clustering. However, both early fusion and late fu…
▽ More
Kernel methods are extensively employed for nonlinear data clustering, yet their effectiveness heavily relies on selecting suitable kernels and associated parameters, posing challenges in advance determination. In response, Multiple Kernel Clustering (MKC) has emerged as a solution, allowing the fusion of information from multiple base kernels for clustering. However, both early fusion and late fusion methods for large-scale MKC encounter challenges in memory and time constraints, necessitating simultaneous optimization of both aspects. To address this issue, we propose Efficient Multiple Kernel Concept Factorization (EMKCF), which constructs a new sparse kernel matrix inspired by local regression to achieve memory efficiency. EMKCF learns consensus and individual representations by extending orthogonal concept factorization to handle multiple kernels for time efficiency. Experimental results demonstrate the efficiency and effectiveness of EMKCF on benchmark datasets compared to state-of-the-art methods. The proposed method offers a straightforward, scalable, and effective solution for large-scale MKC tasks.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Enhancing Near OOD Detection in Prompt Learning: Maximum Gains, Minimal Costs
Authors:
Myong Chol Jung,
He Zhao,
Joanna Dipnall,
Belinda Gabbe,
Lan Du
Abstract:
Prompt learning has shown to be an efficient and effective fine-tuning method for vision-language models like CLIP. While numerous studies have focused on the generalisation of these models in few-shot classification, their capability in near out-of-distribution (OOD) detection has been overlooked. A few recent works have highlighted the promising performance of prompt learning in far OOD detectio…
▽ More
Prompt learning has shown to be an efficient and effective fine-tuning method for vision-language models like CLIP. While numerous studies have focused on the generalisation of these models in few-shot classification, their capability in near out-of-distribution (OOD) detection has been overlooked. A few recent works have highlighted the promising performance of prompt learning in far OOD detection. However, the more challenging task of few-shot near OOD detection has not yet been addressed. In this study, we investigate the near OOD detection capabilities of prompt learning models and observe that commonly used OOD scores have limited performance in near OOD detection. To enhance the performance, we propose a fast and simple post-hoc method that complements existing logit-based scores, improving near OOD detection AUROC by up to 11.67% with minimal computational cost. Our method can be easily applied to any prompt learning model without change in architecture or re-training the models. Comprehensive empirical evaluations across 13 datasets and 8 models demonstrate the effectiveness and adaptability of our method.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Coherent feedback control for cavity optomechanical systems with a frequency-dependent mirror
Authors:
Lei Du,
Juliette Monsel,
Witlef Wieczorek,
Janine Splettstoesser
Abstract:
Ground-state cooling of mechanical resonators is a prerequisite for the observation of various quantum effects in optomechanical systems and thus has always been a crucial task in quantum optomechanics. In this paper, we study how to realize ground-state cooling of the mechanical mode in a Fano-mirror optomechanical setup, which allows for enhanced effective optomechanical interaction but typicall…
▽ More
Ground-state cooling of mechanical resonators is a prerequisite for the observation of various quantum effects in optomechanical systems and thus has always been a crucial task in quantum optomechanics. In this paper, we study how to realize ground-state cooling of the mechanical mode in a Fano-mirror optomechanical setup, which allows for enhanced effective optomechanical interaction but typically works in the (deeply) unresolved-sideband regime. We reveal that for such a two-sided cavity geometry with very different decay rates at the two cavity mirrors when using an appropriate single-sided coherent feedback, it is possible to cool the mechanical mode down to its ground state within a broad range of parameters. This is possible even if the total optical loss is more than seven orders of magnitude larger than the mechanical frequency and the feedback efficiency is relatively low. Importantly, we show that a more standard two-sided feedback scheme is not appropriate to cooperate with a Fano-mirror system.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Medical Dialogue: A Survey of Categories, Methods, Evaluation and Challenges
Authors:
Xiaoming Shi,
Zeming Liu,
Li Du,
Yuxuan Wang,
Hongru Wang,
Yuhang Guo,
Tong Ruan,
Jie Xu,
Shaoting Zhang
Abstract:
This paper surveys and organizes research works on medical dialog systems, which is an important yet challenging task. Although these systems have been surveyed in the medical community from an application perspective, a systematic review from a rigorous technical perspective has to date remained noticeably absent. As a result, an overview of the categories, methods, and evaluation of medical dial…
▽ More
This paper surveys and organizes research works on medical dialog systems, which is an important yet challenging task. Although these systems have been surveyed in the medical community from an application perspective, a systematic review from a rigorous technical perspective has to date remained noticeably absent. As a result, an overview of the categories, methods, and evaluation of medical dialogue systems remain limited and underspecified, hindering the further improvement of this area. To fill this gap, we investigate an initial pool of 325 papers from well-known computer science, and natural language processing conferences and journals, and make an overview. Recently, large language models have shown strong model capacity on downstream tasks, which also reshaped medical dialog systems' foundation. Despite the alluring practical application value, current medical dialogue systems still suffer from problems. To this end, this paper lists the grand challenges of medical dialog systems, especially of large language models.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Circular Photocurrents in Centrosymmetric Semiconductors with Hidden Spin Polarization
Authors:
Kexin Wang,
Butian Zhang,
Chengyu Yan,
Luojun Du,
Shun Wang
Abstract:
Centrosymmetric materials with site inversion asymmetries possess hidden spin polarization, which remains challenging to be converted into spin currents because the global inversion symmetry is still conserved. This study demonstrates the spin-polarized DC circular photocurrents (CPC) in centrosymmetric transition metal dichalcogenides (TMDCs) at normal incidence without applying electric bias. Th…
▽ More
Centrosymmetric materials with site inversion asymmetries possess hidden spin polarization, which remains challenging to be converted into spin currents because the global inversion symmetry is still conserved. This study demonstrates the spin-polarized DC circular photocurrents (CPC) in centrosymmetric transition metal dichalcogenides (TMDCs) at normal incidence without applying electric bias. The global inversion symmetry is broken by using a spatially-varying circularly polarized light beam, which could generate spin gradient owing to the hidden spin polarization. The dependences of the CPC on electrode configuration, illumination position, and beam spot size indicate an emergence of circulating electric current under spatially inhomogeneous light, which is associated with the deflection of spin-polarized current through the inverse spin Hall effect (ISHE). The CPC is subsequently utilized to probe the spin polarization and ISHE under different excitation wavelengths and temperatures. The results of this study demonstrate the feasibility of using centrosymmetric materials with hidden spin polarization and non-vanishing Berry curvature for spintronic device applications.
△ Less
Submitted 20 May, 2024; v1 submitted 19 April, 2024;
originally announced April 2024.
-
NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results
Authors:
Xin Li,
Kun Yuan,
Yajing Pei,
Yiting Lu,
Ming Sun,
Chao Zhou,
Zhibo Chen,
Radu Timofte,
Wei Sun,
Haoning Wu,
Zicheng Zhang,
Jun Jia,
Zhichao Zhang,
Linhan Cao,
Qiubo Chen,
Xiongkuo Min,
Weisi Lin,
Guangtao Zhai,
Jianhui Sun,
Tianyi Wang,
Lei Li,
Han Kong,
Wenxuan Wang,
Bing Li,
Cheng Luo
, et al. (43 additional authors not shown)
Abstract:
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The…
▽ More
This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The purpose is to build new benchmarks and advance the development of S-UGC VQA. The competition had 200 participants and 13 teams submitted valid solutions for the final testing phase. The proposed solutions achieved state-of-the-art performances for S-UGC VQA. The project can be found at https://github.com/lixinustc/KVQChallenge-CVPR-NTIRE2024.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Nonlinear chiral quantum optics with giant-emitter pairs
Authors:
Xin Wang,
Jia-Qi Li,
Zhihai Wang,
Anton Frisk Kockum,
Lei Du,
Tao Liu,
Franco Nori
Abstract:
We propose a setup which combines giant emitters (coupling to light at multiple points separated by wavelength distances) with nonlinear quantum optics and its correlated photons. In this setup, we reveal a mechanism for multiphoton chiral emission: the propagation phase of the center of mass of two strongly correlated photons (a doublon), and the phases encoded in the coupling points of two giant…
▽ More
We propose a setup which combines giant emitters (coupling to light at multiple points separated by wavelength distances) with nonlinear quantum optics and its correlated photons. In this setup, we reveal a mechanism for multiphoton chiral emission: the propagation phase of the center of mass of two strongly correlated photons (a doublon), and the phases encoded in the coupling points of two giant emitters, can yield completely destructive interference in one propagation direction while supporting emission in the other direction. The degree of chirality can be tuned by the phases of the couplings. We show that the proposed setup can provide directional quantum many-body resources, and can be configured as a building block for a chiral quantum network with ``correlated flying qubits'', enabling distinct applications beyond linear chiral setups. Our findings point toward a rich landscape of tailoring multiphoton propagation and correlation properties by exploiting interference effects of giant emitters coupling to nonlinear photonic baths.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient Finetuning
Authors:
Yijiang Liu,
Rongyu Zhang,
Huanrui Yang,
Kurt Keutzer,
Yuan Du,
Li Du,
Shanghang Zhang
Abstract:
Large Language Models (LLMs) have demonstrated significant potential in performing multiple tasks in multimedia applications, ranging from content generation to interactive entertainment, and artistic creation. However, the diversity of downstream tasks in multitask scenarios presents substantial adaptation challenges for LLMs. While traditional methods often succumb to knowledge confusion on thei…
▽ More
Large Language Models (LLMs) have demonstrated significant potential in performing multiple tasks in multimedia applications, ranging from content generation to interactive entertainment, and artistic creation. However, the diversity of downstream tasks in multitask scenarios presents substantial adaptation challenges for LLMs. While traditional methods often succumb to knowledge confusion on their monolithic dense models, Mixture-of-Experts (MoE) has been emerged as a promising solution with its sparse architecture for effective task decoupling. Inspired by the principles of human cognitive neuroscience, we design a novel framework \texttt{Intuition-MoR1E} that leverages the inherent semantic clustering of instances to mimic the human brain to deal with multitask, offering implicit guidance to router for optimized feature allocation. Moreover, we introduce cutting-edge Rank-1 Experts formulation designed to manage a spectrum of intuitions, demonstrating enhanced parameter efficiency and effectiveness in multitask LLM finetuning. Extensive experiments demonstrate that Intuition-MoR1E achieves superior efficiency and 2.15\% overall accuracy improvement across 14 public datasets against other state-of-the-art baselines.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Federated Distillation: A Survey
Authors:
Lin Li,
Jianping Gou,
Baosheng Yu,
Lan Du,
Zhang Yiand Dacheng Tao
Abstract:
Federated Learning (FL) seeks to train a model collaboratively without sharing private training data from individual clients. Despite its promise, FL encounters challenges such as high communication costs for large-scale models and the necessity for uniform model architectures across all clients and the server. These challenges severely restrict the practical applications of FL. To address these l…
▽ More
Federated Learning (FL) seeks to train a model collaboratively without sharing private training data from individual clients. Despite its promise, FL encounters challenges such as high communication costs for large-scale models and the necessity for uniform model architectures across all clients and the server. These challenges severely restrict the practical applications of FL. To address these limitations, the integration of knowledge distillation (KD) into FL has been proposed, forming what is known as Federated Distillation (FD). FD enables more flexible knowledge transfer between clients and the server, surpassing the mere sharing of model parameters. By eliminating the need for identical model architectures across clients and the server, FD mitigates the communication costs associated with training large-scale models. This paper aims to offer a comprehensive overview of FD, highlighting its latest advancements. It delves into the fundamental principles underlying the design of FD frameworks, delineates FD approaches for tackling various challenges, and provides insights into the diverse applications of FD across different scenarios.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NAS
Authors:
Afzal Ahmad,
Linfeng Du,
Zhiyao Xie,
Wei Zhang
Abstract:
One of the primary challenges impeding the progress of Neural Architecture Search (NAS) is its extensive reliance on exorbitant computational resources. NAS benchmarks aim to simulate runs of NAS experiments at zero cost, remediating the need for extensive compute. However, existing NAS benchmarks use synthetic datasets and model proxies that make simplified assumptions about the characteristics o…
▽ More
One of the primary challenges impeding the progress of Neural Architecture Search (NAS) is its extensive reliance on exorbitant computational resources. NAS benchmarks aim to simulate runs of NAS experiments at zero cost, remediating the need for extensive compute. However, existing NAS benchmarks use synthetic datasets and model proxies that make simplified assumptions about the characteristics of these datasets and models, leading to unrealistic evaluations. We present a technique that allows searching for training proxies that reduce the cost of benchmark construction by significant margins, making it possible to construct realistic NAS benchmarks for large-scale datasets. Using this technique, we construct an open-source bi-objective NAS benchmark for the ImageNet2012 dataset combined with the on-device performance of accelerators, including GPUs, TPUs, and FPGAs. Through extensive experimentation with various NAS optimizers and hardware platforms, we show that the benchmark is accurate and allows searching for state-of-the-art hardware-aware models at zero cost.
△ Less
Submitted 18 June, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Towards Generalizable and Faithful Logic Reasoning over Natural Language via Resolution Refutation
Authors:
Zhouhao Sun,
Xiao Ding,
Li Du,
Bibo Cai,
Jinglong Gao,
Ting Liu,
Qin Bing
Abstract:
Large language models (LLMs) have achieved significant performance in various natural language reasoning tasks. However, they still struggle with performing first-order logic reasoning over formal logical theories expressed in natural language. This is because the previous LLMs-based reasoning systems have the theoretical incompleteness issue. As a result, it can only address a limited set of simp…
▽ More
Large language models (LLMs) have achieved significant performance in various natural language reasoning tasks. However, they still struggle with performing first-order logic reasoning over formal logical theories expressed in natural language. This is because the previous LLMs-based reasoning systems have the theoretical incompleteness issue. As a result, it can only address a limited set of simple reasoning problems, which significantly decreases their generalization ability. To address this issue, we propose a novel framework, named Generalizable and Faithful Reasoner (GFaiR), which introduces the paradigm of resolution refutation. Resolution refutation has the capability to solve all first-order logic reasoning problems by extending reasoning rules and employing the principle of proof by contradiction, so our system's completeness can be improved by introducing resolution refutation. Experimental results demonstrate that our system outperforms previous works by achieving state-of-the-art performances in complex scenarios while maintaining performances in simple scenarios. Besides, we observe that GFaiR is faithful to its reasoning process.
△ Less
Submitted 3 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
A Refinement of a Theorem of Diaconis-Evans-Graham
Authors:
Lora R. Du,
Kathy Q. Ji
Abstract:
The note is dedicated to refining a theorem by Diaconis, Evans, and Graham concerning successions and fixed points of permutations. This refinement specifically addresses non-adjacent successions, predecessors, excedances, and drops of permutations.
The note is dedicated to refining a theorem by Diaconis, Evans, and Graham concerning successions and fixed points of permutations. This refinement specifically addresses non-adjacent successions, predecessors, excedances, and drops of permutations.
△ Less
Submitted 31 March, 2024;
originally announced April 2024.
-
Supervisory Prompt Training
Authors:
Jean Ghislain Billa,
Min Oh,
Liang Du
Abstract:
The performance of Large Language Models (LLMs) relies heavily on the quality of prompts, which are often manually engineered and task-specific, making them costly and non-scalable. We propose a novel approach, Supervisory Prompt Training (SPT). SPT automates the generation of highly effective prompts using a dual LLM system. In this system, one LLM, the generator, performs a task while the other,…
▽ More
The performance of Large Language Models (LLMs) relies heavily on the quality of prompts, which are often manually engineered and task-specific, making them costly and non-scalable. We propose a novel approach, Supervisory Prompt Training (SPT). SPT automates the generation of highly effective prompts using a dual LLM system. In this system, one LLM, the generator, performs a task while the other, the corrector, provides feedback and generates improved prompts. In contrast to earlier techniques, both the generator and corrector collaboratively and continuously improve their prompts over time. We also introduce the concept of \textit{impact scores} to measure the sentence-level effectiveness of the prompts. Our method was tested on four benchmarks, testing the level of hallucinations in LLMs. Notably, we were able to increase the accuracy of GPT-4 on GSM8K from 65.8\% to 94.1\% (28.3\% increase). SPT advances LLMs by refining prompts to enhance performance and reduce hallucinations, offering an efficient and scalable alternative to traditional model fine-tuning.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Singular profile of free boundary of incompressible inviscid fluid with external force
Authors:
Lili Du,
Yang Pu,
Jing Yang
Abstract:
This article is devoted to investigate the singular profile of the free boundary of two-dimensional incompressible inviscid fluid with external force near the stagnation point. More precisely, given an external force with some polynomial type decay close to the stagnation point, the singular profile of the free boundary at stagnation point possible are corner wave, flat and cusp singularity. Throu…
▽ More
This article is devoted to investigate the singular profile of the free boundary of two-dimensional incompressible inviscid fluid with external force near the stagnation point. More precisely, given an external force with some polynomial type decay close to the stagnation point, the singular profile of the free boundary at stagnation point possible are corner wave, flat and cusp singularity. Through excluding the cusp and flat singularity, we know the only singular profile is corner wave singularity, and the corner depends on the decay rate of the solution near the stagnation point. The analysis depends on the geometric method to a class of Bernoulli-type free boundary problem with given degenerate gradient function on free boundary. This work is motivated by the significant work [E. V$\breve{a}$rv$\breve{a}$ruc$\breve{a}$ and G. Weiss, Acta Math, 206, 363-403, (2011)] on Stokes conjecture to the incompressible inviscid fluid acted on by gravity.
△ Less
Submitted 20 June, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
Gaia23ckh: Symbiotic outburst of the assumed Mira variable V390 Sco
Authors:
Jaroslav Merc,
Peter Velez,
Stéphane Charbonnel,
Olivier Garde,
Pascal Le Dû,
Lionel Mulato,
Thomas Petit,
Jan Skowron
Abstract:
The poorly studied variable star V390 Sco, previously classified as a Mira pulsator, was detected in a brightening event by the ESA Gaia satellite in September 2023. This work presents an analysis of available archival multifrequency photometric data of this target, along with our spectroscopic observations. Our findings lead to the conclusion that V390 Sco is a new symbiotic star identified by Ga…
▽ More
The poorly studied variable star V390 Sco, previously classified as a Mira pulsator, was detected in a brightening event by the ESA Gaia satellite in September 2023. This work presents an analysis of available archival multifrequency photometric data of this target, along with our spectroscopic observations. Our findings lead to the conclusion that V390 Sco is a new symbiotic star identified by Gaia, currently undergoing a classical symbiotic outburst. Additionally, we uncovered three prior outbursts of this system through archival photometry. The outbursts recur approximately every 2330 - 2400 days, and we hypothesize the periastron passage in an eccentric orbit may trigger them, similarly to the case of BX Mon, DD Mic, or MWC 560. A detailed investigation into the nature of the donor star suggested that V390 Sco is an S-type symbiotic star, likely hosting a less evolved, semiregularly pulsating giant donor, but not a Mira variable.
△ Less
Submitted 21 March, 2024;
originally announced March 2024.
-
Almost Optically Dark Galaxies in DECaLS (I): Detection, Optical Properties and Possible Origins
Authors:
Lin Du,
Wei Du,
Cheng Cheng,
Ming Zhu,
Haiyang Yu,
Hong Wu
Abstract:
We report the discovery of eight optical counterparts of ALFALFA extragalactic objects from DECaLS, five of which are discovered for the first time. These objects were flagged as HI emission sources with no optical counterparts in SDSS before. Multi-band data reveal their unusual physical properties. They are faint and blue ($g-r=-0.35\sim0.55$), with quite low surface brightness (…
▽ More
We report the discovery of eight optical counterparts of ALFALFA extragalactic objects from DECaLS, five of which are discovered for the first time. These objects were flagged as HI emission sources with no optical counterparts in SDSS before. Multi-band data reveal their unusual physical properties. They are faint and blue ($g-r=-0.35\sim0.55$), with quite low surface brightness ($μ_{\rm g,peak}=24.88\sim26.41\,{\rm mag}/{\rm arcsec}^2$), irregular morphologies, low stellar masses ($log_{10}(M_{*}/M_\odot)=5.27\sim7.15$), low star formation rates ($SFR=0.21\sim9.24\times10^{-3}\,{M_\odot}\,{\rm yr}^{-1}$), and remarkably high HI-to-stellar mass ratios ($log_{10}(M_{\rm HI}/M_{*}) = 1.72\sim3.22$, except AGC\,215415). They deviate from the scaling relations between HI and optical properties defined by the ALFALFA sample and the baryonic Tully-Fisher relation. They agree well with the main sequence of star-forming galaxies but exhibit low star-forming efficiency. Based on their physical properties and environments, we speculate that six of these objects may have originated from tidal processes, while the remaining two appear to have isolated origins. They may have had a relatively calm evolutionary history and only begun to form stars recently.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
PMCV hypersurfaces in non-flat pseudo-Riemannian space forms
Authors:
Chao Yang,
Jiancheng Liu,
Li Du
Abstract:
In this paper, we prove that PMCV (i.e. Δ\vec{H} is proportional to \vec{H}) hypersurface M^n_r of a non-flat pseudo-Riemannian space form N^{n+1}_s(c) with at most two distinct principal curvatures is minimal or locally isoparametric, and compute the mean curvature for the isoparametric ones. As an application, we give full classification results of such non-minimal Lorentzian hypersurfaces of no…
▽ More
In this paper, we prove that PMCV (i.e. Δ\vec{H} is proportional to \vec{H}) hypersurface M^n_r of a non-flat pseudo-Riemannian space form N^{n+1}_s(c) with at most two distinct principal curvatures is minimal or locally isoparametric, and compute the mean curvature for the isoparametric ones. As an application, we give full classification results of such non-minimal Lorentzian hypersurfaces of non-flat Lorentz space forms.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Cohomologies and deformations of differential algebra morphisms
Authors:
Lei Du,
Yanhong Bao
Abstract:
This paper studies the formal deformations of differential algebra morphisms. As a consequence, we develop a cohomology theory of differential algebra morphisms to interpret the lower degree cohomology groups as formal deformations. Then, we prove the Cohomology Comparison Theorem of differential algebra morphisms, i.e., the cohomology of a morphism of differential algebras is isomorphic to the co…
▽ More
This paper studies the formal deformations of differential algebra morphisms. As a consequence, we develop a cohomology theory of differential algebra morphisms to interpret the lower degree cohomology groups as formal deformations. Then, we prove the Cohomology Comparison Theorem of differential algebra morphisms, i.e., the cohomology of a morphism of differential algebras is isomorphic to the cohomology of an auxiliary differential algebra. Finally, we can give a minimal model for morphism of differential algebras with weight=0.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Regularity of the free boundary for a semilinear vector-valued minimization problem
Authors:
L. L. Du,
Y. Zhou
Abstract:
In this paper, we consider the following semilinear vector-valued minimization problem
$$\min\left\{\int_{D}({|\nabla\mathbf{u}|}^2 + F(|\mathbf{u}|))dx: \ \ \mathbf{u}\in W^{1,2}(D; \mathbb{R}^m) \ \text{and} \ \mathbf{u}=\mathbf{g}\ \text{on} \ \partial D\right\},$$ where $\mathbf{u}: D\to \mathbb{R}^m$ ($ m\geq 1$) is a vector-valued function, $D\subset \mathbb{R}^n$ ($n\geq 2$) is a bounded…
▽ More
In this paper, we consider the following semilinear vector-valued minimization problem
$$\min\left\{\int_{D}({|\nabla\mathbf{u}|}^2 + F(|\mathbf{u}|))dx: \ \ \mathbf{u}\in W^{1,2}(D; \mathbb{R}^m) \ \text{and} \ \mathbf{u}=\mathbf{g}\ \text{on} \ \partial D\right\},$$ where $\mathbf{u}: D\to \mathbb{R}^m$ ($ m\geq 1$) is a vector-valued function, $D\subset \mathbb{R}^n$ ($n\geq 2$) is a bounded Lipschitz domain, $\mathbf{g}\in W^{1,2}(D; \mathbb{R}^m)$ is a given vector-valued function and $F:[0, \infty)\rightarrow \mathbb{R}$ is a given function. This minimization problem corresponds to the following semilinear elliptic system
\begin{equation*}
Δ\mathbf{u}=\frac{1}{2}F'(|\mathbf{u}|)\cdot\frac{\mathbf{u}}{|\mathbf{u}|}χ_{\{|\mathbf{u}|>0\}},
\end{equation*}
where $χ_A$ denotes the characteristic function of the set A. The linear case that $F'\equiv 2$ was studied in the previous elegant work by Andersson, Shahgholian, Uraltseva and Weiss [Adv. Math 280, 2015], in which an epiperimetric inequality played a crucial role to indicate an energy decay estimate and the uniqueness of blow-up limit. However, this epiperimetric inequality cannot be directly applied to our case due to the more general non-degenerate and non-homogeneous term $F$ which leads to Weiss' boundary adjusted energy does not have scaling properties. Motivated by the linear case, when $F$ satisfies some assumptions, we establish successfully a new epiperimetric inequality, it can deal with term which is not scaling invariant in Weiss' boundary adjusted energy. As an application of this new epiperimetric inequality, we conclude that the free boundary $D\cap \partial\{|\mathbf{u}|>0\}$ is a locally $C^{1,β}$ surface near the regular points for some $β\in (0,1)$.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Regularity of the free boundaries for the two-phase axisymmetric inviscid fluid
Authors:
Lili Du,
Feng Ji
Abstract:
In the seminal paper (Alt, Caffarelli and Friedman, Trans. Amer. Math. Soc., 282, (1984).), the regularity of the free boundary of two-phase fluid in two dimensions via the so-called ACF energy functional was investigated. It was shown the $C^1$ regularity of the free boundaries and asserted that the two free boundaries coincide under some additional assumptions. Later on the standard technique of…
▽ More
In the seminal paper (Alt, Caffarelli and Friedman, Trans. Amer. Math. Soc., 282, (1984).), the regularity of the free boundary of two-phase fluid in two dimensions via the so-called ACF energy functional was investigated. It was shown the $C^1$ regularity of the free boundaries and asserted that the two free boundaries coincide under some additional assumptions. Later on the standard technique of Harnack inequality could be applied to improve the regularity to $C^{1,η}$. A recent significant breakthrough in the regularity of two-phase fluid is due to De Philippis, Spolaor and Velichkov, who investigated the free boundary of the two-phase fluid with the two-phase functional (De Philippis, Spolaor and Velichkov, Invent. Math., 225, (2021).), and the $C^{1,η}$ regularity of the whole free boundaries was given in dimension two. Moreover, the free boundaries of the two-phase fluids do not coincide and the zero level set may process positive Lebesgue measure. In this paper, we consider the free boundaries for the two-phase axisymmetric fluid and show the free boundary is $C^{1,η}$ smooth. The Lebesgue measure of the zero level set of may also be positive, and the main difference lies in the degenerate elliptic operator and the free boundary conditions. More precisely, we use partial boundary Harnack inequalities and establish a linearized problem, whose regularity of the solutions implies the flatness decay of the two-phase free boundaries. Then the iteration argument gives the smoothness of the free boundaries.
△ Less
Submitted 24 May, 2024; v1 submitted 1 March, 2024;
originally announced March 2024.
-
Deciphering the Impact of Pretraining Data on Large Language Models through Machine Unlearning
Authors:
Yang Zhao,
Li Du,
Xiao Ding,
Kai Xiong,
Zhouhao Sun,
Jun Shi,
Ting Liu,
Bing Qin
Abstract:
Through pretraining on a corpus with various sources, Large Language Models (LLMs) have gained impressive performance. However, the impact of each component of the pretraining corpus remains opaque. As a result, the organization of the pretraining corpus is still empirical and may deviate from the optimal. To address this issue, we systematically analyze the impact of 48 datasets from 5 major cate…
▽ More
Through pretraining on a corpus with various sources, Large Language Models (LLMs) have gained impressive performance. However, the impact of each component of the pretraining corpus remains opaque. As a result, the organization of the pretraining corpus is still empirical and may deviate from the optimal. To address this issue, we systematically analyze the impact of 48 datasets from 5 major categories of pretraining data of LLMs and measure their impacts on LLMs using benchmarks about nine major categories of model capabilities. Our analyses provide empirical results about the contribution of multiple corpora on the performances of LLMs, along with their joint impact patterns, including complementary, orthogonal, and correlational relationships. We also identify a set of ``high-impact data'' such as Books that is significantly related to a set of model capabilities. These findings provide insights into the organization of data to support more efficient pretraining of LLMs.
△ Less
Submitted 26 March, 2024; v1 submitted 18 February, 2024;
originally announced February 2024.
-
The QCD phase diagram and Beam Energy Scan physics: a theory overview
Authors:
Lipei Du,
Agnieszka Sorensen,
Mikhail Stephanov
Abstract:
We review recent theoretical developments relevant to heavy-ion experiments carried out within the Beam Energy Scan program at the Relativistic Heavy Ion Collider. Our main focus is on the description of the dynamics of systems created in heavy-ion collisions and establishing the necessary connection between the experimental observables and the QCD phase diagram.
We review recent theoretical developments relevant to heavy-ion experiments carried out within the Beam Energy Scan program at the Relativistic Heavy Ion Collider. Our main focus is on the description of the dynamics of systems created in heavy-ion collisions and establishing the necessary connection between the experimental observables and the QCD phase diagram.
△ Less
Submitted 6 May, 2024; v1 submitted 15 February, 2024;
originally announced February 2024.
-
EmoWear: Exploring Emotional Teasers for Voice Message Interaction on Smartwatches
Authors:
Pengcheng An,
Jiawen Zhu,
Zibo Zhang,
Yifei Yin,
Qingyuan Ma,
Che Yan,
Linghao Du,
Jian Zhao
Abstract:
Voice messages, by nature, prevent users from gauging the emotional tone without fully diving into the audio content. This hinders the shared emotional experience at the pre-retrieval stage. Research scarcely explored "Emotional Teasers"-pre-retrieval cues offering a glimpse into an awaiting message's emotional tone without disclosing its content. We introduce EmoWear, a smartwatch voice messaging…
▽ More
Voice messages, by nature, prevent users from gauging the emotional tone without fully diving into the audio content. This hinders the shared emotional experience at the pre-retrieval stage. Research scarcely explored "Emotional Teasers"-pre-retrieval cues offering a glimpse into an awaiting message's emotional tone without disclosing its content. We introduce EmoWear, a smartwatch voice messaging system enabling users to apply 30 animation teasers on message bubbles to reflect emotions. EmoWear eases senders' choice by prioritizing emotions based on semantic and acoustic processing. EmoWear was evaluated in comparison with a mirroring system using color-coded message bubbles as emotional cues (N=24). Results showed EmoWear significantly enhanced emotional communication experience in both receiving and sending messages. The animated teasers were considered intuitive and valued for diverse expressions. Desirable interaction qualities and practical implications are distilled for future design. We thereby contribute both a novel system and empirical knowledge concerning emotional teasers for voice messaging.
△ Less
Submitted 11 February, 2024;
originally announced February 2024.
-
An Examination on the Effectiveness of Divide-and-Conquer Prompting in Large Language Models
Authors:
Yizhou Zhang,
Lun Du,
Defu Cao,
Qiang Fu,
Yan Liu
Abstract:
Foundation models, such as Large language Models (LLMs), have attracted significant amount of interest due to their large number of applications. However, when handling tasks involving repetitive sub-tasks and/or deceptive contents, such as arithmetic calculation and article-level fake news detection, simple instructional prompts suffer from inaccurate responses. Existing works show that more comp…
▽ More
Foundation models, such as Large language Models (LLMs), have attracted significant amount of interest due to their large number of applications. However, when handling tasks involving repetitive sub-tasks and/or deceptive contents, such as arithmetic calculation and article-level fake news detection, simple instructional prompts suffer from inaccurate responses. Existing works show that more complicated prompting strategies, such as Chain-of-Thoughts and Least-to-Most, can unlock LLM's powerful capacity in diverse areas. Recent researches reveal that simple divide-and-conquer prompting strategy, i.e. simply dividing the input sequence to multiple sub-inputs, can also substantially improve LLM's performance in some specific tasks such as misinformation detection. In this paper, we aim at examining the utility of divide-and-conquer prompting strategy and answer on which kind of tasks this strategy gets advantages. Specifically, we provide a theoretic analysis to divide-and-conquer prompting strategy and help us identify the specific tasks where DaC prompting can bring performance boost with theoretic guarantee. We then present two cases (large integer arithmetic and fact verification) where experimental results aligns with our theoretic analysis.
△ Less
Submitted 2 July, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
SUB-PLAY: Adversarial Policies against Partially Observed Multi-Agent Reinforcement Learning Systems
Authors:
Oubo Ma,
Yuwen Pu,
Linkang Du,
Yang Dai,
Ruo Wang,
Xiaolei Liu,
Yingcai Wu,
Shouling Ji
Abstract:
Recent advancements in multi-agent reinforcement learning (MARL) have opened up vast application prospects, such as swarm control of drones, collaborative manipulation by robotic arms, and multi-target encirclement. However, potential security threats during the MARL deployment need more attention and thorough investigation. Recent research reveals that attackers can rapidly exploit the victim's v…
▽ More
Recent advancements in multi-agent reinforcement learning (MARL) have opened up vast application prospects, such as swarm control of drones, collaborative manipulation by robotic arms, and multi-target encirclement. However, potential security threats during the MARL deployment need more attention and thorough investigation. Recent research reveals that attackers can rapidly exploit the victim's vulnerabilities, generating adversarial policies that result in the failure of specific tasks. For instance, reducing the winning rate of a superhuman-level Go AI to around 20%. Existing studies predominantly focus on two-player competitive environments, assuming attackers possess complete global state observation.
In this study, we unveil, for the first time, the capability of attackers to generate adversarial policies even when restricted to partial observations of the victims in multi-agent competitive environments. Specifically, we propose a novel black-box attack (SUB-PLAY) that incorporates the concept of constructing multiple subgames to mitigate the impact of partial observability and suggests sharing transitions among subpolicies to improve attackers' exploitative ability. Extensive evaluations demonstrate the effectiveness of SUB-PLAY under three typical partial observability limitations. Visualization results indicate that adversarial policies induce significantly different activations of the victims' policy networks. Furthermore, we evaluate three potential defenses aimed at exploring ways to mitigate security threats posed by adversarial policies, providing constructive recommendations for deploying MARL in competitive environments.
△ Less
Submitted 26 June, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Proximity QA: Unleashing the Power of Multi-Modal Large Language Models for Spatial Proximity Analysis
Authors:
Jianing Li,
Xi Nan,
Ming Lu,
Li Du,
Shanghang Zhang
Abstract:
Multi-modal large language models (MLLMs) have demonstrated remarkable vision-language capabilities, primarily due to the exceptional in-context understanding and multi-task learning strengths of large language models (LLMs). The advent of visual instruction tuning has further enhanced MLLMs' performance in vision-language understanding. However, while existing MLLMs adeptly recognize \textit{what…
▽ More
Multi-modal large language models (MLLMs) have demonstrated remarkable vision-language capabilities, primarily due to the exceptional in-context understanding and multi-task learning strengths of large language models (LLMs). The advent of visual instruction tuning has further enhanced MLLMs' performance in vision-language understanding. However, while existing MLLMs adeptly recognize \textit{what} objects are in an image, they still face challenges in effectively discerning \textit{where} these objects are, particularly along the distance (scene depth) axis. To overcome this limitation in MLLMs, we introduce Proximity Question Answering (Proximity QA), a novel framework designed to enable MLLMs to infer the proximity relationship between objects in images. The framework operates in two phases: the first phase focuses on guiding the models to understand the relative depth of objects, and the second phase further encourages the models to infer the proximity relationships between objects based on their depth perceptions. We also propose a VQA dataset called Proximity-110K, containing additional instructions that incorporate depth information and the proximity relationships of objects. We have conducted extensive experiments to validate Proximity QA's superior ability in depth perception and proximity analysis, outperforming other state-of-the-art MLLMs. Code and dataset will be released at \textcolor{magenta}{https://github.com/NorthSummer/ProximityQA.git}.
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Photon-triggered jets as probes of multi-stage jet modification
Authors:
C. Sirimanna,
Y. Tachibana,
A. Angerami,
R. Arora,
S. A. Bass,
S. Cao,
Y. Chen,
L. Du,
R. Ehlers,
H. Elfner,
W. Fan,
R. J. Fries,
C. Gale,
Y. He,
U. Heinz,
B. V. Jacak,
P. M. Jacobs,
S. Jeon,
Y. Ji,
L. Kasper,
M. Kordell II,
A. Kumar,
R. Kunnawalkam-Elayavalli,
J. Latessa,
S. Lee
, et al. (28 additional authors not shown)
Abstract:
Prompt photons are created in the early stages of heavy ion collisions and traverse the QGP medium without any interaction. Therefore, photon-triggered jets can be used to study the jet quenching in the QGP medium. In this work, photon-triggered jets are studied through different jet and jet substructure observables for different collision systems and energies using the JETSCAPE framework. Since t…
▽ More
Prompt photons are created in the early stages of heavy ion collisions and traverse the QGP medium without any interaction. Therefore, photon-triggered jets can be used to study the jet quenching in the QGP medium. In this work, photon-triggered jets are studied through different jet and jet substructure observables for different collision systems and energies using the JETSCAPE framework. Since the multistage evolution used in the JETSCAPE framework is adequate to describe a wide range of experimental observables simultaneously using the same parameter tune, we use the same parameters tuned for jet and leading hadron studies. The same isolation criteria used in the experimental analysis are used to identify prompt photons for better comparison. For the first time, high-accuracy JETSCAPE results are compared with multi-energy LHC and RHIC measurements to better understand the deviations observed in prior studies. This study highlights the importance of multistage evolution for the simultaneous description of experimental observables through different collision systems and energies using a single parameter tune.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Sliding ferroelectric memories and synapses
Authors:
Xiuzhen Li,
Biao Qin,
Yaxian Wang,
Yue Xi,
Zhiheng Huang,
Mengze Zhao,
Yalin Peng,
Zitao Chen,
Zitian Pan,
Jundong Zhu,
Chenyang Cui,
Rong Yang,
Wei Yang,
Sheng Meng,
Dongxia Shi,
Xuedong Bai,
Can Liu,
Na Li,
Jianshi Tang,
Kaihui Liu,
Luojun Du,
Guangyu Zhang
Abstract:
Ferroelectric materials with switchable electric polarization hold great promise for a plethora of emergent applications, such as post-Moore's law nanoelectronics, beyond-Boltzmann transistors, non-volatile memories, and above-bandgap photovoltaic devices. Recent advances have uncovered an exotic sliding ferroelectric mechanism, which endows to design atomically thin ferroelectrics from non-ferroe…
▽ More
Ferroelectric materials with switchable electric polarization hold great promise for a plethora of emergent applications, such as post-Moore's law nanoelectronics, beyond-Boltzmann transistors, non-volatile memories, and above-bandgap photovoltaic devices. Recent advances have uncovered an exotic sliding ferroelectric mechanism, which endows to design atomically thin ferroelectrics from non-ferroelectric parent monolayers. Although notable progress has been witnessed in understanding its fundamental properties, functional devices based on sliding ferroelectrics, the key touchstone toward applications, remain elusive. Here, we demonstrate the rewritable, non-volatile memory devices at room-temperature utilizing a two-dimensional (2D) sliding ferroelectric semiconductor of rhombohedral-stacked bilayer molybdenum disulfide. The 2D sliding ferroelectric memories (SFeMs) show superior performances with a large memory window of >8V, a high conductance ratio of above 106, a long retention time of >10 years, and a programming endurance greater than 104 cycles. Remarkably, flexible SFeMs are achieved with state-of-the-art performances competitive to their rigid counterparts and maintain their performances post bending over 103 cycles. Furthermore, synapse-specific Hebbian forms of plasticity and image recognition with a high accuracy of 97.81% are demonstrated based on flexible SFeMs. Our work demonstrates the sliding ferroelectric memories and synaptic plasticity on both rigid and flexible substrates, highlighting the great potential of sliding ferroelectrics for emerging technological applications in brain-inspired in-memory computing, edge intelligence and energy-efficient wearable electronics.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Solutions to the First Order Difference Equations in the Multivariate Difference Field
Authors:
Lixin Du,
Yarong Wei
Abstract:
The bivariate difference field provides an algebraic framework for a sequence satisfying a recurrence of order two. Based on this, we focus on sequences satisfying a recurrence of higher order, and consider the multivariate difference field, in which the summation problem could be transformed into solving the first order difference equations. We then show a criterion for deciding whether the diffe…
▽ More
The bivariate difference field provides an algebraic framework for a sequence satisfying a recurrence of order two. Based on this, we focus on sequences satisfying a recurrence of higher order, and consider the multivariate difference field, in which the summation problem could be transformed into solving the first order difference equations. We then show a criterion for deciding whether the difference equation has a rational solution and present an algorithm for computing one rational solution of such a difference equation, if it exists. Moreover we get the rational solution set of such an equation.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Two-dimensional ferromagnetic semiconductor Cr2XP: First-principles calculations and Monte Carlo simulations
Authors:
Xiao-Ping Wei,
Lan-Lan Du,
Jiang-Liu Meng,
Xiaoma Tao
Abstract:
According to the Mermin Wagner theorem, two-dimensional material is difficult to have the Curie temperature above room temperature. By using the method of band engineering, we design a promising two-dimensional ferromagnetic semiconductor Cr2XP (X=P, As, Sb) with large magnetization, high Curie temperature and sizable band gap. The formation of gap is discussed in terms of the hybridizations, occu…
▽ More
According to the Mermin Wagner theorem, two-dimensional material is difficult to have the Curie temperature above room temperature. By using the method of band engineering, we design a promising two-dimensional ferromagnetic semiconductor Cr2XP (X=P, As, Sb) with large magnetization, high Curie temperature and sizable band gap. The formation of gap is discussed in terms of the hybridizations, occupation and distribution of electronic states and charge transfer. Large magnetic moments about 6.16~6.37uB origin from the occupation of Cr-d electrons in crystal field.Competition and cooperation between d-d (Cr-d~Cr-d) and d-p-d (Cr-d~X-p~Cr-d) exchange interactions lead to the emergence of ferromagnetic ordering phase. Furthermore, Curie temperatures, approaching to 269 K, 332 K and 400 K for Cr2P2, Cr2AsP and Cr2SbP, are estimated by employing Monte Carlo simulation based on the Heisenberg model. Magnetic anisotropy energy of Cr2XP is determined by calculating the total energy dependence on the angle along different directions, and the origin is also discussed by the second-order perturbation theory. In addition, the Cr2XP possesses excellent thermodynamical, dynamical and mechanical stabilities, and can overcome their own gravity to keep their planar structure without the support of substrate. These above-mentioned advantages will offer some valuable hints for two-dimensional ferromagnetic semiconductor Cr2XP in spintronic devices.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
Ultra-broadband near-field Josephson microwave microscopy
Authors:
Ping Zhang,
Yang-Yang Lyu,
Jingjing Lv,
Zihan Wei,
Shixian Chen,
Chenguang Wang,
Hongmei Du,
Dingding Li,
Zixi Wang,
Shoucheng Hou,
Runfeng Su,
Hancong Sun,
Yuan Du,
Li Du,
Liming Gao,
Yong-Lei Wang,
Huabing Wang,
Peiheng Wu
Abstract:
Advanced microwave technologies constitute the foundation of a wide range of modern sciences, including quantum computing, microwave photonics, spintronics, etc. To facilitate the design of chip-based microwave devices, there is an increasing demand for state-of-the-art microscopic techniques capable of characterizing the near-field microwave distribution and performance. In this work, we integrat…
▽ More
Advanced microwave technologies constitute the foundation of a wide range of modern sciences, including quantum computing, microwave photonics, spintronics, etc. To facilitate the design of chip-based microwave devices, there is an increasing demand for state-of-the-art microscopic techniques capable of characterizing the near-field microwave distribution and performance. In this work, we integrate Josephson junctions onto a nano-sized quartz tip, forming a highly sensitive microwave mixer on-tip. This allows us to conduct spectroscopic imaging of near-field microwave distributions with high spatial resolution. Leveraging its microwave-sensitive characteristics, our Josephson microscope achieves a broad detecting bandwidth of up to 200 GHz with remarkable frequency and intensity sensitivities. Our work emphasizes the benefits of utilizing the Josephson microscope as a real-time, non-destructive technique to advance integrated microwave electronics.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Interplay of Landau quantization and interminivalley scatterings in a weakly coupled moiré superlattice
Authors:
Yalong Yuan,
Le Liu,
Jundong Zhu,
Jingwei Dong,
Yanbang Chu,
Fanfan Wu,
Luojun Du,
Kenji Watanabe,
Takashi Taniguchi,
Dongxia Shi,
Guangyu Zhang,
Wei Yang
Abstract:
Double layer quantum systems are promising platforms for realizing novel quantum phases. Here, we report a study of quantum oscillations (QOs) in a weakly coupled double layer system, composed of a large angle twisted double bilayer graphene (TDBG). We observe two different QOs at low temperature, one with a periodicity in carrier density (n), i.e. Shubnikov de Haas oscillation (SdHO) due to Landa…
▽ More
Double layer quantum systems are promising platforms for realizing novel quantum phases. Here, we report a study of quantum oscillations (QOs) in a weakly coupled double layer system, composed of a large angle twisted double bilayer graphene (TDBG). We observe two different QOs at low temperature, one with a periodicity in carrier density (n), i.e. Shubnikov de Haas oscillation (SdHO) due to Landau quantization, and the other one in displacement field (D), resulting a grid pattern. We quantify the interlayer coupling strength by measuring the interlayer capacitance from the grid pattern with a capacitance model, revealing an electron hole asymmetry. At high temperature when SdHO are thermal smeared, we observe resistance peaks when LLs from two minivalleys in the moiré Brillion zone are aligned, regardless of carrier density; eventually, it results in a two fold increase of oscillating frequency in D, serving as a smoking gun evidence of the magneto intersubband oscillations (MISO) in a double layer system. The temperature dependence of MISO suggests electron-electron interaction between two minivalleys play a crucial rule in the scattering, and the scattering times obtained from MISO thermal damping are found to be correlated with the interlayer coupling strength. Our study reveals an intriguing interplay among Landau quantization, moiré band structure, and scatterings.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness
Authors:
Rongyu Zhang,
Zefan Cai,
Huanrui Yang,
Zidong Liu,
Denis Gudovskiy,
Tomoyuki Okuno,
Yohei Nakata,
Kurt Keutzer,
Baobao Chang,
Yuan Du,
Li Du,
Shanghang Zhang
Abstract:
Finetuning a pretrained vision model (PVM) is a common technique for learning downstream vision tasks. However, the conventional finetuning process with randomly sampled data points results in diminished training efficiency. To address this drawback, we propose a novel approach, Vision-language Collaborative Active Finetuning (VeCAF). With the emerging availability of labels and natural language a…
▽ More
Finetuning a pretrained vision model (PVM) is a common technique for learning downstream vision tasks. However, the conventional finetuning process with randomly sampled data points results in diminished training efficiency. To address this drawback, we propose a novel approach, Vision-language Collaborative Active Finetuning (VeCAF). With the emerging availability of labels and natural language annotations of images through web-scale crawling or controlled generation, VeCAF makes use of these information to perform parametric data selection for PVM finetuning. VeCAF incorporates the finetuning objective to select significant data points that effectively guide the PVM towards faster convergence to meet the performance goal. This process is assisted by the inherent semantic richness of the text embedding space which we use to augment image features. Furthermore, the flexibility of text-domain augmentation allows VeCAF to handle out-of-distribution scenarios without external data. Extensive experiments show the leading performance and high computational efficiency of VeCAF that is superior to baselines in both in-distribution and out-of-distribution image classification tasks. On ImageNet, VeCAF uses up to 3.3x less training batches to reach the target performance compared to full finetuning, and achieves an accuracy improvement of 2.7% over the state-of-the-art active finetuning method with the same number of batches.
△ Less
Submitted 13 April, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.
-
A tensor Alternating Anderson-Richardson method for solving multilinear systems with M-tensors
Authors:
Jing Niu,
Lei Du,
Tomohiro Sogabe,
Shao-Liang Zhang
Abstract:
It is well-known that a multilinear system with a nonsingular M-tensor and a positive right-hand side has a unique positive solution. Tensor splitting methods generalizing the classical iterative methods for linear systems have been proposed for finding the unique positive solution. The Alternating Anderson-Richardson (AAR) method is an effective method to accelerate the classical iterative method…
▽ More
It is well-known that a multilinear system with a nonsingular M-tensor and a positive right-hand side has a unique positive solution. Tensor splitting methods generalizing the classical iterative methods for linear systems have been proposed for finding the unique positive solution. The Alternating Anderson-Richardson (AAR) method is an effective method to accelerate the classical iterative methods. In this study, we apply the idea of AAR for finding the unique positive solution quickly. We first present a tensor Richardson method based on tensor regular splittings, then apply Anderson acceleration to the tensor Richardson method and derive a tensor Anderson-Richardson method, finally, we periodically employ the tensor Anderson-Richardson method within the tensor Richardson method and propose a tensor AAR method. Numerical experiments show that the proposed method is effective in accelerating tensor splitting methods.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
TAROT: A Hierarchical Framework with Multitask Co-Pretraining on Semi-Structured Data towards Effective Person-Job Fit
Authors:
Yihan Cao,
Xu Chen,
Lun Du,
Hao Chen,
Qiang Fu,
Shi Han,
Yushu Du,
Yanbin Kang,
Guangming Lu,
Zi Li
Abstract:
Person-job fit is an essential part of online recruitment platforms in serving various downstream applications like Job Search and Candidate Recommendation. Recently, pretrained large language models have further enhanced the effectiveness by leveraging richer textual information in user profiles and job descriptions apart from user behavior features and job metadata. However, the general domain-o…
▽ More
Person-job fit is an essential part of online recruitment platforms in serving various downstream applications like Job Search and Candidate Recommendation. Recently, pretrained large language models have further enhanced the effectiveness by leveraging richer textual information in user profiles and job descriptions apart from user behavior features and job metadata. However, the general domain-oriented design struggles to capture the unique structural information within user profiles and job descriptions, leading to a loss of latent semantic correlations. We propose TAROT, a hierarchical multitask co-pretraining framework, to better utilize structural and semantic information for informative text embeddings. TAROT targets semi-structured text in profiles and jobs, and it is co-pretained with multi-grained pretraining tasks to constrain the acquired semantic information at each level. Experiments on a real-world LinkedIn dataset show significant performance improvements, proving its effectiveness in person-job fit tasks.
△ Less
Submitted 17 January, 2024; v1 submitted 15 January, 2024;
originally announced January 2024.