subscribe to arXiv mailings

UniPlane: Unified Plane Detection and Reconstruction from Posed Monocular Videos

Authors: Yuzhong Huang, Chen Liu, Ji Hou, Ke Huo, Shiyu Dong, Fred Morstatter

Abstract: We present UniPlane, a novel method that unifies plane detection and reconstruction from posed monocular videos. Unlike existing methods that detect planes from local observations and associate them across the video for the final reconstruction, UniPlane unifies both the detection and the reconstruction tasks in a single network, which allows us to directly optimize final reconstruction quality an… ▽ More We present UniPlane, a novel method that unifies plane detection and reconstruction from posed monocular videos. Unlike existing methods that detect planes from local observations and associate them across the video for the final reconstruction, UniPlane unifies both the detection and the reconstruction tasks in a single network, which allows us to directly optimize final reconstruction quality and fully leverage temporal information. Specifically, we build a Transformers-based deep neural network that jointly constructs a 3D feature volume for the environment and estimates a set of per-plane embeddings as queries. UniPlane directly reconstructs the 3D planes by taking dot products between voxel embeddings and the plane embeddings followed by binary thresholding. Extensive experiments on real-world datasets demonstrate that UniPlane outperforms state-of-the-art methods in both plane detection and reconstruction tasks, achieving +4.6 in F-score in geometry as well as consistent improvements in other geometry and segmentation metrics. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2206.07710 by other authors

arXiv:2407.02073 [pdf, other]

Contribution Evaluation of Heterogeneous Participants in Federated Learning via Prototypical Representations

Authors: Qi Guo, Minghao Yao, Zhen Tian, Saiyu Qi, Yong Qi, Yun Lin, Jin Song Dong

Abstract: Contribution evaluation in federated learning (FL) has become a pivotal research area due to its applicability across various domains, such as detecting low-quality datasets, enhancing model robustness, and designing incentive mechanisms. Existing contribution evaluation methods, which primarily rely on data volume, model similarity, and auxiliary test datasets, have shown success in diverse scena… ▽ More Contribution evaluation in federated learning (FL) has become a pivotal research area due to its applicability across various domains, such as detecting low-quality datasets, enhancing model robustness, and designing incentive mechanisms. Existing contribution evaluation methods, which primarily rely on data volume, model similarity, and auxiliary test datasets, have shown success in diverse scenarios. However, their effectiveness often diminishes due to the heterogeneity of data distributions, presenting a significant challenge to their applicability. In response, this paper explores contribution evaluation in FL from an entirely new perspective of representation. In this work, we propose a new method for the contribution evaluation of heterogeneous participants in federated learning (FLCE), which introduces a novel indicator \emph{class contribution momentum} to conduct refined contribution evaluation. Our core idea is the construction and application of the class contribution momentum indicator from individual, relative, and holistic perspectives, thereby achieving an effective and efficient contribution evaluation of heterogeneous participants without relying on an auxiliary test dataset. Extensive experimental results demonstrate the superiority of our method in terms of fidelity, effectiveness, efficiency, and heterogeneity across various scenarios. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01643 [pdf, other]

A Deep Generative Framework for Joint Households and Individuals Population Synthesis

Authors: Xiao Qian, Utkarsh Gangwal, Shangjia Dong, Rachel Davidson

Abstract: Household and individual-level sociodemographic data are essential for understanding human-infrastructure interaction and policymaking. However, the Public Use Microdata Sample (PUMS) offers only a sample at the state level, while census tract data only provides the marginal distributions of variables without correlations. Therefore, we need an accurate synthetic population dataset that maintains… ▽ More Household and individual-level sociodemographic data are essential for understanding human-infrastructure interaction and policymaking. However, the Public Use Microdata Sample (PUMS) offers only a sample at the state level, while census tract data only provides the marginal distributions of variables without correlations. Therefore, we need an accurate synthetic population dataset that maintains consistent variable correlations observed in microdata, preserves household-individual and individual-individual relationships, adheres to state-level statistics, and accurately represents the geographic distribution of the population. We propose a deep generative framework leveraging the variational autoencoder (VAE) to generate a synthetic population with the aforementioned features. The methodological contributions include (1) a new data structure for capturing household-individual and individual-individual relationships, (2) a transfer learning process with pre-training and fine-tuning steps to generate households and individuals whose aggregated distributions align with the census tract marginal distribution, and (3) decoupled binary cross-entropy (D-BCE) loss function enabling distribution shift and out-of-sample records generation. Model results for an application in Delaware, USA demonstrate the ability to ensure the realism of generated household-individual records and accurately describe population statistics at the census tract level compared to existing methods. Furthermore, testing in North Carolina, USA yielded promising results, supporting the transferability of our method. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2406.19724 [pdf, ps, other]

Momentum and kinetic energy transport in supersonic particle-laden turbulent boundary layers

Authors: Ming Yu, Yibin Du, Qian Wang, Siwei Dong, Xianxu Yuan

Abstract: In the present study, we conduct direct numerical simulations of two-way force-coupled particle-laden compressible turbulent boundary layers at the free-stream Mach number of 2.0 for the purpose of examining the effects of particles on the transport of momentum and kinetic energy. By analyzing turbulent databases with various particle Stokes numbers and mass loadings, we observe that the presence… ▽ More In the present study, we conduct direct numerical simulations of two-way force-coupled particle-laden compressible turbulent boundary layers at the free-stream Mach number of 2.0 for the purpose of examining the effects of particles on the transport of momentum and kinetic energy. By analyzing turbulent databases with various particle Stokes numbers and mass loadings, we observe that the presence of particles suppresses turbulent fluctuations and can even laminarize flow under high mass loading conditions. This is reflected by the wider and more coherent near-wall velocity streaks, reduced Reynolds stresses, and diminished contributions to skin friction and turbulent kinetic energy production. Additionally, the particle feedback force becomes more dominant in turbulent production near the wall and at small scales as mass loadings increase, which is found to be caused by the residual velocity fluctuations from particles swept down from the outer region. Furthermore, we identify that particle dissipation, resulting from the relative velocity between the fluid and particles, accounts for less than 1% of mean kinetic energy viscous dissipation and less than 10% of turbulent kinetic energy dissipation in the case with the highest mass loading. This suggests a modest impact on the internal energy variation of the fluid if two-way heat coupling is introduced. The elevated mean temperature is found in the near-wall region and is ascribed to the influence of the particle feedback force and reduced turbulent diffusion in high mass loading cases. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: 31 pages, 14 figures

arXiv:2406.18616 [pdf, other]

Towards Large Language Model Aided Program Refinement

Authors: Yufan Cai, Zhe Hou, Xiaokun Luan, David Miguel Sanan Baena, Yun Lin, Jun Sun, Jin Song Dong

Abstract: Program refinement involves correctness-preserving transformations from formal high-level specification statements into executable programs. Traditional verification tool support for program refinement is highly interactive and lacks automation. On the other hand, the emergence of large language models (LLMs) enables automatic code generations from informal natural language specifications. However… ▽ More Program refinement involves correctness-preserving transformations from formal high-level specification statements into executable programs. Traditional verification tool support for program refinement is highly interactive and lacks automation. On the other hand, the emergence of large language models (LLMs) enables automatic code generations from informal natural language specifications. However, code generated by LLMs is often unreliable. Moreover, the opaque procedure from specification to code provided by LLM is an uncontrolled black box. We propose LLM4PR, a tool that combines formal program refinement techniques with informal LLM-based methods to (1) transform the specification to preconditions and postconditions, (2) automatically build prompts based on refinement calculus, (3) interact with LLM to generate code, and finally, (4) verify that the generated code satisfies the conditions of refinement calculus, thus guaranteeing the correctness of the code. We have implemented our tool using GPT4, Coq, and Coqhammer, and evaluated it on the HumanEval and EvalPlus datasets. △ Less

Submitted 26 June, 2024; originally announced June 2024.

ACM Class: K.6.3

arXiv:2406.14531 [pdf, ps, other]

Roman FFP Revolution: Two, Three, Many Plutos

Authors: Andrew Gould, Jennifer C. Yee, Subo Dong

Abstract: Roman microlensing stands at a crossroads between its originally charted path of cataloging a population of cool planets that has subsequently become well-measured down to super-Earths, and the path of free-floating planets (FFPs), which did not exist when Roman was chosen in 2010, but by now promises revolutionary insights into planet formation and evolution via their possible connection to a spe… ▽ More Roman microlensing stands at a crossroads between its originally charted path of cataloging a population of cool planets that has subsequently become well-measured down to super-Earths, and the path of free-floating planets (FFPs), which did not exist when Roman was chosen in 2010, but by now promises revolutionary insights into planet formation and evolution via their possible connection to a spectrum of objects spanning 18 decades in mass. Until now, it was not even realized that the 2 paths are in conflict: Roman strategy was optimized for bound-planet detections, and FFPs were considered only in the context of what could be learned about them given this strategy. We derive a simple equation that mathematically expresses this conflict and explains why the current approach severely depresses detection of 2 of the 5 decades of potential FFP masses, i.e., exactly the two decades, $M_{\rm Pluto}< M <2\,M_{\rm Mars}$, that would tie terrestrial planets to the proto-planetary material out of which they formed. FFPs can be either truly free floating or can be bound in "Wide", "Kuiper", and "Oort" orbits, whose separate identification will allow further insight into planet formation. In the (low-mass) limit that the source radius is much bigger than the Einstein radius, $θ_*\ggθ_{\rm E}$, the number of significantly magnified points on the FFP light curve is $N=2Γθ_*\sqrt{1-z^2}/μ$ --> 3.0, when normalized to the adopted Roman cadence $Γ=4/$hr, and to source radius $θ_*=0.3\,μ$as, lens-source proper motion $μ=6\,$mas/yr, and source impact parameter $z=0.5$, which are all typical values. By contrast $N=6$ are needed for an FFP detection. Thus, unless $Γ$ is doubled, FFP detection will be driven into the (large-$θ_*$, small-$μ$) corner of parameter space, reducing the detections by a net factor of 2 and cutting off the lowest-mass FFPs. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: 46 pages, 4 figures

arXiv:2406.13252 [pdf, other]

Self-Supervised Diffusion Model for 3-D Seismic Data Reconstruction

Authors: Xinyang Wang, Qianyu Ge, Xintong Dong, Shiqi Dong, Tie Zhong

Abstract: Seismic data reconstruction is an effective tool for compensating nonuniform and incomplete seismic geometry. Compared with methods for 2D seismic data, 3D reconstruction methods could consider more spatial structure correlation in seismic data. In the early studies, 3D reconstruction methods are mainly theory-driven and have some limitations due to their prior assumptions on the seismic data. To… ▽ More Seismic data reconstruction is an effective tool for compensating nonuniform and incomplete seismic geometry. Compared with methods for 2D seismic data, 3D reconstruction methods could consider more spatial structure correlation in seismic data. In the early studies, 3D reconstruction methods are mainly theory-driven and have some limitations due to their prior assumptions on the seismic data. To release these limitations, deep learning-based reconstruction methods rise and show potential in dealing with reconstruction problems. However, there are mainly two shortcomings in existing deep learning-methods. On the one hand, most of existing deep learning-based methods adopt the convolutional neural network, having some difficulties in dealing with data with complex or time-varying distributions. Recently, the diffusion model has been reported to possess the capability to solve data with complex distributions by gradually complicating the distribution of data to optimize the network. On the other hand, existing methods need enough paired-data to train the network, which are very hard to obtain especially for the starved 3D seismic data. Deep prior-based unsupervised and sampling-based self-supervised networks offer an available solution to this problem. In this paper, we develop a self-supervised diffusion model (S2DM) for 3D seismic data reconstruction. The proposed model mainly contains a diffusion restoration model and a variational time-spatial module. Extensive synthetic and field experiments demonstrate the superiority of the proposed S2DM algorithm. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 43 pages, 13 figures

arXiv:2406.12605 [pdf, other]

Attack and Defense of Deep Learning Models in the Field of Web Attack Detection

Authors: Lijia Shi, Shihao Dong

Abstract: The challenge of WAD (web attack detection) is growing as hackers continuously refine their methods to evade traditional detection. Deep learning models excel in handling complex unknown attacks due to their strong generalization and adaptability. However, they are vulnerable to backdoor attacks, where contextually irrelevant fragments are inserted into requests, compromising model stability. Whil… ▽ More The challenge of WAD (web attack detection) is growing as hackers continuously refine their methods to evade traditional detection. Deep learning models excel in handling complex unknown attacks due to their strong generalization and adaptability. However, they are vulnerable to backdoor attacks, where contextually irrelevant fragments are inserted into requests, compromising model stability. While backdoor attacks are well studied in image recognition, they are largely unexplored in WAD. This paper introduces backdoor attacks in WAD, proposing five methods and corresponding defenses. Testing on textCNN, biLSTM, and tinybert models shows an attack success rate over 87%, reducible through fine-tuning. Future research should focus on backdoor defenses in WAD. All the code and data of this paper can be obtained at https://anonymous.4open.science/r/attackDefenceinDL-7E05 △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 26 pages, 4 figures

arXiv:2406.10828 [pdf]

PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery

Authors: Libo Wang, Dongxu Li, Sijun Dong, Xiaoliang Meng, Xiaokang Zhang, Danfeng Hong

Abstract: Semantic segmentation, as a basic tool for intelligent interpretation of remote sensing images, plays a vital role in many Earth Observation (EO) applications. Nowadays, accurate semantic segmentation of remote sensing images remains a challenge due to the complex spatial-temporal scenes and multi-scale geo-objects. Driven by the wave of deep learning (DL), CNN- and Transformer-based semantic segm… ▽ More Semantic segmentation, as a basic tool for intelligent interpretation of remote sensing images, plays a vital role in many Earth Observation (EO) applications. Nowadays, accurate semantic segmentation of remote sensing images remains a challenge due to the complex spatial-temporal scenes and multi-scale geo-objects. Driven by the wave of deep learning (DL), CNN- and Transformer-based semantic segmentation methods have been explored widely, and these two architectures both revealed the importance of multi-scale feature representation for strengthening semantic information of geo-objects. However, the actual multi-scale feature fusion often comes with the semantic redundancy issue due to homogeneous semantic contents in pyramid features. To handle this issue, we propose a novel Mamba-based segmentation network, namely PyramidMamba. Specifically, we design a plug-and-play decoder, which develops a dense spatial pyramid pooling (DSPP) to encode rich multi-scale semantic features and a pyramid fusion Mamba (PFM) to reduce semantic redundancy in multi-scale feature fusion. Comprehensive ablation experiments illustrate the effectiveness and superiority of the proposed method in enhancing multi-scale feature representation as well as the great potential for real-time semantic segmentation. Moreover, our PyramidMamba yields state-of-the-art performance on three publicly available datasets, i.e. the OpenEarthMap (70.8% mIoU), ISPRS Vaihingen (84.8% mIoU) and Potsdam (88.0% mIoU) datasets. The code will be available at https://github.com/WangLibo1995/GeoSeg. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2406.10481 [pdf, other]

DCDILP: a distributed learning method for large-scale causal structure learning

Authors: Shuyu Dong, Michèle Sebag, Kento Uemura, Akito Fujii, Shuang Chang, Yusuke Koyanagi, Koji Maruhashi

Abstract: This paper presents a novel approach to causal discovery through a divide-and-conquer framework. By decomposing the problem into smaller subproblems defined on Markov blankets, the proposed DCDILP method first explores in parallel the local causal graphs of these subproblems. However, this local discovery phase encounters systematic challenges due to the presence of hidden confounders (variables w… ▽ More This paper presents a novel approach to causal discovery through a divide-and-conquer framework. By decomposing the problem into smaller subproblems defined on Markov blankets, the proposed DCDILP method first explores in parallel the local causal graphs of these subproblems. However, this local discovery phase encounters systematic challenges due to the presence of hidden confounders (variables within each Markov blanket may be influenced by external variables). Moreover, aggregating these local causal graphs in a consistent global graph defines a large size combinatorial optimization problem. DCDILP addresses these challenges by: i) restricting the local subgraphs to causal links only related with the central variable of the Markov blanket; ii) formulating the reconciliation of local causal graphs as an integer linear programming method. The merits of the approach, in both terms of causal discovery accuracy and scalability in the size of the problem, are showcased by experiments and comparisons with the state of the art. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.09664 [pdf, other]

Frequency-mix Knowledge Distillation for Fake Speech Detection

Authors: Cunhang Fan, Shunbo Dong, Jun Xue, Yujie Chen, Jiangyan Yi, Zhao Lv

Abstract: In the telephony scenarios, the fake speech detection (FSD) task to combat speech spoofing attacks is challenging. Data augmentation (DA) methods are considered effective means to address the FSD task in telephony scenarios, typically divided into time domain and frequency domain stages. While each has its advantages, both can result in information loss. To tackle this issue, we propose a novel DA… ▽ More In the telephony scenarios, the fake speech detection (FSD) task to combat speech spoofing attacks is challenging. Data augmentation (DA) methods are considered effective means to address the FSD task in telephony scenarios, typically divided into time domain and frequency domain stages. While each has its advantages, both can result in information loss. To tackle this issue, we propose a novel DA method, Frequency-mix (Freqmix), and introduce the Freqmix knowledge distillation (FKD) to enhance model information extraction and generalization abilities. Specifically, we use Freqmix-enhanced data as input for the teacher model, while the student model's input undergoes time-domain DA method. We use a multi-level feature distillation approach to restore information and improve the model's generalization capabilities. Our approach achieves state-of-the-art results on ASVspoof 2021 LA dataset, showing a 31\% improvement over baseline and performs competitively on ASVspoof 2021 DF dataset. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: Accepted by Interspeech 2024

arXiv:2406.07300 [pdf, other]

Shadows, rings and optical appearance of a magnetically charged regular black hole illuminated by various accretion disks

Authors: Soroush Zare, Luis M. Nieto, Xing-Hui Feng, Shi-Hai Dong, Hassan Hassanabadi

Abstract: The Event Horizon Telescope (EHT) imaging of the supermassive black holes at the centers of Messier 87 galaxy and the Milky Way galaxy marks a significant step in observing the photon rings and central brightness depression that define the optical appearance of black holes with an accretion disk scenario. Inspired by this, we take into account a static and spherically symmetric magnetically charge… ▽ More The Event Horizon Telescope (EHT) imaging of the supermassive black holes at the centers of Messier 87 galaxy and the Milky Way galaxy marks a significant step in observing the photon rings and central brightness depression that define the optical appearance of black holes with an accretion disk scenario. Inspired by this, we take into account a static and spherically symmetric magnetically charged regular black hole (MCRBH) metric characterized by its mass and an additional parameter q, which arises from the coupling of Einstein gravity and nonlinear electrodynamics (NLED) in the weak field approximation. This parameterized model offers a robust foundation for testing the coupling of Einstein gravity and NLED in the weak-field approximation, using the EHT observational results. In this study, we investigate the geodesic motion of particles around the solution, followed by a discussion of its fundamental geometrical characteristics such as scalar invariants. Using null geodesics, we examine how the model parameter influences the behavior of the photon sphere radius and the associated shadow silhouette. We seek constraints on q by applying the EHT results for supermassive black holes M87* and Sgr A*. Furthermore, it is observed that the geodesics of time-like particles are susceptible to variations in q, which can have an impact on the traits of the innermost stable circular orbit and the marginally bounded orbit. Our primary objective is to probe how the free parameter q affects various aspects of the accretion disk surrounding the MCRBH using the thin-disk approximation. Next, we discuss the physical characteristics of the thin accretion disk as well as the observed shadows and rings of the MCRBH, along with its luminosity, across various accretion models. Ultimately, variations in accretion models and the parameter q yield distinct shadow images and optical appearances of the MCRBH. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 31 pages, 2 tables, 16 figures

arXiv:2406.06350 [pdf, other]

Error Analysis and Numerical Algorithm for PDE Approximation with Hidden-Layer Concatenated Physics Informed Neural Networks

Authors: Yianxia Qian, Yongchao Zhang, Suchuan Dong

Abstract: We present the hidden-layer concatenated physics informed neural network (HLConcPINN) method, which combines hidden-layer concatenated feed-forward neural networks, a modified block time marching strategy, and a physics informed approach for approximating partial differential equations (PDEs). We analyze the convergence properties and establish the error bounds of this method for two types of PDEs… ▽ More We present the hidden-layer concatenated physics informed neural network (HLConcPINN) method, which combines hidden-layer concatenated feed-forward neural networks, a modified block time marching strategy, and a physics informed approach for approximating partial differential equations (PDEs). We analyze the convergence properties and establish the error bounds of this method for two types of PDEs: parabolic (exemplified by the heat and Burgers' equations) and hyperbolic (exemplified by the wave and nonlinear Klein-Gordon equations). We show that its approximation error of the solution can be effectively controlled by the training loss for dynamic simulations with long time horizons. The HLConcPINN method in principle allows an arbitrary number of hidden layers not smaller than two and any of the commonly-used smooth activation functions for the hidden layers beyond the first two, with theoretical guarantees. This generalizes several recent neural-network techniques, which have theoretical guarantees but are confined to two hidden layers in the network architecture and the $\tanh$ activation function. Our theoretical analyses subsequently inform the formulation of appropriate training loss functions for these PDEs, leading to physics informed neural network (PINN) type computational algorithms that differ from the standard PINN formulation. Ample numerical experiments are presented based on the proposed algorithm to validate the effectiveness of this method and confirm aspects of the theoretical analyses. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 40 pages, 10 tables, 18 figures

arXiv:2406.06086 [pdf, other]

RawBMamba: End-to-End Bidirectional State Space Model for Audio Deepfake Detection

Authors: Yujie Chen, Jiangyan Yi, Jun Xue, Chenglong Wang, Xiaohui Zhang, Shunbo Dong, Siding Zeng, Jianhua Tao, Lv Zhao, Cunhang Fan

Abstract: Fake artefacts for discriminating between bonafide and fake audio can exist in both short- and long-range segments. Therefore, combining local and global feature information can effectively discriminate between bonafide and fake audio. This paper proposes an end-to-end bidirectional state space model, named RawBMamba, to capture both short- and long-range discriminative information for audio deepf… ▽ More Fake artefacts for discriminating between bonafide and fake audio can exist in both short- and long-range segments. Therefore, combining local and global feature information can effectively discriminate between bonafide and fake audio. This paper proposes an end-to-end bidirectional state space model, named RawBMamba, to capture both short- and long-range discriminative information for audio deepfake detection. Specifically, we use sinc Layer and multiple convolutional layers to capture short-range features, and then design a bidirectional Mamba to address Mamba's unidirectional modelling problem and further capture long-range feature information. Moreover, we develop a bidirectional fusion module to integrate embeddings, enhancing audio context representation and combining short- and long-range information. The results show that our proposed RawBMamba achieves a 34.1\% improvement over Rawformer on ASVspoof2021 LA dataset, and demonstrates competitive performance on other datasets. △ Less

Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

Comments: Accepted by Interspeech 2024

arXiv:2406.05823 [pdf]

Manipulating magnetism and transport properties of EuCd$_2$P$_2$ with a low carrier concentration

Authors: Xiyu Chen, Ziwen Wang, Zhiyu Zhou, Wuzhang Yang, Yi Liu, Jia-Yi Lu, Zhi Ren, Guang-Han Cao, Fazel Tafti, Shuai Dong, Zhi-Cheng Wang

Abstract: Materials that exhibit strongly coupled magnetic order and electronic properties are crucial for both fundamental research and technological applications. However, finding a material that not only shows remarkable magnetoresistive responses but also has an easily tunable ground state remains a challenge. Here, we report successful manipulation of the magnetic and transport properties of EuCd$_2$P… ▽ More Materials that exhibit strongly coupled magnetic order and electronic properties are crucial for both fundamental research and technological applications. However, finding a material that not only shows remarkable magnetoresistive responses but also has an easily tunable ground state remains a challenge. Here, we report successful manipulation of the magnetic and transport properties of EuCd$_2$P$_2$, which is transformed from an A-type antiferromagnet ($T_\mathrm{N}$ = 11 K) exhibiting colossal magnetoresistance into a ferromagnet ($T_\mathrm{C}$ = 47 K) with metallic behavior. The dramatic alteration results from a low hole concentration of $10^{19}$ cm$^{-3}$ induced by changing the growth conditions. Electronic structure and total energy calculations confirm the tunability of magnetism with a small carrier concentration for EuCd$_2$P$_2$. It is feasible to switch between the magnetic states by using field-effect to control the carrier density, thereby changing the magneto-electronic response. The controllable magnetism and electrical transport of EuCd$_2$P$_2$ make it a potential candidate for spintronics. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.05819 [pdf, other]

doi 10.1103/PhysRevB.109.L180410

Carrier-induced transition from antiferromagnetic insulator to ferromagnetic metal in the layered phosphide EuZn$_2$P$_2$

Authors: Xiyu Chen, Wuzhang Yang, Jia-Yi Lu, Zhiyu Zhou, Zhi Ren, Guang-Han Cao, Shuai Dong, Zhi-Cheng Wang

Abstract: EuZn$_2$P$_2$ was reported to be an insulating antiferromagnet with $T_\mathrm{N}$ of 23.5 K. In this study, single crystals of EuZn$_2$P$_2$ exhibiting metallic behavior and a ferromagnetic order of 72 K ($T_\mathrm{C}$) are successfully synthesized via a salt flux method. The presence of hole carriers induced by the Eu vacancies in the lattice is found to be crucial for the drastic changes in ma… ▽ More EuZn$_2$P$_2$ was reported to be an insulating antiferromagnet with $T_\mathrm{N}$ of 23.5 K. In this study, single crystals of EuZn$_2$P$_2$ exhibiting metallic behavior and a ferromagnetic order of 72 K ($T_\mathrm{C}$) are successfully synthesized via a salt flux method. The presence of hole carriers induced by the Eu vacancies in the lattice is found to be crucial for the drastic changes in magnetism and electrical transport. The carriers mediate the interlayer ferromagnetic interaction, and the coupling strength is directly related to $T_\mathrm{C}$, as evidenced by the linear dependence of $T_\mathrm{C}$ and the fitted Curie-Weiss temperatures on the Eu-layer distances for ferromagnetic Eu$M_2X_2$ ($M$ = Zn, Cd; $X$ = P, As). The ferromagnetic EuZn$_2$P$_2$ shows conspicuous negative magnetoresistance (MR) near $T_\mathrm{C}$, owing to strong magnetic scattering. The MR behavior is consistent with the Majumdar-Littlewood model, indicating that the MR can be enhanced by decreasing the carrier density. Our findings suggest that Eu$M_2X_2$ has highly tunable magnetism and charge transport, making it a promising material family for potential applications in spintronics. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Journal ref: Physical Review B 109, L180410 (2024)

arXiv:2406.05762 [pdf, ps, other]

Large data global existence for coupled massive-massless wave-type systems

Authors: Yuan Cai, Shijie Dong, Kuijie Li, Jingya Zhao

Abstract: We consider 3D Klein-Gordon-Zakharov (KGZ) and Dirac-Klein-Gordon (DKG) systems, where a common feature is that there exist both massless and massive fields in each system. We establish global existence and asymptotic behavior for both systems with a class of large data. More precisely, in the KGZ system, we allow the massless field to be large, while in the DKG system we allow the massive field t… ▽ More We consider 3D Klein-Gordon-Zakharov (KGZ) and Dirac-Klein-Gordon (DKG) systems, where a common feature is that there exist both massless and massive fields in each system. We establish global existence and asymptotic behavior for both systems with a class of large data. More precisely, in the KGZ system, we allow the massless field to be large, while in the DKG system we allow the massive field to be large. △ Less

Submitted 10 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

Comments: All comments are welcome. 58 pages

arXiv:2406.03835 [pdf, other]

Monocular Localization with Semantics Map for Autonomous Vehicles

Authors: Jixiang Wan, Xudong Zhang, Shuzhou Dong, Yuwei Zhang, Yuchen Yang, Ruoxi Wu, Ye Jiang, Jijunnan Li, Jinquan Lin, Ming Yang

Abstract: Accurate and robust localization remains a significant challenge for autonomous vehicles. The cost of sensors and limitations in local computational efficiency make it difficult to scale to large commercial applications. Traditional vision-based approaches focus on texture features that are susceptible to changes in lighting, season, perspective, and appearance. Additionally, the large storage siz… ▽ More Accurate and robust localization remains a significant challenge for autonomous vehicles. The cost of sensors and limitations in local computational efficiency make it difficult to scale to large commercial applications. Traditional vision-based approaches focus on texture features that are susceptible to changes in lighting, season, perspective, and appearance. Additionally, the large storage size of maps with descriptors and complex optimization processes hinder system performance. To balance efficiency and accuracy, we propose a novel lightweight visual semantic localization algorithm that employs stable semantic features instead of low-level texture features. First, semantic maps are constructed offline by detecting semantic objects, such as ground markers, lane lines, and poles, using cameras or LiDAR sensors. Then, online visual localization is performed through data association of semantic features and map objects. We evaluated our proposed localization framework in the publicly available KAIST Urban dataset and in scenarios recorded by ourselves. The experimental results demonstrate that our method is a reliable and practical localization solution in various autonomous driving localization tasks. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.19914 [pdf, other]

Towards RGB-NIR Cross-modality Image Registration and Beyond

Authors: Huadong Li, Shichao Dong, Jin Wang, Rong Fu, Minhao Jing, Jiajun Liang, Haoqiang Fan, Renhe Ji

Abstract: This paper focuses on the area of RGB(visible)-NIR(near-infrared) cross-modality image registration, which is crucial for many downstream vision tasks to fully leverage the complementary information present in visible and infrared images. In this field, researchers face two primary challenges - the absence of a correctly-annotated benchmark with viewpoint variations for evaluating RGB-NIR cross-mo… ▽ More This paper focuses on the area of RGB(visible)-NIR(near-infrared) cross-modality image registration, which is crucial for many downstream vision tasks to fully leverage the complementary information present in visible and infrared images. In this field, researchers face two primary challenges - the absence of a correctly-annotated benchmark with viewpoint variations for evaluating RGB-NIR cross-modality registration methods and the problem of inconsistent local features caused by the appearance discrepancy between RGB-NIR cross-modality images. To address these challenges, we first present the RGB-NIR Image Registration (RGB-NIR-IRegis) benchmark, which, for the first time, enables fair and comprehensive evaluations for the task of RGB-NIR cross-modality image registration. Evaluations of previous methods highlight the significant challenges posed by our RGB-NIR-IRegis benchmark, especially on RGB-NIR image pairs with viewpoint variations. To analyze the causes of the unsatisfying performance, we then design several metrics to reveal the toxic impact of inconsistent local features between visible and infrared images on the model performance. This further motivates us to develop a baseline method named Semantic Guidance Transformer (SGFormer), which utilizes high-level semantic guidance to mitigate the negative impact of local inconsistent features. Despite the simplicity of our motivation, extensive experimental results show the effectiveness of our method. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: 18 pages, 7 figures

arXiv:2405.19777 [pdf]

doi 10.1088/0256-307X/41/6/067402

Magnetic nonreciprocity in a hybrid device of asymmetric artificial spin-ice-superconductors

Authors: Chong Li, Peiyuan Huang, Chen-Guang Wang, Haojie Li, Yang-Yang Lyu, Wen-Cheng Yue, Zixiong Yuan, Tianyu Li, Xuecou Tu, Tao Tao, Sining Dong, Liang He, Xiaoqing Jia, Guozhu Sun, Lin Kang, Huabing Wang, Peiheng Wu, Yong-Lei Wang

Abstract: Controlling the size and distribution of potential barriers within a medium of interacting particles can unveil unique collective behaviors and innovative functionalities. In this study, we introduce a unique superconducting hybrid device using a novel artificial spin ice structure composed of asymmetric nanomagnets. This structure forms a distinctive superconducting pinning potential that steers… ▽ More Controlling the size and distribution of potential barriers within a medium of interacting particles can unveil unique collective behaviors and innovative functionalities. In this study, we introduce a unique superconducting hybrid device using a novel artificial spin ice structure composed of asymmetric nanomagnets. This structure forms a distinctive superconducting pinning potential that steers unconventional motion of superconducting vortices, thereby inducing a magnetic nonreciprocal effect, in contrast to the electric nonreciprocal effect commonly observed in superconducting diodes. Furthermore, the polarity of the magnetic nonreciprocity is in-situ reversible through the tunable magnetic patterns of artificial spin ice. Our findings demonstrate that artificial spin ice not only precisely modulates superconducting characteristics but also opens the door to novel functionalities, offering a groundbreaking paradigm for superconducting electronics. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Journal ref: Chinese Physics Letters 41, 067402 (2024) Express Letter

arXiv:2405.19767 [pdf]

MAE-GAN: A Novel Strategy for Simultaneous Super-resolution Reconstruction and Denoising of Post-stack Seismic Profile

Authors: Wenshuo Yu, Shiqi Dong, Shaoping Lu, Xintong Dong

Abstract: Post-stack seismic profiles are images reflecting containing geological structures which provides a critical foundation for understanding the distribution of oil and gas resources. However, due to the limitations of seismic acquisition equipment and data collecting geometry, the post-stack profiles suffer from low resolution and strong noise issues, which severely affects subsequent seismic interp… ▽ More Post-stack seismic profiles are images reflecting containing geological structures which provides a critical foundation for understanding the distribution of oil and gas resources. However, due to the limitations of seismic acquisition equipment and data collecting geometry, the post-stack profiles suffer from low resolution and strong noise issues, which severely affects subsequent seismic interpretation. To better enhance the spatial resolution and signal-to-noise ratio of post-seismic profiles, a multi-scale attention encoder-decoder network based on generative adversarial network (MAE-GAN) is proposed. This method improves the resolution of post-stack profiles, and effectively suppresses noises and recovers weak signals as well. A multi-scale residual module is proposed to extract geological features under different receptive fields. At the same time, an attention module is designed to further guide the network to focus on important feature information. Additionally, to better recover the global and local information of post-stack profiles, an adversarial network based on a Markov discriminator is proposed. Finally, by introducing an edge information preservation loss function, the conventional loss function of the Generative Adversarial Network is improved, which enables better recovery of the edge information of the original post-stack profiles. Experimental results on simulated and field post-stack profiles demonstrate that the proposed MAE-GAN method outperforms two advanced convolutional neural network-based methods in noise suppression and weak signal recovery. Furthermore, the profiles reconstructed by the MAE-GAN method preserve more geological structures. △ Less

Submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.17660 [pdf, other]

LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking

Authors: Shaohua Dong, Yunhe Feng, Qing Yang, Yuewei Lin, Heng Fan

Abstract: High-performance Transformer trackers have shown excellent results, yet they often bear a heavy computational load. Observing that a smaller input can immediately and conveniently reduce computations without changing the model, an easy solution is to adopt the low-resolution input for efficient Transformer tracking. Albeit faster, this hurts tracking accuracy much due to information loss in low re… ▽ More High-performance Transformer trackers have shown excellent results, yet they often bear a heavy computational load. Observing that a smaller input can immediately and conveniently reduce computations without changing the model, an easy solution is to adopt the low-resolution input for efficient Transformer tracking. Albeit faster, this hurts tracking accuracy much due to information loss in low resolution tracking. In this paper, we aim to mitigate such information loss to boost the performance of the low-resolution Transformer tracking via dual knowledge distillation from a frozen high-resolution (but not a larger) Transformer tracker. The core lies in two simple yet effective distillation modules, comprising query-key-value knowledge distillation (QKV-KD) and discrimination knowledge distillation (Disc-KD), across resolutions. The former, from the global view, allows the low-resolution tracker to inherit the features and interactions from the high-resolution tracker, while the later, from the target-aware view, enhances the target-background distinguishing capacity via imitating discriminative regions from its high-resolution counterpart. With the dual knowledge distillation, our Low-Resolution Transformer Tracker (LoReTrack) enjoys not only high efficiency owing to reduced computation but also enhanced accuracy by distilling knowledge from the high-resolution tracker. In extensive experiments, LoReTrack with a 256x256 resolution consistently improves baseline with the same resolution, and shows competitive or even better results compared to 384x384 high-resolution Transformer tracker, while running 52% faster and saving 56% MACs. Moreover, LoReTrack is resolution-scalable. With a 128x128 resolution, it runs 25 fps on a CPU with 64.9%/46.4% SUC scores on LaSOT/LaSOText, surpassing all other CPU real-time trackers. Code will be released. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.17201 [pdf, other]

Diagnosing the Compositional Knowledge of Vision Language Models from a Game-Theoretic View

Authors: Jin Wang, Shichao Dong, Yapeng Zhu, Kelu Yao, Weidong Zhao, Chao Li, Ping Luo

Abstract: Compositional reasoning capabilities are usually considered as fundamental skills to characterize human perception. Recent studies show that current Vision Language Models (VLMs) surprisingly lack sufficient knowledge with respect to such capabilities. To this end, we propose to thoroughly diagnose the composition representations encoded by VLMs, systematically revealing the potential cause for th… ▽ More Compositional reasoning capabilities are usually considered as fundamental skills to characterize human perception. Recent studies show that current Vision Language Models (VLMs) surprisingly lack sufficient knowledge with respect to such capabilities. To this end, we propose to thoroughly diagnose the composition representations encoded by VLMs, systematically revealing the potential cause for this weakness. Specifically, we propose evaluation methods from a novel game-theoretic view to assess the vulnerability of VLMs on different aspects of compositional understanding, e.g., relations and attributes. Extensive experimental results demonstrate and validate several insights to understand the incapabilities of VLMs on compositional reasoning, which provide useful and reliable guidance for future studies. The deliverables will be updated at https://vlms-compositionality-gametheory.github.io/. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 21 pages, 8 figures

arXiv:2405.17099 [pdf]

Element-specific ultrafast lattice dynamics in monolayer WSe2

Authors: H. Jung, S. Dong, D. Zahn, T. Vasileiadis, H. Seiler, R. Schneider, S. Michaelis de Vasconcellos, V. C. A. Taylor, R. Bratschitsch, R. Ernstorfer, Y. W. Windsor

Abstract: We study monolayer WSe2 using ultrafast electron diffraction. We introduce an approach to quantitatively extract atomic-site-specific information, providing an element-specific view of incoherent atomic vibrations following femtosecond excitation. Via differences between W and Se vibrations, we identify stages in the nonthermal evolution of the lattice. Combined with a calculated phonon dispersion… ▽ More We study monolayer WSe2 using ultrafast electron diffraction. We introduce an approach to quantitatively extract atomic-site-specific information, providing an element-specific view of incoherent atomic vibrations following femtosecond excitation. Via differences between W and Se vibrations, we identify stages in the nonthermal evolution of the lattice. Combined with a calculated phonon dispersion, this element specificity enables us to identify a long-lasting overpopulation of specific optical phonons, and to interpret the stages as energy transfer processes between specific phonon groups. These results demonstrate the appeal of resolving element-specific vibrational information in the ultrafast time domain. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 13 pages, 3 figures

arXiv:2405.15135 [pdf, other]

Exploring the Evolution of Hidden Activations with Live-Update Visualization

Authors: Xianglin Yang, Jin Song Dong

Abstract: Monitoring the training of neural networks is essential for identifying potential data anomalies, enabling timely interventions and conserving significant computational resources. Apart from the commonly used metrics such as losses and validation accuracies, the hidden representation could give more insight into the model progression. To this end, we introduce SentryCam, an automated, real-time vi… ▽ More Monitoring the training of neural networks is essential for identifying potential data anomalies, enabling timely interventions and conserving significant computational resources. Apart from the commonly used metrics such as losses and validation accuracies, the hidden representation could give more insight into the model progression. To this end, we introduce SentryCam, an automated, real-time visualization tool that reveals the progression of hidden representations during training. Our results show that this visualization offers a more comprehensive view of the learning dynamics compared to basic metrics such as loss and accuracy over various datasets. Furthermore, we show that SentryCam could facilitate detailed analysis such as task transfer and catastrophic forgetting to a continual learning setting. The code is available at https://github.com/xianglinyang/SentryCam. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: Preprint

arXiv:2405.14169 [pdf, other]

Towards Transferable Attacks Against Vision-LLMs in Autonomous Driving with Typography

Authors: Nhat Chung, Sensen Gao, Tuan-Anh Vu, Jie Zhang, Aishan Liu, Yun Lin, Jin Song Dong, Qing Guo

Abstract: Vision-Large-Language-Models (Vision-LLMs) are increasingly being integrated into autonomous driving (AD) systems due to their advanced visual-language reasoning capabilities, targeting the perception, prediction, planning, and control mechanisms. However, Vision-LLMs have demonstrated susceptibilities against various types of adversarial attacks, which would compromise their reliability and safet… ▽ More Vision-Large-Language-Models (Vision-LLMs) are increasingly being integrated into autonomous driving (AD) systems due to their advanced visual-language reasoning capabilities, targeting the perception, prediction, planning, and control mechanisms. However, Vision-LLMs have demonstrated susceptibilities against various types of adversarial attacks, which would compromise their reliability and safety. To further explore the risk in AD systems and the transferability of practical threats, we propose to leverage typographic attacks against AD systems relying on the decision-making capabilities of Vision-LLMs. Different from the few existing works developing general datasets of typographic attacks, this paper focuses on realistic traffic scenarios where these attacks can be deployed, on their potential effects on the decision-making autonomy, and on the practical ways in which these attacks can be physically presented. To achieve the above goals, we first propose a dataset-agnostic framework for automatically generating false answers that can mislead Vision-LLMs' reasoning. Then, we present a linguistic augmentation scheme that facilitates attacks at image-level and region-level reasoning, and we extend it with attack patterns against multiple reasoning tasks simultaneously. Based on these, we conduct a study on how these attacks can be realized in physical traffic scenarios. Through our empirical study, we evaluate the effectiveness, transferability, and realizability of typographic attacks in traffic scenes. Our findings demonstrate particular harmfulness of the typographic attacks against existing Vision-LLMs (e.g., LLaVA, Qwen-VL, VILA, and Imp), thereby raising community awareness of vulnerabilities when incorporating such models into AD systems. We will release our source code upon acceptance. △ Less

Submitted 23 May, 2024; originally announced May 2024.

Comments: 12 pages, 5 tables, 5 figures, work in progress

arXiv:2405.10169 [pdf, other]

Most likely configurations for fermion localization in a Braneworld-$f(Q,B_Q)$

Authors: A. R. P. Moreira, Shi-Hai Dong, M. E. Rodrigues

Abstract: This study delves deeply into braneworld scenarios within modified gravity models, investigating their impact on particle localization and the structure of branes. Through a comprehensive blend of numerical analyses and theoretical inquiries, we unravel a nuanced correlation between deviations from standard General Relativity (GR) and the emergence of split branes. By employing probabilistic measu… ▽ More This study delves deeply into braneworld scenarios within modified gravity models, investigating their impact on particle localization and the structure of branes. Through a comprehensive blend of numerical analyses and theoretical inquiries, we unravel a nuanced correlation between deviations from standard General Relativity (GR) and the emergence of split branes. By employing probabilistic measurements, we pinpoint stable configurations that align with brane division intervals, thus challenging prevailing assumptions regarding the gravitational framework of our universe. Furthermore, our investigation extends to the localization of fermions within the brane, exposing intricate dynamics shaped by scalar field characteristics and modifications to gravitational models. By harnessing quantum information measurements, notably Shannon entropy, we discern heightened probabilities of fermion localization within the brane as gravitational models diverge from standard paradigms. This underscores the limitations of General Relativity in comprehensively describing the complexities inherent in our universe. Lastly, our exploration of massive fermions unveils their potential to breach the confines of the brane, hinting at promising avenues for future experimental endeavors aimed at probing the nature of extra dimensions and gravitational interactions. This suggests exciting prospects for advancing our understanding of fundamental physics beyond conventional boundaries. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2405.09776 [pdf]

doi 10.1103/PhysRevB.109.184106

Magnetic structure and magnetoelectric coupling in antiferromagnet Co5(TeO3)4Cl2

Authors: B. Yu, L. Huang, J. S. Li, L. Lin, V. Ovidiu Garlea, Q. Zhang, T. Zou, J. C. Zhang, J. Peng, Y. S. Tang, G. Z. Zhou, J. H. Zhang, S. H. Zheng, M. F. Liu, Z. B. Yan, X. H. Zhou, S. Dong, J. G. Wan, J. -M. Liu

Abstract: The van der Waals (vdW) layered multiferroics, which host simultaneous ferroelectric and magnetic orders, have attracted attention not only for their potentials to be utilized in nanoelectric devices and spintronics, but also offer alternative opportunities for emergent physical phenomena. To date, the vdW layered multiferroic materials are still very rare. In this work, we have investigated the m… ▽ More The van der Waals (vdW) layered multiferroics, which host simultaneous ferroelectric and magnetic orders, have attracted attention not only for their potentials to be utilized in nanoelectric devices and spintronics, but also offer alternative opportunities for emergent physical phenomena. To date, the vdW layered multiferroic materials are still very rare. In this work, we have investigated the magnetic structure and magnetoelectric effects in Co5(TeO3)4Cl2, a promising new multiferroic compound with antiferromagnetic (AFM) Neel point TN = 18 K. The neutron powder diffraction reveals the non-coplanar AFM state with preferred Neel vector along the c-axis, while a spin re-orientation occurring between 8 K and 15 K is identified, which results from the distinct temperature dependence of the non-equivalent Co sites moment in Co5(TeO3)4Cl2. What is more, it is found that Co5(TeO3)4Cl2 is one of the best vdW multiferroics studied so far in terms of the multiferroic performance. The measured linear ME coefficient exhibits the emergent oscillation dependence of the angle between magnetic field and electric field, and the maximal value is as big as 45 ps/m. It is suggested that Co5(TeO3)4Cl2 is an appreciated platform for exploring the emergent multiferroicity in vdW layered compounds. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: 31 pages, 9 figures

Journal ref: Phys. Rev. B 109, 184106(2024)

arXiv:2405.09180 [pdf]

doi 10.1038/s41467-024-48224-1

Integrated and DC-powered superconducting microcomb

Authors: Chen-Guang Wang, Wuyue Xu, Chong Li, Lili Shi, Junliang Jiang, Tingting Guo, Wen-Cheng Yue, Tianyu Li, Ping Zhang, Yang-Yang Lyu, Jiazheng Pan, Xiuhao Deng, Ying Dong, Xuecou Tu, Sining Dong, Chunhai Cao, Labao Zhang, Xiaoqing Jia, Guozhu Sun, Lin Kang, Jian Chen, Yong-Lei Wang, Huabing Wang, Peiheng Wu

Abstract: Frequency combs, specialized laser sources emitting multiple equidistant frequency lines, have revolutionized science and technology with unprecedented precision and versatility. Recently, integrated frequency combs are emerging as scalable solutions for on-chip photonics. Here, we demonstrate a fully integrated superconducting microcomb that is easy to manufacture, simple to operate, and consumes… ▽ More Frequency combs, specialized laser sources emitting multiple equidistant frequency lines, have revolutionized science and technology with unprecedented precision and versatility. Recently, integrated frequency combs are emerging as scalable solutions for on-chip photonics. Here, we demonstrate a fully integrated superconducting microcomb that is easy to manufacture, simple to operate, and consumes ultra-low power. Our turnkey apparatus comprises a basic nonlinear superconducting device, a Josephson junction, directly coupled to a superconducting microstrip resonator. We showcase coherent comb generation through self-started mode-locking. Therefore, comb emission is initiated solely by activating a DC bias source, with power consumption as low as tens of picowatts. The resulting comb spectrum resides in the microwave domain and spans multiple octaves. The linewidths of all comb lines can be narrowed down to 1 Hz through a unique coherent injection-locking technique. Our work represents a critical step towards fully integrated microwave photonics and offers the potential for integrated quantum processors. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Journal ref: Nature Communications 15, 4009 (2024)

arXiv:2405.09170 [pdf]

doi 10.1088/1674-1056/ad2f21

Tunable superconducting resonators via on-chip control of local magnetic field

Authors: Chen-Guang Wang, Wen-Cheng Yue, Xuecou Tu, Tianyuan Chi, Tingting Guo, Yang-Yang Lyu, Sining Dong, Chunhai Cao, Labao Zhang, Xiaoqing Jia, Guozhu Sun, Lin Kang, Jian Chen, Yong-Lei Wang, Huabing Wang, Peiheng Wu

Abstract: Superconducting microwave resonators play a pivotal role in superconducting quantum circuits. The ability to fine-tune their resonant frequencies provides enhanced control and flexibility. Here, we introduce a frequency-tunable superconducting coplanar waveguide resonator. By applying electrical currents through specifically designed ground wires, we achieve the generation and control of a localiz… ▽ More Superconducting microwave resonators play a pivotal role in superconducting quantum circuits. The ability to fine-tune their resonant frequencies provides enhanced control and flexibility. Here, we introduce a frequency-tunable superconducting coplanar waveguide resonator. By applying electrical currents through specifically designed ground wires, we achieve the generation and control of a localized magnetic field on the central line of the resonator, enabling continuous tuning of its resonant frequency. We demonstrate a frequency tuning range of 54.85 MHz in a 6.21 GHz resonator. This integrated and tunable resonator holds great potential as a dynamically tunable filter and as a key component of communication buses and memory elements in superconducting quantum computing. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Journal ref: Chin. Phys. B 33, 058402 (2024)

arXiv:2405.08941 [pdf, other]

Parameter optimization comparison in QAOA using Stochastic Hill Climbing with Random Re-starts and Local Search with entangled and non-entangled mixing operators

Authors: Brian García Sarmina, Guo-Hua Sun, Shi-Hai Dong

Abstract: This study investigates the efficacy of Stochastic Hill Climbing with Random Restarts (SHC-RR) compared to Local Search (LS) strategies within the Quantum Approximate Optimization Algorithm (QAOA) framework across various problem models. Employing uniform parameter settings, including the number of restarts and SHC steps, we analyze LS with two distinct perturbation operations: multiplication and… ▽ More This study investigates the efficacy of Stochastic Hill Climbing with Random Restarts (SHC-RR) compared to Local Search (LS) strategies within the Quantum Approximate Optimization Algorithm (QAOA) framework across various problem models. Employing uniform parameter settings, including the number of restarts and SHC steps, we analyze LS with two distinct perturbation operations: multiplication and summation. Our comparative analysis encompasses multiple versions of max-cut and random Ising model (RI) problems, utilizing QAOA models with depths ranging from $1L$ to $3L$. These models incorporate diverse mixing operator configurations, which integrate $RX$ and $RY$ gates, and explore the effects of an entanglement stage within the mixing operator. We also used Quantum Fisher Information (QFI) to compare the different QAOA models, demonstrating the importance of the placement of the entanglement stage in the overall performance of QAOA. Additionally, we observed that the QFI values of previous parameters are not affected as the depth of the quantum circuit increases. Our results consistently show that SHC-RR outperforms LS approaches, showcasing superior efficacy despite its ostensibly simpler optimization mechanism. Furthermore, we observe that the inclusion of entanglement stages within mixing operators significantly impacts model performance, either enhancing or diminishing results depending on the specific problem context. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 51 pages, 19 tables, 22 figures

MSC Class: 81P68; 68Q09; 68Q12

arXiv:2405.08588 [pdf, ps, other]

Sharing Quantum Steering via Standard Projective Measurements

Authors: Shufen Dong, Zinuo Cai, Chunfeng Wu, Changliang Ren

Abstract: We propose a scheme for the sharing of quantum steering among three observers, Alice, Bob, and Charlie using standard projective measurements. We show that in the unilateral sequential scenario, Alice can steer Bob's and Charlie's states and conversely, Bob and Charlie can steer Alice's state. Unlike the quantum steering sharing achieved through weak measurements, we use the standard projective me… ▽ More We propose a scheme for the sharing of quantum steering among three observers, Alice, Bob, and Charlie using standard projective measurements. We show that in the unilateral sequential scenario, Alice can steer Bob's and Charlie's states and conversely, Bob and Charlie can steer Alice's state. Unlike the quantum steering sharing achieved through weak measurements, we use the standard projective measurements to enable quantum steering sharing. Quantum steering is demonstrated by the violations of the linear steering inequality among different observer combinations. We find that Alice can simultaneously steer both Bob's and Charlie's states, and Bob and Charlie can simultaneously steer Alice's state, regardless of whether they are in maximally entangled states or partially entangled states. The maximum double violation of the linear steering inequalities obtained from partially entangled states can be greater in some cases than that obtained from maximally entangled states when randomly combining the case of two projective measurements and the case of two identity measurements. Additionally, we verify hybrid quantum correlation sharing through the double violation of the Clauser-Horne-Shimony-Holt (CHSH) inequality and the linear steering inequality. Our results provide a new perspective for the study of quantum steering and may lead to applications in quantum random access code, randomness certification, and self-testing process. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.08423 [pdf, other]

NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Authors: Yihong Chen, Zhen Fan, Shuai Dong, Zhiwei Chen, Wenjie Li, Minghui Qin, Min Zeng, Xubing Lu, Guofu Zhou, Xingsen Gao, Jun-Ming Liu

Abstract: Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high co… ▽ More Stereo image super-resolution (SR) refers to the reconstruction of a high-resolution (HR) image from a pair of low-resolution (LR) images as typically captured by a dual-camera device. To enhance the quality of SR images, most previous studies focused on increasing the number and size of feature maps and introducing complex and computationally intensive structures, resulting in models with high computational complexity. Here, we propose a simple yet efficient stereo image SR model called NAFRSSR, which is modified from the previous state-of-the-art model NAFSSR by introducing recursive connections and lightweighting the constituent modules. Our NAFRSSR model is composed of nonlinear activation free and group convolution-based blocks (NAFGCBlocks) and depth-separated stereo cross attention modules (DSSCAMs). The NAFGCBlock improves feature extraction and reduces number of parameters by removing the simple channel attention mechanism from NAFBlock and using group convolution. The DSSCAM enhances feature fusion and reduces number of parameters by replacing 1x1 pointwise convolution in SCAM with weight-shared 3x3 depthwise convolution. Besides, we propose to incorporate trainable edge detection operator into NAFRSSR to further improve the model performance. Four variants of NAFRSSR with different sizes, namely, NAFRSSR-Mobile (NAFRSSR-M), NAFRSSR-Tiny (NAFRSSR-T), NAFRSSR-Super (NAFRSSR-S) and NAFRSSR-Base (NAFRSSR-B) are designed, and they all exhibit fewer parameters, higher PSNR/SSIM, and faster speed than the previous state-of-the-art models. In particular, to the best of our knowledge, NAFRSSR-M is the lightest (0.28M parameters) and fastest (50 ms inference time) model achieving an average PSNR/SSIM as high as 24.657 dB/0.7622 on the benchmark datasets. Codes and models will be released at https://github.com/JNUChenYiHong/NAFRSSR. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.00074 [pdf, other]

PAODING: A High-fidelity Data-free Pruning Toolkit for Debloating Pre-trained Neural Networks

Authors: Mark Huasong Meng, Hao Guan, Liuhuo Wan, Sin Gee Teo, Guangdong Bai, Jin Song Dong

Abstract: We present PAODING, a toolkit to debloat pretrained neural network models through the lens of data-free pruning. To preserve the model fidelity, PAODING adopts an iterative process, which dynamically measures the effect of deleting a neuron to identify candidates that have the least impact to the output layer. Our evaluation shows that PAODING can significantly reduce the model size, generalize on… ▽ More We present PAODING, a toolkit to debloat pretrained neural network models through the lens of data-free pruning. To preserve the model fidelity, PAODING adopts an iterative process, which dynamically measures the effect of deleting a neuron to identify candidates that have the least impact to the output layer. Our evaluation shows that PAODING can significantly reduce the model size, generalize on different datasets and models, and meanwhile preserve the model fidelity in terms of test accuracy and adversarial robustness. PAODING is publicly available on PyPI via https://pypi.org/project/paoding-dl. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: 3 pages

arXiv:2404.19621 [pdf, other]

Fibonacci and Lucas Sequences in Aperiodic Monotile Supertiles

Authors: Shiying Dong

Abstract: This paper first discusses the size and orientation of hat supertiles. Fibonacci and Lucas sequences, as well as a third integer sequence linearly related to the Lucas sequence are involved. The result is then generalized to any aperiodic tile in the hat family. This paper first discusses the size and orientation of hat supertiles. Fibonacci and Lucas sequences, as well as a third integer sequence linearly related to the Lucas sequence are involved. The result is then generalized to any aperiodic tile in the hat family. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: 10 pages, 21 figures

MSC Class: 05B45 ACM Class: G.2.1

arXiv:2404.19377 [pdf]

doi 10.1038/s41565-024-01666-6

Toroidic phase transitions in a direct-kagome artificial spin ice

Authors: Wen-Cheng Yue, Zixiong Yuan, Peiyuan Huang, Yizhe Sun, Tan Gao, Yang-Yang Lyu, Xuecou Tu, Sining Dong, Liang He, Ying Dong, Xun Cao, Lin Kang, Huabing Wang, Peiheng Wu, Cristiano Nisoli, Yong-Lei Wang

Abstract: Ferrotoroidicity, the fourth form of primary ferroic order, breaks both space and time inversion symmetry. So far, direct observation of ferrotoroidicity in natural materials remains elusive, which impedes the exploration of ferrotoroidic phase transitions. Here, we overcome the limitations of natural materials using an artificial nanomagnet system that can be characterized at the constituent leve… ▽ More Ferrotoroidicity, the fourth form of primary ferroic order, breaks both space and time inversion symmetry. So far, direct observation of ferrotoroidicity in natural materials remains elusive, which impedes the exploration of ferrotoroidic phase transitions. Here, we overcome the limitations of natural materials using an artificial nanomagnet system that can be characterized at the constituent level and at different effective temperatures. We design a nanomagnet array as to realize a direct-kagome spin ice. This artificial spin ice exhibits robust toroidal moments and a quasi-degenerate ground state with two distinct low-temperature toroidal phases: ferrotoroidicity and paratoroidicity. Using magnetic force microscopy and Monte Carlo simulation, we demonstrate a phase transition between ferrotoroidicity and paratoroidicity, along with a crossover to a non-toroidal paramagnetic phase. Our quasi-degenerate artificial spin ice in a direct-kagome structure provides a model system for the investigation of magnetic states and phase transitions that are inaccessible in natural materials. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Journal ref: Nature Nanotechnology (2024)

arXiv:2404.17437 [pdf]

Transformer For Low-frequency Extrapolating of Seismic Data

Authors: Zheng Cong, Xintong Dong, Shaoping Lu, Shiqi Dong, Xunqian Tong

Abstract: Full waveform inversion (FWI) is used to reconstruct the physical properties of subsurface media which plays an important role in seismic exploration. However, the precision of FWI is seriously affected by the absence or inaccuracy of low-frequency information. Therefore, reconstructing the low-frequency signals accurately is highly significant in seismic data processing. Low-frequency extrapolati… ▽ More Full waveform inversion (FWI) is used to reconstruct the physical properties of subsurface media which plays an important role in seismic exploration. However, the precision of FWI is seriously affected by the absence or inaccuracy of low-frequency information. Therefore, reconstructing the low-frequency signals accurately is highly significant in seismic data processing. Low-frequency extrapolation of seismic records can be approached as a deep learning regression problem. Thus, to obtain low-frequency information from band-limited seismic records, a novel network structure called low-frequency extrapolation transformer (LFET) is proposed to construct the nonlinear mapping relationship between the data missing low-frequency and low-frequency data in a supervised learning approach, which is inspired by the transformer model widely used in natural language processing (NLP). We apply multi-head self-attention (MSA) modules to model the remote dependencies of seismic data. Based on this, we introduce a shifted window partitioning approach to reduce the calculating amount. Due to the field data are not suitable for supervised learning, we generate synthetic seismic records using submodels selected from the benchmark Marmousi model as training data whose characteristics are similar to that of the field data. A single trace of synthetic band-limited seismic data in the time domain is used as the input data, and the parameters of LFET are updated based on the errors between the predicted trace and the corresponding label. The experimental results on the data generated by different models, different wavelets, and different kinds of field marine data demonstrate the feasibility and generalization of the proposed method. Furthermore, the proposed method achieves higher accuracy with lower computational expense than the traditional CNN method. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.16510 [pdf, other]

Interactive3D: Create What You Want by Interactive 3D Generation

Authors: Shaocong Dong, Lihe Ding, Zhanpeng Huang, Zibin Wang, Tianfan Xue, Dan Xu

Abstract: 3D object generation has undergone significant advancements, yielding high-quality results. However, fall short of achieving precise user control, often yielding results that do not align with user expectations, thus limiting their applicability. User-envisioning 3D object generation faces significant challenges in realizing its concepts using current generative models due to limited interaction c… ▽ More 3D object generation has undergone significant advancements, yielding high-quality results. However, fall short of achieving precise user control, often yielding results that do not align with user expectations, thus limiting their applicability. User-envisioning 3D object generation faces significant challenges in realizing its concepts using current generative models due to limited interaction capabilities. Existing methods mainly offer two approaches: (i) interpreting textual instructions with constrained controllability, or (ii) reconstructing 3D objects from 2D images. Both of them limit customization to the confines of the 2D reference and potentially introduce undesirable artifacts during the 3D lifting process, restricting the scope for direct and versatile 3D modifications. In this work, we introduce Interactive3D, an innovative framework for interactive 3D generation that grants users precise control over the generative process through extensive 3D interaction capabilities. Interactive3D is constructed in two cascading stages, utilizing distinct 3D representations. The first stage employs Gaussian Splatting for direct user interaction, allowing modifications and guidance of the generative direction at any intermediate step through (i) Adding and Removing components, (ii) Deformable and Rigid Dragging, (iii) Geometric Transformations, and (iv) Semantic Editing. Subsequently, the Gaussian splats are transformed into InstantNGP. We introduce a novel (v) Interactive Hash Refinement module to further add details and extract the geometry in the second stage. Our experiments demonstrate that Interactive3D markedly improves the controllability and quality of 3D generation. Our project webpage is available at \url{https://interactive-3d.github.io/}. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: project page: https://interactive-3d.github.io/

arXiv:2404.13576 [pdf, other]

I2CANSAY:Inter-Class Analogical Augmentation and Intra-Class Significance Analysis for Non-Exemplar Online Task-Free Continual Learning

Authors: Songlin Dong, Yingjie Chen, Yuhang He, Yuhan Jin, Alex C. Kot, Yihong Gong

Abstract: Online task-free continual learning (OTFCL) is a more challenging variant of continual learning which emphasizes the gradual shift of task boundaries and learns in an online mode. Existing methods rely on a memory buffer composed of old samples to prevent forgetting. However,the use of memory buffers not only raises privacy concerns but also hinders the efficient learning of new samples. To addres… ▽ More Online task-free continual learning (OTFCL) is a more challenging variant of continual learning which emphasizes the gradual shift of task boundaries and learns in an online mode. Existing methods rely on a memory buffer composed of old samples to prevent forgetting. However,the use of memory buffers not only raises privacy concerns but also hinders the efficient learning of new samples. To address this problem, we propose a novel framework called I2CANSAY that gets rid of the dependence on memory buffers and efficiently learns the knowledge of new data from one-shot samples. Concretely, our framework comprises two main modules. Firstly, the Inter-Class Analogical Augmentation (ICAN) module generates diverse pseudo-features for old classes based on the inter-class analogy of feature distributions for different new classes, serving as a substitute for the memory buffer. Secondly, the Intra-Class Significance Analysis (ISAY) module analyzes the significance of attributes for each class via its distribution standard deviation, and generates the importance vector as a correction bias for the linear classifier, thereby enhancing the capability of learning from new samples. We run our experiments on four popular image classification datasets: CoRe50, CIFAR-10, CIFAR-100, and CUB-200, our approach outperforms the prior state-of-the-art by a large margin. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2404.13336 [pdf]

Modeling Seismic Wave Propagation in TTI Media Using Residual Perfectly Matched Layer

Authors: Yuqin Luo, Xintong Dong, Shiqi Dong, Tie Zhong, Yu Zhang, Ying Wang, Ning Hu

Abstract: The perfectly matched layer(PML) is commonly used in wave propagation, radiation and diffraction problems in unbounded space domains. A new implementation scheme of PML is presented. The PML formulation is pre-defined, and the wave field absorption is achieved by calculating the residual between the PML equation and original equation through backward induction. Two forms of the Residual PML (RPML)… ▽ More The perfectly matched layer(PML) is commonly used in wave propagation, radiation and diffraction problems in unbounded space domains. A new implementation scheme of PML is presented. The PML formulation is pre-defined, and the wave field absorption is achieved by calculating the residual between the PML equation and original equation through backward induction. Two forms of the Residual PML (RPML) are presented: RPML-1, which defines the residual as the difference between the original and PML equations, and RPML-2, which defines the residual as the difference between the original and PML wave fields. RPML-2 is the simplest and easiest to extend, as it does not alter the original equation and only has one time partial derivative term in the residual equation. Additionally, since the residual equation has no spatial partial derivative term, high-order spatial difference discretization is unnecessary, which results in higher accuracy and computational efficiency. Furthermore, simulating a wave field in TTI media requires a high absorption effect and stability of PML. The numerical simulation demonstrates that RPML-2 provides better absorption performance and stability compared to ADEPML and NPML. To meet the needs of wave field simulation for complex media, a multiaxial complex frequency shifted RPML-2 (MCFS-RPML-2) is introduced, which employs double damping profiles and complex frequency shift technology to achieve higher stability and absorption effects. △ Less

Submitted 22 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.10249 [pdf]

Picturing the Gap Between the Performance and US-DOE's Hydrogen Storage Target: A Data-Driven Model for MgH2 Dehydrogenation

Authors: Chaoqun Li, Weijie Yang, Hao Liu, Xinyuan Liu, Xiujing Xing, Zhengyang Gao, Shuai Dong, Hao Li

Abstract: Developing solid-state hydrogen storage materials is as pressing as ever, which requires a comprehensive understanding of the dehydrogenation chemistry of a solid-state hydride. Transition state search and kinetics calculations are essential to understanding and designing high-performance solid-state hydrogen storage materials by filling in the knowledge gap that current experimental techniques ca… ▽ More Developing solid-state hydrogen storage materials is as pressing as ever, which requires a comprehensive understanding of the dehydrogenation chemistry of a solid-state hydride. Transition state search and kinetics calculations are essential to understanding and designing high-performance solid-state hydrogen storage materials by filling in the knowledge gap that current experimental techniques cannot measure. However, the ab initio analysis of these processes is computationally expensive and time-consuming. Searching for descriptors to accurately predict the energy barrier is urgently needed, to accelerate the prediction of hydrogen storage material properties and identify the opportunities and challenges in this field. Herein, we develop a data-driven model to describe and predict the dehydrogenation barriers of a typical solid-state hydrogen storage material, magnesium hydride (MgH2), based on the combination of the crystal Hamilton population orbital of Mg-H bond and the distance between atomic hydrogen. By deriving the distance energy ratio, this model elucidates the key chemistry of the reaction kinetics. All the parameters in this model can be directly calculated with significantly less computational cost than conventional transition state search, so that the dehydrogenation performance of hydrogen storage materials can be predicted efficiently. Finally, we found that this model leads to excellent agreement with typical experimental measurements reported to date and provides clear design guidelines on how to propel the performance of MgH2 closer to the target set by the United States Department of Energy (US-DOE). △ Less

Submitted 29 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.08405 [pdf]

doi 10.1021/acs.nanolett.3c05008

Unconventional superconducting diode effects via antisymmetry and antisymmetry breaking

Authors: Chong Li, Yang-Yang Lyu, Wen-Cheng Yue, Peiyuan Huang, Haojie Li, Tianyu Li, Chen-Guang Wang, Zixiong Yuan, Ying Dong, Xiaoyu Ma, Xuecou Tu, Tao Tao, Sining Dong, Liang He, Xiaoqing Jia, Guozhu Sun, Lin Kang, Huabing Wang, Francois M. Peeters, Milorad V. Milošević, Peiheng Wu, Yong-Lei Wang

Abstract: Symmetry-breaking plays a pivotal role in unlocking intriguing properties and functionalities in material systems. For example, the breaking of spatial and temporal symmetries leads to a fascinating phenomenon of superconducting diode effect. However, generating and precisely controlling the superconducting diode effect poses significant challenges. Here, we take a novel route with deliberate mani… ▽ More Symmetry-breaking plays a pivotal role in unlocking intriguing properties and functionalities in material systems. For example, the breaking of spatial and temporal symmetries leads to a fascinating phenomenon of superconducting diode effect. However, generating and precisely controlling the superconducting diode effect poses significant challenges. Here, we take a novel route with deliberate manipulation of magnetic charge potentials to realize unconventional superconducting flux-quantum diode effects. We achieve this through suitably tailored nanoengineered arrays of nanobar magnets on top of a superconducting thin film. We demonstrate the vital roles of inversion antisymmetry and its breaking in evoking unconventional superconducting effects-a magnetically symmetric diode effect and an odd-parity magnetotransport effect. These effects are non-volatilely controllable through in-situ magnetization switching of the nanobar magnets. Our findings promote the use of antisymmetry (breaking) for initiating unconventional superconducting properties, paving the way for exciting prospects and innovative functionalities in superconducting electronics. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Journal ref: Nano Letters 24, 4108-4116 (2024)

arXiv:2404.07127 [pdf, other]

Searching for short-period variables in M31: method and catalogs

Authors: Hongrui Gu, Haibo Yuan, Subo Dong, Chenfa Zheng, Shenzhe Cui, Yi Ren, Haozhu Fu, Yang Huang, Zhou Fan

Abstract: Utilizing high-cadence and continuous g- and r-band data over three nights acquired from the 3.6-meter Canada France Hawaii Telescope (CFHT) aimed to find short-duration microlensing events, we conduct a systematic search for variables, transients, and asteroids across a $\sim1^\circ$ field of view of the Andromeda Galaxy (M 31). We present a catalog of 5859 variable stars, yielding the most exten… ▽ More Utilizing high-cadence and continuous g- and r-band data over three nights acquired from the 3.6-meter Canada France Hawaii Telescope (CFHT) aimed to find short-duration microlensing events, we conduct a systematic search for variables, transients, and asteroids across a $\sim1^\circ$ field of view of the Andromeda Galaxy (M 31). We present a catalog of 5859 variable stars, yielding the most extensive compilation of short-period variable sources of M 31. We also detected 19 flares, predominantly associated with foreground M dwarfs in the Milky Way. In addition, we discovered 17 previously unknown asteroid candidates, and we subsequently reported them to the Minor Planet Center. Lastly, we report a microlensing event candidate C-ML-1 and present a preliminary analysis. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06369 [pdf, other]

VISION2UI: A Real-World Dataset with Layout for Code Generation from UI Designs

Authors: Yi Gui, Zhen Li, Yao Wan, Yemin Shi, Hongyu Zhang, Yi Su, Shaoling Dong, Xing Zhou, Wenbin Jiang

Abstract: Automatically generating UI code from webpage design visions can significantly alleviate the burden of developers, enabling beginner developers or designers to directly generate Web pages from design diagrams. Currently, prior research has accomplished the objective of generating UI code from rudimentary design visions or sketches through designing deep neural networks. Inspired by the groundbreak… ▽ More Automatically generating UI code from webpage design visions can significantly alleviate the burden of developers, enabling beginner developers or designers to directly generate Web pages from design diagrams. Currently, prior research has accomplished the objective of generating UI code from rudimentary design visions or sketches through designing deep neural networks. Inspired by the groundbreaking advancements achieved by Multimodal Large Language Models (MLLMs), the automatic generation of UI code from high-fidelity design images is now emerging as a viable possibility. Nevertheless, our investigation reveals that existing MLLMs are hampered by the scarcity of authentic, high-quality, and large-scale datasets, leading to unsatisfactory performance in automated UI code generation. To mitigate this gap, we present a novel dataset, termed VISION2UI, extracted from real-world scenarios, augmented with comprehensive layout information, tailored specifically for finetuning MLLMs in UI code generation. Specifically, this dataset is derived through a series of operations, encompassing collecting, cleaning, and filtering of the open-source Common Crawl dataset. In order to uphold its quality, a neural scorer trained on labeled samples is utilized to refine the data, retaining higher-quality instances. Ultimately, this process yields a dataset comprising 2,000 (Much more is coming soon) parallel samples encompassing design visions and UI code. The dataset is available at https://huggingface.co/datasets/xcodemind/vision2ui. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2404.06153 [pdf, other]

scRDiT: Generating single-cell RNA-seq data by diffusion transformers and accelerating sampling

Authors: Shengze Dong, Zhuorui Cui, Ding Liu, Jinzhi Lei

Abstract: Motivation: Single-cell RNA sequencing (scRNA-seq) is a groundbreaking technology extensively utilized in biological research, facilitating the examination of gene expression at the individual cell level within a given tissue sample. While numerous tools have been developed for scRNA-seq data analysis, the challenge persists in capturing the distinct features of such data and replicating virtual d… ▽ More Motivation: Single-cell RNA sequencing (scRNA-seq) is a groundbreaking technology extensively utilized in biological research, facilitating the examination of gene expression at the individual cell level within a given tissue sample. While numerous tools have been developed for scRNA-seq data analysis, the challenge persists in capturing the distinct features of such data and replicating virtual datasets that share analogous statistical properties. Results: Our study introduces a generative approach termed scRNA-seq Diffusion Transformer (scRDiT). This method generates virtual scRNA-seq data by leveraging a real dataset. The method is a neural network constructed based on Denoising Diffusion Probabilistic Models (DDPMs) and Diffusion Transformers (DiTs). This involves subjecting Gaussian noises to the real dataset through iterative noise-adding steps and ultimately restoring the noises to form scRNA-seq samples. This scheme allows us to learn data features from actual scRNA-seq samples during model training. Our experiments, conducted on two distinct scRNA-seq datasets, demonstrate superior performance. Additionally, the model sampling process is expedited by incorporating Denoising Diffusion Implicit Models (DDIM). scRDiT presents a unified methodology empowering users to train neural network models with their unique scRNA-seq datasets, enabling the generation of numerous high-quality scRNA-seq samples. Availability and implementation: https://github.com/DongShengze/scRDiT △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 11 pages, 4 figures,

arXiv:2404.02427 [pdf]

doi 10.1063/5.0097518

In-situ tunable giant electrical anisotropy in a grating gated AlGaN/GaN two-dimensional electron gas

Authors: Ting-Ting Wang, Sining Dong, Chong Li, Wen-Cheng Yue, Yang-Yang Lyu, Chen-Guang Wang, Chang-Kun Zeng, Zixiong Yuan, Wei Zhu, Zhi-Li Xiao, Xiaoli Lu, Bin Liu, Hai Lu, Hua-Bing Wang, Peiheng Wu, Wai-Kwong Kwok, Yong-Lei Wang

Abstract: Materials with in-plane electrical anisotropy have great potential for designing artificial synaptic devices. However, natural materials with strong intrinsic in-plane electrical anisotropy are rare. We introduce a simple strategy to produce extremely large electrical anisotropy via grating gating of a semiconductor two-dimensional electron gas (2DEG) of AlGaN/GaN. We show that periodically modula… ▽ More Materials with in-plane electrical anisotropy have great potential for designing artificial synaptic devices. However, natural materials with strong intrinsic in-plane electrical anisotropy are rare. We introduce a simple strategy to produce extremely large electrical anisotropy via grating gating of a semiconductor two-dimensional electron gas (2DEG) of AlGaN/GaN. We show that periodically modulated electric potential in the 2DEG induces in-plane electrical anisotropy, which is significantly enhanced in a magnetic field, leading to an ultra large electrical anisotropy. This is induced by a giant positive magnetoresistance and a giant negative magnetoresistance under two orthogonally oriented in-plane current flows, respectively. This giant electrical anisotropy is in-situ tunable by tailoring both the grating gate voltage and the magnetic field. Our semiconductor device with controllable giant electrical anisotropy will stimulate new device applications, such as multi-terminal memtransistors and bionic synapses. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Journal ref: Appl. Phys. Lett. 121, 092101 (2022)

arXiv:2403.18201 [pdf, other]

Few-shot Online Anomaly Detection and Segmentation

Authors: Shenxing Wei, Xing Wei, Zhiheng Ma, Songlin Dong, Shaochen Zhang, Yihong Gong

Abstract: Detecting anomaly patterns from images is a crucial artificial intelligence technique in industrial applications. Recent research in this domain has emphasized the necessity of a large volume of training data, overlooking the practical scenario where, post-deployment of the model, unlabeled data containing both normal and abnormal samples can be utilized to enhance the model's performance. Consequ… ▽ More Detecting anomaly patterns from images is a crucial artificial intelligence technique in industrial applications. Recent research in this domain has emphasized the necessity of a large volume of training data, overlooking the practical scenario where, post-deployment of the model, unlabeled data containing both normal and abnormal samples can be utilized to enhance the model's performance. Consequently, this paper focuses on addressing the challenging yet practical few-shot online anomaly detection and segmentation (FOADS) task. Under the FOADS framework, models are trained on a few-shot normal dataset, followed by inspection and improvement of their capabilities by leveraging unlabeled streaming data containing both normal and abnormal samples simultaneously. To tackle this issue, we propose modeling the feature distribution of normal images using a Neural Gas network, which offers the flexibility to adapt the topology structure to identify outliers in the data flow. In order to achieve improved performance with limited training samples, we employ multi-scale feature embedding extracted from a CNN pre-trained on ImageNet to obtain a robust representation. Furthermore, we introduce an algorithm that can incrementally update parameters without the need to store previous samples. Comprehensive experimental results demonstrate that our method can achieve substantial performance under the FOADS setting, while ensuring that the time complexity remains within an acceptable range on MVTec AD and BTAD datasets. △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.08300 [pdf, other]

Spin relaxation in inhomogeneous magnetic fields with depolarizing boundaries

Authors: Yue Chang, Shuangai Wan, Shichao Dong, Jie Qin

Abstract: Field-inhomogeneity-induced relaxation of atomic spins confined in vapor cells with depolarizing walls is studied. In contrast to nuclear spins, such as noble-gas spins, which experience minimal polarization loss at cell walls, atomic spins in uncoated cells undergo randomization at the boundaries. This distinct boundary condition results in a varied dependence of the relaxation rate on the field… ▽ More Field-inhomogeneity-induced relaxation of atomic spins confined in vapor cells with depolarizing walls is studied. In contrast to nuclear spins, such as noble-gas spins, which experience minimal polarization loss at cell walls, atomic spins in uncoated cells undergo randomization at the boundaries. This distinct boundary condition results in a varied dependence of the relaxation rate on the field gradient. By solving the Bloch-Torrey equation under fully depolarizing boundary conditions, we illustrate that the relaxation rate induced by field inhomogeneity is more pronounced for spins with a smaller original relaxation rate (in the absence of the inhomogeneous field). We establish an upper limit for the relaxation rate through calculations in the perturbation regime. Moreover, we connect it to the spin-exchange-relaxation-free magnetometers, demonstrating that its linewidth is most sensitive to inhomogeneous fields along the magnetometer's sensitive axis. Our theoretical result agrees with the experimental data for cells subjected to small pump power. However, deviations in larger input-power scenarios underscore the importance of considering pump field attenuation, which leads to uniformly distributed light shift that behaves as an inhomogeneous magnetic field. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.06670 [pdf, other]

CEAT: Continual Expansion and Absorption Transformer for Non-Exemplar Class-Incremental Learning

Authors: Xinyuan Gao, Songlin Dong, Yuhang He, Xing Wei, Yihong Gong

Abstract: In real-world applications, dynamic scenarios require the models to possess the capability to learn new tasks continuously without forgetting the old knowledge. Experience-Replay methods store a subset of the old images for joint training. In the scenario of more strict privacy protection, storing the old images becomes infeasible, which leads to a more severe plasticity-stability dilemma and clas… ▽ More In real-world applications, dynamic scenarios require the models to possess the capability to learn new tasks continuously without forgetting the old knowledge. Experience-Replay methods store a subset of the old images for joint training. In the scenario of more strict privacy protection, storing the old images becomes infeasible, which leads to a more severe plasticity-stability dilemma and classifier bias. To meet the above challenges, we propose a new architecture, named continual expansion and absorption transformer~(CEAT). The model can learn the novel knowledge by extending the expanded-fusion layers in parallel with the frozen previous parameters. After the task ends, we losslessly absorb the extended parameters into the backbone to ensure that the number of parameters remains constant. To improve the learning ability of the model, we designed a novel prototype contrastive loss to reduce the overlap between old and new classes in the feature space. Besides, to address the classifier bias towards the new classes, we propose a novel approach to generate the pseudo-features to correct the classifier. We experiment with our methods on three standard Non-Exemplar Class-Incremental Learning~(NECIL) benchmarks. Extensive experiments demonstrate that our model gets a significant improvement compared with the previous works and achieves 5.38%, 5.20%, and 4.92% improvement on CIFAR-100, TinyImageNet, and ImageNet-Subset. △ Less

Submitted 11 March, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.06393 [pdf, other]

A Functionally Connected Element Method for Solving Boundary Value Problems

Authors: Jielin Yang, Suchuan Dong

Abstract: We present the general forms of piece-wise functions on partitioned domains satisfying an intrinsic $C^0$ or $C^1$ continuity across the sub-domain boundaries. These general forms are constructed based on a strategy stemming from the theory of functional connections, and we refer to partitioned domains endowed with these general forms as functionally connected elements (FCE). We further present a… ▽ More We present the general forms of piece-wise functions on partitioned domains satisfying an intrinsic $C^0$ or $C^1$ continuity across the sub-domain boundaries. These general forms are constructed based on a strategy stemming from the theory of functional connections, and we refer to partitioned domains endowed with these general forms as functionally connected elements (FCE). We further present a method, incorporating functionally connected elements and a least squares collocation approach, for solving boundary and initial value problems. This method exhibits a spectral-like accuracy, with the free functions involved in the FCE form represented by polynomial bases or by non-polynomial bases of quasi-random sinusoidal functions. The FCE method offers a unique advantage over traditional element-based methods for boundary value problems involving relative boundary conditions. A number of linear and nonlinear numerical examples in one and two dimensions are presented to demonstrate the performance of the FCE method developed herein. △ Less

Submitted 10 March, 2024; originally announced March 2024.

Comments: 44 pages, 10 figures, 8 tables

Showing 1–50 of 911 results for author: Dong, S