subscribe to arXiv mailings

On Landau equation with harmonic potential: nonlinear stability of time-periodic Maxwell-Boltzmann distributions

Authors: Chuqi Cao, Ling-Bing He, Jie Ji

Abstract: We provide the first and rigorous confirmations of the hypotheses by Ludwig Boltzmann in his seminal paper \cite{Boltzmann} within the context of the Landau equation in the presence of a harmonic potential. We prove that (i) Each {\it entropy-invariant solution} can be identified as a {\it time-periodic Maxwell-Boltzmann distribution}. Moreover, these distributions can be characterized by thirteen… ▽ More We provide the first and rigorous confirmations of the hypotheses by Ludwig Boltzmann in his seminal paper \cite{Boltzmann} within the context of the Landau equation in the presence of a harmonic potential. We prove that (i) Each {\it entropy-invariant solution} can be identified as a {\it time-periodic Maxwell-Boltzmann distribution}. Moreover, these distributions can be characterized by thirteen conservation laws, which sheds light on the global dynamics. (ii) Each {\it time-periodic Maxwell-Boltzmann distribution} is nonlinearly stable, including neutral asymptotic stability and Lyapunov stability. Furthermore, the convergence rate is entirely reliant on the thirteen conservation laws and is optimal when compared to the linear scenario. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 62 pages,0 figures

MSC Class: 35Q20; 82B40;

arXiv:2407.04942 [pdf, other]

FOSP: Fine-tuning Offline Safe Policy through World Models

Authors: Chenyang Cao, Yucheng Xin, Silang Wu, Longxiang He, Zichen Yan, Junbo Tan, Xueqian Wang

Abstract: Model-based Reinforcement Learning (RL) has shown its high training efficiency and capability of handling high-dimensional tasks. Regarding safety issues, safe model-based RL can achieve nearly zero-cost performance and effectively manage the trade-off between performance and safety. Nevertheless, prior works still pose safety challenges due to the online exploration in real-world deployment. To a… ▽ More Model-based Reinforcement Learning (RL) has shown its high training efficiency and capability of handling high-dimensional tasks. Regarding safety issues, safe model-based RL can achieve nearly zero-cost performance and effectively manage the trade-off between performance and safety. Nevertheless, prior works still pose safety challenges due to the online exploration in real-world deployment. To address this, some offline RL methods have emerged as solutions, which learn from a static dataset in a safe way by avoiding interactions with the environment. In this paper, we aim to further enhance safety during the deployment stage for vision-based robotic tasks by fine-tuning an offline-trained policy. We incorporate in-sample optimization, model-based policy expansion, and reachability guidance to construct a safe offline-to-online framework. Moreover, our method proves to improve the generalization of offline policy in unseen safety-constrained scenarios. Finally, the efficiency of our method is validated on simulation benchmarks with five vision-only tasks and a real robot by solving some deployment problems using limited data. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 21 pages

arXiv:2407.04461 [pdf, other]

VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing

Authors: Shang Liu, Chaohui Yu, Chenjie Cao, Wen Qian, Fan Wang

Abstract: Recent research on texture synthesis for 3D shapes benefits a lot from dramatically developed 2D text-to-image diffusion models, including inpainting-based and optimization-based approaches. However, these methods ignore the modal gap between the 2D diffusion model and 3D objects, which primarily render 3D objects into 2D images and texture each image separately. In this paper, we revisit the text… ▽ More Recent research on texture synthesis for 3D shapes benefits a lot from dramatically developed 2D text-to-image diffusion models, including inpainting-based and optimization-based approaches. However, these methods ignore the modal gap between the 2D diffusion model and 3D objects, which primarily render 3D objects into 2D images and texture each image separately. In this paper, we revisit the texture synthesis and propose a Variance alignment based 3D-2D Collaborative Denoising framework, dubbed VCD-Texture, to address these issues. Formally, we first unify both 2D and 3D latent feature learning in diffusion self-attention modules with re-projected 3D attention receptive fields. Subsequently, the denoised multi-view 2D latent features are aggregated into 3D space and then rasterized back to formulate more consistent 2D predictions. However, the rasterization process suffers from an intractable variance bias, which is theoretically addressed by the proposed variance alignment, achieving high-fidelity texture synthesis. Moreover, we present an inpainting refinement to further improve the details with conflicting regions. Notably, there is not a publicly available benchmark to evaluate texture synthesis, which hinders its development. Thus we construct a new evaluation set built upon three open-source 3D datasets and propose to use four metrics to thoroughly validate the texturing performance. Comprehensive experiments demonstrate that VCD-Texture achieves superior performance against other counterparts. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: ECCV 2024

arXiv:2407.01960 [pdf, other]

Zero-shot Video Restoration and Enhancement Using Pre-Trained Image Diffusion Model

Authors: Cong Cao, Huanjing Yue, Xin Liu, Jingyu Yang

Abstract: Diffusion-based zero-shot image restoration and enhancement models have achieved great success in various image restoration and enhancement tasks without training. However, directly applying them to video restoration and enhancement results in severe temporal flickering artifacts. In this paper, we propose the first framework for zero-shot video restoration and enhancement based on a pre-trained i… ▽ More Diffusion-based zero-shot image restoration and enhancement models have achieved great success in various image restoration and enhancement tasks without training. However, directly applying them to video restoration and enhancement results in severe temporal flickering artifacts. In this paper, we propose the first framework for zero-shot video restoration and enhancement based on a pre-trained image diffusion model. By replacing the self-attention layer with the proposed cross-previous-frame attention layer, the pre-trained image diffusion model can take advantage of the temporal correlation between neighboring frames. We further propose temporal consistency guidance, spatial-temporal noise sharing, and an early stopping sampling strategy for better temporally consistent sampling. Our method is a plug-and-play module that can be inserted into any diffusion-based zero-shot image restoration or enhancement methods to further improve their performance. Experimental results demonstrate the superiority of our proposed method in producing temporally consistent videos with better fidelity. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 19 pages

arXiv:2406.00806 [pdf, other]

Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection

Authors: Chentao Cao, Zhun Zhong, Zhanke Zhou, Yang Liu, Tongliang Liu, Bo Han

Abstract: Detecting out-of-distribution (OOD) samples is essential when deploying machine learning models in open-world scenarios. Zero-shot OOD detection, requiring no training on in-distribution (ID) data, has been possible with the advent of vision-language models like CLIP. Existing methods build a text-based classifier with only closed-set labels. However, this largely restricts the inherent capability… ▽ More Detecting out-of-distribution (OOD) samples is essential when deploying machine learning models in open-world scenarios. Zero-shot OOD detection, requiring no training on in-distribution (ID) data, has been possible with the advent of vision-language models like CLIP. Existing methods build a text-based classifier with only closed-set labels. However, this largely restricts the inherent capability of CLIP to recognize samples from large and open label space. In this paper, we propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to Envision potential Outlier Exposure, termed EOE, without access to any actual OOD data. Owing to better adaptation to open-world scenarios, EOE can be generalized to different tasks, including far, near, and fine-grained OOD detection. Technically, we design (1) LLM prompts based on visual similarity to generate potential outlier class labels specialized for OOD detection, as well as (2) a new score function based on potential outlier penalty to distinguish hard OOD samples effectively. Empirically, EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset. The code is publicly available at: https://github.com/tmlr-group/EOE. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: ICML 2024

arXiv:2405.17792 [pdf, other]

JUNO Sensitivity to Invisible Decay Modes of Neutrons

Authors: JUNO Collaboration, Angel Abusleme, Thomas Adam, Kai Adamowicz, Shakeel Ahmad, Rizwan Ahmed, Sebastiano Aiello, Fengpeng An, Qi An, Giuseppe Andronico, Nikolay Anfimov, Vito Antonelli, Tatiana Antoshkina, João Pedro Athayde Marcondes de André, Didier Auguste, Weidong Bai, Nikita Balashov, Wander Baldini, Andrea Barresi, Davide Basilico, Eric Baussan, Marco Bellato, Marco Beretta, Antonio Bergnoli, Daniel Bick , et al. (635 additional authors not shown)

Abstract: We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation mode… ▽ More We explore the bound neutrons decay into invisible particles (e.g., $n\rightarrow 3 ν$ or $nn \rightarrow 2 ν$) in the JUNO liquid scintillator detector. The invisible decay includes two decay modes: $ n \rightarrow { inv} $ and $ nn \rightarrow { inv} $. The invisible decays of $s$-shell neutrons in $^{12}{\rm C}$ will leave a highly excited residual nucleus. Subsequently, some de-excitation modes of the excited residual nuclei can produce a time- and space-correlated triple coincidence signal in the JUNO detector. Based on a full Monte Carlo simulation informed with the latest available data, we estimate all backgrounds, including inverse beta decay events of the reactor antineutrino $\barν_e$, natural radioactivity, cosmogenic isotopes and neutral current interactions of atmospheric neutrinos. Pulse shape discrimination and multivariate analysis techniques are employed to further suppress backgrounds. With two years of exposure, JUNO is expected to give an order of magnitude improvement compared to the current best limits. After 10 years of data taking, the JUNO expected sensitivities at a 90% confidence level are $τ/B( n \rightarrow { inv} ) > 5.0 \times 10^{31} \, {\rm yr}$ and $τ/B( nn \rightarrow { inv} ) > 1.4 \times 10^{32} \, {\rm yr}$. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 28 pages, 7 figures, 4 tables

arXiv:2405.15438 [pdf, other]

Comparing remote sensing-based forest biomass mapping approaches using new forest inventory plots in contrasting forests in northeastern and southwestern China

Authors: Wenquan Dong, Edward T. A. Mitchard, Yuwei Chen, Man Chen, Congfeng Cao, Peilun Hu, Cong Xu, Steven Hancock

Abstract: Large-scale high spatial resolution aboveground biomass (AGB) maps play a crucial role in determining forest carbon stocks and how they are changing, which is instrumental in understanding the global carbon cycle, and implementing policy to mitigate climate change. The advent of the new space-borne LiDAR sensor, NASA's GEDI instrument, provides unparalleled possibilities for the accurate and unbia… ▽ More Large-scale high spatial resolution aboveground biomass (AGB) maps play a crucial role in determining forest carbon stocks and how they are changing, which is instrumental in understanding the global carbon cycle, and implementing policy to mitigate climate change. The advent of the new space-borne LiDAR sensor, NASA's GEDI instrument, provides unparalleled possibilities for the accurate and unbiased estimation of forest AGB at high resolution, particularly in dense and tall forests, where Synthetic Aperture Radar (SAR) and passive optical data exhibit saturation. However, GEDI is a sampling instrument, collecting dispersed footprints, and its data must be combined with that from other continuous cover satellites to create high-resolution maps, using local machine learning methods. In this study, we developed local models to estimate forest AGB from GEDI L2A data, as the models used to create GEDI L4 AGB data incorporated minimal field data from China. We then applied LightGBM and random forest regression to generate wall-to-wall AGB maps at 25 m resolution, using extensive GEDI footprints as well as Sentinel-1 data, ALOS-2 PALSAR-2 and Sentinel-2 optical data. Through a 5-fold cross-validation, LightGBM demonstrated a slightly better performance than Random Forest across two contrasting regions. However, in both regions, the computation speed of LightGBM is substantially faster than that of the random forest model, requiring roughly one-third of the time to compute on the same hardware. Through the validation against field data, the 25 m resolution AGB maps generated using the local models developed in this study exhibited higher accuracy compared to the GEDI L4B AGB data. We found in both regions an increase in error as slope increased. The trained models were tested on nearby but different regions and exhibited good performance. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.09180 [pdf]

doi 10.1038/s41467-024-48224-1

Integrated and DC-powered superconducting microcomb

Authors: Chen-Guang Wang, Wuyue Xu, Chong Li, Lili Shi, Junliang Jiang, Tingting Guo, Wen-Cheng Yue, Tianyu Li, Ping Zhang, Yang-Yang Lyu, Jiazheng Pan, Xiuhao Deng, Ying Dong, Xuecou Tu, Sining Dong, Chunhai Cao, Labao Zhang, Xiaoqing Jia, Guozhu Sun, Lin Kang, Jian Chen, Yong-Lei Wang, Huabing Wang, Peiheng Wu

Abstract: Frequency combs, specialized laser sources emitting multiple equidistant frequency lines, have revolutionized science and technology with unprecedented precision and versatility. Recently, integrated frequency combs are emerging as scalable solutions for on-chip photonics. Here, we demonstrate a fully integrated superconducting microcomb that is easy to manufacture, simple to operate, and consumes… ▽ More Frequency combs, specialized laser sources emitting multiple equidistant frequency lines, have revolutionized science and technology with unprecedented precision and versatility. Recently, integrated frequency combs are emerging as scalable solutions for on-chip photonics. Here, we demonstrate a fully integrated superconducting microcomb that is easy to manufacture, simple to operate, and consumes ultra-low power. Our turnkey apparatus comprises a basic nonlinear superconducting device, a Josephson junction, directly coupled to a superconducting microstrip resonator. We showcase coherent comb generation through self-started mode-locking. Therefore, comb emission is initiated solely by activating a DC bias source, with power consumption as low as tens of picowatts. The resulting comb spectrum resides in the microwave domain and spans multiple octaves. The linewidths of all comb lines can be narrowed down to 1 Hz through a unique coherent injection-locking technique. Our work represents a critical step towards fully integrated microwave photonics and offers the potential for integrated quantum processors. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Journal ref: Nature Communications 15, 4009 (2024)

arXiv:2405.09170 [pdf]

doi 10.1088/1674-1056/ad2f21

Tunable superconducting resonators via on-chip control of local magnetic field

Authors: Chen-Guang Wang, Wen-Cheng Yue, Xuecou Tu, Tianyuan Chi, Tingting Guo, Yang-Yang Lyu, Sining Dong, Chunhai Cao, Labao Zhang, Xiaoqing Jia, Guozhu Sun, Lin Kang, Jian Chen, Yong-Lei Wang, Huabing Wang, Peiheng Wu

Abstract: Superconducting microwave resonators play a pivotal role in superconducting quantum circuits. The ability to fine-tune their resonant frequencies provides enhanced control and flexibility. Here, we introduce a frequency-tunable superconducting coplanar waveguide resonator. By applying electrical currents through specifically designed ground wires, we achieve the generation and control of a localiz… ▽ More Superconducting microwave resonators play a pivotal role in superconducting quantum circuits. The ability to fine-tune their resonant frequencies provides enhanced control and flexibility. Here, we introduce a frequency-tunable superconducting coplanar waveguide resonator. By applying electrical currents through specifically designed ground wires, we achieve the generation and control of a localized magnetic field on the central line of the resonator, enabling continuous tuning of its resonant frequency. We demonstrate a frequency tuning range of 54.85 MHz in a 6.21 GHz resonator. This integrated and tunable resonator holds great potential as a dynamically tunable filter and as a key component of communication buses and memory elements in superconducting quantum computing. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Journal ref: Chin. Phys. B 33, 058402 (2024)

arXiv:2405.08441 [pdf, other]

Unveiling quantum phase transitions from traps in variational quantum algorithms

Authors: Chenfeng Cao, Filippo Maria Gambetta, Ashley Montanaro, Raul A. Santos

Abstract: Understanding quantum phase transitions in physical systems is fundamental to characterize their behaviour at small temperatures. Achieving this requires both accessing good approximations to the ground state and identifying order parameters to distinguish different phases. Addressing these challenges, our work introduces a hybrid algorithm that combines quantum optimization with classical machine… ▽ More Understanding quantum phase transitions in physical systems is fundamental to characterize their behaviour at small temperatures. Achieving this requires both accessing good approximations to the ground state and identifying order parameters to distinguish different phases. Addressing these challenges, our work introduces a hybrid algorithm that combines quantum optimization with classical machine learning. This approach leverages the capability of near-term quantum computers to prepare locally trapped states through finite optimization. Specifically, we utilize LASSO for identifying conventional phase transitions and the Transformer model for topological transitions, applying these with a sliding window of Hamiltonian parameters to learn appropriate order parameters and estimate the critical points accurately. We verified the effectiveness of our method with numerical simulation and real-hardware experiments on Rigetti's Ankaa 9Q-1 quantum computer. Our protocol not only provides a robust framework for investigating quantum phase transitions using shallow quantum circuits but also significantly enhances efficiency and precision, opening new avenues in the integration of quantum computing and machine learning. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 19 pages, 9 figures

arXiv:2405.04716 [pdf, other]

Physics-based deep learning reveals rising heating demand heightens air pollution in Norwegian cities

Authors: Cong Cao, Ramit Debnath, R. Michael Alvarez

Abstract: Policymakers frequently analyze air quality and climate change in isolation, disregarding their interactions. This study explores the influence of specific climate factors on air quality by contrasting a regression model with K-Means Clustering, Hierarchical Clustering, and Random Forest techniques. We employ Physics-based Deep Learning (PBDL) and Long Short-Term Memory (LSTM) to examine the air p… ▽ More Policymakers frequently analyze air quality and climate change in isolation, disregarding their interactions. This study explores the influence of specific climate factors on air quality by contrasting a regression model with K-Means Clustering, Hierarchical Clustering, and Random Forest techniques. We employ Physics-based Deep Learning (PBDL) and Long Short-Term Memory (LSTM) to examine the air pollution predictions. Our analysis utilizes ten years (2009-2018) of daily traffic, weather, and air pollution data from three major cities in Norway. Findings from feature selection reveal a correlation between rising heating degree days and heightened air pollution levels, suggesting increased heating activities in Norway are a contributing factor to worsening air quality. PBDL demonstrates superior accuracy in air pollution predictions compared to LSTM. This paper contributes to the growing literature on PBDL methods for more accurate air pollution predictions using environmental variables, aiding policymakers in formulating effective data-driven climate policies. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 52 pages, 23 figures

ACM Class: K.4.1; J.2; I.2

arXiv:2404.18055 [pdf, other]

doi 10.1063/5.0190257

Enhanced torque efficiency in ferromagnetic multilayers by introducing naturally oxidized Cu

Authors: Kun Zheng, Cuimei Cao, Yingying Lu, Jing Meng, Junpeng Pan, Zhenjie Zhao, Yang Xu, Tian Shang, Qingfeng Zhan

Abstract: Spin-orbit torque (SOT) in the heavy elements with a large spin-orbit coupling (SOC) has been frequently used to manipulate the magnetic states in spintronic devices. Recent theoretical works have predicted that the surface oxidized light elements with a negligible SOC can yield a sizable orbital torque (OT), which plays an important role in switching the magnetization. Here, we report anomalous-H… ▽ More Spin-orbit torque (SOT) in the heavy elements with a large spin-orbit coupling (SOC) has been frequently used to manipulate the magnetic states in spintronic devices. Recent theoretical works have predicted that the surface oxidized light elements with a negligible SOC can yield a sizable orbital torque (OT), which plays an important role in switching the magnetization. Here, we report anomalous-Hall-resistance and harmonic-Hall-voltage measurements on perpendicularly magnetized Ta/Cu/[Ni/Co]$_5$/Cu-CuO$_x$ multilayers. Both torque efficiency and spin-Hall angle of these multilayers are largely enhanced by introducing a naturally oxidized Cu-CuO$_x$ layer, where the SOC is negligible. Such an enhancement is mainly due to the collaborative driven of the SOT from the Ta layer and the OT from the Cu/CuO$_x$ interface, and can be tuned by controlling the thickness of Cu-CuO$_x$ layer. Compared to the Cu-CuO$_x$-free multilayers, the maximum torque efficiency and spin-Hall angle were enhanced by a factor of ten, larger than most of the reported values in the other heterostructures. △ Less

Submitted 27 April, 2024; originally announced April 2024.

Comments: 10 pages, 5 figures, accepted by Appl. Phys. Lett

Journal ref: Appl. Phys. Lett. 124, 192408 (2024)

arXiv:2404.17560 [pdf, other]

Exploiting many-body localization for scalable variational quantum simulation

Authors: Chenfeng Cao, Yeqing Zhou, Swamit Tannu, Nic Shannon, Robert Joynt

Abstract: Variational quantum algorithms have emerged as a promising approach to achieving practical quantum advantages using near-term quantum devices. Despite their potential, the scalability of these algorithms poses a significant challenge. This is largely attributed to the "barren plateau" phenomenon, which persists even in the absence of noise. In this work, we explore the many-body localization (MBL)… ▽ More Variational quantum algorithms have emerged as a promising approach to achieving practical quantum advantages using near-term quantum devices. Despite their potential, the scalability of these algorithms poses a significant challenge. This is largely attributed to the "barren plateau" phenomenon, which persists even in the absence of noise. In this work, we explore the many-body localization (MBL)-thermalization phase transitions within a framework of Floquet-initialized variational quantum circuits and investigate how MBL could be used to avoid barren plateaus. The phase transitions are observed through calculations of the inverse participation ratio, the entanglement entropy, and a metric termed low-weight stabilizer Rényi entropy. By initializing the circuit in the MBL phase and employing an easily preparable initial state, we find it is possible to prevent the formation of a unitary 2-design, resulting in an output state with entanglement that follows an area- rather than a volume-law, and which circumvents barren plateaus throughout the optimization. Utilizing this methodology, we successfully determine the ground states of various model Hamiltonians across different phases and show that the resources required for the optimization are significantly reduced. We have further validated the MBL approach through experiments carried out on the 127-qubit $ibm\_brisbane$ quantum processor. These experiments confirm that the gradients needed to carry out variational calculations are restored in the MBL phase of a Heisenberg model subject to random unitary "kicks". These results provide new insights into the interplay between MBL and quantum computing, and suggest that the role of MBL states should be considered in the design of quantum algorithms. △ Less

Submitted 20 May, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

Comments: 18 pages, 10 figures

arXiv:2404.16362 [pdf, other]

Feature graph construction with static features for malware detection

Authors: Binghui Zou, Chunjie Cao, Longjuan Wang, Yinan Cheng, Jingzhang Sun

Abstract: Malware can greatly compromise the integrity and trustworthiness of information and is in a constant state of evolution. Existing feature fusion-based detection methods generally overlook the correlation between features. And mere concatenation of features will reduce the model's characterization ability, lead to low detection accuracy. Moreover, these methods are susceptible to concept drift and… ▽ More Malware can greatly compromise the integrity and trustworthiness of information and is in a constant state of evolution. Existing feature fusion-based detection methods generally overlook the correlation between features. And mere concatenation of features will reduce the model's characterization ability, lead to low detection accuracy. Moreover, these methods are susceptible to concept drift and significant degradation of the model. To address those challenges, we introduce a feature graph-based malware detection method, MFGraph, to characterize applications by learning feature-to-feature relationships to achieve improved detection accuracy while mitigating the impact of concept drift. In MFGraph, we construct a feature graph using static features extracted from binary PE files, then apply a deep graph convolutional network to learn the representation of the feature graph. Finally, we employ the representation vectors obtained from the output of a three-layer perceptron to differentiate between benign and malicious software. We evaluated our method on the EMBER dataset, and the experimental results demonstrate that it achieves an AUC score of 0.98756 on the malware detection task, outperforming other baseline models. Furthermore, the AUC score of MFGraph decreases by only 5.884% in one year, indicating that it is the least affected by concept drift. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.13926 [pdf]

Using Polyvinyl Alcohol as Polymeric Adhesive to Enhance the Water Stability of Soil and its Performance

Authors: Chunyan Cao, Lingyu Zhao, Gang Li

Abstract: Soil degradation threatens agricultural productivity and food supply, leading to hunger issues in some developing regions. To address this challenge, we developed a low-cost, highly efficient, and long-term stable soil improvement method. We chose polyvinyl alcohol (PVA), a commercially available polymer that is safe and non-degradable, to serve as a soil adhesive. We mixed PVA solution into the s… ▽ More Soil degradation threatens agricultural productivity and food supply, leading to hunger issues in some developing regions. To address this challenge, we developed a low-cost, highly efficient, and long-term stable soil improvement method. We chose polyvinyl alcohol (PVA), a commercially available polymer that is safe and non-degradable, to serve as a soil adhesive. We mixed PVA solution into the soil and applied a drying treatment to enhance the bonding between PVA and the soil, achieving highly water-stable soil. This PVA-stabilized soil exhibits low bulk density, high porosity, and high permeability, making it an ideal substrate for planting. In a germination test, the PVA-stabilized soil revealed a higher germination rate and growth rate compared to those of the non-treated soil. We believe this simple and efficient soil improvement method can restore degraded soil and contribute to sustainable agriculture. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.04701 [pdf, other]

Flat-Band Enhanced Antiferromagnetic Fluctuations and Unconventional Superconductivity in Pressurized CsCr$_3$Sb$_5$

Authors: Siqi Wu, Chenchao Xu, Xiaoqun Wang, Hai-Qing Lin, Chao Cao, Guang-Han Cao

Abstract: The interrelationship between flat bands and correlated phenomena such as unconventional superconductivity stands as an intriguing subject in condensed matter physics. Here, by first-principles calculations and random phase approximation analyses, we investigate the electronic structure, superconducting instability, as well as roles of the incipient flat bands in kagome superconductor CsCr$_3$Sb… ▽ More The interrelationship between flat bands and correlated phenomena such as unconventional superconductivity stands as an intriguing subject in condensed matter physics. Here, by first-principles calculations and random phase approximation analyses, we investigate the electronic structure, superconducting instability, as well as roles of the incipient flat bands in kagome superconductor CsCr$_3$Sb$_5$. Our calculations reveal strong antiferromagnetic spin fluctuations in CsCr$_3$Sb$_5$, which mediates two sets of spin-singlet superconducting orders with $s_{\pm}$- and ($d_{xy}$, $d_{x^2-y^2}$)-wave symmetries. Under the dominance of local Coulomb interactions, the unoccupied incipient flat bands are shown to be crucial for the momentum dependence of spin fluctuations and thus the superconductivity. Our further analyses unveil a sublattice-momentum-coupling-driven mechanism for this momentum-dependent enhancement of the fluctuations, which provides us a new perspective for future studies of geometrically frustrated systems. △ Less

Submitted 6 April, 2024; originally announced April 2024.

arXiv:2404.04265 [pdf, other]

Accelerating Matrix Factorization by Dynamic Pruning for Fast Recommendation

Authors: Yining Wu, Shengyu Duan, Gaole Sai, Chenhong Cao, Guobing Zou

Abstract: Matrix factorization (MF) is a widely used collaborative filtering (CF) algorithm for recommendation systems (RSs), due to its high prediction accuracy, great flexibility and high efficiency in big data processing. However, with the dramatically increased number of users/items in current RSs, the computational complexity for training a MF model largely increases. Many existing works have accelerat… ▽ More Matrix factorization (MF) is a widely used collaborative filtering (CF) algorithm for recommendation systems (RSs), due to its high prediction accuracy, great flexibility and high efficiency in big data processing. However, with the dramatically increased number of users/items in current RSs, the computational complexity for training a MF model largely increases. Many existing works have accelerated MF, by either putting in additional computational resources or utilizing parallel systems, introducing a large cost. In this paper, we propose algorithmic methods to accelerate MF, without inducing any additional computational resources. In specific, we observe fine-grained structured sparsity in the decomposed feature matrices when considering a certain threshold. The fine-grained structured sparsity causes a large amount of unnecessary operations during both matrix multiplication and latent factor update, increasing the computational time of the MF training process. Based on the observation, we firstly propose to rearrange the feature matrices based on joint sparsity, which potentially makes a latent vector with a smaller index more dense than that with a larger index. The feature matrix rearrangement is given to limit the error caused by the later performed pruning process. We then propose to prune the insignificant latent factors by an early stopping process during both matrix multiplication and latent factor update. The pruning process is dynamically performed according to the sparsity of the latent factors for different users/items, to accelerate the process. The experiments show that our method can achieve 1.2-1.65 speedups, with up to 20.08% error increase, compared with the conventional MF training process. We also prove the proposed methods are applicable considering different hyperparameters including optimizer, optimization strategy and initialization method. △ Less

Submitted 18 March, 2024; originally announced April 2024.

arXiv:2404.03736 [pdf, other]

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

Authors: Zijie Wu, Chaohui Yu, Yanqin Jiang, Chenjie Cao, Fan Wang, Xiang Bai

Abstract: Recent advances in 2D/3D generative models enable the generation of dynamic 3D objects from a single-view video. Existing approaches utilize score distillation sampling to form the dynamic scene as dynamic NeRF or dense 3D Gaussians. However, these methods struggle to strike a balance among reference view alignment, spatio-temporal consistency, and motion fidelity under single-view conditions due… ▽ More Recent advances in 2D/3D generative models enable the generation of dynamic 3D objects from a single-view video. Existing approaches utilize score distillation sampling to form the dynamic scene as dynamic NeRF or dense 3D Gaussians. However, these methods struggle to strike a balance among reference view alignment, spatio-temporal consistency, and motion fidelity under single-view conditions due to the implicit nature of NeRF or the intricate dense Gaussian motion prediction. To address these issues, this paper proposes an efficient, sparse-controlled video-to-4D framework named SC4D, that decouples motion and appearance to achieve superior video-to-4D generation. Moreover, we introduce Adaptive Gaussian (AG) initialization and Gaussian Alignment (GA) loss to mitigate shape degeneration issue, ensuring the fidelity of the learned motion and shape. Comprehensive experimental results demonstrate that our method surpasses existing methods in both quality and efficiency. In addition, facilitated by the disentangled modeling of motion and appearance of SC4D, we devise a novel application that seamlessly transfers the learned motion onto a diverse array of 4D entities according to textual descriptions. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: Project Page: https://sc4d.github.io/

arXiv:2403.14953 [pdf, other]

Radial Wave in the Galactic Disk: New Clues to Discriminate Different Perturbations

Authors: Chengye Cao, Zhao-Yu Li, Ralph Schönrich, Teresa Antoja

Abstract: Decoding the key dynamical processes that shape the Galactic disk structure is crucial for reconstructing the Milky Way's evolution history. The second Gaia data release unveils a novel wave pattern in the $L_Z-\langle V_R\rangle$ space, but its formation mechanism remains elusive due to the intricate nature of involved perturbations and the challenges in disentangling their effects. Utilizing the… ▽ More Decoding the key dynamical processes that shape the Galactic disk structure is crucial for reconstructing the Milky Way's evolution history. The second Gaia data release unveils a novel wave pattern in the $L_Z-\langle V_R\rangle$ space, but its formation mechanism remains elusive due to the intricate nature of involved perturbations and the challenges in disentangling their effects. Utilizing the latest Gaia DR3 data, we find that the $L_Z-\langle V_R\rangle$ wave systematically shifts towards lower $L_Z$ for dynamically hotter stars. The amplitude of this phase shift between stars of different dynamical hotness ($ΔL_Z$) peaks around $\mathrm{2300\,km\,s^{-1}\,kpc}$. To differentiate the role of different perturbations, we perform three sets of test particle simulations, wherein a satellite galaxy, corotating transient spiral arms, and a bar plus the corotating transient spiral arms act as the sole perturber, respectively. Under the satellite impact, the phase shift amplitude decreases towards higher $L_Z$, which we interpret through a toy model of radial phase mixing. While the corotating transient spiral arms do not generate an azimuthally universal phase shift variation pattern, combining the bar and spirals generates a characteristic $ΔL_Z$ peak at 2:1 Outer Lindblad Resonance, qualitatively resembling the observed feature. Therefore, the $L_Z-\langle V_R\rangle$ is more likely of internal origin. Furthermore, linking the $ΔL_Z$ peak to the 2:1 Lindblad resonance offers a novel approach to estimating the pattern speed of the Galactic Bar, supporting the long/slow bar model. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: Submitted to ApJ, 13 pages, 8 figures, comments welcome

arXiv:2403.11713 [pdf, other]

Ac$_3$Ni$_2$O$_7$ and La$_2$$Ae$Ni$_2$O$_6$F ($Ae$ = Sr, Ba): Benchmark Materials for Bilayer Nickelate Superconductivity

Authors: Siqi Wu, Zihan Yang, Xin Ma, Jianhui Dai, Ming Shi, Hui-Qiu Yuan, Hai-Qing Lin, Chao Cao

Abstract: We theoretically propose Ac$_3$Ni$_2$O$_7$, La$_2$BaNi$_2$O$_6$F, and La$_2$SrNi$_2$O$_6$F compounds to be benchmark materials for bilayer nickelate superconductivity. The stable phase of Ac$_3$Ni$_2$O$_7$ and La$_2$BaNi$_2$O$_6$F are found to be $I4/mmm$ without the lattice distortion caused by octahedra rotation at ambient pressure, where as the lattice distortion in La$_2$SrNi$_2$O$_6$F can be… ▽ More We theoretically propose Ac$_3$Ni$_2$O$_7$, La$_2$BaNi$_2$O$_6$F, and La$_2$SrNi$_2$O$_6$F compounds to be benchmark materials for bilayer nickelate superconductivity. The stable phase of Ac$_3$Ni$_2$O$_7$ and La$_2$BaNi$_2$O$_6$F are found to be $I4/mmm$ without the lattice distortion caused by octahedra rotation at ambient pressure, where as the lattice distortion in La$_2$SrNi$_2$O$_6$F can be suppressed with relatively small external pressure of 4 GPa. The magnetism, electronic structure and spin susceptibilities of Ac$_3$Ni$_2$O$_7$ are extremely close to those of La$_3$Ni$_2$O$_7$ at 30 GPa. The ground state of La$_2$BaNi$_2$O$_6$F and La$_2$SrNi$_2$O$_6$F are antiferromagnetically coupled checkerboard bilayer with sizable magnetic moment on Ni. In addition, the inter-layer coupling $J_{\perp}$ between Ni-bilayers in La$_2$BaNi$_2$O$_6$F or La$_2$SrNi$_2$O$_6$F is only $\sim$ 1/10 of that in Ac$_3$Ni$_2$O$_7$ or La$_3$Ni$_2$O$_7$ at 30 GPa. We argue that these compounds may serve as superconducting candidates at ambient pressure and can be employed to testify theoretical proposals for bilayer nickelate superconductivity. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.07056 [pdf, other]

Gravitational back-reaction is magical

Authors: ChunJun Cao, Gong Cheng, Alioscia Hamma, Lorenzo Leone, William Munizzi, Savatore F. E. Oliviero

Abstract: We study the interplay between magic and entanglement in quantum many-body systems. We show that non-local magic, which is supported by the quantum correlations is lower bounded by the non-flatness of entanglement spectrum and upper bounded by the amount of entanglement in the system. We then argue that a smoothed version of non-local magic bounds the hardness of classical simulations for incompre… ▽ More We study the interplay between magic and entanglement in quantum many-body systems. We show that non-local magic, which is supported by the quantum correlations is lower bounded by the non-flatness of entanglement spectrum and upper bounded by the amount of entanglement in the system. We then argue that a smoothed version of non-local magic bounds the hardness of classical simulations for incompressible states. In conformal field theories, we conjecture that the non-local magic should scale linearly with entanglement entropy but sublinearly when an approximation of the state is allowed. We support the conjectures using both analytical arguments based on unitary distillation and numerical data from an Ising CFT. If the CFT has a holographic dual, then we prove that the non-local magic vanishes if and only if there is no gravitational back-reaction. Furthermore, we show that non-local magic is approximately equal to the rate of change of the minimal surface area in response to the change of cosmic brane tension in the bulk. △ Less

Submitted 16 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: 62 pages, 20 figures; title changed, Theorem 1 and 2 refined, references added

arXiv:2403.01826 [pdf, other]

A Novel Shortest Path Query Algorithm Based on Optimized Adaptive Topology Structure

Authors: Xiao Fang, Xuyang Song, Jiyuan Ma, Guanhua Liu, Shurong Pang, Wenbo Zhao, Cong Cao, Ling Fan

Abstract: Urban rail transit is a fundamental component of public transportation, however, commonly station-based path search algorithms often overlook the impact of transfer times on search results, leading to decreased accuracy. To solve this problem, this paper proposes a novel shortest path query algorithm based on adaptive topology optimization called the Adaptive Topology Extension Road Network Struct… ▽ More Urban rail transit is a fundamental component of public transportation, however, commonly station-based path search algorithms often overlook the impact of transfer times on search results, leading to decreased accuracy. To solve this problem, this paper proposes a novel shortest path query algorithm based on adaptive topology optimization called the Adaptive Topology Extension Road Network Structure (ATEN). This algorithm categorizes transfer stations into different types and treats travel time and transfer time equivalently as weights for edges in the topological graph. The proposed algorithm introduces virtual stations to differentiate between pedestrian paths and train paths, eliminating the need for additional operations on transfer stations. The algorithm controls the extent of expansion in the urban rail transit topology, overcoming query errors caused by mishandling of transfer stations in the existing algorithm. Finally, a series of simulation experiments were conducted on Beijing's urban rail transit network to validate both correctness and efficiency of the proposed adaptive topology optimization algorithm. The results demonstrate significant advantages compared to existing similar algorithms. △ Less

Submitted 4 March, 2024; originally announced March 2024.

arXiv:2403.01734 [pdf, other]

Offline Goal-Conditioned Reinforcement Learning for Safety-Critical Tasks with Recovery Policy

Authors: Chenyang Cao, Zichen Yan, Renhao Lu, Junbo Tan, Xueqian Wang

Abstract: Offline goal-conditioned reinforcement learning (GCRL) aims at solving goal-reaching tasks with sparse rewards from an offline dataset. While prior work has demonstrated various approaches for agents to learn near-optimal policies, these methods encounter limitations when dealing with diverse constraints in complex environments, such as safety constraints. Some of these approaches prioritize goal… ▽ More Offline goal-conditioned reinforcement learning (GCRL) aims at solving goal-reaching tasks with sparse rewards from an offline dataset. While prior work has demonstrated various approaches for agents to learn near-optimal policies, these methods encounter limitations when dealing with diverse constraints in complex environments, such as safety constraints. Some of these approaches prioritize goal attainment without considering safety, while others excessively focus on safety at the expense of training efficiency. In this paper, we study the problem of constrained offline GCRL and propose a new method called Recovery-based Supervised Learning (RbSL) to accomplish safety-critical tasks with various goals. To evaluate the method performance, we build a benchmark based on the robot-fetching environment with a randomly positioned obstacle and use expert or random policies to generate an offline dataset. We compare RbSL with three offline GCRL algorithms and one offline safe RL algorithm. As a result, our method outperforms the existing state-of-the-art methods to a large extent. Furthermore, we validate the practicality and effectiveness of RbSL by deploying it on a real Panda manipulator. Code is available at https://github.com/Sunlighted/RbSL.git. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: Accepted by ICRA24

MSC Class: 68T40

arXiv:2403.00323 [pdf, other]

Softened Symbol Grounding for Neuro-symbolic Systems

Authors: Zenan Li, Yuan Yao, Taolue Chen, Jingwei Xu, Chun Cao, Xiaoxing Ma, Jian Lü

Abstract: Neuro-symbolic learning generally consists of two separated worlds, i.e., neural network training and symbolic constraint solving, whose success hinges on symbol grounding, a fundamental problem in AI. This paper presents a novel, softened symbol grounding process, bridging the gap between the two worlds, and resulting in an effective and efficient neuro-symbolic learning framework. Technically, t… ▽ More Neuro-symbolic learning generally consists of two separated worlds, i.e., neural network training and symbolic constraint solving, whose success hinges on symbol grounding, a fundamental problem in AI. This paper presents a novel, softened symbol grounding process, bridging the gap between the two worlds, and resulting in an effective and efficient neuro-symbolic learning framework. Technically, the framework features (1) modeling of symbol solution states as a Boltzmann distribution, which avoids expensive state searching and facilitates mutually beneficial interactions between network training and symbolic reasoning;(2) a new MCMC technique leveraging projection and SMT solvers, which efficiently samples from disconnected symbol solution spaces; (3) an annealing mechanism that can escape from %being trapped into sub-optimal symbol groundings. Experiments with three representative neuro symbolic learning tasks demonstrate that, owining to its superior symbol grounding capability, our framework successfully solves problems well beyond the frontier of the existing proposals. △ Less

Submitted 1 March, 2024; originally announced March 2024.

Comments: Published as a conference paper at ICLR 2023. Code is available at https://github.com/SoftWiser-group/Soften-NeSy-learning

arXiv:2402.12886 [pdf, other]

Real-time High-resolution View Synthesis of Complex Scenes with Explicit 3D Visibility Reasoning

Authors: Tiansong Zhou, Yebin Liu, Xuangeng Chu, Chengkun Cao, Changyin Zhou, Fei Yu, Yu Li

Abstract: Rendering photo-realistic novel-view images of complex scenes has been a long-standing challenge in computer graphics. In recent years, great research progress has been made on enhancing rendering quality and accelerating rendering speed in the realm of view synthesis. However, when rendering complex dynamic scenes with sparse views, the rendering quality remains limited due to occlusion problems.… ▽ More Rendering photo-realistic novel-view images of complex scenes has been a long-standing challenge in computer graphics. In recent years, great research progress has been made on enhancing rendering quality and accelerating rendering speed in the realm of view synthesis. However, when rendering complex dynamic scenes with sparse views, the rendering quality remains limited due to occlusion problems. Besides, for rendering high-resolution images on dynamic scenes, the rendering speed is still far from real-time. In this work, we propose a generalizable view synthesis method that can render high-resolution novel-view images of complex static and dynamic scenes in real-time from sparse views. To address the occlusion problems arising from the sparsity of input views and the complexity of captured scenes, we introduce an explicit 3D visibility reasoning approach that can efficiently estimate the visibility of sampled 3D points to the input views. The proposed visibility reasoning approach is fully differentiable and can gracefully fit inside the volume rendering pipeline, allowing us to train our networks with only multi-view images as supervision while refining geometry and texture simultaneously. Besides, each module in our pipeline is carefully designed to bypass the time-consuming MLP querying process and enhance the rendering quality of high-resolution images, enabling us to render high-resolution novel-view images in real-time.Experimental results show that our method outperforms previous view synthesis methods in both rendering quality and speed, particularly when dealing with complex dynamic scenes with sparse views. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.11016 [pdf, other]

Holographic phenomenology via overlapping degrees of freedom

Authors: Oliver Friedrich, ChunJun Cao, Sean M. Carroll, Gong Cheng, Ashmeet Singh

Abstract: The holographic principle suggests that regions of space contain fewer physical degrees of freedom than would be implied by conventional quantum field theory. Meanwhile, in Hilbert spaces of large dimension $2^n$, it is possible to define $N \gg n$ Pauli algebras that are nearly anti-commuting (but not quite) and which can be thought of as "overlapping degrees of freedom". We propose to model the… ▽ More The holographic principle suggests that regions of space contain fewer physical degrees of freedom than would be implied by conventional quantum field theory. Meanwhile, in Hilbert spaces of large dimension $2^n$, it is possible to define $N \gg n$ Pauli algebras that are nearly anti-commuting (but not quite) and which can be thought of as "overlapping degrees of freedom". We propose to model the phenomenology of holographic theories by allowing field-theory modes to be overlapping, and derive potential observational consequences. In particular, we build a Fermionic quantum field whose effective degrees of freedom approximately obey area scaling and satisfy a cosmic Bekenstein bound, and compare predictions of that model to cosmic neutrino observations. Our implementation of holography implies a finite lifetime of plane waves, which depends on the overall UV cutoff of the theory. To allow for neutrino flux from blazar TXS 0506+056 to be observable, our model needs to have a cutoff $k_{\mathrm{UV}} \lesssim 500\, k_{\mathrm{LHC}}\,$. This is broadly consistent with current bounds on the energy spectrum of cosmic neutrinos from IceCube, but high energy neutrinos are a potential challenge for our model of holography. We motivate our construction via quantum mereology, i.e. using the idea that EFT degrees of freedom should emerge from an abstract theory of quantum gravity by finding quasi-classical Hilbert space decompositions. We also discuss how to extend the framework to Bosons. Finally, using results from random matrix theory we derive an analytical understanding of the energy spectrum of our theory. The numerical tools used in this work are publicly available within the GPUniverse package, https://github.com/OliverFHD/GPUniverse . △ Less

Submitted 5 March, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: 46 pages + appendix; code and data available at https://github.com/OliverFHD/GPUniverse

arXiv:2402.06248 [pdf, ps, other]

doi 10.1103/PhysRevB.109.054501

Distinct pressure evolution of superconductivity and charge-density-wave in kagome superconductor CsV$_3$Sb$_5$ thin flakes

Authors: Ge Ye, Mengwei Xie, Chufan Chen, Yanan Zhang, Dongting Zhang, Xin Ma, Xiangyu Zeng, Fanghang Yu, Yi Liu, Xiaozhi Wang, Guanghan Cao, Xiaofeng Xu, Xianhui Chen, Huiqiu Yuan, Chao Cao, Xin Lu

Abstract: It is intriguing to explore the coexistence and (or) competition between charge-density-wave (CDW) and superconductivity (SC) in many correlated electron systems, such as cuprates, organic superconductors and dichacolgenides. Among them, the recently discovered $\mathbb{Z} _2$ topological kagome metals AV$_3$Sb$_5$ (A=K, Rb, Cs) serve as an ideal platform to study the intricate relation between th… ▽ More It is intriguing to explore the coexistence and (or) competition between charge-density-wave (CDW) and superconductivity (SC) in many correlated electron systems, such as cuprates, organic superconductors and dichacolgenides. Among them, the recently discovered $\mathbb{Z} _2$ topological kagome metals AV$_3$Sb$_5$ (A=K, Rb, Cs) serve as an ideal platform to study the intricate relation between them. Here, we report the electrical resistance measurements on CsV$_3$Sb$_5$ thin flakes ($\approx$ 60 nm) under hydrostatic pressure up to 2.12 GPa to compare its pressure phase diagram of CDW and SC with its bulk form. Even though the CDW transition temperature (T$_{CDW}$) in CsV$_3$Sb$_5$ thin flakes is still monotonically suppressed under pressure and totally vanishes at P$_2$=1.83 GPa similar to the bulk, the superconducting transition temperature (T$_c$) shows an initial decrease and consequent increase up to its maximum $\sim$ 8.03 K at P$_2$, in sharp contrast with the M-shaped double domes in the bulk CsV$_3$Sb$_5$. Our results suggest the important role of reduced dimensionality on the CDW state and its interplay with the SC, offering a new perspective to explore the exotic nature of CsV$_3$Sb$_5$. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 7 pages, 5 figures

Journal ref: Phys. Rev. B 109, 054501(2024)

arXiv:2402.02804 [pdf, ps, other]

doi 10.4208/cmaa.2024-0003

Time-velocity decay of solutions to the non-cutoff Boltzmann equation in the whole space

Authors: Chuqi Cao, Renjun Duan, Zongguang Li

Abstract: In this paper, we consider the perturbed solutions with polynomial tail in large velocities for the non-cutoff Boltzmann equation near global Maxwellians in the whole space. The global in time existence is proved in the weighted Sobolev spaces and the almost optimal time decay is obtained in Fourier transform based low-regularity spaces. The result shows a time-velocity decay structure of solution… ▽ More In this paper, we consider the perturbed solutions with polynomial tail in large velocities for the non-cutoff Boltzmann equation near global Maxwellians in the whole space. The global in time existence is proved in the weighted Sobolev spaces and the almost optimal time decay is obtained in Fourier transform based low-regularity spaces. The result shows a time-velocity decay structure of solutions that can be decomposed into two parts. One part allows the slow polynomial tail in large velocities, carries the initial data and enjoys the exponential or arbitrarily large polynomial time decay. The other part, with zero initial data, is dominated by the non-negative definite symmetric dissipation and has the exponential velocity decay but only the slow polynomial time decay. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Journal ref: Commun. Math. Anal. Appl., 3 (2024), pp. 61-120

arXiv:2401.17634 [pdf, ps, other]

The global well-posedness and Newtonian limit for the relativistic Boltzmann equation in a periodic box

Authors: Chuqi Cao, Jing Ouyang, Yong Wang, Changguo Xiao

Abstract: In this paper, we study the Newtonian limit for relativistic Boltzmann equation in a periodic box $\mathbb{T}^3$. We first establish the global-in-time mild solutions of relativistic Boltzmann equation with uniform-in-$\mathfrak{c}$ estimates and time decay rate. Then we rigorously justify the global-in-time Newtonian limits from the relativistic Boltzmann solutions to the solution of Newtonian Bo… ▽ More In this paper, we study the Newtonian limit for relativistic Boltzmann equation in a periodic box $\mathbb{T}^3$. We first establish the global-in-time mild solutions of relativistic Boltzmann equation with uniform-in-$\mathfrak{c}$ estimates and time decay rate. Then we rigorously justify the global-in-time Newtonian limits from the relativistic Boltzmann solutions to the solution of Newtonian Boltzmann equation in $L^1_pL^{\infty}_x$. Moreover, if the initial data of Newtonian Boltzmann equation belong to $W^{1,\infty}(\mathbb{T}^3\times\mathbb{R}^3)$, based on a decomposition and $L^2-L^\infty$ argument, the global-in-time Newtonian limit is proved in $L^{\infty}_{x,p}$. The convergence rates of Newtonian limit are obtained both in $L^1_pL^{\infty}_x$ and $L^{\infty}_{x,p}$. △ Less

Submitted 31 January, 2024; originally announced January 2024.

Comments: 56 pages, All comments are welcome

arXiv:2401.16861 [pdf, other]

Repositioning the Subject within Image

Authors: Yikai Wang, Chenjie Cao, Ke Fan, Qiaole Dong, Yifan Li, Xiangyang Xue, Yanwei Fu

Abstract: Current image manipulation primarily centers on static manipulation, such as replacing specific regions within an image or altering its overall style. In this paper, we introduce an innovative dynamic manipulation task, subject repositioning. This task involves relocating a user-specified subject to a desired position while preserving the image's fidelity. Our research reveals that the fundamental… ▽ More Current image manipulation primarily centers on static manipulation, such as replacing specific regions within an image or altering its overall style. In this paper, we introduce an innovative dynamic manipulation task, subject repositioning. This task involves relocating a user-specified subject to a desired position while preserving the image's fidelity. Our research reveals that the fundamental sub-tasks of subject repositioning, which include filling the void left by the repositioned subject, reconstructing obscured portions of the subject and blending the subject to be consistent with surrounding areas, can be effectively reformulated as a unified, prompt-guided inpainting task. Consequently, we can employ a single diffusion generative model to address these sub-tasks using various task prompts learned through our proposed task inversion technique. Additionally, we integrate pre-processing and post-processing techniques to further enhance the quality of subject repositioning. These elements together form our SEgment-gEnerate-and-bLEnd (SEELE) framework. To assess SEELE's effectiveness in subject repositioning, we assemble a real-world subject repositioning dataset called ReS. Results of SEELE on ReS demonstrate its efficacy. △ Less

Submitted 17 March, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

Comments: Project page: https://yikai-wang.github.io/seele/. Dataset: https://github.com/Yikai-Wang/ReS. Arxiv version uses small size images for fast preview. Full size PDF is available at project page

arXiv:2401.13531 [pdf, other]

QAGait: Revisit Gait Recognition from a Quality Perspective

Authors: Zengbin Wang, Saihui Hou, Man Zhang, Xu Liu, Chunshui Cao, Yongzhen Huang, Peipei Li, Shibiao Xu

Abstract: Gait recognition is a promising biometric method that aims to identify pedestrians from their unique walking patterns. Silhouette modality, renowned for its easy acquisition, simple structure, sparse representation, and convenient modeling, has been widely employed in controlled in-the-lab research. However, as gait recognition rapidly advances from in-the-lab to in-the-wild scenarios, various con… ▽ More Gait recognition is a promising biometric method that aims to identify pedestrians from their unique walking patterns. Silhouette modality, renowned for its easy acquisition, simple structure, sparse representation, and convenient modeling, has been widely employed in controlled in-the-lab research. However, as gait recognition rapidly advances from in-the-lab to in-the-wild scenarios, various conditions raise significant challenges for silhouette modality, including 1) unidentifiable low-quality silhouettes (abnormal segmentation, severe occlusion, or even non-human shape), and 2) identifiable but challenging silhouettes (background noise, non-standard posture, slight occlusion). To address these challenges, we revisit gait recognition pipeline and approach gait recognition from a quality perspective, namely QAGait. Specifically, we propose a series of cost-effective quality assessment strategies, including Maxmial Connect Area and Template Match to eliminate background noises and unidentifiable silhouettes, Alignment strategy to handle non-standard postures. We also propose two quality-aware loss functions to integrate silhouette quality into optimization within the embedding space. Extensive experiments demonstrate our QAGait can guarantee both gait reliability and performance enhancement. Furthermore, our quality assessment strategies can seamlessly integrate with existing gait datasets, showcasing our superiority. Code is available at https://github.com/wzb-bupt/QAGait. △ Less

Submitted 24 January, 2024; originally announced January 2024.

Comments: Accepted by AAAI 2024

arXiv:2401.11673 [pdf, other]

MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View Stereo

Authors: Chenjie Cao, Xinlin Ren, Yanwei Fu

Abstract: Recent advancements in learning-based Multi-View Stereo (MVS) methods have prominently featured transformer-based models with attention mechanisms. However, existing approaches have not thoroughly investigated the profound influence of transformers on different MVS modules, resulting in limited depth estimation capabilities. In this paper, we introduce MVSFormer++, a method that prudently maximize… ▽ More Recent advancements in learning-based Multi-View Stereo (MVS) methods have prominently featured transformer-based models with attention mechanisms. However, existing approaches have not thoroughly investigated the profound influence of transformers on different MVS modules, resulting in limited depth estimation capabilities. In this paper, we introduce MVSFormer++, a method that prudently maximizes the inherent characteristics of attention to enhance various components of the MVS pipeline. Formally, our approach involves infusing cross-view information into the pre-trained DINOv2 model to facilitate MVS learning. Furthermore, we employ different attention mechanisms for the feature encoder and cost volume regularization, focusing on feature and spatial aggregations respectively. Additionally, we uncover that some design details would substantially impact the performance of transformer modules in MVS, including normalized 3D positional encoding, adaptive attention scaling, and the position of layer normalization. Comprehensive experiments on DTU, Tanks-and-Temples, BlendedMVS, and ETH3D validate the effectiveness of the proposed method. Notably, MVSFormer++ achieves state-of-the-art performance on the challenging DTU and Tanks-and-Temples benchmarks. △ Less

Submitted 21 January, 2024; originally announced January 2024.

Comments: Accepted to ICLR2024

Journal ref: ICLR(International Conference on Learning Representations) 2024

arXiv:2401.09838 [pdf, other]

doi 10.1145/3639478.3640022

CATMA: Conformance Analysis Tool For Microservice Applications

Authors: Clinton Cao, Simon Schneider, Nicolás E. Díaz Ferreyra, Sicco Verwer, Annibale Panichella, Riccardo Scandariato

Abstract: The microservice architecture allows developers to divide the core functionality of their software system into multiple smaller services. However, this architectural style also makes it harder for them to debug and assess whether the system's deployment conforms to its implementation. We present CATMA, an automated tool that detects non-conformances between the system's deployment and implementati… ▽ More The microservice architecture allows developers to divide the core functionality of their software system into multiple smaller services. However, this architectural style also makes it harder for them to debug and assess whether the system's deployment conforms to its implementation. We present CATMA, an automated tool that detects non-conformances between the system's deployment and implementation. It automatically visualizes and generates potential interpretations for the detected discrepancies. Our evaluation of CATMA shows promising results in terms of performance and providing useful insights. CATMA is available at \url{https://cyber-analytics.nl/catma.github.io/}, and a demonstration video is available at \url{https://youtu.be/WKP1hG-TDKc}. △ Less

Submitted 23 January, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

Comments: 5 pages, 5 figures, ICSE '24 Demonstration Track

arXiv:2401.07540 [pdf, other]

Study Features via Exploring Distribution Structure

Authors: Chunxu Cao, Qiang Zhang

Abstract: In this paper, we present a novel framework for data redundancy measurement based on probabilistic modeling of datasets, and a new criterion for redundancy detection that is resilient to noise. We also develop new methods for data redundancy reduction using both deterministic and stochastic optimization techniques. Our framework is flexible and can handle different types of features, and our exper… ▽ More In this paper, we present a novel framework for data redundancy measurement based on probabilistic modeling of datasets, and a new criterion for redundancy detection that is resilient to noise. We also develop new methods for data redundancy reduction using both deterministic and stochastic optimization techniques. Our framework is flexible and can handle different types of features, and our experiments on benchmark datasets demonstrate the effectiveness of our methods. We provide a new perspective on feature selection, and propose effective and robust approaches for both supervised and unsupervised learning problems. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.07488 [pdf, ps, other]

Feature Selection via Maximizing Distances between Class Conditional Distributions

Authors: Chunxu Cao, Qiang Zhang

Abstract: For many data-intensive tasks, feature selection is an important preprocessing step. However, most existing methods do not directly and intuitively explore the intrinsic discriminative information of features. We propose a novel feature selection framework based on the distance between class conditional distributions, measured by integral probability metrics (IPMs). Our framework directly explores… ▽ More For many data-intensive tasks, feature selection is an important preprocessing step. However, most existing methods do not directly and intuitively explore the intrinsic discriminative information of features. We propose a novel feature selection framework based on the distance between class conditional distributions, measured by integral probability metrics (IPMs). Our framework directly explores the discriminative information of features in the sense of distributions for supervised classification. We analyze the theoretical and practical aspects of IPMs for feature selection, construct criteria based on IPMs. We propose several variant feature selection methods of our framework based on the 1-Wasserstein distance and implement them on real datasets from different domains. Experimental results show that our framework can outperform state-of-the-art methods in terms of classification accuracy and robustness to perturbations. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.07482 [pdf, ps, other]

A Contrast Based Feature Selection Algorithm for High-dimensional Data set in Machine Learning

Authors: Chunxu Cao, Qiang Zhang

Abstract: Feature selection is an important process in machine learning and knowledge discovery. By selecting the most informative features and eliminating irrelevant ones, the performance of learning algorithms can be improved and the extraction of meaningful patterns and insights from data can be facilitated. However, most existing feature selection methods, when applied to large datasets, encountered the… ▽ More Feature selection is an important process in machine learning and knowledge discovery. By selecting the most informative features and eliminating irrelevant ones, the performance of learning algorithms can be improved and the extraction of meaningful patterns and insights from data can be facilitated. However, most existing feature selection methods, when applied to large datasets, encountered the bottleneck of high computation costs. To address this problem, we propose a novel filter feature selection method, ContrastFS, which selects discriminative features based on the discrepancies features shown between different classes. We introduce a dimensionless quantity as a surrogate representation to summarize the distributional individuality of certain classes, based on this quantity we evaluate features and study the correlation among them. We validate effectiveness and efficiency of our approach on several widely studied benchmark datasets, results show that the new method performs favorably with negligible computation in comparison with other state-of-the-art feature selection methods. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.06333 [pdf, other]

doi 10.1103/PhysRevB.108.195146

Direction dependent switching of carrier-type enabled by Fermi surface geometry

Authors: Shuaishuai Luo, Feng Du, Dajun Su, Yongjun Zhang, Jiawen Zhang, Jiacheng Xu, Yuxin Chen, Chao Cao, Michael Smidman, Frank Steglich, Huiqiu Yuan

Abstract: While charge carriers can typically be designated as either electron- or hole- type, depending on the sign of the Hall coefficient, some materials defy this straightforward classification. Here we find that LaRh$_6$Ge$_4$ goes beyond this dichotomy, where the Hall resistivity is electron-like for magnetic fields along the $c$-axis but hole-like in the basal plane. Together with first-principles ca… ▽ More While charge carriers can typically be designated as either electron- or hole- type, depending on the sign of the Hall coefficient, some materials defy this straightforward classification. Here we find that LaRh$_6$Ge$_4$ goes beyond this dichotomy, where the Hall resistivity is electron-like for magnetic fields along the $c$-axis but hole-like in the basal plane. Together with first-principles calculations, we show that this direction-dependent switching of the carrier type arises within a single band, where the special geometry leads to charge carriers on the same Fermi surface orbiting as electrons along some directions, but holes along others. The relationship between the Fermi surface geometry and occurrence of a Hall sign reversal is further generalized by considering tight-binding model calculations, which show that this type of Fermi surface corresponds to a more robust means of realizing this phenomenon, suggesting an important route for tailoring direction dependent properties for advanced electronic device applications. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: 7 pages, 5 figures

Journal ref: Phys. Rev. B 108, 195146 (2023)

arXiv:2401.05334 [pdf, other]

URHand: Universal Relightable Hands

Authors: Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito

Abstract: Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows f… ▽ More Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows few-shot personalization using images captured with a mobile phone, and is ready to be photorealistically rendered under novel illuminations. To simplify the personalization process while retaining photorealism, we build a powerful universal relightable prior based on neural relighting from multi-view images of hands captured in a light stage with hundreds of identities. The key challenge is scaling the cross-identity training while maintaining personalized fidelity and sharp details without compromising generalization under natural illuminations. To this end, we propose a spatially varying linear lighting model as the neural renderer that takes physics-inspired shading as input feature. By removing non-linear activations and bias, our specifically designed lighting model explicitly keeps the linearity of light transport. This enables single-stage training from light-stage data while generalizing to real-time rendering under arbitrary continuous illuminations across diverse identities. In addition, we introduce the joint learning of a physically based model and our neural relighting model, which further improves fidelity and generalization. Extensive experiments show that our approach achieves superior performance over existing methods in terms of both quality and generalizability. We also demonstrate quick personalization of URHand from a short phone scan of an unseen identity. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: Project Page https://frozenburning.github.io/projects/urhand/

arXiv:2312.15510 [pdf, ps, other]

The Vlasov-Maxwell-Boltzmann/Landau system with polynomial perturbation near Maxwellian

Authors: Chuqi Cao, Dingqun Deng, Xingyu Li

Abstract: In this paper, we study the Vlasov-Maxwell-Boltzmann system without angular cutoff and the Vlasov-Maxwell-Landau/Boltzmann system with polynomial perturbation $F=μ+f$ near global Maxwellian. In particular, we prove the global existence, uniqueness and large time behavior for solutions in a polynomial weighted space $H^N_{x,v}(\langle v\rangle^k)$. The method is based on Duhamel's principle with th… ▽ More In this paper, we study the Vlasov-Maxwell-Boltzmann system without angular cutoff and the Vlasov-Maxwell-Landau/Boltzmann system with polynomial perturbation $F=μ+f$ near global Maxwellian. In particular, we prove the global existence, uniqueness and large time behavior for solutions in a polynomial weighted space $H^N_{x,v}(\langle v\rangle^k)$. The method is based on Duhamel's principle with the crucial time-decay analysis on the particle distribution $f$ and the electromagnetic field $(E,B)$. △ Less

Submitted 24 December, 2023; originally announced December 2023.

arXiv:2312.15237 [pdf, other]

Towards Fine-Grained Explainability for Heterogeneous Graph Neural Network

Authors: Tong Li, Jiale Deng, Yanyan Shen, Luyu Qiu, Yongxiang Huang, Caleb Chen Cao

Abstract: Heterogeneous graph neural networks (HGNs) are prominent approaches to node classification tasks on heterogeneous graphs. Despite the superior performance, insights about the predictions made from HGNs are obscure to humans. Existing explainability techniques are mainly proposed for GNNs on homogeneous graphs. They focus on highlighting salient graph objects to the predictions whereas the problem… ▽ More Heterogeneous graph neural networks (HGNs) are prominent approaches to node classification tasks on heterogeneous graphs. Despite the superior performance, insights about the predictions made from HGNs are obscure to humans. Existing explainability techniques are mainly proposed for GNNs on homogeneous graphs. They focus on highlighting salient graph objects to the predictions whereas the problem of how these objects affect the predictions remains unsolved. Given heterogeneous graphs with complex structures and rich semantics, it is imperative that salient objects can be accompanied with their influence paths to the predictions, unveiling the reasoning process of HGNs. In this paper, we develop xPath, a new framework that provides fine-grained explanations for black-box HGNs specifying a cause node with its influence path to the target node. In xPath, we differentiate the influence of a node on the prediction w.r.t. every individual influence path, and measure the influence by perturbing graph structure via a novel graph rewiring algorithm. Furthermore, we introduce a greedy search algorithm to find the most influential fine-grained explanations efficiently. Empirical results on various HGNs and heterogeneous graphs show that xPath yields faithful explanations efficiently, outperforming the adaptations of advanced GNN explanation approaches. △ Less

Submitted 23 December, 2023; originally announced December 2023.

Comments: Accepted by AAAI2023

arXiv:2312.08679 [pdf, other]

A Local Appearance Model for Volumetric Capture of Diverse Hairstyle

Authors: Ziyan Wang, Giljoo Nam, Aljaz Bozic, Chen Cao, Jason Saragih, Michael Zollhoefer, Jessica Hodgins

Abstract: Hair plays a significant role in personal identity and appearance, making it an essential component of high-quality, photorealistic avatars. Existing approaches either focus on modeling the facial region only or rely on personalized models, limiting their generalizability and scalability. In this paper, we present a novel method for creating high-fidelity avatars with diverse hairstyles. Our metho… ▽ More Hair plays a significant role in personal identity and appearance, making it an essential component of high-quality, photorealistic avatars. Existing approaches either focus on modeling the facial region only or rely on personalized models, limiting their generalizability and scalability. In this paper, we present a novel method for creating high-fidelity avatars with diverse hairstyles. Our method leverages the local similarity across different hairstyles and learns a universal hair appearance prior from multi-view captures of hundreds of people. This prior model takes 3D-aligned features as input and generates dense radiance fields conditioned on a sparse point cloud with color. As our model splits different hairstyles into local primitives and builds prior at that level, it is capable of handling various hair topologies. Through experiments, we demonstrate that our model captures a diverse range of hairstyles and generalizes well to challenging new hairstyles. Empirical results show that our method improves the state-of-the-art approaches in capturing and generating photorealistic, personalized avatars with complete hair. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2312.08303 [pdf, other]

Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models

Authors: Jiang Zhang, Qiong Wu, Yiming Xu, Cheng Cao, Zheng Du, Konstantinos Psounis

Abstract: Toxic content detection is crucial for online services to remove inappropriate content that violates community standards. To automate the detection process, prior works have proposed varieties of machine learning (ML) approaches to train Language Models (LMs) for toxic content detection. However, both their accuracy and transferability across datasets are limited. Recently, Large Language Models (… ▽ More Toxic content detection is crucial for online services to remove inappropriate content that violates community standards. To automate the detection process, prior works have proposed varieties of machine learning (ML) approaches to train Language Models (LMs) for toxic content detection. However, both their accuracy and transferability across datasets are limited. Recently, Large Language Models (LLMs) have shown promise in toxic content detection due to their superior zero-shot and few-shot in-context learning ability as well as broad transferability on ML tasks. However, efficiently designing prompts for LLMs remains challenging. Moreover, the high run-time cost of LLMs may hinder their deployments in production. To address these challenges, in this work, we propose BD-LLM, a novel and efficient approach to Bootstrapping and Distilling LLMs for toxic content detection. Specifically, we design a novel prompting method named Decision-Tree-of-Thought (DToT) to bootstrap LLMs' detection performance and extract high-quality rationales. DToT can automatically select more fine-grained context to re-prompt LLMs when their responses lack confidence. Additionally, we use the rationales extracted via DToT to fine-tune student LMs. Our experimental results on various datasets demonstrate that DToT can improve the accuracy of LLMs by up to 4.6%. Furthermore, student LMs fine-tuned with rationales extracted via DToT outperform baselines on all datasets with up to 16.9\% accuracy improvement, while being more than 60x smaller than conventional LLMs. Finally, we observe that student LMs fine-tuned with rationales exhibit better cross-dataset transferability. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2312.07030 [pdf]

Stabilizing Soil Using Annealed Polyvinyl Alcohol as Long-lasting Binder

Authors: Chunyan Cao, Gang Li

Abstract: Agricultural production heavily exploits the soil, resulting in high erosion in cultivated land, which poses a threat to food security and environmental sustainability. To address this issue, we stabilize the soil using polyvinyl alcohol (PVA). PVA strongly adheres to the soil after mixing and annealing, enhancing the cohesive strength of the soil. The PVA-soil withstands the impact of water at 7… ▽ More Agricultural production heavily exploits the soil, resulting in high erosion in cultivated land, which poses a threat to food security and environmental sustainability. To address this issue, we stabilize the soil using polyvinyl alcohol (PVA). PVA strongly adheres to the soil after mixing and annealing, enhancing the cohesive strength of the soil. The PVA-soil withstands the impact of water at 7 m/s, protecting it from rainfall-induced erosion. Furthermore, the water-retaining capability and drainage of PVA-soil can be adjusted based on its sizes. This customized PVA-soil provides optimal growing conditions for various plants in different climates. Our method contributes to improved soil management and conversion. △ Less

Submitted 28 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.05256 [pdf, other]

Holistic Evaluation of GPT-4V for Biomedical Imaging

Authors: Zhengliang Liu, Hanqi Jiang, Tianyang Zhong, Zihao Wu, Chong Ma, Yiwei Li, Xiaowei Yu, Yutong Zhang, Yi Pan, Peng Shu, Yanjun Lyu, Lu Zhang, Junjie Yao, Peixin Dong, Chao Cao, Zhenxiang Xiao, Jiaqi Wang, Huan Zhao, Shaochen Xu, Yaonai Wei, Jingyuan Chen, Haixing Dai, Peilong Wang, Hao He, Zewei Wang , et al. (25 additional authors not shown)

Abstract: In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and mor… ▽ More In this paper, we present a large-scale evaluation probing GPT-4V's capabilities and limitations for biomedical image analysis. GPT-4V represents a breakthrough in artificial general intelligence (AGI) for computer vision, with applications in the biomedical domain. We assess GPT-4V's performance across 16 medical imaging categories, including radiology, oncology, ophthalmology, pathology, and more. Tasks include modality recognition, anatomy localization, disease diagnosis, report generation, and lesion detection. The extensive experiments provide insights into GPT-4V's strengths and weaknesses. Results show GPT-4V's proficiency in modality and anatomy recognition but difficulty with disease diagnosis and localization. GPT-4V excels at diagnostic report generation, indicating strong image captioning skills. While promising for biomedical imaging AI, GPT-4V requires further enhancement and validation before clinical deployment. We emphasize responsible development and testing for trustworthy integration of biomedical AGI. This rigorous evaluation of GPT-4V on diverse medical images advances understanding of multimodal large language models (LLMs) and guides future work toward impactful healthcare applications. △ Less

Submitted 10 November, 2023; originally announced December 2023.

arXiv:2312.04831 [pdf, other]

Towards Context-Stable and Visual-Consistent Image Inpainting

Authors: Yikai Wang, Chenjie Cao, Ke Fan Xiangyang Xue Yanwei Fu

Abstract: Recent progress in inpainting increasingly relies on generative models, leveraging their strong generation capabilities for addressing large irregular masks. However, this enhanced generation often introduces context-instability, leading to arbitrary object generation within masked regions. This paper proposes a balanced solution, emphasizing the importance of unmasked regions in guiding inpaintin… ▽ More Recent progress in inpainting increasingly relies on generative models, leveraging their strong generation capabilities for addressing large irregular masks. However, this enhanced generation often introduces context-instability, leading to arbitrary object generation within masked regions. This paper proposes a balanced solution, emphasizing the importance of unmasked regions in guiding inpainting while preserving generation capacity. Our approach, Aligned Stable Inpainting with UnKnown Areas Prior (ASUKA), employs a Masked Auto-Encoder (MAE) to produce reconstruction-based prior. Aligned with the powerful Stable Diffusion inpainting model (SD), ASUKA significantly improves context stability. ASUKA further adopts an inpainting-specialized decoder, highly reducing the color inconsistency issue of SD and thus ensuring more visual-consistent inpainting. We validate effectiveness of inpainting algorithms on benchmark dataset Places 2 and a collection of several existing datasets, dubbed MISATO, across diverse domains and masking scenarios. Results on these benchmark datasets confirm ASUKA's efficacy in both context-stability and visual-consistency compared to SD and other inpainting algorithms. △ Less

Submitted 17 March, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

Comments: Project page: https://yikai-wang.github.io/asuka/ where full-size PDF with appendix is available. Dataset: https://github.com/Yikai-Wang/asuka-misato. Yikai Wang and Chenjie Cao contribute equally

arXiv:2312.01890

Optical anisotropy and nonlinearity in deep ultraviolet fluorooxoborates

Authors: Bing-Hua Lei, Chao Cao, David J. Singh

Abstract: Optical anisotropy and nonlinearity are two tantalizingly important and enticing properties of an optical crystal. Combining these two features will have a miraculous effect. The up conversion can extend solid state laser sources to the ultraviolet and deep ultraviolet (DUV) ranges through harmonic generation and for down conversion needed for quantum information technology, but only a few suitabl… ▽ More Optical anisotropy and nonlinearity are two tantalizingly important and enticing properties of an optical crystal. Combining these two features will have a miraculous effect. The up conversion can extend solid state laser sources to the ultraviolet and deep ultraviolet (DUV) ranges through harmonic generation and for down conversion needed for quantum information technology, but only a few suitable materials are known as the medium because of the combination of properties that are required. These include suitable band gaps, moderate optical anisotropy for phase matching and strong nonlinear optical (NLO) response. Fluorooxoborates are a new ideal platform for this effect in DUV. Here we demonstrate that fluorooxoborate is the optimal framework for DUV NLO material and show that the significance of the incorporation of fluorine in borates. The NLO performance of fluorooxoborates is strongly improved in terms of local crystal structure and distribution of electronic states. Importantly, the role of fluorine is to control the structure, while maintaining high band gaps but does not directly provide large contributions to birefringence and the second harmonic generation as the conventional assumptions. This is a consequence of the microscopic electron distribution and the energy position of the fluorine states well below the valence band maxima. Based on our understandings, we constructed two artificial structure and they all behave as anticipated. △ Less

Submitted 20 December, 2023; v1 submitted 4 December, 2023; originally announced December 2023.

Comments: This manuscript contain many errors

arXiv:2311.13225 [pdf, other]

NeutronOrch: Rethinking Sample-based GNN Training under CPU-GPU Heterogeneous Environments

Authors: Xin Ai, Qiange Wang, Chunyu Cao, Yanfeng Zhang, Chaoyi Chen, Hao Yuan, Yu Gu, Ge Yu

Abstract: Graph Neural Networks (GNNs) have demonstrated outstanding performance in various applications. Existing frameworks utilize CPU-GPU heterogeneous environments to train GNN models and integrate mini-batch and sampling techniques to overcome the GPU memory limitation. In CPU-GPU heterogeneous environments, we can divide sample-based GNN training into three steps: sample, gather, and train. Existing… ▽ More Graph Neural Networks (GNNs) have demonstrated outstanding performance in various applications. Existing frameworks utilize CPU-GPU heterogeneous environments to train GNN models and integrate mini-batch and sampling techniques to overcome the GPU memory limitation. In CPU-GPU heterogeneous environments, we can divide sample-based GNN training into three steps: sample, gather, and train. Existing GNN systems use different task orchestrating methods to employ each step on CPU or GPU. After extensive experiments and analysis, we find that existing task orchestrating methods fail to fully utilize the heterogeneous resources, limited by inefficient CPU processing or GPU resource contention. In this paper, we propose NeutronOrch, a system for sample-based GNN training that incorporates a layer-based task orchestrating method and ensures balanced utilization of the CPU and GPU. NeutronOrch decouples the training process by layer and pushes down the training task of the bottom layer to the CPU. This significantly reduces the computational load and memory footprint of GPU training. To avoid inefficient CPU processing, NeutronOrch only offloads the training of frequently accessed vertices to the CPU and lets GPU reuse their embeddings with bounded staleness. Furthermore, NeutronOrch provides a fine-grained pipeline design for the layer-based task orchestrating method, fully overlapping different tasks on heterogeneous resources while strictly guaranteeing bounded staleness. The experimental results show that compared with the state-of-the-art GNN systems, NeutronOrch can achieve up to 11.51x performance speedup. △ Less

Submitted 11 December, 2023; v1 submitted 22 November, 2023; originally announced November 2023.

arXiv:2311.10288

Current manipulation of Giant tunneling altermagnetic resistance in collinear Antiferromagnetic RuO2/MgO/RuO2 sandwich structure

Authors: Shijie Xu, Yan Huang, Farzad Mahfouzi, Zhizhong Zhang, Houyi Cheng, Bingqian Dai, Jinwoong Kim, Wenlong Cai, Kewen Shi, Daoqian Zhu, Zongxia Guo, Caihua Cao, Kun Zhang, Albert Fert, Yue Zhang, Kang L. Wang, Nicholas Kioussis, Weisheng Zhao

Abstract: As an emerging non-volatile memory technology, magnetic random access memory (MRAM) has key features and advantages including non-volatility, high speed, endurance, low power consumption and radiation tolerance. Conventional MRAM utilizes magnetic tunnel junctions (MTJs), which consist of two ferromagnetic layers separated by an insulating tunnel barrier. The orientation of the magnetic layers rep… ▽ More As an emerging non-volatile memory technology, magnetic random access memory (MRAM) has key features and advantages including non-volatility, high speed, endurance, low power consumption and radiation tolerance. Conventional MRAM utilizes magnetic tunnel junctions (MTJs), which consist of two ferromagnetic layers separated by an insulating tunnel barrier. The orientation of the magnetic layers represents the binary data (0 or 1), and electrical resistance changes depending on the relative orientation of these magnetic layers. Despite these advancements, the quest for a swifter, more stable magneto-resistive random-access memory paradigm persists. In this vein, we present a groundbreaking development: room-temperature antiferromagnetic tunnel junctions devoid of any net magnetic moment. Over 200% tunneling altermagnetic resistance (TAR) ratio was measured at RuO2 (110)/MgO/RuO2 (110)/W structure, which is achieved by changing the antiferromagnetic Neel vector of RuO2 with an ultralow current density 2 MA*cm-2. △ Less

Submitted 24 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: Modification required

arXiv:2311.09588 [pdf]

Orientation-dependent superconductivity and electronic structure of the rare-earth metal/KTaO3 interfaces

Authors: Guowei Yang, Weifan Zhu, Jiawen Zhang, Hao Zheng, Yi Wu, Huali Zhang, Ge Ye, Dajun Su, Yanan Zhang, Chao Cao, Xin Lu, Huiqiu Yuan, Yang Liu

Abstract: The recent discovery of orientation-dependent superconductivity in KTaO3-based interfaces has attracted considerable interest, while the underlying origin remains an open question. Here we report a different approach to tune the interfacial electron gas and superconductivity by forming interfaces between rare-earth (RE) metals (RE being La, Ce, Eu) and KTaO3 substrates with different orientations.… ▽ More The recent discovery of orientation-dependent superconductivity in KTaO3-based interfaces has attracted considerable interest, while the underlying origin remains an open question. Here we report a different approach to tune the interfacial electron gas and superconductivity by forming interfaces between rare-earth (RE) metals (RE being La, Ce, Eu) and KTaO3 substrates with different orientations. We found that the interfacial superconductivity is strongest for the Eu/KTaO3 interfaces, becomes weaker in La/KTaO3 and is absent in Ce/KTaO3. Using in-situ photoemission, we observed distinct valence bands associated with RE metals, as well as a pronounced orientation dependence in the interfacial electronic structure, which can be linked to the orientation-dependent superconductivity. The photoemission spectra show similar double-peak structures for the (111) and (110) oriented interfaces, with an energy separation close to the LO4 phonon of KTaO3. Detailed analyses suggest that this double-peak structure could be attributed to electron-phonon coupling, which might be important for the interfacial superconductivity. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 24 pages, 6 figures and 1 table

arXiv:2311.03074 [pdf, other]

A Two-Stage Generative Model with CycleGAN and Joint Diffusion for MRI-based Brain Tumor Detection

Authors: Wenxin Wang, Zhuo-Xu Cui, Guanxun Cheng, Chentao Cao, Xi Xu, Ziwei Liu, Haifeng Wang, Yulong Qi, Dong Liang, Yanjie Zhu

Abstract: Accurate detection and segmentation of brain tumors is critical for medical diagnosis. However, current supervised learning methods require extensively annotated images and the state-of-the-art generative models used in unsupervised methods often have limitations in covering the whole data distribution. In this paper, we propose a novel framework Two-Stage Generative Model (TSGM) that combines Cyc… ▽ More Accurate detection and segmentation of brain tumors is critical for medical diagnosis. However, current supervised learning methods require extensively annotated images and the state-of-the-art generative models used in unsupervised methods often have limitations in covering the whole data distribution. In this paper, we propose a novel framework Two-Stage Generative Model (TSGM) that combines Cycle Generative Adversarial Network (CycleGAN) and Variance Exploding stochastic differential equation using joint probability (VE-JP) to improve brain tumor detection and segmentation. The CycleGAN is trained on unpaired data to generate abnormal images from healthy images as data prior. Then VE-JP is implemented to reconstruct healthy images using synthetic paired abnormal images as a guide, which alters only pathological regions but not regions of healthy. Notably, our method directly learned the joint probability distribution for conditional generation. The residual between input and reconstructed images suggests the abnormalities and a thresholding method is subsequently applied to obtain segmentation results. Furthermore, the multimodal results are weighted with different weights to improve the segmentation accuracy further. We validated our method on three datasets, and compared with other unsupervised methods for anomaly detection and segmentation. The DSC score of 0.8590 in BraTs2020 dataset, 0.6226 in ITCS dataset and 0.7403 in In-house dataset show that our method achieves better segmentation performance and has better generalization. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: 11 pages,9 figures,3 tables

Showing 1–50 of 522 results for author: Cao, C