subscribe to arXiv mailings

Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components. △ Less

Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

arXiv:2406.19421 [pdf, other]

The Belle II Detector Upgrades Framework Conceptual Design Report

Authors: H. Aihara, A. Aloisio, D. P. Auguste, M. Aversano, M. Babeluk, S. Bahinipati, Sw. Banerjee, M. Barbero, J. Baudot, A. Beaubien, F. Becherer, T. Bergauer, F. U. Bernlochner., V. Bertacchi, G. Bertolone, C. Bespin, M. Bessner, S. Bettarini, A. J. Bevan, B. Bhuyan, M. Bona, J. F. Bonis, J. Borah, F. Bosi, R. Boudagga , et al. (186 additional authors not shown)

Abstract: We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive wit… ▽ More We describe the planned near-term and potential longer-term upgrades of the Belle II detector at the SuperKEKB electron-positron collider operating at the KEK laboratory in Tsukuba, Japan. These upgrades will allow increasingly sensitive searches for possible new physics beyond the Standard Model in flavor, tau, electroweak and dark sector physics that are both complementary to and competitive with the LHC and other experiments. △ Less

Submitted 4 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

Comments: Editor: F. Forti 170 pages

Report number: KEK-REPORT-2024-1, BELLE2-REPORT-2024-042

arXiv:2406.17795 [pdf, other]

RACon: Retrieval-Augmented Simulated Character Locomotion Control

Authors: Yuxuan Mu, Shihao Zou, Kangning Yin, Zheng Tian, Li Cheng, Weinan Zhang, Jun Wang

Abstract: In computer animation, driving a simulated character with lifelike motion is challenging. Current generative models, though able to generalize to diverse motions, often pose challenges to the responsiveness of end-user control. To address these issues, we introduce RACon: Retrieval-Augmented Simulated Character Locomotion Control. Our end-to-end hierarchical reinforcement learning method utilizes… ▽ More In computer animation, driving a simulated character with lifelike motion is challenging. Current generative models, though able to generalize to diverse motions, often pose challenges to the responsiveness of end-user control. To address these issues, we introduce RACon: Retrieval-Augmented Simulated Character Locomotion Control. Our end-to-end hierarchical reinforcement learning method utilizes a retriever and a motion controller. The retriever searches motion experts from a user-specified database in a task-oriented fashion, which boosts the responsiveness to the user's control. The selected motion experts and the manipulation signal are then transferred to the controller to drive the simulated character. In addition, a retrieval-augmented discriminator is designed to stabilize the training process. Our method surpasses existing techniques in both quality and quantity in locomotion control, as demonstrated in our empirical study. Moreover, by switching extensive databases for retrieval, it can adapt to distinctive motion types at run time. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: Accepted in ICME2024 for oral presentation

arXiv:2406.17096 [pdf, other]

Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

Authors: Yudan Wang, Shaofeng Zou, Yue Wang

Abstract: Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored model-based approaches, with limited availability of model-free methods offering convergence guarantees or sample complexities. This paper proposes a model-free DR-R… ▽ More Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored model-based approaches, with limited availability of model-free methods offering convergence guarantees or sample complexities. This paper proposes a model-free DR-RL algorithm leveraging the Multi-level Monte Carlo (MLMC) technique to close such a gap. Our innovative approach integrates a threshold mechanism that ensures finite sample requirements for algorithmic implementation, a significant improvement than previous model-free algorithms. We develop algorithms for uncertainty sets defined by total variation, Chi-square divergence, and KL divergence, and provide finite sample analyses under all three cases. Remarkably, our algorithms represent the first model-free DR-RL approach featuring finite sample complexity for total variation and Chi-square divergence uncertainty sets, while also offering an improved sample complexity and broader applicability compared to existing model-free DR-RL algorithms for the KL divergence model. The complexities of our method establish the tightest results for all three uncertainty models in model-free DR-RL, underscoring the effectiveness and efficiency of our algorithm, and highlighting its potential for practical applications. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: UAI 2024

arXiv:2406.12855 [pdf, other]

Moving frame and spin field representations of submanifolds in flat space

Authors: Shou-Jyun Zou

Abstract: We introduce a spin field approach, that is compatible with the Cartan moving frame method, to describe the submanifold in a flat space. In fact, we consider a kind of spin field $ψ$, that satisfies a Killing spin field equation (analogous to a Killing spinor equation) written in terms of the Clifford algebra, and we use the spin field to locally rotate the orthonormal basis… ▽ More We introduce a spin field approach, that is compatible with the Cartan moving frame method, to describe the submanifold in a flat space. In fact, we consider a kind of spin field $ψ$, that satisfies a Killing spin field equation (analogous to a Killing spinor equation) written in terms of the Clifford algebra, and we use the spin field to locally rotate the orthonormal basis $\{\hat{e}_\mathtt{I}\}$. Then, the deformed orthonormal frame $\{\tildeψ\hat{e}_\mathtt{I}ψ\}$ can be seen as the moving frame of a submanifold. We find some solutions to the Killing spin field equation and demonstrate an explicit example. Using the product of the spin fields, one can easily generate a new immersion submanifold, and this technique should be useful for studies in geometry and physics. Through the spin field, we find a linear relation between the connection and the extrinsic curvature of the submanifold. We propose a conjecture that any solution of the Killing spin field equation can be locally written as the product of the solutions we find. △ Less

Submitted 16 February, 2024; originally announced June 2024.

Comments: 20 pages, 1 figure

arXiv:2406.11203 [pdf]

Large reversible magnetocaloric effect of the EuAl3Si single crystal

Authors: Hai Zeng, Shuo Zou, Zhou Wang, Ziyu Li, Kangjian Luo, Yongkang Luo

Abstract: The magnetic properties, magnetocaloric effect and magnetoresistance of EuAl3Si single crystal have been investigated. A giant reversible magnetocaloric effect was observed around TC = 14.5 K. For the low magnetic field changes of 0-2 T, the maximum values of magnetic entropy change and refrigerant capacity are 13.4 J/kg K, and 166 J/kg,respectively, with the corresponding adiabatic temperature of… ▽ More The magnetic properties, magnetocaloric effect and magnetoresistance of EuAl3Si single crystal have been investigated. A giant reversible magnetocaloric effect was observed around TC = 14.5 K. For the low magnetic field changes of 0-2 T, the maximum values of magnetic entropy change and refrigerant capacity are 13.4 J/kg K, and 166 J/kg,respectively, with the corresponding adiabatic temperature of 7.2 K.These excellent magnetocaloric parameters suggest EuAl3Si as a promising candidate for magnetic refrigeration application around liquid hydrogen temperature. △ Less

Submitted 17 June, 2024; originally announced June 2024.

arXiv:2406.10534 [pdf, other]

A Finite Difference Informed Graph Network for Solving Steady-State Incompressible Flows on Block-Structured Grids

Authors: Yiye Zou, Tianyu Li, Shufan Zou, Jingyu Wang, Laiping Zhang, Xiaogang Deng

Abstract: Recently, advancements in deep learning have enabled physics-informed neural networks (PINNs) to solve partial differential equations (PDEs). Numerical differentiation (ND) using the finite difference (FD) method is efficient in physics-constrained designs, even in parameterized settings, often employing body-fitted block-structured grids for complex flow cases. However, convolution operators in C… ▽ More Recently, advancements in deep learning have enabled physics-informed neural networks (PINNs) to solve partial differential equations (PDEs). Numerical differentiation (ND) using the finite difference (FD) method is efficient in physics-constrained designs, even in parameterized settings, often employing body-fitted block-structured grids for complex flow cases. However, convolution operators in CNNs for finite differences are typically limited to single-block grids. To address this, we use graphs and graph networks (GNs) to learn flow representations across multi-block structured grids. We propose a graph convolution-based finite difference method (GC-FDM) to train GNs in a physics-constrained manner, enabling differentiable finite difference operations on graph unstructured outputs. Our goal is to solve parametric steady incompressible Navier-Stokes equations for flows around a backward-facing step, a circular cylinder, and double cylinders, using multi-block structured grids. Comparing our method to a CFD solver under various boundary conditions, we demonstrate improved training efficiency and accuracy, achieving a minimum relative error of $10^{-3}$ in velocity field prediction and a 20\% reduction in training cost compared to PINNs. △ Less

Submitted 15 June, 2024; originally announced June 2024.

arXiv:2406.06697 [pdf, other]

A quasar-galaxy merger at $z\sim 6.2$: rapid host growth via accretion of two massive satellite galaxies

Authors: Roberto Decarli, Federica Loiacono, Emanuele Paolo Farina, Massimo Dotti, Alessandro Lupi, Romain A. Meyer, Marco Mignoli, Antonio Pensabene, Michael A. Strauss, Bram Venemans, Jinyi Yang, Fabian Walter, Julien Wolf, Eduardo Bañados, Laura Blecha, Sarah Bosman, Chris L. Carilli, Andrea Comastri, Thomas Connor, Tiago Costa, Anna-Christina Eilers, Xiaohui Fan, Roberto Gilli, Hyunsung D. Jun, Weizhe Liu , et al. (16 additional authors not shown)

Abstract: We present JWST/NIRSpec Integral Field Spectroscopy in the rest-frame optical bands of the system PJ308-21, a quasar at $z=6.2342$ caught as its host galaxy interacts with companion galaxies. We detect spatially extended emission of several emission lines (H$α$, H$β$, [OIII], [NII], [SII], HeII), which we use to study the properties of the ionized phase of the interstellar medium: the source and h… ▽ More We present JWST/NIRSpec Integral Field Spectroscopy in the rest-frame optical bands of the system PJ308-21, a quasar at $z=6.2342$ caught as its host galaxy interacts with companion galaxies. We detect spatially extended emission of several emission lines (H$α$, H$β$, [OIII], [NII], [SII], HeII), which we use to study the properties of the ionized phase of the interstellar medium: the source and hardness of the photoionizing radiation field, metallicity, dust reddening, electron density and temperature, and star formation. We also marginally detect continuum starlight emission associated with the companion sources. We find that at least two independent satellite galaxies are part of the system. While the quasar host appears highly enriched and obscured, with AGN-like photoionization conditions, the western companion shows minimal dust extinction, low metallicity ($Z\sim0.4$ Z$_\odot$), and star-formation driven photoionization. The eastern companion shows higher extinction and metallicity ($Z\sim0.8$ Z$_\odot$) compared to the western companion, and it is at least partially photoionized by the nearby quasar. We do not find any indication of AGN in the companion sources. Our study shows that while the quasar host galaxy is already very massive ($M_{\rm dyn}>10^{11}$ M$_\odot$), it is still rapidly building up by accreting two relatively massive ($M_{\rm star}\sim 10^{10}$ M$_\odot$) companion sources. This dataset showcases the power of JWST in exposing the build-up of massive galaxies in the first Gyr of the Universe. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 15 pages, 16 figures. Accepted for publication in A&A

arXiv:2406.01762 [pdf, other]

Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation

Authors: Yudan Wang, Yue Wang, Yi Zhou, Shaofeng Zou

Abstract: Actor-critic (AC) is a powerful method for learning an optimal policy in reinforcement learning, where the critic uses algorithms, e.g., temporal difference (TD) learning with function approximation, to evaluate the current policy and the actor updates the policy along an approximate gradient direction using information from the critic. This paper provides the \textit{tightest} non-asymptotic conv… ▽ More Actor-critic (AC) is a powerful method for learning an optimal policy in reinforcement learning, where the critic uses algorithms, e.g., temporal difference (TD) learning with function approximation, to evaluate the current policy and the actor updates the policy along an approximate gradient direction using information from the critic. This paper provides the \textit{tightest} non-asymptotic convergence bounds for both the AC and natural AC (NAC) algorithms. Specifically, existing studies show that AC converges to an $ε+\varepsilon_{\text{critic}}$ neighborhood of stationary points with the best known sample complexity of $\mathcal{O}(ε^{-2})$ (up to a log factor), and NAC converges to an $ε+\varepsilon_{\text{critic}}+\sqrt{\varepsilon_{\text{actor}}}$ neighborhood of the global optimum with the best known sample complexity of $\mathcal{O}(ε^{-3})$, where $\varepsilon_{\text{critic}}$ is the approximation error of the critic and $\varepsilon_{\text{actor}}$ is the approximation error induced by the insufficient expressive power of the parameterized policy class. This paper analyzes the convergence of both AC and NAC algorithms with compatible function approximation. Our analysis eliminates the term $\varepsilon_{\text{critic}}$ from the error bounds while still achieving the best known sample complexities. Moreover, we focus on the challenging single-loop setting with a single Markovian sample trajectory. Our major technical novelty lies in analyzing the stochastic bias due to policy-dependent and time-varying compatible function approximation in the critic, and handling the non-ergodicity of the MDP due to the single Markovian sample trajectory. Numerical results are also provided in the appendix. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: ICML 2024

arXiv:2405.19440 [pdf, other]

On the Convergence of Multi-objective Optimization under Generalized Smoothness

Authors: Qi Zhang, Peiyao Xiao, Kaiyi Ji, Shaofeng Zou

Abstract: Multi-objective optimization (MOO) is receiving more attention in various fields such as multi-task learning. Recent works provide some effective algorithms with theoretical analysis but they are limited by the standard $L$-smooth or bounded-gradient assumptions, which are typically unsatisfactory for neural networks, such as recurrent neural networks (RNNs) and transformers. In this paper, we stu… ▽ More Multi-objective optimization (MOO) is receiving more attention in various fields such as multi-task learning. Recent works provide some effective algorithms with theoretical analysis but they are limited by the standard $L$-smooth or bounded-gradient assumptions, which are typically unsatisfactory for neural networks, such as recurrent neural networks (RNNs) and transformers. In this paper, we study a more general and realistic class of $\ell$-smooth loss functions, where $\ell$ is a general non-decreasing function of gradient norm. We develop two novel single-loop algorithms for $\ell$-smooth MOO problems, Generalized Smooth Multi-objective Gradient descent (GSMGrad) and its stochastic variant, Stochastic Generalized Smooth Multi-objective Gradient descent (SGSMGrad), which approximate the conflict-avoidant (CA) direction that maximizes the minimum improvement among objectives. We provide a comprehensive convergence analysis of both algorithms and show that they converge to an $ε$-accurate Pareto stationary point with a guaranteed $ε$-level average CA distance (i.e., the gap between the updating direction and the CA direction) over all iterations, where totally $\mathcal{O}(ε^{-2})$ and $\mathcal{O}(ε^{-4})$ samples are needed for deterministic and stochastic settings, respectively. Our algorithms can also guarantee a tighter $ε$-level CA distance in each iteration using more samples. Moreover, we propose a practical variant of GSMGrad named GSMGrad-FA using only constant-level time and space, while achieving the same performance guarantee as GSMGrad. Our experiments validate our theory and demonstrate the effectiveness of the proposed methods. △ Less

Submitted 1 July, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

arXiv:2405.16077 [pdf, ps, other]

Finite-Time Analysis for Conflict-Avoidant Multi-Task Reinforcement Learning

Authors: Yudan Wang, Peiyao Xiao, Hao Ban, Kaiyi Ji, Shaofeng Zou

Abstract: Multi-task reinforcement learning (MTRL) has shown great promise in many real-world applications. Existing MTRL algorithms often aim to learn a policy that optimizes individual objective functions simultaneously with a given prior preference (or weights) on different tasks. However, these methods often suffer from the issue of \textit{gradient conflict} such that the tasks with larger gradients do… ▽ More Multi-task reinforcement learning (MTRL) has shown great promise in many real-world applications. Existing MTRL algorithms often aim to learn a policy that optimizes individual objective functions simultaneously with a given prior preference (or weights) on different tasks. However, these methods often suffer from the issue of \textit{gradient conflict} such that the tasks with larger gradients dominate the update direction, resulting in a performance degeneration on other tasks. In this paper, we develop a novel dynamic weighting multi-task actor-critic algorithm (MTAC) under two options of sub-procedures named as CA and FC in task weight updates. MTAC-CA aims to find a conflict-avoidant (CA) update direction that maximizes the minimum value improvement among tasks, and MTAC-FC targets at a much faster convergence rate. We provide a comprehensive finite-time convergence analysis for both algorithms. We show that MTAC-CA can find a $ε+ε_{\text{app}}$-accurate Pareto stationary policy using $\mathcal{O}({ε^{-5}})$ samples, while ensuring a small $ε+\sqrt{ε_{\text{app}}}$-level CA distance (defined as the distance to the CA direction), where $ε_{\text{app}}$ is the function approximation error. The analysis also shows that MTAC-FC improves the sample complexity to $\mathcal{O}(ε^{-3})$, but with a constant-level CA distance. Our experiments on MT10 demonstrate the improved performance of our algorithms over existing MTRL methods with fixed preference. △ Less

Submitted 10 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

Comments: Initial submission at the 41$^{st}$ International Conference on Machine Learning

arXiv:2405.07558 [pdf, other]

Synchronization of High-Dimensional Linear Networks over Finite Fields

Authors: Siyu Zou, Ting Li, Jiandong Zhu

Abstract: This paper investigates the synchronization problems for general high-dimensional linear networks over finite fields. By using the technique of linear transformations and invariant subspaces for linear spaces over finite fields, several necessary and sufficient conditions for the synchronization of high-dimensional linear networks over finite fields are proposed. This paper not only generalizes th… ▽ More This paper investigates the synchronization problems for general high-dimensional linear networks over finite fields. By using the technique of linear transformations and invariant subspaces for linear spaces over finite fields, several necessary and sufficient conditions for the synchronization of high-dimensional linear networks over finite fields are proposed. This paper not only generalizes the existing results from 1-dimensional to high-dimensional linear networks but also adopts a new approach. Finally, some numerical examples are given to illustrate the effectiveness of our theoretical results. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.04466 [pdf, other]

A fully differentiable GNN-based PDE Solver: With Applications to Poisson and Navier-Stokes Equations

Authors: Tianyu Li, Yiye Zou, Shufan Zou, Xinghua Chang, Laiping Zhang, Xiaogang Deng

Abstract: In this study, we present a novel computational framework that integrates the finite volume method with graph neural networks to address the challenges in Physics-Informed Neural Networks(PINNs). Our approach leverages the flexibility of graph neural networks to adapt to various types of two-dimensional unstructured grids, enhancing the model's applicability across different physical equations and… ▽ More In this study, we present a novel computational framework that integrates the finite volume method with graph neural networks to address the challenges in Physics-Informed Neural Networks(PINNs). Our approach leverages the flexibility of graph neural networks to adapt to various types of two-dimensional unstructured grids, enhancing the model's applicability across different physical equations and boundary conditions. The core innovation lies in the development of an unsupervised training algorithm that utilizes GPU parallel computing to implement a fully differentiable finite volume method discretization process. This method includes differentiable integral and gradient reconstruction algorithms, enabling the model to directly solve partial-differential equations(PDEs) during training without the need for pre-computed data. Our results demonstrate the model's superior mesh generalization and its capability to handle multiple boundary conditions simultaneously, significantly boosting its generalization capabilities. The proposed method not only shows potential for extensive applications in CFD but also establishes a new paradigm for integrating traditional numerical methods with deep learning technologies, offering a robust platform for solving complex physical problems. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.01327 [pdf, other]

Constrained Reinforcement Learning Under Model Mismatch

Authors: Zhongchang Sun, Sihong He, Fei Miao, Shaofeng Zou

Abstract: Existing studies on constrained reinforcement learning (RL) may obtain a well-performing policy in the training environment. However, when deployed in a real environment, it may easily violate constraints that were originally satisfied during training because there might be model mismatch between the training and real environments. To address the above challenge, we formulate the problem as constr… ▽ More Existing studies on constrained reinforcement learning (RL) may obtain a well-performing policy in the training environment. However, when deployed in a real environment, it may easily violate constraints that were originally satisfied during training because there might be model mismatch between the training and real environments. To address the above challenge, we formulate the problem as constrained RL under model uncertainty, where the goal is to learn a good policy that optimizes the reward and at the same time satisfy the constraint under model mismatch. We develop a Robust Constrained Policy Optimization (RCPO) algorithm, which is the first algorithm that applies to large/continuous state space and has theoretical guarantees on worst-case reward improvement and constraint violation at each iteration during the training. We demonstrate the effectiveness of our algorithm on a set of RL tasks with constraints. △ Less

Submitted 3 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: ICML 2024

arXiv:2405.00998 [pdf, other]

Part-aware Shape Generation with Latent 3D Diffusion of Neural Voxel Fields

Authors: Yuhang Huang, SHilong Zou, Xinwang Liu, Kai Xu

Abstract: This paper presents a novel latent 3D diffusion model for the generation of neural voxel fields, aiming to achieve accurate part-aware structures. Compared to existing methods, there are two key designs to ensure high-quality and accurate part-aware generation. On one hand, we introduce a latent 3D diffusion process for neural voxel fields, enabling generation at significantly higher resolutions t… ▽ More This paper presents a novel latent 3D diffusion model for the generation of neural voxel fields, aiming to achieve accurate part-aware structures. Compared to existing methods, there are two key designs to ensure high-quality and accurate part-aware generation. On one hand, we introduce a latent 3D diffusion process for neural voxel fields, enabling generation at significantly higher resolutions that can accurately capture rich textural and geometric details. On the other hand, a part-aware shape decoder is introduced to integrate the part codes into the neural voxel fields, guiding the accurate part decomposition and producing high-quality rendering results. Through extensive experimentation and comparisons with state-of-the-art methods, we evaluate our approach across four different classes of data. The results demonstrate the superior generative capabilities of our proposed method in part-aware shape generation, outperforming existing state-of-the-art methods. △ Less

Submitted 20 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.18482 [pdf, ps, other]

Increasing resolution and instability for linear inverse scattering problems

Authors: Pu-Zhao Kow, Mikko Salo, Sen Zou

Abstract: In this work we study the increasing resolution of linear inverse scattering problems at a large fixed frequency. We consider the problem of recovering the density of a Herglotz wave function, and the linearized inverse scattering problem for a potential. It is shown that the number of features that can be stably recovered (stable region) becomes larger as the frequency increases, whereas one has… ▽ More In this work we study the increasing resolution of linear inverse scattering problems at a large fixed frequency. We consider the problem of recovering the density of a Herglotz wave function, and the linearized inverse scattering problem for a potential. It is shown that the number of features that can be stably recovered (stable region) becomes larger as the frequency increases, whereas one has strong instability for the rest of the features (unstable region). To show this rigorously, we prove that the singular values of the forward operator stay roughly constant in the stable region and decay exponentially in the unstable region. The arguments are based on structural properties of the problems and they involve the Courant min-max principle for singular values, quantitative Agmon-Hörmander estimates, and a Schwartz kernel computation based on the coarea formula. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: 29 pages, 3 figures

MSC Class: 35P15; 35R25; 35R30; 35J05; 35J15

arXiv:2404.08045 [pdf, other]

JWST Discovery of $40+$ Microlensed Stars in a Magnified Galaxy, the "Dragon" behind Abell 370

Authors: Yoshinobu Fudamoto, Fengwu Sun, Jose M. Diego, Liang Dai, Masamune Oguri, Adi Zitrin, Erik Zackrisson, Mathilde Jauzac, David J. Lagattuta, Eiichi Egami, Edoardo Iani, Rogier A. Windhorst, Katsuya T. Abe, Franz Erik Bauer, Fuyan Bian, Rachana Bhatawdekar, Thomas J. Broadhurst, Zheng Cai, Chian-Chou Chen, Wenlei Chen, Seth H. Cohen, Christopher J. Conselice, Daniel Espada, Nicholas Foo, Brenda L. Frye , et al. (21 additional authors not shown)

Abstract: Strong gravitational magnification by massive galaxy clusters enable us to detect faint background sources, resolve their detailed internal structures, and in the most extreme cases identify and study individual stars in distant galaxies. Highly magnified individual stars allow for a wide range of applications, including studies of stellar populations in distant galaxies and constraining small-sca… ▽ More Strong gravitational magnification by massive galaxy clusters enable us to detect faint background sources, resolve their detailed internal structures, and in the most extreme cases identify and study individual stars in distant galaxies. Highly magnified individual stars allow for a wide range of applications, including studies of stellar populations in distant galaxies and constraining small-scale dark matter structures. However, these applications have been hampered by the small number of events observed, as typically one or a few stars are identified from each distant galaxy. Here, we report the discovery of 46 significant microlensed stars in a single strongly-lensed high-redshift galaxy behind the Abell 370 cluster at redshift of 0.725 when the Universe was half of its current age (dubbed the ``Dragon arc''), based on two observations separated by one year with the James Webb Space Telescope ({\it JWST}). These events are mostly found near the expected lensing critical curves, suggesting that these are magnified individual stars that appear as transients from intracluster stellar microlenses. Through multi-wavelength photometry and colors, we constrain stellar types and find that many of them are consistent with red giants/supergiants magnified by factors of thousands. This finding reveals an unprecedented high occurrence of microlensing events in the Dragon arc, and proves that {\it JWST}'s time-domain observations open up the possibility of conducting statistical studies of high-redshift stars and subgalactic scale perturbations in the lensing dark matter field. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: 15 pages, 4 figures, 1 table submitted to Nature Astronomy

arXiv:2404.07779 [pdf, other]

Improving Network Degree Correlation by Degree-preserving Rewiring

Authors: Shuo Zou, Bo Zhou, Qi Xuan

Abstract: Degree correlation is a crucial measure in networks, significantly impacting network topology and dynamical behavior. The degree sequence of a network is a significant characteristic, and altering network degree correlation through degree-preserving rewiring poses an interesting problem. In this paper, we define the problem of maximizing network degree correlation through a finite number of rewiri… ▽ More Degree correlation is a crucial measure in networks, significantly impacting network topology and dynamical behavior. The degree sequence of a network is a significant characteristic, and altering network degree correlation through degree-preserving rewiring poses an interesting problem. In this paper, we define the problem of maximizing network degree correlation through a finite number of rewirings and use the assortativity coefficient to measure it. We analyze the changes in assortativity coefficient under degree-preserving rewiring and establish its relationship with the s-metric. Under our assumptions, we prove the problem to be monotonic and submodular, leading to the proposal of the GA method to enhance network degree correlation. By formulating an integer programming model, we demonstrate that the GA method can effectively approximate the optimal solution and validate its superiority over other baseline methods through experiments on three types of real-world networks. Additionally, we introduce three heuristic rewiring strategies, EDA, TA and PEA, and demonstrate their applicability to different types of networks. Furthermore, we extend our investigation to explore the impact of these rewiring strategies on several spectral robustness metrics based on the adjacency matrix. Finally, we examine the robustness of various centrality metrics in the network while enhancing network degree correlation using the GA method. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.01436 [pdf, ps, other]

Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance

Authors: Qi Zhang, Yi Zhou, Shaofeng Zou

Abstract: This paper provides the first tight convergence analyses for RMSProp and Adam in non-convex optimization under the most relaxed assumptions of coordinate-wise generalized smoothness and affine noise variance. We first analyze RMSProp, which is a special case of Adam with adaptive learning rates but without first-order momentum. Specifically, to solve the challenges due to dependence among adaptive… ▽ More This paper provides the first tight convergence analyses for RMSProp and Adam in non-convex optimization under the most relaxed assumptions of coordinate-wise generalized smoothness and affine noise variance. We first analyze RMSProp, which is a special case of Adam with adaptive learning rates but without first-order momentum. Specifically, to solve the challenges due to dependence among adaptive update, unbounded gradient estimate and Lipschitz constant, we demonstrate that the first-order term in the descent lemma converges and its denominator is upper bounded by a function of gradient norm. Based on this result, we show that RMSProp with proper hyperparameters converges to an $ε$-stationary point with an iteration complexity of $\mathcal O(ε^{-4})$. We then generalize our analysis to Adam, where the additional challenge is due to a mismatch between the gradient and first-order momentum. We develop a new upper bound on the first-order term in the descent lemma, which is also a function of the gradient norm. We show that Adam with proper hyperparameters converges to an $ε$-stationary point with an iteration complexity of $\mathcal O(ε^{-4})$. Our complexity results for both RMSProp and Adam match with the complexity lower bound established in \cite{arjevani2023lower}. △ Less

Submitted 3 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

arXiv:2404.01200 [pdf, other]

Large-Scale Non-convex Stochastic Constrained Distributionally Robust Optimization

Authors: Qi Zhang, Yi Zhou, Ashley Prater-Bennette, Lixin Shen, Shaofeng Zou

Abstract: Distributionally robust optimization (DRO) is a powerful framework for training robust models against data distribution shifts. This paper focuses on constrained DRO, which has an explicit characterization of the robustness level. Existing studies on constrained DRO mostly focus on convex loss function, and exclude the practical and challenging case with non-convex loss function, e.g., neural netw… ▽ More Distributionally robust optimization (DRO) is a powerful framework for training robust models against data distribution shifts. This paper focuses on constrained DRO, which has an explicit characterization of the robustness level. Existing studies on constrained DRO mostly focus on convex loss function, and exclude the practical and challenging case with non-convex loss function, e.g., neural network. This paper develops a stochastic algorithm and its performance analysis for non-convex constrained DRO. The computational complexity of our stochastic algorithm at each iteration is independent of the overall dataset size, and thus is suitable for large-scale applications. We focus on the general Cressie-Read family divergence defined uncertainty set which includes $χ^2$-divergences as a special case. We prove that our algorithm finds an $ε$-stationary point with a computational complexity of $\mathcal O(ε^{-3k_*-5})$, where $k_*$ is the parameter of the Cressie-Read divergence. The numerical results indicate that our method outperforms existing methods.} Our method also applies to the smoothed conditional value at risk (CVaR) DRO. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: We have corrected Theorem 1 in Sec 4 for AAAI 2024 version, where the order of $n_z$ changes from $ε^{-k_*} )$ to $ε^{-2k_*-2}$

arXiv:2403.07257 [pdf, other]

The Dawn of AI-Native EDA: Opportunities and Challenges of Large Circuit Models

Authors: Lei Chen, Yiqi Chen, Zhufei Chu, Wenji Fang, Tsung-Yi Ho, Ru Huang, Yu Huang, Sadaf Khan, Min Li, Xingquan Li, Yu Li, Yun Liang, Jinwei Liu, Yi Liu, Yibo Lin, Guojie Luo, Zhengyuan Shi, Guangyu Sun, Dimitrios Tsaras, Runsheng Wang, Ziyi Wang, Xinming Wei, Zhiyao Xie, Qiang Xu, Chenhao Xue , et al. (14 additional authors not shown)

Abstract: Within the Electronic Design Automation (EDA) domain, AI-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutions often repurpose deep learning models from other domains, such as vision, text, and graph analytics, applying them to circuit design without tailoring to the unique complexities of electronic circuits. Suc… ▽ More Within the Electronic Design Automation (EDA) domain, AI-driven solutions have emerged as formidable tools, yet they typically augment rather than redefine existing methodologies. These solutions often repurpose deep learning models from other domains, such as vision, text, and graph analytics, applying them to circuit design without tailoring to the unique complexities of electronic circuits. Such an AI4EDA approach falls short of achieving a holistic design synthesis and understanding, overlooking the intricate interplay of electrical, logical, and physical facets of circuit data. This paper argues for a paradigm shift from AI4EDA towards AI-native EDA, integrating AI at the core of the design process. Pivotal to this vision is the development of a multimodal circuit representation learning technique, poised to provide a comprehensive understanding by harmonizing and extracting insights from varied data sources, such as functional specifications, RTL designs, circuit netlists, and physical layouts. We champion the creation of large circuit models (LCMs) that are inherently multimodal, crafted to decode and express the rich semantics and structures of circuit data, thus fostering more resilient, efficient, and inventive design methodologies. Embracing this AI-native philosophy, we foresee a trajectory that transcends the current innovation plateau in EDA, igniting a profound shift-left in electronic design methodology. The envisioned advancements herald not just an evolution of existing EDA tools but a revolution, giving rise to novel instruments of design tools that promise to radically enhance design productivity and inaugurate a new epoch where the optimization of circuit performance, power, and area (PPA) is achieved not incrementally, but through leaps that redefine the benchmarks of electronic systems' capabilities. △ Less

Submitted 1 May, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: The authors are ordered alphabetically. Contact: qxu@cse[dot]cuhk[dot]edu[dot]hk, gluo@pku[dot]edu[dot]cn, yuan.mingxuan@huawei[dot]com

arXiv:2403.06476 [pdf, other]

doi 10.1103/PhysRevB.109.245106

$^{27}$Al NMR study of the magnetic Weyl semimetal CeAlGe

Authors: Zhuo Wang, Xiaobo He, Fangjun Lu, Hai Zeng, Shuo Zou, Xiao-Xiao Zhang, Yongkang Luo

Abstract: Motivated by the recent observations of electronic correlation effect [M. Corasaniti \textit{et al}., Phys. Rev. B \textbf{104}, L121112 (2021)] and topology-stabilized magnetic fluctuations [N. Drucker \textit{et al}., Nat. Commun. \textbf{14}, 5182 (2023)] in the noncentrosymmetric magnetic Weyl semimetal candidate CeAlGe, we performed systematic studies on the local static and dynamic spin susc… ▽ More Motivated by the recent observations of electronic correlation effect [M. Corasaniti \textit{et al}., Phys. Rev. B \textbf{104}, L121112 (2021)] and topology-stabilized magnetic fluctuations [N. Drucker \textit{et al}., Nat. Commun. \textbf{14}, 5182 (2023)] in the noncentrosymmetric magnetic Weyl semimetal candidate CeAlGe, we performed systematic studies on the local static and dynamic spin susceptibilities by $^{27}$Al nuclear magnetic resonance. Due to the large spin susceptibility from Ce-$4f$ electrons, the theoretically predicted responses from Weyl fermions are overwhelmed. A Knight-shift anomaly is observed below $T^*\sim50$ K, a signature of the onset of coherent Kondo coupling. In addition, an anomalous peak is found in $1/T_1T$ near 15 K, well above the magnetic ordering temperature $T_N \approx 5$ K, which probably is a consequence of topology-stabilized magnetic fluctuations. These results highlight the interplay among electronic correlation, magnetism and band topology in this family of Kondo Weyl semimetals. △ Less

Submitted 4 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: 6+3 pages, 5+3 figures

Journal ref: Phys. Rev. B 109, 245106 (2024)

arXiv:2403.00691 [pdf, other]

Tri-Modal Motion Retrieval by Learning a Joint Embedding Space

Authors: Kangning Yin, Shihao Zou, Yuxuan Ge, Zheng Tian

Abstract: Information retrieval is an ever-evolving and crucial research domain. The substantial demand for high-quality human motion data especially in online acquirement has led to a surge in human motion research works. Prior works have mainly concentrated on dual-modality learning, such as text and motion tasks, but three-modality learning has been rarely explored. Intuitively, an extra introduced modal… ▽ More Information retrieval is an ever-evolving and crucial research domain. The substantial demand for high-quality human motion data especially in online acquirement has led to a surge in human motion research works. Prior works have mainly concentrated on dual-modality learning, such as text and motion tasks, but three-modality learning has been rarely explored. Intuitively, an extra introduced modality can enrich a model's application scenario, and more importantly, an adequate choice of the extra modality can also act as an intermediary and enhance the alignment between the other two disparate modalities. In this work, we introduce LAVIMO (LAnguage-VIdeo-MOtion alignment), a novel framework for three-modality learning integrating human-centric videos as an additional modality, thereby effectively bridging the gap between text and motion. Moreover, our approach leverages a specially designed attention mechanism to foster enhanced alignment and synergistic effects among text, video, and motion modalities. Empirically, our results on the HumanML3D and KIT-ML datasets show that LAVIMO achieves state-of-the-art performance in various motion-related cross-modal retrieval tasks, including text-to-motion, motion-to-text, video-to-motion and motion-to-video. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.17570 [pdf, other]

Sparse Variational Contaminated Noise Gaussian Process Regression with Applications in Geomagnetic Perturbations Forecasting

Authors: Daniel Iong, Matthew McAnear, Yuezhou Qu, Shasha Zou, Gabor Toth, Yang Chen

Abstract: Gaussian Processes (GP) have become popular machine-learning methods for kernel-based learning on datasets with complicated covariance structures. In this paper, we present a novel extension to the GP framework using a contaminated normal likelihood function to better account for heteroscedastic variance and outlier noise. We propose a scalable inference algorithm based on the Sparse Variational G… ▽ More Gaussian Processes (GP) have become popular machine-learning methods for kernel-based learning on datasets with complicated covariance structures. In this paper, we present a novel extension to the GP framework using a contaminated normal likelihood function to better account for heteroscedastic variance and outlier noise. We propose a scalable inference algorithm based on the Sparse Variational Gaussian Process (SVGP) method for fitting sparse Gaussian process regression models with contaminated normal noise on large datasets. We examine an application to geomagnetic ground perturbations, where the state-of-the-art prediction model is based on neural networks. We show that our approach yields shorter prediction intervals for similar coverage and accuracy when compared to an artificial dense neural network baseline. △ Less

Submitted 2 July, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

arXiv:2402.02700 [pdf, ps, other]

Sample Complexity Characterization for Linear Contextual MDPs

Authors: Junze Deng, Yuan Cheng, Shaofeng Zou, Yingbin Liang

Abstract: Contextual Markov decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable. While CMDPs serve as an important framework to model many real-world applications with time-varying environments, they are largely unexplored from theoretical perspective. In thi… ▽ More Contextual Markov decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable. While CMDPs serve as an important framework to model many real-world applications with time-varying environments, they are largely unexplored from theoretical perspective. In this paper, we study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights. For both models, we propose novel model-based algorithms and show that they enjoy guaranteed $ε$-suboptimality gap with desired polynomial sample complexity. In particular, instantiating our result for the first model to the tabular CMDP improves the existing result by removing the reachability assumption. Our result for the second model is the first-known result for such a type of function approximation models. Comparison between our results for the two models further indicates that having context-varying features leads to much better sample efficiency than having common representations for all contexts under linear CMDPs. △ Less

Submitted 4 February, 2024; originally announced February 2024.

Comments: accepted to AIstats2024

arXiv:2402.00113 [pdf, other]

A SPectroscopic survey of biased halos In the Reionization Era (ASPIRE): Impact of Galaxies on the CGM Metal Enrichment at z > 6 Using the JWST and VLT

Authors: Siwei Zou, Zheng Cai, Feige Wang, Xiaohui Fan, Jaclyn B. Champagne, Joseph F. Hennawi, Jan-Torge Schindler, Emanuele P. Farina, Jinyi Yang, Kohei Inayoshi, Eduardo Banados, Sarah E. I. Bosman, Zihao Li, Xiaojing Lin, Yunjing Wu, Fengwu Sun, Zi-Yi Guo, Girish Kulkarni, Melanie Habouzit, Stephane Charlot, Jacopo Chevallard, Thomas Connor, Anna-Christina Eilers, Linhua Jiang, Xiangyu Jin , et al. (5 additional authors not shown)

Abstract: We characterize the multiphase circumgalactic medium and galaxy properties at z = 6.0-6.5 in four quasar fields from the James Webb Space Telescope A SPectroscopic survey of biased halos In the Reionization Era (ASPIRE) program. We use the Very Large Telescope/X-shooter spectra of quasar J0305-3150 to identify one new metal absorber at z = 6.2713 with multiple transitions (OI, MgI, FeII and CII).… ▽ More We characterize the multiphase circumgalactic medium and galaxy properties at z = 6.0-6.5 in four quasar fields from the James Webb Space Telescope A SPectroscopic survey of biased halos In the Reionization Era (ASPIRE) program. We use the Very Large Telescope/X-shooter spectra of quasar J0305-3150 to identify one new metal absorber at z = 6.2713 with multiple transitions (OI, MgI, FeII and CII). They are combined with the published absorbing systems in Davies et al. (2023a) at the same redshift range to form of a sample of nine metal absorbers at z = 6.03 to 6.49. We identify eight galaxies within 1000 km s$^{-1}$ and 350 kpc around the absorbing gas from the ASPIRE spectroscopic data, with their redshifts secured by [OIII]($λλ$4959, 5007) doublets and H$β$ emission lines. Our spectral energy distribution fitting indicates that the absorbing galaxies have stellar mass ranging from 10$^{7.2}$ to 10$^{8.8}M_{\odot}$ and metallicity between 0.02 and 0.4 solar. Notably, the z = 6.2713 system in the J0305-3150 field resides in a galaxy overdensity region, which contains two (tentatively) merging galaxies within 350 kpc and seven galaxies within 1 Mpc. We measure the relative abundances of $α$ elements to iron ([$α$/Fe]) and find that the CGM gas in the most overdense region exhibits a lower [$α$/Fe] ratio. Our modeling of the galaxy's chemical abundance favors a top-heavy stellar initial mass function, and hints that we may be witnessing the contribution of the first generation Population III stars to the CGM at the end of reionization epoch. △ Less

Submitted 31 January, 2024; originally announced February 2024.

Comments: 21 pages, 4 figures in the main text. Accepted for publication in ApJL

arXiv:2401.17561 [pdf, other]

doi 10.3847/1538-4357/ace7b6

Formation Mechanism of Laser-Driven Magnetized "Pillars of Creation"

Authors: Zhu Lei, Lifeng Wang, Jiwei Li, Shiyang Zou, Junfeng Wu, Zhonghai Zhao, Wei Sun, Wenqiang Yuan, Longxing Li, Zheng Yan, Jun Li, Wenhua Ye, Xiantu He, Bin Qiao

Abstract: Pillars of Creation, one of the most recognized objects in the sky, are believed to be associated with the formation of young stars. However, so far, the formation and maintenance mechanism for the pillars are still not fully understood due to the complexity of the nonlinear radiation magneto-hydrodynamics (RMHD). Here, assuming laboratory laser-driven conditions, we studied the self-consistent dy… ▽ More Pillars of Creation, one of the most recognized objects in the sky, are believed to be associated with the formation of young stars. However, so far, the formation and maintenance mechanism for the pillars are still not fully understood due to the complexity of the nonlinear radiation magneto-hydrodynamics (RMHD). Here, assuming laboratory laser-driven conditions, we studied the self-consistent dynamics of pillar structures in magnetic fields by means of two-dimensional (2D) and three-dimensional (3D) RMHD simulations, and these results also support our proposed experimental scheme. We find only when the magnetic pressure and ablation pressure are comparable, the magnetic field can significantly alter the plasma hydrodynamics. For medium magnetized cases ($β_{initial} \approx 3.5$), {the initial magnetic fields undergo compression and amplification. This amplification results in the magnetic pressure inside the pillar becoming large enough to support the sides of the pillar against radial collapse due to pressure from the surrounding hot plasma. This effect is particularly pronounced for the parallel component ($B_y$), which is consistent with observational results.} In contrast, a strong perpendicular ($B_x, B_z$) magnetic field ($β_{initial} < 1$) almost remains its initial distribution and significantly suppresses the expansion of blow-off gas plasma, leading to the inability to form pillar-like structures. The 3D simulations suggest that the bending at the head of `Column \uppercase\expandafter{\romannumeral1}' in pillars of creation may be due to the non-parallel magnetic fields. After similarity scaling transformation, our results can be applied to explain the formation and maintenance mechanism of the pillars, and can also provide useful information for future experimental designs. △ Less

Submitted 30 January, 2024; originally announced January 2024.

arXiv:2401.07709 [pdf, other]

Towards Efficient Diffusion-Based Image Editing with Instant Attention Masks

Authors: Siyu Zou, Jiji Tang, Yiyi Zhou, Jing He, Chaoyi Zhao, Rongsheng Zhang, Zhipeng Hu, Xiaoshuai Sun

Abstract: Diffusion-based Image Editing (DIE) is an emerging research hot-spot, which often applies a semantic mask to control the target area for diffusion-based editing. However, most existing solutions obtain these masks via manual operations or off-line processing, greatly reducing their efficiency. In this paper, we propose a novel and efficient image editing method for Text-to-Image (T2I) diffusion mo… ▽ More Diffusion-based Image Editing (DIE) is an emerging research hot-spot, which often applies a semantic mask to control the target area for diffusion-based editing. However, most existing solutions obtain these masks via manual operations or off-line processing, greatly reducing their efficiency. In this paper, we propose a novel and efficient image editing method for Text-to-Image (T2I) diffusion models, termed Instant Diffusion Editing(InstDiffEdit). In particular, InstDiffEdit aims to employ the cross-modal attention ability of existing diffusion models to achieve instant mask guidance during the diffusion steps. To reduce the noise of attention maps and realize the full automatics, we equip InstDiffEdit with a training-free refinement scheme to adaptively aggregate the attention distributions for the automatic yet accurate mask generation. Meanwhile, to supplement the existing evaluations of DIE, we propose a new benchmark called Editing-Mask to examine the mask accuracy and local editing ability of existing methods. To validate InstDiffEdit, we also conduct extensive experiments on ImageNet and Imagen, and compare it with a bunch of the SOTA methods. The experimental results show that InstDiffEdit not only outperforms the SOTA methods in both image quality and editing results, but also has a much faster inference speed, i.e., +5 to +6 times. △ Less

Submitted 23 January, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

Comments: Accepted by AAAI2024

arXiv:2312.14410 [pdf, other]

A Multi-Stage Adaptive Feature Fusion Neural Network for Multimodal Gait Recognition

Authors: Shinan Zou, Jianbo Xiong, Chao Fan, Shiqi Yu, Jin Tang

Abstract: Gait recognition is a biometric technology that has received extensive attention. Most existing gait recognition algorithms are unimodal, and a few multimodal gait recognition algorithms perform multimodal fusion only once. None of these algorithms may fully exploit the complementary advantages of the multiple modalities. In this paper, by considering the temporal and spatial characteristics of ga… ▽ More Gait recognition is a biometric technology that has received extensive attention. Most existing gait recognition algorithms are unimodal, and a few multimodal gait recognition algorithms perform multimodal fusion only once. None of these algorithms may fully exploit the complementary advantages of the multiple modalities. In this paper, by considering the temporal and spatial characteristics of gait data, we propose a multi-stage feature fusion strategy (MSFFS), which performs multimodal fusions at different stages in the feature extraction process. Also, we propose an adaptive feature fusion module (AFFM) that considers the semantic association between silhouettes and skeletons. The fusion process fuses different silhouette areas with their more related skeleton joints. Since visual appearance changes and time passage co-occur in a gait period, we propose a multiscale spatial-temporal feature extractor (MSSTFE) to learn the spatial-temporal linkage features thoroughly. Specifically, MSSTFE extracts and aggregates spatial-temporal linkages information at different spatial scales. Combining the strategy and modules mentioned above, we propose a multi-stage adaptive feature fusion (MSAFF) neural network, which shows state-of-the-art performance in many experiments on three datasets. Besides, MSAFF is equipped with feature dimensional pooling (FD Pooling), which can significantly reduce the dimension of the gait representations without hindering the accuracy. https://github.com/ShinanZou/MSAFF △ Less

Submitted 21 December, 2023; originally announced December 2023.

Comments: This paper has been accepted by IJCB2023

ACM Class: I.5

Journal ref: IJCB2023

arXiv:2312.14404 [pdf, other]

Cross-Covariate Gait Recognition: A Benchmark

Authors: Shinan Zou, Chao Fan, Jianbo Xiong, Chuanfu Shen, Shiqi Yu, Jin Tang

Abstract: Gait datasets are essential for gait research. However, this paper observes that present benchmarks, whether conventional constrained or emerging real-world datasets, fall short regarding covariate diversity. To bridge this gap, we undertake an arduous 20-month effort to collect a cross-covariate gait recognition (CCGR) dataset. The CCGR dataset has 970 subjects and about 1.6 million sequences; al… ▽ More Gait datasets are essential for gait research. However, this paper observes that present benchmarks, whether conventional constrained or emerging real-world datasets, fall short regarding covariate diversity. To bridge this gap, we undertake an arduous 20-month effort to collect a cross-covariate gait recognition (CCGR) dataset. The CCGR dataset has 970 subjects and about 1.6 million sequences; almost every subject has 33 views and 53 different covariates. Compared to existing datasets, CCGR has both population and individual-level diversity. In addition, the views and covariates are well labeled, enabling the analysis of the effects of different factors. CCGR provides multiple types of gait data, including RGB, parsing, silhouette, and pose, offering researchers a comprehensive resource for exploration. In order to delve deeper into addressing cross-covariate gait recognition, we propose parsing-based gait recognition (ParsingGait) by utilizing the newly proposed parsing data. We have conducted extensive experiments. Our main results show: 1) Cross-covariate emerges as a pivotal challenge for practical applications of gait recognition. 2) ParsingGait demonstrates remarkable potential for further advancement. 3) Alarmingly, existing SOTA methods achieve less than 43% accuracy on the CCGR, highlighting the urgency of exploring cross-covariate gait recognition. Link: https://github.com/ShinanZou/CCGR. △ Less

Submitted 4 March, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: AAAI2024

ACM Class: I.5

Journal ref: AAAI2024

arXiv:2312.02553 [pdf, other]

Design and test for the CEPC muon subdetector based on extruded scintillator and SiPM

Authors: Hongyu Zhang, Xiyang Wang, Weihu Ma, Shiming Zou, Deqing Fang, Wanbing He, Xiaolong Wang, Zhen Wang, Rui Yuan, Qibin Zheng

Abstract: A combination of scintillator, wavelength shifting (WLS) fiber, and silicon photomultiplier (SiPM) shows an excellent performance in the `$K_{L}$ and $μ$ detector (KLM)' of the Belle II experiment. In this study, we present the R&D efforts for a similar detection technology utilizing a new scintillator and SiPM. This technology can be applied to a muon detector for the proposed CEPC experiment. Th… ▽ More A combination of scintillator, wavelength shifting (WLS) fiber, and silicon photomultiplier (SiPM) shows an excellent performance in the `$K_{L}$ and $μ$ detector (KLM)' of the Belle II experiment. In this study, we present the R&D efforts for a similar detection technology utilizing a new scintillator and SiPM. This technology can be applied to a muon detector for the proposed CEPC experiment. The R&D encompasses the investigation of the performance of a new 150 cm-long scintillator, the NDL SiPM with a sensitive surface of $\times$ 3 mm, or the Hamamatsu MPPC with a sensitive surface of 1.3 mm $\times$ 1.3 mm. Additionally, it includes the construction of a detector strip and the methods employed to achieve excellent light collection. Cosmic ray tests reveal efficient photon collections by NDL SiPM or MPPC, with efficiencies well above 90% using a threshold of 8 p.e.. The time resolutions for hits at the far end of a scintillator strip are better than 1.7 ns. The observed performance lays the foundation for advancing R&D including prototype modules aiming for reference Technical Design Report of CEPC detector recently. △ Less

Submitted 21 May, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

Comments: 11 pages, 6 figures

Report number: No. 2022YFA1601903,No. 11925502, No. 11961141003, No. 12175041,No. XDB34030000

arXiv:2312.01186 [pdf, other]

Linker-Tuning: Optimizing Continuous Prompts for Heterodimeric Protein Prediction

Authors: Shuxian Zou, Hui Li, Shentong Mo, Xingyi Cheng, Eric Xing, Le Song

Abstract: Predicting the structure of interacting chains is crucial for understanding biological systems and developing new drugs. Large-scale pre-trained Protein Language Models (PLMs), such as ESM2, have shown impressive abilities in extracting biologically meaningful representations for protein structure prediction. In this paper, we show that ESMFold, which has been successful in computing accurate atom… ▽ More Predicting the structure of interacting chains is crucial for understanding biological systems and developing new drugs. Large-scale pre-trained Protein Language Models (PLMs), such as ESM2, have shown impressive abilities in extracting biologically meaningful representations for protein structure prediction. In this paper, we show that ESMFold, which has been successful in computing accurate atomic structures for single-chain proteins, can be adapted to predict the heterodimer structures in a lightweight manner. We propose Linker-tuning, which learns a continuous prompt to connect the two chains in a dimer before running it as a single sequence in ESMFold. Experiment results show that our method successfully predicts 56.98% of interfaces on the i.i.d. heterodimer test set, with an absolute improvement of +12.79% over the ESMFold-Linker baseline. Furthermore, our model can generalize well to the out-of-distribution (OOD) test set HeteroTest2 and two antibody test sets Fab and Fv while being $9\times$ faster than AF-Multimer. △ Less

Submitted 2 December, 2023; originally announced December 2023.

arXiv:2312.00300 [pdf, ps, other]

A Large Sample of Extremely Metal-poor Galaxies at $z<1$ Identified from the DESI Early Data

Authors: Hu Zou, Jipeng Sui, Amélie Saintonge, Dirk Scholte, John Moustakas, Malgorzata Siudek, Arjun Dey, Stephanie Juneau, Weijian Guo, Rebecca Canning, J. Aguilar, S. Ahlen, D. Brooks, T. Claybaugh, K. Dawson, A. de la Macorra, P. Doel, J. E. Forero-Romero, S. Gontcho A Gontcho, K. Honscheid, M. Landriau, L. Le Guillou, M. Manera, A. Meisner, R. Miquel , et al. (10 additional authors not shown)

Abstract: Extremely metal-poor galaxies (XMPGs) at relatively low redshift are excellent laboratories for studying galaxy formation and evolution in the early universe. Much effort has been spent on identifying them from large-scale spectroscopic surveys or spectroscopic follow-up observations. Previous work has identified a few hundred XMPGs. In this work, we obtain a large sample of 223 XMPGs at $z<1$ fro… ▽ More Extremely metal-poor galaxies (XMPGs) at relatively low redshift are excellent laboratories for studying galaxy formation and evolution in the early universe. Much effort has been spent on identifying them from large-scale spectroscopic surveys or spectroscopic follow-up observations. Previous work has identified a few hundred XMPGs. In this work, we obtain a large sample of 223 XMPGs at $z<1$ from the early data of the Dark Energy Spectroscopic Instrument (DESI). The oxygen abundance is determined using the direct $T_{\rm e}$ method based on the detection of the [O III]$λ$4363 line. The sample includes 95 confirmed XMPGs based on the oxygen abundance uncertainty; remaining 128 galaxies are regarded as XMPG candidates. These XMPGs are only 0.01% of the total DESI observed galaxies. Their coordinates and other proprieties are provided in the paper. The most XMPG has an oxygen abundance of $\sim 1/34 Z_{\odot}$, stellar mass of about $1.5\times10^7 M_{\odot}$ and star formation rate of 0.22 $M_{\odot}$ yr$^{-1}$. The two most XMPGs present distinct morphologies suggesting different formation mechanisms. The local environmental investigation shows that XMPGs preferentially reside in relatively low-density regions. Many of them fall below the stellar mass-metallicity relations (MZRs) of normal star-forming galaxies. From a comparison of the MZR with theoretical simulations, it appears that XMPGs are good analogs to high-redshift star-forming galaxies. The nature of these XMPG populations will be further investigated in detail with larger and more complete samples from the on-going DESI survey. △ Less

Submitted 30 November, 2023; originally announced December 2023.

Comments: accepted for publication in ApJ

arXiv:2311.16494 [pdf, other]

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models

Authors: Xinyu Tian, Shu Zou, Zhaoyuan Yang, Jing Zhang

Abstract: Although soft prompt tuning is effective in efficiently adapting Vision-Language (V&L) models for downstream tasks, it shows limitations in dealing with distribution shifts. We address this issue with Attribute-Guided Prompt Tuning (ArGue), making three key contributions. 1) In contrast to the conventional approach of directly appending soft prompts preceding class names, we align the model with p… ▽ More Although soft prompt tuning is effective in efficiently adapting Vision-Language (V&L) models for downstream tasks, it shows limitations in dealing with distribution shifts. We address this issue with Attribute-Guided Prompt Tuning (ArGue), making three key contributions. 1) In contrast to the conventional approach of directly appending soft prompts preceding class names, we align the model with primitive visual attributes generated by Large Language Models (LLMs). We posit that a model's ability to express high confidence in these attributes signifies its capacity to discern the correct class rationales. 2) We introduce attribute sampling to eliminate disadvantageous attributes, thus only semantically meaningful attributes are preserved. 3) We propose negative prompting, explicitly enumerating class-agnostic attributes to activate spurious correlations and encourage the model to generate highly orthogonal probability distributions in relation to these negative features. In experiments, our method significantly outperforms current state-of-the-art prompt tuning methods on both novel class prediction and out-of-distribution generalization tasks. △ Less

Submitted 12 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

Comments: Accepted to CVPR2024

arXiv:2310.09327 [pdf, other]

MAGNIF: A Tentative Lensed Rotating Disk at $z=8.34$ detected by JWST NIRCam WFSS with Dynamical Forward Modeling

Authors: Zihao Li, Zheng Cai, Fengwu Sun, Johan Richard, Maxime Trebitsch, Jakob M. Helton, Jose M. Diego, Masamune Oguri, Nicholas Foo, Xiaojing Lin, Franz Bauer, Chian-Chou Chen, Christopher J. Conselice, Daniel Espada, Eiichi Egami, Xiaohui Fan, Brenda L. Frye, Yoshinobu Fudamoto, Pablo G. Perez-Gonzalez, Kevin Hainline, Tiger Yu-Yang Hsiao, Zhiyuan Ji, Xiangyu Jin, Anton M. Koekemoer, Vasily Kokorev , et al. (17 additional authors not shown)

Abstract: We report galaxy MACS0416-Y3 behind the lensing cluster MACSJ0416.1--2403 as a tentative rotating disk at $z=8.34$ detected through its [OIII]$\lambda5007$ emission in JWST NIRCam wide-field slitless spectroscopic observations. The discovery is based on our new grism dynamical modeling methodology for JWST NIRCam slitless spectroscopy, using the data from ``Median-band Astrophysics with the Grism… ▽ More We report galaxy MACS0416-Y3 behind the lensing cluster MACSJ0416.1--2403 as a tentative rotating disk at $z=8.34$ detected through its [OIII]$\lambda5007$ emission in JWST NIRCam wide-field slitless spectroscopic observations. The discovery is based on our new grism dynamical modeling methodology for JWST NIRCam slitless spectroscopy, using the data from ``Median-band Astrophysics with the Grism of NIRCam in Frontier Fields'' (MAGNIF), a JWST Cycle-2 program. The [OIII]$\lambda5007$ emission line morphology in grism data shows velocity offsets compared to the F480M direct imaging, suggestive of rotation. Assuming a geometrically thin disk model, we constrain the rotation velocity of $v_{\rm rot}=58^{+53}_{-35}$ km s$^{-1}$ via forward modeling of the two-dimensional (2D) spectrum. We obtain the kinematic ratio of $v_{\rm rot}/σ_v=1.6^{+1.9}_{-0.9}$, where $σ_v$ is the velocity dispersion, in line with a quasi-stable thin disk. The resulting dynamical mass is estimated to be $\log(M_{\rm dyn}/M_{\odot})=8.4^{+0.5}_{-0.7}$. If the rotation confirmed, our discovery suggests that rotating gaseous disks may have already existed within 600 million years after Big Bang. △ Less

Submitted 13 October, 2023; originally announced October 2023.

Comments: 15 pages, 6 figures. Comments welcome

arXiv:2310.08924 [pdf, other]

Attacking The Assortativity Coefficient Under A Rewiring Strategy

Authors: Shuo Zou, Bo Zhou, Qi Xuan

Abstract: Degree correlation is an important characteristic of networks, which is usually quantified by the assortativity coefficient. However, concerns arise about changing the assortativity coefficient of a network when networks suffer from adversarial attacks. In this paper, we analyze the factors that affect the assortativity coefficient and study the optimization problem of maximizing or minimizing the… ▽ More Degree correlation is an important characteristic of networks, which is usually quantified by the assortativity coefficient. However, concerns arise about changing the assortativity coefficient of a network when networks suffer from adversarial attacks. In this paper, we analyze the factors that affect the assortativity coefficient and study the optimization problem of maximizing or minimizing the assortativity coefficient (r) in rewired networks with $k$ pairs of edges. We propose a greedy algorithm and formulate the optimization problem using integer programming to obtain the optimal solution for this problem. Through experiments, we demonstrate the reasonableness and effectiveness of our proposed algorithm. For example, rewired edges 10% in the ER network, the assortativity coefficient improved by 60%. △ Less

Submitted 13 October, 2023; originally announced October 2023.

arXiv:2310.08789 [pdf, other]

Quickest Change Detection in Autoregressive Models

Authors: Zhongchang Sun, Shaofeng Zou

Abstract: The problem of quickest change detection (QCD) in autoregressive (AR) models is investigated. A system is being monitored with sequentially observed samples. At some unknown time, a disturbance signal occurs and changes the distribution of the observations. The disturbance signal follows an AR model, which is dependent over time. Before the change, observations only consist of measurement noise, a… ▽ More The problem of quickest change detection (QCD) in autoregressive (AR) models is investigated. A system is being monitored with sequentially observed samples. At some unknown time, a disturbance signal occurs and changes the distribution of the observations. The disturbance signal follows an AR model, which is dependent over time. Before the change, observations only consist of measurement noise, and are independent and identically distributed (i.i.d.). After the change, observations consist of the disturbance signal and the measurement noise, are dependent over time, which essentially follow a continuous-state hidden Markov model (HMM). The goal is to design a stopping time to detect the disturbance signal as quickly as possible subject to false alarm constraints. Existing approaches for general non-i.i.d. settings and discrete-state HMMs cannot be applied due to their high computational complexity and memory consumption, and they usually assume some asymptotic stability condition. In this paper, the asymptotic stability condition is firstly theoretically proved for the AR model by a novel design of forward variable and auxiliary Markov chain. A computationally efficient Ergodic CuSum algorithm that can be updated recursively is then constructed and is further shown to be asymptotically optimal. The data-driven setting where the disturbance signal parameters are unknown is further investigated, and an online and computationally efficient gradient ascent CuSum algorithm is designed. The algorithm is constructed by iteratively updating the estimate of the unknown parameters based on the maximum likelihood principle and the gradient ascent approach. The lower bound on its average running length to false alarm is also derived for practical false alarm control. Simulation results are provided to demonstrate the performance of the proposed algorithms. △ Less

Submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.04456 [pdf, other]

Multimodal Prompt Transformer with Hybrid Contrastive Learning for Emotion Recognition in Conversation

Authors: Shihao Zou, Xianying Huang, Xudong Shen

Abstract: Emotion Recognition in Conversation (ERC) plays an important role in driving the development of human-machine interaction. Emotions can exist in multiple modalities, and multimodal ERC mainly faces two problems: (1) the noise problem in the cross-modal information fusion process, and (2) the prediction problem of less sample emotion labels that are semantically similar but different categories. To… ▽ More Emotion Recognition in Conversation (ERC) plays an important role in driving the development of human-machine interaction. Emotions can exist in multiple modalities, and multimodal ERC mainly faces two problems: (1) the noise problem in the cross-modal information fusion process, and (2) the prediction problem of less sample emotion labels that are semantically similar but different categories. To address these issues and fully utilize the features of each modality, we adopted the following strategies: first, deep emotion cues extraction was performed on modalities with strong representation ability, and feature filters were designed as multimodal prompt information for modalities with weak representation ability. Then, we designed a Multimodal Prompt Transformer (MPT) to perform cross-modal information fusion. MPT embeds multimodal fusion information into each attention layer of the Transformer, allowing prompt information to participate in encoding textual features and being fused with multi-level textual information to obtain better multimodal fusion features. Finally, we used the Hybrid Contrastive Learning (HCL) strategy to optimize the model's ability to handle labels with few samples. This strategy uses unsupervised contrastive learning to improve the representation ability of multimodal fusion and supervised contrastive learning to mine the information of labels with few samples. Experimental results show that our proposed model outperforms state-of-the-art models in ERC on two benchmark datasets. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: Accepted to ACM MM 2023

arXiv:2309.16757 [pdf]

doi 10.3847/2041-8213/acfee3

A SPectroscopic survey of biased halos In the Reionization Era (ASPIRE): JWST Discovers an Overdensity around a Metal Absorption-selected Galaxy at $z\sim5.5$

Authors: Yunjing Wu, Feige Wang, Zheng Cai, Xiaohui Fan, Kristian Finlator, Jinyi Yang, Joseph F. Hennawi, Fengwu Sun, Jaclyn B. Champagne, Xiaojing Lin, Zihao Li, Zuyi Chen, Eduardo Bañados, George D. Becker, Sarah E. I. Bosman, Gstavo Bruzual, Stephane Charlot, Hsiao-Wen Chen, Jacopo Chevallard, Anna-Christina Eilers, Emanuele Paolo Farina, Xiangyu Jin, Hyunsung D. Jun, Koki Kakiichi, Mingyu Li , et al. (5 additional authors not shown)

Abstract: The launch of ${\it JWST}$ opens a new window for studying the connection between metal-line absorbers and galaxies at the end of the Epoch of Reionization (EoR). Previous studies have detected absorber-galaxy pairs in limited quantities through ground-based observations. To enhance our understanding of the relationship between absorbers and their host galaxies at $z>5$, we utilized the NIRCam Wid… ▽ More The launch of ${\it JWST}$ opens a new window for studying the connection between metal-line absorbers and galaxies at the end of the Epoch of Reionization (EoR). Previous studies have detected absorber-galaxy pairs in limited quantities through ground-based observations. To enhance our understanding of the relationship between absorbers and their host galaxies at $z>5$, we utilized the NIRCam Wide Field Slitless Spectroscopy (WFSS) to search for absorber-associated galaxies by detecting their rest-frame optical emission lines (e.g., [OIII] + H$β$). We report the discovery of a MgII-associated galaxy at $z=5.428$ using data from the ${\it JWST}$ ASPIRE program. The MgII absorber is detected on the spectrum of quasar J0305--3150 with a rest-frame equivalent width of 0.74$\mathring{A}$. The associated galaxy has an [OIII] luminosity of $10^{42.5}\ {\rm erg\ s^{-1}}$ with an impact parameter of 24.9 proper kiloparsecs (pkpc). The joint ${\it HST}$-${\it JWST}$ spectral energy distribution (SED) implies a stellar mass and star-formation rate of ${\rm M_* \approx 10^{8.8}}$ ${\rm M_{\odot}}$, ${\rm SFR}\approx 10\ {\rm M_{\odot}\ yr^{-1}}$. Its [OIII] equivalent width and stellar mass are typical of [OIII] emitters at this redshift. Furthermore, connecting the outflow starting time to the SED-derived stellar age, the outflow velocity of this galaxy is $\sim300\ {\rm km\ s^{-1}}$, consistent with theoretical expectations. We identified six additional [OIII] emitters with impact parameters of up to $\sim300$ pkpc at similar redshifts ($|dv|<1000\ {\rm km\ s^{-1}}$). The observed number is consistent with that in cosmological simulations. This pilot study suggests that systematically investigating the absorber-galaxy connection within the ASPIRE program will provide insights into the metal-enrichment history in the early universe. △ Less

Submitted 8 November, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: Accepted for publication in ApJL. Main text 8 pages, 4 figures. For more information of the JWST ASPIRE program please check https://aspire-quasar.github.io/index.html

arXiv:2309.10050 [pdf, other]

doi 10.1063/5.0197425

Finite Volume Graph Network(FVGN): Predicting unsteady incompressible fluid dynamics with finite volume informed neural network

Authors: Tianyu Li, Shufan Zou, Xinghua Chang, Laiping Zhang, Xiaogang Deng

Abstract: The rapid development of deep learning has significant implications for the advancement of Computational Fluid Dynamics (CFD). Currently, most pixel-grid-based deep learning methods for flow field prediction exhibit significantly reduced accuracy in predicting boundary layer flows and poor adaptability to geometric shapes. Although Graph Neural Network (GNN) models for unstructured grids based uns… ▽ More The rapid development of deep learning has significant implications for the advancement of Computational Fluid Dynamics (CFD). Currently, most pixel-grid-based deep learning methods for flow field prediction exhibit significantly reduced accuracy in predicting boundary layer flows and poor adaptability to geometric shapes. Although Graph Neural Network (GNN) models for unstructured grids based unsteady flow prediction have better geometric adaptability, these models suffer from error accumulation in long-term predictions of unsteady flows. More importantly, fully data-driven models often require extensive training time, greatly limiting the rapid update and iteration speed of deep learning models when facing more complex unsteady flows. Therefore, this paper aims to balance the demands for training overhead and prediction accuracy by integrating physical constraints based on the finite volume method into the loss function of the graph neural network. Additionally, it incorporates a twice-massage aggregation mechanism inspired by the extended stencil method to enhance the unsteady flow prediction accuracy and geometric shape generalization ability of the graph neural network model on unstructured grids. We focus particularly on the model's predictive accuracy within the boundary layer. Compared to fully data-driven methods, our model achieves better predictive accuracy and geometric shape generalization ability in a shorter training time. △ Less

Submitted 17 March, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Journal ref: Physics of Fluids 1 April 2024; 36 (4): 043601

arXiv:2309.00473 [pdf, other]

A Retrofit Sensing Strategy for Soft Fluidic Robots

Authors: Shibo Zou, Sergio Picella, Jelle de Vries, Vera Kortman, Aimée Sakes, Johannes T. B. Overvelde

Abstract: Soft robots are intrinsically capable of adapting to different environments by changing their shape in response to interaction forces with the environment. However, sensing and feedback are still required for higher level decisions and autonomy. Most sensing technologies developed for soft robots involve the integration of separate sensing elements in soft actuators, which presents a considerable… ▽ More Soft robots are intrinsically capable of adapting to different environments by changing their shape in response to interaction forces with the environment. However, sensing and feedback are still required for higher level decisions and autonomy. Most sensing technologies developed for soft robots involve the integration of separate sensing elements in soft actuators, which presents a considerable challenge for both the fabrication and robustness of soft robots due to the interface between hard and soft components and the complexity of the assembly. To circumvent this, here we present a versatile sensing strategy that can be retrofitted to existing soft fluidic devices without the need for design changes. We achieve this by measuring the fluidic input that is required to activate a soft actuator and relating this input to its deformed state during interaction with the environment. We demonstrate the versatility of our sensing strategy by tactile sensing of the size, shape, surface roughness and stiffness of objects. Moreover, we demonstrate our approach by retrofitting it to a range of existing pneumatic soft actuators and grippers powered by positive and negative pressure. Finally, we show the robustness of our fluidic sensing strategy in closed-loop control of a soft gripper for practical applications such as sorting, fruit picking and ripeness detection. Based on these results, we conclude that as long as the interaction of the actuator with the environment results in a shape change of the interval volume, soft fluidic actuators require no embedded sensors and design modifications to implement useful sensing. We believe that the relative simplicity, versatility, broad applicability and robustness of our sensing strategy will catalyze new functionalities in soft interactive devices and systems, thereby accelerating the use of soft robotics in real world applications. △ Less

Submitted 17 October, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

arXiv:2307.16212 [pdf, other]

Robust Multi-Agent Reinforcement Learning with State Uncertainty

Authors: Sihong He, Songyang Han, Sanbao Su, Shuo Han, Shaofeng Zou, Fei Miao

Abstract: In real-world multi-agent reinforcement learning (MARL) applications, agents may not have perfect state information (e.g., due to inaccurate measurement or malicious attacks), which challenges the robustness of agents' policies. Though robustness is getting important in MARL deployment, little prior work has studied state uncertainties in MARL, neither in problem formulation nor algorithm design.… ▽ More In real-world multi-agent reinforcement learning (MARL) applications, agents may not have perfect state information (e.g., due to inaccurate measurement or malicious attacks), which challenges the robustness of agents' policies. Though robustness is getting important in MARL deployment, little prior work has studied state uncertainties in MARL, neither in problem formulation nor algorithm design. Motivated by this robustness issue and the lack of corresponding studies, we study the problem of MARL with state uncertainty in this work. We provide the first attempt to the theoretical and empirical analysis of this challenging problem. We first model the problem as a Markov Game with state perturbation adversaries (MG-SPA) by introducing a set of state perturbation adversaries into a Markov Game. We then introduce robust equilibrium (RE) as the solution concept of an MG-SPA. We conduct a fundamental analysis regarding MG-SPA such as giving conditions under which such a robust equilibrium exists. Then we propose a robust multi-agent Q-learning (RMAQ) algorithm to find such an equilibrium, with convergence guarantees. To handle high-dimensional state-action space, we design a robust multi-agent actor-critic (RMAAC) algorithm based on an analytical expression of the policy gradient derived in the paper. Our experiments show that the proposed RMAQ algorithm converges to the optimal value function; our RMAAC algorithm outperforms several MARL and robust MARL methods in multiple multi-agent environments when state uncertainty is present. The source code is public on \url{https://github.com/sihongho/robust_marl_with_state_uncertainty}. △ Less

Submitted 30 July, 2023; originally announced July 2023.

Comments: 50 pages, Published in TMLR, Transactions on Machine Learning Research (06/2023)

arXiv:2306.08820 [pdf, other]

doi 10.1103/PhysRevB.108.165153

Ultrasonic investigation of the Kondo semimetal CeBi

Authors: Yupeng Pan, Xiaobo He, Shuo Zou, Hai Zeng, Yuqian Zhao, Ziyu Li, Yuesheng Li, Yongkang Luo

Abstract: We report the elastic properties of the Kondo semimetal CeBi by resonant ultrasound spectroscopy measurements at zero magnetic field. Clear elastic softening is found in bulk modulus $C_B$ below $\sim 60$ K. Such a softening in $C_B$, in addition to the anomalous temperature dependent Poisson's ratio, is hardly attributable to multipolar response for stable localized $4f$ orbital, but can be well… ▽ More We report the elastic properties of the Kondo semimetal CeBi by resonant ultrasound spectroscopy measurements at zero magnetic field. Clear elastic softening is found in bulk modulus $C_B$ below $\sim 60$ K. Such a softening in $C_B$, in addition to the anomalous temperature dependent Poisson's ratio, is hardly attributable to multipolar response for stable localized $4f$ orbital, but can be well described by a two-band model arising from the hybridization between conduction- and $4f$- electrons. These results probably are consequences of the valence fluctuations in this Kondo semimetal as originally suggested by a Fermi-surface expansion observed in a previous angle-resolved photoemission spectroscopy study [P. Li \textit{et al.}, Phys. Rev. B $\mathbf{100}$, 155110 (2019)]. △ Less

Submitted 31 October, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

Comments: 7+4 pages, 5+2 figures, 2+1 tables

Journal ref: Phys. Rev. B 108, 165153 (2023)

arXiv:2306.06321 [pdf, other]

GTC Follow-up Observations of Very Metal-Poor Star Candidates from DESI

Authors: Carlos Allende Prieto, David S. Aguado, Jonay I. González Hernández, Rafael Rebolo, Joan Najita, Christopher J. Manser, Constance Rockosi, Zachary Slepian, Mar Mezcua, Monica Valluri, Rana Ezzeddine, Sergey E. Koposov, Andrew P. Cooper, Arjun Dey, Boris T. Gänsicke, Ting S. Li, Katia Cunha, Siwei Zou, Jessica Nicole Aguilar, Steven Ahlen, David Brooks, Todd Claybaugh, Shaun Cole, Sarah Eftekharzadeh, Kevin Fanning , et al. (26 additional authors not shown)

Abstract: The observations from the Dark Energy Spectroscopic Instrument (DESI) will significantly increase the numbers of known extremely metal-poor stars by a factor of ~ 10, improving the sample statistics to study the early chemical evolution of the Milky Way and the nature of the first stars. In this paper we report high signal-to-noise follow-up observations of 9 metal-poor stars identified during the… ▽ More The observations from the Dark Energy Spectroscopic Instrument (DESI) will significantly increase the numbers of known extremely metal-poor stars by a factor of ~ 10, improving the sample statistics to study the early chemical evolution of the Milky Way and the nature of the first stars. In this paper we report high signal-to-noise follow-up observations of 9 metal-poor stars identified during the DESI commissioning with the Optical System for Imaging and low-Intermediate-Resolution Integrated Spectroscopy (OSIRIS) instrument on the 10.4m Gran Telescopio Canarias (GTC). The analysis of the data using a well-vetted methodology confirms the quality of the DESI spectra and the performance of the pipelines developed for the data reduction and analysis of DESI data. △ Less

Submitted 27 October, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: 14 pages, 4 figures, accepted for publication in ApJ, data available from https://doi.org/10.5281/zenodo.8363303

arXiv:2306.06308 [pdf, other]

doi 10.5281/zenodo.7964161

The Early Data Release of the Dark Energy Spectroscopic Instrument

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, G. Aldering, D. M. Alexander, R. Alfarsy, C. Allende Prieto, M. Alvarez, O. Alves, A. Anand, F. Andrade-Oliveira, E. Armengaud, J. Asorey, S. Avila, A. Aviles, S. Bailey, A. Balaguera-Antolínez, O. Ballester, C. Baltay, A. Bault, J. Bautista, J. Behera, S. F. Beltran , et al. (240 additional authors not shown)

Abstract: The Dark Energy Spectroscopic Instrument (DESI) completed its five-month Survey Validation in May 2021. Spectra of stellar and extragalactic targets from Survey Validation constitute the first major data sample from the DESI survey. This paper describes the public release of those spectra, the catalogs of derived properties, and the intermediate data products. In total, the public release includes… ▽ More The Dark Energy Spectroscopic Instrument (DESI) completed its five-month Survey Validation in May 2021. Spectra of stellar and extragalactic targets from Survey Validation constitute the first major data sample from the DESI survey. This paper describes the public release of those spectra, the catalogs of derived properties, and the intermediate data products. In total, the public release includes good-quality spectral information from 466,447 objects targeted as part of the Milky Way Survey, 428,758 as part of the Bright Galaxy Survey, 227,318 as part of the Luminous Red Galaxy sample, 437,664 as part of the Emission Line Galaxy sample, and 76,079 as part of the Quasar sample. In addition, the release includes spectral information from 137,148 objects that expand the scope beyond the primary samples as part of a series of secondary programs. Here, we describe the spectral data, data quality, data products, Large-Scale Structure science catalogs, access to the data, and references that provide relevant background to using these spectra. △ Less

Submitted 15 June, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: 43 pages, 7 figures, 17 tables, submitted to AJ, DESI EDR references added

arXiv:2306.06307 [pdf, other]

doi 10.5281/zenodo.7858207

Validation of the Scientific Program for the Dark Energy Spectroscopic Instrument

Authors: DESI Collaboration, A. G. Adame, J. Aguilar, S. Ahlen, S. Alam, G. Aldering, D. M. Alexander, R. Alfarsy, C. Allende Prieto, M. Alvarez, O. Alves, A. Anand, F. Andrade-Oliveira, E. Armengaud, J. Asorey, S. Avila, A. Aviles, S. Bailey, A. Balaguera-Antolínez, O. Ballester, C. Baltay, A. Bault, J. Bautista, J. Behera, S. F. Beltran , et al. (239 additional authors not shown)

Abstract: The Dark Energy Spectroscopic Instrument (DESI) was designed to conduct a survey covering 14,000 deg$^2$ over five years to constrain the cosmic expansion history through precise measurements of Baryon Acoustic Oscillations (BAO). The scientific program for DESI was evaluated during a five month Survey Validation (SV) campaign before beginning full operations. This program produced deep spectra of… ▽ More The Dark Energy Spectroscopic Instrument (DESI) was designed to conduct a survey covering 14,000 deg$^2$ over five years to constrain the cosmic expansion history through precise measurements of Baryon Acoustic Oscillations (BAO). The scientific program for DESI was evaluated during a five month Survey Validation (SV) campaign before beginning full operations. This program produced deep spectra of tens of thousands of objects from each of the stellar (MWS), bright galaxy (BGS), luminous red galaxy (LRG), emission line galaxy (ELG), and quasar target classes. These SV spectra were used to optimize redshift distributions, characterize exposure times, determine calibration procedures, and assess observational overheads for the five-year program. In this paper, we present the final target selection algorithms, redshift distributions, and projected cosmology constraints resulting from those studies. We also present a `One-Percent survey' conducted at the conclusion of Survey Validation covering 140 deg$^2$ using the final target selection algorithms with exposures of a depth typical of the main survey. The Survey Validation indicates that DESI will be able to complete the full 14,000 deg$^2$ program with spectroscopically-confirmed targets from the MWS, BGS, LRG, ELG, and quasar programs with total sample sizes of 7.2, 13.8, 7.46, 15.7, and 2.87 million, respectively. These samples will allow exploration of the Milky Way halo, clustering on all scales, and BAO measurements with a statistical precision of 0.28% over the redshift interval $z<1.1$, 0.39% over the redshift interval $1.1<z<1.9$, and 0.46% over the redshift interval $1.9<z<3.5$. △ Less

Submitted 12 January, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

Comments: 42 pages, 18 figures, accepted by AJ

arXiv:2305.20016 [pdf, other]

doi 10.3847/1538-3881/ace62c

Detecting and Characterizing Mg II absorption in DESI Survey Validation Quasar Spectra

Authors: Lucas Napolitano, Agnesh Pandey, Adam D. Myers, Ting-Wen Lan, Abhijeet Anand, Jessica Aguilar, Steven Ahlen, David M. Alexander, David Brooks, Rebecca Canning, Chiara Circosta, Axel De La Macorra, Peter Doel, Sarah Eftekharzadeh, Victoria A. Fawcett, Andreu Font-Ribera, Juan Garcia-Bellido, Satya Gontcho A Gontcho, L. Le Guillou, Julien Guy, Klaus Honscheid, Stephanie Juneau, T. Kisner, Martin Landriau, Aaron M. Meisner , et al. (11 additional authors not shown)

Abstract: We present findings of the detection of Magnesium II (Mg II, λ = 2796, 2803 Å) absorbers from the early data release of the Dark Energy Spectroscopic Instrument (DESI). DESI is projected to obtain spectroscopy of approximately 3 million quasars (QSOs), of which over 99% are anticipated to be at redshifts greater than z > 0.3, such that DESI would be able to observe an associated or intervening Mg… ▽ More We present findings of the detection of Magnesium II (Mg II, λ = 2796, 2803 Å) absorbers from the early data release of the Dark Energy Spectroscopic Instrument (DESI). DESI is projected to obtain spectroscopy of approximately 3 million quasars (QSOs), of which over 99% are anticipated to be at redshifts greater than z > 0.3, such that DESI would be able to observe an associated or intervening Mg II absorber illuminated by the background QSO. We have developed an autonomous supplementary spectral pipeline that detects these systems through an initial line-fitting process and then confirms the line properties using a Markov Chain Monte Carlo sampler. Based upon a visual inspection of the resulting systems, we estimate that this sample has a purity greater than 99%. We have also investigated the completeness of our sample in regard to both the signal-to-noise properties of the input spectra and the rest-frame equivalent width (W0) of the absorber systems. From a parent catalog containing 83,207 quasars, we detect a total of 23,921 Mg II absorption systems following a series of quality cuts. Extrapolating from this occurrence rate of 28.8% implies a catalog at the completion of the five-year DESI survey that will contain over eight hundred thousand Mg II absorbers. The cataloging of these systems will enable significant further research because they carry information regarding circumgalactic medium environments, the distribution of intervening galaxies, and the growth of metallicity across the redshift range 0.3 < z < 2.5. △ Less

Submitted 30 August, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

Comments: 17 pages, 8 figures

Journal ref: 2023AJ....166...99N

arXiv:2305.19571 [pdf, other]

Fractional weak adversarial networks for the stationary fractional advection dispersion equations

Authors: Dian Feng, Zhiwei Yang, Sen Zou

Abstract: In this article, we propose the fractional weak adversarial networks (f-WANs) for the stationary fractional advection dispersion equations (FADE) based on their weak formulas. This enables us to handle less regular solutions for the fractional equations. To handle the non-local property of the fractional derivatives, convolutional layers and special loss functions are introduced in this neural net… ▽ More In this article, we propose the fractional weak adversarial networks (f-WANs) for the stationary fractional advection dispersion equations (FADE) based on their weak formulas. This enables us to handle less regular solutions for the fractional equations. To handle the non-local property of the fractional derivatives, convolutional layers and special loss functions are introduced in this neural network. Numerical experiments for both smooth and less regular solutions show the validity of f-WANs. △ Less

Submitted 8 June, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

MSC Class: 35B65; 34A08

arXiv:2305.13289 [pdf, other]

Achieving the Minimax Optimal Sample Complexity of Offline Reinforcement Learning: A DRO-Based Approach

Authors: Yue Wang, Jinjun Xiong, Shaofeng Zou

Abstract: Offline reinforcement learning aims to learn from pre-collected datasets without active exploration. This problem faces significant challenges, including limited data availability and distributional shifts. Existing approaches adopt a pessimistic stance towards uncertainty by penalizing rewards of under-explored state-action pairs to estimate value functions conservatively. In this paper, we show… ▽ More Offline reinforcement learning aims to learn from pre-collected datasets without active exploration. This problem faces significant challenges, including limited data availability and distributional shifts. Existing approaches adopt a pessimistic stance towards uncertainty by penalizing rewards of under-explored state-action pairs to estimate value functions conservatively. In this paper, we show that the distributionally robust optimization (DRO) based approach can also address these challenges and is minimax optimal. Specifically, we directly model the uncertainty in the transition kernel and construct an uncertainty set of statistically plausible transition kernels. We then find the policy that optimizes the worst-case performance over this uncertainty set. We first design a metric-based Hoeffding-style uncertainty set such that with high probability the true transition kernel is in this set. We prove that to achieve a sub-optimality gap of $ε$, the sample complexity is $\mathcal{O}(S^2C^{π^*}ε^{-2}(1-γ)^{-4})$, where $γ$ is the discount factor, $S$ is the number of states, and $C^{π^*}$ is the single-policy clipped concentrability coefficient which quantifies the distribution shift. To achieve the optimal sample complexity, we further propose a less conservative Bernstein-style uncertainty set, which, however, does not necessarily include the true transition kernel. We show that an improved sample complexity of $\mathcal{O}(SC^{π^*}ε^{-2}(1-γ)^{-3})$ can be obtained, which matches with the minimax lower bound for offline reinforcement learning, and thus is minimax optimal. △ Less

Submitted 3 December, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

arXiv:2305.10504 [pdf, other]

Model-Free Robust Average-Reward Reinforcement Learning

Authors: Yue Wang, Alvaro Velasquez, George Atia, Ashley Prater-Bennette, Shaofeng Zou

Abstract: Robust Markov decision processes (MDPs) address the challenge of model uncertainty by optimizing the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on the robust average-reward MDPs under the model-free setting. We first theoretically characterize the structure of solutions to the robust average-reward Bellman equation, which is essential for our later convergence… ▽ More Robust Markov decision processes (MDPs) address the challenge of model uncertainty by optimizing the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on the robust average-reward MDPs under the model-free setting. We first theoretically characterize the structure of solutions to the robust average-reward Bellman equation, which is essential for our later convergence analysis. We then design two model-free algorithms, robust relative value iteration (RVI) TD and robust RVI Q-learning, and theoretically prove their convergence to the optimal solution. We provide several widely used uncertainty sets as examples, including those defined by the contamination model, total variation, Chi-squared divergence, Kullback-Leibler (KL) divergence and Wasserstein distance. △ Less

Submitted 17 May, 2023; originally announced May 2023.

Comments: ICML 2023

Showing 1–50 of 701 results for author: Zou, S