subscribe to arXiv mailings

doi 10.1016/j.physletb.2024.138828

Spectroscopy of deeply bound orbitals in neutron-rich Ca isotopes

Authors: P. J. Li, J. Lee, P. Doornenbal, S. Chen, S. Wang, A. Obertelli, Y. Chazono, J. D. Holt, B. S. Hu, K. Ogata, Y. Utsuno, K. Yoshida, N. L. Achouri, H. Baba, F. Browne, D. Calvet, F. Château, N. Chiga, A. Corsi, M. L. Cortés, A. Delbart, J-M. Gheller, A. Giganon, A. Gillibert, C. Hilaire , et al. (63 additional authors not shown)

Abstract: The calcium isotopes are an ideal system to investigate the evolution of shell structure and magic numbers. Although the properties of surface nucleons in calcium have been well studied, probing the structure of deeply bound nucleons remains a challenge. Here, we report on the first measurement of unbound states in $^{53}$Ca and $^{55}$Ca, populated from \ts{54,56}Ca($p,pn$) reactions at a beam en… ▽ More The calcium isotopes are an ideal system to investigate the evolution of shell structure and magic numbers. Although the properties of surface nucleons in calcium have been well studied, probing the structure of deeply bound nucleons remains a challenge. Here, we report on the first measurement of unbound states in $^{53}$Ca and $^{55}$Ca, populated from \ts{54,56}Ca($p,pn$) reactions at a beam energy of around 216 MeV/nucleon at the RIKEN Radioactive Isotopes Beam Factory. The resonance properties, partial cross sections, and momentum distributions of these unbound states were analyzed. Orbital angular momentum $l$ assignments were extracted from momentum distributions based on calculations using the distorted wave impulse approximation (DWIA) reaction model. The resonances at excitation energies of 5516(41)\,keV in $^{53}$Ca and 6000(250)\,keV in $^{55}$Ca indicate a significant $l$\, =\,3 component, providing the first experimental evidence for the $ν0f_{7/2}$ single-particle strength of unbound hole states in the neutron-rich Ca isotopes. The observed excitation energies and cross-sections point towards extremely localized and well separated strength distributions, with some fragmentation for the $ν0f_{7/2}$ orbital in $^{55}$Ca. These results are in good agreement with predictions from shell-model calculations using the effective GXPF1Bs interaction and \textit{ab initio} calculations and diverge markedly from the experimental distributions in the nickel isotones at $Z=28$. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 13 pages, 7 figures

Journal ref: Phys. Lett. B, 855 (2024),138828

arXiv:2406.18817 [pdf, other]

Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis

Authors: Mingyang Zhao, Jingen Jiang, Lei Ma, Shiqing Xin, Gaofeng Meng, Dong-Ming Yan

Abstract: This paper presents a novel non-rigid point set registration method that is inspired by unsupervised clustering analysis. Unlike previous approaches that treat the source and target point sets as separate entities, we develop a holistic framework where they are formulated as clustering centroids and clustering members, separately. We then adopt Tikhonov regularization with an $\ell_1$-induced Lapl… ▽ More This paper presents a novel non-rigid point set registration method that is inspired by unsupervised clustering analysis. Unlike previous approaches that treat the source and target point sets as separate entities, we develop a holistic framework where they are formulated as clustering centroids and clustering members, separately. We then adopt Tikhonov regularization with an $\ell_1$-induced Laplacian kernel instead of the commonly used Gaussian kernel to ensure smooth and more robust displacement fields. Our formulation delivers closed-form solutions, theoretical guarantees, independence from dimensions, and the ability to handle large deformations. Subsequently, we introduce a clustering-improved Nyström method to effectively reduce the computational complexity and storage of the Gram matrix to linear, while providing a rigorous bound for the low-rank approximation. Our method achieves high accuracy results across various scenarios and surpasses competitors by a significant margin, particularly on shapes with substantial deformations. Additionally, we demonstrate the versatility of our method in challenging tasks such as shape transfer and medical registration. △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: [CVPR 2024 Highlight] Project and code at: https://github.com/zikai1/CVPR24_PointSetReg

arXiv:2406.11824 [pdf, other]

Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation

Authors: Alexander Raistrick, Lingjie Mei, Karhan Kayan, David Yan, Yiming Zuo, Beining Han, Hongyu Wen, Meenal Parakh, Stamatis Alexandropoulos, Lahav Lipson, Zeyu Ma, Jia Deng

Abstract: We introduce Infinigen Indoors, a Blender-based procedural generator of photorealistic indoor scenes. It builds upon the existing Infinigen system, which focuses on natural scenes, but expands its coverage to indoor scenes by introducing a diverse library of procedural indoor assets, including furniture, architecture elements, appliances, and other day-to-day objects. It also introduces a constrai… ▽ More We introduce Infinigen Indoors, a Blender-based procedural generator of photorealistic indoor scenes. It builds upon the existing Infinigen system, which focuses on natural scenes, but expands its coverage to indoor scenes by introducing a diverse library of procedural indoor assets, including furniture, architecture elements, appliances, and other day-to-day objects. It also introduces a constraint-based arrangement system, which consists of a domain-specific language for expressing diverse constraints on scene composition, and a solver that generates scene compositions that maximally satisfy the constraints. We provide an export tool that allows the generated 3D objects and scenes to be directly used for training embodied agents in real-time simulators such as Omniverse and Unreal. Infinigen Indoors is open-sourced under the BSD license. Please visit https://infinigen.org for code and videos. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: Accepted to CVPR 2024

arXiv:2406.10693 [pdf, ps, other]

On $L^p$ extremals for Fourier extension estimate to fractional surface

Authors: Boning Di, Ning Liu, Dunyan Yan

Abstract: This article investigates the Fourier extension operator associated to the fractional surface $(ξ,|ξ|^α)$ with $α\geq 2$. We show that nearly all valid scale-invariant Fourier extension inequalities possess extremals. More precisely, if the Fourier extension operator is bounded for an endpoint $p_0$, then for all $1<p< p_0$, the corresponding $L^p$-extremal sequences are precompact up to symmetrie… ▽ More This article investigates the Fourier extension operator associated to the fractional surface $(ξ,|ξ|^α)$ with $α\geq 2$. We show that nearly all valid scale-invariant Fourier extension inequalities possess extremals. More precisely, if the Fourier extension operator is bounded for an endpoint $p_0$, then for all $1<p< p_0$, the corresponding $L^p$-extremal sequences are precompact up to symmetries. In particular, we have developed the relevant $L^p$-profile decomposition for these Fourier extension operators. △ Less

Submitted 15 June, 2024; originally announced June 2024.

Comments: 23 pages, 52 references

arXiv:2406.09802 [pdf, other]

doi 10.1088/1674-4527/ad47de

Simulating the Escaping Atmosphere of GJ 436 b with Two-fluid Magnetohydrodynamic Models

Authors: Lei Xing, Jianheng Guo, Chuyuan Yang, Dongdong Yan

Abstract: Observations of transmission spectra reveal that hot Jupiters and Neptunes are likely to possess escaping atmospheres driven by stellar radiation. Numerous models predict that magnetic fields may exert significant influences on the atmospheres of hot planets. Generally, the escaping atmospheres are not entirely ionized, and magnetic fields only directly affect the escape of ionized components with… ▽ More Observations of transmission spectra reveal that hot Jupiters and Neptunes are likely to possess escaping atmospheres driven by stellar radiation. Numerous models predict that magnetic fields may exert significant influences on the atmospheres of hot planets. Generally, the escaping atmospheres are not entirely ionized, and magnetic fields only directly affect the escape of ionized components within them. Considering the chemical reactions between ionized components and neutral atoms, as well as collision processes, magnetic fields indirectly impact the escape of neutral atoms, thereby influencing the detection signals of planetary atmospheres in transmission spectra. In order to simulate this process, we developed a magneto-hydrodynamic multi-fluid model based on MHD code PLUTO. As an initial exploration, we investigated the impact of magnetic fields on the decoupling of H$^+$ and H in the escaping atmosphere of the hot Neptune GJ436 b. Due to the strong resonant interactions between H and H$^+$, the coupling between them is tight even if the magnetic field is strong. Of course, alternatively, our work also suggests that merging H and H$^+$ into a single flow can be a reasonable assumption in MHD simulations of escaping atmospheres. However, our simulation results indicate that under the influence of magnetic fields, there are noticeable regional differences in the decoupling of H$^+$ and H. With the increase of magnetic field strength, the degree of decoupling also increases. For heavier particles such as O, the decoupling between O and H$^+$ is more pronounced. Our findings provide important insights for future studies on the decoupling processes of heavy atoms in the escaping atmospheres of hot Jupiters and hot Neptunes under the influence of magnetic fields. △ Less

Submitted 19 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.08698 [pdf, other]

Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes of astrophysical $γ$-ray background while large amount of dark matter. By analyzing more than 700 days observational data at LHAASO, no significant dark matter signal from 1 TeV to 1 EeV is detected. Accordingly we derive the most stringent constraints on the ultra-heavy dark matter annihilation cross-section up to EeV. The constraints on the lifetime of dark matter in decay mode are also derived. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 17 pages, 12 figures, accepted by PRL

arXiv:2406.07921 [pdf, other]

A Two-Stage Online Algorithm for EV Charging Station Energy Management and Carbon Trading

Authors: Dongxiang Yan, Shihan Huang, Sen Li, Xiaoyi Fan, Yue Chen

Abstract: The increasing electric vehicle (EV) adoption challenges the energy management of charging stations (CSs) due to the large number of EVs and the underlying uncertainties. Moreover, the carbon footprint of CSs is growing significantly due to the rising charging power demand. This makes it important for CSs to properly manage their energy usage and ensure their carbon footprint stay within their car… ▽ More The increasing electric vehicle (EV) adoption challenges the energy management of charging stations (CSs) due to the large number of EVs and the underlying uncertainties. Moreover, the carbon footprint of CSs is growing significantly due to the rising charging power demand. This makes it important for CSs to properly manage their energy usage and ensure their carbon footprint stay within their carbon emission quotas. This paper proposes a two-stage online algorithm for this purpose, considering the different time scales of energy management and carbon trading. In the first stage, the CS characterizes the real-time aggregate EV power flexibility, in terms of upper and lower bounds on the total charging power, by a Lyapunov optimization-based online algorithm. In the second stage, the CS co-optimizes energy management and carbon trading, with EV charging power chosen within the aggregate flexibility region provided by the first stage. A generalized battery model is proposed to capture the dynamic carbon footprint changes and carbon trading. A virtual carbon queue is designed to develop an online algorithm for the second stage, which can ensure the carbon footprint of CS be within its carbon emission quota and its total operation cost is nearly offline optimal. Case studies validate the effectiveness and advantages of the proposed algorithm. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 12 pages, 13 figures

arXiv:2406.07327 [pdf, other]

3D-Properties: Identifying Challenges in DPO and Charting a Path Forward

Authors: Yuzi Yan, Yibo Miao, Jialian Li, Yipin Zhang, Jian Xie, Zhijie Deng, Dong Yan

Abstract: Aligning large language models (LLMs) with human preference has recently gained tremendous attention, with the canonical yet costly RLHF-PPO and the simple and straightforward Direct Preference Optimization (DPO) as two examples. Despite the efficiency, DPO has rarely be used in the state-of-the-art production-level LLMs, implying its potential pathologies. In this work, we revisit DPO with a comp… ▽ More Aligning large language models (LLMs) with human preference has recently gained tremendous attention, with the canonical yet costly RLHF-PPO and the simple and straightforward Direct Preference Optimization (DPO) as two examples. Despite the efficiency, DPO has rarely be used in the state-of-the-art production-level LLMs, implying its potential pathologies. In this work, we revisit DPO with a comprehensive examination of its empirical efficacy and a systematic comparison with RLHF-PPO. We identify the \textbf{3D}-properties of DPO's learning outcomes: the \textbf{D}rastic drop in the likelihood of rejected responses, the \textbf{D}egradation into LLM unlearning, and the \textbf{D}ispersion effect on unseen responses through experiments with both a carefully designed toy model and practical LLMs on tasks including mathematical problem-solving and instruction following. These findings inherently connect to some observations made by related works and we additionally contribute a plausible theoretical explanation for them. Accordingly, we propose easy regularization methods to mitigate the issues caused by \textbf{3D}-properties, improving the training stability and final performance of DPO. Our contributions also include an investigation into how the distribution of the paired preference data impacts the effectiveness of DPO. We hope this work could offer research directions to narrow the gap between reward-free preference learning methods and reward-based ones. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.00959 [pdf, other]

Ta2Pd3Te5 topological thermometer

Authors: Yupeng Li, Anqi Wang, Senyang Pan, Dayu Yan, Guang Yang, Xingchen Guo, Yu Hong, Guangtong Liu, Fanming Qu, Zhijun Wang, Tian Qian, Jinglei Zhang, Youguo Shi, Li Lu, Jie Shen

Abstract: In recent decades, there has been a persistent pursuit of applications for surface/edge states in topological systems, driven by their dissipationless transport effects. However, there have been limited tangible breakthroughs in this field. This work demonstrates the remarkable properties of the topological insulator Ta2Pd3Te5, as a thermometer. This material exhibits a power-law correlation in te… ▽ More In recent decades, there has been a persistent pursuit of applications for surface/edge states in topological systems, driven by their dissipationless transport effects. However, there have been limited tangible breakthroughs in this field. This work demonstrates the remarkable properties of the topological insulator Ta2Pd3Te5, as a thermometer. This material exhibits a power-law correlation in temperature-dependent resistance at low temperatures, stemming from its Luttinger liquid behavior of edge states, while exhibiting semiconductor behavior at high temperatures. The power-law behavior effectively addresses the issue of infinite resistance in semiconductor thermometers at ultra-low temperatures, thereby playing a crucial role in enabling efficient thermometry in refrigerators supporting millikelvin temperatures or below. By employing chemical doping, adjusting thickness, and controlling gate voltage, its power-law behavior and semiconductor behavior can be effectively modulated. This enables efficient thermometry spanning from millikelvin temperatures to room temperature, and allows for precise local temperature measurement. Furthermore, this thermometer exhibits excellent temperature sensitivity and resolution, and can be fine-tuned to show small magnetoresistance. In summary, the Ta2Pd3Te5 thermometer, also referred to as a topological thermometer, exhibits outstanding performance and significant potential for measuring a wider range of temperatures compared to conventional low-temperature thermometers. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: 15 pages, 9 figures

arXiv:2406.00347 [pdf, other]

E$^3$-Net: Efficient E(3)-Equivariant Normal Estimation Network

Authors: Hanxiao Wang, Mingyang Zhao, Weize Quan, Zhen Chen, Dong-ming Yan, Peter Wonka

Abstract: Point cloud normal estimation is a fundamental task in 3D geometry processing. While recent learning-based methods achieve notable advancements in normal prediction, they often overlook the critical aspect of equivariance. This results in inefficient learning of symmetric patterns. To address this issue, we propose E3-Net to achieve equivariance for normal estimation. We introduce an efficient ran… ▽ More Point cloud normal estimation is a fundamental task in 3D geometry processing. While recent learning-based methods achieve notable advancements in normal prediction, they often overlook the critical aspect of equivariance. This results in inefficient learning of symmetric patterns. To address this issue, we propose E3-Net to achieve equivariance for normal estimation. We introduce an efficient random frame method, which significantly reduces the training resources required for this task to just 1/8 of previous work and improves the accuracy. Further, we design a Gaussian-weighted loss function and a receptive-aware inference strategy that effectively utilizes the local properties of point clouds. Our method achieves superior results on both synthetic and real-world datasets, and outperforms current state-of-the-art techniques by a substantial margin. We improve RMSE by 4% on the PCPNet dataset, 2.67% on the SceneNN dataset, and 2.44% on the FamousShape dataset. △ Less

Submitted 1 June, 2024; originally announced June 2024.

arXiv:2405.16964 [pdf, other]

Exploring the LLM Journey from Cognition to Expression with Linear Representations

Authors: Yuzi Yan, Jialian Li, Yipin Zhang, Dong Yan

Abstract: This paper presents an in-depth examination of the evolution and interplay of cognitive and expressive capabilities in large language models (LLMs), with a specific focus on Baichuan-7B and Baichuan-33B, an advanced bilingual (Chinese and English) LLM series. We define and explore the model's cognitive and expressive capabilities through linear representations across three critical phases: Pretrai… ▽ More This paper presents an in-depth examination of the evolution and interplay of cognitive and expressive capabilities in large language models (LLMs), with a specific focus on Baichuan-7B and Baichuan-33B, an advanced bilingual (Chinese and English) LLM series. We define and explore the model's cognitive and expressive capabilities through linear representations across three critical phases: Pretraining, Supervised Fine-Tuning (SFT), and Reinforcement Learning from Human Feedback (RLHF). Cognitive capability is defined as the quantity and quality of information conveyed by the neuron output vectors within the network, similar to the neural signal processing in human cognition. Expressive capability is defined as the model's capability to produce word-level output. Our findings unveil a sequential development pattern, where cognitive abilities are largely established during Pretraining, whereas expressive abilities predominantly advance during SFT and RLHF. Statistical analyses confirm a significant correlation between the two capabilities, suggesting that cognitive capacity may limit expressive potential. The paper also explores the theoretical underpinnings of these divergent developmental trajectories and their connection to the LLMs' architectural design. Moreover, we evaluate various optimization-independent strategies, such as few-shot learning and repeated sampling, which bridge the gap between cognitive and expressive capabilities. This research reveals the potential connection between the hidden space and the output space, contributing valuable insights into the interpretability and controllability of their training processes. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: Published in ICML 2024

arXiv:2405.12739 [pdf, other]

SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling

Authors: Xingzhou Lou, Junge Zhang, Jian Xie, Lifeng Liu, Dong Yan, Kaiqi Huang

Abstract: Human preference alignment is critical in building powerful and reliable large language models (LLMs). However, current methods either ignore the multi-dimensionality of human preferences (e.g. helpfulness and harmlessness) or struggle with the complexity of managing multiple reward models. To address these issues, we propose Sequential Preference Optimization (SPO), a method that sequentially fin… ▽ More Human preference alignment is critical in building powerful and reliable large language models (LLMs). However, current methods either ignore the multi-dimensionality of human preferences (e.g. helpfulness and harmlessness) or struggle with the complexity of managing multiple reward models. To address these issues, we propose Sequential Preference Optimization (SPO), a method that sequentially fine-tunes LLMs to align with multiple dimensions of human preferences. SPO avoids explicit reward modeling, directly optimizing the models to align with nuanced human preferences. We theoretically derive closed-form optimal SPO policy and loss function. Gradient analysis is conducted to show how SPO manages to fine-tune the LLMs while maintaining alignment on previously optimized dimensions. Empirical results on LLMs of different size and multiple evaluation datasets demonstrate that SPO successfully aligns LLMs across multiple dimensions of human preferences and significantly outperforms the baselines. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.11826 [pdf, other]

Data quality control system and long-term performance monitor of the LHAASO-KM2A

Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To ensure the reliability of the LHAASO-KM2A data, a three-level quality control system has been established. It is used to monitor the status of detector units, stability of reconstructed parameters and the performance of the array based on observations of the Crab Nebula and Moon shadow. This paper will introduce the control system and its application on the LHAASO-KM2A data collected from August 2021 to July 2023. During this period, the pointing and angular resolution of the array were stable. From the observations of the Moon shadow and Crab Nebula, the results achieved using the two methods are consistent with each other. According to the observation of the Crab Nebula at energies from 25 TeV to 100 TeV, the time averaged pointing errors are estimated to be $-0.003^{\circ} \pm 0.005^{\circ}$ and $0.001^{\circ} \pm 0.006^{\circ}$ in the R.A. and Dec directions, respectively. △ Less

Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

Comments: 15 pages, 9 figures

arXiv:2405.07691 [pdf, other]

Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) is compatible with NGC 4278 within $\sim0.03$ degree. Variation analysis shows an indication of the variability at a few months level in the TeV band, which is consistent with low frequency observations. Based on these observations, we report the detection of TeV $γ$-ray emissions from this low-luminosity AGN NGC 4278. The observations by LHAASO-WCDA during active period has a significance level of 8.8\,$σ$ with best-fit photon spectral index $\varGamma=2.56\pm0.14$ and a flux $f_{1-10\,\rm{TeV}}=(7.0\pm1.1_{\rm{sta}}\pm0.35_{\rm{syst}})\times10^{-13}\,\rm{photons\,cm^{-2}\,s^{-1}}$, or approximately $5\%$ of the Crab Nebula. The discovery of VHE from NGC 4278 indicates that the compact, weak radio jet can efficiently accelerate particles and emit TeV photons. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 5 figures

arXiv:2404.19198 [pdf, ps, other]

$CP$-violating observables of four-body $B_{(s)} \to (ππ)(K\bar{K})$ decays in perturbative QCD

Authors: Da-Cheng Yan, Yan Yan, Zhou Rui

Abstract: In this work, we investigate six helicity amplitudes of the four-body $B_{(s)} \to (ππ)(K\bar{K})$ decays in the perturbative QCD (PQCD) approach. The $ππ$ invariant mass spectrum is dominated by the vector resonance $ρ(770)$ together with scalar resonance $f_0(980)$, while the vector resonance $φ(1020)$ and scalar resonance $f_0(980)$ are expected to contribute in the $K\bar{K}$ invariant mass ra… ▽ More In this work, we investigate six helicity amplitudes of the four-body $B_{(s)} \to (ππ)(K\bar{K})$ decays in the perturbative QCD (PQCD) approach. The $ππ$ invariant mass spectrum is dominated by the vector resonance $ρ(770)$ together with scalar resonance $f_0(980)$, while the vector resonance $φ(1020)$ and scalar resonance $f_0(980)$ are expected to contribute in the $K\bar{K}$ invariant mass range. We extract the two-body branching ratios ${\cal B}(B_{(s)}\to ρφ)$ from the corresponding four-body decays $B_{(s)}\to ρφ\to (ππ)(K \bar K)$. The predicted ${\cal B}(B^0_{s}\to ρφ)$ agrees well with the current experimental data within errors. The longitudinal polarization fractions of the $B_{(s)}\to ρφ$ decays are found to be as large as $90\%$, basically consistent with the previous two-body predictions within uncertainties. In addition, the triple-product asymmetries (TPAs) of the considered decays are also presented for the first time. Since the $B_s^0\to ρ^0φ\to(π^+π^-)(K^+K^-)$ decay is induced by both tree and penguin operators, the values of the ${\cal A}^{\rm CP}_{\rm dir}$ and ${\cal A}^{1}_{\text{T-true}}$ are calculated to be $(21.8^{+2.7}_{-3.3})\%$ and $(-10.23^{+1.73}_{-1.56})\%$ respectively. While for pure penguin decays $B^0\to ρ^0φ\to(π^+π^-)(K^+K^-)$ and $B^+\to ρ^+φ\to(π^+π^0)(K^+K^-)$, both the direct $CP$ asymmetries and ``true" TPAs are naturally expected to be zero in the standard model (SM). The ``fake" TPAs requiring no weak phase difference are usually none zero for all considered decay channels. The sizable ``fake" ${\cal A}^{1}_{\text{T-fake}}=(-20.92^{+6.26}_{-2.80})\%$ of the $B^0\to ρ^0φ\to(π^+π^-)(K^+K^-)$ decay is predicted in the PQCD approach, which provides valuable information on the final-state interactions.Our predictions can be tested by the future experiments. △ Less

Submitted 7 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

Comments: 24 pages, 2 figures. arXiv admin note: text overlap with arXiv:2308.12543, arXiv:2204.01092, arXiv:2107.10684

arXiv:2404.17917 [pdf, other]

EvaNet: Elevation-Guided Flood Extent Mapping on Earth Imagery

Authors: Mirza Tanzim Sami, Da Yan, Saugat Adhikari, Lyuheng Yuan, Jiao Han, Zhe Jiang, Jalal Khalil, Yang Zhou

Abstract: Accurate and timely mapping of flood extent from high-resolution satellite imagery plays a crucial role in disaster management such as damage assessment and relief activities. However, current state-of-the-art solutions are based on U-Net, which can-not segment the flood pixels accurately due to the ambiguous pixels (e.g., tree canopies, clouds) that prevent a direct judgement from only the spectr… ▽ More Accurate and timely mapping of flood extent from high-resolution satellite imagery plays a crucial role in disaster management such as damage assessment and relief activities. However, current state-of-the-art solutions are based on U-Net, which can-not segment the flood pixels accurately due to the ambiguous pixels (e.g., tree canopies, clouds) that prevent a direct judgement from only the spectral features. Thanks to the digital elevation model (DEM) data readily available from sources such as United States Geological Survey (USGS), this work explores the use of an elevation map to improve flood extent mapping. We propose, EvaNet, an elevation-guided segmentation model based on the encoder-decoder architecture with two novel techniques: (1) a loss function encoding the physical law of gravity that if a location is flooded (resp. dry), then its adjacent locations with a lower (resp. higher) elevation must also be flooded (resp. dry); (2) a new (de)convolution operation that integrates the elevation map by a location sensitive gating mechanism to regulate how much spectral features flow through adjacent layers. Extensive experiments show that EvaNet significantly outperforms the U-Net baselines, and works as a perfect drop-in replacement for U-Net in existing solutions to flood extent mapping. △ Less

Submitted 12 May, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

Comments: Accepted at the International Joint Conference on Artificial Intelligence (IJCAI, 2024)

arXiv:2404.12976 [pdf, other]

Insights from the Gaussian Processes Method for the FRB-associated X-ray Burst of SGR 1935+2154

Authors: Ruijing Tang, Dahai Yan, Haiyun Zhang, Qingchang Zhao, Lian Tao, Chengkui Li, Mingyu Ge, Xiaobo Li, Qianqing Yin, Ce Cai

Abstract: Gaussian processes method is employed to analyze the light curves of bursts detected by Insight-HXMT, NICER, and GECAM from SGR 1935+2154 between 2020 to 2022. It is found that a stochastically driven damped simple harmonic oscillator (SHO) is necessary to capture the characteristics of the X-ray bursts. Variability timescale of the X-ray bursts, corresponding to the broken frequencies in the SHO… ▽ More Gaussian processes method is employed to analyze the light curves of bursts detected by Insight-HXMT, NICER, and GECAM from SGR 1935+2154 between 2020 to 2022. It is found that a stochastically driven damped simple harmonic oscillator (SHO) is necessary to capture the characteristics of the X-ray bursts. Variability timescale of the X-ray bursts, corresponding to the broken frequencies in the SHO power spectral densities (PSDs), are extracted. In particular, a high broken frequency of 35 Hz where the index of the SHO PSD changes from -4 to -2 is constrained by the HXMT-HE burst associated with FRB 200428. It is suggested that the corresponding timescale of 0.03 s could be the retarding timescale of the system driven by some energy release, and the production of the HE photon should be quasi-simultaneous with the response. The other special event is a NICER burst with a retarding timescale of 1/39 Hz (0.02 s). In the normal X-ray bursts, no retarding timescale is constrained; a long relax/equilibrium timescale (corresponding to a broken frequency of 1-10 Hz where the index of the SHO PSD changing from -4/-2 to 0 in the SHO PSD) is obtained. The results indicate that the FRB-associated HXMT-HE X-ray burst could be produced immediately when the system is responding to the energy disturbance, far before the equilibrium state. △ Less

Submitted 19 June, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

Comments: 13 pages,17 figures,1 table

MSC Class: 85-02

arXiv:2404.12850 [pdf, other]

CaBaFL: Asynchronous Federated Learning via Hierarchical Cache and Feature Balance

Authors: Zeke Xia, Ming Hu, Dengke Yan, Xiaofei Xie, Tianlin Li, Anran Li, Junlong Zhou, Mingsong Chen

Abstract: Federated Learning (FL) as a promising distributed machine learning paradigm has been widely adopted in Artificial Intelligence of Things (AIoT) applications. However, the efficiency and inference capability of FL is seriously limited due to the presence of stragglers and data imbalance across massive AIoT devices, respectively. To address the above challenges, we present a novel asynchronous FL a… ▽ More Federated Learning (FL) as a promising distributed machine learning paradigm has been widely adopted in Artificial Intelligence of Things (AIoT) applications. However, the efficiency and inference capability of FL is seriously limited due to the presence of stragglers and data imbalance across massive AIoT devices, respectively. To address the above challenges, we present a novel asynchronous FL approach named CaBaFL, which includes a hierarchical Cache-based aggregation mechanism and a feature Balance-guided device selection strategy. CaBaFL maintains multiple intermediate models simultaneously for local training. The hierarchical cache-based aggregation mechanism enables each intermediate model to be trained on multiple devices to align the training time and mitigate the straggler issue. In specific, each intermediate model is stored in a low-level cache for local training and when it is trained by sufficient local devices, it will be stored in a high-level cache for aggregation. To address the problem of imbalanced data, the feature balance-guided device selection strategy in CaBaFL adopts the activation distribution as a metric, which enables each intermediate model to be trained across devices with totally balanced data distributions before aggregation. Experimental results show that compared with the state-of-the-art FL methods, CaBaFL achieves up to 9.26X training acceleration and 19.71\% accuracy improvements. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.12846 [pdf, other]

KoReA-SFL: Knowledge Replay-based Split Federated Learning Against Catastrophic Forgetting

Authors: Zeke Xia, Ming Hu, Dengke Yan, Ruixuan Liu, Anran Li, Xiaofei Xie, Mingsong Chen

Abstract: Although Split Federated Learning (SFL) is good at enabling knowledge sharing among resource-constrained clients, it suffers from the problem of low training accuracy due to the neglect of data heterogeneity and catastrophic forgetting. To address this issue, we propose a novel SFL approach named KoReA-SFL, which adopts a multi-model aggregation mechanism to alleviate gradient divergence caused by… ▽ More Although Split Federated Learning (SFL) is good at enabling knowledge sharing among resource-constrained clients, it suffers from the problem of low training accuracy due to the neglect of data heterogeneity and catastrophic forgetting. To address this issue, we propose a novel SFL approach named KoReA-SFL, which adopts a multi-model aggregation mechanism to alleviate gradient divergence caused by heterogeneous data and a knowledge replay strategy to deal with catastrophic forgetting. Specifically, in KoReA-SFL cloud servers (i.e., fed server and main server) maintain multiple branch model portions rather than a global portion for local training and an aggregated master-model portion for knowledge sharing among branch portions. To avoid catastrophic forgetting, the main server of KoReA-SFL selects multiple assistant devices for knowledge replay according to the training data distribution of each server-side branch-model portion. Experimental results obtained from non-IID and IID scenarios demonstrate that KoReA-SFL significantly outperforms conventional SFL methods (by up to 23.25\% test accuracy improvement). △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.08409 [pdf, other]

doi 10.1093/mnrasl/slae032

Evidence of the gamma-ray counterpart from nova FM Cir with Fermi-LAT

Authors: H. H. Wang, H. D. Yan, L. C. -C. Lin, J. Takata, P. -H. T. Tam

Abstract: We report the analysis results of X-ray and gamma-ray data of the nova FM Cir taken by Swift and Fermi-LAT. The gamma-ray emission from FM Cir can be identified with a significance level of 3sigma within 40 days after the nova eruption (2018 January 19) while we bin the light curve per day. The significance can further exceed 4 sigma confidence level if we accumulate longer time (i.e., 20 days) to… ▽ More We report the analysis results of X-ray and gamma-ray data of the nova FM Cir taken by Swift and Fermi-LAT. The gamma-ray emission from FM Cir can be identified with a significance level of 3sigma within 40 days after the nova eruption (2018 January 19) while we bin the light curve per day. The significance can further exceed 4 sigma confidence level if we accumulate longer time (i.e., 20 days) to bin the light curve. The gamma-ray counterpart could be identified with a Test Statistic (TS) above 4 until 180 days after the eruption. The duration of the gamma-ray detection was longer than those reported in the previous studies of the other novae detected in the GeV range. The significant X-ray emission was observed after the gamma-ray flux level fell below the sensitivity of Fermi-LAT. The hardness ratio of the X-ray emission decreased rapidly with time, and the spectra were dominated by blackbody radiation from the hot white dwarf. Except for the longer duration of the gamma-ray emission, the multi-wavelength properties of FM Cir closely resemble those of other novae detected in the GeV range. △ Less

Submitted 14 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

Comments: 6 pages,7 figures,Accepted for publication in Monthly Notices of the Royal Astronomical Society Letters

Journal ref: 2024 April 12

arXiv:2404.04801 [pdf, ps, other]

doi 10.1007/s41605-024-00467-8

LHAASO-KM2A detector simulation using Geant4

Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (254 additional authors not shown)

Abstract: KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with… ▽ More KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with large altitude difference (30 m) and huge coverage (1.3 km^2). In this paper, the design of the KM2A simulation code G4KM2A based on Geant4 is introduced. The process of G4KM2A is optimized mainly in memory consumption to avoid memory overffow. Some simpliffcations are used to signiffcantly speed up the execution of G4KM2A. The running time is reduced by at least 30 times compared to full detector simulation. The particle distributions and the core/angle resolution comparison between simulation and experimental data of the full KM2A array are also presented, which show good agreement. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04545 [pdf, other]

TCAN: Text-oriented Cross Attention Network for Multimodal Sentiment Analysis

Authors: Ming Zhou, Weize Quan, Ziqi Zhou, Kai Wang, Tong Wang, Dong-Ming Yan

Abstract: Multimodal Sentiment Analysis (MSA) endeavors to understand human sentiment by leveraging language, visual, and acoustic modalities. Despite the remarkable performance exhibited by previous MSA approaches, the presence of inherent multimodal heterogeneities poses a challenge, with the contribution of different modalities varying considerably. Past research predominantly focused on improving repres… ▽ More Multimodal Sentiment Analysis (MSA) endeavors to understand human sentiment by leveraging language, visual, and acoustic modalities. Despite the remarkable performance exhibited by previous MSA approaches, the presence of inherent multimodal heterogeneities poses a challenge, with the contribution of different modalities varying considerably. Past research predominantly focused on improving representation learning techniques and feature fusion strategies. However, many of these efforts overlooked the variation in semantic richness among different modalities, treating each modality uniformly. This approach may lead to underestimating the significance of strong modalities while overemphasizing the importance of weak ones. Motivated by these insights, we introduce a Text-oriented Cross-Attention Network (TCAN), emphasizing the predominant role of the text modality in MSA. Specifically, for each multimodal sample, by taking unaligned sequences of the three modalities as inputs, we initially allocate the extracted unimodal features into a visual-text and an acoustic-text pair. Subsequently, we implement self-attention on the text modality and apply text-queried cross-attention to the visual and acoustic modalities. To mitigate the influence of noise signals and redundant features, we incorporate a gated control mechanism into the framework. Additionally, we introduce unimodal joint learning to gain a deeper understanding of homogeneous emotional tendencies across diverse modalities through backpropagation. Experimental results demonstrate that TCAN consistently outperforms state-of-the-art MSA methods on two datasets (CMU-MOSI and CMU-MOSEI). △ Less

Submitted 6 April, 2024; originally announced April 2024.

arXiv:2404.02065 [pdf, other]

doi 10.1109/TMM.2024.3374594

Multi-Level Label Correction by Distilling Proximate Patterns for Semi-supervised Semantic Segmentation

Authors: Hui Xiao, Yuting Hong, Li Dong, Diqun Yan, Jiayan Zhuang, Junjie Xiong, Dongtai Liang, Chengbin Peng

Abstract: Semi-supervised semantic segmentation relieves the reliance on large-scale labeled data by leveraging unlabeled data. Recent semi-supervised semantic segmentation approaches mainly resort to pseudo-labeling methods to exploit unlabeled data. However, unreliable pseudo-labeling can undermine the semi-supervision processes. In this paper, we propose an algorithm called Multi-Level Label Correction (… ▽ More Semi-supervised semantic segmentation relieves the reliance on large-scale labeled data by leveraging unlabeled data. Recent semi-supervised semantic segmentation approaches mainly resort to pseudo-labeling methods to exploit unlabeled data. However, unreliable pseudo-labeling can undermine the semi-supervision processes. In this paper, we propose an algorithm called Multi-Level Label Correction (MLLC), which aims to use graph neural networks to capture structural relationships in Semantic-Level Graphs (SLGs) and Class-Level Graphs (CLGs) to rectify erroneous pseudo-labels. Specifically, SLGs represent semantic affinities between pairs of pixel features, and CLGs describe classification consistencies between pairs of pixel labels. With the support of proximate pattern information from graphs, MLLC can rectify incorrectly predicted pseudo-labels and can facilitate discriminative feature representations. We design an end-to-end network to train and perform this effective label corrections mechanism. Experiments demonstrate that MLLC can significantly improve supervised baselines and outperforms state-of-the-art approaches in different scenarios on Cityscapes and PASCAL VOC 2012 datasets. Specifically, MLLC improves the supervised baseline by at least 5% and 2% with DeepLabV2 and DeepLabV3+ respectively under different partition protocols. △ Less

Submitted 9 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: 12 pages, 8 figures. IEEE Transactions on Multimedia, 2024

arXiv:2403.19067 [pdf, other]

Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach

Authors: Wei Dong, Xing Zhang, Bihui Chen, Dawei Yan, Zhijun Lin, Qingsen Yan, Peng Wang, Yang Yang

Abstract: Parameter-efficient fine-tuning for pre-trained Vision Transformers aims to adeptly tailor a model to downstream tasks by learning a minimal set of new adaptation parameters while preserving the frozen majority of pre-trained parameters. Striking a balance between retaining the generalizable representation capacity of the pre-trained model and acquiring task-specific features poses a key challenge… ▽ More Parameter-efficient fine-tuning for pre-trained Vision Transformers aims to adeptly tailor a model to downstream tasks by learning a minimal set of new adaptation parameters while preserving the frozen majority of pre-trained parameters. Striking a balance between retaining the generalizable representation capacity of the pre-trained model and acquiring task-specific features poses a key challenge. Currently, there is a lack of focus on guiding this delicate trade-off. In this study, we approach the problem from the perspective of Singular Value Decomposition (SVD) of pre-trained parameter matrices, providing insights into the tuning dynamics of existing methods. Building upon this understanding, we propose a Residual-based Low-Rank Rescaling (RLRR) fine-tuning strategy. This strategy not only enhances flexibility in parameter tuning but also ensures that new parameters do not deviate excessively from the pre-trained model through a residual design. Extensive experiments demonstrate that our method achieves competitive performance across various downstream image classification tasks, all while maintaining comparable new parameters. We believe this work takes a step forward in offering a unified perspective for interpreting existing methods and serves as motivation for the development of new approaches that move closer to effectively considering the crucial trade-off mentioned above. Our code is available at \href{https://github.com/zstarN70/RLRR.git}{https://github.com/zstarN70/RLRR.git}. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.17325 [pdf, other]

A possibly solar metallicity atmosphere escaping from HAT-P-32b revealed by H$α$ and He absorption

Authors: Dongdong Yan, Jianheng Guo, Kwang-il Seon, Manuel López-Puertas, Stefan Czesla, Manuel Lampón

Abstract: This paper presents a hydrodynamic simulation that couples detailed non-local thermodynamic equilibrium (NLTE) calculations of the hydrogen and helium level populations to model the H$α$ and He 10830 transmission spectra of the hot Jupiter HAT-P-32b. A Monte Carlo simulation is applied to calculate the number of Ly$α$ resonance scatterings, which is the main process for populating H(2). In the exa… ▽ More This paper presents a hydrodynamic simulation that couples detailed non-local thermodynamic equilibrium (NLTE) calculations of the hydrogen and helium level populations to model the H$α$ and He 10830 transmission spectra of the hot Jupiter HAT-P-32b. A Monte Carlo simulation is applied to calculate the number of Ly$α$ resonance scatterings, which is the main process for populating H(2). In the examined parameter space, only the models with H/He $\geq$ 99.5/0.5, $(0.5 \sim 3.0)$ times the fiducial value of $F_{\rm XUV}$, $β_m = 0.16\sim 0.3$, can explain the H$α$ and He 10830 lines simultaneously. We find a mass-loss rate of $\sim (1.0\sim 3.1) \times 10^{13}$ g s$^{-1}$, consistent with previous studies. Moreover, we find that the stellar Ly$α$ flux should be as high as $4 \times 10^{5}$ erg cm$^{-2}$ s$^{-1}$, indicating high stellar activity during the observation epoch of the two absorption lines. Despite the fact that the metallicity in the lower atmosphere of HAT-P-32b may be super-solar, our simulations tentatively suggest it is close to solar in the upper atmosphere. The difference in metallicity between the lower and upper atmospheres is essential for future atmospheric characterisations. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: Accepted for publication in Astronomy & Astrophysics

arXiv:2403.10979 [pdf, other]

Long-term double synchronization in close-in gas giant planets

Authors: Shuaishuai Guo, Jianheng Guo, Jie Su, Dongdong Yan

Abstract: Hot Jupiters, orbiting their host stars at extremely close distances, undergo tidal evolution, with some being engulfed by their stars due to angular momentum exchanges induced by tidal forces. However, achieving double synchronization can prolong their survival. Using the MESA stellar evolution code, combined with the magnetic braking model of Matt et al. (2015), we calculate 25,000 models with d… ▽ More Hot Jupiters, orbiting their host stars at extremely close distances, undergo tidal evolution, with some being engulfed by their stars due to angular momentum exchanges induced by tidal forces. However, achieving double synchronization can prolong their survival. Using the MESA stellar evolution code, combined with the magnetic braking model of Matt et al. (2015), we calculate 25,000 models with different metallicity and study how to attain the conditions that trigger the long-term double synchronization. Our results indicate that massive planets orbiting stars with lower convective turnover time are easier to achieve long-term double synchronization. The rotation angular velocity at the equilibrium point ($Ω_{\mathrm{sta}}$) is almost equal to orbital angular velocity of planet ($\mathrm{n}$) for the majority of the main sequence lifetime if a system has undergone a long-term double synchronization, regardless of their state at this moment. We further compared our results with known parameters of giant planetary systems and found that those systems with larger planetary masses and lower convective turnover time seem to be less sensitive to changes in the tidal quality factor $Q'_{_*}$. We suggest that for systems that fall on the state of $Ω_{\mathrm{sta}} \approx n$, regardless of their current state, the synchronization will persist for a long time if orbital synchronization occurs at any stage of their evolution. Our results can be applied to estimate whether a system has experienced long-term double synchronization in the past or may experience it in the future. △ Less

Submitted 16 March, 2024; originally announced March 2024.

arXiv:2403.10840 [pdf, other]

MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field

Authors: Dongyu Yan, Guanyu Huang, Fengyu Quan, Haoyao Chen

Abstract: Panoramic observation using fisheye cameras is significant in robot perception, reconstruction, and remote operation. However, panoramic images synthesized by traditional methods lack depth information and can only provide three degrees-of-freedom (3DoF) rotation rendering in virtual reality applications. To fully preserve and exploit the parallax information within the original fisheye cameras, w… ▽ More Panoramic observation using fisheye cameras is significant in robot perception, reconstruction, and remote operation. However, panoramic images synthesized by traditional methods lack depth information and can only provide three degrees-of-freedom (3DoF) rotation rendering in virtual reality applications. To fully preserve and exploit the parallax information within the original fisheye cameras, we introduce MSI-NeRF, which combines deep learning omnidirectional depth estimation and novel view rendering. We first construct a multi-sphere image as a cost volume through feature extraction and warping of the input images. It is then processed by geometry and appearance decoders, respectively. Unlike methods that regress depth maps directly, we further build an implicit radiance field using spatial points and interpolated 3D feature vectors as input. In this way, we can simultaneously realize omnidirectional depth estimation and 6DoF view synthesis. Our method is trained in a semi-self-supervised manner. It does not require target view images and only uses depth data for supervision. Our network has the generalization ability to reconstruct unknown scenes efficiently using only four images. Experimental results show that our method outperforms existing methods in depth estimation and novel view synthesis tasks. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: 8 pages, 7 figures, Submitted to IEEE/RSJ International Conference on Intelligent Robots and Systems 2024

arXiv:2403.10010 [pdf, other]

doi 10.1103/PhysRevLett.132.131002

Measurements of All-Particle Energy Spectrum and Mean Logarithmic Mass of Cosmic Rays from 0.3 to 30 PeV with LHAASO-KM2A

Authors: The LHAASO Collaboration, Zhen Cao, F. Aharonian, Q. An, A. Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen , et al. (256 additional authors not shown)

Abstract: We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at… ▽ More We present the measurements of all-particle energy spectrum and mean logarithmic mass of cosmic rays in the energy range of 0.3-30 PeV using data collected from LHAASO-KM2A between September 2021 and December 2022, which is based on a nearly composition-independent energy reconstruction method, achieving unprecedented accuracy. Our analysis reveals the position of the knee at $3.67 \pm 0.05 \pm 0.15$ PeV. Below the knee, the spectral index is found to be -$2.7413 \pm 0.0004 \pm 0.0050$, while above the knee, it is -$3.128 \pm 0.005 \pm 0.027$, with the sharpness of the transition measured with a statistical error of 2%. The mean logarithmic mass of cosmic rays is almost heavier than helium in the whole measured energy range. It decreases from 1.7 at 0.3 PeV to 1.3 at 3 PeV, representing a 24% decline following a power law with an index of -$0.1200 \pm 0.0003 \pm 0.0341$. This is equivalent to an increase in abundance of light components. Above the knee, the mean logarithmic mass exhibits a power law trend towards heavier components, which is reversal to the behavior observed in the all-particle energy spectrum. Additionally, the knee position and the change in power-law index are approximately the same. These findings suggest that the knee observed in the all-particle spectrum corresponds to the knee of the light component, rather than the medium-heavy components. △ Less

Submitted 26 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: 8 pages, 3 figures

Journal ref: Physical Review Letters 132, 131002 (2024)

arXiv:2403.07637 [pdf]

Discovery of a Magnetic Topological Semimetal Eu$_3$In$_2$As$_4$ with a Single Pair of Weyl Points

Authors: Ke Jia, Jingyu Yao, Xiaobo He, Yupeng Li, Junze Deng, Ming Yang, Junfeng Wang, Zengwei Zhu, Cuixiang Wang, Dayu Yan, Hai L. Feng, Jie Shen, Yongkang Luo, Zhijun Wang, Youguo Shi

Abstract: Magnetic Weyl semimetal (MWS) is a unique topological state with open surface Fermi arc states and other exotic transport phenomena. However, most reported MWSs show multiple pairs of Weyl points and complicated Fermi surfaces, which increases the difficulty of the investigation into the intrinsic chiral transport property. In this wor, we successfully synthesized a soft magnetic Weyl semimetal Eu… ▽ More Magnetic Weyl semimetal (MWS) is a unique topological state with open surface Fermi arc states and other exotic transport phenomena. However, most reported MWSs show multiple pairs of Weyl points and complicated Fermi surfaces, which increases the difficulty of the investigation into the intrinsic chiral transport property. In this wor, we successfully synthesized a soft magnetic Weyl semimetal Eu$_3$In$_2$As$_4$ with a single pair of Weyl points under magnetic fields. The Shubnikov de Haas (SdH) oscillation with a single frequency, as well as a linear hall resistance with the same carrier density, is observed up to 50 Tesla, indicating a single pair of Weyl points around the Fermi level with a massless fermion ($m^* = 0.121 m_0$, $π$ Berry phase). Such a single pair of Weyl points is further confirmed by the density functional theory calculations. The magnetic ordering and band topology can be easily tuned by the external magnetic field. The field-induced MWS Eu$_3$In$_2$As$_4$ with a single pair of Weyl points is a good platform to detect chiral transport properties, including possible quantum anomalous Hall effect. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2402.18833 [pdf, ps, other]

doi 10.1103/PhysRevMaterials.7.094004

Layer-dependent Raman spectroscopy of ultrathin Ta$_2$Pd$_3$Te$_5$

Authors: Zhenyu Sun, Zhaopeng Guo, Dayu Yan, Peng Cheng, Lan Chen, Youguo Shi, Yuan Huang, Zhijun Wang, Kehui Wu, Baojie Feng

Abstract: Two-dimensional topological insulators (2DTIs) or quantum spin Hall insulators are attracting increasing attention due to their potential applications in next-generation spintronic devices. Despite their promising prospects, realizable 2DTIs are still limited. Recently, Ta2Pd3Te5, a semiconducting van der Waals material, has shown spectroscopic evidence of quantum spin Hall states. However, achiev… ▽ More Two-dimensional topological insulators (2DTIs) or quantum spin Hall insulators are attracting increasing attention due to their potential applications in next-generation spintronic devices. Despite their promising prospects, realizable 2DTIs are still limited. Recently, Ta2Pd3Te5, a semiconducting van der Waals material, has shown spectroscopic evidence of quantum spin Hall states. However, achieving controlled preparation of few- to monolayer samples, a crucial step in realizing quantum spin Hall devices, has not yet been achieved. In this work, we fabricated few- to monolayer Ta$_2$Pd$_3$Te$_5$ and performed systematic thickness- and temperature-dependent Raman spectroscopy measurements. Our results demonstrate that Raman spectra can provide valuable information to determine the thickness of Ta2Pd3Te5 thin flakes. Moreover, our angle-resolved polarized Raman (ARPR) spectroscopy measurements show that the intensities of the Raman peaks are strongly anisotropic due to the quasi-one-dimensional atomic structure, providing a straightforward method to determine its crystalline orientation. Our findings may stimulate further efforts to realize quantum devices based on few or monolayer Ta$_2$Pd$_3$Te$_5$. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Journal ref: Phys. Rev. Materials 7, 094004 (2023)

arXiv:2402.18294 [pdf, other]

Whole-body Humanoid Robot Locomotion with Human Reference

Authors: Qiang Zhang, Peter Cui, David Yan, Jingkai Sun, Yiqun Duan, Arthur Zhang, Renjing Xu

Abstract: Recently, humanoid robots have made significant advances in their ability to perform challenging tasks due to the deployment of Reinforcement Learning (RL), however, the inherent complexity of humanoid robots, including the difficulty of designing complicated reward functions and training entire sophisticated systems, still poses a notable challenge. To conquer these challenges, after many iterati… ▽ More Recently, humanoid robots have made significant advances in their ability to perform challenging tasks due to the deployment of Reinforcement Learning (RL), however, the inherent complexity of humanoid robots, including the difficulty of designing complicated reward functions and training entire sophisticated systems, still poses a notable challenge. To conquer these challenges, after many iterations and in-depth investigations, we have meticulously developed a full-size humanoid robot, "Adam", whose innovative structural design greatly improves the efficiency and effectiveness of the imitation learning process. In addition, we have developed a novel imitation learning framework based on an adversarial motion prior, which applies not only to Adam but also to humanoid robots in general. Using the framework, Adam can exhibit unprecedented human-like characteristics in locomotion tasks. Our experimental results demonstrate that the proposed framework enables Adam to achieve human-comparable performance in complex locomotion tasks, marking the first time that human locomotion data has been used for imitation learning in a full-size humanoid robot. △ Less

Submitted 1 March, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

Comments: 7pages, 7 figures

arXiv:2402.16051 [pdf, other]

Two-body hadronic weak decays of bottomed hadrons

Authors: Ying Zhang, Guangzhao He, Quanxing Ye, Da-Cheng Yan, Jun Hua, Qian Wang

Abstract: The structure of light diquarks plays a crucial role in the formation of exotic hadrons beyond the conventional quark model, especially in their line shapes of bottomed hadron decays. We study the two-body hadronic weak decays of bottomed baryons and bottomed mesons to probe the light diquark structure and pin down the quark-quark correlations in the diquark picture. We find that the light diquark… ▽ More The structure of light diquarks plays a crucial role in the formation of exotic hadrons beyond the conventional quark model, especially in their line shapes of bottomed hadron decays. We study the two-body hadronic weak decays of bottomed baryons and bottomed mesons to probe the light diquark structure and pin down the quark-quark correlations in the diquark picture. We find that the light diquark does not favor a compact structure. For instance, the isoscalar diquark $[ud]$ in $Λ_{b}^{0}$ can be easily split and rearranged to form $Σ_{c}^{(*)}\bar{D}^{(*)}$ via the color-suppressed transition. This provides a hint that the hidden charm pentaquark states produced in $Λ^0_b$ decays could be the $Σ_{c}^{(*)}\bar{D}^{(*)}$ hadronic molecular candidates. This quantitative study resolves the apparent conflicts between the production mechanism and molecular nature of these $P_c$ states observed in experiment. △ Less

Submitted 25 February, 2024; originally announced February 2024.

Comments: accepted by Chinese Physics Letter

arXiv:2402.13008 [pdf, other]

Efficient Enumeration of Large Maximal k-Plexes

Authors: Qihao Cheng, Da Yan, Tianhao Wu, Lyuheng Yuan, Ji Cheng, Zhongyi Huang, Yang Zhou

Abstract: Finding cohesive subgraphs in a large graph has many important applications, such as community detection and biological network analysis. Clique is often a too strict cohesive structure since communities or biological modules rarely form as cliques for various reasons such as data noise. Therefore, $k$-plex is introduced as a popular clique relaxation, which is a graph where every vertex is adjace… ▽ More Finding cohesive subgraphs in a large graph has many important applications, such as community detection and biological network analysis. Clique is often a too strict cohesive structure since communities or biological modules rarely form as cliques for various reasons such as data noise. Therefore, $k$-plex is introduced as a popular clique relaxation, which is a graph where every vertex is adjacent to all but at most $k$ vertices. In this paper, we propose a fast branch-and-bound algorithm as well as its task-based parallel version to enumerate all maximal $k$-plexes with at least $q$ vertices. Our algorithm adopts an effective search space partitioning approach that provides a lower time complexity, a new pivot vertex selection method that reduces candidate vertex size, an effective upper-bounding technique to prune useless branches, and three novel pruning techniques by vertex pairs. Our parallel algorithm uses a timeout mechanism to eliminate straggler tasks, and maximizes cache locality while ensuring load balancing. Extensive experiments show that compared with the state-of-the-art algorithms, our sequential and parallel algorithms enumerate large maximal $k$-plexes with up to $5 \times$ and $18.9 \times$ speedup, respectively. Ablation results also demonstrate that our pruning techniques bring up to $7 \times$ speedup compared with our basic algorithm. △ Less

Submitted 10 June, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

Comments: Accepted by EDBT2025. Camera-ready version

arXiv:2402.10821 [pdf, other]

Training Class-Imbalanced Diffusion Model Via Overlap Optimization

Authors: Divin Yan, Lu Qi, Vincent Tao Hu, Ming-Hsuan Yang, Meng Tang

Abstract: Diffusion models have made significant advances recently in high-quality image synthesis and related tasks. However, diffusion models trained on real-world datasets, which often follow long-tailed distributions, yield inferior fidelity for tail classes. Deep generative models, including diffusion models, are biased towards classes with abundant training images. To address the observed appearance o… ▽ More Diffusion models have made significant advances recently in high-quality image synthesis and related tasks. However, diffusion models trained on real-world datasets, which often follow long-tailed distributions, yield inferior fidelity for tail classes. Deep generative models, including diffusion models, are biased towards classes with abundant training images. To address the observed appearance overlap between synthesized images of rare classes and tail classes, we propose a method based on contrastive learning to minimize the overlap between distributions of synthetic images for different classes. We show variants of our probabilistic contrastive learning method can be applied to any class conditional diffusion model. We show significant improvement in image synthesis using our loss for multiple datasets with long-tailed distribution. Extensive experimental results demonstrate that the proposed method can effectively handle imbalanced data for diffusion-based generation and classification models. Our code and datasets will be publicly available at https://github.com/yanliang3612/DiffROP. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: Technique Report

arXiv:2402.10184 [pdf, other]

Reward Generalization in RLHF: A Topological Perspective

Authors: Tianyi Qiu, Fanzhi Zeng, Jiaming Ji, Dong Yan, Kaile Wang, Jiayi Zhou, Yang Han, Josef Dai, Xuehai Pan, Yaodong Yang

Abstract: Existing alignment methods share a common topology of information flow, where reward information is collected from humans, modeled with preference learning, and used to tune language models. However, this shared topology has not been systematically characterized, nor have its alternatives been thoroughly explored, leaving the problems of low data efficiency and unreliable generalization unaddresse… ▽ More Existing alignment methods share a common topology of information flow, where reward information is collected from humans, modeled with preference learning, and used to tune language models. However, this shared topology has not been systematically characterized, nor have its alternatives been thoroughly explored, leaving the problems of low data efficiency and unreliable generalization unaddressed. As a solution, we introduce a theoretical framework for investigating reward generalization in reinforcement learning from human feedback (RLHF), focusing on the topology of information flow at both macro and micro levels. At the macro level, we portray the RLHF information flow as an autoencoding process over behavior distributions, formalizing the RLHF objective of distributional consistency between human preference and model behavior. At the micro level, we present induced Bayesian networks as a theory of reward generalization in RLHF, introducing fine-grained dataset topologies into generalization bounds. Combining analysis on both levels, we propose reward modeling from tree-structured preference information. It is shown to reduce reward uncertainty by up to $Θ(\log n/\log\log n)$ times compared to baselines, where $n$ is the dataset size. Validation on three NLP tasks shows that our tree-based reward model achieves an average win rate of 65% against baseline methods, thus improving reward generalization for free via topology design. △ Less

Submitted 16 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.06088 [pdf, other]

Animated Stickers: Bringing Stickers to Life with Video Diffusion

Authors: David Yan, Winnie Zhang, Luxin Zhang, Anmol Kalia, Dingkang Wang, Ankit Ramchandani, Miao Liu, Albert Pumarola, Edgar Schoenfeld, Elliot Blanchard, Krishna Narni, Yaqiao Luo, Lawrence Chen, Guan Pang, Ali Thabet, Peter Vajda, Amy Bearman, Licheng Yu

Abstract: We introduce animated stickers, a video diffusion model which generates an animation conditioned on a text prompt and static sticker image. Our model is built on top of the state-of-the-art Emu text-to-image model, with the addition of temporal layers to model motion. Due to the domain gap, i.e. differences in visual and motion style, a model which performed well on generating natural videos can n… ▽ More We introduce animated stickers, a video diffusion model which generates an animation conditioned on a text prompt and static sticker image. Our model is built on top of the state-of-the-art Emu text-to-image model, with the addition of temporal layers to model motion. Due to the domain gap, i.e. differences in visual and motion style, a model which performed well on generating natural videos can no longer generate vivid videos when applied to stickers. To bridge this gap, we employ a two-stage finetuning pipeline: first with weakly in-domain data, followed by human-in-the-loop (HITL) strategy which we term ensemble-of-teachers. It distills the best qualities of multiple teachers into a smaller student model. We show that this strategy allows us to specifically target improvements to motion quality while maintaining the style from the static image. With inference optimizations, our model is able to generate an eight-frame video with high-quality, interesting, and relevant motion in under one second. △ Less

Submitted 8 February, 2024; originally announced February 2024.

arXiv:2402.01441 [pdf, ps, other]

Learning the Market: Sentiment-Based Ensemble Trading Agents

Authors: Andrew Ye, James Xu, Yi Wang, Yifan Yu, Daniel Yan, Ryan Chen, Bosheng Dong, Vipin Chaudhary, Shuai Xu

Abstract: We propose the integration of sentiment analysis and deep-reinforcement learning ensemble algorithms for stock trading, and design a strategy capable of dynamically altering its employed agent given concurrent market sentiment. In particular, we create a simple-yet-effective method for extracting news sentiment and combine this with general improvements upon existing works, resulting in automated… ▽ More We propose the integration of sentiment analysis and deep-reinforcement learning ensemble algorithms for stock trading, and design a strategy capable of dynamically altering its employed agent given concurrent market sentiment. In particular, we create a simple-yet-effective method for extracting news sentiment and combine this with general improvements upon existing works, resulting in automated trading agents that effectively consider both qualitative market factors and quantitative stock data. We show that our approach results in a strategy that is profitable, robust, and risk-minimal -- outperforming the traditional ensemble strategy as well as single agent algorithms and market metrics. Our findings determine that the conventional practice of switching ensemble agents every fixed-number of months is sub-optimal, and that a dynamic sentiment-based framework greatly unlocks additional performance within these agents. Furthermore, as we have designed our algorithm with simplicity and efficiency in mind, we hypothesize that the transition of our method from historical evaluation towards real-time trading with live data should be relatively simple. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.17792 [pdf, other]

Dimensional Analysis Theory and Molecular Dynamics Simulation of Polypropylene Melt Flow during Injection Molding Process

Authors: Jinrong Zhang, Dadong Yan, Li Peng, Xianbo Huang

Abstract: Flow marks are common surface defects that occur in injection-molded products. Their formation may be related to the flow process of the melt in the mold. Through dimensional analysis, we have discovered that the geometric shape of the flow field is controlled by specific dimensionless quantities. These quantities can be summarized as follows: geometric dimensionless quantities related to the shap… ▽ More Flow marks are common surface defects that occur in injection-molded products. Their formation may be related to the flow process of the melt in the mold. Through dimensional analysis, we have discovered that the geometric shape of the flow field is controlled by specific dimensionless quantities. These quantities can be summarized as follows: geometric dimensionless quantities related to the shape of the mold, material dimensionless quantities related to the melt and mold materials, and physical dimensionless quantities related to the flow. When the geometric shape of the mold changes proportionally, with the melt and mold material fixed, and the initial temperature of the melt and mold fixed, the geometric shape of the flow field will be solely controlled by the Weissenberg number Wi. If Wi is kept constant, changing the injection speed, changing the relaxation time of the polypropylene melt, or scaling the mold will result in similar geometric shapes of the flow field. If the size of the mold is not changed, the geometric shape of the flow field will be the same. Since the dimensionless equation represents a similar system of all sizes, we verified the above conclusion through molecular dynamics simulations at a smaller scale. After further improvement of the micro simulation system, there is a possibility of visualizing the formation process of flow marks. This would greatly aid in the advancement of theory and the elimination of flow marks in production and experiments. This work also illustrates that the methodology of dimensional analysis plus molecular dynamics simulation may be applied to a wider range of other systems, scaling down large systems and thus significantly reducing their computational effort. △ Less

Submitted 31 January, 2024; originally announced January 2024.

arXiv:2401.14995 [pdf, other]

A ThermalKinetic Inductance Detectors Pixel Design for Cosmic Microwave Background Observations at 90/150 GHz bands

Authors: Ye Chai, Shibo Shu, Yongping Li, Jiamin Sun, Zhouhui Liu, Yu Xu, Daikang Yan, Zhengwei Li, Yang Liu, Yiwen Wang, Weijie Guo, Juexian Cao, Congzhan Liu

Abstract: The highly sensitive millimeter-wave telescope is an important tool for accurate measurement of Cosmic Microwave Background (CMB) radiation, and its core component is a detector array located in a cryogenic focal plane. The feasibility of utilizing thermal kinetic inductance detectors (TKIDs) for CMB observations has been demonstrated. We propose a pixel design of TKIDs for observing CMB through a… ▽ More The highly sensitive millimeter-wave telescope is an important tool for accurate measurement of Cosmic Microwave Background (CMB) radiation, and its core component is a detector array located in a cryogenic focal plane. The feasibility of utilizing thermal kinetic inductance detectors (TKIDs) for CMB observations has been demonstrated. We propose a pixel design of TKIDs for observing CMB through atmospheric windows for observations in the 90/150 GHz bands. Assuming lossless dielectric, the coupling efficiency of a single pixel is around 90%. This pixel design will be utilized for future large-scale TKIDs array designs for CMB observations. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.03914 [pdf, other]

D3PRefiner: A Diffusion-based Denoise Method for 3D Human Pose Refinement

Authors: Danqi Yan, Qing Gao, Yuepeng Qian, Xinxing Chen, Chenglong Fu, Yuquan Leng

Abstract: Three-dimensional (3D) human pose estimation using a monocular camera has gained increasing attention due to its ease of implementation and the abundance of data available from daily life. However, owing to the inherent depth ambiguity in images, the accuracy of existing monocular camera-based 3D pose estimation methods remains unsatisfactory, and the estimated 3D poses usually include much noise.… ▽ More Three-dimensional (3D) human pose estimation using a monocular camera has gained increasing attention due to its ease of implementation and the abundance of data available from daily life. However, owing to the inherent depth ambiguity in images, the accuracy of existing monocular camera-based 3D pose estimation methods remains unsatisfactory, and the estimated 3D poses usually include much noise. By observing the histogram of this noise, we find each dimension of the noise follows a certain distribution, which indicates the possibility for a neural network to learn the mapping between noisy poses and ground truth poses. In this work, in order to obtain more accurate 3D poses, a Diffusion-based 3D Pose Refiner (D3PRefiner) is proposed to refine the output of any existing 3D pose estimator. We first introduce a conditional multivariate Gaussian distribution to model the distribution of noisy 3D poses, using paired 2D poses and noisy 3D poses as conditions to achieve greater accuracy. Additionally, we leverage the architecture of current diffusion models to convert the distribution of noisy 3D poses into ground truth 3D poses. To evaluate the effectiveness of the proposed method, two state-of-the-art sequence-to-sequence 3D pose estimators are used as basic 3D pose estimation models, and the proposed method is evaluated on different types of 2D poses and different lengths of the input sequence. Experimental results demonstrate the proposed architecture can significantly improve the performance of current sequence-to-sequence 3D pose estimators, with a reduction of at least 10.3% in the mean per joint position error (MPJPE) and at least 11.0% in the Procrustes MPJPE (P-MPJPE). △ Less

Submitted 8 January, 2024; originally announced January 2024.

arXiv:2401.03395 [pdf, other]

Deep Learning-based Image and Video Inpainting: A Survey

Authors: Weize Quan, Jiaxi Chen, Yanli Liu, Dong-Ming Yan, Peter Wonka

Abstract: Image and video inpainting is a classic problem in computer vision and computer graphics, aiming to fill in the plausible and realistic content in the missing areas of images and videos. With the advance of deep learning, this problem has achieved significant progress recently. The goal of this paper is to comprehensively review the deep learning-based methods for image and video inpainting. Speci… ▽ More Image and video inpainting is a classic problem in computer vision and computer graphics, aiming to fill in the plausible and realistic content in the missing areas of images and videos. With the advance of deep learning, this problem has achieved significant progress recently. The goal of this paper is to comprehensively review the deep learning-based methods for image and video inpainting. Specifically, we sort existing methods into different categories from the perspective of their high-level inpainting pipeline, present different deep learning architectures, including CNN, VAE, GAN, diffusion models, etc., and summarize techniques for module design. We review the training objectives and the common benchmark datasets. We present evaluation metrics for low-level pixel and high-level perceptional similarity, conduct a performance evaluation, and discuss the strengths and weaknesses of representative inpainting methods. We also discuss related real-world applications. Finally, we discuss open challenges and suggest potential future research directions. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Comments: accepted to IJCV

arXiv:2401.01456 [pdf, other]

ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text

Authors: Dingkun Yan, Liang Yuan, Erwin Wu, Yuma Nishioka, Issei Fujishiro, Suguru Saito

Abstract: Diffusion models have recently demonstrated their effectiveness in generating extremely high-quality images and are now utilized in a wide range of applications, including automatic sketch colorization. Although many methods have been developed for guided sketch colorization, there has been limited exploration of the potential conflicts between image prompts and sketch inputs, which can lead to se… ▽ More Diffusion models have recently demonstrated their effectiveness in generating extremely high-quality images and are now utilized in a wide range of applications, including automatic sketch colorization. Although many methods have been developed for guided sketch colorization, there has been limited exploration of the potential conflicts between image prompts and sketch inputs, which can lead to severe deterioration in the results. Therefore, this paper exhaustively investigates reference-based sketch colorization models that aim to colorize sketch images using reference color images. We specifically investigate two critical aspects of reference-based diffusion models: the "distribution problem", which is a major shortcoming compared to text-based counterparts, and the capability in zero-shot sequential text-based manipulation. We introduce two variations of an image-guided latent diffusion model utilizing different image tokens from the pre-trained CLIP image encoder and propose corresponding manipulation methods to adjust their results sequentially using weighted text inputs. We conduct comprehensive evaluations of our models through qualitative and quantitative experiments as well as a user study. △ Less

Submitted 3 July, 2024; v1 submitted 2 January, 2024; originally announced January 2024.

arXiv:2312.14456 [pdf, other]

doi 10.1103/PhysRevX.14.011047

Spontaneous gap opening and potential excitonic states in an ideal Dirac semimetal Ta$_2$Pd$_3$Te$_5$

Authors: Peng Zhang, Yuyang Dong, Dayu Yan, Bei Jiang, Tao Yang, Jun Li, Zhaopeng Guo, Yong Huang, Bo Hao, Qing Li, Yupeng Li, Kifu Kurokawa, Rui Wang, Yuefeng Nie, Makoto Hashimoto, Donghui Lu, Wen-He Jiao, Jie Shen, Tian Qian, Zhijun Wang, Youguo Shi, Takeshi Kondo

Abstract: The opening of an energy gap in the electronic structure generally indicates the presence of interactions. In materials with low carrier density and short screening length, long-range Coulomb interaction favors the spontaneous formation of electron-hole pairs, so-called excitons, opening an excitonic gap at the Fermi level. Excitonic materials host unique phenomenons associated with pair excitatio… ▽ More The opening of an energy gap in the electronic structure generally indicates the presence of interactions. In materials with low carrier density and short screening length, long-range Coulomb interaction favors the spontaneous formation of electron-hole pairs, so-called excitons, opening an excitonic gap at the Fermi level. Excitonic materials host unique phenomenons associated with pair excitations. However, there is still no generally recognized single-crystal material with excitonic order, which is, therefore, awaited in condensed matter physics. Here, we show that excitonic states may exist in the quasi-one-dimensional material Ta$_2$Pd$_3$Te$_5$, which has an almost ideal Dirac-like band structure, with Dirac point located exactly at Fermi level. We find that an energy gap appears at 350 K, and it grows with decreasing temperature. The spontaneous gap opening is absent in a similar material Ta$_2$Ni$_3$Te$_5$. Intriguingly, the gap is destroyed by the potassium deposition on the crystal, likely due to extra-doped carriers. Furthermore, we observe a pair of in-gap flat bands, which is an analog of the impurity states in a superconducting gap. All these observations can be properly explained by an excitonic order, providing Ta$_2$Pd$_3$Te$_5$ as a new and promising candidate realizing excitonic states. △ Less

Submitted 15 March, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

Comments: 9 pages, 5 figures

Journal ref: Phys. Rev. X 14, 011047 (2024)

arXiv:2312.14455 [pdf, other]

doi 10.1103/PhysRevX.14.011046

Evidence for an Excitonic Insulator State in Ta$_2$Pd$_3$Te$_5$

Authors: Jierui Huang, Bei Jiang, Jingyu Yao, Dayu Yan, Xincheng Lei, Jiacheng Gao, Zhaopeng Guo, Feng Jin, Yupeng Li, Zhenyu Yuan, Congcong Chai, Haohao Sheng, Mojun Pan, Famin Chen, Junde Liu, Shunye Gao, Gexing Qu, Bo Liu, Zhicheng Jiang, Zhengtai Liu, Xiaoyan Ma, Shiming Zhou, Yaobo Huang, Chenxia Yun, Qingming Zhang , et al. (8 additional authors not shown)

Abstract: The excitonic insulator (EI) is an exotic ground state of narrow-gap semiconductors and semimetals arising from spontaneous condensation of electron-hole pairs bound by attractive Coulomb interaction. Despite research on EIs dating back to half a century ago, their existence in real materials remains a subject of ongoing debate. In this study, through systematic experimental and theoretical invest… ▽ More The excitonic insulator (EI) is an exotic ground state of narrow-gap semiconductors and semimetals arising from spontaneous condensation of electron-hole pairs bound by attractive Coulomb interaction. Despite research on EIs dating back to half a century ago, their existence in real materials remains a subject of ongoing debate. In this study, through systematic experimental and theoretical investigations, we provide evidence for the existence of an EI ground state in a van der Waals compound Ta$_2$Pd$_3$Te$_5$. Density-functional-theory calculations suggest that it is a semimetal with a small band overlap, whereas various experiments exhibit an insulating ground state with a clear band gap. Upon incorporating electron-hole Coulomb interaction into our calculations, we obtain an EI phase where the electronic symmetry breaking opens a many-body gap. Angle-resolved photoemission spectroscopy measurements exhibit that the band gap is closed with a significant change in the dispersions as the number of thermally excited charge carriers becomes sufficiently large in both equilibrium and nonequilibrium states. Structural measurements reveal a slight breaking of crystal symmetry with exceptionally small lattice distortion in the insulating state, which cannot account for the significant gap opening. Therefore, we attribute the insulating ground state with a gap opening in Ta$_2$Pd$_3$Te$_5$ to exciton condensation, where the coupling to the symmetry-breaking electronic state induces a subtle change in the crystal structure. △ Less

Submitted 14 March, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

Comments: 10 pages, 5 figures

Journal ref: Phys. Rev. X 14, 011046, 2024

arXiv:2312.12851 [pdf, other]

A Configurable Ultra-Low Noise Current Source for Transition-Edge Sensor Characterization

Authors: N. Li, G. Liao, D. Yan, Y. Xu, Y. Zhang, Z. Liu, S. Yuan, Y. Zhang, H. Gao, Y. Li, Y. Gu, C. Liu, H. Li, Z. Li, X. Ren

Abstract: Transition-edge sensors (TESs) are sensitive devices for detecting photons from millimeter radiation to gamma rays. Their photon counting efficiency and collecting area benefit from large-array multiplexing scheme, and therefore the development of multiplexing readout system has been an important topic in this field. Among the many multiplex techniques, time-division multiplexing (TDM) superconduc… ▽ More Transition-edge sensors (TESs) are sensitive devices for detecting photons from millimeter radiation to gamma rays. Their photon counting efficiency and collecting area benefit from large-array multiplexing scheme, and therefore the development of multiplexing readout system has been an important topic in this field. Among the many multiplex techniques, time-division multiplexing (TDM) superconducting quantum interference device (SQUID) has been used most widely for TES readout. In this work, we design a Configurable Ultra-Low Noise Current Source (CLCS) for TES characterization and as a part of a whole TDM-TES bias control system. The CLCS is based on the feedback structure of ultra-low noise instrumentation amplifiers and low-noise, high-resolution (20 bits) digital-to-analog converter (DAC). CLCS has an ultra-high resolution of 10 nA in the 0 to 5 mA current output range, and can perform current-voltage (IV) sweep and bias-step tests to measure key TES parameters on board. The feedback structure of the CLCS also avoids the issue of impedance mismatch. △ Less

Submitted 2 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: 11 pages, 9 figures

arXiv:2312.09154 [pdf, other]

CMG-Net: Robust Normal Estimation for Point Clouds via Chamfer Normal Distance and Multi-scale Geometry

Authors: Yingrui Wu, Mingyang Zhao, Keqiang Li, Weize Quan, Tianqi Yu, Jianfeng Yang, Xiaohong Jia, Dong-Ming Yan

Abstract: This work presents an accurate and robust method for estimating normals from point clouds. In contrast to predecessor approaches that minimize the deviations between the annotated and the predicted normals directly, leading to direction inconsistency, we first propose a new metric termed Chamfer Normal Distance to address this issue. This not only mitigates the challenge but also facilitates netwo… ▽ More This work presents an accurate and robust method for estimating normals from point clouds. In contrast to predecessor approaches that minimize the deviations between the annotated and the predicted normals directly, leading to direction inconsistency, we first propose a new metric termed Chamfer Normal Distance to address this issue. This not only mitigates the challenge but also facilitates network training and substantially enhances the network robustness against noise. Subsequently, we devise an innovative architecture that encompasses Multi-scale Local Feature Aggregation and Hierarchical Geometric Information Fusion. This design empowers the network to capture intricate geometric details more effectively and alleviate the ambiguity in scale selection. Extensive experiments demonstrate that our method achieves the state-of-the-art performance on both synthetic and real-world datasets, particularly in scenarios contaminated by noise. Our implementation is available at https://github.com/YingruiWoo/CMG-Net_Pytorch. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: Accepted by AAAI 2024

arXiv:2312.04883 [pdf, other]

Understanding Community Bias Amplification in Graph Representation Learning

Authors: Shengzhong Zhang, Wenjie Yang, Yimin Zhang, Hongwei Zhang, Divin Yan, Zengfeng Huang

Abstract: In this work, we discover a phenomenon of community bias amplification in graph representation learning, which refers to the exacerbation of performance bias between different classes by graph representation learning. We conduct an in-depth theoretical study of this phenomenon from a novel spectral perspective. Our analysis suggests that structural bias between communities results in varying local… ▽ More In this work, we discover a phenomenon of community bias amplification in graph representation learning, which refers to the exacerbation of performance bias between different classes by graph representation learning. We conduct an in-depth theoretical study of this phenomenon from a novel spectral perspective. Our analysis suggests that structural bias between communities results in varying local convergence speeds for node embeddings. This phenomenon leads to bias amplification in the classification results of downstream tasks. Based on the theoretical insights, we propose random graph coarsening, which is proved to be effective in dealing with the above issue. Finally, we propose a novel graph contrastive learning model called Random Graph Coarsening Contrastive Learning (RGCCL), which utilizes random coarsening as data augmentation and mitigates community bias by contrasting the coarsened graph with the original graph. Extensive experiments on various datasets demonstrate the advantage of our method when dealing with community bias amplification. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2311.13841 [pdf, other]

Adversarial defense based on distribution transfer

Authors: Jiahao Chen, Diqun Yan, Li Dong

Abstract: The presence of adversarial examples poses a significant threat to deep learning models and their applications. Existing defense methods provide certain resilience against adversarial examples, but often suffer from decreased accuracy and generalization performance, making it challenging to achieve a trade-off between robustness and generalization. To address this, our paper interprets the adversa… ▽ More The presence of adversarial examples poses a significant threat to deep learning models and their applications. Existing defense methods provide certain resilience against adversarial examples, but often suffer from decreased accuracy and generalization performance, making it challenging to achieve a trade-off between robustness and generalization. To address this, our paper interprets the adversarial example problem from the perspective of sample distribution and proposes a defense method based on distribution shift, leveraging the distribution transfer capability of a diffusion model for adversarial defense. The core idea is to exploit the discrepancy between normal and adversarial sample distributions to achieve adversarial defense using a pretrained diffusion model. Specifically, an adversarial sample undergoes a forward diffusion process, moving away from the source distribution, followed by a reverse process guided by the protected model (victim model) output to map it back to the normal distribution. Experimental evaluations on CIFAR10 and ImageNet30 datasets are conducted, comparing with adversarial training and input preprocessing methods. For infinite-norm attacks with 8/255 perturbation, accuracy rates of 78.1% and 83.5% are achieved, respectively. For 2-norm attacks with 128/255 perturbation, accuracy rates are 74.3% and 82.5%. Additional experiments considering perturbation amplitude, diffusion iterations, and adaptive attacks also validate the effectiveness of the proposed method. Results demonstrate that even when the attacker has knowledge of the defense, the proposed distribution-based method effectively withstands adversarial examples. It fills the gaps of traditional approaches, restoring high-quality original samples and showcasing superior performance in model robustness and generalization. △ Less

Submitted 23 November, 2023; originally announced November 2023.

Comments: 27 pages

arXiv:2311.13163 [pdf, other]

Have Your Cake and Eat It Too: Toward Efficient and Accurate Split Federated Learning

Authors: Dengke Yan, Ming Hu, Zeke Xia, Yanxin Yang, Jun Xia, Xiaofei Xie, Mingsong Chen

Abstract: Due to its advantages in resource constraint scenarios, Split Federated Learning (SFL) is promising in AIoT systems. However, due to data heterogeneity and stragglers, SFL suffers from the challenges of low inference accuracy and low efficiency. To address these issues, this paper presents a novel SFL approach, named Sliding Split Federated Learning (S$^2$FL), which adopts an adaptive sliding mode… ▽ More Due to its advantages in resource constraint scenarios, Split Federated Learning (SFL) is promising in AIoT systems. However, due to data heterogeneity and stragglers, SFL suffers from the challenges of low inference accuracy and low efficiency. To address these issues, this paper presents a novel SFL approach, named Sliding Split Federated Learning (S$^2$FL), which adopts an adaptive sliding model split strategy and a data balance-based training mechanism. By dynamically dispatching different model portions to AIoT devices according to their computing capability, S$^2$FL can alleviate the low training efficiency caused by stragglers. By combining features uploaded by devices with different data distributions to generate multiple larger batches with a uniform distribution for back-propagation, S$^2$FL can alleviate the performance degradation caused by data heterogeneity. Experimental results demonstrate that, compared to conventional SFL, S$^2$FL can achieve up to 16.5\% inference accuracy improvement and 3.54X training acceleration. △ Less

Submitted 8 April, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

arXiv:2311.10794 [pdf, other]

Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression

Authors: Animesh Sinha, Bo Sun, Anmol Kalia, Arantxa Casanova, Elliot Blanchard, David Yan, Winnie Zhang, Tony Nelli, Jiahui Chen, Hardik Shah, Licheng Yu, Mitesh Kumar Singh, Ankit Ramchandani, Maziar Sanjabi, Sonal Gupta, Amy Bearman, Dhruv Mahajan

Abstract: We introduce Style Tailoring, a recipe to finetune Latent Diffusion Models (LDMs) in a distinct domain with high visual quality, prompt alignment and scene diversity. We choose sticker image generation as the target domain, as the images significantly differ from photorealistic samples typically generated by large-scale LDMs. We start with a competent text-to-image model, like Emu, and show that r… ▽ More We introduce Style Tailoring, a recipe to finetune Latent Diffusion Models (LDMs) in a distinct domain with high visual quality, prompt alignment and scene diversity. We choose sticker image generation as the target domain, as the images significantly differ from photorealistic samples typically generated by large-scale LDMs. We start with a competent text-to-image model, like Emu, and show that relying on prompt engineering with a photorealistic model to generate stickers leads to poor prompt alignment and scene diversity. To overcome these drawbacks, we first finetune Emu on millions of sticker-like images collected using weak supervision to elicit diversity. Next, we curate human-in-the-loop (HITL) Alignment and Style datasets from model generations, and finetune to improve prompt alignment and style alignment respectively. Sequential finetuning on these datasets poses a tradeoff between better style alignment and prompt alignment gains. To address this tradeoff, we propose a novel fine-tuning method called Style Tailoring, which jointly fits the content and style distribution and achieves best tradeoff. Evaluation results show our method improves visual quality by 14%, prompt alignment by 16.2% and scene diversity by 15.3%, compared to prompt engineering the base Emu model for stickers generation. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 10 pages, 5 figures

Showing 1–50 of 424 results for author: Yan, D