subscribe to arXiv mailings

doi 10.23977/tracam.2024.040107

A Study on Lampreys Population Based on Sex-Ratio-Related Growth-Balance Model

Authors: Zuhua Ji, Jiarui Chen, Zihang Wang

Abstract: Lampreys are one of the oldest species in the world, living longer than dinosaurs, which is related to the ability to change the sex ratio during their lifespan. In this paper, to understand how sex ratio and food quantity affect the population growth rate of lampreys, the researchers draw inspiration from the logistics model and established a model called EcoSexChange(ESC), which results in a pop… ▽ More Lampreys are one of the oldest species in the world, living longer than dinosaurs, which is related to the ability to change the sex ratio during their lifespan. In this paper, to understand how sex ratio and food quantity affect the population growth rate of lampreys, the researchers draw inspiration from the logistics model and established a model called EcoSexChange(ESC), which results in a population initially increasing and then stabilizing, a reasonable outcome that may apply to other organisms with significant differences in consumption between sexes. Subsequently, this paper develops the Sex Ratio Adaptation Eco Impact (SRAEI) model based on the ESC model using the ABM algorithm to simulate how the population of lampreys, whose lives are divided into seven stages, grows and stabilizes. Then introduces a sudden disaster factor in the middle of the simulation, while also comparing lampreys that cannot adjust their sex ratio. The results of this paper are of great reference significance for people to analyze the population changes of lampreys in different living environments, and they are also easy to apply to other species with large differences between males and females. △ Less

Submitted 14 July, 2024; originally announced July 2024.

Journal ref: Transactions on Computational and Applied Mathematics. 2024 May 6;4(1):48-55

arXiv:2407.09486 [pdf, other]

ENOVA: Autoscaling towards Cost-effective and Stable Serverless LLM Serving

Authors: Tao Huang, Pengfei Chen, Kyoka Gong, Jocky Hawk, Zachary Bright, Wenxin Xie, Kecheng Huang, Zhi Ji

Abstract: Since the increasing popularity of large language model (LLM) backend systems, it is common and necessary to deploy stable serverless serving of LLM on multi-GPU clusters with autoscaling. However, there exist challenges because the diversity and co-location of applications in multi-GPU clusters will lead to low service quality and GPU utilization. To address them, we build ENOVA, a deployment, mo… ▽ More Since the increasing popularity of large language model (LLM) backend systems, it is common and necessary to deploy stable serverless serving of LLM on multi-GPU clusters with autoscaling. However, there exist challenges because the diversity and co-location of applications in multi-GPU clusters will lead to low service quality and GPU utilization. To address them, we build ENOVA, a deployment, monitoring and autoscaling service towards serverless LLM serving. ENOVA deconstructs the execution process of LLM service comprehensively, based on which ENOVA designs a configuration recommendation module for automatic deployment on any GPU clusters and a performance detection module for autoscaling. On top of them, ENOVA implements a deployment execution engine for multi-GPU cluster scheduling. The experiment results show that ENOVA significantly outperforms other state-of-the-art methods and is suitable for wide deployment in large online systems. △ Less

Submitted 17 May, 2024; originally announced July 2024.

arXiv:2407.08586 [pdf, other]

Centrality dependence of Lévy-stable two-pion Bose-Einstein correlations in $\sqrt{s_{_{NN}}}=200$ GeV Au$+$Au collisions

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, C. Aidala, N. N. Ajitanand, Y. Akiba, R. Akimoto, H. Al-Ta'ani, J. Alexander, A. Angerami, K. Aoki, N. Apadula, Y. Aramaki, H. Asano, E. C. Aschenauer, E. T. Atomssa, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, B. Bannier, K. N. Barish, B. Bassalleck, S. Bathe , et al. (377 additional authors not shown)

Abstract: The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability… ▽ More The PHENIX experiment measured the centrality dependence of two-pion Bose-Einstein correlation functions in $\sqrt{s_{_{NN}}}=200$~GeV Au$+$Au collisions at the Relativistic Heavy Ion Collider at Brookhaven National Laboratory. The data are well represented by Lévy-stable source distributions. The extracted source parameters are the correlation-strength parameter $λ$, the Lévy index of stability $α$, and the Lévy-scale parameter $R$ as a function of transverse mass $m_T$ and centrality. The $λ(m_T)$ parameter is constant at larger values of $m_T$, but decreases as $m_T$ decreases. The Lévy scale parameter $R(m_T)$ decreases with $m_T$ and exhibits proportionality to the length scale of the nuclear overlap region. The Lévy exponent $α(m_T)$ is independent of $m_T$ within uncertainties in each investigated centrality bin, but shows a clear centrality dependence. At all centralities, the Lévy exponent $α$ is significantly different from that of Gaussian ($α=2$) or Cauchy ($α=1$) source distributions. Comparisons to the predictions of Monte-Carlo simulations of resonance-decay chains show that in all but the most peripheral centrality class (50%-60%), the obtained results are inconsistent with the measurements, unless a significant reduction of the in-medium mass of the $η'$ meson is included. In each centrality class, the best value of the in-medium $η'$ mass is compared to the mass of the $η$ meson, as well as to several theoretical predictions that consider restoration of $U_A(1)$ symmetry in hot hadronic matter. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 401 authors from 75 institutions, 20 pages, 15 figures, 2 tables. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

arXiv:2407.07999 [pdf, ps, other]

Fusion of Short-term and Long-term Attention for Video Mirror Detection

Authors: Mingchen Xu, Jing Wu, Yukun Lai, Ze Ji

Abstract: Techniques for detecting mirrors from static images have witnessed rapid growth in recent years. However, these methods detect mirrors from single input images. Detecting mirrors from video requires further consideration of temporal consistency between frames. We observe that humans can recognize mirror candidates, from just one or two frames, based on their appearance (e.g. shape, color). However… ▽ More Techniques for detecting mirrors from static images have witnessed rapid growth in recent years. However, these methods detect mirrors from single input images. Detecting mirrors from video requires further consideration of temporal consistency between frames. We observe that humans can recognize mirror candidates, from just one or two frames, based on their appearance (e.g. shape, color). However, to ensure that the candidate is indeed a mirror (not a picture or a window), we often need to observe more frames for a global view. This observation motivates us to detect mirrors by fusing appearance features extracted from a short-term attention module and context information extracted from a long-term attention module. To evaluate the performance, we build a challenging benchmark dataset of 19,255 frames from 281 videos. Experimental results demonstrate that our method achieves state-of-the-art performance on the benchmark dataset. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.05993 [pdf, other]

Self-Prior Guided Mamba-UNet Networks for Medical Image Super-Resolution

Authors: Zexin Ji, Beiji Zou, Xiaoyan Kui, Pierre Vera, Su Ruan

Abstract: In this paper, we propose a self-prior guided Mamba-UNet network (SMamba-UNet) for medical image super-resolution. Existing methods are primarily based on convolutional neural networks (CNNs) or Transformers. CNNs-based methods fail to capture long-range dependencies, while Transformer-based approaches face heavy calculation challenges due to their quadratic computational complexity. Recently, Sta… ▽ More In this paper, we propose a self-prior guided Mamba-UNet network (SMamba-UNet) for medical image super-resolution. Existing methods are primarily based on convolutional neural networks (CNNs) or Transformers. CNNs-based methods fail to capture long-range dependencies, while Transformer-based approaches face heavy calculation challenges due to their quadratic computational complexity. Recently, State Space Models (SSMs) especially Mamba have emerged, capable of modeling long-range dependencies with linear computational complexity. Inspired by Mamba, our approach aims to learn the self-prior multi-scale contextual features under Mamba-UNet networks, which may help to super-resolve low-resolution medical images in an efficient way. Specifically, we obtain self-priors by perturbing the brightness inpainting of the input image during network training, which can learn detailed texture and brightness information that is beneficial for super-resolution. Furthermore, we combine Mamba with Unet network to mine global features at different levels. We also design an improved 2D-Selective-Scan (ISS2D) module to divide image features into different directional sequences to learn long-range dependencies in multiple directions, and adaptively fuse sequence information to enhance super-resolved feature representation. Both qualitative and quantitative experimental results demonstrate that our approach outperforms current state-of-the-art methods on two public medical datasets: the IXI and fastMRI. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.05969 [pdf, other]

Deform-Mamba Network for MRI Super-Resolution

Authors: Zexin Ji, Beiji Zou, Xiaoyan Kui, Pierre Vera, Su Ruan

Abstract: In this paper, we propose a new architecture, called Deform-Mamba, for MR image super-resolution. Unlike conventional CNN or Transformer-based super-resolution approaches which encounter challenges related to the local respective field or heavy computational cost, our approach aims to effectively explore the local and global information of images. Specifically, we develop a Deform-Mamba encoder wh… ▽ More In this paper, we propose a new architecture, called Deform-Mamba, for MR image super-resolution. Unlike conventional CNN or Transformer-based super-resolution approaches which encounter challenges related to the local respective field or heavy computational cost, our approach aims to effectively explore the local and global information of images. Specifically, we develop a Deform-Mamba encoder which is composed of two branches, modulated deform block and vision Mamba block. We also design a multi-view context module in the bottleneck layer to explore the multi-view contextual content. Thanks to the extracted features of the encoder, which include content-adaptive local and efficient global information, the vision Mamba decoder finally generates high-quality MR images. Moreover, we introduce a contrastive edge loss to promote the reconstruction of edge and contrast related content. Quantitative and qualitative experimental results indicate that our approach on IXI and fastMRI datasets achieves competitive performance. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.04693 [pdf, other]

ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

Authors: Yuzhe Gu, Ziwei Ji, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen

Abstract: Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks across various domains and wide applications. Current hallucination detection and mitigation datasets are limited in domains and sizes, which struggle to scale due to prohibitive labor costs and insufficient reliability of existing hallucination annotators. To facilitate the scalable oversight of LLM hallucin… ▽ More Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks across various domains and wide applications. Current hallucination detection and mitigation datasets are limited in domains and sizes, which struggle to scale due to prohibitive labor costs and insufficient reliability of existing hallucination annotators. To facilitate the scalable oversight of LLM hallucinations, this paper introduces an iterative self-training framework that simultaneously and progressively scales up the hallucination annotation dataset and improves the accuracy of the hallucination annotator. Based on the Expectation Maximization (EM) algorithm, in each iteration, the framework first applies a hallucination annotation pipeline to annotate a scaled dataset and then trains a more accurate hallucination annotator on the dataset. This new hallucination annotator is adopted in the hallucination annotation pipeline used for the next iteration. Extensive experimental results demonstrate that the finally obtained hallucination annotator with only 7B parameters surpasses the performance of GPT-4 and obtains new state-of-the-art hallucination detection results on HaluEval and HalluQA by zero-shot inference. Such an annotator can not only evaluate the hallucination levels of various LLMs on the large-scale dataset but also help to mitigate the hallucination of LLMs generations, with the Natural Language Inference (NLI) metric increasing from 25% to 37% on HaluEval. △ Less

Submitted 5 July, 2024; originally announced July 2024.

Comments: 9 pages

arXiv:2407.03282 [pdf, other]

LLM Internal States Reveal Hallucination Risk Faced With a Query

Authors: Ziwei Ji, Delong Chen, Etsuko Ishii, Samuel Cahyawijaya, Yejin Bang, Bryan Wilie, Pascale Fung

Abstract: The hallucination problem of Large Language Models (LLMs) significantly limits their reliability and trustworthiness. Humans have a self-awareness process that allows us to recognize what we don't know when faced with queries. Inspired by this, our paper investigates whether LLMs can estimate their own hallucination risk before response generation. We analyze the internal mechanisms of LLMs broadl… ▽ More The hallucination problem of Large Language Models (LLMs) significantly limits their reliability and trustworthiness. Humans have a self-awareness process that allows us to recognize what we don't know when faced with queries. Inspired by this, our paper investigates whether LLMs can estimate their own hallucination risk before response generation. We analyze the internal mechanisms of LLMs broadly both in terms of training data sources and across 15 diverse Natural Language Generation (NLG) tasks, spanning over 700 datasets. Our empirical analysis reveals two key insights: (1) LLM internal states indicate whether they have seen the query in training data or not; and (2) LLM internal states show they are likely to hallucinate or not regarding the query. Our study explores particular neurons, activation layers, and tokens that play a crucial role in the LLM perception of uncertainty and hallucination risk. By a probing estimator, we leverage LLM self-assessment, achieving an average hallucination estimation accuracy of 84.32\% at run time. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.02575 [pdf, other]

JADES: The star-formation and chemical enrichment history of a luminous galaxy at z~9.43 probed by ultra-deep JWST/NIRSpec spectroscopy

Authors: Mirko Curti, Joris Witstok, Peter Jakobsen, Chiaki Kobayashi, Emma Curtis-Lake, Kevin Hainline, Xihan Ji, Francesco D'Eugenio, Jacopo Chevallard, Roberto Maiolino, Jan Scholtz, Stefano Carniani, Santiago Arribas, William M. Baker, Rachana Bhatawdekar, Kristan Boyett, Andrew J. Bunker, Alex Cameron, Phillip A. Cargile, Stephane Charlot, Daniel J. Eisenstein, Zhiyuan Ji, Benjamin D. Johnson, Nimisha Kumari, Michael V. Maseda , et al. (8 additional authors not shown)

Abstract: We analyse ultra-deep JWST observations of the galaxy JADES-GS-z9-0 at z = 9.4327, and derive detailed stellar and interstellar medium (ISM) properties of this luminous (MUV=-20.43) high-redshift system. Complementary information from NIRCam imaging and NIRSpec (both low- and medium-resolution) spectroscopy reveal a compact system (Re ~110 pc) characterised by a steeply rising star formation histo… ▽ More We analyse ultra-deep JWST observations of the galaxy JADES-GS-z9-0 at z = 9.4327, and derive detailed stellar and interstellar medium (ISM) properties of this luminous (MUV=-20.43) high-redshift system. Complementary information from NIRCam imaging and NIRSpec (both low- and medium-resolution) spectroscopy reveal a compact system (Re ~110 pc) characterised by a steeply rising star formation history, which is reflected in the inferred young stellar age (t ~ 3 Myr, light-weighted), high star-formation rate surface density (ΣSFR ~ 72 M yr-1 kpc-2), high ionisation parameter (log(U) ~ -1.5), low metallicity (12+log(O/H) ~ 7.5), and low carbon-over-oxygen abundance ([C/O] = -0.64). Leveraging the detection of N iii]1750 we derive nitrogen-over-oxygen abundance ([N/O] ~ 0) higher than the plateau followed by low-redshift galaxies of similar metallicity, possibly revealing the imprint from (very) massive stars on the ISM enrichment and favouring a top-heavy Initial Mass Function (IMF) scenario. Massive stars powering a hard radiation field are also required to explain the rest-frame UV line ratios, though the presence of the high-excitation [Ne v]λ3426 emission line possibly hints at additional ionization from an AGN. We also report the tentative detection of Lyα emission in the G140M spectrum, shifted by ~450 km/s redward of the systemic redshift. Combined with a modelling of the Lyα spectral break, we rule out the presence of very high column densities of neutral gas pertaining to local absorbers, as well as any extended surrounding ionised bubble, suggesting that JADES-GS-z9-0 has not yet significantly contributed to cosmic Reionization. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Submitted to A&A. Comments are welcome

arXiv:2406.17968 [pdf, other]

Efficient Document Ranking with Learnable Late Interactions

Authors: Ziwei Ji, Himanshu Jain, Andreas Veit, Sashank J. Reddi, Sadeep Jayasumana, Ankit Singh Rawat, Aditya Krishna Menon, Felix Yu, Sanjiv Kumar

Abstract: Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval. To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized query and document embeddings; usually, the former has higher quality while the latter benefits from lower latency. Recently, late-interaction models have been p… ▽ More Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval. To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized query and document embeddings; usually, the former has higher quality while the latter benefits from lower latency. Recently, late-interaction models have been proposed to realize more favorable latency-quality tradeoffs, by using a DE structure followed by a lightweight scorer based on query and document token embeddings. However, these lightweight scorers are often hand-crafted, and there is no understanding of their approximation power; further, such scorers require access to individual document token embeddings, which imposes an increased latency and storage burden. In this paper, we propose novel learnable late-interaction models (LITE) that resolve these issues. Theoretically, we prove that LITE is a universal approximator of continuous scoring functions, even for relatively small embedding dimension. Empirically, LITE outperforms previous late-interaction models such as ColBERT on both in-domain and zero-shot re-ranking tasks. For instance, experiments on MS MARCO passage re-ranking show that LITE not only yields a model with better generalization, but also lowers latency and requires 0.25x storage compared to ColBERT. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.17608 [pdf, other]

Test-Time Generative Augmentation for Medical Image Segmentation

Authors: Xiao Ma, Yuhui Tao, Yuhan Zhang, Zexuan Ji, Yizhe Zhang, Qiang Chen

Abstract: In this paper, we propose a novel approach to enhance medical image segmentation during test time. Instead of employing hand-crafted transforms or functions on the input test image to create multiple views for test-time augmentation, we advocate for the utilization of an advanced domain-fine-tuned generative model (GM), e.g., stable diffusion (SD), for test-time augmentation. Given that the GM has… ▽ More In this paper, we propose a novel approach to enhance medical image segmentation during test time. Instead of employing hand-crafted transforms or functions on the input test image to create multiple views for test-time augmentation, we advocate for the utilization of an advanced domain-fine-tuned generative model (GM), e.g., stable diffusion (SD), for test-time augmentation. Given that the GM has been trained to comprehend and encapsulate comprehensive domain data knowledge, it is superior than segmentation models in terms of representing the data characteristics and distribution. Hence, by integrating the GM into test-time augmentation, we can effectively generate multiple views of a given test sample, aligning with the content and appearance characteristics of the sample and the related local data distribution. This approach renders the augmentation process more adaptable and resilient compared to conventional handcrafted transforms. Comprehensive experiments conducted across three medical image segmentation tasks (nine datasets) demonstrate the efficacy and versatility of the proposed TTGA in enhancing segmentation outcomes. Moreover, TTGA significantly improves pixel-wise error estimation, thereby facilitating the deployment of a more reliable segmentation system. Code will be released at: https://github.com/maxiao0234/TTGA. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 12pages, 2figures

arXiv:2406.11997 [pdf, other]

JADES: Physical properties of Ly$α$ and non-Ly$α$ emitters at z ~ 4.8-9.6

Authors: Nimisha Kumari, Renske Smit, Joris Witstok, Marco Sirianni, Roberto Maiolino, Andrew J. Bunker, Rachana Bhatawdekar, Kristan Boyett, Alex J. Cameron, Stefano Carniani, Stephane Charlot, Mirko Curti, Emma Curtis-Lake, Francesco D'Eugenio, Daniel J. Eisenstein, Kevin Hainline, Zhiyuan Ji, Gareth C. Jones, Brant Robertson, Aayush Saxena, Jan Scholtz, Charlotte Simmonds, Christina C. Williams, Christopher N. A. Willmer

Abstract: We investigate the physical properties of Lyman-alpha emitters (LAEs) and non-Lyman-alpha emitters (non-LAEs) at z$\sim$4.8--9.6 via a stacking analysis of 253 JWST/NIRSpec spectra of galaxies observed as part of the JWST Advanced Deep Extragalactic Survey (JADES). We identify a sample of 42 LAEs with the equivalent width of Ly$α$ $\gtrsim$20Åand a sample of 211 non-LAEs, divide each sample furthe… ▽ More We investigate the physical properties of Lyman-alpha emitters (LAEs) and non-Lyman-alpha emitters (non-LAEs) at z$\sim$4.8--9.6 via a stacking analysis of 253 JWST/NIRSpec spectra of galaxies observed as part of the JWST Advanced Deep Extragalactic Survey (JADES). We identify a sample of 42 LAEs with the equivalent width of Ly$α$ $\gtrsim$20Åand a sample of 211 non-LAEs, divide each sample further via the median redshift of the LAEs (z~6.3), and create composite spectra using the low and medium resolution spectra from NIRSpec. We estimate physical quantities such as dust extinction, UV continuum slope $β$, electron temperatures, ionization parameter, escape fraction of Ly$α$ and Lyman Continuum, and the photon production rate for each bin/stack. The existing dust-extinction laws do not appear to be valid at these epochs. The emission line ratio analyses show that active galactic nuclei might dominate all sub-samples, irrespective of Ly$α$ emission. LAEs show much higher [OIII]/[OII] and low [OII]/H$δ$ at z$\lesssim$6.3 compared to non-LAEs, but these line ratios are not sufficient to distinguish the two populations at z$>$6.3. However, the LAEs samples show large EW([OIII]4959, 5007) ($>$1000Å) compared to the non-LAEs sample at all redshifts. CIV/Ly$α$ and CIV/CIII] for LAE population at z$\lesssim$6.3 is $\sim$a factor of 5 larger than that for LAE population at z$>$6.3. The ionizing radiation for LAEs is hard, as revealed from several diagnostics, including CIV detection, high [OIII]/[OII] ($>$8), and large values of $ξ^{\star}_{ion}$. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: Submitted to ApJ, 20 pages, 13 figures, 2 tables

arXiv:2406.10179 [pdf, other]

Multivariate Predictors of LyC Escape II: Predicting LyC Escape Fractions for High-Redshift Galaxies

Authors: Anne E. Jaskot, Anneliese C. Silveyra, Anna Plantinga, Sophia R. Flury, Matthew Hayes, John Chisholm, Timothy Heckman, Laura Pentericci, Daniel Schaerer, Maxime Trebitsch, Anne Verhamme, Cody Carr, Henry C. Ferguson, Zhiyuan Ji, Mauro Giavalisco, Alaina Henry, Rui Marques-Chaves, Göran Östlin, Alberto Saldana-Lopez, Claudia Scarlata, Gábor Worseck, Xinfeng Xu

Abstract: JWST is uncovering the properties of ever increasing numbers of galaxies at z>6, during the epoch of reionization. Connecting these observed populations to the process of reionization requires understanding how efficiently they produce Lyman continuum (LyC) photons and what fraction (fesc) of these photons escape into the intergalactic medium. By applying the Cox proportional hazards model, a surv… ▽ More JWST is uncovering the properties of ever increasing numbers of galaxies at z>6, during the epoch of reionization. Connecting these observed populations to the process of reionization requires understanding how efficiently they produce Lyman continuum (LyC) photons and what fraction (fesc) of these photons escape into the intergalactic medium. By applying the Cox proportional hazards model, a survival analysis technique, to the Low-redshift Lyman Continuum Survey (LzLCS), we develop new, empirical, multivariate predictions for fesc. The models developed from the LzLCS reproduce the observed fesc for z~3 samples, which suggests that LyC emitters may share similar properties at low and high redshift. Our best-performing models for the z~3 galaxies include information about dust attenuation, ionization, and/or morphology. We then apply these models to z$\gtrsim$6 galaxies. For large photometric samples, we find a median predicted fesc=0.047-0.14. For smaller spectroscopic samples, which may include stronger emission line galaxies, we find that $\geq$33% of the galaxies have fesc >0.2, and we identify several candidate extreme leakers with fesc $\geq$0.5. The current samples show no strong trend between predicted fesc and UV magnitude, but limited spectroscopic information makes this result uncertain. Multivariate predictions can give significantly different results from single variable predictions, and the predicted fesc for high-redshift galaxies can differ significantly depending on whether star formation rate surface density or radius is used as a measure of galaxy morphology. We provide all parameters necessary to predict fesc for additional samples of high-redshift galaxies using these models. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Accepted for publication in ApJ. 33 pages, 9 figures, 10 tables, plus appendix

arXiv:2406.10171 [pdf, other]

Multivariate Predictors of LyC Escape I: A Survival Analysis of the Low-redshift Lyman Continuum Survey

Authors: Anne E. Jaskot, Anneliese C. Silveyra, Anna Plantinga, Sophia R. Flury, Matthew Hayes, John Chisholm, Timothy Heckman, Laura Pentericci, Daniel Schaerer, Maxime Trebitsch, Anne Verhamme, Cody Carr, Henry C. Ferguson, Zhiyuan Ji, Mauro Giavalisco, Alaina Henry, Rui Marques-Chaves, Göran Östlin, Alberto Saldana-Lopez, Claudia Scarlata, Gábor Worseck, Xinfeng Xu

Abstract: To understand how galaxies reionized the universe, we must determine how the escape fraction of Lyman Continuum (LyC) photons (fesc) depends on galaxy properties. Using the z~0.3 Low-redshift Lyman Continuum Survey (LzLCS), we develop and analyze new multivariate predictors of fesc. These predictions use the Cox proportional hazards model, a survival analysis technique that incorporates both detec… ▽ More To understand how galaxies reionized the universe, we must determine how the escape fraction of Lyman Continuum (LyC) photons (fesc) depends on galaxy properties. Using the z~0.3 Low-redshift Lyman Continuum Survey (LzLCS), we develop and analyze new multivariate predictors of fesc. These predictions use the Cox proportional hazards model, a survival analysis technique that incorporates both detections and upper limits. Our best model predicts the LzLCS fesc detections with a root-mean-square (RMS) scatter of 0.31 dex, better than single-variable correlations. According to ranking techniques, the most important predictors of fesc are the equivalent width (EW) of Lyman-series absorption lines and the UV dust attenuation, which track line-of-sight absorption due to HI and dust. The HI absorption EW is uniquely crucial for predicting fesc for the strongest LyC emitters, which show properties similar to weaker LyC emitters and whose high fesc may therefore result from favorable orientation. In the absence of HI information, star formation rate surface density ($Σ_{\rm SFR}$) and [O III]/[O II] ratio are the most predictive variables and highlight the connection between feedback and fesc. We generate a model suitable for z>6, which uses only the UV slope, $Σ_{\rm SFR}$, and [O III]/[O II]. We find that $Σ_{\rm SFR}$ is more important in predicting fesc at higher stellar masses, whereas [O III]/[O II] plays a greater role at lower masses. We also analyze predictions for other parameters, such as the ionizing-to-non ionizing flux ratio and Ly=alpha escape fraction. These multivariate models represent a promising tool for predicting fesc at high redshift. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Accepted for publication in ApJ. 34 pages + appendix, 12 figures

arXiv:2406.09178 [pdf, other]

AutomaChef: A Physics-informed Demonstration-guided Learning Framework for Granular Material Manipulation

Authors: Minglun Wei, Xintong Yang, Yu-Kun Lai, Seyed Amir Tafrishi, Ze Ji

Abstract: Due to the complex physical properties of granular materials, research on robot learning for manipulating such materials predominantly either disregards the consideration of their physical characteristics or uses surrogate models to approximate their physical properties. Learning to manipulate granular materials based on physical information obtained through precise modelling remains an unsolved p… ▽ More Due to the complex physical properties of granular materials, research on robot learning for manipulating such materials predominantly either disregards the consideration of their physical characteristics or uses surrogate models to approximate their physical properties. Learning to manipulate granular materials based on physical information obtained through precise modelling remains an unsolved problem. In this paper, we propose to address this challenge by constructing a differentiable physics simulator for granular materials based on the Taichi programming language and developing a learning framework accelerated by imperfect demonstrations that are generated via gradient-based optimisation on non-granular materials through our simulator. Experimental results show that our method trains three policies that, when chained, are capable of executing the task of transporting granular materials in both simulated and real-world scenarios, which existing popular deep reinforcement learning models fail to accomplish. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 8 pages

arXiv:2406.08455 [pdf, other]

AToM-Bot: Embodied Fulfillment of Unspoken Human Needs with Affective Theory of Mind

Authors: Wei Ding, Fanhong Li, Ziteng Ji, Zhengrong Xue, Jia Liu

Abstract: We propose AToM-Bot, a novel task generation and execution framework for proactive robot-human interaction, which leverages the human mental and physical state inference capabilities of the Vision Language Model (VLM) prompted by the Affective Theory of Mind (AToM). Without requiring explicit commands by humans, AToM-Bot proactively generates and follows feasible tasks to improve general human wel… ▽ More We propose AToM-Bot, a novel task generation and execution framework for proactive robot-human interaction, which leverages the human mental and physical state inference capabilities of the Vision Language Model (VLM) prompted by the Affective Theory of Mind (AToM). Without requiring explicit commands by humans, AToM-Bot proactively generates and follows feasible tasks to improve general human well-being. When around humans, AToM-Bot first detects current human needs based on inferred human states and observations of the surrounding environment. It then generates tasks to fulfill these needs, taking into account its embodied constraints. We designed 16 daily life scenarios spanning 4 common scenes and tasked the same visual stimulus to 59 human subjects and our robot. We used the similarity between human open-ended answers and robot output, and the human satisfaction scores to metric robot performance. AToM-Bot received high human evaluations in need detection (6.42/7, 91.7%), embodied solution (6.15/7, 87.8%) and task execution (6.17/7, 88.1%). We show that AToM-Bot excels in generating and executing feasible plans to fulfill unspoken human needs. Videos and code are available at https://affective-tom-bot.github.io. △ Less

Submitted 15 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.08301 [pdf, other]

Jet modification via $π^0$-hadron correlations in Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$ GeV

Authors: PHENIX Collaboration, N. J. Abdulameer, U. Acharya, A. Adare, S. Afanasiev, C. Aidala, N. N. Ajitanand, Y. Akiba, H. Al-Bataineh, J. Alexander, M. Alfred, K. Aoki, N. Apadula, L. Aphecetche, J. Asai, H. Asano, E. T. Atomssa, R. Averbeck, T. C. Awes, B. Azmoun, V. Babintsev, M. Bai, G. Baksay, L. Baksay, A. Baldisseri , et al. (510 additional authors not shown)

Abstract: High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is obs… ▽ More High-momentum two-particle correlations are a useful tool for studying jet-quenching effects in the quark-gluon plasma. Angular correlations between neutral-pion triggers and charged hadrons with transverse momenta in the range 4--12~GeV/$c$ and 0.5--7~GeV/$c$, respectively, have been measured by the PHENIX experiment in 2014 for Au$+$Au collisions at $\sqrt{s_{_{NN}}}=200$~GeV. Suppression is observed in the yield of high-momentum jet fragments opposite the trigger particle, which indicates jet suppression stemming from in-medium partonic energy loss, while enhancement is observed for low-momentum particles. The ratio and differences between the yield in Au$+$Au collisions and $p$$+$$p$ collisions, $I_{AA}$ and $Δ_{AA}$, as a function of the trigger-hadron azimuthal separation, $Δφ$, are measured for the first time at the Relativistic Heavy Ion Collider. These results better quantify how the yield of low-$p_T$ associated hadrons is enhanced at wide angle, which is crucial for studying energy loss as well as medium-response effects. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: 534 authors from 83 institutions, 12 pages, 7 figures. v1 is version submitted to Physical Review C. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html

arXiv:2406.05992 [pdf, other]

MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba

Authors: Zhongping Ji

Abstract: Recently, State Space Models (SSMs), with Mamba as a prime example, have shown great promise for long-range dependency modeling with linear complexity. Then, Vision Mamba and the subsequent architectures are presented successively, and they perform well on visual tasks. The crucial step of applying Mamba to visual tasks is to construct 2D visual features in sequential manners. To effectively organ… ▽ More Recently, State Space Models (SSMs), with Mamba as a prime example, have shown great promise for long-range dependency modeling with linear complexity. Then, Vision Mamba and the subsequent architectures are presented successively, and they perform well on visual tasks. The crucial step of applying Mamba to visual tasks is to construct 2D visual features in sequential manners. To effectively organize and construct visual features within the 2D image space through 1D selective scan, we propose a novel Multi-Head Scan (MHS) module. The embeddings extracted from the preceding layer are projected into multiple lower-dimensional subspaces. Subsequently, within each subspace, the selective scan is performed along distinct scan routes. The resulting sub-embeddings, obtained from the multi-head scan process, are then integrated and ultimately projected back into the high-dimensional space. Moreover, we incorporate a Scan Route Attention (SRA) mechanism to enhance the module's capability to discern complex structures. To validate the efficacy of our module, we exclusively substitute the 2D-Selective-Scan (SS2D) block in VM-UNet with our proposed module, and we train our models from scratch without using any pre-trained weights. The results indicate a significant improvement in performance while reducing the parameters of the original VM-UNet. The code for this study is publicly available at https://github.com/PixDeep/MHS-VM. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 11 pages, 5 figures

arXiv:2406.05498 [pdf, other]

SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner

Authors: Xunguang Wang, Daoyuan Wu, Zhenlan Ji, Zongjie Li, Pingchuan Ma, Shuai Wang, Yingjiu Li, Yang Liu, Ning Liu, Juergen Rahmel

Abstract: Jailbreaking is an emerging adversarial attack that bypasses the safety alignment deployed in off-the-shelf large language models (LLMs) and has evolved into four major categories: optimization-based attacks such as Greedy Coordinate Gradient (GCG), jailbreak template-based attacks such as "Do-Anything-Now", advanced indirect attacks like DrAttack, and multilingual jailbreaks. However, delivering… ▽ More Jailbreaking is an emerging adversarial attack that bypasses the safety alignment deployed in off-the-shelf large language models (LLMs) and has evolved into four major categories: optimization-based attacks such as Greedy Coordinate Gradient (GCG), jailbreak template-based attacks such as "Do-Anything-Now", advanced indirect attacks like DrAttack, and multilingual jailbreaks. However, delivering a practical jailbreak defense is challenging because it needs to not only handle all the above jailbreak attacks but also incur negligible delay to user prompts, as well as be compatible with both open-source and closed-source LLMs. Inspired by how the traditional security concept of shadow stacks defends against memory overflow attacks, this paper introduces a generic LLM jailbreak defense framework called SelfDefend, which establishes a shadow LLM defense instance to concurrently protect the target LLM instance in the normal stack and collaborate with it for checkpoint-based access control. The effectiveness of SelfDefend builds upon our observation that existing LLMs (both target and defense LLMs) have the capability to identify harmful prompts or intentions in user queries, which we empirically validate using the commonly used GPT-3.5/4 models across all major jailbreak attacks. Our measurements show that SelfDefend enables GPT-3.5 to suppress the attack success rate (ASR) by 8.97-95.74% (average: 60%) and GPT-4 by even 36.36-100% (average: 83%), while incurring negligible effects on normal queries. To further improve the defense's robustness and minimize costs, we employ a data distillation approach to tune dedicated open-source defense models. These models outperform four SOTA defenses and match the performance of GPT-4-based SelfDefend, with significantly lower extra delays. We also empirically show that the tuned models are robust to targeted GCG and prompt injection attacks. △ Less

Submitted 8 June, 2024; originally announced June 2024.

Comments: This paper completes its earlier vision paper, available at arXiv:2402.15727

arXiv:2406.01059 [pdf, other]

VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model

Authors: Jinze Yang, Haoran Wang, Zining Zhu, Chenglong Liu, Meng Wymond Wu, Zeke Xie, Zhong Ji, Jungong Han, Mingming Sun

Abstract: In this paper, we focus on resolving the problem of image outpainting, which aims to extrapolate the surrounding parts given the center contents of an image. Although recent works have achieved promising performance, the lack of versatility and customization hinders their practical applications in broader scenarios. Therefore, this work presents a novel image outpainting framework that is capable… ▽ More In this paper, we focus on resolving the problem of image outpainting, which aims to extrapolate the surrounding parts given the center contents of an image. Although recent works have achieved promising performance, the lack of versatility and customization hinders their practical applications in broader scenarios. Therefore, this work presents a novel image outpainting framework that is capable of customizing the results according to the requirement of users. First of all, we take advantage of a Multimodal Large Language Model (MLLM) that automatically extracts and organizes the corresponding textual descriptions of the masked and unmasked part of a given image. Accordingly, the obtained text prompts are introduced to endow our model with the capacity to customize the outpainting results. In addition, a special Cross-Attention module, namely Center-Total-Surrounding (CTS), is elaborately designed to enhance further the the interaction between specific space regions of the image and corresponding parts of the text prompts. Note that unlike most existing methods, our approach is very resource-efficient since it is just slightly fine-tuned on the off-the-shelf stable diffusion (SD) model rather than being trained from scratch. Finally, the experimental results on three commonly used datasets, i.e. Scenery, Building, and WikiArt, demonstrate our model significantly surpasses the SoTA methods. Moreover, versatile outpainting results are listed to show its customized ability. △ Less

Submitted 3 June, 2024; originally announced June 2024.

Comments: 15 pages

arXiv:2406.00072 [pdf, other]

Methodology for Analyzing Proton Multiplicity Fluctuations with Azimuthal Partitions in Heavy-Ion Collisions

Authors: Dylan Neff, Zhongling Ji, Roli Esha, Gang Wang, Huan Huang

Abstract: A primary objective in high-energy heavy-ion collisions is to investigate the phase transition between confined and deconfined color matter. Complementary to the cumulants of conserved charges integrated over the full azimuth, we introduce a novel experimental approach to explore particle fluctuations in azimuthal partitions, which are potentially sensitive to the first-order phase transition in h… ▽ More A primary objective in high-energy heavy-ion collisions is to investigate the phase transition between confined and deconfined color matter. Complementary to the cumulants of conserved charges integrated over the full azimuth, we introduce a novel experimental approach to explore particle fluctuations in azimuthal partitions, which are potentially sensitive to the first-order phase transition in heavy-ion collisions. We evaluate proton multiplicity ($N_w$) fluctuations in azimuthal partitions of width $w$ to quantitatively estimate the clustering tendency among these protons. The $Δσ^2$ observable is defined as the normalized difference between the variance of the $N_w$ distribution and the binomial baseline. We demonstrate the feasibility and characteristics of this observable through simulations using the AMPT and MUSIC+FIST models. We also use a Gaussian correlation model to illustrate that the dependence of $Δσ^2$ on $w$ can be parameterized to accurately extract the strength and the range of the input interaction among protons. △ Less

Submitted 30 May, 2024; originally announced June 2024.

Comments: 10 pages, 15 figures

arXiv:2405.20315 [pdf, other]

ANAH: Analytical Annotation of Hallucinations in Large Language Models

Authors: Ziwei Ji, Yuzhe Gu, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen

Abstract: Reducing the `$\textit{hallucination}$' problem of Large Language Models (LLMs) is crucial for their wide applications. A comprehensive and fine-grained measurement of the hallucination is the first key step for the governance of this issue but is under-explored in the community. Thus, we present $\textbf{ANAH}$, a bilingual dataset that offers $\textbf{AN}$alytical $\textbf{A}$nnotation of… ▽ More Reducing the `$\textit{hallucination}$' problem of Large Language Models (LLMs) is crucial for their wide applications. A comprehensive and fine-grained measurement of the hallucination is the first key step for the governance of this issue but is under-explored in the community. Thus, we present $\textbf{ANAH}$, a bilingual dataset that offers $\textbf{AN}$alytical $\textbf{A}$nnotation of $\textbf{H}$allucinations in LLMs within Generative Question Answering. Each answer sentence in our dataset undergoes rigorous annotation, involving the retrieval of a reference fragment, the judgment of the hallucination type, and the correction of hallucinated content. ANAH consists of ~12k sentence-level annotations for ~4.3k LLM responses covering over 700 topics, constructed by a human-in-the-loop pipeline. Thanks to the fine granularity of the hallucination annotations, we can quantitatively confirm that the hallucinations of LLMs progressively accumulate in the answer and use ANAH to train and evaluate hallucination annotators. We conduct extensive experiments on studying generative and discriminative annotators and show that, although current open-source LLMs have difficulties in fine-grained hallucination annotation, the generative annotator trained with ANAH can surpass all open-source LLMs and GPT-3.5, obtain performance competitive with GPT-4, and exhibits better generalization ability on unseen questions. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: Accepted by ACL 2024

arXiv:2405.19732 [pdf, other]

Two Optimizers Are Better Than One: LLM Catalyst Empowers Gradient-Based Optimization for Prompt Tuning

Authors: Zixian Guo, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, Wangmeng Zuo

Abstract: Learning a skill generally relies on both practical experience by doer and insightful high-level guidance by instructor. Will this strategy also work well for solving complex non-convex optimization problems? Here, a common gradient-based optimizer acts like a disciplined doer, making locally optimal update at each step. Recent methods utilize large language models (LLMs) to optimize solutions for… ▽ More Learning a skill generally relies on both practical experience by doer and insightful high-level guidance by instructor. Will this strategy also work well for solving complex non-convex optimization problems? Here, a common gradient-based optimizer acts like a disciplined doer, making locally optimal update at each step. Recent methods utilize large language models (LLMs) to optimize solutions for concrete problems by inferring from natural language instructions, akin to a high-level instructor. In this paper, we show that these two optimizers are complementary to each other, suggesting a collaborative optimization approach. The gradient-based optimizer and LLM-based optimizer are combined in an interleaved manner. We instruct LLMs using task descriptions and timely optimization trajectories recorded during gradient-based optimization. Inferred results from LLMs are used as restarting points for the next stage of gradient optimization. By leveraging both the locally rigorous gradient-based optimizer and the high-level deductive LLM-based optimizer, our combined optimization method consistently yields improvements over competitive baseline prompt tuning methods. Our results demonstrate the synergistic effect of conventional gradient-based optimization and the inference ability of LLMs. The code is released at https://github.com/guozix/LLM-catalyst. △ Less

Submitted 6 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

arXiv:2405.18485 [pdf, other]

A shining cosmic dawn: spectroscopic confirmation of two luminous galaxies at $z\sim14$

Authors: Stefano Carniani, Kevin Hainline, Francesco D'Eugenio, Daniel J. Eisenstein, Peter Jakobsen, Joris Witstok, Benjamin D. Johnson, Jacopo Chevallard, Roberto Maiolino, Jakob M. Helton, Chris Willott, Brant Robertson, Stacey Alberts, Santiago Arribas, William M. Baker, Rachana Bhatawdekar, Kristan Boyett, Andrew J. Bunker, Alex J. Cameron, Phillip A. Cargile, Stéphane Charlot, Mirko Curti, Emma Curtis-Lake, Eiichi Egami, Giovanna Giardino , et al. (18 additional authors not shown)

Abstract: The discovery by JWST of an abundance of luminous galaxies in the very early Universe suggests that galaxies developed rapidly, in apparent tension with many standard models. However, most of these galaxies lack spectroscopic confirmation, so their distances and properties are uncertain. We present JADES JWST/NIRSpec spectroscopic confirmation of two luminous galaxies at redshifts of… ▽ More The discovery by JWST of an abundance of luminous galaxies in the very early Universe suggests that galaxies developed rapidly, in apparent tension with many standard models. However, most of these galaxies lack spectroscopic confirmation, so their distances and properties are uncertain. We present JADES JWST/NIRSpec spectroscopic confirmation of two luminous galaxies at redshifts of $z=14.32^{+0.08}_{-0.20}$ and $z=13.90\pm0.17$. The spectra reveal ultraviolet continua with prominent Lyman-$α$ breaks but no detected emission lines. This discovery proves that luminous galaxies were already in place 300 million years after the Big Bang and are more common than what was expected before JWST. The most distant of the two galaxies is unexpectedly luminous (M$_{\rm uv}=-20.81\pm0.16$) and is spatially resolved with a radius of 260 parsecs. Considering also the steep ultraviolet slope of the second galaxy ($β=-2.71\pm0.19$), we conclude that both are dominated by stellar continuum emission, showing that the excess of luminous galaxies in the early Universe cannot be entirely explained by accretion onto black holes. Galaxy formation models will need to address the existence of such large and luminous galaxies so early in cosmic history. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 26 pages, 15 figures

arXiv:2405.18462 [pdf, other]

JWST/MIRI photometric detection at $7.7\ μ\mathrm{m}$ of the stellar continuum and nebular emission in a galaxy at $z > 14$

Authors: Jakob M. Helton, George H. Rieke, Stacey Alberts, Zihao Wu, Daniel J. Eisenstein, Kevin N. Hainline, Stefano Carniani, Zhiyuan Ji, William M. Baker, Rachana Bhatawdekar, Andrew J. Bunker, Phillip A. Cargile, Stéphane Charlot, Jacopo Chevallard, Francesco D'Eugenio, Eiichi Egami, Benjamin D. Johnson, Gareth C. Jones, Jianwei Lyu, Roberto Maiolino, Pablo G. Pérez-González, Marcia J. Rieke, Brant Robertson, Aayush Saxena, Jan Scholtz , et al. (9 additional authors not shown)

Abstract: The James Webb Space Telescope (JWST) has spectroscopically confirmed numerous galaxies at $z > 10$. While weak rest-ultraviolet emission lines have only been seen in a handful of sources, the stronger rest-optical emission lines are highly diagnostic and accessible at mid-infrared wavelengths with the Mid-Infrared Instrument (MIRI) of JWST. We report the photometric detection of the most distant… ▽ More The James Webb Space Telescope (JWST) has spectroscopically confirmed numerous galaxies at $z > 10$. While weak rest-ultraviolet emission lines have only been seen in a handful of sources, the stronger rest-optical emission lines are highly diagnostic and accessible at mid-infrared wavelengths with the Mid-Infrared Instrument (MIRI) of JWST. We report the photometric detection of the most distant spectroscopically confirmed galaxy JADES-GS-z14-0 at $z = 14.32^{+0.08}_{-0.20}$ with MIRI at $7.7\ μ\mathrm{m}$. The most plausible solution for the stellar population properties is that this galaxy contains half a billion solar masses in stars with a strong burst of star formation in the most recent few million years. For this model, at least one-third of the flux at $7.7\ μ\mathrm{m}$ comes from the rest-optical emission lines $\mathrm{H}β$ and/or $\mathrm{[OIII]}λ\lambda4959,5007$. The inferred properties of JADES-GS-z14-0 suggest rapid mass assembly and metal enrichment during the earliest phases of galaxy formation. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Submitted; main text has 9 pages, 3 figures and 1 table; extended text has 13 pages, 5 figures, and 1 table

arXiv:2405.15972 [pdf, other]

SMILES Initial Data Release: Unveiling the Obscured Universe with MIRI Multi-band Imaging

Authors: Stacey Alberts, Jianwei Lyu, Irene Shivaei, George H. Rieke, Pablo G. Perez-Gonzalez, Nina Bonventura, Yongda Zhu, Jakob M. Helton, Zhiyuan Ji, Jane Morrison, Brant E. Robertson, Meredith A. Stone, Yang Sun, Christina C. Williams, Christopher N. A. Willmer

Abstract: The James Webb Space Telescope (JWST) is revolutionizing our view of the Universe through unprecedented sensitivity and resolution in the infrared, with some of the largest gains realized at its longest wavelengths. We present the Systematic Mid-infrared Instrument (MIRI) Legacy Extragalactic Survey (SMILES), an eight-band MIRI survey with Near-Infrared Spectrograph (NIRSpec) spectroscopic follow-… ▽ More The James Webb Space Telescope (JWST) is revolutionizing our view of the Universe through unprecedented sensitivity and resolution in the infrared, with some of the largest gains realized at its longest wavelengths. We present the Systematic Mid-infrared Instrument (MIRI) Legacy Extragalactic Survey (SMILES), an eight-band MIRI survey with Near-Infrared Spectrograph (NIRSpec) spectroscopic follow-up in the GOODS-S/HUDF region. SMILES takes full advantage of MIRI's continuous coverage from $5.6-25.5\,μ$m over a $\sim34$ arcmin$^2$ area to greatly expand our understanding of the obscured Universe up to cosmic noon and beyond. This work, together with a companion paper by Rieke et al., covers the SMILES science drivers and technical design, early results with SMILES, data reduction, photometric catalog creation, and the first data release. As part of the discussion on early results, we additionally present a high-level science demonstration on how MIRI's wavelength coverage and resolution will advance our understanding of cosmic dust using the full range of polycyclic aromatic hydrocarbon (PAH) emission features from $3.3-18\,μ$m. Using custom background subtraction, we produce robust reductions of the MIRI imaging that maximize the depths reached with our modest exposure times ($\sim0.6 - 2.2$ ks per filter). Included in our initial data release are (1) eight MIRI imaging mosaics reaching depths of $0.2-18\,μ$Jy ($5σ$) and (2) a $5-25.5\,μ$m photometric catalog with over 3,000 sources. Building upon the rich legacy of extensive photometric and spectroscopy coverage of GOODS-S/HUDF from the X-ray to the radio, SMILES greatly expands our investigative power in understanding the obscured Universe. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: 23 pages, 19 figures, submitted to ApJ. Comments welcome! Data release will go live at https://archive.stsci.edu/hlsp/smiles in the next few weeks

arXiv:2405.13166 [pdf, other]

FairLENS: Assessing Fairness in Law Enforcement Speech Recognition

Authors: Yicheng Wang, Mark Cusick, Mohamed Laila, Kate Puech, Zhengping Ji, Xia Hu, Michael Wilson, Noah Spitzer-Williams, Bryan Wheeler, Yasser Ibrahim

Abstract: Automatic speech recognition (ASR) techniques have become powerful tools, enhancing efficiency in law enforcement scenarios. To ensure fairness for demographic groups in different acoustic environments, ASR engines must be tested across a variety of speakers in realistic settings. However, describing the fairness discrepancies between models with confidence remains a challenge. Meanwhile, most pub… ▽ More Automatic speech recognition (ASR) techniques have become powerful tools, enhancing efficiency in law enforcement scenarios. To ensure fairness for demographic groups in different acoustic environments, ASR engines must be tested across a variety of speakers in realistic settings. However, describing the fairness discrepancies between models with confidence remains a challenge. Meanwhile, most public ASR datasets are insufficient to perform a satisfying fairness evaluation. To address the limitations, we built FairLENS - a systematic fairness evaluation framework. We propose a novel and adaptable evaluation method to examine the fairness disparity between different models. We also collected a fairness evaluation dataset covering multiple scenarios and demographic dimensions. Leveraging this framework, we conducted fairness assessments on 1 open-source and 11 commercially available state-of-the-art ASR models. Our results reveal that certain models exhibit more biases than others, serving as a fairness guideline for users to make informed choices when selecting ASR models for a given real-world scenario. We further explored model biases towards specific demographic groups and observed that shifts in the acoustic domain can lead to the emergence of new biases. △ Less

Submitted 28 May, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.10908 [pdf, other]

UVCANDELS: The role of dust on the stellar mass-size relation of disk galaxies at 0.5 $\leq z \leq$ 3.0

Authors: Kalina V. Nedkova, Marc Rafelski, Harry I. Teplitz, Vihang Mehta, Laura DeGroot, Swara Ravindranath, Anahita Alavi, Alexander Beckett, Norman A. Grogin, Boris Häußler, Anton M. Koekemoer, Grecco A. Oyarzún, Laura Prichard, Mitchell Revalski, Gregory F. Snyder, Ben Sunnquist, Xin Wang, Rogier A. Windhorst, Nima Chartab, Christopher J. Conselice, Yicheng Guo, Nimish Hathi, Matthew J. Hayes, Zhiyuan Ji, Keunho J. Kim , et al. (8 additional authors not shown)

Abstract: We use the Ultraviolet Imaging of the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey fields (UVCANDELS) to measure half-light radii in the rest-frame far-UV for $\sim$16,000 disk-like galaxies over $0.5\leq z \leq 3$. We compare these results to rest-frame optical sizes that we measure in a self-consistent way and find that the stellar mass-size relation of disk galaxies is steeper… ▽ More We use the Ultraviolet Imaging of the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey fields (UVCANDELS) to measure half-light radii in the rest-frame far-UV for $\sim$16,000 disk-like galaxies over $0.5\leq z \leq 3$. We compare these results to rest-frame optical sizes that we measure in a self-consistent way and find that the stellar mass-size relation of disk galaxies is steeper in the rest-frame UV than in the optical across our entire redshift range. We show that this is mainly driven by massive galaxies ($\gtrsim10^{10}$M$_\odot$), which we find to also be among the most dusty. Our results are consistent with the literature and have commonly been interpreted as evidence of inside-out growth wherein galaxies form their central structures first. However, they could also suggest that the centers of massive galaxies are more heavily attenuated than their outskirts. We distinguish between these scenarios by modeling and selecting galaxies at $z=2$ from the VELA simulation suite in a way that is consistent with UVCANDELS. We show that the effects of dust alone can account for the size differences we measure at $z=2$. This indicates that, at different wavelengths, size differences and the different slopes of the stellar mass-size relation do not constitute evidence for inside-out growth. △ Less

Submitted 28 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: Accepted for publication in ApJ. 22 pages, 12 figures, and 4 tables

arXiv:2405.08555 [pdf, other]

Dual-Branch Network for Portrait Image Quality Assessment

Authors: Wei Sun, Weixia Zhang, Yanwei Jiang, Haoning Wu, Zicheng Zhang, Jun Jia, Yingjie Zhou, Zhongpeng Ji, Xiongkuo Min, Weisi Lin, Guangtao Zhai

Abstract: Portrait images typically consist of a salient person against diverse backgrounds. With the development of mobile devices and image processing techniques, users can conveniently capture portrait images anytime and anywhere. However, the quality of these portraits may suffer from the degradation caused by unfavorable environmental conditions, subpar photography techniques, and inferior capturing de… ▽ More Portrait images typically consist of a salient person against diverse backgrounds. With the development of mobile devices and image processing techniques, users can conveniently capture portrait images anytime and anywhere. However, the quality of these portraits may suffer from the degradation caused by unfavorable environmental conditions, subpar photography techniques, and inferior capturing devices. In this paper, we introduce a dual-branch network for portrait image quality assessment (PIQA), which can effectively address how the salient person and the background of a portrait image influence its visual quality. Specifically, we utilize two backbone networks (\textit{i.e.,} Swin Transformer-B) to extract the quality-aware features from the entire portrait image and the facial image cropped from it. To enhance the quality-aware feature representation of the backbones, we pre-train them on the large-scale video quality assessment dataset LSVQ and the large-scale facial image quality assessment dataset GFIQA. Additionally, we leverage LIQE, an image scene classification and quality assessment model, to capture the quality-aware and scene-specific features as the auxiliary features. Finally, we concatenate these features and regress them into quality scores via a multi-perception layer (MLP). We employ the fidelity loss to train the model via a learning-to-rank manner to mitigate inconsistencies in quality scores in the portrait image quality assessment dataset PIQ. Experimental results demonstrate that the proposed model achieves superior performance in the PIQ dataset, validating its effectiveness. The code is available at \url{https://github.com/sunwei925/DN-PIQA.git}. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.07551 [pdf, other]

MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning

Authors: Shuo Yin, Weihao You, Zhilong Ji, Guoqiang Zhong, Jinfeng Bai

Abstract: The tool-use Large Language Models (LLMs) that integrate with external Python interpreters have significantly enhanced mathematical reasoning capabilities for open-source LLMs, while tool-free methods chose another track: augmenting math reasoning data. However, a great method to integrate the above two research paths and combine their advantages remains to be explored. In this work, we firstly in… ▽ More The tool-use Large Language Models (LLMs) that integrate with external Python interpreters have significantly enhanced mathematical reasoning capabilities for open-source LLMs, while tool-free methods chose another track: augmenting math reasoning data. However, a great method to integrate the above two research paths and combine their advantages remains to be explored. In this work, we firstly include new math questions via multi-perspective data augmenting methods and then synthesize code-nested solutions to them. The open LLMs (i.e., Llama-2) are finetuned on the augmented dataset to get the resulting models, MuMath-Code ($μ$-Math-Code). During the inference phase, our MuMath-Code generates code and interacts with the external python interpreter to get the execution results. Therefore, MuMath-Code leverages the advantages of both the external tool and data augmentation. To fully leverage the advantages of our augmented data, we propose a two-stage training strategy: In Stage-1, we finetune Llama-2 on pure CoT data to get an intermediate model, which then is trained on the code-nested data in Stage-2 to get the resulting MuMath-Code. Our MuMath-Code-7B achieves 83.8 on GSM8K and 52.4 on MATH, while MuMath-Code-70B model achieves new state-of-the-art performance among open methods -- achieving 90.7% on GSM8K and 55.1% on MATH. Extensive experiments validate the combination of tool use and data augmentation, as well as our two-stage training strategy. We release the proposed dataset along with the associated code for public use. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: The state-of-the-art open-source tool-use LLMs for mathematical reasoning

arXiv:2405.05806 [pdf, other]

MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation

Authors: Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hongzhi Zhang, Lei Zhang, Wangmeng Zuo

Abstract: Text-to-image (T2I) diffusion models have shown significant success in personalized text-to-image generation, which aims to generate novel images with human identities indicated by the reference images. Despite promising identity fidelity has been achieved by several tuning-free methods, they usually suffer from overfitting issues. The learned identity tends to entangle with irrelevant information… ▽ More Text-to-image (T2I) diffusion models have shown significant success in personalized text-to-image generation, which aims to generate novel images with human identities indicated by the reference images. Despite promising identity fidelity has been achieved by several tuning-free methods, they usually suffer from overfitting issues. The learned identity tends to entangle with irrelevant information, resulting in unsatisfied text controllability, especially on faces. In this work, we present MasterWeaver, a test-time tuning-free method designed to generate personalized images with both faithful identity fidelity and flexible editability. Specifically, MasterWeaver adopts an encoder to extract identity features and steers the image generation through additional introduced cross attention. To improve editability while maintaining identity fidelity, we propose an editing direction loss for training, which aligns the editing directions of our MasterWeaver with those of the original T2I model. Additionally, a face-augmented dataset is constructed to facilitate disentangled identity learning, and further improve the editability. Extensive experiments demonstrate that our MasterWeaver can not only generate personalized images with faithful identity, but also exhibit superiority in text controllability. Our code will be publicly available at https://github.com/csyxwei/MasterWeaver. △ Less

Submitted 10 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: 34 pages

arXiv:2405.05772 [pdf, other]

JADES -- The small blue bump in GN-z11: insights into the nuclear region of a galaxy at z=10.6

Authors: Xihan Ji, Roberto Maiolino, Gary Ferland, Francesco D'Eugenio, Rachana Bhatawdekar, Stéphane Charlot, Jacopo Chevallard, Mirko Curti, Emma Curtis-Lake, Kevin Hainline, Zhiyuan Ji, Brant Robertson, Bruno Rodríguez Del Pino, Jan Scholtz, Sandro Tacchella, Christina C. Williams, Joris Witstok

Abstract: We report the detection of continuum excess in the rest-frame UV between 3000 Å and 3550 Å in the JWST/NIRSpec spectrum of GN-z11, a galaxy hosting an active galactic nucleus (AGN) at z = 10.603. The shape of the continuum excess resembles a Balmer continuum but has a break around 3546 Å in the rest frame, which is 100 Å bluewards to the Balmer limit at 3646 Å. A Balmer continuum model alone canno… ▽ More We report the detection of continuum excess in the rest-frame UV between 3000 Å and 3550 Å in the JWST/NIRSpec spectrum of GN-z11, a galaxy hosting an active galactic nucleus (AGN) at z = 10.603. The shape of the continuum excess resembles a Balmer continuum but has a break around 3546 Å in the rest frame, which is 100 Å bluewards to the Balmer limit at 3646 Å. A Balmer continuum model alone cannot fit the spectrum, implying a different origin for the continuum excess. The absence of the Balmer jump indicates an electron temperature of $\sim 3\times 10^4$ K, which is significantly higher than the temperature of $T_{e}({\rm O^{2+}}) \approx 1.3\times 10^{4}$ K inferred from [OIII]$λ4363$. The temperature difference must result from mixing of different ionized regions: the Balmer emission mainly arises from dense and hot clouds in the Broad Line Region, close to the accreting black hole, whereas the forbidden lines originate from less dense and colder gas in the host galaxy (although these ionized regions are kinematically similar in GN-z11 due to its small BH mass). We propose a potential explanation for the observed continuum excess to come from a complex of FeII emission, which shows a characteristic jump bluewards to the Balmer limit as previously seen in the spectra of many lower-redshift quasars. Through comparisons with Cloudy models, we show an Fe abundance or an overall metallicity above $\sim 1/3$ solar is likely needed. Besides the FeII emission, part of the small blue bump might also be associated with an OIII Bowen fluorescent line, a line often enhanced in dense AGN-ionized gas. Finally, the spectrum provides further evidence against Wolf-Rayet or massive stars dominating the nebular emission in GN-z11. △ Less

Submitted 20 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

Comments: 22 pages (including appendix), 18 figures, submitted to MNRAS

arXiv:2404.17833 [pdf, other]

Testing and Understanding Erroneous Planning in LLM Agents through Synthesized User Inputs

Authors: Zhenlan Ji, Daoyuan Wu, Pingchuan Ma, Zongjie Li, Shuai Wang

Abstract: Agents based on large language models (LLMs) have demonstrated effectiveness in solving a wide range of tasks by integrating LLMs with key modules such as planning, memory, and tool usage. Increasingly, customers are adopting LLM agents across a variety of commercial applications critical to reliability, including support for mental well-being, chemical synthesis, and software development. Neverth… ▽ More Agents based on large language models (LLMs) have demonstrated effectiveness in solving a wide range of tasks by integrating LLMs with key modules such as planning, memory, and tool usage. Increasingly, customers are adopting LLM agents across a variety of commercial applications critical to reliability, including support for mental well-being, chemical synthesis, and software development. Nevertheless, our observations and daily use of LLM agents indicate that they are prone to making erroneous plans, especially when the tasks are complex and require long-term planning. In this paper, we propose PDoctor, a novel and automated approach to testing LLM agents and understanding their erroneous planning. As the first work in this direction, we formulate the detection of erroneous planning as a constraint satisfiability problem: an LLM agent's plan is considered erroneous if its execution violates the constraints derived from the user inputs. To this end, PDoctor first defines a domain-specific language (DSL) for user queries and synthesizes varying inputs with the assistance of the Z3 constraint solver. These synthesized inputs are natural language paragraphs that specify the requirements for completing a series of tasks. Then, PDoctor derives constraints from these requirements to form a testing oracle. We evaluate PDoctor with three mainstream agent frameworks and two powerful LLMs (GPT-3.5 and GPT-4). The results show that PDoctor can effectively detect diverse errors in agent planning and provide insights and error characteristics that are valuable to both agent developers and users. We conclude by discussing potential alternative designs and directions to extend PDoctor. △ Less

Submitted 27 April, 2024; originally announced April 2024.

arXiv:2404.16425 [pdf, other]

Soft X-ray prompt emission from a high-redshift gamma-ray burst EP240315a

Authors: Y. Liu, H. Sun, D. Xu, D. S. Svinkin, J. Delaunay, N. R. Tanvir, H. Gao, C. Zhang, Y. Chen, X. -F. Wu, B. Zhang, W. Yuan, J. An, G. Bruni, D. D. Frederiks, G. Ghirlanda, J. -W. Hu, A. Li, C. -K. Li, J. -D. Li, D. B. Malesani, L. Piro, G. Raman, R. Ricci, E. Troja , et al. (170 additional authors not shown)

Abstract: Long gamma-ray bursts (GRBs) are believed to originate from core collapse of massive stars. High-redshift GRBs can probe the star formation and reionization history of the early universe, but their detection remains rare. Here we report the detection of a GRB triggered in the 0.5--4 keV band by the Wide-field X-ray Telescope (WXT) on board the Einstein Probe (EP) mission, designated as EP240315a,… ▽ More Long gamma-ray bursts (GRBs) are believed to originate from core collapse of massive stars. High-redshift GRBs can probe the star formation and reionization history of the early universe, but their detection remains rare. Here we report the detection of a GRB triggered in the 0.5--4 keV band by the Wide-field X-ray Telescope (WXT) on board the Einstein Probe (EP) mission, designated as EP240315a, whose bright peak was also detected by the Swift Burst Alert Telescope and Konus-Wind through off-line analyses. At a redshift of $z=4.859$, EP240315a showed a much longer and more complicated light curve in the soft X-ray band than in gamma-rays. Benefiting from a large field-of-view ($\sim$3600 deg$^2$) and a high sensitivity, EP-WXT captured the earlier engine activation and extended late engine activity through a continuous detection. With a peak X-ray flux at the faint end of previously known high-$z$ GRBs, the detection of EP240315a demonstrates the great potential for EP to study the early universe via GRBs. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: 41 pages, 8 figures, 7 tables

arXiv:2404.15802 [pdf, other]

Raformer: Redundancy-Aware Transformer for Video Wire Inpainting

Authors: Zhong Ji, Yimu Su, Yan Zhang, Jiacheng Hou, Yanwei Pang, Jungong Han

Abstract: Video Wire Inpainting (VWI) is a prominent application in video inpainting, aimed at flawlessly removing wires in films or TV series, offering significant time and labor savings compared to manual frame-by-frame removal. However, wire removal poses greater challenges due to the wires being longer and slimmer than objects typically targeted in general video inpainting tasks, and often intersecting… ▽ More Video Wire Inpainting (VWI) is a prominent application in video inpainting, aimed at flawlessly removing wires in films or TV series, offering significant time and labor savings compared to manual frame-by-frame removal. However, wire removal poses greater challenges due to the wires being longer and slimmer than objects typically targeted in general video inpainting tasks, and often intersecting with people and background objects irregularly, which adds complexity to the inpainting process. Recognizing the limitations posed by existing video wire datasets, which are characterized by their small size, poor quality, and limited variety of scenes, we introduce a new VWI dataset with a novel mask generation strategy, namely Wire Removal Video Dataset 2 (WRV2) and Pseudo Wire-Shaped (PWS) Masks. WRV2 dataset comprises over 4,000 videos with an average length of 80 frames, designed to facilitate the development and efficacy of inpainting models. Building upon this, our research proposes the Redundancy-Aware Transformer (Raformer) method that addresses the unique challenges of wire removal in video inpainting. Unlike conventional approaches that indiscriminately process all frame patches, Raformer employs a novel strategy to selectively bypass redundant parts, such as static background segments devoid of valuable information for inpainting. At the core of Raformer is the Redundancy-Aware Attention (RAA) module, which isolates and accentuates essential content through a coarse-grained, window-based attention mechanism. This is complemented by a Soft Feature Alignment (SFA) module, which refines these features and achieves end-to-end feature alignment. Extensive experiments on both the traditional video inpainting datasets and our proposed WRV2 dataset demonstrate that Raformer outperforms other state-of-the-art methods. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.15688 [pdf, ps, other]

Observer-Based Realization of Control Systems

Authors: Daizhan Cheng, Changxi Li, Xiao Zhang, Zhengping Ji

Abstract: Lebesgue-type of dynamic control systems and dimension-keeping semi-tensor product (DK-STP) of matrices are introduced. Using bridge matrices, the DK-STP is used to construct approximated observer-based realization (OR) of linear control systems, as Lebesgue-type control systems, are proposed. A necessary and sufficient condition for the OR-system to have exactly same observer dynamics is obtained… ▽ More Lebesgue-type of dynamic control systems and dimension-keeping semi-tensor product (DK-STP) of matrices are introduced. Using bridge matrices, the DK-STP is used to construct approximated observer-based realization (OR) of linear control systems, as Lebesgue-type control systems, are proposed. A necessary and sufficient condition for the OR-system to have exactly same observer dynamics is obtained. When the exact OR-system does not exist, the extended OR-system, which contains observers of the original system as part of its state variables, is presented. Moreover, the (minimum) feedback (extended) OR-system is also constructed, and its relationship with Kalman's minimum realization is revealed. Finally, the technique developed for linear control systems has been extended to affine nonlinear control systems. The purpose of OR-system is to provide a new technique to deal with large scale complex systems. △ Less

Submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.10541 [pdf, other]

MPCOM: Robotic Data Gathering with Radio Mapping and Model Predictive Communication

Authors: Zhiyou Ji, Guoliang Li, Ruihua Han, Shuai Wang, Bing Bai, Wei Xu, Kejiang Ye, Chengzhong Xu

Abstract: Robotic data gathering (RDG) is an emerging paradigm that navigates a robot to harvest data from remote sensors. However, motion planning in this paradigm needs to maximize the RDG efficiency instead of the navigation efficiency, for which the existing motion planning methods become inefficient, as they plan robot trajectories merely according to motion factors. This paper proposes radio map guide… ▽ More Robotic data gathering (RDG) is an emerging paradigm that navigates a robot to harvest data from remote sensors. However, motion planning in this paradigm needs to maximize the RDG efficiency instead of the navigation efficiency, for which the existing motion planning methods become inefficient, as they plan robot trajectories merely according to motion factors. This paper proposes radio map guided model predictive communication (MPCOM), which navigates the robot with both grid and radio maps for shape-aware collision avoidance and communication-aware trajectory generation in a dynamic environment. The proposed MPCOM is able to trade off the time spent on reaching goal, avoiding collision, and improving communication. MPCOM captures high-order signal propagation characteristics using radio maps and incorporates the map-guided communication regularizer to the motion planning block. Experiments in IRSIM and CARLA simulators show that the proposed MPCOM outperforms other benchmarks in both LOS and NLOS cases. Real-world testing based on car-like robots is also provided to demonstrate the effectiveness of MPCOM in indoor environments. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: submit to IROS

arXiv:2404.10264 [pdf, other]

Calibration of the Cryogenic Measurement System of a Resonant Haloscope Cavity

Authors: Dong He, Jie Fan, Xin Gao, Yu Gao, Nick Houston, Zhongqing Ji, Yirong Jin, Chuang Li, Jinmian Li, Tianjun Li, Shi-hang Liu, Jia-Shu Niu, Zhihui Peng, Liang Sun, Zheng Sun, Jia Wang, Puxian Wei, Lina Wu, Zhongchen Xiang, Qiaoli Yang, Chi Zhang, Wenxing Zhang, Xin Zhang, Dongning Zheng, Ruifeng Zheng , et al. (1 additional authors not shown)

Abstract: Possible light bosonic dark matter interactions with the Standard Model photon have been searched by microwave resonant cavities. In this paper, we demonstrate the cryogenic readout system calibration of a 7.138 GHz copper cavity with a loaded quality factor $Q_l=10^4$, operated at 22 mK temperature based on a dilution refrigerator. Our readout system consists of High Electron Mobility Transistors… ▽ More Possible light bosonic dark matter interactions with the Standard Model photon have been searched by microwave resonant cavities. In this paper, we demonstrate the cryogenic readout system calibration of a 7.138 GHz copper cavity with a loaded quality factor $Q_l=10^4$, operated at 22 mK temperature based on a dilution refrigerator. Our readout system consists of High Electron Mobility Transistors as cryogenic amplifiers at 4 K, plus room-temperature amplifiers and a spectrum analyzer for signal power detection. We test the system with a superconducting two-level system as a single-photon source in the microwave frequency regime and report an overall 95.6 dB system gain and -71.4 dB attenuation in the cavity's input channel. The effective noise temperature of the measurement system is 7.5 K. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: 7 pages, 5 figures, version to appear in CPC

arXiv:2404.09125 [pdf]

Achieving High Yield of Perpendicular SOT-MTJ Manufactured on 300 mm Wafers

Authors: Wenlong Yang, Zhenghui Ji, Yang Gao, Kaiyuan Zhou, Qijun Guo, Dinggui Zeng, Shasha Wang, Ming Wang, Lijie Shen, Guilin Chen, Yihui Sun, Enlong Liu, Shikun He

Abstract: The large-scale fabrication of three-terminal magnetic tunnel junctions (MTJs) with high yield is becoming increasingly crucial, especially with the growing interest in spin-orbit torque (SOT) magnetic random access memory (MRAM) as the next generation of MRAM technology. To achieve high yield and consistent device performance in MTJs with perpendicular magnetic anisotropy, an integration flow has… ▽ More The large-scale fabrication of three-terminal magnetic tunnel junctions (MTJs) with high yield is becoming increasingly crucial, especially with the growing interest in spin-orbit torque (SOT) magnetic random access memory (MRAM) as the next generation of MRAM technology. To achieve high yield and consistent device performance in MTJs with perpendicular magnetic anisotropy, an integration flow has been developed that incorporates special MTJ etching technique and other CMOS-compatible processes on a 300 mm wafer manufacturing platform. Systematic studies have been conducted on device performance and statistical uniformity, encompassing magnetic properties, electrical switching behavior, and reliability. Achievements include a switching current of 680 uA at 2 ns, a TMR as high as 119%, ultra-high endurance (over 1012 cycles), and excellent uniformity in the fabricated SOT-MTJ devices, with a yield of up to 99.6%. The proposed integration process, featuring high yield, is anticipated to streamline the mass production of SOT-MRAM. △ Less

Submitted 13 April, 2024; originally announced April 2024.

Comments: 8 pages, 5 figures

ACM Class: J.2.6

arXiv:2404.07900 [pdf, other]

High-Dimension Human Value Representation in Large Language Models

Authors: Samuel Cahyawijaya, Delong Chen, Yejin Bang, Leila Khalatbari, Bryan Wilie, Ziwei Ji, Etsuko Ishii, Pascale Fung

Abstract: The widespread application of Large Language Models (LLMs) across various tasks and fields has necessitated the alignment of these models with human values and preferences. Given various approaches of human value alignment, ranging from Reinforcement Learning with Human Feedback (RLHF), to constitutional learning, etc. there is an urgent need to understand the scope and nature of human values inje… ▽ More The widespread application of Large Language Models (LLMs) across various tasks and fields has necessitated the alignment of these models with human values and preferences. Given various approaches of human value alignment, ranging from Reinforcement Learning with Human Feedback (RLHF), to constitutional learning, etc. there is an urgent need to understand the scope and nature of human values injected into these models before their release. There is also a need for model alignment without a costly large scale human annotation effort. We propose UniVaR, a high-dimensional representation of human value distributions in LLMs, orthogonal to model architecture and training data. Trained from the value-relevant output of eight multilingual LLMs and tested on the output from four multilingual LLMs, namely LlaMA2, ChatGPT, JAIS and Yi, we show that UniVaR is a powerful tool to compare the distribution of human values embedded in different LLMs with different langauge sources. Through UniVaR, we explore how different LLMs prioritize various values in different languages and cultures, shedding light on the complex interplay between human values and language modeling. △ Less

Submitted 25 June, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.07157 [pdf, other]

Local probe of bulk and edge states in a fractional Chern insulator

Authors: Zhurun Ji, Heonjoon Park, Mark E. Barber, Chaowei Hu, Kenji Watanabe, Takashi Taniguchi, Jiun-Haw Chu, Xiaodong Xu, Zhi-xun Shen

Abstract: Fractional quantum Hall effect (FQHE) is a prime example of topological quantum many-body phenomena, arising from the interplay between strong electron correlation, topological order, and time reversal symmetry breaking. Recently, a lattice analog of FQHE at zero magnetic field has been observed, confirming the existence of a zero-field fractional Chern insulator (FCI). Despite this, the bulk-edge… ▽ More Fractional quantum Hall effect (FQHE) is a prime example of topological quantum many-body phenomena, arising from the interplay between strong electron correlation, topological order, and time reversal symmetry breaking. Recently, a lattice analog of FQHE at zero magnetic field has been observed, confirming the existence of a zero-field fractional Chern insulator (FCI). Despite this, the bulk-edge correspondence -- a hallmark of FCI featuring an insulating bulk with conductive edges -- has not been directly observed. In fact, this correspondence has not been visualized in any system for fractional states due to experimental challenges. Here we report the imaging of FCI edge states in twisted MoTe2 by employing a newly developed modality of microwave-impedance microscopy. By tuning the carrier density, we observe the system evolving between metallic and FCI states, the latter of which exhibits insulating bulk and conductive edges as expected from bulk-boundary correspondence. We also observe the evolution of edge states across the topological phase transition from an incompressible Chern insulator state to a metal and finally to a putative charge ordered insulating state as a function of interlayer electric field. The local measurement further reveals tantalizing prospects of neighboring domains with different fractional orders. These findings pave the way for research into topologically protected 1D interfaces between various anyonic states at zero magnetic field, such as topological entanglement entropy, Halperin-Laughlin interfaces, and the creation of non-abelian anyons. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.06531 [pdf, other]

JADES Data Release 3 -- NIRSpec/MSA spectroscopy for 4,000 galaxies in the GOODS fields

Authors: Francesco D'Eugenio, Alex J. Cameron, Jan Scholtz, Stefano Carniani, Chris J. Willott, Emma Curtis-Lake, Andrew J. Bunker, Eleonora Parlanti, Roberto Maiolino, Christopher N. A. Willmer, Peter Jakobsen, Brant E. Robertson, Benjamin D. Johnson, Sandro Tacchella, Phillip A. Cargile, Tim Rawle, Santiago Arribas, Jacopo Chevallard, Mirko Curti, Eiichi Egami, Daniel J. Eisenstein, Nimisha Kumari, Tobias J. Looser, Marcia J. Rieke, Bruno Rodríguez Del Pino , et al. (29 additional authors not shown)

Abstract: We present the third data release of JADES, the JWST Advanced Deep Extragalactic Survey, providing both imaging and spectroscopy in the two GOODS fields. Spectroscopy consists of medium-depth and deep NIRSpec/MSA spectra of 4,000 targets, covering the spectral range 0.6-5.3 $μ$m and observed with both the low-dispersion prism (R=30-300) and all three medium-resolution gratings (R=500-1,500). We de… ▽ More We present the third data release of JADES, the JWST Advanced Deep Extragalactic Survey, providing both imaging and spectroscopy in the two GOODS fields. Spectroscopy consists of medium-depth and deep NIRSpec/MSA spectra of 4,000 targets, covering the spectral range 0.6-5.3 $μ$m and observed with both the low-dispersion prism (R=30-300) and all three medium-resolution gratings (R=500-1,500). We describe the observations, data reduction, sample selection, and target allocation. We measured 2,375 redshifts (2,053 from multiple emission lines); our targets span the range from z=0.5 up to z=13, including 404 at z>5. The data release includes 2-d and 1-d fully reduced spectra, with slit-loss corrections and background subtraction optimized for point sources. We also provide redshifts and S/N>5 emission-line flux catalogs for the prism and grating spectra, and concise guidelines on how to use these data products. Alongside spectroscopy, we are also publishing fully calibrated NIRCam imaging, which enables studying the JADES sample with the combined power of imaging and spectroscopy. Together, these data provide the largest statistical sample to date to characterize the properties of galaxy populations in the first billion years after the Big Bang. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 41 pages, 26 figures, 10 tables. Submitted to ApJS

arXiv:2404.06351 [pdf, other]

HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention

Authors: Xiaolong Tang, Meina Kan, Shiguang Shan, Zhilong Ji, Jinfeng Bai, Xilin Chen

Abstract: Predicting the trajectories of road agents is essential for autonomous driving systems. The recent mainstream methods follow a static paradigm, which predicts the future trajectory by using a fixed duration of historical frames. These methods make the predictions independently even at adjacent time steps, which leads to potential instability and temporal inconsistency. As successive time steps hav… ▽ More Predicting the trajectories of road agents is essential for autonomous driving systems. The recent mainstream methods follow a static paradigm, which predicts the future trajectory by using a fixed duration of historical frames. These methods make the predictions independently even at adjacent time steps, which leads to potential instability and temporal inconsistency. As successive time steps have largely overlapping historical frames, their forecasting should have intrinsic correlation, such as overlapping predicted trajectories should be consistent, or be different but share the same motion goal depending on the road situation. Motivated by this, in this work, we introduce HPNet, a novel dynamic trajectory forecasting method. Aiming for stable and accurate trajectory forecasting, our method leverages not only historical frames including maps and agent states, but also historical predictions. Specifically, we newly design a Historical Prediction Attention module to automatically encode the dynamic relationship between successive predictions. Besides, it also extends the attention range beyond the currently visible window benefitting from the use of historical predictions. The proposed Historical Prediction Attention together with the Agent Attention and Mode Attention is further formulated as the Triple Factorized Attention module, serving as the core design of HPNet.Experiments on the Argoverse and INTERACTION datasets show that HPNet achieves state-of-the-art performance, and generates accurate and stable future trajectories. Our code are available at https://github.com/XiaolongTang23/HPNet. △ Less

Submitted 11 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: CVPR2024

arXiv:2404.04568 [pdf, other]

The moduli space of a rational map is Carathéodory hyperbolic

Authors: Zhuchao Ji, Junyi Xie

Abstract: Let $f$ be a rational map of degree $d\geq 2$. The moduli space $\mathcal{M}_f$, introduced by McMullen and Sullivan, is a complex analytic space consisting all quasiconformal conjugacy classes of $f$. For $f$ that is not flexible Lattès, we show that there is a normal affine variety $X_f$ of dimension $2d-2$ and a holomorphic injection $i:\mathcal{M}_f\to X_f$ such that $i(\mathcal{M}_f)$ is prec… ▽ More Let $f$ be a rational map of degree $d\geq 2$. The moduli space $\mathcal{M}_f$, introduced by McMullen and Sullivan, is a complex analytic space consisting all quasiconformal conjugacy classes of $f$. For $f$ that is not flexible Lattès, we show that there is a normal affine variety $X_f$ of dimension $2d-2$ and a holomorphic injection $i:\mathcal{M}_f\to X_f$ such that $i(\mathcal{M}_f)$ is precompact in $X_f$. In particular $\mathcal{M}_f$ is Carathéodory hyperbolic (i.e. bounded holomorphic functions separate points in $\mathcal{M}_f$), provided that $f$ is not flexible Lattès. This solves a conjecture of McMullen. When $d\geq 4$, we give a concrete construction of $X_f$ as the normalization of the Zariski closure of the image of the reciprocal multiplier spectrum morphism. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: 10 pages

arXiv:2404.04325 [pdf, other]

Searching for Emission Lines at $z>11$: The Role of Damped Lyman-$α$ and Hints About the Escape of Ionizing Photons

Authors: Kevin N. Hainline, Francesco D'Eugenio, Peter Jakobsen, Jacopo Chevallard, Stefano Carniani, Joris Witstok, Zhiyuan Ji, Emma Curtis-Lake, Benjamin D. Johnson, Brant Robertson, Sandro Tacchella, Mirko Curti, Stephane Charlot, Jakob M. Helton, Santiago Arribas, Rachana Bhatawdekar, Andrew J. Bunker, Alex J. Cameron, Eiichi Egami, Daniel J. Eisenstein, Ryan Hausen, Nimisha Kumari, Roberto Maiolino, Pablo G. Perez-Gonzalez, Marcia Rieke , et al. (7 additional authors not shown)

Abstract: We describe new ultra-deep James Webb Space Telescope (JWST) NIRSpec PRISM and grating spectra for the galaxies JADES-GS-z11-0 ($z_{\mathrm{spec}} = 11.122^{+0.005}_{-0.003}$) and JADES-GS-z13-0 ($z_{\mathrm{spec}} = 13.20^{+0.03}_{-0.04}$), the most distant spectroscopically-confirmed galaxy discovered in the first year of JWST observations. The extraordinary depth of these observations (75 hours… ▽ More We describe new ultra-deep James Webb Space Telescope (JWST) NIRSpec PRISM and grating spectra for the galaxies JADES-GS-z11-0 ($z_{\mathrm{spec}} = 11.122^{+0.005}_{-0.003}$) and JADES-GS-z13-0 ($z_{\mathrm{spec}} = 13.20^{+0.03}_{-0.04}$), the most distant spectroscopically-confirmed galaxy discovered in the first year of JWST observations. The extraordinary depth of these observations (75 hours and 56 hours, respectively) provides a unique opportunity to explore the redshifts, stellar properties, UV magnitudes, and slopes for these two sources. For JADES-GS-z11-0, we find evidence for multiple emission lines, including [OII]3726,3729 and [NeIII]3869, resulting in a spectroscopic redshift we determine with 94% confidence. At this spectroscopic redshift, the Lyman-$α$ break in JADES-GS-z11-0 can be fit with a damped Lyman-$α$ absorber with $\log{(N_\mathrm{HI}/\mathrm{cm}^{-2})} = 22.42^{+0.093}_{-0.120}$. We present stringent upper limits on the emission line fluxes and line equivalent widths for JADES-GS-z13-0. These results demonstrate how neutral hydrogen fraction and Lyman-damping wings may impact the recovery of spectroscopic redshifts for sources like these, providing insight into the overprediction of the photometric redshifts seen for distant galaxies observed with JWST. In addition, we analyze updated NIRCam photometry to calculate the morphological properties of these resolved sources, and find a secondary source $0.3^{\prime\prime}$ south of JADES-GS-z11-0 at a similar photometric redshift, hinting at how galaxies grow through interactions in the early Universe. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 33 pages, 19 figures, submitted to AAS Journals

arXiv:2404.02194 [pdf, other]

Resolving the nature and putative nebular emission of GS9422: an obscured AGN without exotic stars

Authors: Sandro Tacchella, William McClymont, Jan Scholtz, Roberto Maiolino, Xihan Ji, Natalia C. Villanueva, Stéphane Charlot, Francesco D'Eugenio, Jakob M. Helton, Christina C. Williams, Joris Witstok, Rachana Bhatawdekar, Stefano Carniani, Jacopo Chevallard, Mirko Curti, Kevin Hainline, Zhiyuan Ji, Benjamin D. Johnson, Joel Leja, Yijia Li, Michael V. Maseda, Dávid Puskás, Marcia Rieke, Brant Robertson, Irene Shivaei , et al. (5 additional authors not shown)

Abstract: Understanding the sources that power nebular emission in high-redshift galaxies is fundamentally important not only for shedding light onto the drivers of reionisation, but to constrain stellar populations and the growth of black holes. Here we focus on an individual object, GS9422, a galaxy at $z_{\rm spec}=5.943$ with exquisite data from the JADES and JEMS surveys, including 14-band JWST/NIRCam… ▽ More Understanding the sources that power nebular emission in high-redshift galaxies is fundamentally important not only for shedding light onto the drivers of reionisation, but to constrain stellar populations and the growth of black holes. Here we focus on an individual object, GS9422, a galaxy at $z_{\rm spec}=5.943$ with exquisite data from the JADES and JEMS surveys, including 14-band JWST/NIRCam photometry and deep NIRSpec prism and grating spectroscopy. We map the continuum emission and nebular emission lines across the galaxy on 0.2-kpc scales. GS9422 has been claimed to have nebular-dominated continuum and an extreme stellar population with top-heavy initial mass function. We find clear evidence for different morphologies in the emission lines, the rest-UV and rest-optical continuum emission, demonstrating that the full continuum cannot be dominated by nebular emission. While multiple models reproduce the spectrum reasonably well, our preferred model with a type-2 active galactic nucleus (AGN) and local damped Ly-$α$ (DLA) clouds can explain both the spectrum and the wavelength-dependent morphology. The AGN powers the off-planar nebular emission, giving rise to the Balmer jump and the emission lines, including Ly-$α$, which therefore does not suffer DLA absorption. A central, young stellar component dominates the rest-UV emission and -- together with the DLA clouds -- leads to a spectral turn-over. A disc-like, older stellar component explains the flattened morphology in the rest-optical continuum. We conclude that GS9422 is consistent with being a normal galaxy with an obscured, type-2 AGN -- a simple scenario, without the need for exotic stellar populations. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: 19 pages, 13 figures, submitted to MNRAS. Comments welcome

arXiv:2404.00908 [pdf, other]

Dark photon constraints from a 7.139 GHz cavity haloscope experiment

Authors: Dong He, Jie Fan, Xin Gao, Yu Gao, Nick Houston, Zhongqing Ji, Yirong Jin, Chuang Li, Jinmian Li, Tianjun Li, Shi-hang Liu, Jia-Shu Niu, Zhihui Peng, Liang Sun, Zheng Sun, Jia Wang, Puxian Wei, Lina Wu, Zhongchen Xiang, Qiaoli Yang, Chi Zhang, Wenxing Zhang, Xin Zhang, Dongning Zheng, Ruifeng Zheng , et al. (1 additional authors not shown)

Abstract: The dark photon is a promising candidate for the dark matter which comprises most of the matter in our visible Universe. Via kinetic mixing with the Standard Model it can also be resonantly converted to photons in an electromagnetic cavity, offering novel experimental possibilities for the discovery and study of dark matter. We report the results of a pathfinder dark photon dark matter cavity sear… ▽ More The dark photon is a promising candidate for the dark matter which comprises most of the matter in our visible Universe. Via kinetic mixing with the Standard Model it can also be resonantly converted to photons in an electromagnetic cavity, offering novel experimental possibilities for the discovery and study of dark matter. We report the results of a pathfinder dark photon dark matter cavity search experiment performed at Hunan Normal University and the Institute of Physics, Chinese Academy of Sciences, representing the first stage of the APEX (Axion and dark Photon EXperiment) program. Finding no statistically significant excess, we place an upper limit on the kinetic mixing parameter $|χ|<3.7\times 10^{-13}$ around $m_A\simeq 29.5$ $μ$eV at 90% confidence level. This result exceeds other constraints on dark photon dark matter in this frequency range by roughly an order of magnitude. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: 5 pages, 4 figures

arXiv:2403.20079 [pdf, other]

SGD: Street View Synthesis with Gaussian Splatting and Diffusion Prior

Authors: Zhongrui Yu, Haoran Wang, Jinze Yang, Hanzhang Wang, Zeke Xie, Yunfeng Cai, Jiale Cao, Zhong Ji, Mingming Sun

Abstract: Novel View Synthesis (NVS) for street scenes play a critical role in the autonomous driving simulation. The current mainstream technique to achieve it is neural rendering, such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS). Although thrilling progress has been made, when handling street scenes, current methods struggle to maintain rendering quality at the viewpoint that deviate… ▽ More Novel View Synthesis (NVS) for street scenes play a critical role in the autonomous driving simulation. The current mainstream technique to achieve it is neural rendering, such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS). Although thrilling progress has been made, when handling street scenes, current methods struggle to maintain rendering quality at the viewpoint that deviates significantly from the training viewpoints. This issue stems from the sparse training views captured by a fixed camera on a moving vehicle. To tackle this problem, we propose a novel approach that enhances the capacity of 3DGS by leveraging prior from a Diffusion Model along with complementary multi-modal data. Specifically, we first fine-tune a Diffusion Model by adding images from adjacent frames as condition, meanwhile exploiting depth data from LiDAR point clouds to supply additional spatial information. Then we apply the Diffusion Model to regularize the 3DGS at unseen views during training. Experimental results validate the effectiveness of our method compared with current state-of-the-art models, and demonstrate its advance in rendering images from broader views. △ Less

Submitted 29 March, 2024; originally announced March 2024.

arXiv:2403.17369 [pdf, other]

CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning

Authors: Ziyang Gong, Fuhao Li, Yupeng Deng, Deblina Bhattacharjee, Xianzheng Ma, Xiangwei Zhu, Zhenming Ji

Abstract: Unsupervised Domain Adaptation (UDA) aims to adapt models from labeled source domains to unlabeled target domains. When adapting to adverse scenes, existing UDA methods fail to perform well due to the lack of instructions, leading their models to overlook discrepancies within all adverse scenes. To tackle this, we propose CoDA which instructs models to distinguish, focus, and learn from these disc… ▽ More Unsupervised Domain Adaptation (UDA) aims to adapt models from labeled source domains to unlabeled target domains. When adapting to adverse scenes, existing UDA methods fail to perform well due to the lack of instructions, leading their models to overlook discrepancies within all adverse scenes. To tackle this, we propose CoDA which instructs models to distinguish, focus, and learn from these discrepancies at scene and image levels. Specifically, CoDA consists of a Chain-of-Domain (CoD) strategy and a Severity-Aware Visual Prompt Tuning (SAVPT) mechanism. CoD focuses on scene-level instructions to divide all adverse scenes into easy and hard scenes, guiding models to adapt from source to easy domains with easy scene images, and then to hard domains with hard scene images, thereby laying a solid foundation for whole adaptations. Building upon this foundation, we employ SAVPT to dive into more detailed image-level instructions to boost performance. SAVPT features a novel metric Severity that divides all adverse scene images into low-severity and high-severity images. Then Severity directs visual prompts and adapters, instructing models to concentrate on unified severity features instead of scene-specific features, without adding complexity to the model architecture. CoDA achieves SOTA performances on widely-used benchmarks under all adverse scenes. Notably, CoDA outperforms the existing ones by 4.6%, and 10.3% mIoU on the Foggy Driving, and Foggy Zurich benchmarks, respectively. Our code is available at https://github.com/Cuzyoung/CoDA △ Less

Submitted 15 July, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.12486 [pdf, other]

NTK-Guided Few-Shot Class Incremental Learning

Authors: Jingren Liu, Zhong Ji, Yanwei Pang, YunLong Yu

Abstract: While anti-amnesia FSCIL learners often excel in incremental sessions, they tend to prioritize mitigating knowledge attrition over harnessing the model's potential for knowledge acquisition. In this paper, we delve into the foundations of model generalization in FSCIL through the lens of the Neural Tangent Kernel (NTK). Our primary design focus revolves around ensuring optimal NTK convergence and… ▽ More While anti-amnesia FSCIL learners often excel in incremental sessions, they tend to prioritize mitigating knowledge attrition over harnessing the model's potential for knowledge acquisition. In this paper, we delve into the foundations of model generalization in FSCIL through the lens of the Neural Tangent Kernel (NTK). Our primary design focus revolves around ensuring optimal NTK convergence and NTK-related generalization error, serving as the theoretical bedrock for exceptional generalization. To attain globally optimal NTK convergence, we employ a meta-learning mechanism grounded in mathematical principles to guide the optimization process within an expanded network. Furthermore, to reduce the NTK-related generalization error, we commence from the foundational level, optimizing the relevant factors constituting its generalization loss. Specifically, we initiate self-supervised pre-training on the base session to shape the initial network weights. Then they are carefully refined through curricular alignment, followed by the application of dual NTK regularization tailored specifically for both convolutional and linear layers. Through the combined effects of these measures, our network acquires robust NTK properties, significantly enhancing its foundational generalization. On popular FSCIL benchmark datasets, our NTK-FSCIL surpasses contemporary state-of-the-art approaches, elevating end-session accuracy by 2.9% to 8.7%. △ Less

Submitted 19 March, 2024; originally announced March 2024.

Showing 1–50 of 587 results for author: Ji, Z