subscribe to arXiv mailings

Audio coding with unified noise shaping and phase contrast control

Authors: Byeongho Jo, Seungkwon Beack, Taejin Lee

Abstract: Over the past decade, audio coding technology has seen standardization and the development of many frameworks incorporated with linear predictive coding (LPC). As LPC reduces information in the frequency domain, LP-based frequency-domain noise-shaping (FDNS) was previously proposed. To code transient signals effectively, FDNS with temporal noise shaping (TNS) has emerged. However, these mainly ope… ▽ More Over the past decade, audio coding technology has seen standardization and the development of many frameworks incorporated with linear predictive coding (LPC). As LPC reduces information in the frequency domain, LP-based frequency-domain noise-shaping (FDNS) was previously proposed. To code transient signals effectively, FDNS with temporal noise shaping (TNS) has emerged. However, these mainly operated in the modified discrete cosine transform domain, which essentially accompanies time domain aliasing. In this paper, a unified noise-shaping (UNS) framework including FDNS and complex LPC-based TNS (CTNS) in the DFT domain is proposed to overcome the aliasing issues. Additionally, a modified polar quantizer with phase contrast control is proposed, which saves phase bits depending on the frequency envelope information. The core coding feasibility at low bit rates is verified through various objective metrics and subjective listening evaluations. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: Submitted and accepted in ICASSP (International Conference on Acoustics, Speech, and Signal Processing) 2023

arXiv:2304.08057 [pdf, other]

Regularization of the inverse Laplace transform by Mollification

Authors: Pierre Maréchal, Faouzi Triki, Walter C. Simo Tao Lee

Abstract: In this paper we study the inverse Laplace transform. We first derive a new global logarithmic stability estimate that shows that the inversion is severely ill-posed. Then we propose a regularization method to compute the inverse Laplace transform using the concept of mollification. Taking into account the exponential instability we derive a criterion for selection of the regularization parameter.… ▽ More In this paper we study the inverse Laplace transform. We first derive a new global logarithmic stability estimate that shows that the inversion is severely ill-posed. Then we propose a regularization method to compute the inverse Laplace transform using the concept of mollification. Taking into account the exponential instability we derive a criterion for selection of the regularization parameter. We show that by taking the optimal value of this parameter we improve significantly the convergence of the method. Finally, making use of the holomorphic extension of the Laplace transform, we suggest a new PDEs based numerical method for the computation of the solution. The effectiveness of the proposed regularization method is demonstrated through several numerical examples. △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: 22 pages, 10 figures

MSC Class: 35R30; 65N21 ACM Class: G.1.0; G.1.9

arXiv:2304.07614 [pdf, ps, other]

An eigenvalue problem for prescribed curvature equations

Authors: Taehun Lee

Abstract: We study an eigenvalue problem for prescribed $σ_k$-curvature equations of star-shaped, $k$-convex, closed hypersurfaces. We establish the existence of a unique eigenvalue and its associated hypersurface, which is also unique, provided that the given data is even. Moreover, we show that the hypersurface must be strictly convex. A crucial aspect of our proof involves deriving uniform estimates in… ▽ More We study an eigenvalue problem for prescribed $σ_k$-curvature equations of star-shaped, $k$-convex, closed hypersurfaces. We establish the existence of a unique eigenvalue and its associated hypersurface, which is also unique, provided that the given data is even. Moreover, we show that the hypersurface must be strictly convex. A crucial aspect of our proof involves deriving uniform estimates in $p$ for $L_p$-type prescribed curvature equations. △ Less

Submitted 22 September, 2023; v1 submitted 15 April, 2023; originally announced April 2023.

Comments: To appear in IMRN

MSC Class: 53C42 (Primary) 35J60 35P30 (Secondary)

arXiv:2304.05215 [pdf, other]

A Billion-scale Foundation Model for Remote Sensing Images

Authors: Keumgang Cha, Junghoon Seo, Taekyung Lee

Abstract: As the potential of foundation models in visual tasks has garnered significant attention, pretraining these models before downstream tasks has become a crucial step. The three key factors in pretraining foundation models are the pretraining method, the size of the pretraining dataset, and the number of model parameters. Recently, research in the remote sensing field has focused primarily on the pr… ▽ More As the potential of foundation models in visual tasks has garnered significant attention, pretraining these models before downstream tasks has become a crucial step. The three key factors in pretraining foundation models are the pretraining method, the size of the pretraining dataset, and the number of model parameters. Recently, research in the remote sensing field has focused primarily on the pretraining method and the size of the dataset, with limited emphasis on the number of model parameters. This paper addresses this gap by examining the effect of increasing the number of model parameters on the performance of foundation models in downstream tasks such as rotated object detection and semantic segmentation. We pretrained foundation models with varying numbers of parameters, including 86M, 605.26M, 1.3B, and 2.4B, to determine whether performance in downstream tasks improved with an increase in parameters. To the best of our knowledge, this is the first billion-scale foundation model in the remote sensing field. Furthermore, we propose an effective method for scaling up and fine-tuning a vision transformer in the remote sensing field. To evaluate general performance in downstream tasks, we employed the DOTA v2.0 and DIOR-R benchmark datasets for rotated object detection, and the Potsdam and LoveDA datasets for semantic segmentation. Experimental results demonstrated that, across all benchmark datasets and downstream tasks, the performance of the foundation models and data efficiency improved as the number of parameters increased. Moreover, our models achieve the state-of-the-art performance on several datasets including DIOR-R, Postdam, and LoveDA. △ Less

Submitted 14 May, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: This manuscript is the accepted version for IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (IEEE J-STARS)

arXiv:2304.03426 [pdf, ps, other]

Convex Minimization with Integer Minima in $\widetilde O(n^4)$ Time

Authors: Haotian Jiang, Yin Tat Lee, Zhao Song, Lichen Zhang

Abstract: Given a convex function $f$ on $\mathbb{R}^n$ with an integer minimizer, we show how to find an exact minimizer of $f$ using $O(n^2 \log n)$ calls to a separation oracle and $O(n^4 \log n)$ time. The previous best polynomial time algorithm for this problem given in [Jiang, SODA 2021, JACM 2022] achieves $O(n^2\log\log n/\log n)$ oracle complexity. However, the overall runtime of Jiang's algorithm… ▽ More Given a convex function $f$ on $\mathbb{R}^n$ with an integer minimizer, we show how to find an exact minimizer of $f$ using $O(n^2 \log n)$ calls to a separation oracle and $O(n^4 \log n)$ time. The previous best polynomial time algorithm for this problem given in [Jiang, SODA 2021, JACM 2022] achieves $O(n^2\log\log n/\log n)$ oracle complexity. However, the overall runtime of Jiang's algorithm is at least $\widetildeΩ(n^8)$, due to expensive sub-routines such as the Lenstra-Lenstra-Lovász (LLL) algorithm [Lenstra, Lenstra, Lovász, Math. Ann. 1982] and random walk based cutting plane method [Bertsimas, Vempala, JACM 2004]. Our significant speedup is obtained by a nontrivial combination of a faster version of the LLL algorithm due to [Neumaier, Stehlé, ISSAC 2016] that gives similar guarantees, the volumetric center cutting plane method (CPM) by [Vaidya, FOCS 1989] and its fast implementation given in [Jiang, Lee, Song, Wong, STOC 2020]. For the special case of submodular function minimization (SFM), our result implies a strongly polynomial time algorithm for this problem using $O(n^3 \log n)$ calls to an evaluation oracle and $O(n^4 \log n)$ additional arithmetic operations. Both the oracle complexity and the number of arithmetic operations of our more general algorithm are better than the previous best-known runtime algorithms for this specific problem given in [Lee, Sidford, Wong, FOCS 2015] and [Dadush, Végh, Zambelli, SODA 2018, MOR 2021]. △ Less

Submitted 14 November, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

Comments: SODA 2024

arXiv:2304.01934 [pdf, other]

doi 10.1103/PhysRevB.108.205146

Effect of Off-Diagonal Elements in Wannier Hamiltonian on DFT+DMFT for low-symmetry material: Study of Li$_2$MnO$_3$

Authors: Alex Taekyung Lee, Hyowon Park, Anh T. Ngo

Abstract: We study the effect of the off-diagonal elements of the Wannier Hamiltonian on the electronic structure of low-symmetry material Li$_2$MnO$_3$ ($C2/m$), using dynamical mean field theory calculations with continuous-time Quantum Monte Carlo impurity solver. Presence of significant off-diagonal elements leads to a pronounced suppression of the energy gap. The off-diagonal elements are largest when… ▽ More We study the effect of the off-diagonal elements of the Wannier Hamiltonian on the electronic structure of low-symmetry material Li$_2$MnO$_3$ ($C2/m$), using dynamical mean field theory calculations with continuous-time Quantum Monte Carlo impurity solver. Presence of significant off-diagonal elements leads to a pronounced suppression of the energy gap. The off-diagonal elements are largest when the Wannier projection is used based on the global coordinate, and they remain substantial even with the projection using the local coordinate close to the direction of Mn-O bonds. We show that the energy gap is enhanced by the diagonalization of the Mn $d$ block in the full $p$-$d$ Hamiltonian, with applying unitary rotation matrix. Additionally, the inclusion of a small double counting energy is crucial for achieving the experimental gap by reducing $p$-$d$ hybridization. Furthermore, we establish the efficiency of a low-energy ($d$-only basis) model for studying the electronic structure of Li$_2$MnO$3$, as the Wannier basis represents a hybridized state of Mn $d$ and O $p$ orbitals. These findings suggest an appropriate new approach for investigating low-symmetry materials using the DFT+DMFT method. To the best of our knowledge, no systematic study of the effect of off-diagonal terms has been conducted thus far. We also find that the antiferromagnetic ground state $Γ_{2u}$ is stable with $U \leq 2$ eV within density functional theory+$U$ calculations, which is much smaller than widely used $U$=5 eV. △ Less

Submitted 27 November, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

Comments: 13 pages, 10 figures

Journal ref: Physical Review B 108, 205146 (2023)

arXiv:2303.17953 [pdf, ps, other]

doi 10.1103/PhysRevB.107.115127

Superconducting topological Dirac semimetals: $P6/m$-Si$_6$ and $P6/m$-NaSi$_6$

Authors: Alex Takyung Lee, Kyungwha Park, In-Ho Lee

Abstract: We theoretically propose that hexagonal silicon-based crystals, $P6/m$-Si$_6$ and $P6/m$-NaSi$_6$, are topological Dirac semimetals with superconducting critical temperatures of 12 K and 13 K, respectively, at ambient pressure. Band inversion occurs with the Fu-Kane topological invariant $\mathbb{Z}_2=1$, even in the absence of spin-orbit coupling. The Dirac nodes protected by $C_6$ crystal rotati… ▽ More We theoretically propose that hexagonal silicon-based crystals, $P6/m$-Si$_6$ and $P6/m$-NaSi$_6$, are topological Dirac semimetals with superconducting critical temperatures of 12 K and 13 K, respectively, at ambient pressure. Band inversion occurs with the Fu-Kane topological invariant $\mathbb{Z}_2=1$, even in the absence of spin-orbit coupling. The Dirac nodes protected by $C_6$ crystal rotational symmetry remain gapless with spin-orbit coupling. Using first-principles calculations, we find pressure-induced topological phase transitions for $P6/m$-Si$_6$ and $P6/m$-NaSi$_6$ with critical external pressures of 11.5 GPa and 14.9 GPa, respectively. Above the critical pressures, the Dirac bands are gapped with $\mathbb{Z}_2=0$, while the superconducting states and the crystal symmetries are retained.Our results may shed light into a search for silicon-based topological materials with superconductivity. △ Less

Submitted 31 March, 2023; originally announced March 2023.

Comments: 12 pages, 11 figures

arXiv:2303.16730 [pdf, other]

TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation

Authors: Taeyeop Lee, Jonathan Tremblay, Valts Blukis, Bowen Wen, Byeong-Uk Lee, Inkyu Shin, Stan Birchfield, In So Kweon, Kuk-Jin Yoon

Abstract: Test-time adaptation methods have been gaining attention recently as a practical solution for addressing source-to-target domain gaps by gradually updating the model without requiring labels on the target data. In this paper, we propose a method of test-time adaptation for category-level object pose estimation called TTA-COPE. We design a pose ensemble approach with a self-training loss using pose… ▽ More Test-time adaptation methods have been gaining attention recently as a practical solution for addressing source-to-target domain gaps by gradually updating the model without requiring labels on the target data. In this paper, we propose a method of test-time adaptation for category-level object pose estimation called TTA-COPE. We design a pose ensemble approach with a self-training loss using pose-aware confidence. Unlike previous unsupervised domain adaptation methods for category-level object pose estimation, our approach processes the test data in a sequential, online manner, and it does not require access to the source domain at runtime. Extensive experimental results demonstrate that the proposed pose ensemble and the self-training loss improve category-level object pose performance during test time under both semi-supervised and unsupervised settings. Project page: https://taeyeop.com/ttacope △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: Accepted to CVPR 2023, Project page: https://taeyeop.com/ttacope

arXiv:2303.15060 [pdf, other]

TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering

Authors: Jaehoon Choi, Dongki Jung, Taejae Lee, Sangwook Kim, Youngdong Jung, Dinesh Manocha, Donghwan Lee

Abstract: We present a new pipeline for acquiring a textured mesh in the wild with a single smartphone which offers access to images, depth maps, and valid poses. Our method first introduces an RGBD-aided structure from motion, which can yield filtered depth maps and refines camera poses guided by corresponding depth. Then, we adopt the neural implicit surface reconstruction method, which allows for high-qu… ▽ More We present a new pipeline for acquiring a textured mesh in the wild with a single smartphone which offers access to images, depth maps, and valid poses. Our method first introduces an RGBD-aided structure from motion, which can yield filtered depth maps and refines camera poses guided by corresponding depth. Then, we adopt the neural implicit surface reconstruction method, which allows for high-quality mesh and develops a new training process for applying a regularization provided by classical multi-view stereo methods. Moreover, we apply a differentiable rendering to fine-tune incomplete texture maps and generate textures which are perceptually closer to the original scene. Our pipeline can be applied to any common objects in the real world without the need for either in-the-lab environments or accurate mask images. We demonstrate results of captured objects with complex shapes and validate our method numerically against existing 3D reconstruction and texture mapping methods. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: Accepted to CVPR23. Project Page: https://jh-choi.github.io/TMO/

arXiv:2303.14323 [pdf]

Surface termination control of charge transfer and band alignment across a semiconductor-crystalline oxide heterojunction

Authors: M. Chrysler, J. Gabel, T. -L. Lee, Z. Zhu, T. C. Kaspar, P. V. Sushko, S. A. Chambers, J. H. Ngai

Abstract: Charge redistribution across heterojunctions has long been utilized to induce functional response in materials systems. Here we examine how the composition of the terminating surface affects charge transfer across a heterojunction consisting of Si and the crystalline complex oxide SrTiO3. Itinerant electrons in Si migrate across the interface toward the surface of SrTiO3 due to surface depletion.… ▽ More Charge redistribution across heterojunctions has long been utilized to induce functional response in materials systems. Here we examine how the composition of the terminating surface affects charge transfer across a heterojunction consisting of Si and the crystalline complex oxide SrTiO3. Itinerant electrons in Si migrate across the interface toward the surface of SrTiO3 due to surface depletion. The electron transfer in turn creates an electric field across the interface that modifies the interfacial dipole associated with bonding between SrTiO3 and Si, leading to a change in the band alignment from type-II to type-III. By capping the SrTiO3 surface with ultra-thin layers of BaO, SrO or TiO2, charge transfer across the interface can be weakened or inhibited. Ab initio modeling implicates the adsorption of oxygen as driving surface depletion in SrTiO3. The electronic coupling between the surface and buried interface expands the functionality of semiconductor-crystalline oxide heterojunctions. △ Less

Submitted 24 March, 2023; originally announced March 2023.

Comments: 17 pages, 5 figures

arXiv:2303.13285 [pdf, other]

Fourier Diffusion Models: A Method to Control MTF and NPS in Score-Based Stochastic Image Generation

Authors: Matthew Tivnan, Jacopo Teneggi, Tzu-Cheng Lee, Ruoqiao Zhang, Kirsten Boedeker, Liang Cai, Grace J. Gang, Jeremias Sulam, J. Webster Stayman

Abstract: Score-based stochastic denoising models have recently been demonstrated as powerful machine learning tools for conditional and unconditional image generation. The existing methods are based on a forward stochastic process wherein the training images are scaled to zero over time and white noise is gradually added such that the final time step is approximately zero-mean identity-covariance Gaussian… ▽ More Score-based stochastic denoising models have recently been demonstrated as powerful machine learning tools for conditional and unconditional image generation. The existing methods are based on a forward stochastic process wherein the training images are scaled to zero over time and white noise is gradually added such that the final time step is approximately zero-mean identity-covariance Gaussian noise. A neural network is then trained to approximate the time-dependent score function, or the gradient of the logarithm of the probability density, for that time step. Using this score estimator, it is possible to run an approximation of the time-reversed stochastic process to sample new images from the training data distribution. These score-based generative models have been shown to out-perform generative adversarial neural networks using standard benchmarks and metrics. However, one issue with this approach is that it requires a large number of forward passes of the neural network. Additionally, the images at intermediate time steps are not useful, since the signal-to-noise ratio is low. In this work we present a new method called Fourier Diffusion Models which replaces the scalar operations of the forward process with shift-invariant convolutions and the additive white noise with additive stationary noise. This allows for control of MTF and NPS at intermediate time steps. Additionally, the forward process can be crafted to converge to the same MTF and NPS as the measured images. This way, we can model continuous probability flow from true images to measurements. In this way, the sample time can be used to control the tradeoffs between measurement uncertainty and generative uncertainty of posterior estimates. We compare Fourier diffusion models to existing scalar diffusion models and show that they achieve a higher level of performance and allow for a smaller number of time steps. △ Less

Submitted 23 March, 2023; originally announced March 2023.

arXiv:2303.12712 [pdf, other]

Sparks of Artificial General Intelligence: Early experiments with GPT-4

Authors: Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang

Abstract: Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an earl… ▽ More Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an early version of GPT-4, when it was still in active development by OpenAI. We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models. We discuss the rising capabilities and implications of these models. We demonstrate that, beyond its mastery of language, GPT-4 can solve novel and difficult tasks that span mathematics, coding, vision, medicine, law, psychology and more, without needing any special prompting. Moreover, in all of these tasks, GPT-4's performance is strikingly close to human-level performance, and often vastly surpasses prior models such as ChatGPT. Given the breadth and depth of GPT-4's capabilities, we believe that it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system. In our exploration of GPT-4, we put special emphasis on discovering its limitations, and we discuss the challenges ahead for advancing towards deeper and more comprehensive versions of AGI, including the possible need for pursuing a new paradigm that moves beyond next-word prediction. We conclude with reflections on societal influences of the recent technological leap and future research directions. △ Less

Submitted 13 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

arXiv:2303.12710 [pdf, other]

A Unified Arbitrary Style Transfer Framework via Adaptive Contrastive Learning

Authors: Yuxin Zhang, Fan Tang, Weiming Dong, Haibin Huang, Chongyang Ma, Tong-Yee Lee, Changsheng Xu

Abstract: We present Unified Contrastive Arbitrary Style Transfer (UCAST), a novel style representation learning and transfer framework, which can fit in most existing arbitrary image style transfer models, e.g., CNN-based, ViT-based, and flow-based methods. As the key component in image style transfer tasks, a suitable style representation is essential to achieve satisfactory results. Existing approaches b… ▽ More We present Unified Contrastive Arbitrary Style Transfer (UCAST), a novel style representation learning and transfer framework, which can fit in most existing arbitrary image style transfer models, e.g., CNN-based, ViT-based, and flow-based methods. As the key component in image style transfer tasks, a suitable style representation is essential to achieve satisfactory results. Existing approaches based on deep neural network typically use second-order statistics to generate the output. However, these hand-crafted features computed from a single image cannot leverage style information sufficiently, which leads to artifacts such as local distortions and style inconsistency. To address these issues, we propose to learn style representation directly from a large amount of images based on contrastive learning, by taking the relationships between specific styles and the holistic style distribution into account. Specifically, we present an adaptive contrastive learning scheme for style transfer by introducing an input-dependent temperature. Our framework consists of three key components, i.e., a parallel contrastive learning scheme for style representation and style transfer, a domain enhancement module for effective learning of style distribution, and a generative network for style transfer. We carry out qualitative and quantitative evaluations to show that our approach produces superior results than those obtained via state-of-the-art methods. △ Less

Submitted 23 March, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2205.09542

arXiv:2303.11137 [pdf, other]

AnimeDiffusion: Anime Face Line Drawing Colorization via Diffusion Models

Authors: Yu Cao, Xiangqiao Meng, P. Y. Mok, Xueting Liu, Tong-Yee Lee, Ping Li

Abstract: It is a time-consuming and tedious work for manually colorizing anime line drawing images, which is an essential stage in cartoon animation creation pipeline. Reference-based line drawing colorization is a challenging task that relies on the precise cross-domain long-range dependency modelling between the line drawing and reference image. Existing learning methods still utilize generative adversar… ▽ More It is a time-consuming and tedious work for manually colorizing anime line drawing images, which is an essential stage in cartoon animation creation pipeline. Reference-based line drawing colorization is a challenging task that relies on the precise cross-domain long-range dependency modelling between the line drawing and reference image. Existing learning methods still utilize generative adversarial networks (GANs) as one key module of their model architecture. In this paper, we propose a novel method called AnimeDiffusion using diffusion models that performs anime face line drawing colorization automatically. To the best of our knowledge, this is the first diffusion model tailored for anime content creation. In order to solve the huge training consumption problem of diffusion models, we design a hybrid training strategy, first pre-training a diffusion model with classifier-free guidance and then fine-tuning it with image reconstruction guidance. We find that with a few iterations of fine-tuning, the model shows wonderful colorization performance, as illustrated in Fig. 1. For training AnimeDiffusion, we conduct an anime face line drawing colorization benchmark dataset, which contains 31696 training data and 579 testing data. We hope this dataset can fill the gap of no available high resolution anime face dataset for colorization method evaluation. Through multiple quantitative metrics evaluated on our dataset and a user study, we demonstrate AnimeDiffusion outperforms state-of-the-art GANs-based models for anime face line drawing colorization. We also collaborate with professional artists to test and apply our AnimeDiffusion for their creation work. We release our code on https://github.com/xq-meng/AnimeDiffusion. △ Less

Submitted 20 March, 2023; originally announced March 2023.

arXiv:2303.10273 [pdf, other]

doi 10.1364/AO.478048

Broadband plasma spray anti-reflection coating technology for millimeter-wave astrophysics

Authors: Oliver Jeong, Richard Plambeck, Christopher Raum, Aritoki Suzuki, Adrian T. Lee

Abstract: We present a broadband plasma spray anti-reflection (AR) coating technology for millimeter-wave astrophysics experiments with large-format, cryogenic optics. By plasma spraying alumina- and silica-based powders, we have produced coatings of tunable index of refraction and thickness, low loss, and coefficient of thermal expansion matched to alumina substrates. We demonstrate two-layer AR coatings o… ▽ More We present a broadband plasma spray anti-reflection (AR) coating technology for millimeter-wave astrophysics experiments with large-format, cryogenic optics. By plasma spraying alumina- and silica-based powders, we have produced coatings of tunable index of refraction and thickness, low loss, and coefficient of thermal expansion matched to alumina substrates. We demonstrate two-layer AR coatings on alumina with reflection below 5% over 82% and 69% fractional bandwidths for 90/150 and 220/280 GHz passband designs, respectively, and band-averaged absorption loss reduced to ~1% at 100 K for both AR coatings. We describe the design, tolerance, fabrication process, and optical measurements of these AR coatings. △ Less

Submitted 21 March, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

Journal ref: Appl. Opt. 62, 1628-1634 (2023)

arXiv:2303.08774 [pdf, other]

GPT-4 Technical Report

Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4. △ Less

Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: 100 pages; updated authors list; fixed author names and added citation

arXiv:2303.08410 [pdf, other]

doi 10.1103/PhysRevD.108.043017

Constraints on axion-like polarization oscillations in the cosmic microwave background with POLARBEAR

Authors: The POLARBEAR Collaboration, Shunsuke Adachi, Tylor Adkins, Kam Arnold, Carlo Baccigalupi, Darcy Barron, Kolen Cheung, Yuji Chinone, Kevin T. Crowley, Josquin Errard, Giulio Fabbian, Chang Feng, Raphael Flauger, Takuro Fujino, Daniel Green, Masaya Hasegawa, Masashi Hazumi, Daisuke Kaneko, Nobuhiko Katayama, Brian Keating, Akito Kusaka, Adrian T. Lee, Yuto Minami, Haruki Nishino, Christian L. Reichardt , et al. (7 additional authors not shown)

Abstract: Very light pseudoscalar fields, often referred to as axions, are compelling dark matter candidates and can potentially be detected through their coupling to the electromagnetic field. Recently a novel detection technique using the cosmic microwave background (CMB) was proposed, which relies on the fact that the axion field oscillates at a frequency equal to its mass in appropriate units, leading t… ▽ More Very light pseudoscalar fields, often referred to as axions, are compelling dark matter candidates and can potentially be detected through their coupling to the electromagnetic field. Recently a novel detection technique using the cosmic microwave background (CMB) was proposed, which relies on the fact that the axion field oscillates at a frequency equal to its mass in appropriate units, leading to a time-dependent birefringence. For appropriate oscillation periods this allows the axion field at the telescope to be detected via the induced sinusoidal oscillation of the CMB linear polarization. We search for this effect in two years of POLARBEAR data. We do not detect a signal, and place a median $95 \%$ upper limit of $0.65 ^\circ$ on the sinusoid amplitude for oscillation frequencies between $0.02\,\text{days}^{-1}$ and $0.45\,\text{days}^{-1}$, which corresponds to axion masses between $9.6 \times 10^{-22} \, \text{eV}$ and $2.2\times 10^{-20} \,\text{eV}$. Under the assumptions that 1) the axion constitutes all the dark matter and 2) the axion field amplitude is a Rayleigh-distributed stochastic variable, this translates to a limit on the axion-photon coupling $g_{φγ} < 2.4 \times 10^{-11} \,\text{GeV}^{-1} \times ({m_φ}/{10^{-21} \, \text{eV}})$. △ Less

Submitted 1 September, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: 17 pages, 5 figures, 2 tables. Published in Physical Review D

Journal ref: Phys. Rev. D 108, 043017 (2023)

arXiv:2303.07872 [pdf, other]

Object-based SLAM utilizing unambiguous pose parameters considering general symmetry types

Authors: Taekbeom Lee, Youngseok Jang, H. Jin Kim

Abstract: Existence of symmetric objects, whose observation at different viewpoints can be identical, can deteriorate the performance of simultaneous localization and mapping(SLAM). This work proposes a system for robustly optimizing the pose of cameras and objects even in the presence of symmetric objects. We classify objects into three categories depending on their symmetry characteristics, which is effic… ▽ More Existence of symmetric objects, whose observation at different viewpoints can be identical, can deteriorate the performance of simultaneous localization and mapping(SLAM). This work proposes a system for robustly optimizing the pose of cameras and objects even in the presence of symmetric objects. We classify objects into three categories depending on their symmetry characteristics, which is efficient and effective in that it allows to deal with general objects and the objects in the same category can be associated with the same type of ambiguity. Then we extract only the unambiguous parameters corresponding to each category and use them in data association and joint optimization of the camera and object pose. The proposed approach provides significant robustness to the SLAM performance by removing the ambiguous parameters and utilizing as much useful geometric information as possible. Comparison with baseline algorithms confirms the superior performance of the proposed system in terms of object tracking and pose estimation, even in challenging scenarios where the baseline fails. △ Less

Submitted 12 March, 2023; originally announced March 2023.

Comments: This paper has been accepted to ICRA 2023

arXiv:2303.07053 [pdf, other]

Bandit-supported care planning for older people with complex health and care needs

Authors: Gi-Soo Kim, Young Suh Hong, Tae Hoon Lee, Myunghee Cho Paik, Hongsoo Kim

Abstract: Long-term care service for old people is in great demand in most of the aging societies. The number of nursing homes residents is increasing while the number of care providers is limited. Due to the care worker shortage, care to vulnerable older residents cannot be fully tailored to the unique needs and preference of each individual. This may bring negative impacts on health outcomes and quality o… ▽ More Long-term care service for old people is in great demand in most of the aging societies. The number of nursing homes residents is increasing while the number of care providers is limited. Due to the care worker shortage, care to vulnerable older residents cannot be fully tailored to the unique needs and preference of each individual. This may bring negative impacts on health outcomes and quality of life among institutionalized older people. To improve care quality through personalized care planning and delivery with limited care workforce, we propose a new care planning model assisted by artificial intelligence. We apply bandit algorithms which optimize the clinical decision for care planning by adapting to the sequential feedback from the past decisions. We evaluate the proposed model on empirical data acquired from the Systems for Person-centered Elder Care (SPEC) study, a ICT-enhanced care management program. △ Less

Submitted 13 March, 2023; originally announced March 2023.

arXiv:2303.06274 [pdf]

CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

Authors: Simon Graham, Quoc Dang Vu, Mostafa Jahanifar, Martin Weigert, Uwe Schmidt, Wenhua Zhang, Jun Zhang, Sen Yang, Jinxi Xiang, Xiyue Wang, Josef Lorenz Rumberger, Elias Baumann, Peter Hirsch, Lihao Liu, Chenyang Hong, Angelica I. Aviles-Rivero, Ayushi Jain, Heeyoung Ahn, Yiyu Hong, Hussam Azzuni, Min Xu, Mohammad Yaqub, Marie-Claire Blache, Benoît Piégu, Bertrand Vernay , et al. (64 additional authors not shown)

Abstract: Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of repro… ▽ More Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of reproducible algorithms for cellular recognition with real-time result inspection on public leaderboards. We conducted an extensive post-challenge analysis based on the top-performing models using 1,658 whole-slide images of colon tissue. With around 700 million detected nuclei per model, associated features were used for dysplasia grading and survival analysis, where we demonstrated that the challenge's improvement over the previous state-of-the-art led to significant boosts in downstream performance. Our findings also suggest that eosinophils and neutrophils play an important role in the tumour microevironment. We release challenge models and WSI-level results to foster the development of further methods for biomarker discovery. △ Less

Submitted 14 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

arXiv:2303.05370 [pdf, other]

Rethinking Self-Supervised Visual Representation Learning in Pre-training for 3D Human Pose and Shape Estimation

Authors: Hongsuk Choi, Hyeongjin Nam, Taeryung Lee, Gyeongsik Moon, Kyoung Mu Lee

Abstract: Recently, a few self-supervised representation learning (SSL) methods have outperformed the ImageNet classification pre-training for vision tasks such as object detection. However, its effects on 3D human body pose and shape estimation (3DHPSE) are open to question, whose target is fixed to a unique class, the human, and has an inherent task gap with SSL. We empirically study and analyze the effec… ▽ More Recently, a few self-supervised representation learning (SSL) methods have outperformed the ImageNet classification pre-training for vision tasks such as object detection. However, its effects on 3D human body pose and shape estimation (3DHPSE) are open to question, whose target is fixed to a unique class, the human, and has an inherent task gap with SSL. We empirically study and analyze the effects of SSL and further compare it with other pre-training alternatives for 3DHPSE. The alternatives are 2D annotation-based pre-training and synthetic data pre-training, which share the motivation of SSL that aims to reduce the labeling cost. They have been widely utilized as a source of weak-supervision or fine-tuning, but have not been remarked as a pre-training source. SSL methods underperform the conventional ImageNet classification pre-training on multiple 3DHPSE benchmarks by 7.7% on average. In contrast, despite a much less amount of pre-training data, the 2D annotation-based pre-training improves accuracy on all benchmarks and shows faster convergence during fine-tuning. Our observations challenge the naive application of the current SSL pre-training to 3DHPSE and relight the value of other data types in the pre-training aspect. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Comments: Accepted to ICLR 2023, 18 pages including the appendix

arXiv:2302.14749 [pdf, other]

doi 10.3847/2041-8213/acbf45

Simultaneous Millimeter-wave, Gamma-ray, and Optical Monitoring of the Blazar PKS 2326-502 During a Flaring State

Authors: J. C. Hood II, A. Simpson, A. McDaniel, A. Foster, P. A. R. Ade, M. Ajello, A. J. Anderson, J. E. Austermann, J. A. Beall, A. N. Bender, B. A. Benson, F. Bianchini, L. E. Bleem, J. E. Carlstrom, C. L. Chang, P. Chaubal, H. C. Chiang, T-L. Chou, R. Citron, C. Corbett Moran, T. M. Crawford, A. T. Crites, T. de Haan, M. A. Dobbs, W. Everett , et al. (44 additional authors not shown)

Abstract: Including millimeter-wave (mm-wave) data in multi-wavelength studies of the variability of active galactic nuclei (AGN) can provide insights into AGN physics that are not easily accessible at other wavelengths. We demonstrate in this work the potential of cosmic microwave background (CMB) telescopes to provide long-term, high-cadence mm-wave AGN monitoring over large fractions of sky. We report on… ▽ More Including millimeter-wave (mm-wave) data in multi-wavelength studies of the variability of active galactic nuclei (AGN) can provide insights into AGN physics that are not easily accessible at other wavelengths. We demonstrate in this work the potential of cosmic microwave background (CMB) telescopes to provide long-term, high-cadence mm-wave AGN monitoring over large fractions of sky. We report on a pilot study using data from the SPTpol instrument on the South Pole Telescope (SPT), which was designed to observe the CMB at arcminute and larger angular scales. Between 2013 and 2016, SPTpol was used primarily to observe a single 500 deg^2 field, covering the entire field several times per day with detectors sensitive to radiation in bands centered at 95 and 150 GHz. We use SPT 150 GHz observations to create AGN light curves, and we compare these mm-wave light curves to those at other wavelengths, in particular gamma-ray and optical. In this Letter, we focus on a single source, PKS 2326-502, which has extensive, day-timescale monitoring data in gamma-ray, optical, and now mm-wave between 2013 and 2016. We find PKS 2326-502 to be in a flaring state in the first two years of this monitoring, and we present a search for evidence of correlated variability between mm-wave, optical R band, and gamma-ray observations. This pilot study is paving the way for AGN monitoring with current and upcoming CMB experiments such as SPT-3G, Simons Observatory, and CMB-S4, including multi-wavelength studies with facilities such as VRO-LSST. △ Less

Submitted 28 February, 2023; originally announced February 2023.

Comments: 9 pages, 3 figures, accepted to Astrophysical Journal Letters

arXiv:2302.11797 [pdf, other]

Region-Aware Diffusion for Zero-shot Text-driven Image Editing

Authors: Nisha Huang, Fan Tang, Weiming Dong, Tong-Yee Lee, Changsheng Xu

Abstract: Image manipulation under the guidance of textual descriptions has recently received a broad range of attention. In this study, we focus on the regional editing of images with the guidance of given text prompts. Different from current mask-based image editing methods, we propose a novel region-aware diffusion model (RDM) for entity-level image editing, which could automatically locate the region of… ▽ More Image manipulation under the guidance of textual descriptions has recently received a broad range of attention. In this study, we focus on the regional editing of images with the guidance of given text prompts. Different from current mask-based image editing methods, we propose a novel region-aware diffusion model (RDM) for entity-level image editing, which could automatically locate the region of interest and replace it following given text prompts. To strike a balance between image fidelity and inference speed, we design the intensive diffusion pipeline by combing latent space diffusion and enhanced directional guidance. In addition, to preserve image content in non-edited regions, we introduce regional-aware entity editing to modify the region of interest and preserve the out-of-interest region. We validate the proposed RDM beyond the baseline methods through extensive qualitative and quantitative experiments. The results show that RDM outperforms the previous approaches in terms of visual quality, overall harmonization, non-editing region content preservation, and text-image semantic consistency. The codes are available at https://github.com/haha-lisa/RDM-Region-Aware-Diffusion-Model. △ Less

Submitted 23 February, 2023; originally announced February 2023.

arXiv:2302.10879 [pdf, other]

$k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models

Authors: Yangsibo Huang, Daogao Liu, Zexuan Zhong, Weijia Shi, Yin Tat Lee

Abstract: Fine-tuning a language model on a new domain is standard practice for domain adaptation. However, it can be infeasible when it comes to modern large-scale language models such as GPT-3, which can only be accessed through APIs, making it difficult to access the internal parameters of the model. In this paper, we propose $k$NN-Adapter, a method to effectively adapt these black-box large language mod… ▽ More Fine-tuning a language model on a new domain is standard practice for domain adaptation. However, it can be infeasible when it comes to modern large-scale language models such as GPT-3, which can only be accessed through APIs, making it difficult to access the internal parameters of the model. In this paper, we propose $k$NN-Adapter, a method to effectively adapt these black-box large language models (LLMs) to a new domain. The $k$NN-Adapter builds on top of the retrieval-augmented language model, and adaptively learns to interpolate the output of the language model with retrieval results from a datastore consisting of the target domain data. Our experiments on four different domains demonstrate that $k$NN-Adapter significantly improves perplexity, and works particularly well in settings with limited access to LLMs. Additionally, we show that $k$NN-Adapter is more effective than fine-tuning when the amount of training data is limited. We also release a dataset to encourage further study. △ Less

Submitted 21 February, 2023; originally announced February 2023.

arXiv:2302.10501 [pdf, other]

Few-Shot Point Cloud Semantic Segmentation via Contrastive Self-Supervision and Multi-Resolution Attention

Authors: Jiahui Wang, Haiyue Zhu, Haoren Guo, Abdullah Al Mamun, Cheng Xiang, Tong Heng Lee

Abstract: This paper presents an effective few-shot point cloud semantic segmentation approach for real-world applications. Existing few-shot segmentation methods on point cloud heavily rely on the fully-supervised pretrain with large annotated datasets, which causes the learned feature extraction bias to those pretrained classes. However, as the purpose of few-shot learning is to handle unknown/unseen clas… ▽ More This paper presents an effective few-shot point cloud semantic segmentation approach for real-world applications. Existing few-shot segmentation methods on point cloud heavily rely on the fully-supervised pretrain with large annotated datasets, which causes the learned feature extraction bias to those pretrained classes. However, as the purpose of few-shot learning is to handle unknown/unseen classes, such class-specific feature extraction in pretrain is not ideal to generalize into new classes for few-shot learning. Moreover, point cloud datasets hardly have a large number of classes due to the annotation difficulty. To address these issues, we propose a contrastive self-supervision framework for few-shot learning pretrain, which aims to eliminate the feature extraction bias through class-agnostic contrastive supervision. Specifically, we implement a novel contrastive learning approach with a learnable augmentor for a 3D point cloud to achieve point-wise differentiation, so that to enhance the pretrain with managed overfitting through the self-supervision. Furthermore, we develop a multi-resolution attention module using both the nearest and farthest points to extract the local and global point information more effectively, and a center-concentrated multi-prototype is adopted to mitigate the intra-class sparsity. Comprehensive experiments are conducted to evaluate the proposed approach, which shows our approach achieves state-of-the-art performance. Moreover, a case study on practical CAM/CAD segmentation is presented to demonstrate the effectiveness of our approach for real-world applications. △ Less

Submitted 21 February, 2023; originally announced February 2023.

Comments: ICRA 2023

arXiv:2302.10444 [pdf, other]

Leveraging phone-level linguistic-acoustic similarity for utterance-level pronunciation scoring

Authors: Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

Abstract: Recent studies on pronunciation scoring have explored the effect of introducing phone embeddings as reference pronunciation, but mostly in an implicit manner, i.e., addition or concatenation of reference phone embedding and actual pronunciation of the target phone as the phone-level pronunciation quality representation. In this paper, we propose to use linguistic-acoustic similarity to explicitly… ▽ More Recent studies on pronunciation scoring have explored the effect of introducing phone embeddings as reference pronunciation, but mostly in an implicit manner, i.e., addition or concatenation of reference phone embedding and actual pronunciation of the target phone as the phone-level pronunciation quality representation. In this paper, we propose to use linguistic-acoustic similarity to explicitly measure the deviation of non-native production from its native reference for pronunciation assessment. Specifically, the deviation is first estimated by the cosine similarity between reference phone embedding and corresponding acoustic embedding. Next, a phone-level Goodness of pronunciation (GOP) pre-training stage is introduced to guide this similarity-based learning for better initialization of the aforementioned two embeddings. Finally, a transformer-based hierarchical pronunciation scorer is used to map a sequence of phone embeddings, acoustic embeddings along with their similarity measures to predict the final utterance-level score. Experimental results on the non-native databases suggest that the proposed system significantly outperforms the baselines, where the acoustic and phone embeddings are simply added or concatenated. A further examination shows that the phone embeddings learned in the proposed approach are able to capture linguistic-acoustic attributes of native pronunciation as reference. △ Less

Submitted 13 March, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

Comments: Accepted by ICASSP 2023

arXiv:2302.09928 [pdf, other]

An ASR-free Fluency Scoring Approach with Self-Supervised Learning

Authors: Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

Abstract: A typical fluency scoring system generally relies on an automatic speech recognition (ASR) system to obtain time stamps in input speech for either the subsequent calculation of fluency-related features or directly modeling speech fluency with an end-to-end approach. This paper describes a novel ASR-free approach for automatic fluency assessment using self-supervised learning (SSL). Specifically, w… ▽ More A typical fluency scoring system generally relies on an automatic speech recognition (ASR) system to obtain time stamps in input speech for either the subsequent calculation of fluency-related features or directly modeling speech fluency with an end-to-end approach. This paper describes a novel ASR-free approach for automatic fluency assessment using self-supervised learning (SSL). Specifically, wav2vec2.0 is used to extract frame-level speech features, followed by K-means clustering to assign a pseudo label (cluster index) to each frame. A BLSTM-based model is trained to predict an utterance-level fluency score from frame-level SSL features and the corresponding cluster indexes. Neither speech transcription nor time stamp information is required in the proposed system. It is ASR-free and can potentially avoid the ASR errors effect in practice. Experimental results carried out on non-native English databases show that the proposed approach significantly improves the performance in the "open response" scenario as compared to previous methods and matches the recently reported performance in the "read aloud" scenario. △ Less

Submitted 13 March, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: Accepted by ICASSP 2023

arXiv:2302.09440 [pdf, other]

Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits

Authors: Yue Kang, Cho-Jui Hsieh, Thomas C. M. Lee

Abstract: In stochastic contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience to minimize the cumulative regret. Like many other machine learning algorithms, the performance of bandits heavily depends on the values of hyperparameters, and theoretically derived parameter values may lead to unsatisfactory results in practice. Moreover, it is infeasib… ▽ More In stochastic contextual bandits, an agent sequentially makes actions from a time-dependent action set based on past experience to minimize the cumulative regret. Like many other machine learning algorithms, the performance of bandits heavily depends on the values of hyperparameters, and theoretically derived parameter values may lead to unsatisfactory results in practice. Moreover, it is infeasible to use offline tuning methods like cross-validation to choose hyperparameters under the bandit environment, as the decisions should be made in real-time. To address this challenge, we propose the first online continuous hyperparameter tuning framework for contextual bandits to learn the optimal parameter configuration in practice within a search space on the fly. Specifically, we use a double-layer bandit framework named CDT (Continuous Dynamic Tuning) and formulate the hyperparameter optimization as a non-stationary continuum-armed bandit, where each arm represents a combination of hyperparameters, and the corresponding reward is the algorithmic result. For the top layer, we propose the Zooming TS algorithm that utilizes Thompson Sampling (TS) for exploration and a restart technique to get around the \textit{switching} environment. The proposed CDT framework can be easily utilized to tune contextual bandit algorithms without any pre-specified candidate set for multiple hyperparameters. We further show that it could achieve a sublinear regret in theory and performs consistently better than all existing methods on both synthetic and real datasets. △ Less

Submitted 8 April, 2024; v1 submitted 18 February, 2023; originally announced February 2023.

Comments: Published in Transactions on Machine Learning Research (TMLR)

arXiv:2302.08430 [pdf, ps, other]

Twisted GKZ hypergeometric functions and relative cohomology

Authors: Tsung-Ju Lee, Dingxin Zhang

Abstract: We investigate the GKZ $A$-hypergeometric $\mathscr{D}$-modules, introduced by Gel'fand, Kapranov, and Zelevinskii, arising from cyclic covers of toric varieties and find its Riemann--Hilbert partner. This extends our earlier results in arXiv:1902.01536. We investigate the GKZ $A$-hypergeometric $\mathscr{D}$-modules, introduced by Gel'fand, Kapranov, and Zelevinskii, arising from cyclic covers of toric varieties and find its Riemann--Hilbert partner. This extends our earlier results in arXiv:1902.01536. △ Less

Submitted 16 February, 2023; originally announced February 2023.

Comments: 23 pages. Comments are welcome!

arXiv:2302.06085 [pdf, ps, other]

Algorithmic Aspects of the Log-Laplace Transform and a Non-Euclidean Proximal Sampler

Authors: Sivakanth Gopi, Yin Tat Lee, Daogao Liu, Ruoqi Shen, Kevin Tian

Abstract: The development of efficient sampling algorithms catering to non-Euclidean geometries has been a challenging endeavor, as discretization techniques which succeed in the Euclidean setting do not readily carry over to more general settings. We develop a non-Euclidean analog of the recent proximal sampler of [LST21], which naturally induces regularization by an object known as the log-Laplace transfo… ▽ More The development of efficient sampling algorithms catering to non-Euclidean geometries has been a challenging endeavor, as discretization techniques which succeed in the Euclidean setting do not readily carry over to more general settings. We develop a non-Euclidean analog of the recent proximal sampler of [LST21], which naturally induces regularization by an object known as the log-Laplace transform (LLT) of a density. We prove new mathematical properties (with an algorithmic flavor) of the LLT, such as strong convexity-smoothness duality and an isoperimetric inequality, which are used to prove a mixing time on our proximal sampler matching [LST21] under a warm start. As our main application, we show our warm-started sampler improves the value oracle complexity of differentially private convex optimization in $\ell_p$ and Schatten-$p$ norms for $p \in [1, 2]$ to match the Euclidean setting [GLL22], while retaining state-of-the-art excess risk bounds [GLLST23]. We find our investigation of the LLT to be a promising proof-of-concept of its utility as a tool for designing samplers, and outline directions for future exploration. △ Less

Submitted 22 February, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

Comments: Comments welcome! v2 improves constant in duality result, adds citations

arXiv:2302.05228 [pdf, other]

doi 10.1051/0004-6361/202346155

Tensor-to-scalar ratio forecasts for extended LiteBIRD frequency configurations

Authors: U. Fuskeland, J. Aumont, R. Aurlien, C. Baccigalupi, A. J. Banday, H. K. Eriksen, J. Errard, R. T. Génova-Santos, T. Hasebe, J. Hubmayr, H. Imada, N. Krachmalnicoff, L. Lamagna, G. Pisano, D. Poletti, M. Remazeilles, K. L. Thompson, L. Vacher, I. K. Wehus, S. Azzoni, M. Ballardini, R. B. Barreiro, N. Bartolo, A. Basyrov, D. Beck , et al. (92 additional authors not shown)

Abstract: LiteBIRD is a planned JAXA-led CMB B-mode satellite experiment aiming for launch in the late 2020s, with a primary goal of detecting the imprint of primordial inflationary gravitational waves. Its current baseline focal-plane configuration includes 15 frequency bands between 40 and 402 GHz, fulfilling the mission requirements to detect the amplitude of gravitational waves with the total uncertaint… ▽ More LiteBIRD is a planned JAXA-led CMB B-mode satellite experiment aiming for launch in the late 2020s, with a primary goal of detecting the imprint of primordial inflationary gravitational waves. Its current baseline focal-plane configuration includes 15 frequency bands between 40 and 402 GHz, fulfilling the mission requirements to detect the amplitude of gravitational waves with the total uncertainty on the tensor-to-scalar ratio, $δr$, down to $δr<0.001$. A key aspect of this performance is accurate astrophysical component separation, and the ability to remove polarized thermal dust emission is particularly important. In this paper we note that the CMB frequency spectrum falls off nearly exponentially above 300 GHz relative to the thermal dust SED, and a relatively minor high frequency extension can therefore result in even lower uncertainties and better model reconstructions. Specifically, we compare the baseline design with five extended configurations, while varying the underlying dust modeling, in each of which the HFT (High-Frequency Telescope) frequency range is shifted logarithmically towards higher frequencies, with an upper cutoff ranging between 400 and 600 GHz. In each case, we measure the tensor-to-scalar ratio $r$ uncertainty and bias using both parametric and minimum-variance component-separation algorithms. When the thermal dust sky model includes a spatially varying spectral index and temperature, we find that the statistical uncertainty on $r$ after foreground cleaning may be reduced by as much as 30--50 % by extending the upper limit of the frequency range from 400 to 600 GHz, with most of the improvement already gained at 500 GHz. We also note that a broader frequency range leads to better ability to discriminate between models through higher $χ^2$ sensitivity. (abridged) △ Less

Submitted 15 August, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

Comments: 18 pages, 13 figures. Published in A&A

Journal ref: A&A 676, A42 (2023)

arXiv:2301.10883 [pdf, ps, other]

Uniqueness of conical singularities for mean curvature flows

Authors: Tang-Kai Lee, Xinrui Zhao

Abstract: In this paper, we prove the uniqueness of asymptotically conical tangent flows in all codimensions. This is based on an early work of Chodosh-Schulze, who proved the uniqueness in the hypersurface case. In this paper, we prove the uniqueness of asymptotically conical tangent flows in all codimensions. This is based on an early work of Chodosh-Schulze, who proved the uniqueness in the hypersurface case. △ Less

Submitted 2 February, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

Comments: 19 pages

arXiv:2301.05991 [pdf]

Conceptual Framework and Documentation Standards of Cystoscopic Media Content for Artificial Intelligence

Authors: Okyaz Eminaga, Timothy Jiyong Lee, Jessie Ge, Eugene Shkolyar, Mark Laurie, Jin Long, Lukas Graham Hockman, Joseph C. Liao

Abstract: Background: The clinical documentation of cystoscopy includes visual and textual materials. However, the secondary use of visual cystoscopic data for educational and research purposes remains limited due to inefficient data management in routine clinical practice. Methods: A conceptual framework was designed to document cystoscopy in a standardized manner with three major sections: data management… ▽ More Background: The clinical documentation of cystoscopy includes visual and textual materials. However, the secondary use of visual cystoscopic data for educational and research purposes remains limited due to inefficient data management in routine clinical practice. Methods: A conceptual framework was designed to document cystoscopy in a standardized manner with three major sections: data management, annotation management, and utilization management. A Swiss-cheese model was proposed for quality control and root cause analyses. We defined the infrastructure required to implement the framework with respect to FAIR (findable, accessible, interoperable, re-usable) principles. We applied two scenarios exemplifying data sharing for research and educational projects to ensure the compliance with FAIR principles. Results: The framework was successfully implemented while following FAIR principles. The cystoscopy atlas produced from the framework could be presented in an educational web portal; a total of 68 full-length qualitative videos and corresponding annotation data were sharable for artificial intelligence projects covering frame classification and segmentation problems at case, lesion and frame levels. Conclusion: Our study shows that the proposed framework facilitates the storage of the visual documentation in a standardized manner and enables FAIR data for education and artificial intelligence research. △ Less

Submitted 18 January, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

Comments: Under Reveiw

arXiv:2301.02577 [pdf, other]

Capturing the dynamics of Ti diffusion across Ti$_x$W$_{1-x}$/Cu heterostructures using X-ray photoelectron spectroscopy

Authors: Curran Kalha, Pardeep K. Thakur, Tien-Lin Lee, Michael Reisinger, Johannes Zechner, Michael Nelhiebel, Anna Regoutz

Abstract: Interdiffusion phenomena between adjacent materials are highly prevalent in semiconductor device architectures and can present a major reliability challenge for the industry. To fully capture and better understand these phenomena, experimental approaches must go beyond static and post-mortem studies to include in-situ and in-operando setups. Here, soft and hard X-ray photoelectron spectroscopy (SX… ▽ More Interdiffusion phenomena between adjacent materials are highly prevalent in semiconductor device architectures and can present a major reliability challenge for the industry. To fully capture and better understand these phenomena, experimental approaches must go beyond static and post-mortem studies to include in-situ and in-operando setups. Here, soft and hard X-ray photoelectron spectroscopy (SXPS and HAXPES) is used to monitor diffusion in real-time across a proxy device. The device consists of a Si/SiO\textsubscript{2}/Ti$_x$W$_{1-x}$(300 nm)/Cu(25 nm) thin film material stack, with the Ti$_x$W$_{1-x}$ film acting as a diffusion barrier between Si and Cu. The monitoring of diffusion is achieved through the continuous collection of spectra whilst in-situ annealing to 673 K. Ti within the TiW is found to be highly mobile during annealing, diffusing out of the barrier and accumulating at the Cu surface. Increasing the Ti concentration within the Ti$_x$W$_{1-x}$ film increases the quantity of accumulated Ti, and Ti is first detected at the Cu surface at temperatures as low as 550 K. Surprisingly, at low Ti concentrations ($x$ = 0.054), W is also mobile and diffuses alongside Ti. These results provide crucial evidence for the importance of diffusion barrier composition on their efficacy during device application, delivering insights into the mechanisms underlying their effectiveness and limitations. △ Less

Submitted 19 January, 2023; v1 submitted 6 January, 2023; originally announced January 2023.

arXiv:2301.01983 [pdf, other]

doi 10.1063/5.0140088

Characterization of a half-wave plate for cosmic microwave background circular polarization measurement with POLARBEAR

Authors: T. Fujino, S. Takakura, Y. Chinone, M. Hasegawa, M. Hazumi, N. Katayama, A. T. Lee, T. Matsumura, Y. Minami, H. Nishino

Abstract: A half-wave plate (HWP) is often used as a modulator to suppress systematic error in the measurements of cosmic microwave background (CMB) polarization. A HWP can also be used to measure circular polarization (CP) through its optical leakage from CP to linear polarization. The CP of the CMB is predicted from various sources, such as interactions in the Universe and extension of the standard model.… ▽ More A half-wave plate (HWP) is often used as a modulator to suppress systematic error in the measurements of cosmic microwave background (CMB) polarization. A HWP can also be used to measure circular polarization (CP) through its optical leakage from CP to linear polarization. The CP of the CMB is predicted from various sources, such as interactions in the Universe and extension of the standard model. Interaction with supernova remnants of population III stars is one of the brightest CP sources. Thus, the observation of the CP of CMB is a new tool for searching for population III stars. In this paper, we demonstrate the improved measurement of the leakage coefficient using the transmission measurement of an actual HWP in the laboratory. We measured the transmittance of linearly polarized light through the HWP used in \textsc{Polarbear} in the frequency range of \SIrange{120}{160}{GHz}. We evaluate properties of the HWP by fitting the data with a physical model using the Markov Chain Monte Carlo method. We then estimate the band-averaged CP leakage coefficient using the physical model. We find that the leakage coefficient strongly depends on the spectra of CP sources. We thus calculate the maximum fractional leakage coefficient from CP to linear polarization as $0.133 \pm 0.009$ in the Rayleigh--Jeans spectrum. The nonzero value shows that \textsc{Polarbear} has sensitivity to CP. Additionally, because we use the bandpass of detectors installed in the telescope to calculate the band-averaged values, we also consider systematic effects in the experiment. △ Less

Submitted 28 June, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

Comments: 27 pages, 7 figures

arXiv:2301.00457 [pdf, other]

ReSQueing Parallel and Private Stochastic Convex Optimization

Authors: Yair Carmon, Arun Jambulapati, Yujia Jin, Yin Tat Lee, Daogao Liu, Aaron Sidford, Kevin Tian

Abstract: We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO obj… ▽ More We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball oracle acceleration [CJJJLST20, ACJJS21], we develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings. For a SCO objective constrained to the unit ball in $\mathbb{R}^d$, we obtain the following results (up to polylogarithmic factors). We give a parallel algorithm obtaining optimization error $ε_{\text{opt}}$ with $d^{1/3}ε_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}ε_{\text{opt}}^{-2/3} + ε_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator. For $ε_{\text{opt}} \in [d^{-1}, d^{-1/4}]$, our algorithm matches the state-of-the-art oracle depth of [BJLLS19] while maintaining the optimal total work of stochastic gradient descent. Given $n$ samples of Lipschitz loss functions, prior works [BFTT19, BFGT20, AFKT21, KLL21] established that if $n \gtrsim d ε_{\text{dp}}^{-2}$, $(ε_{\text{dp}}, δ)$-differential privacy is attained at no asymptotic cost to the SCO utility. However, these prior works all required a superlinear number of gradient queries. We close this gap for sufficiently large $n \gtrsim d^2 ε_{\text{dp}}^{-3}$, by using ReSQue to design an algorithm with near-linear gradient query complexity in this regime. △ Less

Submitted 27 October, 2023; v1 submitted 1 January, 2023; originally announced January 2023.

arXiv:2212.13451 [pdf]

Compositionally Complex Perovskite Oxides as a New Class of Li-Ion Solid Electrolytes

Authors: Shu-Ting Ko, Tom Lee, Ji Qi, Dawei Zhang, Wei-Tao Peng, Xin Wang, Wei-Che Tsai, Shikai Sun, Zhaokun Wang, William J. Bowman, Shyue Ping Ong, Xiaoqing Pan, Jian Luo

Abstract: Compositionally complex ceramics (CCCs), including high-entropy ceramics (HECs) as a subclass, offer new opportunities of materials discovery beyond the traditional methodology of searching new stoichiometric compounds. Herein, we establish new strategies of tailoring CCCs via a seamless combination of (1) non-equimolar compositional designs and (2) controlling microstructures and interfaces. Usin… ▽ More Compositionally complex ceramics (CCCs), including high-entropy ceramics (HECs) as a subclass, offer new opportunities of materials discovery beyond the traditional methodology of searching new stoichiometric compounds. Herein, we establish new strategies of tailoring CCCs via a seamless combination of (1) non-equimolar compositional designs and (2) controlling microstructures and interfaces. Using oxide solid electrolytes for all-solid-state batteries as an exemplar, we validate these new strategies via discovering a new class of compositionally complex perovskite oxides (CCPOs) to show the possibility of improving ionic conductivities beyond the limit of conventional doping. As an example (amongst the 28 CCPOs examined), we demonstrate that the ionic conductivity can be improved by >60% in (Li0.375Sr0.4375)(Ta0.375Nb0.375Zr0.125Hf0.125)O3-δ, in comparison with the state-of-art (Li0.375Sr0.4375)(Ta0.75Zr0.25)O3-δ (LSTZ) baseline, via maintaining comparable electrochemical stability. Furthermore, the ionic conductivity can be improved by another >70% via grain boundary (GB) engineering, achieving >270% of the LSTZ baseline. This work suggests transformative new strategies for designing and tailoring HECs and CCCs, thereby opening a new window for discovering materials for energy storage and many other applications. △ Less

Submitted 27 December, 2022; originally announced December 2022.

arXiv:2212.09746 [pdf, other]

Evaluating Human-Language Model Interaction

Authors: Mina Lee, Megha Srivastava, Amelia Hardy, John Thickstun, Esin Durmus, Ashwin Paranjape, Ines Gerard-Ursin, Xiang Lisa Li, Faisal Ladhak, Frieda Rong, Rose E. Wang, Minae Kwon, Joon Sung Park, Hancheng Cao, Tony Lee, Rishi Bommasani, Michael Bernstein, Percy Liang

Abstract: Many real-world applications of language models (LMs), such as writing assistance and code autocomplete, involve human-LM interaction. However, most benchmarks are non-interactive in that a model produces output without human involvement. To evaluate human-LM interaction, we develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive… ▽ More Many real-world applications of language models (LMs), such as writing assistance and code autocomplete, involve human-LM interaction. However, most benchmarks are non-interactive in that a model produces output without human involvement. To evaluate human-LM interaction, we develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems and dimensions to consider when designing evaluation metrics. Compared to standard, non-interactive evaluation, HALIE captures (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality (e.g., enjoyment and ownership). We then design five tasks to cover different forms of interaction: social dialogue, question answering, crossword puzzles, summarization, and metaphor generation. With four state-of-the-art LMs (three variants of OpenAI's GPT-3 and AI21 Labs' Jurassic-1), we find that better non-interactive performance does not always translate to better human-LM interaction. In particular, we highlight three cases where the results from non-interactive and interactive metrics diverge and underscore the importance of human-LM interaction for LM evaluation. △ Less

Submitted 5 January, 2024; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI)

arXiv:2212.08772 [pdf, other]

doi 10.1103/PhysRevB.108.085407

Spin and electronic excitations in $4f$ atomic chains on Au(111) substrates

Authors: David W. Facemyer, Naveen K. Dandu, Alex Taekyung Lee, Vijay R. Singh, Anh T. Ngo, Sergio E. Ulloa

Abstract: High spin systems, like those that incorporate rare-earth $4f$ elements (REEs), are increasingly relevant in many fields. Although research in such systems is sparse, the large Hilbert spaces they occupy are promising for many applications. In this work, we examine a one-dimensional linear array of europium (Eu) atoms on a Au(111) surface and study their electronic and magnetic excitations. Ab ini… ▽ More High spin systems, like those that incorporate rare-earth $4f$ elements (REEs), are increasingly relevant in many fields. Although research in such systems is sparse, the large Hilbert spaces they occupy are promising for many applications. In this work, we examine a one-dimensional linear array of europium (Eu) atoms on a Au(111) surface and study their electronic and magnetic excitations. Ab initio calculations using VASP with PBE+U are employed to study the structure. We find Eu atoms to have a net charge when on gold, consistent with a net magnetic momemt of $\simeq 3.5 μ_B$. Examining various spin-projection configurations, we can evaluate first and second neighbor exchange energies in an isotropic Heisenberg model between spin-$\frac{7}{2}$ moments to obtain $J_1 \approx -1.2 \, \mathrm{K}$ and $J_2 \approx 0.2 \, \mathrm{K}$ for the relaxed-chain atomic separation of $a \approx 5$ $\mathrm{\dot{A}}$. These parameters are used to obtain the full spin excitation spectrum of a physically realizable four-atom chain. The large $|J_1|/J_2$ ratio results in a highly degenerate ferromagnetic ground state that is split by a significant easy plane single ion anisotropy of $0.6$ K. Spin-flip excitations are calculated to extract differential conductance profiles as those obtained by scanning tunneling microscopy techniques. We uncover interesting behavior of local spin excitations, especially as we track their dispersion with applied magnetic fields. △ Less

Submitted 19 July, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

arXiv:2212.07515 [pdf, other]

doi 10.1103/PhysRevB.107.L121104

Accuracy of ghost-rotationally-invariant slave-boson and dynamical mean field theory as a function of the impurity-model bath size

Authors: Tsung-Han Lee, Nicola Lanatà, Gabriel Kotliar

Abstract: We compare the accuracy of the ghost-rotationally-invariant slave-boson (g-RISB) theory and dynamical mean-field theory (DMFT) on the single-band Hubbard model, as a function of the number of bath sites in the embedding impurity Hamiltonian. Our benchmark calculations confirm that the accuracy of g-RISB can be systematically improved by increasing the number of bath sites, similar to DMFT. With a… ▽ More We compare the accuracy of the ghost-rotationally-invariant slave-boson (g-RISB) theory and dynamical mean-field theory (DMFT) on the single-band Hubbard model, as a function of the number of bath sites in the embedding impurity Hamiltonian. Our benchmark calculations confirm that the accuracy of g-RISB can be systematically improved by increasing the number of bath sites, similar to DMFT. With a few bath sites, we observe that g-RISB is systematically more accurate than DMFT for the ground-state observables. On the other hand, the relative accuracy of these methods is generally comparable for the quasiparticle weight and the spectral function. As expected, we observe that g-RISB satisfies the variational principle in infinite dimensions, as the total energy decreases monotonically towards the exact value as a function of the number of bath sites, suggesting that the g-RISB wavefunction may approach the exact ground state in infinite dimensions. Our results suggest that the g-RISB is a promising method for first principle simulations of strongly correlated matter, which can capture the behavior of both static and dynamical observables, at a relatively low computational cost. △ Less

Submitted 21 December, 2022; v1 submitted 14 December, 2022; originally announced December 2022.

Journal ref: Phys. Rev. B 107, L121104 (2023)

arXiv:2212.07469 [pdf, other]

Learning threshold neurons via the "edge of stability"

Authors: Kwangjun Ahn, Sébastien Bubeck, Sinho Chewi, Yin Tat Lee, Felipe Suarez, Yi Zhang

Abstract: Existing analyses of neural network training often operate under the unrealistic assumption of an extremely small learning rate. This lies in stark contrast to practical wisdom and empirical studies, such as the work of J. Cohen et al. (ICLR 2021), which exhibit startling new phenomena (the "edge of stability" or "unstable convergence") and potential benefits for generalization in the large learni… ▽ More Existing analyses of neural network training often operate under the unrealistic assumption of an extremely small learning rate. This lies in stark contrast to practical wisdom and empirical studies, such as the work of J. Cohen et al. (ICLR 2021), which exhibit startling new phenomena (the "edge of stability" or "unstable convergence") and potential benefits for generalization in the large learning rate regime. Despite a flurry of recent works on this topic, however, the latter effect is still poorly understood. In this paper, we take a step towards understanding genuinely non-convex training dynamics with large learning rates by performing a detailed analysis of gradient descent for simplified models of two-layer neural networks. For these models, we provably establish the edge of stability phenomenon and discover a sharp phase transition for the step size below which the neural network fails to learn "threshold-like" neurons (i.e., neurons with a non-zero first-layer bias). This elucidates one possible mechanism by which the edge of stability can in fact lead to better generalization, as threshold neurons are basic building blocks with useful inductive bias for many tasks. △ Less

Submitted 19 October, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

Comments: 31 pages, 13 figures, Published at NeurIPS 2023

arXiv:2212.05897 [pdf, other]

MultiAct: Long-Term 3D Human Motion Generation from Multiple Action Labels

Authors: Taeryung Lee, Gyeongsik Moon, Kyoung Mu Lee

Abstract: We tackle the problem of generating long-term 3D human motion from multiple action labels. Two main previous approaches, such as action- and motion-conditioned methods, have limitations to solve this problem. The action-conditioned methods generate a sequence of motion from a single action. Hence, it cannot generate long-term motions composed of multiple actions and transitions between actions. Me… ▽ More We tackle the problem of generating long-term 3D human motion from multiple action labels. Two main previous approaches, such as action- and motion-conditioned methods, have limitations to solve this problem. The action-conditioned methods generate a sequence of motion from a single action. Hence, it cannot generate long-term motions composed of multiple actions and transitions between actions. Meanwhile, the motion-conditioned methods generate future motions from initial motion. The generated future motions only depend on the past, so they are not controllable by the user's desired actions. We present MultiAct, the first framework to generate long-term 3D human motion from multiple action labels. MultiAct takes account of both action and motion conditions with a unified recurrent generation system. It repetitively takes the previous motion and action label; then, it generates a smooth transition and the motion of the given action. As a result, MultiAct produces realistic long-term motion controlled by the given sequence of multiple action labels. Codes are available here at https://github.com/TaeryungLee/MultiAct_RELEASE. △ Less

Submitted 17 February, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

Comments: AAAI 2023 (Oral presentation)

arXiv:2212.05685 [pdf, other]

doi 10.1103/PhysRevResearch.5.033028

Intertwined Orders and Electronic Structure in Superconducting Vortex Halos

Authors: Yi-Hsuan Liu, Wei-Lin Tu, Gia-Wei Chern, Ting-Kuo Lee

Abstract: We present a comprehensive study of vortex structures in $d$-wave superconductors from large-scale renormalized mean-field theory of the square-lattice $t$-$t'$-$J$ model, which has been shown to provide a quantitative modeling for high-$T_c$ cuprate superconductors. With an efficient implementation of the kernel polynomial method for solving electronic structures, self-consistent calculations inv… ▽ More We present a comprehensive study of vortex structures in $d$-wave superconductors from large-scale renormalized mean-field theory of the square-lattice $t$-$t'$-$J$ model, which has been shown to provide a quantitative modeling for high-$T_c$ cuprate superconductors. With an efficient implementation of the kernel polynomial method for solving electronic structures, self-consistent calculations involving up to $10^5$ variational parameters are performed to investigate the vortex solutions on lattices of up to $10^4$ sites. By taking into account the strong correlation of the model, our calculations shed new lights on two puzzling results that have emerged from recent scanning tunneling microscopy (STM) experiments. The first concerns the issue of the zero-biased-conductance peak (ZBCP) at the vortex core for a uniform $d$-wave superconducting state. Despite its theoretical prediction, the ZBCP was not observed in most doping range of cuprates except in heavily over-doped samples at low magnetic field. The second issue is the nature of the checkerboard charge density waves (CDWs) with a period of about 8 unit cells in the vortex halo at optimal doping. Although it has been suggested that such bipartite structure arises from low-energy quasiparticle interference, another intriguing scenario posits that the checkerboard CDWs originate from an underlying bidirectional pair-density wave (PDW) ordering with the same period. We present a coherent interpretation of these experimental results based on systematic studies of the doping and magnetic field effects on vortex solutions with and without a checkerboard structure. The mechanism of the emergent intertwined orders within the vortex halo is also discussed. △ Less

Submitted 11 December, 2022; originally announced December 2022.

Comments: 19 pages, 7 figures

Journal ref: Physical Review Research 5, 033028 (2023)

arXiv:2212.05642 [pdf, other]

doi 10.1103/PhysRevD.108.023510

A Measurement of the CMB Temperature Power Spectrum and Constraints on Cosmology from the SPT-3G 2018 TT/TE/EE Data Set

Authors: L. Balkenhol, D. Dutcher, A. Spurio Mancini, A. Doussot, K. Benabed, S. Galli, P. A. R. Ade, A. J. Anderson, B. Ansarinejad, M. Archipley, A. N. Bender, B. A. Benson, F. Bianchini, L. E. Bleem, F. R. Bouchet, L. Bryant, E. Camphuis, J. E. Carlstrom, T. W. Cecil, C. L. Chang, P. Chaubal, P. M. Chichura, T. -L. Chou, A. Coerver, T. M. Crawford , et al. (62 additional authors not shown)

Abstract: We present a sample-variance-limited measurement of the temperature power spectrum ($TT$) of the cosmic microwave background (CMB) using observations of a $\sim\! 1500 \,\mathrm{deg}^2$ field made by SPT-3G in 2018. We report multifrequency power spectrum measurements at 95, 150, and 220GHz covering the angular multipole range $750 \leq \ell < 3000$. We combine this $TT$ measurement with the publi… ▽ More We present a sample-variance-limited measurement of the temperature power spectrum ($TT$) of the cosmic microwave background (CMB) using observations of a $\sim\! 1500 \,\mathrm{deg}^2$ field made by SPT-3G in 2018. We report multifrequency power spectrum measurements at 95, 150, and 220GHz covering the angular multipole range $750 \leq \ell < 3000$. We combine this $TT$ measurement with the published polarization power spectrum measurements from the 2018 observing season and update their associated covariance matrix to complete the SPT-3G 2018 $TT/TE/EE$ data set. This is the first analysis to present cosmological constraints from SPT $TT$, $TE$, and $EE$ power spectrum measurements jointly. We blind the cosmological results and subject the data set to a series of consistency tests at the power spectrum and parameter level. We find excellent agreement between frequencies and spectrum types and our results are robust to the modeling of astrophysical foregrounds. We report results for $Λ$CDM and a series of extensions, drawing on the following parameters: the amplitude of the gravitational lensing effect on primary power spectra $A_\mathrm{L}$, the effective number of neutrino species $N_{\mathrm{eff}}$, the primordial helium abundance $Y_{\mathrm{P}}$, and the baryon clumping factor due to primordial magnetic fields $b$. We find that the SPT-3G 2018 $T/TE/EE$ data are well fit by $Λ$CDM with a probability-to-exceed of $15\%$. For $Λ$CDM, we constrain the expansion rate today to $H_0 = 68.3 \pm 1.5\,\mathrm{km\,s^{-1}\,Mpc^{-1}}$ and the combined structure growth parameter to $S_8 = 0.797 \pm 0.042$. The SPT-based results are effectively independent of Planck, and the cosmological parameter constraints from either data set are within $<1\,σ$ of each other. (abridged) △ Less

Submitted 27 July, 2023; v1 submitted 11 December, 2022; originally announced December 2022.

Comments: 35 Pages, 17 Figures, 11 Tables

arXiv:2212.03090 [pdf, other]

Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition

Authors: Zhiyuan Peng, Xuanji He, Ke Ding, Tan Lee, Guanglu Wan

Abstract: Very deep models for speaker recognition (SR) have demonstrated remarkable performance improvement in recent research. However, it is impractical to deploy these models for on-device applications with constrained computational resources. On the other hand, light-weight models are highly desired in practice despite their sub-optimal performance. This research aims to improve light-weight SR models… ▽ More Very deep models for speaker recognition (SR) have demonstrated remarkable performance improvement in recent research. However, it is impractical to deploy these models for on-device applications with constrained computational resources. On the other hand, light-weight models are highly desired in practice despite their sub-optimal performance. This research aims to improve light-weight SR models through large-scale label-free knowledge distillation (KD). Existing KD approaches for SR typically require speaker labels to learn task-specific knowledge, due to the inefficiency of conventional loss for distillation. To address the inefficiency problem and achieve label-free KD, we propose to employ the contrastive loss from self-supervised learning for distillation. Extensive experiments are conducted on a collection of public speech datasets from diverse sources. Results on light-weight SR models show that the proposed approach of label-free KD with contrastive loss consistently outperforms both conventional distillation methods and self-supervised learning methods by a significant margin. △ Less

Submitted 6 December, 2022; originally announced December 2022.

arXiv:2212.03039 [pdf, ps, other]

Covariance Regularization for Probabilistic Linear Discriminant Analysis

Authors: Zhiyuan Peng, Mingjie Shao, Xuanji He, Xu Li, Tan Lee, Ke Ding, Guanglu Wan

Abstract: Probabilistic linear discriminant analysis (PLDA) is commonly used in speaker verification systems to score the similarity of speaker embeddings. Recent studies improved the performance of PLDA in domain-matched conditions by diagonalizing its covariance. We suspect such brutal pruning approach could eliminate its capacity in modeling dimension correlation of speaker embeddings, leading to inadequ… ▽ More Probabilistic linear discriminant analysis (PLDA) is commonly used in speaker verification systems to score the similarity of speaker embeddings. Recent studies improved the performance of PLDA in domain-matched conditions by diagonalizing its covariance. We suspect such brutal pruning approach could eliminate its capacity in modeling dimension correlation of speaker embeddings, leading to inadequate performance with domain adaptation. This paper explores two alternative covariance regularization approaches, namely, interpolated PLDA and sparse PLDA, to tackle the problem. The interpolated PLDA incorporates the prior knowledge from cosine scoring to interpolate the covariance of PLDA. The sparse PLDA introduces a sparsity penalty to update the covariance. Experimental results demonstrate that both approaches outperform diagonal regularization noticeably with domain adaptation. In addition, in-domain data can be significantly reduced when training sparse PLDA for domain adaptation. △ Less

Submitted 6 December, 2022; originally announced December 2022.

arXiv:2212.01539 [pdf, other]

Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping

Authors: Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, Jiang Bian

Abstract: Differentially private deep learning has recently witnessed advances in computational efficiency and privacy-utility trade-off. We explore whether further improvements along the two axes are possible and provide affirmative answers leveraging two instantiations of \emph{group-wise clipping}. To reduce the compute time overhead of private learning, we show that \emph{per-layer clipping}, where the… ▽ More Differentially private deep learning has recently witnessed advances in computational efficiency and privacy-utility trade-off. We explore whether further improvements along the two axes are possible and provide affirmative answers leveraging two instantiations of \emph{group-wise clipping}. To reduce the compute time overhead of private learning, we show that \emph{per-layer clipping}, where the gradient of each neural network layer is clipped separately, allows clipping to be performed in conjunction with backpropagation in differentially private optimization. This results in private learning that is as memory-efficient and almost as fast per training update as non-private learning for many workflows of interest. While per-layer clipping with constant thresholds tends to underperform standard flat clipping, per-layer clipping with adaptive thresholds matches or outperforms flat clipping under given training epoch constraints, hence attaining similar or better task performance within less wall time. To explore the limits of scaling (pretrained) models in differentially private deep learning, we privately fine-tune the 175 billion-parameter GPT-3. We bypass scaling challenges associated with clipping gradients that are distributed across multiple devices with \emph{per-device clipping} that clips the gradient of each model piece separately on its host device. Privately fine-tuning GPT-3 with per-device clipping achieves a task performance at $ε=1$ better than what is attainable by non-privately fine-tuning the largest GPT-2 on a summarization task. △ Less

Submitted 3 December, 2022; originally announced December 2022.

Comments: 25 pages

arXiv:2211.15707 [pdf, other]

doi 10.3847/1538-4357/aca6e8

The mid-infrared molecular inventory towards Orion IRc2

Authors: Sarah Nickerson, Naseem Rangwala, Sean W. J. Colgan, Curtis DeWitt, Jose S. Monzon, Xinchuan Huang, Kinsuk Acharyya, Maria N. Drozdovskaya, Ryan C. Fortenberry, Eric Herbst, Timothy J. Lee

Abstract: We present the first high spectral resolution mid-infrared survey in the Orion BN/KL region, covering 7.2 to 28.3 micron. With SOFIA/EXES we target the enigmatic source Orion IRc2. While this is in the most prolifically studied massive star-forming region, longer wavelengths and molecular emission lines dominated previous spectral surveys. The mid-infrared observations in this work access differen… ▽ More We present the first high spectral resolution mid-infrared survey in the Orion BN/KL region, covering 7.2 to 28.3 micron. With SOFIA/EXES we target the enigmatic source Orion IRc2. While this is in the most prolifically studied massive star-forming region, longer wavelengths and molecular emission lines dominated previous spectral surveys. The mid-infrared observations in this work access different components and molecular species in unprecedented detail. We unambiguously identify two new kinematic components, both chemically rich with multiple molecular absorption lines. The "blue clump" has vLSR = -7.1 \pm 0.7 km/s and the "red clump" 1.4 \pm 0.5 km/s. While the blue and red clumps have similar temperatures and line widths, molecular species in the blue clump have higher column densities. They are both likely linked to pure rotational H2 emission also covered by this survey. This work provides evidence for the scenario that the blue and red clumps are distinct components unrelated to the classic components in the Orion BN/KL region. Comparison to spectroscopic surveys towards other infrared targets in the region show that the blue clump is clearly extended. We analyze, compare, and present in depth findings on the physical conditions of C2H2, 13CCH2, CH4, CS, H2O, HCN, H13CN, HNC, NH3, and SO2 absorption lines and an H2 emission line associated with the blue and red clumps. We also provide limited analysis of H2O and SiO molecular emission lines towards Orion IRc2 and the atomic forbidden transitions [FeII], [SI], [SIII], and [NeII]. △ Less

Submitted 28 November, 2022; originally announced November 2022.

Comments: Accepted to ApJ; 44 pages, 14 figures, 13 tables

arXiv:2211.11860 [pdf, other]

Upper and Lower Bounds on the Smoothed Complexity of the Simplex Method

Authors: Sophie Huiberts, Yin Tat Lee, Xinzhi Zhang

Abstract: The simplex method for linear programming is known to be highly efficient in practice, and understanding its performance from a theoretical perspective is an active research topic. The framework of smoothed analysis, first introduced by Spielman and Teng (JACM '04) for this purpose, defines the smoothed complexity of solving a linear program with $d$ variables and $n$ constraints as the expected r… ▽ More The simplex method for linear programming is known to be highly efficient in practice, and understanding its performance from a theoretical perspective is an active research topic. The framework of smoothed analysis, first introduced by Spielman and Teng (JACM '04) for this purpose, defines the smoothed complexity of solving a linear program with $d$ variables and $n$ constraints as the expected running time when Gaussian noise of variance $σ^2$ is added to the LP data. We prove that the smoothed complexity of the simplex method is $O(σ^{-3/2} d^{13/4}\log^{7/4} n)$, improving the dependence on $1/σ$ compared to the previous bound of $O(σ^{-2} d^2\sqrt{\log n})$. We accomplish this through a new analysis of the \emph{shadow bound}, key to earlier analyses as well. Illustrating the power of our new method, we use our method to prove a nearly tight upper bound on the smoothed complexity of two-dimensional polygons. We also establish the first non-trivial lower bound on the smoothed complexity of the simplex method, proving that the \emph{shadow vertex simplex method} requires at least $Ω\Big(\min \big(σ^{-1/2} d^{-1/2}\log^{-1/4} d,2^d \big) \Big)$ pivot steps with high probability. A key part of our analysis is a new variation on the extended formulation for the regular $2^k$-gon. We end with a numerical experiment that suggests this analysis could be further improved. △ Less

Submitted 15 May, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: 43 pages, 5 figures. STOC 2023

arXiv:2211.09110 [pdf, other]

Holistic Evaluation of Language Models

Authors: Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao , et al. (25 additional authors not shown)

Abstract: Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. First, we taxonomize the vast space of potential scenarios (i.e. use cases) and metrics (i.e. desiderata) that are of interest fo… ▽ More Language models (LMs) are becoming the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models. First, we taxonomize the vast space of potential scenarios (i.e. use cases) and metrics (i.e. desiderata) that are of interest for LMs. Then we select a broad subset based on coverage and feasibility, noting what's missing or underrepresented (e.g. question answering for neglected English dialects, metrics for trustworthiness). Second, we adopt a multi-metric approach: We measure 7 metrics (accuracy, calibration, robustness, fairness, bias, toxicity, and efficiency) for each of 16 core scenarios when possible (87.5% of the time). This ensures metrics beyond accuracy don't fall to the wayside, and that trade-offs are clearly exposed. We also perform 7 targeted evaluations, based on 26 targeted scenarios, to analyze specific aspects (e.g. reasoning, disinformation). Third, we conduct a large-scale evaluation of 30 prominent language models (spanning open, limited-access, and closed models) on all 42 scenarios, 21 of which were not previously used in mainstream LM evaluation. Prior to HELM, models on average were evaluated on just 17.9% of the core HELM scenarios, with some prominent models not sharing a single scenario in common. We improve this to 96.0%: now all 30 models have been densely benchmarked on the same core scenarios and metrics under standardized conditions. Our evaluation surfaces 25 top-level findings. For full transparency, we release all raw model prompts and completions publicly for further analysis, as well as a general modular toolkit. We intend for HELM to be a living benchmark for the community, continuously updated with new scenarios, metrics, and models. △ Less

Submitted 1 October, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

Comments: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Project page: https://crfm.stanford.edu/helm/v1.0

Journal ref: Published in Transactions on Machine Learning Research (TMLR), 2023

Showing 201–250 of 1,808 results for author: Lee, T