Skip to main content

Showing 1–50 of 936 results for author: Chang, H

  1. arXiv:2407.03672  [pdf, other

    cs.LG cs.AI

    A Survey of Data Synthesis Approaches

    Authors: Hsin-Yu Chang, Pei-Yu Chen, Tun-Hsiang Chou, Chang-Sheng Kao, Hsuan-Yun Yu, Yen-Ting Lin, Yun-Nung Chen

    Abstract: This paper provides a detailed survey of synthetic data techniques. We first discuss the expected goals of using synthetic data in data augmentation, which can be divided into four parts: 1) Improving Diversity, 2) Data Balancing, 3) Addressing Domain Shift, and 4) Resolving Edge Cases. Synthesizing data are closely related to the prevailing machine learning techniques at the time, therefore, we s… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2406.17442  [pdf, other

    cs.CV

    Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space Model

    Authors: Zhuoyuan Li, Yubo Ai, Jiahao Lu, ChuXin Wang, Jiacheng Deng, Hanzhi Chang, Yanzhe Liang, Wenfei Yang, Shifeng Zhang, Tianzhu Zhang

    Abstract: Transformers have demonstrated impressive results for 3D point cloud semantic segmentation. However, the quadratic complexity of transformer makes computation cost high, limiting the number of points that can be processed simultaneously and impeding the modeling of long-range dependencies. Drawing inspiration from the great potential of recent state space models (SSM) for long sequence modeling, w… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  3. arXiv:2406.17289  [pdf, other

    cs.IR cs.AI

    Hyperbolic Knowledge Transfer in Cross-Domain Recommendation System

    Authors: Xin Yang, Heng Chang, Zhijian Lai, Jinze Yang, Xingrun Li, Yu Lu, Shuaiqiang Wang, Dawei Yin, Erxue Min

    Abstract: Cross-Domain Recommendation (CDR) seeks to utilize knowledge from different domains to alleviate the problem of data sparsity in the target recommendation domain, and it has been gaining more attention in recent years. Although there have been notable advancements in this area, most current methods represent users and items in Euclidean space, which is not ideal for handling long-tail distributed… ▽ More

    Submitted 4 July, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  4. arXiv:2406.16357  [pdf, other

    cs.LG cs.AI cs.SI

    Towards Lightweight Graph Neural Network Search with Curriculum Graph Sparsification

    Authors: Beini Xie, Heng Chang, Ziwei Zhang, Zeyang Zhang, Simin Wu, Xin Wang, Yuan Meng, Wenwu Zhu

    Abstract: Graph Neural Architecture Search (GNAS) has achieved superior performance on various graph-structured tasks. However, existing GNAS studies overlook the applications of GNAS in resource-constraint scenarios. This paper proposes to design a joint graph data and architecture mechanism, which identifies important sub-architectures via the valuable graph data. To search for optimal lightweight Graph N… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024. The two first authors made equal contributions

  5. arXiv:2406.13350  [pdf, other

    physics.ins-det physics.app-ph

    Microwave amplification chain calibration in an axion haloscope via cavity-emitted radiation

    Authors: Hsin Chang, Han-Wen Liu, Hien Thi Doan, Yung-Fu Chen

    Abstract: In an axion haloscope, the weak photon signal, theoretically converted from axions, is captured by a detection cavity. The amplification chain assists the signal receiver to read the signal from the cavity and requires accurate calibration. Typically, the readout line is calibrated using the Y-factor method, involving a switch that directs the signal from either the detection line or the calibrati… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 6 journal pages, 5 figures, and 1 table in the main text

  6. arXiv:2406.11813  [pdf, other

    cs.CL

    How Do Large Language Models Acquire Factual Knowledge During Pretraining?

    Authors: Hoyeon Chang, Jinho Park, Seonghyeon Ye, Sohee Yang, Youngkyung Seo, Du-Seong Chang, Minjoon Seo

    Abstract: Despite the recent observation that large language models (LLMs) can store substantial factual knowledge, there is a limited understanding of the mechanisms of how they acquire factual knowledge through pretraining. This work addresses this gap by studying how LLMs acquire factual knowledge during pretraining. The findings reveal several important insights into the dynamics of factual knowledge ac… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    ACM Class: I.2.7

  7. arXiv:2406.11057  [pdf, other

    eess.SY

    Design of Interacting Particle Systems for Fast and Efficient Reinforcement Learning

    Authors: Anant A Joshi, Heng-Sheng Chang, Amirhossein Taghvaei, Prashant G Mehta, Sean P. Meyn

    Abstract: This paper is concerned with the design of algorithms based on systems of interacting particles to represent, approximate, and learn the optimal control law for reinforcement learning (RL). The primary contribution of the present paper is to show that convergence rates can be accelerated dramatically through careful design of interactions between particles. Theory focuses on the linear quadratic s… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  8. arXiv:2406.10159  [pdf, other

    quant-ph cond-mat.mes-hall

    Probing entanglement dynamics and topological transitions on noisy intermediate-scale quantum computers

    Authors: Huai-Chun Chang, Hsiu-Chuan Hsu, Yu-Cheng Lin

    Abstract: We simulate quench dynamics of the Su-Schrieffer-Heeger (SSH) chain on the IBM quantum computers, calculating the Rényi entanglement entropy, the twist order parameter and the Berry phase. The latter two quantities can be deduced from a slow-twist operator defined in the Lieb-Schultz-Mattis theorem. The Rényi entropy is obtained using a recently developed randomized measurement scheme. The twist o… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: 9 pages, 11 figures

  9. arXiv:2406.09499  [pdf, other

    hep-ph astro-ph.CO

    Axion Stars: Mass Functions and Constraints

    Authors: Jae Hyeok Chang, Patrick J. Fox, Huangyu Xiao

    Abstract: The QCD axion and axion-like particles, as leading dark matter candidates, can also have interesting implications for dark matter substructures if the Peccei-Quinn symmetry is broken after inflation. In such a scenario, axion perturbations on small scales will lead to the formation of axion miniclusters at matter-radiation equality, and subsequently the formation of axion stars. Such compact objec… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 20 pages + refs, 5 figures

    Report number: FERMILAB-PUB-24-0295-T

  10. arXiv:2406.07837  [pdf, other

    cs.RO cs.AI

    Scaling Manipulation Learning with Visual Kinematic Chain Prediction

    Authors: Xinyu Zhang, Yuhan Liu, Haonan Chang, Abdeslam Boularias

    Abstract: Learning general-purpose models from diverse datasets has achieved great success in machine learning. In robotics, however, existing methods in multi-task learning are typically constrained to a single robot and workspace, while recent work such as RT-X requires a non-trivial action normalization procedure to manually bridge the gap between different action spaces in diverse environments. In this… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Submitted to CoRL 2024

  11. arXiv:2406.07735  [pdf, other

    cs.CL cs.LG

    REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy

    Authors: Haw-Shiuan Chang, Nanyun Peng, Mohit Bansal, Anil Ramakrishna, Tagyoung Chung

    Abstract: Decoding methods for large language models (LLMs) usually struggle with the tradeoff between ensuring factuality and maintaining diversity. For example, a higher p threshold in the nucleus (top-p) sampling increases the diversity but decreases the factuality, and vice versa. In this paper, we propose REAL (Residual Entropy from Asymptotic Line) sampling, a decoding method that achieves improved fa… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  12. arXiv:2406.07549  [pdf, other

    cs.RO

    A3VLM: Actionable Articulation-Aware Vision Language Model

    Authors: Siyuan Huang, Haonan Chang, Yuhan Liu, Yimeng Zhu, Hao Dong, Peng Gao, Abdeslam Boularias, Hongsheng Li

    Abstract: Vision Language Models (VLMs) have received significant attention in recent years in the robotics community. VLMs are shown to be able to perform complex visual reasoning and scene understanding tasks, which makes them regarded as a potential universal solution for general robotics problems such as manipulation and navigation. However, previous VLMs for robotics such as RT-1, RT-2, and ManipLLM ha… ▽ More

    Submitted 13 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

  13. arXiv:2406.06650  [pdf, other

    eess.IV cs.CV

    Predicting the risk of early-stage breast cancer recurrence using H\&E-stained tissue images

    Authors: Geongyu Lee, Joonho Lee, Tae-Yeong Kwak, Sun Woo Kim, Youngmee Kwon, Chungyeul Kim, Hyeyoon Chang

    Abstract: Accurate prediction of the likelihood of recurrence is important in the selection of postoperative treatment for patients with early-stage breast cancer. In this study, we investigated whether deep learning algorithms can predict patients' risk of recurrence by analyzing the pathology images of their cancer histology. A total of 125 hematoxylin and eosin stained breast cancer whole slide images la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages, 7 figures

  14. arXiv:2406.05944  [pdf, other

    stat.ME math.ST

    Embedding Network Autoregression for time series analysis and causal peer effect inference

    Authors: Jae Ho Chang, Subhadeep Paul

    Abstract: We propose an Embedding Network Autoregressive Model (ENAR) for multivariate networked longitudinal data. We assume the network is generated from a latent variable model, and these unobserved variables are included in a structural peer effect model or a time series network autoregressive model as additive effects. This approach takes a unified view of two related problems, (1) modeling and predict… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  15. arXiv:2406.05392  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

    Authors: Chengyuan Deng, Yiqun Duan, Xin Jin, Heng Chang, Yijun Tian, Han Liu, Henry Peng Zou, Yiqiao Jin, Yijia Xiao, Yichen Wang, Shenghao Wu, Zongxing Xie, Kuofeng Gao, Sihong He, Jun Zhuang, Lu Cheng, Haohan Wang

    Abstract: Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, an… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  16. arXiv:2406.00276  [pdf

    cs.LG cs.AI cs.CE physics.data-an

    Non-destructive Degradation Pattern Decoupling for Ultra-early Battery Prototype Verification Using Physics-informed Machine Learning

    Authors: Shengyu Tao, Mengtian Zhang, Zixi Zhao, Haoyang Li, Ruifei Ma, Yunhong Che, Xin Sun, Lin Su, Xiangyu Chen, Zihao Zhou, Heng Chang, Tingwei Cao, Xiao Xiao, Yaojun Liu, Wenjun Yu, Zhongling Xu, Yang Li, Han Hao, Xuan Zhang, Xiaosong Hu, Guangmin ZHou

    Abstract: Manufacturing complexities and uncertainties have impeded the transition from material prototypes to commercial batteries, making prototype verification critical to quality assessment. A fundamental challenge involves deciphering intertwined chemical processes to characterize degradation patterns and their quantitative relationship with battery performance. Here we show that a physics-informed mac… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

    ACM Class: J.2; G.3

  17. arXiv:2405.20596  [pdf, other

    cs.CV cs.LG

    Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

    Authors: Jiachen Liang, Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen

    Abstract: Traditional semi-supervised learning (SSL) assumes that the feature distributions of labeled and unlabeled data are consistent which rarely holds in realistic scenarios. In this paper, we propose a novel SSL setting, where unlabeled samples are drawn from a mixed distribution that deviates from the feature distribution of labeled samples. Under this setting, previous SSL methods tend to predict wr… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 10 pages; Accepted by NeurIPS 2023

  18. arXiv:2405.20502  [pdf, ps, other

    eess.SY math.DS math.OC

    Reach-Avoid Control Synthesis for a Quadrotor UAV with Formal Safety Guarantees

    Authors: Mohamed Serry, Haocheng Chang, Jun Liu

    Abstract: Reach-avoid specifications are one of the most common tasks in autonomous aerial vehicle (UAV) applications. Despite the intensive research and development associated with control of aerial vehicles, generating feasible trajectories though complex environments and tracking them with formal safety guarantees remain challenging. In this paper, we propose a control framework for a quadrotor UAV that… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  19. arXiv:2405.20202  [pdf, other

    cs.AI

    One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments

    Authors: Ke Yi, Yuhui Xu, Heng Chang, Chen Tang, Yuan Meng, Tong Zhang, Jia Li

    Abstract: Large Language Models (LLMs) have advanced rapidly but face significant memory demands. While quantization has shown promise for LLMs, current methods typically require lengthy training to alleviate the performance degradation from quantization loss. However, deploying LLMs across diverse scenarios with different resource constraints, e.g., servers and personal computers, requires repeated trainin… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  20. arXiv:2405.17913  [pdf, other

    cs.CV cs.AI

    OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision

    Authors: Junjie Wang, Bin Chen, Bin Kang, Yulin Li, YiChi Chen, Weizhi Xian, Huifeng Chang

    Abstract: Open-Vocabulary Detection (OVD) aims to detect objects from novel categories beyond the base categories on which the detector is trained. However, existing open-vocabulary detectors trained on known category data tend to assign higher confidence to trained categories and confuse novel categories with background. To resolve this, we propose OV-DQUO, an \textbf{O}pen-\textbf{V}ocabulary DETR with \t… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  21. arXiv:2405.17385  [pdf, other

    quant-ph cond-mat.mes-hall cond-mat.str-el

    Thermalization and Criticality on an Analog-Digital Quantum Simulator

    Authors: Trond I. Andersen, Nikita Astrakhantsev, Amir H. Karamlou, Julia Berndtsson, Johannes Motruk, Aaron Szasz, Jonathan A. Gross, Alexander Schuckert, Tom Westerhout, Yaxing Zhang, Ebrahim Forati, Dario Rossi, Bryce Kobrin, Agustin Di Paolo, Andrey R. Klots, Ilya Drozdov, Vladislav D. Kurilovich, Andre Petukhov, Lev B. Ioffe, Andreas Elben, Aniket Rath, Vittorio Vitale, Benoit Vermersch, Rajeev Acharya, Laleh Aghababaie Beni , et al. (202 additional authors not shown)

    Abstract: Understanding how interacting particles approach thermal equilibrium is a major challenge of quantum simulators. Unlocking the full potential of such systems toward this goal requires flexible initial state preparation, precise time evolution, and extensive probes for final state characterization. We present a quantum simulator comprising 69 superconducting qubits which supports both universal qua… ▽ More

    Submitted 8 July, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  22. arXiv:2405.16273  [pdf, other

    cs.CV

    M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation

    Authors: Mingshuang Luo, Ruibing Hou, Hong Chang, Zimo Liu, Yaowei Wang, Shiguang Shan

    Abstract: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. M$^3$GPT operates on three fundamental principles. The first focuses on creating a unified representation space for various motion-relevant modalities. We employ discrete vector quantization for multimodal control and generation signals, such as text,… ▽ More

    Submitted 29 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: 18 pages, 6 figures

  23. arXiv:2405.15304  [pdf, other

    cs.LG cs.CV

    Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient

    Authors: Yongliang Wu, Shiji Zhou, Mingzhuo Yang, Lianzhe Wang, Wenbo Zhu, Heng Chang, Xiao Zhou, Xu Yang

    Abstract: Current text-to-image diffusion models have achieved groundbreaking results in image generation tasks. However, the unavoidable inclusion of sensitive information during pre-training introduces significant risks such as copyright infringement and privacy violations in the generated images. Machine Unlearning (MU) provides a effective way to the sensitive concepts captured by the model, has been sh… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  24. arXiv:2405.12656  [pdf, other

    cs.CL cs.AI

    Retrieval-Augmented Language Model for Extreme Multi-Label Knowledge Graph Link Prediction

    Authors: Yu-Hsiang Lin, Huang-Ting Shieh, Chih-Yu Liu, Kuang-Ting Lee, Hsiao-Cheng Chang, Jing-Lun Yang, Yu-Sheng Lin

    Abstract: Extrapolation in Large language models (LLMs) for open-ended inquiry encounters two pivotal issues: (1) hallucination and (2) expensive training costs. These issues present challenges for LLMs in specialized domains and personalized data, requiring truthful responses and low fine-tuning costs. Existing works attempt to tackle the problem by augmenting the input of a smaller language model with inf… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

  25. arXiv:2405.05248  [pdf, other

    cs.CL cs.AI cs.MA

    LLMs with Personalities in Multi-issue Negotiation Games

    Authors: Sean Noh, Ho-Chun Herbert Chang

    Abstract: Powered by large language models (LLMs), AI agents have become capable of many human tasks. Using the most canonical definitions of the Big Five personality, we measure the ability of LLMs to negotiate within a game-theoretical framework, as well as methodological challenges to measuring notions of fairness and risk. Simulations (n=1,500) for both single-issue and multi-issue negotiation reveal in… ▽ More

    Submitted 8 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  26. arXiv:2405.01610  [pdf, other

    cs.CL cs.IR

    Automating the Analysis of Public Saliency and Attitudes towards Biodiversity from Digital Media

    Authors: Noah Giebink, Amrita Gupta, Diogo Verìssimo, Charlotte H. Chang, Tony Chang, Angela Brennan, Brett Dickson, Alex Bowmer, Jonathan Baillie

    Abstract: Measuring public attitudes toward wildlife provides crucial insights into our relationship with nature and helps monitor progress toward Global Biodiversity Framework targets. Yet, conducting such assessments at a global scale is challenging. Manually curating search terms for querying news and social media is tedious, costly, and can lead to biased results. Raw news and social media data returned… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: v0.1, 21 pages with 10 figures

  27. arXiv:2405.00079  [pdf

    physics.soc-ph

    A global evidence map of human well-being and biodiversity co-benefits and trade-offs of natural climate solutions

    Authors: Charlotte H. Chang, James T. Erbaugh, Paola Fajardo, Luci Lu, István Molnár, Dávid Papp, Brian E. Robinson, Kemen Austin, Susan Cook-Patton, Timm Kroeger, Lindsey Smart, Miguel Castro, Samantha H. Cheng, Peter W. Ellis, Rob I. McDonald, Teevrat Garg, Erin E. Poor, Preston Welker, Andrew R. Tilman, Stephen A. Wood, Yuta J. Masuda

    Abstract: Natural climate solutions (NCS) are critical for mitigating climate change through ecosystem-based carbon removal and emissions reductions. NCS implementation can also generate biodiversity and human well-being co-benefits and trade-offs ("NCS co-impacts"), but the volume of evidence on NCS co-impacts has grown rapidly across disciplines, is poorly understood, and remains to be systematically coll… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

    Comments: 28 pages, 5 figures

  28. arXiv:2404.18786  [pdf, ps, other

    math.ST stat.ME

    Randomization-based confidence intervals for the local average treatment effect

    Authors: P. M. Aronow, Haoge Chang, Patrick Lopatto

    Abstract: We consider the problem of generating confidence intervals in randomized experiments with noncompliance. We show that a refinement of a randomization-based procedure proposed by Imbens and Rosenbaum (2005) has desirable properties. Namely, we show that using a studentized Anderson-Rubin-type statistic as a test statistic yields confidence intervals that are finite-sample exact under treatment effe… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: 40 pages

  29. arXiv:2404.18336  [pdf, ps, other

    math.RT math.CT

    Mutation of $n$-cotorsion pairs in triangulated categories

    Authors: Huimin Chang, Panyue Zhou

    Abstract: In this article, we define the notion of $n$-cotorsion pairs in triangulated categories, which is a generalization of the classical cotorsion pairs. We prove that any mutation of an $n$-cotorsion pair is again an $n$-cotorsion pair. When $n=1$, this result generalizes the work of Zhou and Zhu for classical cotorsion pairs. As applications, we give a geometric characterization of $n$-cotorsion pair… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 13 pages

  30. arXiv:2404.17486  [pdf, other

    cs.CV

    TextGaze: Gaze-Controllable Face Generation with Natural Language

    Authors: Hengfei Wang, Zhongqun Zhang, Yihua Cheng, Hyung Jin Chang

    Abstract: Generating face image with specific gaze information has attracted considerable attention. Existing approaches typically input gaze values directly for face generation, which is unnatural and requires annotated gaze datasets for training, thereby limiting its application. In this paper, we present a novel gaze-controllable face generation task. Our approach inputs textual descriptions that describ… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Under review

  31. arXiv:2404.09696  [pdf, other

    cs.CL cs.AI cs.ET

    Are Large Language Models Reliable Argument Quality Annotators?

    Authors: Nailia Mirzakhmedova, Marcel Gohsen, Chia Hao Chang, Benno Stein

    Abstract: Evaluating the quality of arguments is a crucial aspect of any system leveraging argument mining. However, it is a challenge to obtain reliable and consistent annotations regarding argument quality, as this usually requires domain-specific expertise of the annotators. Even among experts, the assessment of argument quality is often inconsistent due to the inherent subjectivity of this task. In this… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 18 pages, 5 figures, 5 tables

  32. arXiv:2404.09507  [pdf, other

    cs.CV

    Clothes-Changing Person Re-Identification with Feasibility-Aware Intermediary Matching

    Authors: Jiahe Zhao, Ruibing Hou, Hong Chang, Xinqian Gu, Bingpeng Ma, Shiguang Shan, Xilin Chen

    Abstract: Current clothes-changing person re-identification (re-id) approaches usually perform retrieval based on clothes-irrelevant features, while neglecting the potential of clothes-relevant features. However, we observe that relying solely on clothes-irrelevant features for clothes-changing re-id is limited, since they often lack adequate identity information and suffer from large intra-class variations… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  33. arXiv:2404.09385  [pdf, other

    eess.AS cs.CL eess.SP

    A Large-Scale Evaluation of Speech Foundation Models

    Authors: Shu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-yi Lee

    Abstract: The foundation model paradigm leverages a shared foundation model to achieve state-of-the-art (SOTA) performance for various tasks, requiring minimal downstream-specific modeling and data annotation. This approach has proven crucial in the field of Natural Language Processing (NLP). However, the speech processing community lacks a similar setup to explore the paradigm systematically. In this work,… ▽ More

    Submitted 29 May, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: The extended journal version for SUPERB and SUPERB-SG. Published in IEEE/ACM TASLP. The Arxiv version is preferred

  34. arXiv:2404.06903  [pdf, other

    cs.CV cs.AI

    DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

    Authors: Shijie Zhou, Zhiwen Fan, Dejia Xu, Haoran Chang, Pradyumna Chari, Tejas Bharadwaj, Suya You, Zhangyang Wang, Achuta Kadambi

    Abstract: The increasing demand for virtual reality applications has highlighted the significance of crafting immersive 3D assets. We present a text-to-3D 360$^{\circ}$ scene generation pipeline that facilitates the creation of comprehensive 360$^{\circ}$ scenes for in-the-wild environments in a matter of minutes. Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  35. arXiv:2404.05191  [pdf, ps, other

    eess.SP

    Graph-based Untrained Neural Network Detector for OTFS Systems

    Authors: Hao Chang, Branka Vucetic, Wibowo Hardjawana

    Abstract: Inter-carrier interference (ICI) caused by mobile reflectors significantly degrades the conventional orthogonal frequency division multiplexing (OFDM) performance in high-mobility environments. The orthogonal time frequency space (OTFS) modulation system effectively represents ICI in the delay-Doppler domain, thus significantly outperforming OFDM. Existing iterative and neural network (NN) based O… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  36. arXiv:2404.04979  [pdf, other

    econ.EM cs.LG

    CAVIAR: Categorical-Variable Embeddings for Accurate and Robust Inference

    Authors: Anirban Mukherjee, Hannah Hanwen Chang

    Abstract: Social science research often hinges on the relationship between categorical variables and outcomes. We introduce CAVIAR, a novel method for embedding categorical variables that assume values in a high-dimensional ambient space but are sampled from an underlying manifold. Our theoretical and numerical analyses outline challenges posed by such categorical variables in causal inference. Specifically… ▽ More

    Submitted 11 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

  37. arXiv:2404.04436  [pdf, other

    cs.AI

    AI Knowledge and Reasoning: Emulating Expert Creativity in Scientific Research

    Authors: Anirban Mukherjee, Hannah Hanwen Chang

    Abstract: We investigate whether modern AI can emulate expert creativity in complex scientific endeavors. We introduce novel methodology that utilizes original research articles published after the AI's training cutoff, ensuring no prior exposure, mitigating concerns of rote memorization and prior training. The AI are tasked with redacting findings, predicting outcomes from redacted research, and assessing… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  38. arXiv:2404.03867  [pdf, other

    stat.CO math.PR stat.ML

    Dimension-free Relaxation Times of Informed MCMC Samplers on Discrete Spaces

    Authors: Hyunwoong Chang, Quan Zhou

    Abstract: Convergence analysis of Markov chain Monte Carlo methods in high-dimensional statistical applications is increasingly recognized. In this paper, we develop general mixing time bounds for Metropolis-Hastings algorithms on discrete spaces by building upon and refining some recent theoretical advancements in Bayesian model selection problems. We establish sufficient conditions for a class of informed… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    MSC Class: 60J10; 60J20; 82M31; 62F15

  39. arXiv:2404.00009  [pdf, other

    physics.ed-ph

    Applying Cognitive Diagnostic Models to Mechanics Concept Inventories

    Authors: Vy Le, Jayson M. Nissen, Xiuxiu Tang, Yuxiao Zhang, Amirreza Mehrabi, Jason W. Morphew, Hua Hua Chang, Ben Van Dusen

    Abstract: In physics education research, instructors and researchers often use research-based assessments (RBAs) to assess students' skills and knowledge. In this paper, we support the development of a mechanics cognitive diagnostic to test and implement effective and equitable pedagogies for physics instruction. Adaptive assessments using cognitive diagnostic models provide significant advantages over fixe… ▽ More

    Submitted 8 March, 2024; originally announced April 2024.

    Comments: 14 pages, 3 figures, 14 tables (including appendix). This paper is submitted to Physical Review Physics Education Research

  40. arXiv:2403.18727  [pdf, ps, other

    math.RT

    Modular representations of the Yangian $Y_2$

    Authors: Hao Chang, Jinxin Hu, Lewis Topley

    Abstract: Let $Y_2$ be the Yangian associated to the general linear Lie algebra $\mathfrak{gl}_2$, defined over an algebraically closed field $\mathbbm{k}$ of characteristic $p > 0$. In this paper, we study the representation theory of the restricted Yangian $Y^{[p]}_2$. This leads to a description of the representations of $\mathfrak{gl}_{2n}$, whose $p$-character is nilpotent with Jordan type given by a t… ▽ More

    Submitted 8 May, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: Minor corrections, 19 pages

  41. arXiv:2403.17754  [pdf, other

    cs.CG

    Optimal Euclidean Tree Covers

    Authors: Hsien-Chih Chang, Jonathan Conroy, Hung Le, Lazar Milenkovic, Shay Solomon, Cuong Than

    Abstract: A $(1+\varepsilon)\textit{-stretch tree cover}$ of a metric space is a collection of trees, where every pair of points has a $(1+\varepsilon)$-stretch path in one of the trees. The celebrated $\textit{Dumbbell Theorem}$ [Arya et~al. STOC'95] states that any set of $n$ points in $d$-dimensional Euclidean space admits a $(1+\varepsilon)$-stretch tree cover with… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  42. arXiv:2403.16428  [pdf, other

    cs.CV

    Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

    Authors: Zicong Fan, Takehiko Ohkawa, Linlin Yang, Nie Lin, Zhishan Zhou, Shihao Zhou, Jiajun Liang, Zhong Gao, Xuanyang Zhang, Xue Zhang, Fei Li, Liu Zheng, Feng Lu, Karim Abou Zeid, Bastian Leibe, Jeongwan On, Seungryul Baek, Aditya Prakash, Saurabh Gupta, Kun He, Yoichi Sato, Otmar Hilliges, Hyung Jin Chang, Angela Yao

    Abstract: We interact with the world with our hands and see it through our own (egocentric) perspective. A holistic 3D understanding of such interactions from egocentric views is important for tasks in robotics, AR/VR, action recognition and motion generation. Accurately reconstructing such interactions in 3D is challenging due to heavy occlusion, viewpoint bias, camera distortion, and motion blur from the… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  43. arXiv:2403.15664  [pdf, other

    cs.CV

    What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation

    Authors: Yihua Cheng, Yaning Zhu, Zongji Wang, Hongquan Hao, Yongwei Liu, Shiqing Cheng, Xi Wang, Hyung Jin Chang

    Abstract: Driver's eye gaze holds a wealth of cognitive and intentional cues crucial for intelligent vehicles. Despite its significance, research on in-vehicle gaze estimation remains limited due to the scarcity of comprehensive and well-annotated datasets in real driving scenarios. In this paper, we present three novel elements to advance in-vehicle gaze research. Firstly, we introduce IVGaze, a pioneering… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: CVPR24

  44. arXiv:2403.13551  [pdf, other

    cs.CV cs.LG

    Ground-A-Score: Scaling Up the Score Distillation for Multi-Attribute Editing

    Authors: Hangeol Chang, Jinho Chang, Jong Chul Ye

    Abstract: Despite recent advancements in text-to-image diffusion models facilitating various image editing techniques, complex text prompts often lead to an oversight of some requests due to a bottleneck in processing text information. To tackle this challenge, we present Ground-A-Score, a simple yet powerful model-agnostic image editing method by incorporating grounding during score distillation. This appr… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  45. arXiv:2403.11163  [pdf, ps, other

    stat.ME cs.LG math.ST stat.CO

    A Selective Review on Statistical Methods for Massive Data Computation: Distributed Computing, Subsampling, and Minibatch Techniques

    Authors: Xuetong Li, Yuan Gao, Hong Chang, Danyang Huang, Yingying Ma, Rui Pan, Haobo Qi, Feifei Wang, Shuyuan Wu, Ke Xu, Jing Zhou, Xuening Zhu, Yingqiu Zhu, Hansheng Wang

    Abstract: This paper presents a selective review of statistical computation methods for massive data analysis. A huge amount of statistical methods for massive data computation have been rapidly developed in the past decades. In this work, we focus on three categories of statistical computation methods: (1) distributed computing, (2) subsampling methods, and (3) minibatch gradient techniques. The first clas… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  46. arXiv:2403.10036  [pdf, other

    cs.CV

    SparseFusion: Efficient Sparse Multi-Modal Fusion Framework for Long-Range 3D Perception

    Authors: Yiheng Li, Hongyang Li, Zehao Huang, Hong Chang, Naiyan Wang

    Abstract: Multi-modal 3D object detection has exhibited significant progress in recent years. However, most existing methods can hardly scale to long-range scenarios due to their reliance on dense 3D features, which substantially escalate computational demands and memory usage. In this paper, we introduce SparseFusion, a novel multi-modal fusion framework fully built upon sparse 3D features to facilitate ef… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  47. arXiv:2403.09404  [pdf, other

    cs.AI

    Heuristic Reasoning in AI: Instrumental Use and Mimetic Absorption

    Authors: Anirban Mukherjee, Hannah Hanwen Chang

    Abstract: Deviating from conventional perspectives that frame artificial intelligence (AI) systems solely as logic emulators, we propose a novel program of heuristic reasoning. We distinguish between the 'instrumental' use of heuristics to match resources with objectives, and 'mimetic absorption,' whereby heuristics manifest randomly and universally. Through a series of innovative experiments, including var… ▽ More

    Submitted 18 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  48. arXiv:2403.09289  [pdf, other

    cs.AI

    Silico-centric Theory of Mind

    Authors: Anirban Mukherjee, Hannah Hanwen Chang

    Abstract: Theory of Mind (ToM) refers to the ability to attribute mental states, such as beliefs, desires, intentions, and knowledge, to oneself and others, and to understand that these mental states can differ from one's own and from reality. We investigate ToM in environments with multiple, distinct, independent AI agents, each possessing unique internal states, information, and objectives. Inspired by hu… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  49. arXiv:2403.06225  [pdf, other

    cs.CV cs.AI

    MoST: Motion Style Transformer between Diverse Action Contents

    Authors: Boeun Kim, Jungho Kim, Hyung Jin Chang, Jin Young Choi

    Abstract: While existing motion style transfer methods are effective between two motions with identical content, their performance significantly diminishes when transferring style between motions with different contents. This challenge lies in the lack of clear separation between content and style of a motion. To tackle this challenge, we propose a novel motion style transformer that effectively disentangle… ▽ More

    Submitted 20 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  50. arXiv:2403.03535  [pdf, other

    cs.CV cs.LG

    Task Attribute Distance for Few-Shot Learning: Theoretical Analysis and Applications

    Authors: Minyang Hu, Hong Chang, Zong Guo, Bingpeng Ma, Shiguan Shan, Xilin Chen

    Abstract: Few-shot learning (FSL) aims to learn novel tasks with very few labeled samples by leveraging experience from \emph{related} training tasks. In this paper, we try to understand FSL by delving into two key questions: (1) How to quantify the relationship between \emph{training} and \emph{novel} tasks? (2) How does the relationship affect the \emph{adaptation difficulty} on novel tasks for different… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.