Skip to main content

Showing 1–50 of 933 results for author: Han, W

  1. arXiv:2407.01436  [pdf, other

    cs.CV cs.RO

    AdaOcc: Adaptive Forward View Transformation and Flow Modeling for 3D Occupancy and Flow Prediction

    Authors: Dubing Chen, Wencheng Han, Jin Fang, Jianbing Shen

    Abstract: In this technical report, we present our solution for the Vision-Centric 3D Occupancy and Flow Prediction track in the nuScenes Open-Occ Dataset Challenge at CVPR 2024. Our innovative approach involves a dual-stage framework that enhances 3D occupancy and flow predictions by incorporating adaptive forward view transformation and flow modeling. Initially, we independently train the occupancy model,… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 2nd Place in the 3D Occupancy and Flow Prediction Challenge (CVPR24)

  2. arXiv:2407.01378  [pdf, other

    cs.LG cs.NI

    Beyond Throughput and Compression Ratios: Towards High End-to-end Utility of Gradient Compression

    Authors: Wenchen Han, Shay Vargaftik, Michael Mitzenmacher, Brad Karp, Ran Ben Basat

    Abstract: Gradient aggregation has long been identified as a major bottleneck in today's large-scale distributed machine learning training systems. One promising solution to mitigate such bottlenecks is gradient compression, directly reducing communicated gradient data volume. However, in practice, many gradient compression schemes do not achieve acceleration of the training process while also preserving ac… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 9 pages, 3 figures

  3. arXiv:2407.00136  [pdf, other

    hep-ex

    Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, S. Ahmed, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, X. H. Bai, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, J. Bloms, A. Bortone, I. Boyko, R. A. Briere , et al. (495 additional authors not shown)

    Abstract: Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions… ▽ More

    Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

  4. arXiv:2406.19135  [pdf, other

    eess.AS cs.AI

    DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability

    Authors: Hyun Joon Park, Jin Sob Kim, Wooseok Shin, Sung Won Han

    Abstract: Expressive Text-to-Speech (TTS) using reference speech has been studied extensively to synthesize natural speech, but there are limitations to obtaining well-represented styles and improving model generalization ability. In this study, we present Diffusion-based EXpressive TTS (DEX-TTS), an acoustic model designed for reference-based speech synthesis with enhanced style representations. Based on a… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Preprint

  5. arXiv:2406.17255  [pdf, other

    cs.CL

    MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning

    Authors: Zhenlong Dai, Chang Yao, WenKang Han, Ying Yuan, Zhipeng Gao, Jingyuan Chen

    Abstract: Large Language Models (LLMs) have demonstrated great potential for assisting developers in their daily development. However, most research focuses on generating correct code, how to use LLMs to generate personalized code has seldom been investigated. To bridge this gap, we proposed MPCoder (Multi-user Personalized Code Generator) to generate personalized code for multiple users. To better learn co… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL 2024, Main Conference

  6. arXiv:2406.16271  [pdf, other

    cs.CV

    Feature-prompting GBMSeg: One-Shot Reference Guided Training-Free Prompt Engineering for Glomerular Basement Membrane Segmentation

    Authors: Xueyu Liu, Guangze Shi, Rui Wang, Yexin Lai, Jianan Zhang, Lele Sun, Quan Yang, Yongfei Wu, MIng Li, Weixia Han, Wen Zheng

    Abstract: Assessment of the glomerular basement membrane (GBM) in transmission electron microscopy (TEM) is crucial for diagnosing chronic kidney disease (CKD). The lack of domain-independent automatic segmentation tools for the GBM necessitates an AI-based solution to automate the process. In this study, we introduce GBMSeg, a training-free framework designed to automatically segment the GBM in TEM images… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: Accepted for MICCAI2024

  7. arXiv:2406.13358  [pdf, other

    cs.CV eess.IV

    Multi-scale Restoration of Missing Data in Optical Time-series Images with Masked Spatial-Temporal Attention Network

    Authors: Zaiyan Zhang, Jining Yan, Yuanqi Liang, Jiaxin Feng, Haixu He, Wei Han

    Abstract: Due to factors such as thick cloud cover and sensor limitations, remote sensing images often suffer from significant missing data, resulting in incomplete time-series information. Existing methods for imputing missing values in remote sensing images do not fully exploit spatio-temporal auxiliary information, leading to limited accuracy in restoration. Therefore, this paper proposes a novel deep le… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  8. arXiv:2406.12647  [pdf, ps, other

    physics.soc-ph

    Evolution of cooperation with the diversity of cooperation tendencies

    Authors: Linya Huang, Wenchen Han

    Abstract: The complete cooperation and the complete defection are two typical strategies considered in evolutionary games in many previous works. However, in real life, strategies of individuals are full of variety rather than only two complete ones. In this work, the diversity of strategies is introduced into the weak prisoners' dilemma game, which is measured by the diversity of the cooperation tendency.… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 7 pages, 5 figures, 1 table

  9. arXiv:2406.12331  [pdf, other

    cs.CL cs.AI

    Retrieval Meets Reasoning: Dynamic In-Context Editing for Long-Text Understanding

    Authors: Weizhi Fei, Xueyan Niu, Guoqing Xie, Yanhua Zhang, Bo Bai, Lei Deng, Wei Han

    Abstract: Current Large Language Models (LLMs) face inherent limitations due to their pre-defined context lengths, which impede their capacity for multi-hop reasoning within extensive textual contexts. While existing techniques like Retrieval-Augmented Generation (RAG) have attempted to bridge this gap by sourcing external information, they fall short when direct answers are not readily available. We introd… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  10. arXiv:2406.12016  [pdf, other

    cs.LG

    Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization

    Authors: Seungwoo Son, Wonpyo Park, Woohyun Han, Kyuyeun Kim, Jaeho Lee

    Abstract: Despite recent advances in LLM quantization, activation quantization remains to be challenging due to the activation outliers. Conventional remedies, e.g., mixing precisions for different channels, introduce extra overhead and reduce the speedup. In this work, we develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tok… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  11. arXiv:2406.11643  [pdf, other

    cs.CV

    AnyMaker: Zero-shot General Object Customization via Decoupled Dual-Level ID Injection

    Authors: Lingjie Kong, Kai Wu, Xiaobin Hu, Wenhui Han, Jinlong Peng, Chengming Xu, Donghao Luo, Jiangning Zhang, Chengjie Wang, Yanwei Fu

    Abstract: Text-to-image based object customization, aiming to generate images with the same identity (ID) as objects of interest in accordance with text prompts and reference images, has made significant progress. However, recent customizing research is dominated by specialized tasks, such as human customization or virtual try-on, leaving a gap in general object customization. To this end, we introduce AnyM… ▽ More

    Submitted 5 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  12. Ents: An Efficient Three-party Training Framework for Decision Trees by Communication Optimization

    Authors: Guopeng Lin, Weili Han, Wenqiang Ruan, Ruisheng Zhou, Lushan Song, Bingshuai Li, Yunfeng Shao

    Abstract: Multi-party training frameworks for decision trees based on secure multi-party computation enable multiple parties to train high-performance models on distributed private data with privacy preservation. The training process essentially involves frequent dataset splitting according to the splitting criterion (e.g. Gini impurity). However, existing multi-party training frameworks for decision trees… ▽ More

    Submitted 3 July, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: This paper is the full version of a paper to appear in ACM CCS 2024

  13. arXiv:2406.05314  [pdf, other

    eess.AS cs.AI eess.SP

    Relational Proxy Loss for Audio-Text based Keyword Spotting

    Authors: Youngmoon Jung, Seungjin Lee, Joon-Young Yang, Jaeyoung Roh, Chang Woo Han, Hoon-Young Cho

    Abstract: In recent years, there has been an increasing focus on user convenience, leading to increased interest in text-based keyword enrollment systems for keyword spotting (KWS). Since the system utilizes text input during the enrollment phase and audio input during actual usage, we call this task audio-text based KWS. To enable this task, both acoustic and text encoders are typically trained using deep… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 5 pages, 2 figures, Accepted by Interspeech 2024

  14. arXiv:2406.05039  [pdf, other

    cs.CV cs.CL

    Bootstrapping Referring Multi-Object Tracking

    Authors: Yani Zhang, Dongming Wu, Wencheng Han, Xingping Dong

    Abstract: Referring multi-object tracking (RMOT) aims at detecting and tracking multiple objects following human instruction represented by a natural language expression. Existing RMOT benchmarks are usually formulated through manual annotations, integrated with static regulations. This approach results in a dearth of notable diversity and a constrained scope of implementation. In this work, our key idea is… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  15. arXiv:2406.03813  [pdf, other

    cs.RO

    Touch100k: A Large-Scale Touch-Language-Vision Dataset for Touch-Centric Multimodal Representation

    Authors: Ning Cheng, Changhao Guan, Jing Gao, Weihao Wang, You Li, Fandong Meng, Jie Zhou, Bin Fang, Jinan Xu, Wenjuan Han

    Abstract: Touch holds a pivotal position in enhancing the perceptual and interactive capabilities of both humans and robots. Despite its significance, current tactile research mainly focuses on visual and tactile modalities, overlooking the language domain. Inspired by this, we construct Touch100k, a paired touch-language-vision dataset at the scale of 100k, featuring tactile sensation descriptions in multi… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  16. arXiv:2406.03728  [pdf, other

    cs.CV

    Evaluating Durability: Benchmark Insights into Multimodal Watermarking

    Authors: Jielin Qiu, William Han, Xuandong Zhao, Shangbang Long, Christos Faloutsos, Lei Li

    Abstract: With the development of large models, watermarks are increasingly employed to assert copyright, verify authenticity, or monitor content distribution. As applications become more multimodal, the utility of watermarking techniques becomes even more critical. The effectiveness and reliability of these watermarks largely depend on their robustness to various disturbances. However, the robustness of th… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  17. arXiv:2405.20610  [pdf, other

    cs.CV

    Revisiting and Maximizing Temporal Knowledge in Semi-supervised Semantic Segmentation

    Authors: Wooseok Shin, Hyun Joon Park, Jin Sob Kim, Sung Won Han

    Abstract: In semi-supervised semantic segmentation, the Mean Teacher- and co-training-based approaches are employed to mitigate confirmation bias and coupling problems. However, despite their high performance, these approaches frequently involve complex training pipelines and a substantial computational burden, limiting the scalability and compatibility of these methods. In this paper, we propose a PrevMatc… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 14 pages, 5 figures, submitted to IEEE TPAMI. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  18. arXiv:2405.19554  [pdf, ps, other

    math.NA

    Numerical analysis of a 1/2-equation model of turbulence

    Authors: Wei-Wei Han, Rui Fang, William Layton

    Abstract: The recent 1/2-equation model of turbulence is a simplification of the standard Kolmogorov-Prandtl 1-equation URANS model. Surprisingly, initial numerical tests indicated that the 1/2-equation model produces comparable velocity statistics at reduced cost. It is also a test problem and first step for developing numerical analysis to address a full 1-equation model. This report begins the numerical… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  19. arXiv:2405.19521  [pdf, other

    cs.LG stat.ML

    Crowdsourcing with Difficulty: A Bayesian Rating Model for Heterogeneous Items

    Authors: Seong Woo Han, Ozan Adıgüzel, Bob Carpenter

    Abstract: In applied statistics and machine learning, the "gold standards" used for training are often biased and almost always noisy. Dawid and Skene's justifiably popular crowdsourcing model adjusts for rater (coder, annotator) sensitivity and specificity, but fails to capture distributional properties of rating data gathered for training, which in turn biases training. In this study, we introduce a gener… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  20. arXiv:2405.12809  [pdf, other

    hep-ex

    Precision measurement of the branching fraction of \boldmath $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere , et al. (604 additional authors not shown)

    Abstract: Using a sample of $448.1 \times 10^6$ $ψ(2S)$ events collected with the BESIII detector, we perform a study of the decay $J/ψ\rightarrow K^+K^-$ via $ψ(2S)\rightarrow π^+π^-J/ψ$. The branching fraction of $J/ψ\rightarrow K^+K^-$ is determined to be $\mathcal{B}_{K^+K^-}=(3.072\pm 0.023({\rm stat.})\pm 0.050({\rm syst.}))\times 10^{-4}$, which is consistent with previous measurements but with sig… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: to be submitted to PRD

  21. arXiv:2405.12779  [pdf

    cs.LG cs.AI

    Transformer in Touch: A Survey

    Authors: Jing Gao, Ning Cheng, Bin Fang, Wenjuan Han

    Abstract: The Transformer model, initially achieving significant success in the field of natural language processing, has recently shown great potential in the application of tactile perception. This review aims to comprehensively outline the application and development of Transformers in tactile technology. We first introduce the two fundamental concepts behind the success of the Transformer: the self-atte… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 27 pages, 2 tables, 5 figures, accepted by ICIC 2024

  22. arXiv:2405.11279  [pdf, other

    gr-qc

    The Effect of Higher Harmonics On Gravitational Wave Dark Sirens

    Authors: Jian-Dong Liu, Wen-Biao Han, Qianyun Yun, Shu-Cheng Yang

    Abstract: The gravitational wave (GW) signal from the merger of two black holes can serve as a standard sirens for cosmological inference. However, a degeneracy exists between the luminosity distance and the inclination angle between the binary system's orbital angular momentum and the observer's line of sight, limiting the precise measurement of the luminosity distance. In this study, we investigate how hi… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: 14 pages, 6 figures

  23. arXiv:2405.09796  [pdf

    physics.acc-ph

    Prototype Design of a Digital Low-level RF System for S-band Deflectors

    Authors: J. F. Zhu, H. L. Ding, H. K. Li, Y. Li, X. W. Dai, J. W. Han, W. Q. Zhang

    Abstract: S-band deflectors are generally operated on pulsed mode for beam diagnosis. We plan to deploy 5 S-band (2997 MHz) deflectors to accurately measure the longitudinal time distribution of ultra-short electron beam pulses in Shenzhen Superconducting Soft X-ray Free Electron Laser (S3FEL). A microwave system of one deflector consists of a low-level RF system (LLRF), a solid-state amplifier, waveguide c… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 3 pages, 5 figures, IPAC'24 - 15th International Particle Accelerator Conference

  24. arXiv:2405.09585  [pdf, other

    cs.LG cs.AI

    An Embarrassingly Simple Approach to Enhance Transformer Performance in Genomic Selection for Crop Breeding

    Authors: Renqi Chen, Wenwei Han, Haohao Zhang, Haoyang Su, Zhefan Wang, Xiaolei Liu, Hao Jiang, Wanli Ouyang, Nanqing Dong

    Abstract: Genomic selection (GS), as a critical crop breeding strategy, plays a key role in enhancing food production and addressing the global hunger crisis. The predominant approaches in GS currently revolve around employing statistical methods for prediction. However, statistical methods often come with two main limitations: strong statistical priors and linear assumptions. A recent trend is to capture t… ▽ More

    Submitted 24 June, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI2024. Code is available at https://github.com/RenqiChen/Genomic-Selection

  25. arXiv:2405.09552  [pdf, other

    eess.IV cs.AI cs.CV

    ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

    Authors: Jiayi Wang, Yi-An Mao, Xiaoyu Ma, Sicen Guo, Yuting Shao, Xiao Lv, Wenting Han, Mark Christopher, Linda M. Zangwill, Yanlong Bi, Rui Fan

    Abstract: Optic nerve head (ONH) detection has been a crucial area of study in ophthalmology for years. However, the significant discrepancy between fundus image datasets, each generated using a single type of fundus camera, poses challenges to the generalizability of ONH detection approaches developed based on semantic segmentation networks. Despite the numerous recent advancements in general-purpose seman… ▽ More

    Submitted 2 June, 2024; v1 submitted 15 April, 2024; originally announced May 2024.

  26. arXiv:2405.09066  [pdf, other

    hep-ex

    Search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, M. Albrecht, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, R. Baldini Ferroli, I. Balossino, Y. Ban, V. Batozskaya, D. Becker, K. Begzsuren, N. Berger, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, J. Bloms, A. Bortone, I. Boyko , et al. (559 additional authors not shown)

    Abstract: We present the first search for the leptonic decays $D^{*+}\to e^+ν_e$ and $D^{*+}\to μ^+ν_μ$ by analyzing a data sample of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.178 and 4.226 GeV, corresponding to an integrated luminosity of 6.32~fb$^{-1}$. No significant signal is observed. The upper limits on the branching fractions for… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 14 pages, 7 figures

  27. arXiv:2405.08707  [pdf, other

    cs.LG

    Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory

    Authors: Xueyan Niu, Bo Bai, Lei Deng, Wei Han

    Abstract: Increasing the size of a Transformer model does not always lead to enhanced performance. This phenomenon cannot be explained by the empirical scaling laws. Furthermore, improved generalization ability occurs as the model memorizes the training samples. We present a theoretical framework that sheds light on the memorization process and performance dynamics of transformer-based language models. We m… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  28. arXiv:2405.06410  [pdf, other

    cs.CL

    Potential and Limitations of LLMs in Capturing Structured Semantics: A Case Study on SRL

    Authors: Ning Cheng, Zhaohui Yan, Ziming Wang, Zhijie Li, Jiaming Yu, Zilong Zheng, Kewei Tu, Jinan Xu, Wenjuan Han

    Abstract: Large Language Models (LLMs) play a crucial role in capturing structured semantics to enhance language understanding, improve interpretability, and reduce bias. Nevertheless, an ongoing controversy exists over the extent to which LLMs can grasp structured semantics. To assess this, we propose using Semantic Role Labeling (SRL) as a fundamental task to explore LLMs' ability to extract structured se… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted by ICIC 2024

  29. arXiv:2404.14684  [pdf, other

    gr-qc

    Tests of gravitational wave propagation with LIGO-Virgo catalog

    Authors: Xian-Liang Wang, Shu-Cheng Yang, Wen-Biao Han

    Abstract: In the framework of general relativity (GR), gravitational waves (GWs) are theorized to travel at the speed of light across all frequencies. However, Lorentz invariance (LI) violation and weak equivalence principle (WEP) violation may lead to frequency-dependent variations in the propagation speed of GWs, which can be examined by comparing the theoretical and observed discrepancies in the arrival… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 10 pages, 9 figures

  30. arXiv:2404.08522  [pdf, other

    cs.LG physics.ao-ph

    Fuxi-DA: A Generalized Deep Learning Data Assimilation Framework for Assimilating Satellite Observations

    Authors: Xiaoze Xu, Xiuyu Sun, Wei Han, Xiaohui Zhong, Lei Chen, Hao Li

    Abstract: Data assimilation (DA), as an indispensable component within contemporary Numerical Weather Prediction (NWP) systems, plays a crucial role in generating the analysis that significantly impacts forecast performance. Nevertheless, the development of an efficient DA system poses significant challenges, particularly in establishing intricate relationships between the background data and the vast amoun… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  31. arXiv:2404.07436  [pdf, other

    hep-ex

    Measurement of $e^{+}e^{-}\to ωη^{\prime}$ cross sections at $\sqrt{s}=$ 2.000 to 3.080 GeV

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (599 additional authors not shown)

    Abstract: The Born cross sections for the process $e^{+}e^{-}\to ωη^{\prime}$ are measured at 22 center-of-mass energies from 2.000 to 3.080 GeV using data collected with the BESIII detector at the BEPCII collider. A resonant structure is observed with a statistical significance of 9.6$σ$. A Breit-Wigner fit determines its mass to be $M_R=(2153\pm30\pm31)~{\rm{MeV}}/c^{2}$ and its width to be… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  32. arXiv:2404.05558  [pdf, other

    eess.IV cs.CV

    JDEC: JPEG Decoding via Enhanced Continuous Cosine Coefficients

    Authors: Woo Kyoung Han, Sunghoon Im, Jaedeok Kim, Kyong Hwan Jin

    Abstract: We propose a practical approach to JPEG image decoding, utilizing a local implicit neural representation with continuous cosine formulation. The JPEG algorithm significantly quantizes discrete cosine transform (DCT) spectra to achieve a high compression rate, inevitably resulting in quality degradation while encoding an image. We have designed a continuous cosine spectrum estimator to address the… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  33. arXiv:2404.04640  [pdf, other

    hep-ex

    Search for di-photon decays of an axion-like particle in radiative J/ψdecays

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, O. Afedulidis, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, H. -R. Bao, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko , et al. (604 additional authors not shown)

    Abstract: We search for the di-photon decay of a light pseudoscalar axion-like particle, $a$, in radiative $J/ψ$ decays, using 10 billion $J/ψ$ events collected with the BESIII detector. We find no evidence of a signal and set upper limits at the $95\%$ confidence level on the product branching fraction $\mathcal{B}(J/ψ\to γa) \times \mathcal{B}(a \to γγ)$ and the axion-like particle photon coupling constan… ▽ More

    Submitted 3 July, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

    Comments: 9 pages, 5 figures, To be published in Phys. Rev. D (Letter)

    Report number: BESIII Analysis Memo - 671

  34. arXiv:2403.19267  [pdf, other

    cs.CL cs.AI

    MineLand: Simulating Large-Scale Multi-Agent Interactions with Limited Multimodal Senses and Physical Needs

    Authors: Xianhao Yu, Jiaqi Fu, Renjia Deng, Wenjuan Han

    Abstract: While Vision-Language Models (VLMs) hold promise for tasks requiring extensive collaboration, traditional multi-agent simulators have facilitated rich explorations of an interactive artificial society that reflects collective behavior. However, these existing simulators face significant limitations. Firstly, they struggle with handling large numbers of agents due to high resource demands. Secondly… ▽ More

    Submitted 23 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Project website: https://github.com/cocacola-lab/MineLand

  35. arXiv:2403.19091  [pdf, other

    hep-ex

    Observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$

    Authors: BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, R. Aliberti, A. Amoroso, M. R. An, Q. An, Y. Bai, O. Bakina, I. Balossino, Y. Ban, V. Batozskaya, K. Begzsuren, N. Berger, M. Berlowski, M. Bertani, D. Bettoni, F. Bianchi, E. Bianco, A. Bortone, I. Boyko, R. A. Briere, A. Brueggemann , et al. (600 additional authors not shown)

    Abstract: By analyzing $e^+e^-$ annihilation data corresponding to an integrated luminosity of 2.93 $\rm fb^{-1}$ collected at a center-of-mass energy of 3.773 GeV with the \text{BESIII} detector, the first observation of the semileptonic decays $D^0\rightarrow K_S^0π^-π^0 e^+ ν_e$ and $D^+\rightarrow K_S^0π^+π^- e^+ ν_e$ is reported. With a dominant hadronic contribution from $K_1(1270)$, the branching fra… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 19pages

  36. arXiv:2403.18320  [pdf, other

    math.OC

    Online Prediction for Streaming Tensor Time Series

    Authors: Zhenting Luan, Haoning Wang, Liping Zhang, Shansuo Liang, Wei Han

    Abstract: Real-time prediction plays a vital role in various control systems, such as traffic congestion control and wireless channel resource allocation. In these scenarios, the predictor usually needs to track the evolution of the latent statistical patterns in the modern high-dimensional streaming time series continuously and quickly, which presents new challenges for traditional prediction methods. This… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  37. arXiv:2403.17598  [pdf

    eess.SY

    Ultrafast Adaptive Primary Frequency Tuning and Secondary Frequency Identification for S/S WPT system

    Authors: Chang Liu, Wei Han, Guangyu Yan, Bowang Zhang, Chunlin Li

    Abstract: Magnetic resonance wireless power transfer (WPT) technology is increasingly being adopted across diverse applications. However, its effectiveness can be significantly compromised by parameter shifts within the resonance network, owing to its high system quality factor. Such shifts are inherent and challenging to mitigate during the manufacturing process. In response, this article introduces a rapi… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 11 pages,16 figures,to be published in IEEE Transactions on Industrial Electronics

  38. arXiv:2403.12339  [pdf, other

    cs.CV

    Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition

    Authors: Jielin Qiu, William Han, Winfred Wang, Zhengyuan Yang, Linjie Li, Jianfeng Wang, Christos Faloutsos, Lei Li, Lijuan Wang

    Abstract: Open-domain real-world entity recognition is essential yet challenging, involving identifying various entities in diverse environments. The lack of a suitable evaluation dataset has been a major obstacle in this field due to the vast number of entities and the extensive human effort required for data curation. We introduce Entity6K, a comprehensive dataset for real-world entity recognition, featur… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  39. arXiv:2403.09936  [pdf

    cond-mat.mtrl-sci cond-mat.mes-hall

    Anomalous Raman Response in 2D Magnetic FeTe under Uniaxial Strain: Tetragonal and Hexagonal Polymorphs

    Authors: Wuxiao Han, Tiansong Zhang, Pengcheng Zhao, Longfei Yang, Mo Cheng, Jianping Shi, Yabin Chen

    Abstract: Two-dimensional (2D) Fe-chalcogenides have emerged with rich structures, magnetisms and superconductivities, which sparked the growing research interests in the torturous transition mechanism and tunable properties for their potential applications in nanoelectronics. Uniaxial strain can produce a lattice distortion to study symmetry breaking induced exotic properties in 2D magnets. Herein, the ano… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 21 pages, 5 figures

  40. arXiv:2403.09813  [pdf, other

    cs.CV cs.RO

    Towards Comprehensive Multimodal Perception: Introducing the Touch-Language-Vision Dataset

    Authors: Ning Cheng, You Li, Jing Gao, Bin Fang, Jinan Xu, Wenjuan Han

    Abstract: Tactility provides crucial support and enhancement for the perception and interaction capabilities of both humans and robots. Nevertheless, the multimodal research related to touch primarily focuses on visual and tactile modalities, with limited exploration in the domain of language. Beyond vocabulary, sentence-level descriptions contain richer semantics. Based on this, we construct a touch-langua… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted by ICIC 2024

  41. arXiv:2403.09223  [pdf, other

    cs.LG eess.SP

    MCformer: Multivariate Time Series Forecasting with Mixed-Channels Transformer

    Authors: Wenyong Han, Tao Zhu Member, Liming Chen, Huansheng Ning, Yang Luo, Yaping Wan

    Abstract: The massive generation of time-series data by largescale Internet of Things (IoT) devices necessitates the exploration of more effective models for multivariate time-series forecasting. In previous models, there was a predominant use of the Channel Dependence (CD) strategy (where each channel represents a univariate sequence). Current state-of-the-art (SOTA) models primarily rely on the Channel In… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  42. arXiv:2403.08801  [pdf, other

    cs.CV

    CoBra: Complementary Branch Fusing Class and Semantic Knowledge for Robust Weakly Supervised Semantic Segmentation

    Authors: Woojung Han, Seil Kang, Kyobin Choo, Seong Jae Hwang

    Abstract: Leveraging semantically precise pseudo masks derived from image-level class knowledge for segmentation, namely image-level Weakly Supervised Semantic Segmentation (WSSS), still remains challenging. While Class Activation Maps (CAMs) using CNNs have steadily been contributing to the success of WSSS, the resulting activation maps often narrowly focus on class-specific parts (e.g., only face of human… ▽ More

    Submitted 27 May, 2024; v1 submitted 5 February, 2024; originally announced March 2024.

  43. arXiv:2403.08302  [pdf, other

    cs.RO

    Online Multi-Contact Feedback Model Predictive Control for Interactive Robotic Tasks

    Authors: Seo Wook Han, Maged Iskandar, Jinoh Lee, Min Jun Kim

    Abstract: In this paper, we propose a model predictive control (MPC) that accomplishes interactive robotic tasks, in which multiple contacts may occur at unknown locations. To address such scenarios, we made an explicit contact feedback loop in the MPC framework. An algorithm called Multi-Contact Particle Filter with Exploration Particle (MCP-EP) is employed to establish real-time feedback of multi-contact… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: This paper has been accepted for publication at the IEEE International Conference on Robotics and Automation (ICRA), Yokohama, 2024

  44. arXiv:2403.06516  [pdf, other

    cs.CV

    Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning

    Authors: Woojung Han, Chanyoung Kim, Dayun Ju, Yumin Shim, Seong Jae Hwang

    Abstract: Recent advances in text-conditioned image generation diffusion models have begun paving the way for new opportunities in modern medical domain, in particular, generating Chest X-rays (CXRs) from diagnostic reports. Nonetheless, to further drive the diffusion models to generate CXRs that faithfully reflect the complexity and diversity of real data, it has become evident that a nontrivial learning a… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  45. arXiv:2403.05574  [pdf, other

    cs.HC cs.AI cs.CL

    HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy

    Authors: Mengxi Xiao, Qianqian Xie, Ziyan Kuang, Zhicheng Liu, Kailai Yang, Min Peng, Weiguang Han, Jimin Huang

    Abstract: Large Language Models (LLMs) can play a vital role in psychotherapy by adeptly handling the crucial task of cognitive reframing and overcoming challenges such as shame, distrust, therapist skill variability, and resource scarcity. Previous LLMs in cognitive reframing mainly converted negative emotions to positive ones, but these approaches have limited efficacy, often not promoting clients' self-d… ▽ More

    Submitted 22 March, 2024; v1 submitted 26 February, 2024; originally announced March 2024.

    Comments: 17 pages, 4 figures

    ACM Class: J.4

  46. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  47. arXiv:2403.03004  [pdf, other

    astro-ph.CO gr-qc hep-ph

    Ultralight vector dark matter search using data from the KAGRA O3GK run

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

    Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: 20 pages, 5 figures

    Report number: LIGO-P2300250

  48. arXiv:2403.02870  [pdf, other

    cs.AI cs.CR cs.LG

    Precise Extraction of Deep Learning Models via Side-Channel Attacks on Edge/Endpoint Devices

    Authors: Younghan Lee, Sohee Jun, Yungi Cho, Woorim Han, Hyungon Moon, Yunheung Paek

    Abstract: With growing popularity, deep learning (DL) models are becoming larger-scale, and only the companies with vast training datasets and immense computing power can manage their business serving such large models. Most of those DL models are proprietary to the companies who thus strive to keep their private models safe from the model extraction attack (MEA), whose aim is to steal the model by training… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted by 27th European Symposium on Research in Computer Security (ESORICS 2022)

  49. arXiv:2403.02846  [pdf, other

    cs.LG cs.AI cs.CR cs.DC

    FLGuard: Byzantine-Robust Federated Learning via Ensemble of Contrastive Models

    Authors: Younghan Lee, Yungi Cho, Woorim Han, Ho Bae, Yunheung Paek

    Abstract: Federated Learning (FL) thrives in training a global model with numerous clients by only sharing the parameters of their local models trained with their private training datasets. Therefore, without revealing the private dataset, the clients can obtain a deep learning (DL) model with high performance. However, recent research proposed poisoning attacks that cause a catastrophic loss in the accurac… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted by 28th European Symposium on Research in Computer Security (ESORICS 2023)

  50. arXiv:2403.01628  [pdf, ps, other

    cs.LG

    Recent Advances, Applications, and Open Challenges in Machine Learning for Health: Reflections from Research Roundtables at ML4H 2023 Symposium

    Authors: Hyewon Jeong, Sarah Jabbour, Yuzhe Yang, Rahul Thapta, Hussein Mozannar, William Jongwon Han, Nikita Mehandru, Michael Wornow, Vladislav Lialin, Xin Liu, Alejandro Lozano, Jiacheng Zhu, Rafal Dariusz Kocielnik, Keith Harrigian, Haoran Zhang, Edward Lee, Milos Vukadinovic, Aparna Balagopalan, Vincent Jeanselme, Katherine Matton, Ilker Demirel, Jason Fries, Parisa Rashidi, Brett Beaulieu-Jones, Xuhai Orson Xu , et al. (18 additional authors not shown)

    Abstract: The third ML4H symposium was held in person on December 10, 2023, in New Orleans, Louisiana, USA. The symposium included research roundtable sessions to foster discussions between participants and senior researchers on timely and relevant topics for the \ac{ML4H} community. Encouraged by the successful virtual roundtables in the previous year, we organized eleven in-person roundtables and four vir… ▽ More

    Submitted 5 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: ML4H 2023, Research Roundtables