Skip to main content

Showing 1–50 of 1,650 results for author: Cheng, Y

  1. arXiv:2407.08582  [pdf, other

    cs.CL

    On the Universal Truthfulness Hyperplane Inside LLMs

    Authors: Junteng Liu, Shiqi Chen, Yu Cheng, Junxian He

    Abstract: While large language models (LLMs) have demonstrated remarkable abilities across various fields, hallucination remains a significant challenge. Recent studies have explored hallucinations through the lens of internal representations, proposing mechanisms to decipher LLMs' adherence to facts. However, these approaches often fail to generalize to out-of-distribution data, leading to concerns about w… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  2. arXiv:2407.08564  [pdf, other

    cs.AI

    The Career Interests of Large Language Models

    Authors: Meng Hua, Yuan Cheng, Hengshu Zhu

    Abstract: Recent advancements in Large Language Models (LLMs) have significantly extended their capabilities, evolving from basic text generation to complex, human-like interactions. In light of the possibilities that LLMs could assume significant workplace responsibilities, it becomes imminently necessary to explore LLMs' capacities as professional assistants. This study focuses on the aspect of career int… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  3. arXiv:2407.07403  [pdf, other

    cs.CV

    A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends

    Authors: Daizong Liu, Mingyu Yang, Xiaoye Qu, Pan Zhou, Yu Cheng, Wei Hu

    Abstract: With the significant development of large models in recent years, Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities across a wide range of multimodal understanding and reasoning tasks. Compared to traditional Large Language Models (LLMs), LVLMs present great potential and challenges due to its closer proximity to the multi-resource real-world applications and the compl… ▽ More

    Submitted 11 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

  4. arXiv:2407.06938  [pdf, other

    cs.CV

    RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

    Authors: Bowen Zhang, Yiji Cheng, Chunyu Wang, Ting Zhang, Jiaolong Yang, Yansong Tang, Feng Zhao, Dong Chen, Baining Guo

    Abstract: We present RodinHD, which can generate high-fidelity 3D avatars from a portrait image. Existing methods fail to capture intricate details such as hairstyles which we tackle in this paper. We first identify an overlooked problem of catastrophic forgetting that arises when fitting triplanes sequentially on many avatars, caused by the MLP decoder sharing scheme. To overcome this issue, we raise a nov… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: ECCV 2024; project page: https://rodinhd.github.io/

  5. arXiv:2407.06529  [pdf

    cs.LG q-fin.ST

    Advanced Financial Fraud Detection Using GNN-CL Model

    Authors: Yu Cheng, Junjie Guo, Shiqing Long, You Wu, Mengfang Sun, Rong Zhang

    Abstract: The innovative GNN-CL model proposed in this paper marks a breakthrough in the field of financial fraud detection by synergistically combining the advantages of graph neural networks (gnn), convolutional neural networks (cnn) and long short-term memory (LSTM) networks. This convergence enables multifaceted analysis of complex transaction patterns, improving detection accuracy and resilience agains… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  6. arXiv:2407.06199  [pdf

    cs.DB

    Data Governance and Data Management in Operations and Supply Chain: A Literature Review

    Authors: Xuejiao Li, Yang Cheng, Xiaoning Xia, Charles Møller

    Abstract: In the dynamic landscape of contemporary business, the wave in data and technological advancements has directed companies toward embracing data-driven decision-making processes. Despite the vast potential that data holds for strategic insights and operational efficiencies, substantial challenges arise in the form of data issues. Recognizing these obstacles, the imperative for effective data govern… ▽ More

    Submitted 21 June, 2024; originally announced July 2024.

  7. arXiv:2407.06087  [pdf, other

    cs.LG cs.CV

    Analytic Convolutional Layer: A Step to Analytic Neural Network

    Authors: Jingmao Cui, Donglai Tao, Linmi Tao, Ruiyang Liu, Yu Cheng

    Abstract: The prevailing approach to embedding prior knowledge within convolutional layers typically includes the design of steerable kernels or their modulation using designated kernel banks. In this study, we introduce the Analytic Convolutional Layer (ACL), an innovative model-driven convolutional layer, which is a mosaic of analytical convolution kernels (ACKs) and traditional convolution kernels. ACKs… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  8. arXiv:2407.05365  [pdf, other

    cs.AI

    ElecBench: a Power Dispatch Evaluation Benchmark for Large Language Models

    Authors: Xiyuan Zhou, Huan Zhao, Yuheng Cheng, Yuji Cao, Gaoqi Liang, Guolong Liu, Junhua Zhao

    Abstract: In response to the urgent demand for grid stability and the complex challenges posed by renewable energy integration and electricity market dynamics, the power sector increasingly seeks innovative technological solutions. In this context, large language models (LLMs) have become a key technology to improve efficiency and promote intelligent progress in the power sector with their excellent natural… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  9. arXiv:2407.05099  [pdf, other

    math.OC

    Equilibrium Strategies of Carbon Emission Reduction in Agricultural Product Supply Chain under Carbon Sink Trading

    Authors: Tingting Meng, Yukun Cheng, Xujin Pu, Rui Li

    Abstract: As global climate change and environmental issues escalate, carbon reduction has emerged as a paramount global concern. Agriculture accounts for approximately 30% of global greenhouse gas emissions, making carbon reduction in this sector crucial for attaining global emission targets. Carbon sink trading serves as a supplementary mechanism to achieve carbon peaking and neutrality, helping to lower… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  10. arXiv:2407.04557  [pdf, other

    cond-mat.mtrl-sci cs.LG

    Structural Constraint Integration in Generative Model for Discovery of Quantum Material Candidates

    Authors: Ryotaro Okabe, Mouyang Cheng, Abhijatmedhi Chotrattanapituk, Nguyen Tuan Hung, Xiang Fu, Bowen Han, Yao Wang, Weiwei Xie, Robert J. Cava, Tommi S. Jaakkola, Yongqiang Cheng, Mingda Li

    Abstract: Billions of organic molecules are known, but only a tiny fraction of the functional inorganic materials have been discovered, a particularly relevant problem to the community searching for new quantum materials. Recent advancements in machine-learning-based generative models, particularly diffusion models, show great promise for generating new, stable materials. However, integrating geometric patt… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 512 pages total, 4 main figures + 218 supplementary figures

  11. arXiv:2407.03604  [pdf, other

    cs.CL cs.CV

    Lateralization LoRA: Interleaved Instruction Tuning with Modality-Specialized Adaptations

    Authors: Zhiyang Xu, Minqian Liu, Ying Shen, Joy Rimchala, Jiaxin Zhang, Qifan Wang, Yu Cheng, Lifu Huang

    Abstract: Recent advancements in Vision-Language Models (VLMs) have led to the development of Vision-Language Generalists (VLGs) capable of understanding and generating interleaved images and text. Despite these advances, VLGs still struggle to follow user instructions for interleaved text and image generation. To address this issue, we introduce LeafInstruct, the first open-sourced interleaved instruction… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 8 Pages, visual instruction tuning, parameter-efficient tuning

  12. arXiv:2407.03374  [pdf

    cs.AI cs.SE eess.SP eess.SY

    An Outline of Prognostics and Health Management Large Model: Concepts, Paradigms, and Challenges

    Authors: Laifa Tao, Shangyu Li, Haifei Liu, Qixuan Huang, Liang Ma, Guoao Ning, Yiling Chen, Yunlong Wu, Bin Li, Weiwei Zhang, Zhengduo Zhao, Wenchao Zhan, Wenyan Cao, Chao Wang, Hongmei Liu, Jian Ma, Mingliang Suo, Yujie Cheng, Yu Ding, Dengwei Song, Chen Lu

    Abstract: Prognosis and Health Management (PHM), critical for ensuring task completion by complex systems and preventing unexpected failures, is widely adopted in aerospace, manufacturing, maritime, rail, energy, etc. However, PHM's development is constrained by bottlenecks like generalization, interpretation and verification abilities. Presently, generative artificial intelligence (AI), represented by Larg… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  13. arXiv:2407.03331  [pdf, other

    cs.CV cs.AI cs.DC

    Anole: Adapting Diverse Compressed Models For Cross-Scene Prediction On Mobile Devices

    Authors: Yunzhe Li, Hongzi Zhu, Zhuohong Deng, Yunlong Cheng, Liang Zhang, Shan Chang, Minyi Guo

    Abstract: Emerging Artificial Intelligence of Things (AIoT) applications desire online prediction using deep neural network (DNN) models on mobile devices. However, due to the movement of devices, unfamiliar test samples constantly appear, significantly affecting the prediction accuracy of a pre-trained DNN. In addition, unstable network connection calls for local model inference. In this paper, we propose… ▽ More

    Submitted 9 May, 2024; originally announced July 2024.

  14. arXiv:2407.00772  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci physics.chem-ph

    Core-level signature of long-range charge-density-wave order and short-range excitonic correlations probed by attosecond broadband spectroscopy

    Authors: Alfred Zong, Sheng-Chih Lin, Shunsuke A. Sato, Emma Berger, Bailey R. Nebgen, Marcus Hui, B. Q. Lv, Yun Cheng, Wei Xia, Yanfeng Guo, Dao Xiang, Michael W. Zuerch

    Abstract: Strongly-correlated quantum materials are characterized by a multitude of time- and energy-scales that govern quasiparticle interactions and excitations, and a precise understanding of their ground state properties necessitate an access to experimental observables that are both sensitive to the establishment of long-range order and short-range correlations. Advances in attosecond core-level spectr… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  15. arXiv:2407.00754  [pdf, ps, other

    q-bio.MN

    Gene Regulatory Network Inference with Covariance Dynamics

    Authors: Yue Wang, Peng Zheng, Yu-Chen Cheng, Zikun Wang, Aleksandr Aravkin

    Abstract: Determining gene regulatory network (GRN) structure is a central problem in biology, with a variety of inference methods available for different types of data. For a widely prevalent and challenging use case, namely single-cell gene expression data measured after intervention at multiple time points with unknown joint distributions, there is only one known specifically developed method, which does… ▽ More

    Submitted 17 June, 2024; originally announced July 2024.

  16. arXiv:2407.00421  [pdf

    physics.optics

    Multi-wavelength switchable single-frequency hyper Raman microlasers

    Authors: Chuntao Li, Ni Yao, Jintian Lin, Renhong Gao, Jianglin Guan, Guanghui Zhao, Minghui Li, Min Wang, Lingling Qiao, Ya Cheng

    Abstract: Multi-wavelength switchable single-frequency microlasers in a broad spectral range are highly desirable for integrated photonic applications due to their dynamic switching functionality, narrow linewidth, and high side-mode-suppression-ratio (SMSR). Here, a strategy based on highly efficient successive excitation of different stimulated multi-photon hyper-Raman scattering (SMPHRS) processes is pro… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 17 pages,5 figures, and 1 table

  17. arXiv:2407.00031  [pdf, other

    cs.DC cs.SE

    Supercharging Federated Learning with Flower and NVIDIA FLARE

    Authors: Holger R. Roth, Daniel J. Beutel, Yan Cheng, Javier Fernandez Marques, Heng Pan, Chester Chen, Zhihong Zhang, Yuhong Wen, Sean Yang, Isaac, Yang, Yuan-Ting Hsieh, Ziyue Xu, Daguang Xu, Nicholas D. Lane, Andrew Feng

    Abstract: Several open-source systems, such as Flower and NVIDIA FLARE, have been developed in recent years while focusing on different aspects of federated learning (FL). Flower is dedicated to implementing a cohesive approach to FL, analytics, and evaluation. Over time, Flower has cultivated extensive strategies and algorithms tailored for FL application development, fostering a vibrant FL community in re… ▽ More

    Submitted 21 May, 2024; originally announced July 2024.

  18. arXiv:2406.19966  [pdf, other

    cs.CL

    Simulating Financial Market via Large Language Model based Agents

    Authors: Shen Gao, Yuntao Wen, Minghang Zhu, Jianing Wei, Yuhan Cheng, Qunzi Zhang, Shuo Shang

    Abstract: Most economic theories typically assume that financial market participants are fully rational individuals and use mathematical models to simulate human behavior in financial markets. However, human behavior is often not entirely rational and is challenging to predict accurately with mathematical models. In this paper, we propose \textbf{A}gent-based \textbf{S}imulated \textbf{F}inancial \textbf{M}… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  19. arXiv:2406.19774  [pdf, other

    cs.CL

    Direct Preference Knowledge Distillation for Large Language Models

    Authors: Yixing Li, Yuxian Gu, Li Dong, Dequan Wang, Yu Cheng, Furu Wei

    Abstract: In the field of large language models (LLMs), Knowledge Distillation (KD) is a critical technique for transferring capabilities from teacher models to student models. However, existing KD methods face limitations and challenges in distillation of LLMs, including efficiency and insufficient measurement capabilities of traditional KL divergence. It is shown that LLMs can serve as an implicit reward… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  20. arXiv:2406.19133  [pdf

    physics.ao-ph

    Multiphase buffering by ammonia sustains sulfate production in atmospheric aerosols

    Authors: Guangjie Zheng, Hang Su, Meinrat O. Andreae, Ulrich Pöschl, Yafang Cheng

    Abstract: Multiphase oxidation of sulfur dioxide (SO2) is an important source of sulfate in the atmosphere. There are, however, concerns that protons produced during SO2 oxidation may cause rapid acidification of aerosol water and thereby quickly shut down the fast reactions favored at high pH. Here, we show that the sustainability of sulfate production is controlled by the competing effects of multiphase b… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  21. arXiv:2406.19055  [pdf, other

    cs.CV

    SimpleFusion: A Simple Fusion Framework for Infrared and Visible Images

    Authors: Ming Chen, Yuxuan Cheng, Xinwei He, Xinyue Wang, Yan Aze, Jinhai Xiang

    Abstract: Integrating visible and infrared images into one high-quality image, also known as visible and infrared image fusion, is a challenging yet critical task for many downstream vision tasks. Most existing works utilize pretrained deep neural networks or design sophisticated frameworks with strong priors for this task, which may be unsuitable or lack flexibility. This paper presents SimpleFusion, a sim… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: code:https://github.com/hxwxss/SimpleFusion-A-Simple-Fusion-Framework-for-Infrared-and-Visible-Images

  22. arXiv:2406.16942  [pdf, other

    eess.IV cs.AI cs.CV

    Enhancing Diagnostic Reliability of Foundation Model with Uncertainty Estimation in OCT Images

    Authors: Yuanyuan Peng, Aidi Lin, Meng Wang, Tian Lin, Ke Zou, Yinglin Cheng, Tingkun Shi, Xulong Liao, Lixia Feng, Zhen Liang, Xinjian Chen, Huazhu Fu, Haoyu Chen

    Abstract: Inability to express the confidence level and detect unseen classes has limited the clinical implementation of artificial intelligence in the real-world. We developed a foundation model with uncertainty estimation (FMUE) to detect 11 retinal conditions on optical coherence tomography (OCT). In the internal test set, FMUE achieved a higher F1 score of 96.76% than two state-of-the-art algorithms, RE… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: All codes are available at https://github.com/yuanyuanpeng0129/FMUE

  23. arXiv:2406.16554  [pdf, other

    cs.CL

    LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

    Authors: Tong Zhu, Xiaoye Qu, Daize Dong, Jiacheng Ruan, Jingqi Tong, Conghui He, Yu Cheng

    Abstract: Mixture-of-Experts (MoE) has gained increasing popularity as a promising framework for scaling up large language models (LLMs). However, training MoE from scratch in a large-scale setting still suffers from data-hungry and instability problems. Motivated by this limit, we investigate building MoE models from existing dense large language models. Specifically, based on the well-known LLaMA-2 7B mod… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  24. arXiv:2406.15480  [pdf, other

    cs.CL cs.AI cs.LG

    On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion

    Authors: Chenghao Fan, Zhenyi Lu, Wei Wei, Jie Tian, Xiaoye Qu, Dangyang Chen, Yu Cheng

    Abstract: Efficient fine-tuning of large language models for task-specific applications is imperative, yet the vast number of parameters in these models makes their training increasingly challenging. Despite numerous proposals for effective methods, a substantial memory overhead remains for gradient computations during updates. \thm{Can we fine-tune a series of task-specific small models and transfer their… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: submit under review

  25. arXiv:2406.15479  [pdf, other

    cs.CL cs.AI cs.LG

    Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging

    Authors: Zhenyi Lu, Chenghao Fan, Wei Wei, Xiaoye Qu, Dangyang Chen, Yu Cheng

    Abstract: In the era of large language models, model merging is a promising way to combine multiple task-specific models into a single multitask model without extra training. However, two challenges remain: (a) interference between different models and (b) heterogeneous data during testing. Traditional model merging methods often show significant performance gaps compared to fine-tuned models due to these i… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: submit in review

  26. arXiv:2406.14885  [pdf, other

    cs.HC

    Ink and Algorithm: Exploring Temporal Dynamics in Human-AI Collaborative Writing

    Authors: Kaixun Yang, Yixin Cheng, Linxuan Zhao, Mladen Raković, Zachari Swiecki, Dragan Gašević, Guanliang Chen

    Abstract: The advent of Generative Artificial Intelligence (GAI) has revolutionized the field of writing, marking a shift towards human-AI collaborative writing in education. However, the dynamics of human-AI interaction in the collaborative writing process are not well understood, and thus it remains largely unknown how human learning can be effectively supported with such cutting-edge GAI technologies. In… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  27. arXiv:2406.14663  [pdf, other

    astro-ph.GA astro-ph.SR

    MagMar III -- Resisting the Pressure, Is the Magnetic Field Overwhelmed in NGC6334I?

    Authors: Paulo C. Cortes, Josep M. Girart, Patricio Sanhueza, Junhao Liu, Sergio Martin, Ian W. Stephens, Henrik Beuther, Patrick M. Koch, M. Fernandez-Lopez, Alvaro Sanchez-Monge, Jia-Wei Wang, Kaho Morii, Shanghuo Li, Piyali Saha, Qizhou Zhang, David Rebolledo, Luis A. Zapata, Ji-hyun Kang, Wenyu Jiao, Jongsoo Kim, Yu Cheng, Jihye Hwang, Eun Jung Chung, Spandan Choudhury, A-Ran Lyo , et al. (1 additional authors not shown)

    Abstract: We report on ALMA observations of polarized dust emission at 1.2 mm from NGC6334I, a source known for its significant flux outbursts. Between five months, our data show no substantial change in total intensity and a modest 8\% variation in linear polarization, suggesting a phase of stability or the conclusion of the outburst. The magnetic field, inferred from this polarized emission, displays a pr… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Accepted for Publication at the Astrophysical Journal

  28. arXiv:2406.14192  [pdf, other

    cs.CL cs.AI

    Timo: Towards Better Temporal Reasoning for Language Models

    Authors: Zhaochen Su, Jun Zhang, Tong Zhu, Xiaoye Qu, Juntao Li, Min Zhang, Yu Cheng

    Abstract: Reasoning about time is essential for Large Language Models (LLMs) to understand the world. Previous works focus on solving specific tasks, primarily on time-sensitive question answering. While these methods have proven effective, they cannot generalize to a wider spectrum of temporal reasoning tasks. Therefore, we propose a crucial question: Can we build a universal framework to handle a variety… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Under review

  29. arXiv:2406.13998  [pdf, other

    math.CO

    Transversal Hamilton paths and cycles

    Authors: Yangyang Cheng, Wanting Sun, Guanghui Wang, Lan Wei

    Abstract: Given a collection $\mathcal{G} =\{G_1,G_2,\dots,G_m\}$ of graphs on the common vertex set $V$ of size $n$, an $m$-edge graph $H$ on the same vertex set $V$ is transversal in $\mathcal{G}$ if there exists a bijection $\varphi :E(H)\rightarrow [m]$ such that $e \in E(G_{\varphi(e)})$ for all $e\in E(H)$. Denote $δ(\mathcal{G}):=\operatorname*{min}\left\{δ(G_i): i\in [m]\right\}$. In this paper, we… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 33 pages, 10 figures

    MSC Class: 05C35

  30. arXiv:2406.13960  [pdf, other

    cs.CL cs.AI

    Evolving to be Your Soulmate: Personalized Dialogue Agents with Dynamically Adapted Personas

    Authors: Yi Cheng, Wenge Liu, Kaishuai Xu, Wenjun Hou, Yi Ouyang, Chak Tou Leong, Xian Wu, Yefeng Zheng

    Abstract: Previous research on persona-based dialogue agents typically preset the agent's persona before deployment, which remains static thereafter. In this paper, we take a step further and explore a new paradigm called Self-evolving Personalized Dialogue Agents (SPDA), where the agent continuously evolves during the conversation to better align with the user's anticipation by dynamically adapting its per… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Work in progress

  31. arXiv:2406.13934  [pdf, other

    cs.CL cs.AI

    Reasoning Like a Doctor: Improving Medical Dialogue Systems via Diagnostic Reasoning Process Alignment

    Authors: Kaishuai Xu, Yi Cheng, Wenjun Hou, Qiaoyu Tan, Wenjie Li

    Abstract: Medical dialogue systems have attracted significant attention for their potential to act as medical assistants. Enabling these medical systems to emulate clinicians' diagnostic reasoning process has been the long-standing research focus. Previous studies rudimentarily realized the simulation of clinicians' diagnostic process by fine-tuning language models on high-quality dialogue datasets. Nonethe… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted by ACL 2024 Findings

  32. arXiv:2406.13776  [pdf, other

    physics.comp-ph physics.flu-dyn

    Flow and clogging of capillary droplets

    Authors: Yuxuan Cheng, Benjamin F. Lonial, Shivnag Sista, David J. Meer, Anisa Hofert, Eric R. Weeks, Mark D. Shattuck, Corey S. O'Hern

    Abstract: Capillary droplets form due to surface tension when two immiscible fluids are mixed. We describe the motion of gravity-driven capillary droplets flowing through narrow constrictions and obstacle arrays in both simulations and experiments. Our new capillary deformable particle model recapitulates the shape and velocity of single oil droplets in water as they pass through narrow constrictions in mic… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  33. arXiv:2406.13578  [pdf, other

    cs.CL

    Enhancing Distractor Generation for Multiple-Choice Questions with Retrieval Augmented Pretraining and Knowledge Graph Integration

    Authors: Han-Cheng Yu, Yu-An Shih, Kin-Man Law, Kai-Yu Hsieh, Yu-Chen Cheng, Hsin-Chih Ho, Zih-An Lin, Wen-Chuan Hsu, Yao-Chung Fan

    Abstract: In this paper, we tackle the task of distractor generation (DG) for multiple-choice questions. Our study introduces two key designs. First, we propose \textit{retrieval augmented pretraining}, which involves refining the language model pretraining to align it more closely with the downstream task of DG. Second, we explore the integration of knowledge graphs to enhance the performance of DG. Throug… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Findings at ACL 2024

  34. arXiv:2406.13231  [pdf, other

    cs.DS

    Tight Lower Bounds for Directed Cut Sparsification and Distributed Min-Cut

    Authors: Yu Cheng, Max Li, Honghao Lin, Zi-Yi Tai, David P. Woodruff, Jason Zhang

    Abstract: In this paper, we consider two fundamental cut approximation problems on large graphs. We prove new lower bounds for both problems that are optimal up to logarithmic factors. The first problem is to approximate cuts in balanced directed graphs. In this problem, the goal is to build a data structure that $(1 \pm ε)$-approximates cut values in graphs with $n$ vertices. For arbitrary directed graph… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  35. arXiv:2406.12217  [pdf

    physics.optics physics.app-ph

    An integrated electro-optically tunable multi-channel interference cavity laser

    Authors: Junxia Zhou, Yiran Zhu, Botao Fu, Jinming Chen, Huiting Song, Zhihao Zhang, Jianping Yu, Jian Liu, Min Wang, Jia Qi, Ya Cheng

    Abstract: We demonstrated a continuously tunable laser system by butt coupling a reflective semiconductor optical amplifier (RSOA) chip with a thin-film lithium niobate (TFLN) based multi-channel interference (MCI) cavity chip. This hybrid integrated lasers allows for fine-tuning of the laser wavelength from 1538 nm to 1560 nm with a resolution of 0.014 nm and a side-mode suppression ratio (SMSR) exceeding… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  36. arXiv:2406.11353  [pdf, other

    cs.LG cs.CL

    $\texttt{MoE-RBench}$: Towards Building Reliable Language Models with Sparse Mixture-of-Experts

    Authors: Guanjie Chen, Xinyu Zhao, Tianlong Chen, Yu Cheng

    Abstract: Mixture-of-Experts (MoE) has gained increasing popularity as a promising framework for scaling up large language models (LLMs). However, the reliability assessment of MoE lags behind its surging applications. Moreover, when transferred to new domains such as in fine-tuning MoE models sometimes underperform their dense counterparts. Motivated by the research gap and counter-intuitive phenomenon, we… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 9 pages, 8 figures, camera ready on ICML2024

  37. arXiv:2406.11256  [pdf, other

    cs.CL

    Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

    Authors: Tong Zhu, Daize Dong, Xiaoye Qu, Jiacheng Ruan, Wenliang Chen, Yu Cheng

    Abstract: Mixture-of-Experts (MoE) models have shown remarkable capability in instruction tuning, especially when the number of tasks scales. However, previous methods simply merge all training tasks (e.g. creative writing, coding, and mathematics) and apply fixed sampling weights, without considering the importance of different tasks as the model training state changes. In this way, the most helpful data c… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  38. arXiv:2406.10807  [pdf, other

    stat.ML cs.AI cs.LG stat.AP

    Bayesian Networks and Machine Learning for COVID-19 Severity Explanation and Demographic Symptom Classification

    Authors: Oluwaseun T. Ajayi, Yu Cheng

    Abstract: With the prevailing efforts to combat the coronavirus disease 2019 (COVID-19) pandemic, there are still uncertainties that are yet to be discovered about its spread, future impact, and resurgence. In this paper, we present a three-stage data-driven approach to distill the hidden information about COVID-19. The first stage employs a Bayesian network structure learning method to identify the causal… ▽ More

    Submitted 17 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

  39. arXiv:2406.10715  [pdf, other

    physics.optics quant-ph

    Chip-scale generation of 60-mode continuous-variable cluster states

    Authors: Ze Wang, Kangkang Li, Yue Wang, Xin Zhou, Yinke Cheng, Boxuan Jing, Fengxiao Sun, Jincheng Li, Zhilin Li, Qihuang Gong, Qiongyi He, Bei-Bei Li, Qi-Fan Yang

    Abstract: Increasing the number of entangled entities is crucial for achieving exponential computational speedups and secure quantum networks. Despite recent progress in generating large-scale entanglement through continuous-variable (CV) cluster states, translating these technologies to photonic chips has been hindered by decoherence, limiting the number of entangled entities to 8. Here, we demonstrate 60-… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  40. arXiv:2406.10517  [pdf, other

    cs.IR cs.AI cs.LG

    ADSNet: Cross-Domain LTV Prediction with an Adaptive Siamese Network in Advertising

    Authors: Ruize Wang, Hui Xu, Ying Cheng, Qi He, Xing Zhou, Rui Feng, Wei Xu, Lei Huang, Jie Jiang

    Abstract: Advertising platforms have evolved in estimating Lifetime Value (LTV) to better align with advertisers' true performance metric. However, the sparsity of real-world LTV data presents a significant challenge to LTV predictive model(i.e., pLTV), severely limiting the their capabilities. Therefore, we propose to utilize external data, in addition to the internal data of advertising platform, to expan… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

    Comments: Accepted to KDD 2024

  41. arXiv:2406.09773  [pdf

    cs.CV cs.AI

    Research on Edge Detection of LiDAR Images Based on Artificial Intelligence Technology

    Authors: Haowei Yang, Liyang Wang, Jingyu Zhang, Yu Cheng, Ao Xiang

    Abstract: With the widespread application of Light Detection and Ranging (LiDAR) technology in fields such as autonomous driving, robot navigation, and terrain mapping, the importance of edge detection in LiDAR images has become increasingly prominent. Traditional edge detection methods often face challenges in accuracy and computational complexity when processing LiDAR images. To address these issues, this… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  42. arXiv:2406.09765  [pdf

    q-fin.RM cs.CL

    Application of Natural Language Processing in Financial Risk Detection

    Authors: Liyang Wang, Yu Cheng, Ao Xiang, Jingyu Zhang, Haowei Yang

    Abstract: This paper explores the application of Natural Language Processing (NLP) in financial risk detection. By constructing an NLP-based financial risk detection model, this study aims to identify and predict potential risks in financial documents and communications. First, the fundamental concepts of NLP and its theoretical foundation, including text mining methods, NLP model design principles, and mac… ▽ More

    Submitted 20 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  43. arXiv:2406.09295  [pdf, other

    cs.CL cs.CV

    AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models

    Authors: Yuhang Wu, Wenmeng Yu, Yean Cheng, Yan Wang, Xiaohan Zhang, Jiazheng Xu, Ming Ding, Yuxiao Dong

    Abstract: Evaluating the alignment capabilities of large Vision-Language Models (VLMs) is essential for determining their effectiveness as helpful assistants. However, existing benchmarks primarily focus on basic abilities using nonverbal methods, such as yes-no and multiple-choice questions. In this paper, we address this gap by introducing AlignMMBench, a comprehensive alignment benchmark specifically des… ▽ More

    Submitted 13 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  44. arXiv:2406.09270  [pdf, other

    astro-ph.HE

    Discovery and Extensive Follow-Up of SN 2024ggi, a nearby type IIP supernova in NGC 3621

    Authors: Ting-Wan Chen, Sheng Yang, Shubham Srivastav, Takashi J. Moriya, Stephen J. Smartt, Sofia Rest, Armin Rest, Hsing Wen Lin, Hao-Yu Miao, Yu-Chi Cheng, Amar Aryan, Chia-Yu Cheng, Morgan Fraser, Li-Ching Huang, Meng-Han Lee, Cheng-Han Lai, Yu Hsuan Liu, Aiswarya Sankar. K, Ken W. Smith, Heloise F. Stevance, Ze-Ning Wang, Joseph P. Anderson, Charlotte R. Angus, Thomas de Boer, Kenneth Chambers , et al. (23 additional authors not shown)

    Abstract: We present the discovery and early observations of the nearby Type II supernova (SN) 2024ggi in NGC 3621 at 6.64 +/- 0.3 Mpc. The SN was caught 5.8 (+1.9 -2.9) hours after its explosion by the ATLAS survey. Early-phase, high-cadence, and multi-band photometric follow-up was performed by the Kinder (Kilonova Finder) project, collecting over 1000 photometric data points within a week. The combined o… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 11 pages, 5 figures in manuscript, 6 pages in appendix, submitted to ApJL

  45. arXiv:2406.09149  [pdf

    cond-mat.mtrl-sci physics.app-ph

    Structural changes in Ge1-xSnx and Si1-x-yGeySnx thin films on SOI substrates treated by pulse laser annealing

    Authors: Oliver Steuer, Daniel Schwarz, Michael Oehme, Florian Bärwolf, Yu Cheng, Fabian Ganss, René Hübner, René Heller, Shengqiang Zhou, Manfred Helm, Gianaurelio Cuniberti, Yordan M. Georgiev, Slawomir Prucnal

    Abstract: Ge1-xSnx and Si1-x-yGeySnx alloys are promising materials for future opto- and nanoelectronics applications. These alloys enable effective band-gap engineering, broad adjustability of their lattice parameter, exhibit much higher carrier mobility than pure Si, and are compatible with the CMOS technology. Unfortunately, the equilibrium solid solubility of Sn in Si1-xGex is less than 1% and the pseud… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  46. arXiv:2406.09072  [pdf, other

    cs.CL

    Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?

    Authors: Zhaochen Su, Juntao Li, Jun Zhang, Tong Zhu, Xiaoye Qu, Pan Zhou, Yan Bowen, Yu Cheng, Min zhang

    Abstract: Temporal reasoning is fundamental for large language models (LLMs) to comprehend the world. Current temporal reasoning datasets are limited to questions about single or isolated events, falling short in mirroring the realistic temporal characteristics involving concurrent nature and intricate temporal interconnections. In this paper, we introduce CoTempQA, a comprehensive co-temporal Question Answ… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted to the ACL 2024 main conference

  47. arXiv:2406.08744  [pdf

    physics.optics physics.app-ph

    Compact low-half-wave-voltage thin film lithium niobate electro-optic phase modulator fabricated by photolithography assisted chemo-mechanical etching

    Authors: Lang Gao, Youting Liang, Jinming Chen, Jianping Yu, Jia Qi, Lvbin Song, Jian Liu, Zhaoxiang Liu, Hongxin Qi, Ya Cheng

    Abstract: This paper presents a compact dual-arm thin film lithium niobate (TFLN) electro-optic phase modulator fabricated using the photolithography-assisted chemo-mechanical etching (PLACE) technique. The design of the device allows for complete utilization of the microwave electric field, doubling the modulation efficiency compared to single-arm modulators in theory. With a half-wave voltage of approxima… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  48. arXiv:2406.08698  [pdf, other

    astro-ph.HE hep-ph

    Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, accepted by PRL

  49. arXiv:2406.08155  [pdf, other

    cs.LG cs.AI cs.CL

    Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark

    Authors: Pingzhi Li, Xiaolong Jin, Yu Cheng, Tianlong Chen

    Abstract: Large Language Models~(LLMs) have become foundational in the realm of natural language processing, demonstrating performance improvements as model sizes increase. The Mixture-of-Experts~(MoE) approach offers a promising way to scale LLMs more efficiently by using fewer computational FLOPs through sparse activation. However, it suffers from significant memory overheads, necessitating model compress… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Our code for reproducing all our experiments is provided at https://github.com/UNITES-Lab/moe-quantization

  50. arXiv:2406.08035  [pdf, other

    cs.CV cs.AI

    LVBench: An Extreme Long Video Understanding Benchmark

    Authors: Weihan Wang, Zehai He, Wenyi Hong, Yean Cheng, Xiaohan Zhang, Ji Qi, Shiyu Huang, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang

    Abstract: Recent progress in multimodal large language models has markedly enhanced the understanding of short videos (typically under one minute), and several evaluation datasets have emerged accordingly. However, these advancements fall short of meeting the demands of real-world applications such as embodied intelligence for long-term decision-making, in-depth movie reviews and discussions, and live sport… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.