Skip to main content

Showing 1–50 of 7,407 results for author: Wang, C

  1. arXiv:2407.09417  [pdf, other

    cs.CL cs.IR

    Mitigating Entity-Level Hallucination in Large Language Models

    Authors: Weihang Su, Yichen Tang, Qingyao Ai, Changyue Wang, Zhijing Wu, Yiqun Liu

    Abstract: The emergence of Large Language Models (LLMs) has revolutionized how users access information, shifting from traditional search engines to direct question-and-answer interactions with LLMs. However, the widespread adoption of LLMs has revealed a significant challenge known as hallucination, wherein LLMs generate coherent yet factually inaccurate responses. This hallucination phenomenon has led to… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2407.09329  [pdf, ps, other

    math.FA

    Function spaces on formal manifolds

    Authors: Fulin Chen, Binyong Sun, Chuyun Wang

    Abstract: This is a paper in a series that studies smooth relative Lie algebra homologies and cohomologies based on the theory of formal manifolds and formal Lie groups. In a previous paper, we introduce the notion of formal manifolds and develop the foundational framework of formal manifolds. In this paper, we study various function spaces on formal manifolds, including generalizations of vector-valued gen… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: The preprint arXiv:2401.01535v1 of ours was split into three separate papers. This is the second paper about vector-valued generalized functions and vector-valued distributions

  3. arXiv:2407.09247  [pdf, other

    cs.AI

    Constrained Intrinsic Motivation for Reinforcement Learning

    Authors: Xiang Zheng, Xingjun Ma, Chao Shen, Cong Wang

    Abstract: This paper investigates two fundamental problems that arise when utilizing Intrinsic Motivation (IM) for reinforcement learning in Reward-Free Pre-Training (RFPT) tasks and Exploration with Intrinsic Motivation (EIM) tasks: 1) how to design an effective intrinsic objective in RFPT tasks, and 2) how to reduce the bias introduced by the intrinsic objective in EIM tasks. Existing IM methods suffer fr… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: Accepted by IJCAI 2024

  4. arXiv:2407.09048  [pdf, other

    cs.AI

    KUNPENG: An Embodied Large Model for Intelligent Maritime

    Authors: Naiyao Wang, Tongbang Jiang, Ye Wang, Shaoyang Qiu, Bo Zhang, Xinqiang Xie, Munan Li, Chunliu Wang, Yiyang Wang, Hongxiang Ren, Ruili Wang, Hongjun Shan, Hongbo Liu

    Abstract: Intelligent maritime, as an essential component of smart ocean construction, deeply integrates advanced artificial intelligence technology and data analysis methods, which covers multiple aspects such as smart vessels, route optimization, safe navigation, aiming to enhance the efficiency of ocean resource utilization and the intelligence of transportation networks. However, the complex and dynamic… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 9 pages, 3 figures

  5. arXiv:2407.08948  [pdf, other

    eess.IV cs.CV

    Symmetry Awareness Encoded Deep Learning Framework for Brain Imaging Analysis

    Authors: Yang Ma, Dongang Wang, Peilin Liu, Lynette Masters, Michael Barnett, Weidong Cai, Chenyu Wang

    Abstract: The heterogeneity of neurological conditions, ranging from structural anomalies to functional impairments, presents a significant challenge in medical imaging analysis tasks. Moreover, the limited availability of well-annotated datasets constrains the development of robust analysis models. Against this backdrop, this study introduces a novel approach leveraging the inherent anatomical symmetrical… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024

    ACM Class: I.2.10; I.4.10

  6. arXiv:2407.08865  [pdf, other

    cs.CV

    Single-Image Shadow Removal Using Deep Learning: A Comprehensive Survey

    Authors: Laniqng Guo, Chong Wang, Yufei Wang, Siyu Huang, Wenhan Yang, Alex C. Kot, Bihan Wen

    Abstract: Shadow removal aims at restoring the image content within shadow regions, pursuing a uniform distribution of illumination that is consistent between shadow and non-shadow regions. {Comparing to other image restoration tasks, there are two unique challenges in shadow removal:} 1) The patterns of shadows are arbitrary, varied, and often have highly complex trace structures, making ``trace-less'' ima… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: url: https://github.com/GuoLanqing/Awesome-Shadow-Removal

  7. arXiv:2407.08855  [pdf, other

    eess.IV cs.CV

    BraTS-PEDs: Results of the Multi-Consortium International Pediatric Brain Tumor Segmentation Challenge 2023

    Authors: Anahita Fathi Kazerooni, Nastaran Khalili, Xinyang Liu, Debanjan Haldar, Zhifan Jiang, Anna Zapaishchykova, Julija Pavaine, Lubdha M. Shah, Blaise V. Jones, Nakul Sheth, Sanjay P. Prabhu, Aaron S. McAllister, Wenxin Tu, Khanak K. Nandolia, Andres F. Rodriguez, Ibraheem Salman Shaikh, Mariana Sanchez Montano, Hollie Anne Lai, Maruf Adewole, Jake Albrecht, Udunna Anazodo, Hannah Anderson, Syed Muhammed Anwar, Alejandro Aristizabal, Sina Bagheri , et al. (54 additional authors not shown)

    Abstract: Pediatric central nervous system tumors are the leading cause of cancer-related deaths in children. The five-year survival rate for high-grade glioma in children is less than 20%. The development of new treatments is dependent upon multi-institutional collaborative clinical trials requiring reproducible and accurate centralized response assessment. We present the results of the BraTS-PEDs 2023 cha… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  8. arXiv:2407.08726  [pdf, other

    cs.CV

    Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data

    Authors: Cherie Ho, Jiaye Zou, Omar Alama, Sai Mitheran Jagadesh Kumar, Benjamin Chiang, Taneesh Gupta, Chen Wang, Nikhil Keetha, Katia Sycara, Sebastian Scherer

    Abstract: Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. While recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. In this context, we show that a more sca… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  9. arXiv:2407.08706  [pdf, other

    cs.CV

    HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

    Authors: Runhui Huang, Xinpeng Ding, Chunwei Wang, Jianhua Han, Yulong Liu, Hengshuang Zhao, Hang Xu, Lu Hou, Wei Zhang, Xiaodan Liang

    Abstract: High-resolution inputs enable Large Vision-Language Models (LVLMs) to discern finer visual details, enhancing their comprehension capabilities. To reduce the training and computation costs caused by high-resolution input, one promising direction is to use sliding windows to slice the input into uniform patches, each matching the input size of the well-trained vision encoder. Although efficient, th… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  10. arXiv:2407.08224  [pdf, other

    q-bio.QM cs.AI

    stEnTrans: Transformer-based deep learning for spatial transcriptomics enhancement

    Authors: Shuailin Xue, Fangfang Zhu, Changmiao Wang, Wenwen Min

    Abstract: The spatial location of cells within tissues and organs is crucial for the manifestation of their specific functions.Spatial transcriptomics technology enables comprehensive measurement of the gene expression patterns in tissues while retaining spatial information. However, current popular spatial transcriptomics techniques either have shallow sequencing depth or low resolution. We present stEnTra… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: ISBRA2024, Code: https://github.com/shuailinxue/stEnTrans

  11. arXiv:2407.08216  [pdf, other

    eess.IV cs.AI cs.CV q-bio.QM

    Multimodal contrastive learning for spatial gene expression prediction using histology images

    Authors: Wenwen Min, Zhiceng Shi, Jun Zhang, Jun Wan, Changmiao Wang

    Abstract: In recent years, the advent of spatial transcriptomics (ST) technology has unlocked unprecedented opportunities for delving into the complexities of gene expression patterns within intricate biological systems. Despite its transformative potential, the prohibitive cost of ST technology remains a significant barrier to its widespread adoption in large-scale studies. An alternative, more cost-effect… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: BIB, Code: https://github.com/shizhiceng/mclSTExp

  12. arXiv:2407.08200  [pdf, other

    cs.CV

    Deep Understanding of Soccer Match Videos

    Authors: Shikun Xu, Yandong Zhu, Gen Li, Changhu Wang

    Abstract: Soccer is one of the most popular sport worldwide, with live broadcasts frequently available for major matches. However, extracting detailed, frame-by-frame information on player actions from these videos remains a challenge. Utilizing state-of-the-art computer vision technologies, our system can detect key objects such as soccer balls, players and referees. It also tracks the movements of players… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  13. arXiv:2407.08199  [pdf, other

    cs.CV

    SRPose: Two-view Relative Pose Estimation with Sparse Keypoints

    Authors: Rui Yin, Yulun Zhang, Zherong Pan, Jianjun Zhu, Cheng Wang, Biao Jia

    Abstract: Two-view pose estimation is essential for map-free visual relocalization and object pose tracking tasks. However, traditional matching methods suffer from time-consuming robust estimators, while deep learning-based pose regressors only cater to camera-to-world pose estimation, lacking generalizability to different image sizes and camera intrinsics. In this paper, we propose SRPose, a sparse keypoi… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 30 pages, 11 figures, to be published in ECCV 2024

  14. arXiv:2407.08187  [pdf, other

    cs.CV

    ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

    Authors: Ruijie Zhu, Chuxin Wang, Ziyang Song, Li Liu, Tianzhu Zhang, Yongdong Zhang

    Abstract: Estimating depth from a single image is a challenging visual task. Compared to relative depth estimation, metric depth estimation attracts more attention due to its practical physical significance and critical applications in real-life scenarios. However, existing metric depth estimation methods are typically trained on specific datasets with similar scenes, facing challenges in generalizing acros… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 14 pages, 11 figure, 13 tables

  15. arXiv:2407.07954  [pdf

    physics.med-ph cond-mat.mtrl-sci cond-mat.soft

    3D E-textile for Exercise Physiology and Clinical Maternal Health Monitoring

    Authors: Junyi Zhao, Chansoo Kim, Weilun Li, Zichao Wen, Zhili Xiao, Yong Wang, Shantanu Chakrabartty, Chuan Wang

    Abstract: Electronic textiles (E-textiles) offer great wearing comfort and unobtrusiveness, thus holding potential for next-generation health monitoring wearables. However, the practical implementation is hampered by challenges associated with poor signal quality, substantial motion artifacts, durability for long-term usage, and non-ideal user experience. Here, we report a cost-effective E-textile system th… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 16 pages, 6 figures

  16. arXiv:2407.07697  [pdf

    quant-ph

    Revealing spontaneous symmetry breaking in continuous time crystals

    Authors: Yuanjiang Tang, Chenyang Wang, Bei Liu, Jin Peng, Chao Liang, Yaohua Li, Xian Zhao, Cuicui Lu, Shuang Zhang, Yong-Chun Liu

    Abstract: Spontaneous symmetry breaking plays a pivotal role in physics ranging from the emergence of elementary particles to the phase transitions of matter. The spontaneous breaking of continuous time translation symmetry leads to a novel state of matter named continuous time crystal (CTC). It exhibits periodic oscillation without the need for periodic driving, and the relative phases for repetitively rea… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  17. arXiv:2407.07518  [pdf, other

    cs.CV

    Multi-modal Crowd Counting via a Broker Modality

    Authors: Haoliang Meng, Xiaopeng Hong, Chenhao Wang, Miao Shang, Wangmeng Zuo

    Abstract: Multi-modal crowd counting involves estimating crowd density from both visual and thermal/depth images. This task is challenging due to the significant gap between these distinct modalities. In this paper, we propose a novel approach by introducing an auxiliary broker modality and on this basis frame the task as a triple-modal learning problem. We devise a fusion-based method to generate this brok… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: This is the preprint version of the paper and supplemental material to appear in ECCV 2024. Please cite the final published version. Code is available at https://github.com/HenryCilence/Broker-Modality-Crowd-Counting

  18. arXiv:2407.07099  [pdf, other

    cs.CL cs.AI cs.GT cs.LG

    Nash CoT: Multi-Path Inference with Preference Equilibrium

    Authors: Ziqi Zhang, Cunxiang Wang, Xiong Xiao, Yue Zhang, Donglin Wang

    Abstract: Chain-of-thought (CoT) prompting has emerged as a powerful technique for enhancing the reasoning capabilities of Large Language Models (LLMs) on complex problems. Among CoT-related studies, self-consistency (Multi-path inference with answer filtering through voting) involves generating multiple reasoning paths using the CoT framework and then selecting the most frequently produced outputs standing… ▽ More

    Submitted 18 June, 2024; originally announced July 2024.

  19. arXiv:2407.07020  [pdf, other

    cs.AI cs.RO

    Less is More: Efficient Brain-Inspired Learning for Autonomous Driving Trajectory Prediction

    Authors: Haicheng Liao, Yongkang Li, Zhenning Li, Chengyue Wang, Chunlin Tian, Yuming Huang, Zilin Bian, Kaiqun Zhu, Guofa Li, Ziyuan Pu, Jia Hu, Zhiyong Cui, Chengzhong Xu

    Abstract: Accurately and safely predicting the trajectories of surrounding vehicles is essential for fully realizing autonomous driving (AD). This paper presents the Human-Like Trajectory Prediction model (HLTP++), which emulates human cognitive processes to improve trajectory prediction in AD. HLTP++ incorporates a novel teacher-student knowledge distillation framework. The "teacher" model equipped with an… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2402.19251

  20. arXiv:2407.06938  [pdf, other

    cs.CV

    RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

    Authors: Bowen Zhang, Yiji Cheng, Chunyu Wang, Ting Zhang, Jiaolong Yang, Yansong Tang, Feng Zhao, Dong Chen, Baining Guo

    Abstract: We present RodinHD, which can generate high-fidelity 3D avatars from a portrait image. Existing methods fail to capture intricate details such as hairstyles which we tackle in this paper. We first identify an overlooked problem of catastrophic forgetting that arises when fitting triplanes sequentially on many avatars, caused by the MLP decoder sharing scheme. To overcome this issue, we raise a nov… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: ECCV 2024; project page: https://rodinhd.github.io/

  21. arXiv:2407.06698  [pdf, ps, other

    cs.CV cs.LG

    PSPU: Enhanced Positive and Unlabeled Learning by Leveraging Pseudo Supervision

    Authors: Chengjie Wang, Chengming Xu, Zhenye Gan, Jianlong Hu, Wenbing Zhu, Lizhuag Ma

    Abstract: Positive and Unlabeled (PU) learning, a binary classification model trained with only positive and unlabeled data, generally suffers from overfitted risk estimation due to inconsistent data distributions. To address this, we introduce a pseudo-supervised PU learning framework (PSPU), in which we train the PU model first, use it to gather confident samples for the pseudo supervision, and then apply… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: accepted by ICME2024

  22. arXiv:2407.05909  [pdf, other

    cs.CV

    Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection

    Authors: Chenxu Wang, Chunyan Xu, Ziqi Gu, Zhen Cui

    Abstract: While existing semi-supervised object detection (SSOD) methods perform well in general scenes, they encounter challenges in handling oriented objects in aerial images. We experimentally find three gaps between general and oriented object detection in semi-supervised learning: 1) Sampling inconsistency: the common center sampling is not suitable for oriented objects with larger aspect ratios when s… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  23. arXiv:2407.05764  [pdf, other

    eess.IV

    Neuromorphic Imaging with Super-Resolution

    Authors: Pei Zhang, Shuo Zhu, Chutian Wang, Yaping Zhao, Edmund Y. Lam

    Abstract: Neuromorphic imaging is a bio-inspired technique that imitates the human retina to sense variations in a dynamic scene. It responds to pixel-level brightness changes by asynchronous streaming events and boasts microsecond temporal precision over a high dynamic range, yielding blur-free recordings under extreme illumination. Nevertheless, such a modality falls short in spatial resolution and leads… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 11 pages, 13 figures, and 3 tables

  24. arXiv:2407.05749  [pdf, other

    eess.SP cs.HC cs.LG

    LDGCN: An Edge-End Lightweight Dual GCN Based on Single-Channel EEG for Driver Drowsiness Monitoring

    Authors: Jingwei Huang, Chuansheng Wang, Jiayan Huang, Haoyi Fan, Antoni Grau, Fuquan Zhang

    Abstract: Driver drowsiness electroencephalography (EEG) signal monitoring can timely alert drivers of their drowsiness status, thereby reducing the probability of traffic accidents. Graph convolutional networks (GCNs) have shown significant advancements in processing the non-stationary, time-varying, and non-Euclidean nature of EEG signals. However, the existing single-channel EEG adjacency graph construct… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  25. arXiv:2407.05540  [pdf, other

    cs.CV

    GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation

    Authors: Chenxin Li, Xinyu Liu, Cheng Wang, Yifan Liu, Weihao Yu, Jing Shao, Yixuan Yuan

    Abstract: Recent advances in learning multi-modal representation have witnessed the success in biomedical domains. While established techniques enable handling multi-modal information, the challenges are posed when extended to various clinical modalities and practical modalitymissing setting due to the inherent modality gaps. To tackle these, we propose an innovative Modality-prompted Heterogeneous Graph fo… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  26. arXiv:2407.05361  [pdf, other

    eess.AS cs.CL

    Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation

    Authors: Haorui He, Zengqiang Shang, Chaoren Wang, Xuyuan Li, Yicheng Gu, Hua Hua, Liwei Liu, Chen Yang, Jiaqi Li, Peiyang Shi, Yuancheng Wang, Kai Chen, Pengyuan Zhang, Zhizheng Wu

    Abstract: Recently, speech generation models have made significant progress by using large-scale training data. However, the research community struggle to produce highly spontaneous and human-like speech due to the lack of large-scale, diverse, and spontaneous speech data. This paper presents \textit{Emilia}, the first multilingual speech generation dataset from in-the-wild speech data, and Emilia-Pipe, th… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  27. arXiv:2407.05358  [pdf, other

    cs.CV

    CPM: Class-conditional Prompting Machine for Audio-visual Segmentation

    Authors: Yuanhong Chen, Chong Wang, Yuyuan Liu, Hu Wang, Gustavo Carneiro

    Abstract: Audio-visual segmentation (AVS) is an emerging task that aims to accurately segment sounding objects based on audio-visual cues. The success of AVS learning systems depends on the effectiveness of cross-modal interaction. Such a requirement can be naturally fulfilled by leveraging transformer-based segmentation architecture due to its inherent ability to capture long-range dependencies and flexibi… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  28. arXiv:2407.05253  [pdf, other

    math.NA

    A Third-order Implicit-Explicit Runge-Kutta Method for Landau-Lifshitz Equation with Arbitrary Damping Parameters

    Authors: Yan Gui, Rui Du, Cheng Wang

    Abstract: A third-order accurate implicit-explicit Runge-Kutta time marching numerical scheme is proposed and implemented for the Landau-Lifshitz-Gilbert equation, which models magnetization dynamics in ferromagnetic materials, with arbitrary damping parameters. This method has three remarkable advantages:~(1) only a linear system with constant coefficients needs to be solved at each Runge-Kutta stage, whic… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by Numerical Mathematics: Theory, Methods and Applications and is prepared for publication

  29. arXiv:2407.04981  [pdf, other

    cs.CL cs.LG

    TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs

    Authors: Cheng Wang, Xinyang Lu, See-Kiong Ng, Bryan Kian Hsiang Low

    Abstract: The rapid evolution of large language models (LLMs) represents a substantial leap forward in natural language understanding and generation. However, alongside these advancements come significant challenges related to the accountability and transparency of LLM responses. Reliable source attribution is essential to adhering to stringent legal and regulatory standards, including those set forth by th… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  30. arXiv:2407.04969  [pdf, other

    cs.CL

    EVA-Score: Evaluation of Long-form Summarization on Informativeness through Extraction and Validation

    Authors: Yuchen Fan, Xin Zhong, Chengsi Wang, Gaoche Wu, Bowen Zhou

    Abstract: Summarization is a fundamental task in natural language processing (NLP) and since large language models (LLMs), such as GPT-4 and Claude, come out, increasing attention has been paid to long-form summarization whose input sequences are much longer, indicating more information contained. The current evaluation metrics either use similarity-based metrics like ROUGE and BERTScore which rely on sim… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

    Comments: 16 pages, 3 figures, submitted to EMNLP

  31. arXiv:2407.04922  [pdf, other

    cond-mat.mtrl-sci

    Revolutionizing Alloy Microstructure Segmentation through SAM and Domain Knowledge without Extra Training

    Authors: Xudong Ma, Yuqi Zhang, Chenchong Wang, Wei Xu

    Abstract: Fundamental models, trained on large-scale datasets and adapted to new data using innovative learning methods, have revolutionized various fields. In materials science, microstructure image segmentation plays a pivotal role in understanding alloy properties. However, conventional supervised modelling algorithms often necessitate extensive annotations and intricate optimization procedures. The segm… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  32. arXiv:2407.04842  [pdf, other

    cs.CV cs.CL cs.LG

    MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

    Authors: Zhaorun Chen, Yichao Du, Zichen Wen, Yiyang Zhou, Chenhang Cui, Zhenzhen Weng, Haoqin Tu, Chaoqi Wang, Zhengwei Tong, Qinglan Huang, Canyu Chen, Qinghao Ye, Zhihong Zhu, Yuqing Zhang, Jiawei Zhou, Zhuokai Zhao, Rafael Rafailov, Chelsea Finn, Huaxiu Yao

    Abstract: While text-to-image models like DALLE-3 and Stable Diffusion are rapidly proliferating, they often encounter challenges such as hallucination, bias, and the production of unsafe, low-quality output. To effectively address these issues, it is crucial to align these models with desired behaviors based on feedback from a multimodal judge. Despite their significance, current multimodal judges frequent… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 42 pages, 13 figures, 33 tables

  33. arXiv:2407.04787  [pdf, other

    cs.CL cs.AI cs.LG

    Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning

    Authors: Eric Pasewark, Kyle Montgomery, Kefei Duan, Dawn Song, Chenguang Wang

    Abstract: We present a new method for large language models to solve compositional tasks. Although they have shown strong performance on traditional language understanding tasks, large language models struggle to solve compositional tasks, where the solution depends on solving smaller instances of the same problem. We propose a natural approach to solve compositional tasks recursively. Our method, Re-Tuning… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted to ACL 2024

  34. arXiv:2407.04486  [pdf, other

    q-bio.QM cs.AI

    Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

    Authors: Tianshu Feng, Rohan Gnanaolivu, Abolfazl Safikhani, Yuanhang Liu, Jun Jiang, Nicholas Chia, Alexander Partin, Priyanka Vasanthakumari, Yitan Zhu, Chen Wang

    Abstract: Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological i… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  35. arXiv:2407.04296  [pdf

    physics.plasm-ph

    The study of propagation characteristics of millimeter-wave vortex in magnetized plasma by using FDTD Method

    Authors: Chenxu Wang, Hideki Kawaguchi, Hiroaki Nakamura, Shin Kubo

    Abstract: It is pointed out that millimeter-wave vortex may contribute an efficient plasma heating since it was found that the millimeter-wave vortex can propagate in magnetized plasma even in which the normal plane wave is in cut-off condition. Then, it was assumed that the vortex field was the Laguerre-Gaussian (L-G) mode which is free-space solution, but the generation and stable propagation of the L-G m… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 9 pages, 5 figures

  36. arXiv:2407.03900  [pdf, other

    cs.CV

    Oracle Bone Inscriptions Multi-modal Dataset

    Authors: Bang Li, Donghao Luo, Yujie Liang, Jing Yang, Zengmao Ding, Xu Peng, Boyuan Jiang, Shengwei Han, Dan Sui, Peichao Qin, Pian Wu, Chaoyang Wang, Yun Qi, Taisong Jin, Chengjie Wang, Xiaoming Huang, Zhan Shu, Rongrong Ji, Yongge Liu, Yunsheng Wu

    Abstract: Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current climate of the scholarship, can prove extremely challenging. Out of the 4,500 oracle bone characters excavated, only a third have been successfully identified. Therefore, leveraging… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  37. arXiv:2407.03772  [pdf, other

    eess.IV cs.CV q-bio.QM

    CS3: Cascade SAM for Sperm Segmentation

    Authors: Yi Shi, Xu-Peng Tian, Yun-Kai Wang, Tie-Yi Zhang, Bin Yao, Hui Wang, Yong Shao, Cen-Cen Wang, Rong Zeng, De-Chuan Zhan

    Abstract: Automated sperm morphology analysis plays a crucial role in the assessment of male fertility, yet its efficacy is often compromised by the challenges in accurately segmenting sperm images. Existing segmentation techniques, including the Segment Anything Model(SAM), are notably inadequate in addressing the complex issue of sperm overlap-a frequent occurrence in clinical samples. Our exploratory stu… ▽ More

    Submitted 9 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Early accepted by MICCAI2024

  38. arXiv:2407.03548  [pdf, other

    cs.CV

    HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation

    Authors: Tao Chen, Chenhui Wang, Zhihao Chen, Yiming Lei, Hongming Shan

    Abstract: Medical image segmentation has been significantly advanced with the rapid development of deep learning (DL) techniques. Existing DL-based segmentation models are typically discriminative; i.e., they aim to learn a mapping from the input image to segmentation masks. However, these discriminative methods neglect the underlying data distribution and intrinsic class characteristics, suffering from uns… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted by IEEE Transactions on Medical Imaging 2024

  39. arXiv:2407.03531  [pdf, other

    cs.RO

    OrbitGrasp: $SE(3)$-Equivariant Grasp Learning

    Authors: Boce Hu, Xupeng Zhu, Dian Wang, Zihao Dong, Haojie Huang, Chenghao Wang, Robin Walters, Robert Platt

    Abstract: While grasp detection is an important part of any robotic manipulation pipeline, reliable and accurate grasp detection in $SE(3)$ remains a research challenge. Many robotics applications in unstructured environments such as the home or warehouse would benefit a lot from better grasp performance. This paper proposes a novel framework for detecting $SE(3)$ grasp poses based on point cloud input. Our… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  40. arXiv:2407.03449  [pdf, other

    eess.SP

    A Tutorial on Fluid Antenna System for 6G Networks: Encompassing Communication Theory, Optimization Methods and Hardware Designs

    Authors: Wee Kiat New, Kai-Kit Wong, Hao Xu, Chao Wang, Farshad Rostami Ghadi, Jichen Zhang, Junhui Rao, Ross Murch, Pablo Ramírez-Espinosa, David Morales-Jimenez, Chan-Byoung Chae, Kin-Fai Tong

    Abstract: The advent of the sixth-generation (6G) networks presents another round of revolution for the mobile communication landscape, promising an immersive experience, robust reliability, minimal latency, extreme connectivity, ubiquitous coverage, and capabilities beyond communication, including intelligence and sensing. To achieve these ambitious goals, it is apparent that 6G networks need to incorporat… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 50 pages, 45 figures, 5 tables. Submitted for potential publication

  41. arXiv:2407.03374  [pdf

    cs.AI cs.SE eess.SP eess.SY

    An Outline of Prognostics and Health Management Large Model: Concepts, Paradigms, and Challenges

    Authors: Laifa Tao, Shangyu Li, Haifei Liu, Qixuan Huang, Liang Ma, Guoao Ning, Yiling Chen, Yunlong Wu, Bin Li, Weiwei Zhang, Zhengduo Zhao, Wenchao Zhan, Wenyan Cao, Chao Wang, Hongmei Liu, Jian Ma, Mingliang Suo, Yujie Cheng, Yu Ding, Dengwei Song, Chen Lu

    Abstract: Prognosis and Health Management (PHM), critical for ensuring task completion by complex systems and preventing unexpected failures, is widely adopted in aerospace, manufacturing, maritime, rail, energy, etc. However, PHM's development is constrained by bottlenecks like generalization, interpretation and verification abilities. Presently, generative artificial intelligence (AI), represented by Larg… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  42. arXiv:2407.02783  [pdf, ps, other

    cs.CL cs.AI

    52B to 1T: Lessons Learned via Tele-FLM Series

    Authors: Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang

    Abstract: Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence. As scaling laws underscore the potential of increasing model sizes, the academic community has intensified its investigations into LLMs with capacities exceeding 50 billion parameters. This technical report builds on our prior work with Tele-FLM (also known as FLM-2), a publicly available 52-billion… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: For the Tele-FLM-52B tech report, see also 2404.16645

  43. arXiv:2407.02482  [pdf, other

    cs.CV

    Boosting Consistency in Story Visualization with Rich-Contextual Conditional Diffusion Models

    Authors: Fei Shen, Hu Ye, Sibo Liu, Jun Zhang, Cong Wang, Xiao Han, Wei Yang

    Abstract: Recent research showcases the considerable potential of conditional diffusion models for generating consistent stories. However, current methods, which predominantly generate stories in an autoregressive and excessively caption-dependent manner, often underrate the contextual consistency and relevance of frames during sequential generation. To address this, we propose a novel Rich-contextual Condi… ▽ More

    Submitted 3 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  44. arXiv:2407.02384  [pdf

    cond-mat.mtrl-sci nlin.CD

    Improved Long-Term Prediction of Chaos Using Reservoir Computing Based on Stochastic Spin-Orbit Torque Devices

    Authors: Cen Wang, Xinyao Lei, Kaiming Cai, Xiaofei Yang, Yue Zhang

    Abstract: Predicting chaotic systems is crucial for understanding complex behaviors, yet challenging due to their sensitivity to initial conditions and inherent unpredictability. Probabilistic Reservoir Computing (RC) is well-suited for long-term chaotic predictions by handling complex dynamic systems. Spin-Orbit Torque (SOT) devices in spintronics, with their nonlinear and probabilistic operations, can enh… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 14 pages, 3 figures

  45. arXiv:2407.02376  [pdf, other

    astro-ph.HE

    A new subclass of gamma-ray burst originating from compact binary merger

    Authors: Chen-Wei Wang, Wen-Jun Tan, Shao-Lin Xiong, Shu-Xu Yi, Rahim Moradi, Bing Li, Zhen Zhang, Yu Wang, Yan-Zhi Meng, Jia-Cong Liu, Yue Wang, Sheng-Lun Xie, Wang-Chen Xue, Zheng-Hang Yu, Peng Zhang, Wen-Long Zhang, Yan-Qiu Zhang, Chao Zheng

    Abstract: Type I gamma-ray bursts (GRBs) are believed to originate from compact binary merger usually with duration less than 2 seconds for the main emission. However, recent observations of GRB 211211A and GRB 230307A indicate that some merger-origin GRBs could last much longer. Since they show strikingly similar properties (indicating a common mechanism) which are different from the classic "long"-short b… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  46. arXiv:2407.02095  [pdf, other

    cs.SE

    TIGER: A Generating-Then-Ranking Framework for Practical Python Type Inference

    Authors: Chong Wang, Jian Zhang, Yiling Lou, Mingwei Liu, Weisong Sun, Yang Liu, Xin Peng

    Abstract: Python's dynamic typing system offers flexibility and expressiveness but can lead to type-related errors, prompting the need for automated type inference to enhance type hinting. While existing learning-based approaches show promising inference accuracy, they struggle with practical challenges in comprehensively handling various types, including complex generic types and (unseen) user-defined type… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  47. arXiv:2407.01905  [pdf, other

    cs.CV

    Enhancing Multi-Class Anomaly Detection via Diffusion Refinement with Dual Conditioning

    Authors: Jiawei Zhan, Jinxiang Lai, Bin-Bin Gao, Jun Liu, Xiaochen Chen, Chengjie Wang

    Abstract: Anomaly detection, the technique of identifying abnormal samples using only normal samples, has attracted widespread interest in industry. Existing one-model-per-category methods often struggle with limited generalization capabilities due to their focus on a single category, and can fail when encountering variations in product. Recent feature reconstruction methods, as representatives in one-model… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  48. arXiv:2407.01342  [pdf

    cond-mat.str-el

    Structural and Magnetic properties of Ge0.5Mn0.5Co2O4 using neutron diffraction

    Authors: Pooja Jain, Benny Schundelmier, Chin-Wei Wang, Poonam Yadav, Kaya Wei, N. P. Lalla, Shivani Sharma

    Abstract: The structural and magnetic properties of Ge0.5Mn0.5Co2O4 (GMCO) have been investigated in detail utilizing neutron powder diffraction (NPD), x-ray diffraction (XRD), DC magnetometry, and heat capacity analysis and compared with GeCo2O4. Despite both compounds exhibiting a cubic structure at room temperature, a substantial difference on low temperature structural properties have been observed for… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 16 pages, 9 figures

  49. arXiv:2407.01029  [pdf, other

    cs.CV

    EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting

    Authors: Chenxin Li, Brandon Y. Feng, Yifan Liu, Hengyu Liu, Cheng Wang, Weihao Yu, Yixuan Yuan

    Abstract: 3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities. Existing methods employ various advanced neural rendering techniques for photorealistic view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available, which is usually… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accpeted by MICCAI2024

  50. arXiv:2407.00992  [pdf, other

    physics.flu-dyn

    Turbulence modulation in liquid-liquid two-phase Taylor-Couette turbulence

    Authors: Jinghong Su, Cheng Wang, Yi-bao Zhang, Fan Xu, Junwu Wang, Chao Sun

    Abstract: We investigate the coupling effects of the two-phase interface, viscosity ratio, and density ratio of the dispersed phase to the continuous phase on the flow statistics in two-phase Taylor-Couette turbulence at a system Reynolds number of 6000 and a system Weber number of 10 using interface-resolved three-dimensional direct numerical simulations with the volume-of-fluid method. Our study focuses o… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.