Skip to main content

Showing 1–50 of 86 results for author: Guan, T

  1. arXiv:2407.07764  [pdf, other

    cs.CV

    PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer

    Authors: Tongkun Guan, Chengyu Lin, Wei Shen, Xiaokang Yang

    Abstract: Handwritten Mathematical Expression Recognition (HMER) has wide applications in human-machine interaction scenarios, such as digitized education and automated offices. Recently, sequence-based models with encoder-decoder architectures have been commonly adopted to address this task by directly predicting LaTeX sequences of expression images. However, these methods only implicitly learn the syntax… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  2. arXiv:2406.18054  [pdf, other

    eess.IV cs.CV

    Leveraging Pre-trained Models for FF-to-FFPE Histopathological Image Translation

    Authors: Qilai Zhang, Jiawen Li, Peiran Liao, Jiali Hu, Tian Guan, Anjia Han, Yonghong He

    Abstract: The two primary types of Hematoxylin and Eosin (H&E) slides in histopathology are Formalin-Fixed Paraffin-Embedded (FFPE) and Fresh Frozen (FF). FFPE slides offer high quality histopathological images but require a labor-intensive acquisition process. In contrast, FF slides can be prepared quickly, but the image quality is relatively poor. Our task is to translate FF images into FFPE style, thereb… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2406.10900  [pdf, other

    cs.CV cs.CL

    AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models

    Authors: Xiyang Wu, Tianrui Guan, Dianqi Li, Shuaiyi Huang, Xiaoyu Liu, Xijun Wang, Ruiqi Xian, Abhinav Shrivastava, Furong Huang, Jordan Lee Boyd-Graber, Tianyi Zhou, Dinesh Manocha

    Abstract: Large vision-language models (LVLMs) hallucinate: certain context cues in an image may trigger the language module's overconfident and incorrect reasoning on abnormal or hypothetical objects. Though a few benchmarks have been developed to investigate LVLM hallucinations, they mainly rely on hand-crafted corner cases whose fail patterns may hardly generalize, and finetuning on them could undermine… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  4. arXiv:2406.00672  [pdf, other

    cs.CV

    Task-oriented Embedding Counts: Heuristic Clustering-driven Feature Fine-tuning for Whole Slide Image Classification

    Authors: Xuenian Wang, Shanshan Shi, Renao Yan, Qiehe Sun, Lianghui Zhu, Tian Guan, Yonghong He

    Abstract: In the field of whole slide image (WSI) classification, multiple instance learning (MIL) serves as a promising approach, commonly decoupled into feature extraction and aggregation. In this paradigm, our observation reveals that discriminative embeddings are crucial for aggregation to the final prediction. Among all feature updating strategies, task-oriented ones can capture characteristics specifi… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  5. arXiv:2405.05363  [pdf, other

    cs.CV cs.RO

    LOC-ZSON: Language-driven Object-Centric Zero-Shot Object Retrieval and Navigation

    Authors: Tianrui Guan, Yurou Yang, Harry Cheng, Muyuan Lin, Richard Kim, Rajasimman Madhivanan, Arnie Sen, Dinesh Manocha

    Abstract: In this paper, we present LOC-ZSON, a novel Language-driven Object-Centric image representation for object navigation task within complex scenes. We propose an object-centric image representation and corresponding losses for visual-language model (VLM) fine-tuning, which can handle complex object-level queries. In addition, we design a novel LLM-based augmentation and prompt templates for stabilit… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted to ICRA 2024

  6. arXiv:2404.12777  [pdf, other

    cs.CV

    EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation

    Authors: Wenkai Liu, Tao Guan, Bin Zhu, Lili Ju, Zikai Song, Dan Li, Yuesong Wang, Wei Yang

    Abstract: In the domain of 3D scene representation, 3D Gaussian Splatting (3DGS) has emerged as a pivotal technology. However, its application to large-scale, high-resolution scenes (exceeding 4k$\times$4k pixels) is hindered by the excessive computational requirements for managing a large number of Gaussians. Addressing this, we introduce 'EfficientGS', an advanced approach that optimizes 3DGS for high-res… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  7. arXiv:2404.03187  [pdf, other

    cs.CV

    AGL-NET: Aerial-Ground Cross-Modal Global Localization with Varying Scales

    Authors: Tianrui Guan, Ruiqi Xian, Xijun Wang, Xiyang Wu, Mohamed Elnoor, Daeun Song, Dinesh Manocha

    Abstract: We present AGL-NET, a novel learning-based method for global localization using LiDAR point clouds and satellite maps. AGL-NET tackles two critical challenges: bridging the representation gap between image and points modalities for robust feature matching, and handling inherent scale discrepancies between global view and local view. To address these challenges, AGL-NET leverages a unified network… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  8. arXiv:2403.13235  [pdf, other

    cs.RO

    AMCO: Adaptive Multimodal Coupling of Vision and Proprioception for Quadruped Robot Navigation in Outdoor Environments

    Authors: Mohamed Elnoor, Kasun Weerakoon, Adarsh Jagan Sathyamoorthy, Tianrui Guan, Vignesh Rajagopal, Dinesh Manocha

    Abstract: We present AMCO, a novel navigation method for quadruped robots that adaptively combines vision-based and proprioception-based perception capabilities. Our approach uses three cost maps: general knowledge map; traversability history map; and current proprioception map; which are derived from a robot's vision and proprioception data, and couples them to obtain a coupled traversability cost map for… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: 8 pages

  9. arXiv:2403.12414  [pdf, other

    physics.ins-det hep-ex

    Development of low-radon ultra-pure water for the Jiangmen Underground Neutrino Observatory

    Authors: T. Y. Guan, Y. P. Zhang, B. Wang, C. Guo, J. C. Liu, Q. Tang, C. G. Yang, C. Li

    Abstract: The Jiangmen Underground Neutrino Observatory(JUNO) is a state-of-the-art liquid scintillator-based neutrino physics experiment under construction in South China. To reduce the background from external radioactivities, a water Cherenkov detector composed of 35~kton ultra-pure water and 2,400 20-inch photomultiplier tubes is developed. Even after specialized treatment, ultra-pure water still contai… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 20 pages, 13 figures

  10. arXiv:2403.11193  [pdf, other

    cs.CV

    Neural Markov Random Field for Stereo Matching

    Authors: Tongfan Guan, Chen Wang, Yun-Hui Liu

    Abstract: Stereo matching is a core task for many computer vision and robotics applications. Despite their dominance in traditional stereo methods, the hand-crafted Markov Random Field (MRF) models lack sufficient modeling accuracy compared to end-to-end deep models. While deep learning representations have greatly improved the unary terms of the MRF models, the overall accuracy is still severely limited by… ▽ More

    Submitted 21 March, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  11. arXiv:2403.10858  [pdf, other

    cs.CV

    RetMIL: Retentive Multiple Instance Learning for Histopathological Whole Slide Image Classification

    Authors: Hongbo Chu, Qiehe Sun, Jiawen Li, Yuxuan Chen, Lizhong Zhang, Tian Guan, Anjia Han, Yonghong He

    Abstract: Histopathological whole slide image (WSI) analysis with deep learning has become a research focus in computational pathology. The current paradigm is mainly based on multiple instance learning (MIL), in which approaches with Transformer as the backbone are well discussed. These methods convert WSI tasks into sequence tasks by representing patches as tokens in the WSI sequence. However, the feature… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: under review

  12. arXiv:2403.09606  [pdf, ps, other

    cs.CL cs.AI

    Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey

    Authors: Xiaoyu Liu, Paiheng Xu, Junda Wu, Jiaxin Yuan, Yifan Yang, Yuhang Zhou, Fuxiao Liu, Tianrui Guan, Haoliang Wang, Tong Yu, Julian McAuley, Wei Ai, Furong Huang

    Abstract: Causal inference has shown potential in enhancing the predictive accuracy, fairness, robustness, and explainability of Natural Language Processing (NLP) models by capturing causal relationships among variables. The emergence of generative Large Language Models (LLMs) has significantly impacted various NLP domains, particularly through their advanced reasoning capabilities. This survey focuses on e… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  13. arXiv:2403.07719  [pdf, other

    cs.CV

    Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis

    Authors: Jiawen Li, Yuxuan Chen, Hongbo Chu, Qiehe Sun, Tian Guan, Anjia Han, Yonghong He

    Abstract: Histopathological whole slide images (WSIs) classification has become a foundation task in medical microscopic imaging processing. Prevailing approaches involve learning WSIs as instance-bag representations, emphasizing significant instances but struggling to capture the interactions between instances. Additionally, conventional graph representation methods utilize explicit spatial positions to co… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  14. arXiv:2402.13614  [pdf, other

    physics.ins-det hep-ex

    Developing a $μ$Bq/m$^{3}$ level $^{226}$Ra concentration in water measurement system for the Jiangmen Underground Neutrino Observatory

    Authors: C. Li, B. Wang, Y. Liu, C. Guo, Y. P. Zhang, J. C. Liu, Q. Tang, T. Y. Guan, C. G. Yang

    Abstract: The Jiangmen Underground Neutrino Observatory (JUNO), a 20~kton multi-purpose low background Liquid Scintillator (LS) detector, was proposed primarily to determine the neutrino mass ordering. To suppress the radioactivity from the surrounding rocks and tag cosmic muons, the JUNO central detector is submerged in a Water Cherenkov Detector (WCD). In addition to being used in the WCD, ultrapure water… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 16 pages, 7 figures

  15. arXiv:2402.10340  [pdf, other

    cs.RO cs.AI

    Highlighting the Safety Concerns of Deploying LLMs/VLMs in Robotics

    Authors: Xiyang Wu, Souradip Chakraborty, Ruiqi Xian, Jing Liang, Tianrui Guan, Fuxiao Liu, Brian M. Sadler, Dinesh Manocha, Amrit Singh Bedi

    Abstract: In this paper, we highlight the critical issues of robustness and safety associated with integrating large language models (LLMs) and vision-language models (VLMs) into robotics applications. Recent works focus on using LLMs and VLMs to improve the performance of robotics tasks, such as manipulation and navigation. Despite these improvements, analyzing the safety of such systems remains underexplo… ▽ More

    Submitted 16 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  16. arXiv:2312.05490  [pdf, other

    cs.CV

    Shapley Values-enabled Progressive Pseudo Bag Augmentation for Whole Slide Image Classification

    Authors: Renao Yan, Qiehe Sun, Cheng Jin, Yiqing Liu, Yonghong He, Tian Guan, Hao Chen

    Abstract: In computational pathology, whole slide image (WSI) classification presents a formidable challenge due to its gigapixel resolution and limited fine-grained annotations. Multiple instance learning (MIL) offers a weakly supervised solution, yet refining instance-level information from bag-level labels remains complex. While most of the conventional MIL methods use attention scores to estimate instan… ▽ More

    Submitted 15 May, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: submitted to IEEE TRANSACTIONS ON MEDICAL IMAGING

  17. arXiv:2312.05286  [pdf, other

    cs.CV

    Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors

    Authors: Tongkun Guan, Wei Shen, Xue Yang, Xuehui Wang, Xiaokang Yang

    Abstract: Existing scene text detection methods typically rely on extensive real data for training. Due to the lack of annotated real images, recent works have attempted to exploit large-scale labeled synthetic data (LSD) for pre-training text detectors. However, a synth-to-real domain gap emerges, further limiting the performance of text detectors. Differently, in this work, we propose FreeReal, a real-dom… ▽ More

    Submitted 10 July, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: Accepted by ECCV2024

  18. arXiv:2311.11499  [pdf

    physics.optics physics.app-ph

    Flexible generation of structured terahertz fields via programmable exchange-biased spintronic emitters

    Authors: Shunjia Wang, Wentao Qin, Tongyang Guan, Jingyu Liu, Qingnan Cai, Sheng Zhang, Lei Zhou, Yan Zhang, Yizheng Wu, Zhensheng Tao

    Abstract: Structured light, particularly in the terahertz frequency range, holds considerable potential for a diverse range of applications. However, the generation and control of structured terahertz radiation pose major challenges. In this work, we demonstrate a novel programmable spintronic emitter that can flexibly generate a variety of structured terahertz waves. This is achieved through the precise an… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

  19. arXiv:2310.14566  [pdf, other

    cs.CV cs.CL

    HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models

    Authors: Tianrui Guan, Fuxiao Liu, Xiyang Wu, Ruiqi Xian, Zongxia Li, Xiaoyu Liu, Xijun Wang, Lichang Chen, Furong Huang, Yaser Yacoob, Dinesh Manocha, Tianyi Zhou

    Abstract: We introduce HallusionBench, a comprehensive benchmark designed for the evaluation of image-context reasoning. This benchmark presents significant challenges to advanced large visual-language models (LVLMs), such as GPT-4V(Vision), Gemini Pro Vision, Claude 3, and LLaVA-1.5, by emphasizing nuanced understanding and interpretation of visual data. The benchmark comprises 346 images paired with 1129… ▽ More

    Submitted 25 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted to CVPR 2024

  20. arXiv:2310.14524  [pdf, other

    hep-ex physics.ins-det

    Study on the radon adsorption capability of low-background activated carbon

    Authors: Chi Li, Yongpeng Zhang, Lidan Lv, Jinchang Liu, Cong Guo, Changgen Yang, Tingyu Guan, Yu Liu, Yu Lei, Quan Tang

    Abstract: Radon is a significant background source in rare event detection experiments. Activated Carbon (AC) adsorption is widely used for effective radon removal. The selection of AC considers its adsorption capacity and radioactive background. In this study, using self-developed devices, we screened and identified a new kind of low-background AC from Qingdao Inaf Technology Company that has very high Rad… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: 21pages, 7 figures

  21. arXiv:2310.07191  [pdf, other

    cs.CG math.NA

    $pκ$-Curves: Interpolatory curves with curvature approximating a parabola

    Authors: Zhihao Wang, Juan Cao, Tuan Guan, Zhonggui Chen, Yongjie Jessica Zhang

    Abstract: This paper introduces a novel class of fair and interpolatory curves called $pκ$-curves. These curves are comprised of smoothly stitched Bézier curve segments, where the curvature distribution of each segment is made to closely resemble a parabola, resulting in an aesthetically pleasing shape. Moreover, each segment passes through an interpolated point at a parameter where the parabola has an extr… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  22. arXiv:2307.06344  [pdf, other

    q-bio.QM cs.CV eess.IV

    The Whole Pathological Slide Classification via Weakly Supervised Learning

    Authors: Qiehe Sun, Jiawen Li, Jin Xu, Junru Cheng, Tian Guan, Yonghong He

    Abstract: Due to its superior efficiency in utilizing annotations and addressing gigapixel-sized images, multiple instance learning (MIL) has shown great promise as a framework for whole slide image (WSI) classification in digital pathology diagnosis. However, existing methods tend to focus on advanced aggregators with different structures, often overlooking the intrinsic features of H\&E pathological slide… ▽ More

    Submitted 12 July, 2023; originally announced July 2023.

  23. arXiv:2306.10003  [pdf, other

    cs.CV

    C2F2NeUS: Cascade Cost Frustum Fusion for High Fidelity and Generalizable Neural Surface Reconstruction

    Authors: Luoyuan Xu, Tao Guan, Yuesong Wang, Wenkai Liu, Zhaojie Zeng, Junle Wang, Wei Yang

    Abstract: There is an emerging effort to combine the two popular 3D frameworks using Multi-View Stereo (MVS) and Neural Implicit Surfaces (NIS) with a specific focus on the few-shot / sparse view setting. In this paper, we introduce a novel integration scheme that combines the multi-view stereo with neural signed distance function representations, which potentially overcomes the limitations of both methods.… ▽ More

    Submitted 14 August, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted by ICCV2023

  24. arXiv:2306.06236  [pdf, other

    cs.MA cs.LG cs.RO

    iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning

    Authors: Xiyang Wu, Rohan Chandra, Tianrui Guan, Amrit Singh Bedi, Dinesh Manocha

    Abstract: Navigating safely and efficiently in dense and heterogeneous traffic scenarios is challenging for autonomous vehicles (AVs) due to their inability to infer the behaviors or intentions of nearby drivers. In this work, we introduce a distributed multi-agent reinforcement learning (MARL) algorithm that can predict trajectories and intents in dense and heterogeneous traffic scenarios. Our approach for… ▽ More

    Submitted 21 August, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

  25. arXiv:2305.12437  [pdf, other

    cs.CV

    PLAR: Prompt Learning for Action Recognition

    Authors: Xijun Wang, Ruiqi Xian, Tianrui Guan, Dinesh Manocha

    Abstract: We present a new general learning approach, Prompt Learning for Action Recognition (PLAR), which leverages the strengths of prompt learning to guide the learning process. Our approach is designed to predict the action label by helping the models focus on the descriptions or instructions associated with actions in the input videos. Our formulation uses various prompts, including learnable prompts,… ▽ More

    Submitted 14 November, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

  26. Nonrelativistic and nonmagnetic control of terahertz charge currents via electrical anisotropy in RuO2 and IrO2

    Authors: Sheng Zhang, Yongwei Cui, Shunjia Wang, Haoran Chen, Yaxin Liu, Wentao Qin, Tongyang Guan, Chuanshan Tian, Zhe Yuan, Lei Zhou, Yizheng Wu, Zhensheng Tao

    Abstract: Precise and ultrafast control over photo-induced charge currents across nanoscale interfaces could lead to important applications in energy harvesting, ultrafast electronics, and coherent terahertz sources. Recent studies have shown that several relativistic mechanisms, including inverse spin-Hall effect, inverse Rashba-Edelstein effect and inverse spin-orbit-torque effect, can convert longitudina… ▽ More

    Submitted 2 April, 2023; originally announced April 2023.

    Journal ref: Advanced Photonics (2023)

  27. arXiv:2303.17778  [pdf, other

    cs.CV

    CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition

    Authors: Tianrui Guan, Aswath Muthuselvam, Montana Hoover, Xijun Wang, Jing Liang, Adarsh Jagan Sathyamoorthy, Damon Conover, Dinesh Manocha

    Abstract: We present CrossLoc3D, a novel 3D place recognition method that solves a large-scale point matching problem in a cross-source setting. Cross-source point cloud data corresponds to point sets captured by depth sensors with different accuracies or from different distances and perspectives. We address the challenges in terms of developing 3D place recognition methods that account for the representati… ▽ More

    Submitted 29 September, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

  28. arXiv:2303.14502  [pdf, other

    cs.RO

    VERN: Vegetation-aware Robot Navigation in Dense Unstructured Outdoor Environments

    Authors: Adarsh Jagan Sathyamoorthy, Kasun Weerakoon, Tianrui Guan, Mason Russell, Damon Conover, Jason Pusey, Dinesh Manocha

    Abstract: We propose a novel method for autonomous legged robot navigation in densely vegetated environments with a variety of pliable/traversable and non-pliable/untraversable vegetation. We present a novel few-shot learning classifier that can be trained on a few hundred RGB images to differentiate flora that can be navigated through, from the ones that must be circumvented. Using the vegetation classific… ▽ More

    Submitted 25 March, 2023; originally announced March 2023.

    Comments: 8 Pages, 5 figures

  29. AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning

    Authors: Xijun Wang, Ruiqi Xian, Tianrui Guan, Celso M. de Melo, Stephen M. Nogar, Aniket Bera, Dinesh Manocha

    Abstract: We propose a novel approach for aerial video action recognition. Our method is designed for videos captured using UAVs and can run on edge or mobile devices. We present a learning-based approach that uses customized auto zoom to automatically identify the human target and scale it appropriately. This makes it easier to extract the key features and reduces the computational overhead. We also presen… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted for publication at ICRA 2023

  30. arXiv:2301.06982  [pdf, other

    physics.ins-det hep-ex

    Research of radon diffusion behavior in liquid scintillator

    Authors: Z. F. Xu, C. Guo, J. C. Liu, Y. P. Zhang, P. Zhang, C. G. Yang, Q. Tang, Y. Liu, C. Li, T. Y. Guan

    Abstract: The background caused by radon and its daughters is an important background in the low background liquid scintillator (LS) detectors. The study of the diffusion behaviour of radon in the LS contributes to the analysis of the related background caused by radon. Methodologies and devices for measuring the diffusion coefficient and solubility of radon in materials are developed and described. The rad… ▽ More

    Submitted 28 January, 2023; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: 9 pages, 7 figures

  31. arXiv:2301.00959  [pdf, ps, other

    physics.ins-det hep-ex

    System upgrade for $μ$Bq/m$^3$ level $^{222}$Rn concentration measurement

    Authors: Y. Liu, Y. P. Zhang, J. C. Liu, C. Guo, C. G. Yang. P. Zhang, Q. Tang, Z. F. Xu, C. Li, T. Y. Guan, S. B. Wang

    Abstract: The Jiangmen Underground Neutrino Observatory (JUNO), a 20 kton multipurpose underground liquid scintillator detector, was proposed for the determination of the neutrino mass hierarchy as primary physics goal. The central detector will be submerged in a water Cherenkov detector to lower the background from the environment and cosmic muons. Radon is one of the primary background sources. Nitrogen w… ▽ More

    Submitted 24 September, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

    Comments: 18 pages, 9 figures

  32. arXiv:2211.00288  [pdf, other

    cs.CV

    Self-supervised Character-to-Character Distillation for Text Recognition

    Authors: Tongkun Guan, Wei Shen, Xue Yang, Qi Feng, Zekun Jiang, Xiaokang Yang

    Abstract: When handling complicated text images (e.g., irregular structures, low resolution, heavy occlusion, and uneven illumination), existing supervised text recognition methods are data-hungry. Although these methods employ large-scale synthetic text images to reduce the dependence on annotated real images, the domain gap still limits the recognition performance. Therefore, exploring the robust text fea… ▽ More

    Submitted 18 August, 2023; v1 submitted 1 November, 2022; originally announced November 2022.

    Comments: Accepted by ICCV2023

  33. arXiv:2209.07725  [pdf, other

    cs.RO cs.CV

    VINet: Visual and Inertial-based Terrain Classification and Adaptive Navigation over Unknown Terrain

    Authors: Tianrui Guan, Ruitao Song, Zhixian Ye, Liangjun Zhang

    Abstract: We present a visual and inertial-based terrain classification network (VINet) for robotic navigation over different traversable surfaces. We use a novel navigation-based labeling scheme for terrain classification and generalization on unknown surfaces. Our proposed perception method and adaptive scheduling control framework can make predictions according to terrain navigation properties and lead t… ▽ More

    Submitted 1 March, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

  34. arXiv:2209.05722  [pdf, other

    cs.RO

    GrASPE: Graph based Multimodal Fusion for Robot Navigation in Unstructured Outdoor Environments

    Authors: Kasun Weerakoon, Adarsh Jagan Sathyamoorthy, Jing Liang, Tianrui Guan, Utsav Patel, Dinesh Manocha

    Abstract: We present a novel trajectory traversability estimation and planning algorithm for robot navigation in complex outdoor environments. We incorporate multimodal sensory inputs from an RGB camera, 3D LiDAR, and the robot's odometry sensor to train a prediction model to estimate candidate trajectories' success probabilities based on partially reliable multi-modal sensor observations. We encode high-di… ▽ More

    Submitted 16 May, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

  35. arXiv:2207.13848  [pdf, other

    cs.DC cs.LG cs.PF math.NA

    Predicting the Output Structure of Sparse Matrix Multiplication with Sampled Compression Ratio

    Authors: Zhaoyang Du, Yijin Guan, Tianchan Guan, Dimin Niu, Nianxiong Tan, Xiaopeng Yu, Hongzhong Zheng, Jianyi Meng, Xiaolang Yan, Yuan Xie

    Abstract: Sparse general matrix multiplication (SpGEMM) is a fundamental building block in numerous scientific applications. One critical task of SpGEMM is to compute or predict the structure of the output matrix (i.e., the number of nonzero elements per output row) for efficient memory allocation and load balance, which impact the overall performance of SpGEMM. Existing work either precisely calculates the… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: This paper has been submitted to the IEEE International Conference on Parallel and Distributed Systems (ICPADS). 8 pages, 2 fgures, 3 tables

    ACM Class: F.2.1; G.3; D.1.3; G.1.3

  36. Intrinsic Spectrum Analysis of Laser Dynamics Based on Fractional Fourier Transform

    Authors: Ligang Huang, Tianyi Lan, Chaoze Zhang, Laiyang Dang, Tianyu Guan, Bowen Zheng, Shunli Liu, Lei Gao, Wei Huang, Guolu Yin, Tao Zhu

    Abstract: Intrinsic spectrum that results from the coupling of spontaneous emission in a laser cavity, can determine the energy concentration and coherence of lasers, which is crucial for the optical high-precision measurement. Up to now, it is hard to analyze the intrinsic spectrum in the high-speed laser dynamics process, especially under the condition of fast wavelength sweeping. In this work, a new meth… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  37. arXiv:2206.07244  [pdf, other

    cs.DC

    OpSparse: a Highly Optimized Framework for Sparse General Matrix Multiplication on GPUs

    Authors: Zhaoyang Du, Yijin Guan, Tianchan Guan, Dimin Niu, Linyong Huang, Hongzhong Zheng, Yuan Xie

    Abstract: Sparse general matrix multiplication (SpGEMM) is an important and expensive computation primitive in many real-world applications. Due to SpGEMM's inherent irregularity and the vast diversity of its input matrices, developing high-performance SpGEMM implementation on modern processors such as GPUs is challenging. The state-of-the-art SpGEMM libraries (i.e., $nsparse$ and $spECK$) adopt several alg… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: This paper has been submitted to the IEEE Access since May 7, 2022, and is currently under review by IEEE Access. 20 pages, 11 fgures, 5 tables

    MSC Class: 68-02; 68W10; 65F50 ACM Class: D.1.3; G.1.3

  38. arXiv:2206.06611  [pdf, other

    cs.DC cs.MS cs.PF

    Accelerating CPU-Based Sparse General Matrix Multiplication With Binary Row Merging

    Authors: Zhaoyang Du, Yijin Guan, Tianchan Guan, Dimin Niu, Hongzhong Zheng, Yuan Xie

    Abstract: Sparse general matrix multiplication (SpGEMM) is a fundamental building block for many real-world applications. Since SpGEMM is a well-known memory-bounded application with vast and irregular memory accesses, considering the memory access efficiency is of critical importance for SpGEMM's performance. Yet, the existing methods put less consideration into the memory subsystem and achieved suboptimal… ▽ More

    Submitted 19 August, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: This work has been accepted by IEEE Access (DOI:10.1109/ACCESS.2022.3193937). There are 12 pages, 6 fgures, 2 tables

    MSC Class: 68-02; 68W10; 65F50 ACM Class: D.1.3; G.1.3

  39. arXiv:2206.05840  [pdf, other

    cs.LG cs.AI

    GAN based Data Augmentation to Resolve Class Imbalance

    Authors: Sairamvinay Vijayaraghavan, Terry Guan, Jason, Song

    Abstract: The number of credit card fraud has been growing as technology grows and people can take advantage of it. Therefore, it is very important to implement a robust and effective method to detect such frauds. The machine learning algorithms are appropriate for these tasks since they try to maximize the accuracy of predictions and hence can be relied upon. However, there is an impending flaw where in ma… ▽ More

    Submitted 12 June, 2022; originally announced June 2022.

  40. arXiv:2205.03517  [pdf, other

    cs.RO

    AdaptiveON: Adaptive Outdoor Local Navigation Method For Stable and Reliable Actions

    Authors: Jing Liang, Kasun Weerakoon, Tianrui Guan, Nare Karapetyan, Dinesh Manocha

    Abstract: We present a novel outdoor navigation algorithm to generate stable and efficient actions to navigate a robot to reach a goal. We use a multi-stage training pipeline and show that our approach produces policies that result in stable and reliable robot navigation on complex terrains. Based on the Proximal Policy Optimization (PPO) algorithm, we developed a novel method to achieve multiple capabiliti… ▽ More

    Submitted 6 December, 2022; v1 submitted 6 May, 2022; originally announced May 2022.

    Comments: 10 pages

  41. arXiv:2203.15459  [pdf, other

    cs.SE

    Influence of Communication Among Shared Developers on the Productivity of Open Source Software Projects

    Authors: Sairamvinay Vijayaraghavan, Jinxiao Song, Terry Guan, Seongwoo Choi, Sutej Kulkarni

    Abstract: Many software developers rely on open source software for developing their applications and writing their source codes. Measuring an independent project's overall productivity is still an open problem for many technology companies. In this project, we address to bridge the gap of analyzing which are the most important features for prediction of a productivity based system. We have chosen to collec… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

  42. arXiv:2203.10694  [pdf, other

    cs.CV

    FAR: Fourier Aerial Video Recognition

    Authors: Divya Kothandaraman, Tianrui Guan, Xijun Wang, Sean Hu, Ming Lin, Dinesh Manocha

    Abstract: We present an algorithm, Fourier Activity Recognition (FAR), for UAV video activity recognition. Our formulation uses a novel Fourier object disentanglement method to innately separate out the human agent (which is typically small) from the background. Our disentanglement technique operates in the frequency domain to characterize the extent of temporal change of spatial pixels, and exploits convol… ▽ More

    Submitted 18 July, 2022; v1 submitted 20 March, 2022; originally announced March 2022.

    Comments: ECCV 2022 Poster paper

  43. arXiv:2203.03382  [pdf, other

    cs.CV

    Self-supervised Implicit Glyph Attention for Text Recognition

    Authors: Tongkun Guan, Chaochen Gu, Jingzheng Tu, Xue Yang, Qi Feng, Yudi Zhao, Xiaokang Yang, Wei Shen

    Abstract: The attention mechanism has become the \emph{de facto} module in scene text recognition (STR) methods, due to its capability of extracting character-level representations. These methods can be summarized into implicit attention based and supervised attention based, depended on how the attention is computed, i.e., implicit attention and supervised attention are learned from sequence-level text anno… ▽ More

    Submitted 15 May, 2023; v1 submitted 7 March, 2022; originally announced March 2022.

    Comments: CVPR2023

  44. arXiv:2202.12873  [pdf, other

    cs.RO

    TerraPN: Unstructured Terrain Navigation using Online Self-Supervised Learning

    Authors: Adarsh Jagan Sathyamoorthy, Kasun Weerakoon, Tianrui Guan, Jing Liang, Dinesh Manocha

    Abstract: We present TerraPN, a novel method that learns the surface properties (traction, bumpiness, deformability, etc.) of complex outdoor terrains directly from robot-terrain interactions through self-supervised learning, and uses it for autonomous robot navigation. Our method uses RGB images of terrain surfaces and the robot's velocities as inputs, and the IMU vibrations and odometry errors experienced… ▽ More

    Submitted 22 June, 2022; v1 submitted 25 February, 2022; originally announced February 2022.

    Comments: 10 pages, 6 figures

  45. arXiv:2202.09746  [pdf, ps, other

    quant-ph physics.optics

    Ultrasensitive refractive index sensor with rotatory biased weak measurement

    Authors: Chongqi Zhou, Yang Xu, Xiaonan Zhang, Zhangyan Li, Tian Guan, Yonghong He, Yanhong Ji

    Abstract: A modified weak measurement scheme, rotatory biased weak measurement, is proposed to significantly improve the sensitivity and resolution of the refractive index sensor on a total reflection structure. This method introduces an additional phase in the post-selected procedure and generates an extinction point in the spectrum distribution. The biased post-selection makes smaller coupling strength av… ▽ More

    Submitted 21 April, 2022; v1 submitted 20 February, 2022; originally announced February 2022.

    Comments: 8 pages, 6 figures

  46. arXiv:2202.07505  [pdf, ps, other

    math.CV

    A note on $\partial$-bilipschitz mappings in quasiconvex metric spaces

    Authors: Tiantian Guan, Saminathan Ponnusamy, Qingshan Zhou

    Abstract: This paper focuses on properties of \partial-biLipschitz mappings which were recently introduced by Bulter. We establish several characterizations for the class of \partial-biLipschitz mappings between domains in quasiconvex metric spaces. As an application, we show that a locally quasisymmetric equivalence between uniform metric spaces is quasimöbius, quantitatively.

    Submitted 15 February, 2022; originally announced February 2022.

    Comments: 17 pages; To appear in Bulletin des sciences mathématiques

    MSC Class: Primary: 30C65; 30F45; 53C23; Secondary: 30C20

  47. Learning to be a Statistician: Learned Estimator for Number of Distinct Values

    Authors: Renzhi Wu, Bolin Ding, Xu Chu, Zhewei Wei, Xiening Dai, Tao Guan, Jingren Zhou

    Abstract: Estimating the number of distinct values (NDV) in a column is useful for many tasks in database systems, such as columnstore compression and data profiling. In this work, we focus on how to derive accurate NDV estimations from random (online/offline) samples. Such efficient estimation is critical for tasks where it is prohibitive to scan the data even once. Existing sample-based estimators typical… ▽ More

    Submitted 6 February, 2022; originally announced February 2022.

    Comments: Published at International Conference on Very Large Data Bases (VLDB) 2022

    Journal ref: PVLDB, 15(2): 272 - 284, 2022

  48. Industrial Scene Text Detection with Refined Feature-attentive Network

    Authors: Tongkun Guan, Chaochen Gu, Changsheng Lu, Jingzheng Tu, Qi Feng, Kaijie Wu, Xinping Guan

    Abstract: Detecting the marking characters of industrial metal parts remains challenging due to low visual contrast, uneven illumination, corroded character structures, and cluttered background of metal part images. Affected by these factors, bounding boxes generated by most existing methods locate low-contrast text areas inaccurately. In this paper, we propose a refined feature-attentive network (RFN) to s… ▽ More

    Submitted 29 March, 2022; v1 submitted 25 October, 2021; originally announced October 2021.

  49. arXiv:2109.12780  [pdf, other

    math.CV

    Pommerenke's theorem on Gromov hyperbolic domains

    Authors: Qingshan Zhou, Antti Rasila, Tiantian Guan

    Abstract: We establish a version of a classical theorem of Pommerenke, which is a diameter version of the Gehring-Hayman inequality on Gromov hyperbolic domains of $\mathbb{R}^n$. Two applications are given. Firstly, we generalize Ostrowski's Faltensatz to quasihyperbolic geodesics of Gromov hyperbolic domains. Secondly, we prove that unbounded uniform domains can be characterized in the terms of Gromov hyp… ▽ More

    Submitted 26 September, 2021; originally announced September 2021.

    Comments: 28 pages, 4 figures

    MSC Class: Primary: 30C65; 30F45; Secondary: 30C20

  50. arXiv:2109.06250  [pdf, other

    cs.RO

    TNS: Terrain Traversability Mapping and Navigation System for Autonomous Excavators

    Authors: Tianrui Guan, Zhenpeng He, Ruitao Song, Dinesh Manocha, Liangjun Zhang

    Abstract: We present a terrain traversability mapping and navigation system (TNS) for autonomous excavator applications in an unstructured environment. We use an efficient approach to extract terrain features from RGB images and 3D point clouds and incorporate them into a global map for planning and navigation. Our system can adapt to changing environments and update the terrain information in real-time. Mo… ▽ More

    Submitted 5 May, 2022; v1 submitted 13 September, 2021; originally announced September 2021.