3D object detection for autonomous driving: A comprehensive survey
Autonomous driving, in recent years, has been receiving increasing attention for its potential
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving�…
to relieve drivers' burdens and improve the safety of driving. In modern autonomous driving�…
Delving into the devils of bird's-eye-view perception: A review, evaluation and recipe
Learning powerful representations in bird's-eye-view (BEV) for perception tasks is trending
and drawing extensive attention both from industry and academia. Conventional�…
and drawing extensive attention both from industry and academia. Conventional�…
Bytetrack: Multi-object tracking by associating every detection box
Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in
videos. Most methods obtain identities by associating detection boxes whose scores are�…
videos. Most methods obtain identities by associating detection boxes whose scores are�…
Bevfusion: Multi-task multi-sensor fusion with unified bird's-eye view representation
Multi-sensor fusion is essential for an accurate and reliable autonomous driving system.
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with�…
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with�…
Multimodal learning with transformers: A survey
Transformer is a promising neural network learner, and has achieved great success in
various machine learning tasks. Thanks to the recent prevalence of multimodal applications�…
various machine learning tasks. Thanks to the recent prevalence of multimodal applications�…
Fully convolutional one-stage 3d object detection on lidar range images
We present a simple yet effective fully convolutional one-stage 3D object detector for LiDAR
point clouds of autonomous driving scenes, termed FCOS-LiDAR. Unlike the dominant�…
point clouds of autonomous driving scenes, termed FCOS-LiDAR. Unlike the dominant�…
Voxelnext: Fully sparse voxelnet for 3d object detection and tracking
Abstract 3D object detectors usually rely on hand-crafted proxies, eg, anchors or centers,
and translate well-studied 2D frameworks to 3D. Thus, sparse voxel features need to be�…
and translate well-studied 2D frameworks to 3D. Thus, sparse voxel features need to be�…
Bevfusion: A simple and robust lidar-camera fusion framework
Fusing the camera and LiDAR information has become a de-facto standard for 3D object
detection tasks. Current methods rely on point clouds from the LiDAR sensor as queries to�…
detection tasks. Current methods rely on point clouds from the LiDAR sensor as queries to�…
Bevformer v2: Adapting modern image backbones to bird's-eye-view recognition via perspective supervision
We present a novel bird's-eye-view (BEV) detector with perspective supervision, which
converges faster and better suits modern image backbones. Existing state-of-the-art BEV�…
converges faster and better suits modern image backbones. Existing state-of-the-art BEV�…
A survey of visual transformers
Transformer, an attention-based encoder–decoder model, has already revolutionized the
field of natural language processing (NLP). Inspired by such significant achievements, some�…
field of natural language processing (NLP). Inspired by such significant achievements, some�…