Challenges

Track1
2D object detection

For 2D object detection, we provide a real-word training dataset with 10 million images of which 5K are labeled and with 5K/10K validation/testing labeled images for evaluation. This dataset has been collected throughout diverse scenarios in cities in China and contains scenes in a wide variety of places, objects and weather conditions such as highways, city streets, country roads, rainy weather, also different cameras / camera setups.

  • Evaluation: Leaderboard ranking for this track is by Mean Average Precision(mAP) among all categories, that is, the mean over the APs of pedestrian, cyclist, car, truck, tram and tricycle. The IoU overlap threshold for pedestrian, cyclist, tricycle is set to 0.5, and for car, truck, tram is set to 0.7. Only camera images of SODA10M are allowed to be used.
  • Dataset: Please refer to SODA-2d for detailed dataset introduction and dataset downloads.
  • Submission: The challenge is now available on codalab.
Technique report:
  • Technique report for Frist Prize Download
  • Technique report for Second Prize Download
  • Technique report for Third Prize Download

Track2
3D object detection

For 3D object detection, we provide a large-scale dataset with 1 million point clouds and 7 million images. We annotated 5K, 3K and 8K scenes for training, validation and testing set respectively and leave the other scenes unlabeled. We provide 3D bounding boxes for car, cyclist, pedestrian, truck and bus.

  • Evaluation: Leaderboard ranking for this track is by Mean Average Precision with Heading (mAPH) / L2 among "ALL_NS" (all Object Types except signs), that is, the mean over the APHs of car, cyclist, pedestrian, truck and bus. All sensors are allowed to be used.
  • Dataset: Please refer to ONCE for detailed dataset introduction and dataset downloads.
  • Submission: The challenge is now available on codalab.
Technique report:
  • Technique report for Frist Prize Download
  • Technique report for Second Prize Download
  • Technique report for Third Prize Download

Track3
Corner Case Detection

Deep learning has achieved prominent success in detecting common traffic participants (e.g., cars, pedestrians, and cyclists). Such detectors, however, are generally incapable of detecting novel objects that are not seen or rarely seen in the training process. These objects are called (object-level) corner cases, which consist of two categories, namely 1) instance of novel class (e.g., a runaway tire) and 2) novel instance of common class (e.g., an overturned truck). Properly dealing with corner cases has become one of the essential keys to reliable autonomous-driving perception systems. The aim of this challenge is to discover novel methods for detecting corner cases among common traffic participants in the real world.

For this challenge, we allow [SODA10M](https://soda-2d.github.io/), [ONCE](https://once-for-auto-driving.github.io/), and ImageNet-1k for training/pretraining. The evaluation will be conducted on the corner case dataset, CODA2022, which contains 9768 camera images with 80180 annotated objects spanning 43 object categories.

The first 7 categories (pedestrian, cyclist, car, truck, tram, tricycle, bus) are common categories, while the rest are novel categories.

  • Evaluation: For this task, we define a custom metric which is the sum of the four metrics as follows:
    • AP-common: mAP over objects of common categories;
    • AP-agnostic: mAP over objects of all categories in a class-agnostic manner;
    • AR-agnostic: mAR over objects of all categories in a class-agnostic manner;
    • AR-agnostic-corner: mAR over corner-case objects of all categories in a class-agnostic manner;
    where mAP and mAR stand for mean Average Precison and mean Average Recall in COCO API.
  • Dataset: Please refer to SODA10M and ONCE for detailed dataset introduction and dataset downloads. CODA2022 can be downloaded on the submission page once the challenge begins.
  • Submission: Please follow the instructions of the challenge published at CodaLab Competitions.
Technique report:
  • Technique report for Frist Prize Download
  • Technique report for Second Prize Download
  • Technique report for Third Prize Download

Track4
Multiple object tracking and segmentation

This is a large-scale tracking challenge under the most diverse driving conditions. Understanding the temporal association and shape of objects within videos is one of the fundamental yet challenging tasks for autonomous driving. The BDD100K MOT and MOTS datasets provides diverse driving scenarios with high quality instance segmentation masks under complicated occlusions and reappearing patterns, which serves as a great testbed for the reliability of the developed tracking and segmentation algorithms in real scenes. The BDD100K dataset also include 100K raw video sequences, which can be readily used for self-supervised learning. We hope the utilization of large-scale unlabeled video data in self-driving could further boost the performance of MOT & MOTS. In this challenge, we provide two tracks: (1) Main track - standard MOT and MOTS, and (2) Teaser track - self-supervised MOT and MOTS.

Technique report:
  • Technique report for Frist Prize in track MOT and SSMOT Download
  • Technique report for Second Prize for in track MOT Download
  • Technique report for Third Prize for in track MOT Download
  • Technique report for Frist Prize in track MOTS and SSMOTS Download
  • Technique report for Second Prize for in track MOTS Download
  • Technique report for Third Prize for in track MOTS Download

Track5
Unified model for multi-task learning

Conception system for autonomous driving is responsible for providing various information based on sensor collect data, like the position of other traffic participants, signals of traffic lights and signs, and etc.. Currently, these information is provided by independent models, which requires great computation resources and neglects the latent connections between these tasks. Therefore it is intuitive to combine these tasks together during training process, i.e., Multi-Task Learning.

We provide mutli-task learning track as part of this year's challenge. In this track, we provide a real world collected dataset with 3000+ frame data. Each frame contains 1 point cloud and 7 images along with annotation of 3D object detection, lane detection and road segmentation.

  • Evaluation: We use common evaluation metrics of each tasks, i.e., mAP for object detectiom, mIoU for road segmentation.
  • Dataset: A new multi-task learning dataset called AutoScenes / AutoScenes annotation is released.
  • Submission: The challenge is now available on codalab.
Technique report:
  • Technique report for Frist Prize Download

CHALLENGE PRIZES: (57,000USD/396,339.93CNY)

Challenge participants with the most successful and innovative entries will be invited to present at this workshop and will receive awards. There are 10,000 USD cash prize for track 1, 2, 3 and MOT/MOTS tasks in track 4. A 5,000 USD(34766.66CHY) cash prize will be awarded to the top performers in each task and 2nd will be awarded with 3,000 USD(20859.99CNY) and 3rd will be awarded with 2,000 USD(13906.66CNY). For SSMOT and SSMOTS task in track 4, a 1,000 USD(6953.33CNY) prize will be awarded to the 1st prize. For track 5, 5,000 USD(34766.66CHY) will be awarded to the 1st prize winner.


References

[1] Han J, Liang X, Xu H, et al. SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous Driving[J]. arXiv preprint arXiv:2106.11118, 2021.

[2] Mao J, Niu M, Jiang C, et al. One Million Scenes for Autonomous Driving: ONCE Dataset[J]. arXiv preprint arXiv:2106.11037,