Object Tracking (Draft)
Overview
Object tracking은 비디오나 이미지에서 특정 객체를 식별하고 그 객체가 프레임 간에 어떻게 이동하는지 추적하는 컴퓨터 비전 기술. 이 기술은 주로 컴퓨터 비전, 영상처리 및 기계 학습 분야에서 사용되며 여러 응용 분야에서 활용.
Object tracking은 주로 다음과 같은 목적으로 사용:
- 동영상 감시 및 보안: CCTV 카메라로부터 수집된 영상에서 특정 객체(예: 사람, 차량)를 추적하여 경계 침입 또는 이상 행동을 감지합니다.
- 자동 운행 차량 (Autonomous Vehicles): 자율 주행 차량에서 주변 환경의 객체를 식별하고 추적하여 안전한 운행을 지원합니다.
- 로봇의 시각적 지각: 로봇이 주변 환경에서 객체를 식별하고 추적하여 작업을 수행하거나 상호 작용할 수 있도록 합니다.
-
컴퓨터 비전 기반의 상호작용: 게임, 가상 현실 및 증강 현실과 같은 응용 분야에서 특정 객체나 움직이는 물체를 식별하여 상호작용을 제공합니다.
- Simple approach for tracking: object detection at each frame.
- However, we can also use temporal information.
- We can construct solutions based on a detector.
- To solve the problem, we need to assoaciate the same object between consecutive frames.
- Many methods model the object dynamics, so they can predict its position in the next frame.
Types of tracking problem
- moving camera?
- single or multiple cameras?
- single or multiple objects?
- major objects or all objects?
- similar or distinct objects?
- occlusion?
- crossing?
- online or offline?
- initial object marking?
Tracking Classification
Single Object Tracking (SOT):
- tracking of a single object.
- It can contain the information of the object being present or not.
- It can consider the presence of false positives. Example: ball in robot soccer.
Multi Object Tracking (MOT):
- tracking of multiple objects (including objects of the same type).
Online Tracking vs. Offline Tracking
- Online Tracking: Estimate current state given current and past observations
- Offline Tracking: Estimate all states given all observations (batch mode)
- As we consider self-driving, we will focus on online tracking in this lecture
Elements of Tracking
- Detection: Where are candidate objects in each frame? (“tracking-by-detection”)
- Association: Which detection corresponds to which object?
- Filtering: What is the most likely object state, e.g., location and size? (Detections are noisy ⇒ exploit probabilistic observation/motion models)
Filtering
frequency domain approach
In general, for online tracking, the most popular filters are stochastic filters, which are based on the so-called Bayes filter.
Beyond filtering the signal, the Bayes filter considers a dynamics, so it can predict the position in the next instant and mitigate delay.
Bayesian Filtering
Idea: integrate motion and observation.
Kalman Filter
Specialization of the Bayes filter.
Association
In self-driving, we typically have to track multiple objects at the same time How can we associate detections in a new frame to existing object tracks?
Algorithm
- Predict objects from previous frame and detect objects in current frame
- Associate detections to object tracks (initiate/delete tracks if necessary)
- Correct predictions with observations (e.g., Kalman Filter)
When do observations in consecutive frames belong together?
- Predict bounding box (via motion model) and measure overlap
- Compare color histograms or normalized cross-correlation
- Estimate optical flow and measure agreement
- Compare relative location and size of bounding box
- Compare orientation of detected objects
Simple Online Realtime Tracking (SORT)
- Very popular approach for object tracking.
- Faster R-CNN as the object detector.
- MOT based on the Kalman filter.
- A filter for each object being tracked (tracklet).
- Association based on the Hungarian algorithm.
- Heuristics to create and remove objects being tracked.
- Separates detection and tracking.
- Requires training only the object detector!
- Very easy to adapt to other object detectors, because the tracking part doesn’t change.
- Potentially, there is loss of performance by not considering detection and tracking as a single problem.
[Beley et al., 2017, Simple Online and Realtime Tracking]
DeepSORT
[Wojke et al., 2017, Simple Online and Realtime Tracking with a Deep Association Metric]
Metric
We can adopt metrics for object detection, such as mAP, accuracy, precision etc. However, tracking has its own challenges.
- HOTA(Higher Order Tracking Accuracy)
- MOTA(Multiple Object Tracking Accuracy)
- MOTP(Multiple Object Tracking Precision)
- IDF1(Identification F1 Score)
- MT
- number of tracked trajectories during most of the time. We consider trajectories that were tracked for at least 80% of its time of existence.
- ML
- number of lost trajectories during most of the time. We consider trajectories that were not tracked for at least 20% of its time of existence.
- FP
- number of false positives.
- FN
- number of false negatives.
- IDSW
- number of incorrect id switches.
- Frag
- number of fragmentations (when a tracking is incorrectly interrupted).
Reference
- For Bayes filter and Kalman filter: Probabilistic Robotics.
- SORT and DeepSORT papers:
- Beley et al., 2017, Simple Online and Realtime Tracking.
- Wojke et al., 2017, Simple Online and Realtime Tracking with a Deep Association Metric.
- Metrics for MOT
- Milan et al., 2016, A Benchmark for Multi-Object Tracking.
- Deep learning in video multi-object tracking: A survey
- Gioele Ciaparrone, Francisco Luque Sánchez, Siham Tabik, Luigi Troiano, Roberto Tagliaferri, Francisco Herrer Neurocomputing 381 (2020) 61–88
CV3DST - Object tracking CV3DST - Multi-object tracking
Aula 8 - Object Tracking - CM203
L11 - Object Tracking - Lecture: Self-Driving Cars
Bastian Leibe (RWTH Aachen): Computer Vision 2
Laura Leal-Taixe (TUM): Computer Vision 3: Detection, Segmentation, Trackinǵ
Leave a comment