Identifying multiple objects from their appearance in inaccurate detections

Kooij, Julian; Englebienne, Gwenn; Gavrila, D.

Identifying multiple objects from their appearance in inaccurate detections

Journal article (2015)

Authors

Julian Kooij Intelligent Autonomous Systems Group, Universiteit van Amsterdam

Gwenn Englebienne Universiteit van Amsterdam

D. Gavrila Universiteit van Amsterdam, Intelligent Autonomous Systems Group

Affiliation

External organisation

Segmentation Unsupervised learning Object recognition Latent Dirichlet Allocation Generative model Video surveillance

To reference this document use:

http://resolver.tudelft.nl/uuid:034e4191-fca4-4946-b10d-e52dc0cc4f1e

More Info

expand_more

Published Date

01-01-2015

Language

English

Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Affiliation

External organisation

Abstract

We propose a novel method for keeping track of multiple objects in provided regions of interest, i.e. object detections, specifically in cases where a single object results in multiple co-occurring detections (e.g. when objects exhibit unusual size or pose) or a single detection spans multiple objects (e.g. during occlusion). Our method identifies a minimal set of objects to explain the observed features, which are extracted from the regions of interest in a set of frames. Focusing on appearance rather than temporal cues, we treat video as an unordered collection of frames, and "unmix" object appearances from inaccurate detections within a Latent Dirichlet Allocation (LDA) framework, for which we propose an efficient Variational Bayes inference method. After the objects have been localized and their appearances have been learned, we can use the posterior distributions to "back-project" the assigned object features to the image and obtain segmentation at pixel level. In experiments on challenging datasets, we show that our batch method outperforms state-of-the-art batch and on-line multi-view trackers in terms of number of identity switches and proportion of correctly identified objects. We make our software and new dataset publicly available for non-commercial, benchmarking purposes.

No files available

Metadata only record. There are no files for this journal article.