Logo image
Visual analysis of videos of crowded scenes
Dissertation   Open access

Visual analysis of videos of crowded scenes

Louis Aloysious Kratz III
Doctor of Philosophy (Ph.D.), Drexel University
Apr 2012
DOI:
https://doi.org/10.17918/etd-7825
pdf
Kratz_Louis_201224.79 MBDownloadView

Abstract

Computer Science
Automatic, vision-based analysis of crowds has implications in a number of fields, but faces unique challenges due to the large number of pedestrians within the scenes. The movement of each pedestrian contributes to the overall crowd motion (i.e., the collective motions of the scene's constituents) that varies spatially across the frame and temporally over the video. This thesis explores how to model the dynamically varying crowd motion, and how to leverage it to perform vision-based analysis on videos of crowded scenes. The crowd motion serves as a scene-centric constraint (i.e., representing the motion in the entire video), compared with conventional objectcentric methods that build on individual constituents. By exploring what information the crowd motion can represent, we demonstrate the impact of leveraging our model on three problems facing video analysis of crowded scenes. First, we represent the crowd motion using a novel statistical model of local motion patterns (i.e., the motion in local space-time areas). By doing so, we may learn the spatially and temporally varying underlying structure of the crowd motion from an example video of crowd behavior. Second, we use our model to represent the typical crowd activity (i.e., the crowd's steady-state) and detect unusual events in local areas of the video. Specifically, we identify local motion patterns that statistically deviate from our learned model. Our space-time model enables detection and isolation of unusual events that are specific to the scene and the location within the video. Next, we use the crowd motion as an indicator of an individual's motion to perform tracking. Specifically, we predict the local motion patterns at different space-time locations of the video and use them as a prior to track individuals in a Bayesian framework. Leveraging the crowd motion provides an accurate prior that dynamically adapts to the space-time variations of the crowd. Finally, we explore how to measure how much individual pedestrians conform to the movement of the crowd. To achieve this, we use our crowd model to indicate the future locations of pedestrians, and compare the direction they would move to their instantaneous optical flow. By identifying deviations from the crowd, we identify global unusual events and augment our tracking method to model the individuality of each target. We compare with conventional object-centric methods and those that do not encode the space-time varying motion of the crowd. We demonstrate that our scene-centric approach (i.e, one that starts with the crowd motion) advances video analysis closer to the robustness and dependability needed for real-world video analysis of scenes containing a large number of pedestrians.

Metrics

23 File views/ downloads
25 Record Views

Details

Logo image