文摘
Today, there are many opportunities to create vision-based intelligent systems that are human-centric. This is a very rich area because humans are very complex, and the number of tasks we can use these systems for is almost limitless. In addition, camera placement has an enormous impact on the performance of human-centric vision systems. As a result, this dissertation focuses largely on the problem of task-specific camera placement. We attempt to determine how to place cameras relative to activities being performed by human subjects, in order to provide image input to a system so as to optimize that system's ability to achieve its task (learn activities, recognize motion, take measurements, etc.).;We study this problem in three parts. We begin by examining the limits of a single view-point to do measurement of humans and activity recognition tasks in outdoor, real-world settings. We develop a system comprised of a wide-angle, fixed field of view camera coupled with a computer-controlled pan/tilt/zoom-lens camera to make detailed measurements of people for activity recognition applications.;Furthermore, we develop methods for the task-specific camera placement of multi-camera systems. We consider the most fundamental and general task of maximizing observability, and also the task of optimizing classification accuracy for a family of motion classifiers. Our goal is to optimize the aggregate observability of the tasks being performed by the subjects in an area. We develop a general analytical formulation of the observation problem, in terms of the statistics of the motion in the scene and the total resolution of the observed actions, and use an optimization approach to find the camera parameters that optimize the observation criteria. We demonstrate the method for multiple camera systems in real-world monitoring applications, both indoor and outdoor. For activity recognition applications, we develop a virtual-camera approach. We render novel views of the scene that match the training view, to construct the proper view for view-dependent motion classifiers from a combination of arbitrary views taken from several cameras. We tested the method on an existing view-dependent human motion classification system, testing 162 different sequences of motion, with encouraging results.