|Technical Information and Slides|
Activity analysis in intelligent rooms (Poster)
The system is able to track people in 3D and is aware of their identities, and of the identity of the current speaker. Based on this information and the known environment, an important set of activities can be defined and recognized in real time. 3D tracking is based on the video from static cameras with highly overlapping fields of view. Active camera network is used for two purposes: taking snapshots of people's faces to be used for face recognition and for capturing video of interesting events. People are recognized by combining face and voice recognition results to achieve a more robust performance. In addition to the described real-time functionality, the system enables reviewing of past events in the environment. The events are summarized graphically to enable the user to easily grasp the spatio-temporal relationships between events and people that participated in them. This graphic also serves as a user interface for an interactive review of the events such as replay of video associated with a certain event.
Using both audio and video sensor data, a robust performance can be achieved. We have pursued research in audio and multimodal signal analysis.
Using multiple omnidirectional video streams, we have developed algorithms for generating virtual views from arbitrary viewpoints as well as virtual walkthroughs. Also, we have developed techniques for seamless merging of high-resolution rectilinear images with low-resolution omnidirectional images.
Semantic event databases store abstracted past states (events) of the intelligent environment and active sensory networks. These can be flexibly queried using a powerful language for characterizing complex activities as the spatio-temporal compositions of semantic activities. Please view this demo page.