Paper
8 July 1998 Noninvasive extraction of audiovisual cues for multimodal applications
Harouna Kabre
Author Affiliations +
Abstract
We describe HOPS, a system for extracting some audiovisual cues for the modeling of a computer end-user environment. The objective of the study is to provide some reliable audiovisual cues in order to 'augment' the computer input devices set for multimodal applications. The system accepts an audio-visual scene as input and produces different kinds of events which could contribute to increase the awareness and robustness of interactive system. The described framework for the extraction of cues is ecological and homogenous. On the audio path a cross power spectrum method is applied for extracting different kind of acoustic patterns defined as acoustic segments. The acoustic signal from a microphone and the acoustic segments are firstly FFT- transformed, averaged, and secondly correlated in the spectral domain. The maxima of the inverse Fourier transform of this cross-power spectrum is the criteria for the detection of some acoustic events. On the video path, we define some initial color models of some desired cues such as mouth, eyes, etc. and then track them in the audiovisual scene recorded by a camera.
© (1998) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Harouna Kabre "Noninvasive extraction of audiovisual cues for multimodal applications", Proc. SPIE 3389, Hybrid Image and Signal Processing VI, (8 July 1998); https://doi.org/10.1117/12.316534
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Acoustics

Video

Visualization

Image filtering

Speech recognition

Cameras

Computing systems

RELATED CONTENT

Recent developments in automated lip-reading
Proceedings of SPIE (October 16 2013)
Some observations on computer lip reading moving from the...
Proceedings of SPIE (October 07 2014)
A content-based news video retrieval system: NVRS
Proceedings of SPIE (October 30 2009)
Low-cost universal stereoscopic virtual reality interfaces
Proceedings of SPIE (September 23 1993)
MARTI: man-machine animation real-time interface
Proceedings of SPIE (May 15 1997)

Back to Top