Investigation on effectiveness of mid-level feature representation for semantic boundary detection in news video

Regunathan Radhakrishan; Ziyou Xiong; Ajay Divakaran; Bhiksha Raj

doi:10.1117/12.514397

26 November 2003 Investigation on effectiveness of mid-level feature representation for semantic boundary detection in news video

Regunathan Radhakrishan, Ziyou Xiong, Ajay Divakaran, Bhiksha Raj

Author Affiliations +

Proceedings Volume 5242, Internet Multimedia Management Systems IV; (2003) https://doi.org/10.1117/12.514397
Event: ITCom 2003, 2003, Orlando, Florida, United States

Abstract

In our past work, we have attempted to use a mid-level feature namely the state population histogram obtained from the Hidden Markov Model (HMM) of a general sound class, for speaker change detection so as to extract semantic boundaries in broadcast news. In this paper, we compare the performance of our previous approach with another approach based on video shot detection and speaker change detection using the Bayesian Information Criterion (BIC). Our experiments show that the latter approach performs significantly better than the former. This motivated us to examine the mid-level feature closely. We found that the component population histogram enabled discovery of broad phonetic categories such as vowels, nasals, fricatives etc, regardless of the number of distinct speakers in the test utterance. In order for it to be useful for speaker change detection, the individual components should model the phonetic sounds of each speaker separately. From our experiments, we conclude that state/component population histograms can only be useful for further clustering or semantic class discovery if the features are chosen carefully so that the individual states represent the semantic categories of interest.

Citation Download Citation

Regunathan Radhakrishan, Ziyou Xiong, Ajay Divakaran, and Bhiksha Raj "Investigation on effectiveness of mid-level feature representation for semantic boundary detection in news video", Proc. SPIE 5242, Internet Multimedia Management Systems IV, (26 November 2003); https://doi.org/10.1117/12.514397

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available