Book Reviews

Multimedia Data Mining and Knowledge Discovery

J. Electron. Imaging. 17(4), 049901 (December 05, 2008). doi:10.1117/1.3040688
History: Published December 05, 2008
Text Size: A A A

Open Access Open Access

Grahic Jump LocationImage not available.
With the explosion of multimedia data, efficient indexing and retrieval of such data has become a challenging problem. Many research works have been presented in the computer vision, machine learning, and pattern-recognition communities. This book tries to differentiate itself from the others by tackling the problem from a data-mining point of view.

As the preface indicates, the book tries to cover the broad area of multimedia data mining and knowledge discovery. It consists of five parts in 25 chapters. In each chapter, leading researchers from academia or industry report their research work on selected topics, which are extended from papers presented at the Multimedia Data Mining Workshops in 2003 and 2004.

In the introduction, the editors give a brief introduction of data mining techniques and their application in multimedia systems. Chapter 1 starts with a general discussion on multimedia data mining, its major applications, and future applications. It also provides a short summary of each chapter so that users may jump directly to the content they are interested in. Chapter 2 follows with an excellent overview of multimedia data mining from a systems point of view. Of particular interest is the discussion about the differences between multimedia data mining and traditional data mining, e.g., unstructured data and fusion of multimodalities.

Data representation and clustering is the focus of Part 2, which contains five chapters. Different from many related papers, the works presented in Chaps. 3 to 6 are particularly interested in building a hierarchical structure of clusters, which are crucial for fast indexing in large-scale databases. Chapter 7 proposes to recognize the motion data collected from 3-D glove and motion cameras, which are different from the classic action recognition task of fixed cameras. Because the data are collected from multiple sensors with varied lengths of time, the authors propose to select the features using singular value decomposition and to recognize the motion using a support vector machine.

Part 3 contains three chapters related to indexing and retrieval of multimedia data. Chapter 8 first proposes a fuzzy representation of visual features, such as color, texture, and shape, with a desirable property that describes not only the characteristics in each region but also the gradient transition between different regions. In addition, it also assigns weights to fuzzy feature sets, which reflect their membership in multiple regions. The rest of Chap. 8 proposes hierarchical indexing and relevance feedback to improve the effectiveness and efficiency of the system, which has been widely studied in the literature. Considering that the main contribution of the work deals with fuzzy representation, this chapter could also be included in Part 1. Chapter 9 targets an important problem, namely, the lack of semantics in multimedia data, and proposes to build an ontology from data mining. Although it provides a lot of discussion and system diagrams, without sufficient experiments or system demonstration the conclusion is not convincing. Chapter 10 attempts to conduct scene classification on video data. It first recognizes each region of a key frame in the video and then feeds the confidence-rated prediction to a second-level classifier for final scene recognition of the video clip. Because similar techniques have been reported, its novelty seems to be limited.

Part 4 presents research results on multimedia data modeling and evaluation and consists of eight chapters. It starts with an interesting study on novelty detection in video clips in Chap. 11. After comparing the machine detection results and users’ interest regions captured by an eye tracker, it suggests that the machine may find more interest regions yet not detect the more important/novel ones. The reason may be the lack of rank information in training data, which is important but not available in most of the published research in related areas. Chapters 12 and 18 give a brief introduction to audio features and use them for soccer goal shot detection and music modeling. Chapter 13 proposes a novel technique based on MDS and Procrustes analysis to find mapping between nodes in structured trees and, consequently, match multimedia data represented in such form. Research on that problem may attract more attention as the structured representation becomes standard for the semantic web. The rest of the section deals with classic problems such as segmentation, recognition, and relevance feedback.

In addition to introducing new techniques, in Part 5, the book showcases several multimedia data mining systems, which demonstrate the potential applications and practical problems in this field. Chapter 19 designs a new virtual cooperative environment by integrating multimedia data. Chapter 20 tries to mine event patterns in video clips that have been represented by multiple metadata associated with audio and video streams. Chapter 21 combines multiple information sensors such as video cameras, audio recorders, and fingerprint readers for person localization in offices. Chapter 22 estimates the attractiveness of banner images on websites using the visual properties. Chapter 23 analyzes the users’ behavior when video searching, while Chaps. 24 and 25 focus on iris recognition and medical image mining.

Although most chapters in this book include brief reviews of the literature, their main focus is on introducing novel techniques and applications. The readers are expected to have sufficient background on data mining, machine learning, and pattern recognition. Thus the book is suitable for scientists, engineers, and senior graduate students who work in related fields. Because the book is a compilation of published papers, the topics cannot be organized as neatly as many other ones. Additionally, some content might be a bit outdated since it is based on 4- or 5-year-old research. For example, most of the discussed methods do not use local features such as scale-invariant feature transform (circa 2004), and Chap. 24 studies latent semantic indexing instead of its more powerful successors probabilistic latent semantic indexing and latent Dirichlet allocation. Interested readers can complement the information contained in this book by referring to other papers in premier conferences and journals.

Grahic Jump LocationImage not available.

Jie Yu joined Kodak Research Labs in 2007 as a research scientist. He received his PhD in computer science at the University of Texas at San Antonio. His research interests include multimedia information retrieval, machine learning, computer vision, and pattern recognition. He has published over 20 journal articles, conference papers, and book chapters in these fields. He is the recipient of the Student Paper Contest Winner Award of the IEEE International Conference on Acoustics, Speech, and Signal Processing in 2006 and Best Poster Paper Award of the ACM International Conference on Image and Video Retrieval in 2008. He is a member of IEEE and ACM.



Valery A. Petrushin, and Latifur Khan,
"Multimedia Data Mining and Knowledge Discovery", J. Electron. Imaging. 17(4), 049901 (December 05, 2008). ;



Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging & repositioning the boxes below.

Related Book Chapters

Topic Collections

  • Don't have an account?
  • Subscribe to the SPIE Digital Library
  • Create a FREE account to sign up for Digital Library content alerts and gain access to institutional subscriptions remotely.
Access This Article
Sign in or Create a personal account to Buy this article ($20 for members, $25 for non-members).
Access This Proceeding
Sign in or Create a personal account to Buy this article ($15 for members, $18 for non-members).
Access This Chapter

Access to SPIE eBooks is limited to subscribing institutions and is not available as part of a personal subscription. Print or electronic versions of individual SPIE books may be purchased via