Over recent years user-generated content has become ubiquitously available and an attractive entertainment source for
millions of end-users. Particularly for larger events, where many people use their devices to capture the action, a great
number of short video clips are made available through appropriate web services. The objective of this presentation is to
describe a way to combine these clips by analyzing them, and automatically reconstruct the time line in which the
individual video clips were captured. This will enable people to easily create a compelling multimedia experience by
leveraging multiple clips taken by different users from different angles, and across different time spans. The user will be
able to shift into the role of a movie director mastering a multi-camera recording of the event. To achieve this goal, the
audio portion of the video clips is analyzed, and waveform characteristics are computed with high temporal granularity
in order to facilitate precise time alignment and overlap computation of the user-generated clips. Special care has to be
given not only to the robustness of the selected audio features against ambient noise and various distortions, but also to
the matching algorithm used to align the user-generated clips properly.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.