Paper
24 March 2014 Semi-automated document image clustering and retrieval
Markus Diem, Florian Kleber, Stefan Fiel, Robert Sablatnig
Author Affiliations +
Proceedings Volume 9021, Document Recognition and Retrieval XXI; 90210M (2014) https://doi.org/10.1117/12.2043010
Event: IS&T/SPIE Electronic Imaging, 2014, San Francisco, California, United States
Abstract
In this paper a semi-automated document image clustering and retrieval is presented to create links between different documents based on their content. Ideally the initial bundling of shuffled document images can be reproduced to explore large document databases. Structural and textural features, which describe the visual similarity, are extracted and used by experts (e.g. registrars) to interactively cluster the documents with a manually defined feature subset (e.g. checked paper, handwritten). The methods presented allow for the analysis of heterogeneous documents that contain printed and handwritten text and allow for a hierarchically clustering with different feature subsets in different layers.
© (2014) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Markus Diem, Florian Kleber, Stefan Fiel, and Robert Sablatnig "Semi-automated document image clustering and retrieval", Proc. SPIE 9021, Document Recognition and Retrieval XXI, 90210M (24 March 2014); https://doi.org/10.1117/12.2043010
Lens.org Logo
CITATIONS
Cited by 7 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image retrieval

Feature extraction

Visualization

Databases

Image segmentation

Distance measurement

Fermium

RELATED CONTENT


Back to Top