Paper
21 December 2000 Word extraction using irregular pyramid
PohKok Loo, Chew Lim Tan
Author Affiliations +
Proceedings Volume 4307, Document Recognition and Retrieval VIII; (2000) https://doi.org/10.1117/12.410857
Event: Photonics West 2001 - Electronic Imaging, 2001, San Jose, CA, United States
Abstract
This paper proposed a new algorithm to perform text extraction from imaged documents. The paper focused in the extraction of word group. Irregular pyramid structure is used as the basis of the algorithm. The uniqueness of this algorithm is its inclusion of strategic background information in the analysis where most techniques have discarded. Both foreground (i.e. text area) and portion of background (i.e. white area) regions are examined. The fundamental of the algorithm is based on the concept of 'closeness' where text information within a group is closed to each other, in terms of spatial distance, as compared to other text area. The result produced by the algorithm is encouraging with the ability to correctly group words of different size, font, arrangement and orientation.
© (2000) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
PohKok Loo and Chew Lim Tan "Word extraction using irregular pyramid", Proc. SPIE 4307, Document Recognition and Retrieval VIII, (21 December 2000); https://doi.org/10.1117/12.410857
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Detection and tracking algorithms

Image resolution

Image processing

Feature extraction

Image analysis

Process control

Civil engineering

Back to Top