Paper
1 April 1998 Methodologies for using UW databases for OCR and image-understanding systems
Ihsin T. Phillips
Author Affiliations +
Proceedings Volume 3305, Document Recognition V; (1998) https://doi.org/10.1117/12.304624
Event: Photonics West '98 Electronic Imaging, 1998, San Jose, CA, United States
Abstract
This paper discusses methodologies for automatically selecting document pages and zones form the UW databases, having the desired page/zone attributes. The selected pages can then be randomly partitioned into subsets for training and testing purposes. This paper also discusses three degradation methodologies that allow the developers of OCR and document recognition systems to create unlimited 'real- life' degraded images - with geometric distortions, coffee stains and water marks. Since the degraded images are created from the images in the UW databases, the nearly perfect original groundtruth files in the UW databases can be reused. The process of creating the additional document images, the associated groundtruth and attribute files require only a fraction of the original cost and time.
© (1998) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ihsin T. Phillips "Methodologies for using UW databases for OCR and image-understanding systems", Proc. SPIE 3305, Document Recognition V, (1 April 1998); https://doi.org/10.1117/12.304624
Lens.org Logo
CITATIONS
Cited by 14 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Databases

Optical character recognition

Binary data

Image processing

Image understanding

Mathematics

Nanoimprint lithography

RELATED CONTENT

A Measure Of Scene Content
Proceedings of SPIE (January 09 1979)
Image quality, dollars, and very low contrast documents
Proceedings of SPIE (February 01 1991)
Spotting phrases in lines of imaged text
Proceedings of SPIE (March 30 1995)

Back to Top