Paper
19 January 2009 Document boundary determination using structural and lexical analysis
Kazem Taghva, Marc-Allen Cartright
Author Affiliations +
Proceedings Volume 7247, Document Recognition and Retrieval XVI; 724704 (2009) https://doi.org/10.1117/12.805384
Event: IS&T/SPIE Electronic Imaging, 2009, San Jose, California, United States
Abstract
The document boundary determination problem is the process of identifying individual documents in a stack of papers. In this paper, we report on a classification system for automation of this process. The system employs features based on document structure and lexical content. We also report on experimental results to support the effectiveness of this system.
© (2009) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Kazem Taghva and Marc-Allen Cartright "Document boundary determination using structural and lexical analysis", Proc. SPIE 7247, Document Recognition and Retrieval XVI, 724704 (19 January 2009); https://doi.org/10.1117/12.805384
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Analytical research

Structural analysis

Classification systems

Error analysis

Binary data

Data modeling

Feature extraction

Back to Top