Paper
13 January 2003 Automated labeling of bibliographic data extracted from biomedical online journals
Author Affiliations +
Proceedings Volume 5010, Document Recognition and Retrieval X; (2003) https://doi.org/10.1117/12.476047
Event: Electronic Imaging 2003, 2003, Santa Clara, CA, United States
Abstract
A prototype system has been designed to automate the extraction of bibliographic data (e.g., article title, authors, abstract, affiliation and others) from online biomedical journals to populate the National Library of Medicine’s MEDLINE database. This paper describes a key module in this system: the labeling module that employs statistics and fuzzy rule-based algorithms to identify segmented zones in an article’s HTML pages as specific bibliographic data. Results from experiments conducted with 1,149 medical articles from forty-seven journal issues are presented.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jongwoo Kim, Daniel X. Le, and George R. Thoma "Automated labeling of bibliographic data extracted from biomedical online journals", Proc. SPIE 5010, Document Recognition and Retrieval X, (13 January 2003); https://doi.org/10.1117/12.476047
Lens.org Logo
CITATIONS
Cited by 5 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Biomedical optics

Fuzzy logic

Databases

Mars

Medicine

Telecommunications

Data centers

RELATED CONTENT

Correcting OCR text by association with historical datasets
Proceedings of SPIE (January 13 2003)
Evaluation of a generic RIS-PACS interface
Proceedings of SPIE (July 01 1992)
Study of style effects on OCR errors in the MEDLINE...
Proceedings of SPIE (January 17 2005)
Performance of RAID as a storage system for Internet image...
Proceedings of SPIE (November 01 1996)
CoreP2P a tailored group communication scheme for P2P grid...
Proceedings of SPIE (November 11 2008)

Back to Top