Paper
21 February 2012 Text documents as social networks
Helen Balinsky, Alexander Balinsky, Steven J. Simske
Author Affiliations +
Proceedings Volume 8302, Imaging and Printing in a Web 2.0 World III; 830207 (2012) https://doi.org/10.1117/12.909110
Event: IS&T/SPIE Electronic Imaging, 2012, Burlingame, California, United States
Abstract
The extraction of keywords and features is a fundamental problem in text data mining. Document processing applications directly depend on the quality and speed of the identification of salient terms and phrases. Applications as disparate as automatic document classification, information visualization, filtering and security policy enforcement all rely on the quality of automatically extracted keywords. Recently, a novel approach to rapid change detection in data streams and documents has been developed. It is based on ideas from image processing and in particular on the Helmholtz Principle from the Gestalt Theory of human perception. By modeling a document as a one-parameter family of graphs with its sentences or paragraphs defining the vertex set and with edges defined by Helmholtz's principle, we demonstrated that for some range of the parameters, the resulting graph becomes a small-world network. In this article we investigate the natural orientation of edges in such small world networks. For two connected sentences, we can say which one is the first and which one is the second, according to their position in a document. This will make such a graph look like a small WWW-type network and PageRank type algorithms will produce interesting ranking of nodes in such a document.
© (2012) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Helen Balinsky, Alexander Balinsky, and Steven J. Simske "Text documents as social networks", Proc. SPIE 8302, Imaging and Printing in a Web 2.0 World III, 830207 (21 February 2012); https://doi.org/10.1117/12.909110
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Social networks

Neodymium

Data mining

Visualization

Image processing

Brain

Computer security

Back to Top