Incremental information mining

S. K. Gupta; P. Suresh; Vasudha Bhatanagar

doi:10.1117/12.460252

12 March 2002 Incremental information mining

S. K. Gupta, P. Suresh, Vasudha Bhatanagar

Proceedings Volume 4730, Data Mining and Knowledge Discovery: Theory, Tools, and Technology IV; (2002) https://doi.org/10.1117/12.460252
Event: AeroSense 2002, 2002, Orlando, FL, United States

Abstract

Decision Tree Classification is a simple and important mining function. Decision Tree algorithms are computationally intensive, yet do not capture the evolutionary trends from incremental data repository. In conventional mining approaches, if two or more datasets are to be merged to get a single target dataset, the entire computation for constructing a classifier has to be carried out all over again. Previous work in this field has been to construct individual decision tree classifiers and merge them by taking a voted arbitration or by merging the corresponding decision rules. We have attempted a new approach by data pre-processing the individual windows of the growing database and we call them as Knowledge Concentrates(KC). The formation of the KCs is done in the offline mode. In the mining operations, we use the KCs instead of using the entire past data, thereby reducing the space and time complexity of the entire mining process. The user dynamically selects the target dataset by identifying the windows of interest. The mining requirement is satisfied by merging the respective KCs and running the decision tree algorithm on the merged KC. The proposed scheme operates in three phases. The first phase is the planning phase wherein the dataset domain information is gathered and the data mining goals are defined. The second phase makes a single scan on a window in the database and generates a summary of the window as a knowledge concentrate KC. In our work we have used an efficient Trie structure to store the KCs. The third phase merges the desired windows(KCs) and applies the classification algorithm on the aggregate of the KCs to give the final required classifier. The salient issues addressed in this work are to form a condensed form of the database which enables in the extraction of the patterns in the database that are input to a decision making algorithm to form the required decision tree. The entire scheme is decision tree algorithm independent, in the sense that a user has flexibility to use any standard decision tree algorithm.

Citation Download Citation

S. K. Gupta, P. Suresh, and Vasudha Bhatanagar "Incremental information mining", Proc. SPIE 4730, Data Mining and Knowledge Discovery: Theory, Tools, and Technology IV, (12 March 2002); https://doi.org/10.1117/12.460252

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
12 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Mining

Databases

Data storage

Classification systems

Detection and tracking algorithms

Knowledge discovery

Data mining

Show All Keywords

Keywords/Phrases

Search In:

Publication Years