Dimensional reduction of web traffic data

Vladimir Nikulin

doi:10.1117/12.664767

18 April 2006 Dimensional reduction of web traffic data

Vladimir Nikulin

Proceedings Volume 6241, Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2006; 62410J (2006) https://doi.org/10.1117/12.664767
Event: Defense and Security Symposium, 2006, Orlando (Kissimmee), Florida, United States

Abstract

Dimensional reduction may be effective in order to compress data without loss of essential information. Also, it may be useful in order to smooth data and reduce random noise. The model presented in this paper was motivated by the structure of the msweb web-traffic dataset from the UCI archive. It is proposed to reduce dimension (number of the used web-areas or vroots) as a result of the unsupervised learning process maximizing specially defined average log-likelihood divergence. Two different web-areas will be merged in the case if these areas appear together frequently during the same sessions. Essentially, roles of the web-areas are not symmetrical in the merging process. The web-area or cluster with bigger weight will act as an attractor and will stimulate merging. In difference, the smaller cluster will try to keep independence. In both cases the powers of attraction or resistance will depend on the weights of the corresponding clusters. Above strategy will prevent creation of one super-big cluster, and will help to reduce number of non-significant clusters. The proposed method was illustrated using two synthetic examples. The first example is based on an ideal vlink matrix which characterizes weights of the vroots and relations between them. The vlink matrix for the second example was generated using specially designed web-traffic simulator.

Citation Download Citation

Vladimir Nikulin "Dimensional reduction of web traffic data", Proc. SPIE 6241, Data Mining, Intrusion Detection, Information Assurance, and Data Networks Security 2006, 62410J (18 April 2006); https://doi.org/10.1117/12.664767

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available