Paper
18 March 2022 Improved DBSCAN algorithm based on relative mass of the data field
Author Affiliations +
Proceedings Volume 12168, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2021); 121682G (2022) https://doi.org/10.1117/12.2631161
Event: International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2021), 2021, Harbin, China
Abstract
The DBSCAN algorithm can discover clusters of arbitrary shapes, but it has difficulty in predicting the appropriate clustering parameters. In this study, the data field is introduced into the number field space, and the relative mass (RM) calculation method of the data field is proposed, and the first N points with larger mass in the dataset are calculated as the initial points of clustering by the RM algorithm. Then the optimized influence factor sigma is used to calculate the force range radius to achieve the optimization of the field radius parameter, so as to select the appropriate clustering parameters. In addition, this study improves the efficiency of computing large datasets by implementing the improved algorithm for parallel computing in a distributed cluster. Finally, the effectiveness of the improved algorithm is verified on three publicly available datasets, and the efficiency of parallel computation is verified on three large datasets. The results show that (1) the improved DBSCAN algorithm can effectively solve the problem of difficult selection of clustering parameters. (2) The maximum speedup ratio of parallel computation reaches 2.12 when the size of the large dataset is increased from 30,000 to 150,000 and the number of nodes involved in the computation is increased from one to five, and the average operation efficiency of the improved algorithm is improved by 32.45% compared with the original algorithm.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Daoheng Zhu, Zhiqiang Li, Pengpeng Hu, Qianxin Su, and Run Liu "Improved DBSCAN algorithm based on relative mass of the data field", Proc. SPIE 12168, International Conference on Computer Graphics, Artificial Intelligence, and Data Processing (ICCAID 2021), 121682G (18 March 2022); https://doi.org/10.1117/12.2631161
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Parallel computing

Algorithms

Data modeling

Optimization (mathematics)

Back to Top