Paper
11 July 2016 Detecting nonsense for Chinese comments based on logistic regression
Ren Zhuolin, Chen Guang, Chen Shu
Author Affiliations +
Proceedings Volume 10011, First International Workshop on Pattern Recognition; 100111J (2016) https://doi.org/10.1117/12.2242283
Event: First International Workshop on Pattern Recognition, 2016, Tokyo, Japan
Abstract
To understand cyber citizens’ opinion accurately from Chinese news comments, the clear definition on nonsense is present, and a detection model based on logistic regression (LR) is proposed. The detection of nonsense can be treated as a binary-classification problem. Besides of traditional lexical features, we propose three kinds of features in terms of emotion, structure and relevance. By these features, we train an LR model and demonstrate its effect in understanding Chinese news comments. We find that each of proposed features can significantly promote the result. In our experiments, we achieve a prediction accuracy of 84.3% which improves the baseline 77.3% by 7%.
© (2016) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ren Zhuolin, Chen Guang, and Chen Shu "Detecting nonsense for Chinese comments based on logistic regression", Proc. SPIE 10011, First International Workshop on Pattern Recognition, 100111J (11 July 2016); https://doi.org/10.1117/12.2242283
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Lawrencium

Mining

Binary data

Intelligence systems

Pattern recognition

Web 2.0 technologies

Feature extraction

Back to Top