Paper
30 November 2022 Chinese short news text classification based on BERT and sparse autoencoder
Jiuzhou Lin
Author Affiliations +
Proceedings Volume 12456, International Conference on Artificial Intelligence and Intelligent Information Processing (AIIIP 2022); 124562N (2022) https://doi.org/10.1117/12.2659618
Event: International Conference on Artificial Intelligence and Intelligent Information Processing (AIIIP 2022), 2022, Qingdao, China
Abstract
Short news text classification plays an import role in natural language processing as the popularity of mobile phones. In this paper we propose a Chinese short news text classification method based on BERT and sparse autoencoder, regarding the overfitting caused by pretrained BERT. We use the BERT for text representation, the output vectors of BERT are dimension reduced through the sparse autoencoder, and then the Softmax classifier takes the reduced vectors as input to get the prediction of the input text. Experimental results show that our method mitigate the unbalance of the performance of different categories, raises the overall classification performance by six percentage, effectively alleviates the overfitting of text representation of BERT, and achieve a better Chinese short text classification performance than using naïve autoencoder and without autoencoder.
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jiuzhou Lin "Chinese short news text classification based on BERT and sparse autoencoder", Proc. SPIE 12456, International Conference on Artificial Intelligence and Intelligent Information Processing (AIIIP 2022), 124562N (30 November 2022); https://doi.org/10.1117/12.2659618
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Computer programming

Data modeling

Dimension reduction

Performance modeling

Transformers

Machine learning

Neural networks

Back to Top