Paper
12 July 2024 Research on feature extraction methods of academic papers
Bin Wei, Fucheng Wan, Rong Jing
Author Affiliations +
Proceedings Volume 13185, International Conference on Communication, Information, and Digital Technologies (CIDT2024) ; 1318506 (2024) https://doi.org/10.1117/12.3033049
Event: International Conference on Communication, Information and Digital Technologies, 2024, Wuhan, China
Abstract
Aiming at the problem of insufficient representation of single feature information in academic papers, this paper combines text features with graph structure information in academic charts to capture the correlation and influence among academic papers more comprehensively, so as to improve the accuracy and richness of feature representation. In this method, we first use pre-trained ERNIE large model to obtain the initial representation of the paper information, then use DPCNN and Bi-LSTM to extract deeper textual features of the paper, and then construct the academic map, and design the restart random walk method guided by metapath and Skip-gram to extract the feature representation of nodes in the academic map. The paper text features and graph node features are fused by the attention mechanism to get the final feature representation. The effectiveness of the proposed method is verified by text classification task, and the classification results are predicted by SoftMax layer. The experimental results show that the accuracy rate, recall rate and F1 value of the proposed model are 86.2%, 88.5% and 87.3%, respectively, which are superior to other single semantic methods, which can help researchers obtain relevant information faster and more accurately, improve academic research efficiency, and provide effective support for automatic paper classification, recommendation system and text mining. It has certain application prospect and practical value.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Bin Wei, Fucheng Wan, and Rong Jing "Research on feature extraction methods of academic papers", Proc. SPIE 13185, International Conference on Communication, Information, and Digital Technologies (CIDT2024) , 1318506 (12 July 2024); https://doi.org/10.1117/12.3033049
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Feature extraction

Semantics

Classification systems

Data modeling

Education and training

Transformers

Convolution

Back to Top