Speech emotion recognition based on data enhancement in time-frequency domain

Qianqian Li; Fuji Ren; Xiaoyan Shen; Xin Kang

doi:10.1117/12.2579205

12 October 2020 Speech emotion recognition based on data enhancement in time-frequency domain

Qianqian Li, Fuji Ren, Xiaoyan Shen, Xin Kang

Proceedings Volume 11574, International Symposium on Artificial Intelligence and Robotics 2020; 115740R (2020) https://doi.org/10.1117/12.2579205
Event: International Symposium on Artificial Intelligence and Robotics (ISAIR), 2020, Kitakyushu, Japan

Abstract

Currently, there is a lack of voice samples in the speech emotion recognition field, which leads to poor recognition rate and over-fitting of data. Inspire by this, we propose speech emotion recognition based on data enhancement. The Berlin Emotional Corpus is enhanced from two directions: Time Domain and Frequency Domain. The samples was extracted and trained. Research and analyze the recognition rate of two classifiers: K-Nearest Neighbor and Support Vector Machine. Experiments show that the effect after data enhancement is better.

Citation Download Citation

Qianqian Li, Fuji Ren, Xiaoyan Shen, and Xin Kang "Speech emotion recognition based on data enhancement in time-frequency domain", Proc. SPIE 11574, International Symposium on Artificial Intelligence and Robotics 2020, 115740R (12 October 2020); https://doi.org/10.1117/12.2579205

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available