Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Curriculum learning for speech emotion recognition from crowdsourced labels
Lotfian R., Busso C. IEEE/ACM Transactions on Audio, Speech and Language Processing27 (4):815-826,2019.Type:Article
Date Reviewed: Aug 20 2021

In computer applications such as synergistic games, gratifying robots, and speech recognition systems, the ability to identify emotions is invaluable. But how should effective algorithms and systems be designed for discerning emotions from diverse speech? Lotfian and Busso present a machine learning program for effectively boosting the training procedure of deep neural networks (DNNs) in speech sentiment detection systems.

The authors present extensive yet concise reviews of research efforts and solutions related to understanding speech emotion. Without a doubt, new systems must be equipped with (1) models that capitalize on using artificial intelligence (AI) techniques to delve into the inadequate training datasets for recognizing speech emotion, and (2) algorithms for identifying known uncertainties in the emotional speech by humans and computers with imperfect training.

Considering the challenges that speech sentence evaluators of abstruse emotional content face, how should reliable metrics be created for reconciling the differences among evaluators to construct reliable classifications of emotional perceptions? The authors present statistical models for identifying, clustering, and categorizing emotional attributes. Specifically, a regression model is proposed for categorizing the dimensions of emotions for speech evaluation, a dichotomous model is used to identify the boundaries of emotional dimensions, and an algorithm is used to estimate the accuracy of speech emotional ratings by alternative evaluators.

The paper presents numerous experiments performed with datasets from various sources. Compared to similar research, the results tend to indicate the reliability of the multifaceted approach outlined. The authors clearly recognize the impact of being able to accurately identify training samples. They offer new techniques for determining reliability among speech evaluators.

Reviewer:  Amos Olagunju Review #: CR147337
Bookmark and Share
  Reviewer Selected
Featured Reviewer
Would you recommend this review?
Other reviews under "General": Date
The design and pilot evaluation of an interactive learning environment for introductory programming influenced by cognitive load theory and constructivism
Moons J., De Backer C. Computers & Education 60(1): 368-384, 2013. Type: Article
Jun 14 2013
Cognition and distance learning
Linn M. Journal of the American Society for Information Science 47(11): 826-842, 1996. Type: Article
Sep 1 1997
Children’s intuitive gestures in vision-based action games
Höysniemi J., Hämäläinen P., Turkki L., Rouvi T. Communications of the ACM 48(1): 44-50, 2005. Type: Article
Oct 21 2005

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 2004™
Terms of Use
| Privacy Policy