Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Environmental sound recognition using short-time feature aggregation
Roma G., Herrera P., Nogueira W.  Journal of Intelligent Information Systems 51 (3): 457-475, 2018. Type: Article
Date Reviewed: Jan 31 2019

Enabling the automatic human-level (or better) detection and classification of audio events and sound environments would be a clear plus for artificial intelligence (AI)-based applications such as robotics and social signal processing. Typical machine learning approaches to such analysis problems rely on the prior extraction of description features from raw data before semantic analysis; audio-specific feature proposals abound, from frame-based mel-frequency cepstral coefficients (MFCCs) to recurrence quantification analysis (RQA) data.

This paper provides experimental evidence that accuracy gains can be expected from both aggregating short-time features and separating the event detection and classification tasks. First, a new framework for the automatic frequency-domain-based recognition of environmental sounds and a new single-channel noise reduction algorithm are introduced and used in four experiments. Experiment 1 focuses on RQA and suggests that the RQA+MFCC combination performs better than existing related approaches for scene classification on the D-CASE2013, “in-house,” and Rouen datasets. Experiment 2 reaches similar conclusions regarding aggregation for event classification. Experiment 3 addresses segmentation issues, where the goal is to detect events independently of their class. Finally, experiment 4 looks at joint detection and classification; here, aggregating some features (RQA) helped, as did noise reduction, while others (derivative statistics), not so much.

Overall, this rather technical paper provides some experimental motivation for additional research focusing on independent segmentation, detection, and classification of environmental sounds, in particular using the promising approach of feature aggregation. It will be of interest to researchers and advanced graduate students well versed in audio semantic analysis techniques.

Reviewer:  P. Jouvelot Review #: CR146408 (1905-0184)
Bookmark and Share
  Editor Recommended
Sound And Music Computing (H.5.5 )
Feature Evaluation And Selection (I.5.2 ... )
Would you recommend this review?
Other reviews under "Sound And Music Computing": Date
Sound reinforcement engineering: fundamentals and practice
Ahnert W., Steffen F.,  CRC Press, Inc., Boca Raton, FL, 2017. 424 pp. Type: Book (978-1-138569-74-4)
Apr 23 2019
Multimodal mood classification of Hindi and Western songs
Patra B., Das D., Bandyopadhyay S.  Journal of Intelligent Information Systems 51(3): 579-596, 2018. Type: Article
Feb 4 2019
Computer music instruments: foundations, design and development
Lazzarini V.,  Springer International Publishing, New York, NY, 2017. 361 pp. Type: Book (978-3-319635-03-3)
Mar 23 2018

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright © 2000-2019 ThinkLoud, Inc.
Terms of Use
| Privacy Policy