Computing Reviews, the leading online review service for computing literature.

Search

Environmental sound recognition using short-time feature aggregation
Roma G., Herrera P., Nogueira W. Journal of Intelligent Information Systems51 (3):457-475,2018.Type:Article

Date Reviewed: Jan 31 2019

Enabling the automatic human-level (or better) detection and classification of audio events and sound environments would be a clear plus for artificial intelligence (AI)-based applications such as robotics and social signal processing. Typical machine learning approaches to such analysis problems rely on the prior extraction of description features from raw data before semantic analysis; audio-specific feature proposals abound, from frame-based mel-frequency cepstral coefficients (MFCCs) to recurrence quantification analysis (RQA) data. This paper provides experimental evidence that accuracy gains can be expected from both aggregating short-time features and separating the event detection and classification tasks. First, a new framework for the automatic frequency-domain-based recognition of environmental sounds and a new single-channel noise reduction algorithm are introduced and used in four experiments. Experiment 1 focuses on RQA and suggests that the RQA+MFCC combination performs better than existing related approaches for scene classification on the D-CASE2013, “in-house,” and Rouen datasets. Experiment 2 reaches similar conclusions regarding aggregation for event classification. Experiment 3 addresses segmentation issues, where the goal is to detect events independently of their class. Finally, experiment 4 looks at joint detection and classification; here, aggregating some features (RQA) helped, as did noise reduction, while others (derivative statistics), not so much. Overall, this rather technical paper provides some experimental motivation for additional research focusing on independent segmentation, detection, and classification of environmental sounds, in particular using the promising approach of feature aggregation. It will be of interest to researchers and advanced graduate students well versed in audio semantic analysis techniques.

Reviewer: P. Jouvelot	Review #: CR146408 (1905-0184)

Sound And Music Computing (H.5.5 )

Feature Evaluation And Selection (I.5.2 ... )

Would you recommend this review?

yes

Other reviews under "Sound And Music Computing":	Date

Music, cognition, and computerized sound Cook P., MIT Press, Cambridge, MA, 1999. Type: Book (9780262032568)	Jul 1 1999

Linux music & sound Phillips D., No Starch Press, San Francisco, CA, 2000. 399, Type: Book (9781886411340)	Aug 1 2001

Machine musicianship Rowe R., MIT Press, Cambridge, MA, 2001. 399, Type: Book (9780262182065)	Aug 1 2001

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy