You are here
Model-based classification of speech audio
- Date Issued:
- 2009
- Summary:
- This work explores the process of model-based classification of speech audio signals using low-level feature vectors. The process of extracting low-level features from audio signals is described along with a discussion of established techniques for training and testing mixture model-based classifiers and using these models in conjunction with feature selection algorithms to select optimal feature subsets. The results of a number of classification experiments using a publicly available speech database, the Berlin Database of Emotional Speech, are presented. This includes experiments in optimizing feature extraction parameters and comparing different feature selection results from over 700 candidate feature vectors for the tasks of classifying speaker gender, identity, and emotion. In the experiments, final classification accuracies of 99.5%, 98.0% and 79% were achieved for the gender, identity and emotion tasks respectively.
Title: | Model-based classification of speech audio. |
1390 views
1308 downloads |
---|---|---|
Name(s): |
Thoman, Chris. College of Engineering and Computer Science Department of Computer and Electrical Engineering and Computer Science |
|
Type of Resource: | text | |
Genre: | Electronic Thesis Or Dissertation | |
Issuance: | monographic | |
Date Issued: | 2009 | |
Publisher: | Florida Atlantic University | |
Physical Form: | electronic | |
Extent: | xiv, 186 p. : ill. | |
Language(s): | English | |
Summary: | This work explores the process of model-based classification of speech audio signals using low-level feature vectors. The process of extracting low-level features from audio signals is described along with a discussion of established techniques for training and testing mixture model-based classifiers and using these models in conjunction with feature selection algorithms to select optimal feature subsets. The results of a number of classification experiments using a publicly available speech database, the Berlin Database of Emotional Speech, are presented. This includes experiments in optimizing feature extraction parameters and comparing different feature selection results from over 700 candidate feature vectors for the tasks of classifying speaker gender, identity, and emotion. In the experiments, final classification accuracies of 99.5%, 98.0% and 79% were achieved for the gender, identity and emotion tasks respectively. | |
Identifier: | 426148870 (oclc), 210518 (digitool), FADT210518 (IID), fau:3410 (fedora) | |
Note(s): |
by Chris Thoman. Thesis (M.S.C.S.)--Florida Atlantic University, 2009. Includes bibliography. Electronic reproduction. Boca Raton, Fla., 2009. Mode of access: World Wide Web. |
|
Subject(s): |
Signal processing -- Digital techniques Speech processing systems Sound -- Recording and reproducing -- Digital techniques Pattern recognition systems |
|
Held by: | FBoU FAUER | |
Persistent Link to This Record: | http://purl.flvc.org/FAU/210518 | |
Use and Reproduction: | http://rightsstatements.org/vocab/InC/1.0/ | |
Host Institution: | FAU |