On the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification

Of the many audio features available, this paper focuses on the comparison of two most popular features, i.e. line spectral frequencies (LSF) and Mel-Frequency Cepstral Coefficients. We trained a feedforward neural network with various hidden layers and number of hidden nodes to identify five differ...

Full description

Bibliographic Details
Main Authors: Gunawan, Teddy Surya, Kartiwi, Mira
Format: Article
Language:English
English
Published: Indonesian Journal of Electrical Engineering and Computer Science ( IAES) 2018
Subjects:
Online Access:http://irep.iium.edu.my/61795/
http://irep.iium.edu.my/61795/
http://irep.iium.edu.my/61795/
http://irep.iium.edu.my/61795/1/10880-15119-1-PBGunawanLanguage.pdf
http://irep.iium.edu.my/61795/7/61795_On%20the%20Comparison%20of%20Line%20Spectral%20Frequencies%20_scopus.pdf
id iium-61795
recordtype eprints
spelling iium-617952018-02-13T03:05:26Z http://irep.iium.edu.my/61795/ On the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification Gunawan, Teddy Surya Kartiwi, Mira TK7885 Computer engineering Of the many audio features available, this paper focuses on the comparison of two most popular features, i.e. line spectral frequencies (LSF) and Mel-Frequency Cepstral Coefficients. We trained a feedforward neural network with various hidden layers and number of hidden nodes to identify five different languages, i.e. Arabic, Chinese, English, Korean, and Malay. LSF, MFCC, and combination of both features were extracted as the feature vectors. Systematic experiments have been conducted to find the optimum parameters, i.e. sampling frequency, frame size, model order, and structure of neural network. The recognition rate per frame was converted to recognition rate per audio file using majority voting. On average, the recognition rate for LSF, MFCC, and combination of both features are 96%, 92%, and 96%, respectively. Therefore, LSF is the most suitable features to be utilized for language identification using feedforward neural network classifier. Indonesian Journal of Electrical Engineering and Computer Science ( IAES) 2018-04 Article PeerReviewed application/pdf en http://irep.iium.edu.my/61795/1/10880-15119-1-PBGunawanLanguage.pdf application/pdf en http://irep.iium.edu.my/61795/7/61795_On%20the%20Comparison%20of%20Line%20Spectral%20Frequencies%20_scopus.pdf Gunawan, Teddy Surya and Kartiwi, Mira (2018) On the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification. Indonesian Journal of Electrical Engineering and Computer Science, 10 (1). pp. 168-175. ISSN 2502-4752 E-ISSN 2502-4760 http://iaescore.com/journals/index.php/IJEECS/ 10.11591/ijeecs.v10.i1.pp168-175
repository_type Digital Repository
institution_category Local University
institution International Islamic University Malaysia
building IIUM Repository
collection Online Access
language English
English
topic TK7885 Computer engineering
spellingShingle TK7885 Computer engineering
Gunawan, Teddy Surya
Kartiwi, Mira
On the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification
description Of the many audio features available, this paper focuses on the comparison of two most popular features, i.e. line spectral frequencies (LSF) and Mel-Frequency Cepstral Coefficients. We trained a feedforward neural network with various hidden layers and number of hidden nodes to identify five different languages, i.e. Arabic, Chinese, English, Korean, and Malay. LSF, MFCC, and combination of both features were extracted as the feature vectors. Systematic experiments have been conducted to find the optimum parameters, i.e. sampling frequency, frame size, model order, and structure of neural network. The recognition rate per frame was converted to recognition rate per audio file using majority voting. On average, the recognition rate for LSF, MFCC, and combination of both features are 96%, 92%, and 96%, respectively. Therefore, LSF is the most suitable features to be utilized for language identification using feedforward neural network classifier.
format Article
author Gunawan, Teddy Surya
Kartiwi, Mira
author_facet Gunawan, Teddy Surya
Kartiwi, Mira
author_sort Gunawan, Teddy Surya
title On the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification
title_short On the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification
title_full On the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification
title_fullStr On the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification
title_full_unstemmed On the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification
title_sort on the comparison of line spectral frequencies and mel-frequency cepstral coefficients using feedforward neural network for language identification
publisher Indonesian Journal of Electrical Engineering and Computer Science ( IAES)
publishDate 2018
url http://irep.iium.edu.my/61795/
http://irep.iium.edu.my/61795/
http://irep.iium.edu.my/61795/
http://irep.iium.edu.my/61795/1/10880-15119-1-PBGunawanLanguage.pdf
http://irep.iium.edu.my/61795/7/61795_On%20the%20Comparison%20of%20Line%20Spectral%20Frequencies%20_scopus.pdf
first_indexed 2023-09-18T21:27:38Z
last_indexed 2023-09-18T21:27:38Z
_version_ 1777412298284990464