A novel lip geometry approach for audio-visual speech recognition

By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. Various method have been studied by research group around the world to incorporate lip movements in...

Full description

Bibliographic Details
Main Author: Mohd Zamri, Ibrahim
Format: Thesis
Language:English
Published: 2014
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/12087/
http://umpir.ump.edu.my/id/eprint/12087/
http://umpir.ump.edu.my/id/eprint/12087/1/MOHD%20ZAMRI%20BIN%20IBRAHIM.PDF
id ump-12087
recordtype eprints
spelling ump-120872016-03-22T03:32:07Z http://umpir.ump.edu.my/id/eprint/12087/ A novel lip geometry approach for audio-visual speech recognition Mohd Zamri, Ibrahim TK Electrical engineering. Electronics Nuclear engineering By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. Various method have been studied by research group around the world to incorporate lip movements into speech recognition in recent years, however exactly how best to incorporate ,the additional visual information is still not known. This study aims to extend the knowledge of relationships between visual and speech information specifically using lip geometry information due to its robustness to head rotation and the fewer number of features required to represent movement. A new method has been developed to extract lip geometry information, to perform classification and to integrate visual and speech modalities. This thesis makes several contributions. First, this work presents a new method to extract lip geometry features using the combination ofa skin colour filter, a border following algorithm and a convex hull approach. The proposed method was found to improve lip shape extraction performance compared to existing approaches. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs best when representing speech in the visual domain. Second, a novel template matching techniqLie able to adapt dynamic differences in the way words are uttered by speakers has been developed, which determines the best fit of an unseen feature signal to those stored in a database template. Third, following on evaluation of integration strategies, a novel method has been developed based on alternative decision fusion strategy, in which the outcome from the visual and speech modality is chosen by measuring the quality of audio based on kurtosis and skewness analysis and driven by white noise confusion. Finally, the performance of the new methods introduced in this work are evaluated using the CUAVE and LUNA-V data corpora under a range of different signal to noise ratio conditions using the NOISEX-92 dataset. 2014-10 Thesis NonPeerReviewed application/pdf en http://umpir.ump.edu.my/id/eprint/12087/1/MOHD%20ZAMRI%20BIN%20IBRAHIM.PDF Mohd Zamri, Ibrahim (2014) A novel lip geometry approach for audio-visual speech recognition. PhD thesis, Loughborough University. http://iportal.ump.edu.my/lib/item?id=chamo:87863&theme=UMP2
repository_type Digital Repository
institution_category Local University
institution Universiti Malaysia Pahang
building UMP Institutional Repository
collection Online Access
language English
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Mohd Zamri, Ibrahim
A novel lip geometry approach for audio-visual speech recognition
description By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. Various method have been studied by research group around the world to incorporate lip movements into speech recognition in recent years, however exactly how best to incorporate ,the additional visual information is still not known. This study aims to extend the knowledge of relationships between visual and speech information specifically using lip geometry information due to its robustness to head rotation and the fewer number of features required to represent movement. A new method has been developed to extract lip geometry information, to perform classification and to integrate visual and speech modalities. This thesis makes several contributions. First, this work presents a new method to extract lip geometry features using the combination ofa skin colour filter, a border following algorithm and a convex hull approach. The proposed method was found to improve lip shape extraction performance compared to existing approaches. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs best when representing speech in the visual domain. Second, a novel template matching techniqLie able to adapt dynamic differences in the way words are uttered by speakers has been developed, which determines the best fit of an unseen feature signal to those stored in a database template. Third, following on evaluation of integration strategies, a novel method has been developed based on alternative decision fusion strategy, in which the outcome from the visual and speech modality is chosen by measuring the quality of audio based on kurtosis and skewness analysis and driven by white noise confusion. Finally, the performance of the new methods introduced in this work are evaluated using the CUAVE and LUNA-V data corpora under a range of different signal to noise ratio conditions using the NOISEX-92 dataset.
format Thesis
author Mohd Zamri, Ibrahim
author_facet Mohd Zamri, Ibrahim
author_sort Mohd Zamri, Ibrahim
title A novel lip geometry approach for audio-visual speech recognition
title_short A novel lip geometry approach for audio-visual speech recognition
title_full A novel lip geometry approach for audio-visual speech recognition
title_fullStr A novel lip geometry approach for audio-visual speech recognition
title_full_unstemmed A novel lip geometry approach for audio-visual speech recognition
title_sort novel lip geometry approach for audio-visual speech recognition
publishDate 2014
url http://umpir.ump.edu.my/id/eprint/12087/
http://umpir.ump.edu.my/id/eprint/12087/
http://umpir.ump.edu.my/id/eprint/12087/1/MOHD%20ZAMRI%20BIN%20IBRAHIM.PDF
first_indexed 2023-09-18T22:13:20Z
last_indexed 2023-09-18T22:13:20Z
_version_ 1777415173783420928