Phonetically rich and balanced text and speech corpora for Arabic language
This paper describes the preparation, recording, analyzing, and evaluation of a new speech corpus for Modern Standard Arabic (MSA). The speech corpus contains a total of 415 sentences recorded by 40 (20 male and 20 female) Arabic native speakers from 11 different Arab countries representing three...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2011
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/10572/ http://irep.iium.edu.my/10572/ http://irep.iium.edu.my/10572/ http://irep.iium.edu.my/10572/4/Phonetically_rich_Irep_ID10572.pdf |
id |
iium-10572 |
---|---|
recordtype |
eprints |
spelling |
iium-105722013-07-02T08:39:26Z http://irep.iium.edu.my/10572/ Phonetically rich and balanced text and speech corpora for Arabic language Abushariah, Mohammad Abd-Alrahman Mahmoud Ainon, Raja Noor Zainuddin, Roziati Elshafei, Moustafa Khalifa, Othman Omran TK7885 Computer engineering This paper describes the preparation, recording, analyzing, and evaluation of a new speech corpus for Modern Standard Arabic (MSA). The speech corpus contains a total of 415 sentences recorded by 40 (20 male and 20 female) Arabic native speakers from 11 different Arab countries representing three major regions (Levant, Gulf, and Africa). Three hundred and sixty seven sentences are considered as phonetically rich and balanced, which are used for training Arabic Automatic Speech Recognition (ASR) systems. The rich characteristic is in the sense that it must contain all phonemes of Arabic language, whereas the balanced characteristic is in the sense that it must preserve the phonetic distribution of Arabic language. The remaining 48 sentences are created for Springer 2011-11-05 Article PeerReviewed application/pdf en http://irep.iium.edu.my/10572/4/Phonetically_rich_Irep_ID10572.pdf Abushariah, Mohammad Abd-Alrahman Mahmoud and Ainon, Raja Noor and Zainuddin, Roziati and Elshafei, Moustafa and Khalifa, Othman Omran (2011) Phonetically rich and balanced text and speech corpora for Arabic language. Language Resources and Evaluation. pp. 1-34. ISSN 1574-020X http://dx.doi.org/10.1007/s10579-011-9166-8 10.1007/s10579-011-9166-8 |
repository_type |
Digital Repository |
institution_category |
Local University |
institution |
International Islamic University Malaysia |
building |
IIUM Repository |
collection |
Online Access |
language |
English |
topic |
TK7885 Computer engineering |
spellingShingle |
TK7885 Computer engineering Abushariah, Mohammad Abd-Alrahman Mahmoud Ainon, Raja Noor Zainuddin, Roziati Elshafei, Moustafa Khalifa, Othman Omran Phonetically rich and balanced text and speech corpora for Arabic language |
description |
This paper describes the preparation, recording, analyzing, and evaluation
of a new speech corpus for Modern Standard Arabic (MSA). The speech corpus
contains a total of 415 sentences recorded by 40 (20 male and 20 female) Arabic
native speakers from 11 different Arab countries representing three major regions
(Levant, Gulf, and Africa). Three hundred and sixty seven sentences are considered as
phonetically rich and balanced, which are used for training Arabic Automatic Speech
Recognition (ASR) systems. The rich characteristic is in the sense that it must contain
all phonemes of Arabic language, whereas the balanced characteristic is in the sense
that it must preserve the phonetic distribution of Arabic language. The remaining 48
sentences are created for |
format |
Article |
author |
Abushariah, Mohammad Abd-Alrahman Mahmoud Ainon, Raja Noor Zainuddin, Roziati Elshafei, Moustafa Khalifa, Othman Omran |
author_facet |
Abushariah, Mohammad Abd-Alrahman Mahmoud Ainon, Raja Noor Zainuddin, Roziati Elshafei, Moustafa Khalifa, Othman Omran |
author_sort |
Abushariah, Mohammad Abd-Alrahman Mahmoud |
title |
Phonetically rich and balanced text and speech corpora for Arabic language |
title_short |
Phonetically rich and balanced text and speech corpora for Arabic language |
title_full |
Phonetically rich and balanced text and speech corpora for Arabic language |
title_fullStr |
Phonetically rich and balanced text and speech corpora for Arabic language |
title_full_unstemmed |
Phonetically rich and balanced text and speech corpora for Arabic language |
title_sort |
phonetically rich and balanced text and speech corpora for arabic language |
publisher |
Springer |
publishDate |
2011 |
url |
http://irep.iium.edu.my/10572/ http://irep.iium.edu.my/10572/ http://irep.iium.edu.my/10572/ http://irep.iium.edu.my/10572/4/Phonetically_rich_Irep_ID10572.pdf |
first_indexed |
2023-09-18T20:19:59Z |
last_indexed |
2023-09-18T20:19:59Z |
_version_ |
1777408041649438720 |