Issues in evaluating the retrieval performance of multiscript translation of Al-Quran

The main aim of this paper is to present on the issues of evaluating the retrieval performance of the multi-script indexing of translated texts of al-Quran. Translations of al-Quran has played a major role in the recitation of al-Quran in its original texts and understanding through the translated w...

Full description

Bibliographic Details
Main Authors: Othman, Roslina, Abdul Wahid, Fauziah
Format: Conference or Workshop Item
Language:English
Published: 2011
Subjects:
Online Access:http://irep.iium.edu.my/7481/
http://irep.iium.edu.my/7481/
http://irep.iium.edu.my/7481/1/Issues_of_ret_perf.pdf
Description
Summary:The main aim of this paper is to present on the issues of evaluating the retrieval performance of the multi-script indexing of translated texts of al-Quran. Translations of al-Quran has played a major role in the recitation of al-Quran in its original texts and understanding through the translated words, among the public. Even in querying, non-Arabic speakers will find the texts through the translated words in addition to topical search. Transliteration is a need in the absence of terminology in the normal conduct of Cross-Language Information Retrieval research area, while in the case of this research, the transliterated version was meant for those with the ability to read the older script in its own original translation. The Malay Roman script has its own version of the translation. Objectives include to examine the reported retrieval performance of these texts and to evaluate the retrieval performance of the translations available in two different scripts of a language: Malay Rumi and Malay Jawi, built upon Pimpinan ar-Rahman version, Indri and Jawi software. Measures include recall, precision and overlap. Recall explains the performance in retrieving all relevant items, while precision describes the performance in rejecting non-relevant items. Overlap exhibits the retrieval of items common in both sub-collections. Queries are constructed from questions posed by newspaper readers in both scripts resulted as keywords with semantic, while relevance judgment is made by a panel of expert based on answers to the questions. Findings based on recall, precision and overlaps revealed the major issues of standardized texts, translation and transliteration, text alignments, queries construction, question-answering relevance vs. topical relevance. Indri's performance is not a major issue, while the Jawi software requires improvement to a minor extent. This paper contributes to the issues of handling test collections involving parallel corpus in the area of Cross Language IR facing the Muslim World.