An Extractive Question Answering System for the Tamil Language

Article Preview

Abstract:

In the field of Natural Language Processing, Question Answering is a cardinal task that has garnered a lot of attention. With the development of multiple language models, question answering systems have been developed and deployed to facilitate enhanced information retrieval. These systems, however, have been implemented to a large extent only in English. Our objective was to create such a question answering system for the Tamil Language. We decided to use XLM-RoBERTa as our language model, which has been trained on a variety of datasets. We have also employed a hand-annotated dataset for the purpose of validation. We trained the model on two types of datasets, the first one being only in Tamil, whereas the other one being a mixture of Indian languages along with Tamil. The results were satisfactory in both cases. Given the huge amount of computational power the model required for training, we utilized the Colab Pro Plus cloud GPU from Google to satisfy our demands. We will also be publishing our dataset on huggingface so that fellow researchers can use it for further analysis.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

312-319

Citation:

Online since:

February 2023

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2023 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, Percy Liang, SQuAD: 100,000+ Questions for Machine Comprehension of Text,, arXiv:1606.05250, [cs], June (2016).

DOI: 10.18653/v1/d16-1264

Google Scholar

[2] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, Attention Is All You Need,, arXiv:1706.03762, [cs], June (2017).

Google Scholar

[3] Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov, Unsupervised Cross-lingual Representation Learning at Scale,, arXiv:1911.02116, [cs], Nov. (2019).

DOI: 10.18653/v1/2020.acl-main.747

Google Scholar

[4] Patrick Lewis, Barlas Oğuz, Ruty Rinott, Sebastian Riedel, Holger Schwenk, MLQA: Evaluating Cross-lingual Extractive Question Answering,, arXiv:1910.07475 [cs], Oct. (2018).

DOI: 10.18653/v1/2020.acl-main.653

Google Scholar

[5] Mikel Artetxe, Sebastian Ruder, Dani Yogatama, On the cross-lingual transferability of monolingual representations, arXiv:1910.11856 [cs], Oct. (2019).

DOI: 10.18653/v1/2020.acl-main.421

Google Scholar

[6] Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,, arXiv:1810.04805 [cs], Oct. (2018).

Google Scholar

[7] L. -Q. Cai, M. Wei, S. -T. Zhou and X. Yan, Intelligent Question Answering in Restricted Domains Using Deep Learning and Question Pair Matching,, in IEEE Access, vol. 8, pp.32922-32934, 2020,.

DOI: 10.1109/access.2020.2973728

Google Scholar

[8] Y. Lan, S. Wang and J. Jiang, Knowledge Base Question Answering With a Matching-Aggregation Model and Question-Specific Contextual Relations,, in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 10, pp.1629-1638, Oct. 2019,.

DOI: 10.1109/taslp.2019.2926125

Google Scholar

[9] L. Su, T. He, Z. Fan, Y. Zhang and M. Guizani, Answer Acquisition for Knowledge Base Question Answering Systems Based on Dynamic Memory Network,, in IEEE Access, vol. 7, pp.161329-161339, 2019,.

DOI: 10.1109/access.2019.2949993

Google Scholar

[10] W. Wu, Y. Deng, Y. Liang and K. Lei, Answer Category-Aware Answer Selection for Question Answering,, in IEEE Access, vol. 9, pp.126357-126365, 2021,.

DOI: 10.1109/access.2020.3034920

Google Scholar

[11] D. R. CH and S. K. Saha, Automatic Multiple Choice Question Generation from Text: A Survey,, in IEEE Transactions on Learning Technologies, vol. 13, no. 1, pp.14-25, 1 Jan.-March 2020,.

DOI: 10.1109/tlt.2018.2889100

Google Scholar

[12] R. -Z. Wang, Z. -H. Ling and Y. Hu, Knowledge Base Question Answering with Attentive Pooling for Question Representation,, in IEEE Access, vol. 7, pp.46773-46784, 2019,.

DOI: 10.1109/access.2019.2909826

Google Scholar

[13] M. Wei and Y. Zhang, Natural Answer Generation with Attention Over Instances,, in IEEE Access, vol. 7, pp.61008-61017, 2019,.

DOI: 10.1109/access.2019.2904337

Google Scholar

[14] Y. Sun et al., Joint Learning of Question Answering and Question Generation,, in IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 5, pp.971-982, 1 May 2020,.

DOI: 10.1109/tkde.2019.2897773

Google Scholar

[15] T. Shao, Y. Guo, H. Chen and Z. Hao, Transformer-Based Neural Network for Answer Selection in Question Answering,, in IEEE Access, vol. 7, pp.26146-26156, 2019,.

DOI: 10.1109/access.2019.2900753

Google Scholar