Sign Language Translation Using LSTM for Context Recognition

Article Preview

Abstract:

There are a set of few among the 7billion+ people in the world with either hearing or speech impairment, their only means of communication is through the use of sign language. It is one of the most reliable methods of communicating with special needs people. A form of communication using visual patterns to express emotions and ideas helps bridge the gap for deaf individuals. However, when interacting with those who rely on spoken language, a communication barrier often arises. Currently, human interpreters are used to facilitate conversations between these groups, but this solution can be both costly and inconvenient. The necessity for developing a technology to aid the interpretation of sign language to the deaf community and to foster idea-sharing amongst all humans cannot be overemphasized. Much research has been carried out to acknowledge sign language using technology for most global languages. In this project, deep learning techniques were applied to develop a system for recognizing hand gestures in American Sign Language. A dataset was created using both two-dimensional and three-dimensional images of American gestures. To detect landmarks in these images, the MediaPipe framework was utilized. Additionally, Long Short-Term Memory (LSTM) networks were employed to improve gesture recognition by leveraging the temporal dependencies of hand movements. My work specifically focuses on recognizing contexts in sign language communication, enhancing the system's ability to understand not just individual gestures but also their meanings in different scenarios.

You might also be interested in these eBooks

Info:

* - Corresponding Author

[1] Caruana, R., & Niculescu-Mizil, A. (2006). An empirical comparison of supervised learning algorithms. Proceedings of the 23rd international conference on machine learning (ICML).

DOI: 10.1145/1143844.1143865

Google Scholar

[2] Soni, P., et al. (2023). Sign Language Recognition using Deep Learning Techniques: A Survey. Journal of Visual Communication and Image Representation.

Google Scholar

[3] Yuan, Y., et al. (2021). A Comprehensive Review on Sign Language Recognition Using Machine Learning and Deep Learning Approaches. IEEE Transactions on Neural Networks and Learning Systems.

Google Scholar

[4] Zhang, Y., et al. (2018). Understanding overfitting: Insights from regularization and feature selection. IEEE Transactions on Neural Networks and Learning Systems.

Google Scholar

[5] Aloysius, N., G. M., & Nedungadi, P. (2021). Incorporating Relative Position Information in Transformer-Based Sign Language Recognition and Translation. IEEE Access, 9, 145929- 145942.

DOI: 10.1109/ACCESS.2021.3122921

Google Scholar

[6] Al-Qurishi, M., Al-Rodhaan, M., Al-Dhelaan, A., & Mekouar, L. (2021). Deep learning for sign language recognition: Current techniques, benchmarks, and open issues. IEEE Access, 9, 112640-112671.

DOI: 10.1109/access.2021.3110912

Google Scholar

[7] Al-Qurishi, M., Kothadiya, D., & Corchado, J. M. (2021). Deep Learning for Sign Language Recognition: Current Techniques, Benchmarks, and Open Issues. In 2021 IEEE International Conference on Systems, Man and Cybernetics (SMC) (pp.1-6). IEEE.

DOI: 10.1109/access.2021.3110912

Google Scholar

[8] Boháček, M., & Hrúz, M. (2022). Sign Pose-based Transformer for Word-level Sign Language Recognition. 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), 182-191.

DOI: 10.1109/WACVW54805.2022.00024

Google Scholar

[9] Camgoz, N. C., Koller, O., Hadfield, S., & Bowden, R. (2020). Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation. arXiv preprint arXiv:2003.13830.

DOI: 10.1109/cvpr42600.2020.01004

Google Scholar

[10] Chakraborty, A., Sri Dharshini, R. S., Shruthi, K., & Logeshwari, R. (2023). Recognition of American Sign Language with Study of Facial Expression for Emotion Analysis1. In M. Tropmann-Frick et al. (Eds.), International Research Conference on IOT, Cloud and Data Science (pp.21-37). IOS Press.

DOI: 10.4028/p-238mcg

Google Scholar

[11] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations.

DOI: 10.1109/cvpr46437.2021.00238

Google Scholar

[12] Gandhi, D., Shah, K., & Chandane, M. (2022). Dynamic Sign Language Recognition and Emotion Detection using MediaPipe and Deep Learning. In 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kharagpur, India (pp.1-7).

DOI: 10.1109/ICCCNT54827.2022.9984592

Google Scholar

[13] Hrúz, M., Gruber, I., Kanis, J., Boháček, M., Hlaváč, M., & Krňoul, Z. (2022). One Model is Not Enough: Ensembles for Isolated Sign Language Recognition. Sensors, 22(13), 5043.

DOI: 10.3390/s22135043

Google Scholar

[14] Hu, H., Zhou, W., & Li, H. (2021). Hand-Model-Aware Sign Language Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 35 (2), 1558-1566.

DOI: 10.1609/aaai.v35i2.16247

Google Scholar

[15] Joze, H. R. V., & Koller, O. (2018). MS-ASL: A Large-Scale Data Set and Benchmark for Understanding American Sign Language. arXiv preprint arXiv:1812.01053.

Google Scholar

[16] Koller O., Zargaran O., Ney H., Bowden R. (2020). Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition. International Journal of Computer Vision 128: 1627– 1642

DOI: 10.5244/c.30.136

Google Scholar

[17] Kwon J., Ha J., Kim D.H., Choi J.W., Kim L. (2021) Emotion Recognition Using a Glasses- Type Wearable Device via Multi-Channel Facial Responses. IEEE Access.

DOI: 10.1109/access.2021.3121543

Google Scholar

[18] Bora, J., Dehingia, S., Boruah, A., Chetia, A. A., & Gogoi, D. (2023). Real-time Assamese Sign Language Recognition using MediaPipe and Deep Learning. Procedia Computer Science, 218, 1384–1393

DOI: 10.1016/j.procs.2023.01.117

Google Scholar

[19] Mohsin, S., Salim, B. W., Mohamedsaeed, A. K., Ibrahim, B. F., & Zeebaree, S. R. M. (2024). American Sign Language Recognition Based on Transfer Learning Algorithms. International Journal of Intelligent Systems and Applications in Engineering, 12(5s), 390–399

Google Scholar

[20] Soni, N., et al. (2022). Deep learning for sign language recognition: Current techniques, benchmarks, and open issues. IEEE Access, 10, 123456-123478. https://ieeexplore.ieee.org/document/9530569

DOI: 10.1109/access.2021.3110912

Google Scholar

[21] Dataset link: https://www.kaggle.com/datasets/ayuraj/asl-dataset

Google Scholar

[22] Dataset link: https://huggingface.co/datasets/sayakpaul/ucf101- subset/blob/main/UCF101_subset.tar.gz

Google Scholar

[23] O.E. Oduntan, I.A. Adeyanju, A.S. Falohun, O.O. Obe," A comparative analysis of euclidean distance and cosine similarity measure for automated essay-type grading Journal of Engineering and Applied Sciences, 2018, 13(11), p.4198–4204

Google Scholar

[24] F. O. Akinloye, O. O. Obe, and O. K. Boyinbode (2020). Development of an affective-based e-healthcare system for autistic children. Elsevier Science Direct, Scientific African, Volume 9 (1): 1-9.

DOI: 10.1016/j.sciaf.2020.e00514

Google Scholar

[25] Y. A. Asogbon, W. S. Oluwarotimi., E. Nsugbe, Y. A. Jarrah, O. O. Obe, Y. Geng, and G. Li (2021). A Deep Learning-based Model for Decoding Motion Intent of Traumatic Brain Injured Patients' using HD sEMG Recordings. 2021 IEEE International Workshop on Metrology for Industry 4.0 and IoT.

DOI: 10.1109/metroind4.0iot51437.2021.9488440

Google Scholar

[26] C.C Ugwu , O.O. Obe, O.S. Popoola, A.O. Adetunmbi, "Adistriuted denial of service attack detection system using long short term memory with Singular Value Decomposition", Proceedings of the 2020 IEEE 2nd International Conference on Cyberspace , CYBER NIGERIA 2020, 2021,pp.112-118, 9428870.

DOI: 10.1109/cybernigeria51635.2021.9428870

Google Scholar

[27] F. T. Matthew, A.I. Adepoju,O. Ayodele, O.O. Obe, O. Olaniyan,A. Esan,B. Omodunbi, F. Egbetola, "Development of Mobile-Interfaced Machine Learning-Based Predictive Models for Improving Students' Performance in Programming Courses" International Journal of Advanced Computer Science and Applications(ijacsa), 9(5), 2018

DOI: 10.14569/IJACSA.2018.090514

Google Scholar

[28] Jarrah, Yazan & Asogbon, Mojisola & Samuel, Oluwarotimi & Zhu, Mingxing & Wang, Xin & Obe, Olumide & Chen, Shixiong & Li, Patrick. (2021). A Comparative Analysis on the Impact of Linear and Non-Linear Filtering Techniques on EMG Signal Quality of Transhumeral Amputees. 2021 IEEE International Workshop on Metrology for Industry 4.0 and IoT, MetroInd 4.0 and IoT 2021 - Proceedings, 2021, p.604–608, 9488516.

DOI: 10.1109/MetroInd4.0IoT51437.2021.9488516

Google Scholar

[29] Okikiola, F.M., Adewale, O.S., Obe, O.O. An Ontology-Based Diabetes Prediction Algorithm Using Naïve Bayes Classifier and Decision Tree.2023 International Conference on Science, Engineering, and Business for Sustainable Development Goals, SEB-SDG (2023)

DOI: 10.1109/seb-sdg57117.2023.10124491

Google Scholar

[30] Kulwa, F., Samuel, O.W., Asogbon, M.G., Obe, O.O., Li, G. Analyzing the Impact of Varied Window Hyper-parameters on Deep CNN for sEMG based Motion Intent Classification 2022 IEEE International Workshop on Metrology for Industry 4.0 and IoT, MetroInd 4.0 and IoT 2022 - Proceedings, 2022, p.81–86

DOI: 10.1109/metroind4.0iot54413.2022.9831573

Google Scholar

[31] Obe O. O. and Dumitrache I., (2010). Fuzzy Control of Autonomous Mobile Robot. University Politehnica of Bucharest Sci. Bull., Series C, Vol. 72, Iss. 3, pp.173-186

Google Scholar

[32] Obe O. O. and Dumitrache I., (2012). Adaptive Neuro-Fuzzy Controller with Genetic Training for Mobile Robot Control. International Journal of Computer Communications and Control, No.1, Vol.7, pp.145-156

DOI: 10.15837/ijccc.2012.1.1429

Google Scholar

[33] Olaleke J.O., Adetunmbi A.O., Obe O. O., Iroju O.G. (2015) Automated Detection of Breast Cancer's Indicators in Mammogram via Image Processing Techniques, British Journal of Applied Science & Technology, Vol. 9(1): 53-64

DOI: 10.9734/bjast/2015/13675

Google Scholar

[34] Oduntan O.E., Adeyanju I.A., Falodun A.S. and Obe O. O. (2018). A Comparative Analysis of Euclidean Distance and Cosine Similarity Measure for Automated Essay–Type Grading. Medwell Journals: Journal of Engineering and Applied Sciences. 13(11): 4198-4204

Google Scholar

[35] Fagbola T.M., Adejanju I.A., Oloyede A., Obe O. O., Olaniyan O., Esan A., Omodunbi B. and Egbetola F. (2018) Development of Mobile-Interfaced Machine Learning-Based Predictive Models for Improving Students' Performance in Programming Courses. International Journal of Advanced Computer Science and Applications (IJACSA), Vol. 9, No. 5, 105-115

DOI: 10.14569/ijacsa.2018.090514

Google Scholar

[36] Obe O. O., and Tiko I. (2020). Tool Support for Learning Computer and Robot Programming. 17th International Conference on Cognition and Exploratory Learning in Digital Age (CELDA 2020). Pp. 197- 204

DOI: 10.33965/CELDA2020_202014L025

Google Scholar

[37] Obe, O.O., Sarumi, O.A., Adebayo, A. (2022). Enhancing Epidemiological Surveillance Systems Using Dynamic Modeling: A Scoping Review. In Proceedings of the 13th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2021). SoCPaR 2021. Lecture Notes in Networks and Systems, vol 417. Springer, Cham

DOI: 10.1007/978-3-030-96302-6_48

Google Scholar

[38] Adetunmbi O. A., Obe. O. O., Iyanda J. N. (2016) Development of Standard Yorùbá speech-to-text system using HTK. International Journal of Speech Technology, Springer, 19:4, 929-944.

DOI: 10.1007/s10772-016-9380-2

Google Scholar

[39] Obe O. O. and D. K. Shangodoyin, (2010). Artificial Neural Network-Based Model for Forecasting Sugar Cane Production. Science Publication: Journal of Computer Science, USA 6(4): pp.439-445.

DOI: 10.3844/jcssp.2010.439.445

Google Scholar

[40] Shangodoyin D. K., Obe O. O., Arnab R., and Dlamini S. S.(2006). Tool Support for Systematic Test Data Generation using Genetic Algorithms. Journal of Advances and Applications in Statistics, India 6(3), pp.399-409

Google Scholar