Paper Titles

Hybrid Bio-Inspired Algorithms for Energy Efficient and Optimal Communication in Mobile Wireless Sensor Networks: Review
p.135

Design a Simulation System for Data Transmission in Order to Obtain Best Values for the Laser Power Used
p.155

Assessing the Performance of Standard Mobility Models in Cellular Networks for Drones
p.165

Development of a Group Decision Making Method for Ranking Alternatives: Selection of most Preferred Data Mining Algorithm for a Construction Project
p.177

Evolving Text Matching: A Systematic Review of Classical and Modern Approaches in the Neural Network
p.195

The Hadoop Ecosystem: An Open-Source Framework for Enterprise-Scale Big Data Processing and Analytics
p.210

EHD-ABC: An Enhanced History-Driven Artificial Bee Colony Algorithm for Improved Data Clustering
p.227

Advancements in Machine Learning-Based DDoS Attack Detection within Software-Defined Networking Environments
p.243

Enhancing Medical Data Security in Private Cloud: An MAR-Based Encryption Approach
p.254

HomeEngineering HeadwayEngineering Headway Vol. 35Evolving Text Matching: A Systematic Review of...

Evolving Text Matching: A Systematic Review of Classical and Modern Approaches in the Neural Network

Article Preview

Abstract:

Writing matching has evolved dramatically from simple string comparison algorithms to sophisticated natural language processing techniques. This comprehensive literature review examines matching methods over the last 20 years, with special emphasis on transitioning from traditional frameworks to modern NLP methods to identify opportunities for practical theoretical integration and development exploring both models' fundamental principles, strengths and limitations. Our systematic review covers three main areas: (1) classical text matching algorithms, including Levenstein distance, Boyer-Moore, and Knuth-Morris-Pratt; (2) modern NLP techniques, such as transformer-based models and contextual ontologies; and (3) emerging hybrid approaches that seek to integrate these approaches. Intensive analysis of more than 40 papers from leading areas in information retrieval, natural language processing, and algorithmic evolution reveals key patterns in adopting text-matching strategies and highlights promising directions for future research. The study highlights a significant difference between the computational efficiency of traditional methods and the logical comprehension capabilities of modern NLP methods. Our study examines various attempts to bridge this gap and discusses the challenges and opportunities in integrating classical and modern approaches. We examine how different approaches manage the trade-off between computational complexity, logical clarity, and application-specific requirements.

You might also be interested in these eBooks

The 6th International Scientific Conference of Alkafeel University (ISCKU)

Info:

Periodical:

Engineering Headway (Volume 35)

Pages:

195-209

DOI:

https://doi.org/10.4028/p-6OHOwe

Citation:

Cite this paper

Online since:

February 2026

Authors:

Nagwa Elmobark*, Aymen Saad, Ahmed Ali Talib Al-Khazalli, Mohamed Badouch

Keywords:

Edit Distance, Natural Language Processing, Pattern Recognition, Semantic Analysis, String Matching Algorithms, Systematic Review, Text Matching, Transformer Models

Export:

RIS, BibTeX

Price:

Permissions CCC:

Request Permissions

Permissions PLS:

Request Permissions

Сopyright:

© 2026 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] H. Manning, C. D., & Schütze, Foundations of Statistical Natural Language Processing, (3rd ed.). MIT Press, 2021.

[2] K. Zhang, T., & Lee, Hybrid Text Matching with Neural N-grams. ACL 2021, 2021.

[3] V. I. Levenshtein, "Binary Codes Capable of Correcting Deletions, Insertions and Reversals," Sov. Phys. Dokl., vol. 10, p.707–710, 1966.

[4] J. S. Boyer, R. S., & Moore, "A fast string searching algorithm," Commun. ACM, vol. 20, no. (10), p.762–772.

DOI: 10.1145/359842.359859

[5] V. R. Knuth, D. E., Morris, J. H., & Pratt, "Fast pattern matching in strings," SIAM J. Comput., vol. 6, no. 2, p.323–350.

DOI: 10.1137/0206024

[6] A. Tiskin, "Bounded-length Smith-Waterman alignment," Leibniz Int. Proc. Informatics, LIPIcs, vol. 143, no. 16, p.1–12, 2019.

[7] J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," NAACL HLT 2019 - 2019 Conf. North Am. Chapter Assoc. Comput. Linguist. Hum. Lang. Technol. - Proc. Conf., vol. 1, no. Mlm, p.4171–4186, 2019.

DOI: 10.18653/v1/n19-1423

[8] A. Vaswani et al., "Attention is all you need," Adv. Neural Inf. Process. Syst., vol. 2017-Decem, no. Nips, p.5999–6009, 2017.

[9] R. Li, M., & Thompson, "Efficient Indexing for Large-Scale Text Matching," SIGIR 2023, p.567–576.

[10] P. A. Sherstnev, A. S. Polyakova, and L. V. Lipinskiy, "Comparative analysis of the efficiency of classical and neural network approaches for text vectorization in solving classification problems," 2022, p.050031.

DOI: 10.1063/5.0106058

[11] H. Iuchi et al., "Representation learning applications in biological sequence analysis," Comput. Struct. Biotechnol. J., vol. 19, p.3198–3208, 2021.

DOI: 10.1016/j.csbj.2021.05.039

[12] A. H. Muhammad, K. Kusrini, and I. Oyong, "Revisiting the challenges and surveys in text similarity matching and detection methods," J. Inform., vol. 16, no. 3, p.127, 2022.

DOI: 10.26555/jifo.v16i3.a23471

[13] F. J. Damerau, "A technique for computer detection and correction of spelling errors," Commun. ACM, vol. 7, no. 3, p.171–176, 1964.

DOI: 10.1145/363958.363994

[14] M. S. Smith, T. F., & Waterman, "Identification of Common Molecular Subsequences," J. Mol. Biol., vol. 147, no. (1), p.195–197, 2019.

[15] R. M. Karp and M. O. Rabin, "Efficient randomized pattern-matching algorithms," IBM J. Res. Dev., vol. 31, no. 2, p.249–260, Mar. 1987.

DOI: 10.1147/rd.312.0249

[16] W. H. Gomma and A. A. Fahmy, "A Survey of Text Similarity Approaches," Int. J. Comput. Appl., vol. 68, no. 13, p.13–18, 2013.

[17] G. Salton and M. J. McGill, "Introduction to Modem Information," p.375–384, 1983, [Online]. Available: http://portal.acm.org/citation.cfm?id=1893971.1894017.

[18] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, "Distributed representations ofwords and phrases and their compositionality," Adv. Neural Inf. Process. Syst., p.1–9, 2013.

[19] P. M. Brennan, J. J. M. Loan, N. Watson, P. M. Bhatt, and P. A. Bodkin, "Pre-operative obesity does not predict poorer symptom control and quality of life after lumbar disc surgery," Br. J. Neurosurg., vol. 31, no. 6, p.682–687, 2017.

DOI: 10.1080/02688697.2017.1354122

[20] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, "Enriching Word Vectors with Subword Information," Trans. Assoc. Comput. Linguist., vol. 5, p.135–146, 2017.

DOI: 10.1162/tacl_a_00051

[21] Y. Liu et al., "RoBERTa: A Robustly Optimized BERT Pretraining Approach," no. 1, 2019, [Online]. Available: http://arxiv.org/abs/1907.11692.

[22] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, "XLNet: Generalized autoregressive pretraining for language understanding," Adv. Neural Inf. Process. Syst., vol. 32, no. NeurIPS, p.1–18, 2019.

[23] J. Mueller and A. Thyagarajan, "Siamese Recurrent Architectures for Learning Sentence Similarity," Proc. AAAI Conf. Artif. Intell., vol. 30, no. 1, Mar. 2016.

DOI: 10.1609/aaai.v30i1.10350

[24] S. Humeau, K. Shuster, M.-A. Lachaux, and J. Weston, "Poly-encoders: Transformer Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring," Apr. 2019.

[25] E. Gogoulou, A. Ekgren, T. Isbister, and M. Sahlgren, "Cross-lingual Transfer of Monolingual Models," Sep. 2021, [Online]. Available: http://arxiv.org/abs/2109.07348.

[26] J. Libovický and A. Fraser, "Neural String Edit Distance," SPNLP 2022 - 6th Work. Struct. Predict. NLP, Proc. Work., p.52–66, 2022.

DOI: 10.18653/v1/2022.spnlp-1.6

[27] X. Jiang, J. Ma, and J. Chen, "Progressive Filtering for Feature Matching," in ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, May 2019, p.2217–2221.

DOI: 10.1109/ICASSP.2019.8682372

[28] C. V. Fuenteslópez, A. McKitrick, J. Corvi, M.-P. Ginebra, and O. Hakimi, "Biomaterials text mining: A hands-on comparative study of methods on polydioxanone biocompatibility," N. Biotechnol., vol. 77, p.161–175, Nov. 2023.

DOI: 10.1016/j.nbt.2023.09.001

[29] H. X. Rodriguez, "Artificial Intelligence (AI) and the Practice of Law," Sedona Conf. J., vol. 24, no. forthcoming, 2023, [Online]. Available: https://thesedonaconference.org/publications.

[30] K. Staffs, "Guidelines for performing systematic literature reviews in software engineering," Tech. report, Ver. 2.3 EBSE Tech. Report. EBSE, no. January 2007, p.1–57, 2007.

[31] K. Petersen, S. Vakkalanka, and L. Kuzniarz, "Guidelines for conducting systematic mapping studies in software engineering: An update," Inf. Softw. Technol., vol. 64, p.1–18, Aug. 2015.

DOI: 10.1016/j.infsof.2015.03.007

[32] R. T. W. Webster, Jane, "ANALYZING THE PAST TO PREPARE FOR THE FUTURE: WRITING A LITERATURE REVIEW," MIS Q., vol. Vol. 26, no. o. 2, pp. xiii–xxiii.

[33] P. Brereton, B. A. Kitchenham, D. Budgen, M. Turner, and M. Khalil, "Lessons from applying the systematic literature review process within the software engineering domain," J. Syst. Softw., vol. 80, no. 4, p.571–583, Apr. 2007.

DOI: 10.1016/j.jss.2006.07.009

[34] H. Zhang, M. A. Babar, and P. Tell, "Identifying relevant studies in software engineering," Inf. Softw. Technol., vol. 53, no. 6, p.625–637, Jun. 2011.

DOI: 10.1016/j.infsof.2010.12.010

[35] C. Wohlin, "Guidelines for snowballing in systematic literature studies and a replication in software engineering," ACM Int. Conf. Proceeding Ser., 2014.

DOI: 10.1145/2601248.2601268

[36] T. Dybå and T. Dingsøyr, "Empirical studies of agile software development: A systematic review," Inf. Softw. Technol., vol. 50, no. 9–10, p.833–859, Aug. 2008.

DOI: 10.1016/j.infsof.2008.01.006

[37] B. A. Kitchenham, D. Budgen, and P. Brereton, Evidence-Based Software Engineering and Systematic Reviews. Chapman and Hall/CRC, 2015.

DOI: 10.1201/b19467

[38] D. S. Cruzes and T. Dybå, "Research synthesis in software engineering: A tertiary study," Inf. Softw. Technol., vol. 53, no. 5, p.440–455, May 2011.

DOI: 10.1016/j.infsof.2011.01.004

[39] X. Zhou, Y. Jin, H. Zhang, S. Li, and X. Huang, "A Map of Threats to Validity of Systematic Literature Reviews in Software Engineering," in 2016 23rd Asia-Pacific Software Engineering Conference (APSEC), IEEE, 2016, p.153–160.

DOI: 10.1109/APSEC.2016.031

[40] A. Saad, U. U. Sheikh and Z. A. A. Alyasseri, "An Efficient Layout Index Characters for Automatic License Plate Recognition System Based on the YOLO-v8 Detector," 2024 IEEE 8th International Conference on Signal and Image Processing Applications (ICSIPA), Kuala Lumpur, Malaysia, 2024, pp.1-5.

DOI: 10.1109/ICSIPA62061.2024.10701017

[41] A. H. Abdulkhaleq, A. W. Altaher, A. Saad and H. M. Al-Jawahry, "Automatic Vehicle License Plate Recognition Using Lightweight Deep Learning Approach," 2023 6th International Conference on Engineering Technology and its Applications (IICETA), Al-Najaf, Iraq, 2023, pp.143-148.

DOI: 10.1109/IICETA57613.2023.10351297