Two-Step Text Recognition and Summarization of Scanned Documents

Article Preview

Abstract:

With the explosion of unstructured textual data circulating the digital space in present times, there has been an increase in the necessity of developing tools that can perform automatic text summarization to allow people to get insights from them easily and extract significant and essential data using Automatic Text Summarizers. The readability of documents can be improved and the time spent on researching for information can be improved by the implementation of text summarization tools. In this project, extractive summarization will be performed on text recognized from scanned documents via Optical Character Recognition (OCR), using the TextRank algorithm which is an unsupervised text summarization technique for performing extractive text summarization.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

355-361

Citation:

Online since:

February 2023

Export:

Price:

Permissions CCC:

Permissions PLS:

Сopyright:

© 2023 Trans Tech Publications Ltd. All Rights Reserved

Share:

Citation:

* - Corresponding Author

[1] ÁNGEL HERNANDEZ-CASTANEDA, RENE ARNULFO GARCIA-HERNANDEZ, YULIA LEDENEVA, CHRISTIAN EDUARDO MILLAN- HERNANDEZ, Language-independent extractive automatic text summarization based on automatic keyword extraction, In Computer Speech & Language, Volume 71, January 2022, 101267, Elsevier.

DOI: 10.1016/j.csl.2021.101267

Google Scholar

[2] ANSHUL ARORA, RAJAT SINGH, ASHIQ EQBAL, ANKIT MANGAL, PROF. S. U. SOUJI, Extraction and Detection of Text From Images,, In International Journal of Research in Engineering and Technology Vol. 8, August (2021).

Google Scholar

[3] MINGXI ZHANG, XUEMIN LI, SHUIBO YUE, AND LIUQIAN YANG, An Empirical Study of TextRank for Keyword Extraction,, In IEEE Access(2020).

DOI: 10.1109/access.2020.3027567

Google Scholar

[4] M. F. MRIDHA, AKLIMA AKTER LIMA, KAMRUDDIN NUR, SUJOY CHANDRA DAS, MAHMUD HASAN, AND MUHAMMAD MOHSIN KABIR, A survey of Automatic Text Summarization: Progress, Process and Challenges,, In IEEE Access November 22, (2021).

DOI: 10.1109/access.2021.3129786

Google Scholar

[5] JINYUAN ZHAO, YANNA WANG, BAIHUA XIAO, CUNZHAO SHI, FUXI JIA, AND CHUNHENG WANG, DetectGAN: GAN-based text detector for camera-captured document Images, In International Journal on Document Analysis and Recognition (IJDAR), Springer (2020).

DOI: 10.1007/s10032-020-00358-w

Google Scholar

[6] JINGQIANG CHEN, HAI ZHUGE, Extractive Text-Image Summarization using Multi-Modal RNN,, In 14th International Conference on Semantics, Knowledge, and Grids (SKG) IEEE (2018).

DOI: 10.1109/skg.2018.00033

Google Scholar

[7] ASH RANI MISHRA, V.K PANCHAL, PAWAN KUMAR, Extractive Text Summarization - An effective approach to extract information from Text, In 2019 International Conference on Contemporary Computing and Informatics (IC3I) IEEE (2019).

DOI: 10.1109/ic3i46837.2019.9055636

Google Scholar

[8] RAUNAK KOLLE, S SANJANA, MERIN MELEET,Extractive Summarization of Text from Images, in International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems (ICSES) (2021).

DOI: 10.1109/icses52305.2021.9633936

Google Scholar

[9] POOJA RAUNDALE, HIMANSHU SHEKHAR, Analytical Study of Text Summarization Techniques, In Asian Conference on Innovation in Technology (ASIANCON) (2021).

DOI: 10.1109/asiancon51346.2021.9544804

Google Scholar

[10] XIYAN LIU, GAOFENG MENG, CHUNHONG PAN, Scene text detection and recognition with advances in deep learning: a survey, In International Journal on Document Analysis and Recognition (IJDAR), Springer (2019).

DOI: 10.1007/s10032-019-00320-5

Google Scholar