Tags

Type your tag names separated by a space and hit enter

Abstractive text summarization of low-resourced languages using deep learning.
PeerJ Comput Sci. 2023; 9:e1176.PC

Abstract

Background

Humans must be able to cope with the huge amounts of information produced by the information technology revolution. As a result, automatic text summarization is being employed in a range of industries to assist individuals in identifying the most important information. For text summarization, two approaches are mainly considered: text summarization by the extractive and abstractive methods. The extractive summarisation approach selects chunks of sentences like source documents, while the abstractive approach can generate a summary based on mined keywords. For low-resourced languages, e.g., Urdu, extractive summarization uses various models and algorithms. However, the study of abstractive summarization in Urdu is still a challenging task. Because there are so many literary works in Urdu, producing abstractive summaries demands extensive research.

Methodology

This article proposed a deep learning model for the Urdu language by using the Urdu 1 Million news dataset and compared its performance with the two widely used methods based on machine learning, such as support vector machine (SVM) and logistic regression (LR). The results show that the suggested deep learning model performs better than the other two approaches. The summaries produced by extractive summaries are processed using the encoder-decoder paradigm to create an abstractive summary.

Results

With the help of Urdu language specialists, the system-generated summaries were validated, showing the proposed model's improvement and accuracy.

Authors+Show Affiliations

Department of Computer Science, National Textile University, Faisalabad, Pakistan.Department of Computer Science, National Textile University, Faisalabad, Pakistan.Department of Computer Science, National Textile University, Faisalabad, Pakistan.Department of Computer Science, University of Agriculture Faisalabad, Faisalabad, Pakistan.Computer Sciences Department, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.Department of Computer Science, National Textile University, Faisalabad, Pakistan.

Pub Type(s)

Journal Article

Language

eng

PubMed ID

37346684

Citation

Shafiq, Nida, et al. "Abstractive Text Summarization of Low-resourced Languages Using Deep Learning." PeerJ. Computer Science, vol. 9, 2023, pp. e1176.
Shafiq N, Hamid I, Asif M, et al. Abstractive text summarization of low-resourced languages using deep learning. PeerJ Comput Sci. 2023;9:e1176.
Shafiq, N., Hamid, I., Asif, M., Nawaz, Q., Aljuaid, H., & Ali, H. (2023). Abstractive text summarization of low-resourced languages using deep learning. PeerJ. Computer Science, 9, e1176. https://doi.org/10.7717/peerj-cs.1176
Shafiq N, et al. Abstractive Text Summarization of Low-resourced Languages Using Deep Learning. PeerJ Comput Sci. 2023;9:e1176. PubMed PMID: 37346684.
* Article titles in AMA citation format should be in sentence-case
TY - JOUR T1 - Abstractive text summarization of low-resourced languages using deep learning. AU - Shafiq,Nida, AU - Hamid,Isma, AU - Asif,Muhammad, AU - Nawaz,Qamar, AU - Aljuaid,Hanan, AU - Ali,Hamid, Y1 - 2023/01/13/ PY - 2022/09/11/received PY - 2022/11/09/accepted PY - 2023/6/22/medline PY - 2023/6/22/pubmed PY - 2023/6/22/entrez KW - Abstractive summarization KW - BERT2BERT KW - LSTM KW - Pars-BERT KW - Seq-to-Seq KW - Urdu SP - e1176 EP - e1176 JF - PeerJ. Computer science JO - PeerJ Comput Sci VL - 9 N2 - Background: Humans must be able to cope with the huge amounts of information produced by the information technology revolution. As a result, automatic text summarization is being employed in a range of industries to assist individuals in identifying the most important information. For text summarization, two approaches are mainly considered: text summarization by the extractive and abstractive methods. The extractive summarisation approach selects chunks of sentences like source documents, while the abstractive approach can generate a summary based on mined keywords. For low-resourced languages, e.g., Urdu, extractive summarization uses various models and algorithms. However, the study of abstractive summarization in Urdu is still a challenging task. Because there are so many literary works in Urdu, producing abstractive summaries demands extensive research. Methodology: This article proposed a deep learning model for the Urdu language by using the Urdu 1 Million news dataset and compared its performance with the two widely used methods based on machine learning, such as support vector machine (SVM) and logistic regression (LR). The results show that the suggested deep learning model performs better than the other two approaches. The summaries produced by extractive summaries are processed using the encoder-decoder paradigm to create an abstractive summary. Results: With the help of Urdu language specialists, the system-generated summaries were validated, showing the proposed model's improvement and accuracy. SN - 2376-5992 UR - https://www.unboundmedicine.com/medline/citation/37346684/Abstractive_text_summarization_of_low_resourced_languages_using_deep_learning_ DB - PRIME DP - Unbound Medicine ER -