AccScience Publishing / IJOSI / Volume 9 / Issue 6 / DOI: 10.6977/IJoSI.202512_9(6).0003
Cite this article
6
Download
20
Citations
84
Views
Journal Browser
Volume | Year
Issue
Search
News and Announcements
View All
ARTICLE

Automatic text summarization framework for multi-text and multilingual documents using an ensemble of HIN-MELM-AE and improved DePori model

Sunil Upadhyay1* Hemant Kumar Soni1
Show Less
1 Department of Computer Science and Engineering, Amity School of Engineering and Technology, Amity University Madhya Pradesh, Gwalior, Madhya Pradesh, India
Submitted: 28 November 2024 | Revised: 10 November 2025 | Accepted: 2 December 2025 | Published: 29 December 2025
© 2025 by the Author (s). Licensee AccScience Publishing, USA. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution -Noncommercial 4.0 International License (CC BY-NC 4.0) ( https://creativecommons.org/licenses/by-nc/4.0/ )
Abstract

Automatic text summarization (ATS) has gained increasing significance in recent years due to the rapid growth of textual data across digital platforms. The main objective of ATS is to generate a concise, informative summary from a lengthy document. Multi-document and multilingual summarization has been largely underexplored in previous research. This study presents an improved ensemble learning-based ATS system with slang filtering, using the Hyperfan-IN multilayer extreme learning machine-based autoencoder (HIN-MELM-AE) and the improved Dehghani poor-and-rich optimization algorithm (DePori). The original text undergoes comprehensive preprocessing, after which slang is detected and removed using DePori. Subsequently, the clean text is processed through info-squared C-means clustering, latent Dirichlet allocation-based topic modeling, term frequency–inverse document frequency weighting, and frequent-term extraction. Next, part-of-speech (POS) tagging is performed using a sememe similarity-induced hidden Markov model, and key entities are extracted from the transformed and POS–tagged data. Distilled bidirectional encoder representations from transformers (DBERT) are used to convert these entities into vectors. The final summary is generated through a combination of HIN-MELM-AE, stack autoencoder, variational autoencoder, and DBERT models, followed by cosine similarity calculation, voting-based fusion, re-ranking, and selection of the optimal sentences. Experimental results indicate that the proposed framework achieves superior performance 97.92% of the time, outperforming existing ATS methods.

Keywords
Hyperfan-IN Multilayer Extreme Learning Machine Auto Encoder
Info-Squared Fuzzy C-Means Clustering
Latent Dirichlet Allocation
Parts of Speech
Sentence Bidirectional Encoder Representations from Transformers
Sememe Similarity-Induced Hidden Markov Model
Term Frequency–Inverse Document Frequency
Variational Auto Encoder
Funding
None.
References

Abo-Bakr, H., & Mohamed, S.A. (2023). Automatic multi-documents text summarization by a large-scale sparse multi-objective optimization algorithm. Complex and Intelligent Systems, 9(4), 4629–4644 https://doi.org/10.1007/s40747-023-00967-y

 

Alami Merrouni, Z., Frikh, B., & Ouhbi, B. (2023). EXABSUM: A new text summarization approach for generating extractive and abstractive summaries. Journal of Big Data, 10(1), 1–34. https://doi.org/10.1186/s40537-023-00836-y

 

Alomari, A., Idris, N., Sabri, A.Q.M., & Alsmadi, I. (2022). Deep reinforcement and transfer learning for abstractive text summarization: A review. Computer Speech and Language, 71, 1–43. https://doi.org/10.1016/j.csl.2021.101276

 

Awasthi, I., Gupta, K., Bhogal, P.S., Anand, S.S., & Soni, P.K. (2021). Natural Language Processing (NLP) Based Text Summarization-A Survey. In: Proceedings of the 6th International Conference on Inventive Computation Technologies, p1310–1317. https://doi.org/10.1109/icict50816.2021.9358703

 

Belwal, R.C., Rai, S., & Gupta, A. (2021). Text summarization using topic-based vector space model and semantic measure. Information Processing and Management, 58(3), 102536. https://doi.org/10.1016/j.ipm.2021.102536

 

Divya, S., & Sripriya, N. (2025). Semantic based extractive document summarization using deep learning model. ICTACT Journal on Soft Computing, 15(4), 3669–3681. https://doi.org/10.21917/ijsc.2025.0509

 

El-Kassas, W.S., Salama, C.R., Rafea, A.A., & Mohamed, H.K. (2021). Automatic text summarization: A comprehensive survey. Expert Systems with Applications, 165, 1–46. https://doi.org/10.1016/j.eswa.2020.113679

 

Gupta, H., & Patel, M. (2021). Method of Text Summarization Using Lsa and Sentence Based Topic Modelling with Bert. In: International Conference on Artificial Intelligence and Smart Systems, p511–517. https://doi.org/10.1109/ICAIS50930.2021.9395976

 

Haider, M.M., Hossin, M.A., Mahi, H.R., & Arif, H. (2020). Automatic Text Summarization Using Gensim Word2Vec and K-Means Clustering Algorithm. In: IEEE Region 10 Symposium, TENSYMP 2020, p283–286. https://doi.org/10.1109/TENSYMP50017.2020.9230670

 

Hailu, T.T., Yu, J., & Fantaye, T.G. (2020). A framework for word embedding based automatic text summarization and evaluation. Information (Switzerland), 11(2), 1–23. https://doi.org/10.3390/info11020078

 

Hark, C., & Karci, A. (2020). Karci summarization: A simple and effective approach for automatic text summarization using Karcı entropy. Information Processing and Management, 57(3), 1–16. https://doi.org/10.1016/j.ipm.2019.102187

 

Hassan, A.A., Al-Onazi, B.B., Maashi, M., Darem, A.A., Abunadi, I., & Mahmud, A. (2024). Enhancing extractive text summarization using natural language processing with an optimal deep learning model. AIMS Mathematics, 9(5), 12588–12609.

 

Hernandez-Castaneda, A., Garcia-Hernandez, R.A., & Ledeneva, Y. (2023). Toward the automatic generation of an objective function for extractive text summarization. IEEE Access, 11, 51455–51464. https://doi.org/10.1109/access.2023.3279101

 

Hosseinabadi, S., Kelarestaghi, M., & Eshghi, F. (2022). ISSE: A new iterative sentence scoring and extraction scheme for automatic text summarization. International Journal of Computers and Applications, 44(6), 1–6. https://doi.org/10.1080/1206212X.2020.1829844

 

El-Kassas, Wafaa S., Cherif R. Salama, Ahmed A. Rafea, and Hoda K. Mohamed. (2020). EdgeSumm: Graph-Based framework for automatic text summarization. Information Processing and Management, 57(6), 102264. https://doi.org/10.1016/j.ipm.2020.102264.

 

Jain, M., & Rastogi, H. (2020). Automatic Text Summarization using Soft-Cosine Similarity and Centrality Measures. In: Proceedings of the 4th International Conference on Electronics, Communication and Aerospace Technology, p. 1021–1028. https://doi.org/10.1109/iceca49313.2020.9297583

 

Jiang, J., Zhang, H., Dai, C., Zhao, Q., Feng, H., Ji, Z., et al. (2021). Enhancements of attention-based bidirectional LSTM for hybrid automatic text summarization. IEEE Access, 9, 123660–123671. https://doi.org/10.1109/access.2021.3110143

 

Kouris, P., Alexandridis, G., & Stafylopatis, A. (2024). Text summarization based on semantic graphs: An abstract meaning representation graph-to-text deep learning approach. Journal of Big Data, 11(1), 1–39. https://doi.org/10.1186/s40537-024-00950-5

 

Lamsiyah, S., Mahdaouy, E., Espinasse, B., & Ouatik, A. (2021). An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings. Expert Systems with Applications, 167, 114152.

 

Liu, W., Sun, Y., Yu, B., Wang, H., Peng, Q., Hou, M., et al. (2024). Automatic text summarization method based on improved TextRank algorithm and K-Means clustering. Knowledge-Based Systems, 287, 111447. https://doi.org/10.1016/j.knosys.2024.111447

 

Mandale-Jadhav, A. (2025). Text summarization using natural language processing. Journal of Electrical Systems, 20(11), 3410–3417. https://doi.org/10.52783/jes.8095

 

Manjari, K.U., Rousha, S., Sumanth, D., & Devi, J.S. (2020). Extractive Text Summarization from Web pages using Selenium and TF-IDF algorithm. In: Proceedings of the 4th International Conference on Trends in Electronics and Informatics, p648–652. https://doi.org/10.1109/icoei48184.2020.9142938

 

Muniraj, P., Sabarmathi, K.R., Leelavathi, R., & Balaji, B.S. (2023). HNTSumm: Hybrid text summarization of transliterated news articles. International Journal of Intelligent Networks, 4, 53–61. https://doi.org/10.1016/j.ijin.2023.03.001

 

Onah, D.F.O., Pang, E.L.L., & El-Haj, M. (2022). A Data-driven Latent Semantic Analysis for Automatic Text Summarization using LDA Topic Modelling. IEEE International Conference on Big Data (Big Data), 2771–2780. https://doi.org/10.1109/BigData55660.2022.10020259

 

Onan, A., & Alhumyani, H.A. (2024a). DeepExtract: Semantic-driven extractive text summarization framework using LLMs and hierarchical positional encoding. Journal of King Saud University - Computer and Information Sciences, 36(8), 1–19. https://doi.org/10.1016/j.jksuci.2024.102178

 

Onan, A., & Alhumyani, H.A. (2024b). FuzzyTP-BERT: Enhancing extractive text summarization with fuzzy topic modeling and transformer networks. Journal of King Saud University - Computer and Information Sciences, 36(6), 102080.

 

Payak, A., Rai, S., Shrivastava, K., & Gulwani, R. (2020). Automatic Text Summarization and Keyword Extraction using Natural Language Processing. In: Proceedings of the International Conference on Electronics and Sustainable Communication Systems, p98–103. https://doi.org/10.1109/icesc48915.2020.9155852

 

Prasad, C., Kallimani, J.S., Harekal, D., & Sharma, N. (2020). Automatic Text Summarization Model using seq2seq Technique. In: Proceedings of the 4th International Conference on IoT in Social, Mobile, Analytics and Cloud, p599–604. https://doi.org/10.1109/I-SMAC49090.2020.9243572

 

Syed, A.A., Gaol, F.L., & Matsuo, T. (2021). A survey of the state-of-the-art models in neural abstractive text summarization. IEEE Access, 9, 13248–13265. https://doi.org/10.1109/access.2021.3052783

 

Tomer, M., & Kumar, M. (2022). Multi-document extractive text summarization based on firefly algorithm. Journal of King Saud University - Computer and Information Sciences, 34(8), 6057–6065. https://doi.org/10.1016/j.jksuci.2021.04.004

 

Wahab, M.H.H., Ali, N.H., Hamid, N.A.W.A., Subramaniam, S.K., Latip, R., & Othman, M.(2024). A review on optimization-based automatic text summarization approach. IEEE Access, 12, 4892–4909. https://doi.org/10.1109/access.2023.3348075

 

Widyassari, A.P., Rustad, S., Shidik, G.F., Noersasongko, E., Syukur, A., Affandy, A., et al. (2022). Review of automatic text summarization techniques and methods. Journal of King Saud University - Computer and Information Sciences, 34(4), 1029–1046. https://doi.org/10.1016/j.jksuci.2020.05.006

 

Yadav, D., Katna, R., Yadav, A.K., & Morato, J. (2022). Feature based automatic text summarization methods: A comprehensive state-of-the-art survey. IEEE Access, 10, 133981–134003. https://doi.org/10.1109/access.2022.3231016

 

Yang, J., Wang, H., Qin, H., Sun, Y., Khan, A.A., Por, L.Y., et al. (2025). A generative adversarial network-based extractive text summarization using transductive and reinforcement learning. IEEE Access, 13, 65490–65509. https://doi.org/10.1109/access.2025.3558266

 

Zhang, M., Zhou, G., Yu, W., & Liu, W. (2020). A Survey of Automatic Text Summarization Technology Based on Deep Learning. In: Proceedings - International Conference on Artificial Intelligence and Computer Engineering, p211–217. https://doi.org/10.1109/icaice51518.2020.00047

 

Zhong, J., & Wang, Z. (2022). MTL-DAS: Automatic text summarization for domain adaptation. Computational Intelligence and Neuroscience, 2022, 1–10. https://doi.org/10.1155/2022/4851828

Conflict of interest
The authors declare that they have no competing interests.
Share
Back to top
International Journal of Systematic Innovation, Electronic ISSN: 2077-8767 Print ISSN: 2077-7973, Published by AccScience Publishing