Data Clustering Approach for Automatic Text Summarization of Hindi Documents using Particle Swarm Optimization and Semantic Graph
Vipul Dalal1, Latesh Malik2
1Vipul Dalal, Research Scholar, Department of Computer Science & Engineering, G.H. Raisoni College of Engineering, Nagpur (Maharashtra)- 440016, India.
2Dr. Latesh Malik, Department of Computer Science & Engineering, Government College of Engineering, Nagpur (Maharashtra) – 440016, India.
Manuscript received on June 21, 2017. | Revised Manuscript received on june 28, 2017. | Manuscript published on July 05, 2017. | PP: 1-3 | Volume-7 Issue-3, July 2017. | Retrieval Number: C3009077317/2017©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley
©The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Automatic text summarization is a process of describing important information from given document using intelligent algorithms. A lot of methods have been proposed by researchers for summarization of English text. Automatic summarization of Indian text has received a very little attention so far. In this paper, we have proposed a data clustering approach for summarizing Hindi text using semantic graph of the document and Particle Swarm Optimization (PSO) algorithm. PSO is one of the most powerful bio-inspired algorithms used to obtain optimal solution. The subject-object-verb (SOV) triples are extracted from the document. These triples are used to construct semantic graph of the document and finally clustered into summary and non-summary groups. A classifier is trained using PSO algorithm which is then used to obtain document summary.
Keywords: Bio-inspired algorithms, text mining, text summarization, semantic graph, PSO, data clustering.