Data Clustering Approach for Automatic Text Summarization of Hindi Documents using Particle Swarm Optimization and Semantic Graph
Vipul Dalal1, Latesh Malik2
1Vipul Dalal, Research Scholar, Department of Computer Science & Engineering, G.H. Raisoni College of Engineering, Nagpur (Maharashtra)- 440016, India.
2Dr. Latesh Malik, Department of Computer Science & Engineering, Government College of Engineering, Nagpur (Maharashtra) – 440016, India.
Manuscript published on July 05, 2017.
Abstract: Automatic text summarization is a process of describing important information from given document using intelligent algorithms. A lot of methods have been proposed by researchers for summarization of English text. Automatic summarization of Indian text has received a very little attention so far. In this paper, we have proposed a data clustering approach for summarizing Hindi text using semantic graph of the document and Particle Swarm Optimization (PSO) algorithm. PSO is one of the most powerful bio-inspired algorithms used to obtain optimal solution. The subject-object-verb (SOV) triples are extracted from the document. These triples are used to construct semantic graph of the document and finally clustered into summary and non-summary groups. A classifier is trained using PSO algorithm which is then used to obtain document summary.
Keywords: Bio-inspired algorithms, text mining, text summarization, semantic graph, PSO, data clustering.