Similarity of Articles using Hierarchical Clustering
Geetanjali, Shashank Sahu
1Geetanjali, M.Tech, Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College Ghaziabad (U.P). India.
2Shashank Sahu, Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College Ghaziabad (U.P). India.
Manuscript received on May 12, 2016. | Revised Manuscript received on May 18, 2016. | Manuscript published on July 05, 2016. | PP: 50-53 | Volume-6 Issue-3, July 2016. | Retrieval Number: C2875076316
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: RSS technology is to find similarity in the articles to provide better services to user. The research is going on to find out semantic similarity in articles to reduce same type of articles read by user. Objective of RSS is to deliver content which is latest and consist of most relevant information to the user. Here, the research focus is to find out the suitable distance method that can be use to check similarity in the articles. Hierarchical clustering is one of the best methods to cluster the articles which are similar on some parameter various methods are used in HC (hierarchical clustering). Which method is best suitable to find the semantic similarity, similarity is the focus area of this paper. We have collected various articles from many news channel websites for a category (terrorist) by observing the articles. Thirty keywords are selected for the implementation for the proposed technique and comparison. We perform similarity checking on various numbers of articles like 18, 16, 12, 10, 9, and 6. After calculating the distance the cityblock distance method gives the best result. For this research work article from last one decade (2003-2013) has been selected.
Keywords: Really Simple Syndication (RSS), Hierarchical Clustering.