An Approach for Extracting the Keyword using Frequency and Distance of the Word Calculations
Ashwini Madane1, Devendra Thakore2
1Ms Ashiwini.Madane BE. MTech(pursing) Bharati Vidyappeth University, College of Engineering, Pune – 43.
2Prof Devendra Thakore BE. MTec .PHd (Pursing) Bharati Vidyappeth University, College of Engineering, Pune – 43.
Manuscript received on July 01, 2012. | Revised Manuscript received on July 04, 2012. | Manuscript published on July 05, 2012. | PP: 144-146 | Volume-2, Issue-3, July 2012. | Retrieval Number: C0724052312 /2012©BEIESP
Open Access | Ethics and Policies | Cite
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: A significant word used in indexing or cataloguing is regarded as a Keyword. Keywords provide a concise and precise high-level summarization of a document. They therefore constitute an important feature for document retrieval, classification, topic search and other tasks even if full text search is available. Keywords are useful tools as they give the shortest summary of the document. A keyword is identified by finding the relevance of the word with or without prior vocabulary of the document or the web page. Extracting keywords manually is an extremely difficult and time consuming process, therefore it is almost impossible to extract keywords manually even for the articles published in a single conference. Therefore there is a need for automated process that extracts keywords from documents. This paper concentrates on the extracting the keywords by understanding the linguistic, non-linguistic and various other approaches but applying the simple statistics approach.
Keywords: Extraction methods, Keyword Frequency Count, Stemming, Tokenization.