Sequential Pattern Mining: Survey and Current Research Challenges
Chetna Chand1, Amit Thakkar2, Amit Ganatra3

1Chetna Chand, Department of Computer Engineering, Charotar Institute of Technology (Faculty of Technology and Engineering), Charotar University of Technology Changa, Anand, Gujarat, India.
2Amit Thakkar, Department of Information Technology, Charotar Institute of Technology (Faculty of Technology and Engineering), Charotar University of Technology Changa, Anand, Gujarat, India.
3Amit Ganatra, Department of Computer Engineering, Charotar Institute of Technology (Faculty of Technology and Engineering), Charotar University of Technology Changa, Anand, Gujarat, India.

Manuscript received on February 15, 2012. | Revised Manuscript received on February 20, 2012. | Manuscript published on March 05, 2012. | PP: 185-193 | Volume-2 Issue-1, March 2012. | Retrieval Number: A0412022112/2012©BEIESP
Open Access | Ethics and Policies | Cite 
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The concept of sequence Data Mining was first introduced by Rakesh Agrawal and Ramakrishnan Srikant in the year 1995. The problem was first introduced in the context of market analysis. It aimed to retrieve frequent patterns in the sequences of products purchased by customers through time ordered transactions. Later on its application was extended to complex applications like telecommunication, network detection, DNA research, etc. Several algorithms were proposed. The very first was Apriori algorithm, which was put forward by the founders themselves. Later more scalable algorithms for complex applications were developed. E.g. GSP, Spade, PrefixSpan etc. The area underwent considerable advancements since its introduction in a short span. In this paper, a systematic survey of the sequential pattern mining algorithms is performed. This paper investigates these algorithms by classifying study of sequential pattern-mining algorithms into two broad categories. First, on the basis of algorithms which are designed to increase efficiency of mining and second, on the basis of various extensions of sequential pattern mining designed for certain application. At the end, comparative analysis is done on the basis of important key features supported by various algorithms and current research challenges are discussed in this field of data mining.

Keywords: Sequential Pattern, Sequence Database, Itemsets, Apriori.