Honey Bee Colony Optimization for Multiresponse Mixed-Integer Problems
Partha De

Partha De, Department of Computer Science & Engineering, Birbhum Institute of Engineering & Technology, Suri, Birbhum, West Bengal, India.
Manuscript received on May 01, 2016. | Revised Manuscript received on May 02, 2016. | Manuscript published on May 05, 2016. | PP: 43-49 | Volume-6 Issue-2, May 2016. | Retrieval Number: B2843056216
Open Access | Ethics and Policies | Cite 
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The present task involves the machine learning based approaches to emotion tagging for Bengali Documents and Sentiment Analysis for English Documents. For the Bengali documents, all the unigrams and bigrams are considered as features for emotion tagging. The feature selection is done using point wise mutual information technique. To prepare training data, all the sentences of the documents are tagged manually with one of the Ekman’s six basic universal emotion label (Happy=1, Sad=2, Anger=3, Disgust=4, fear=5, Surprise=6, other emotion=7). Point wise mutual information of all the features are calculated by calculating the number of occurrences in a particular emotion category. The unigrams and bigrams that have point wise mutual information greater than a certain threshold value are considered as features. The feature matrix for the sentences with their emotion labels is calculated to prepare the training data. For emotion tagging or sentiment analysis, we train a number of machine learning algorithms chosen from WEKA, which provides a collection of machine learning tools. For performance evaluation, 10 fold cross validation is done and the final accuracy is calculated after averaging the results over all 10 folds. The average best accuracy obtained for emotion tagging is 55.89%. For sentiment analysis, we have used the bench mark datasets for experiments. Mutual information has also been used for feature selection for sentiment analysis. For sentiment analysis on the bench mark datasets, the average best accuracy obtained is 89%.
Keywords: Point wise mutual information, naïve bayes multinomial, weka, emotion tagging, sentiment analysis