A Brief Survey on Emotion Based Text to Speech Conversion System
Supriya Dhanaraj Dhumale1, Manjiri Vitthal Khopade2, Bhushan Dhimate3, Avadhoot Yogesh Dhere4

1Bhushan Hemant Dhimate, Department of Computer Science, Savitribai Phule Pune University, Pune (Maharashtra), India.
2Manjiri Vitthal Khopade, Department of Computer Science, Savitribai Phule Pune University, Pune (Maharashtra), India.
3Avadhoot Yogesh Dhere, Department of Computer Science, Savitribai Phule Pune University, Pune (Maharashtra), India.
4Supriya Dhanaraj Dhumale, Department of Computer Science, Savitribai Phule Pune University, Pune (Maharashtra), India.
Manuscript received on September 09, 2021. | Revised Manuscript received on September 12, 2021. | Manuscript published on September 30, 2021. | PP: 40-43 | Volume-11, Issue-1, September 2021. | Retrieval Number: 100.1/ijsce.A35290911121 | DOI: 10.35940/ijsce.A3529.0911121
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Text to speech conversion is one of the applications of machine learning. It is widely used in search engines, standalone applications, web applications, chatbots and android applications. But still there is need to upgrade text to speech system so that we can get more interactive and user-friendly application. Traditional text to speech application has monotonous voice as output which does not has emotions in it and seems to be more mechanized. So, there is need to improvise the existing system by embedding the flavour of emotions in it. Existing text to speech cannot be used in story telling applications also it does not provide effective communication. Most of the Text to Speech systems are developed using algorithms such as Support Vector Machine (SVM), Naïve Bayes etc. Emotion Based Text to Speech System will help to improvise the existing Text to Speech system. With the help of machine learning and deep learning algorithm such as Recurrent Neural Network can be used for performing sentiment analysis and semantic analysis on the input text. We are going to use neural network which is more effective and help to maintain a relation between previous word and next word. Emotion based text to speech system will be able to identify four emotions ‘happy’, ‘sad’, ‘angry’ and ‘neutral’. Emotion based text to speech system will be beneficial for educational purpose like listening stories from storytelling applications for young budding children. Emotion based text to speech is going to be serviceable for visually impaired individuals.
Keywords: Emotion recognition, Text to Speech, GRU.