TY - JOUR
T1 - Multi-Class Sentiment Analysis in Twitter
T2 - What if Classification is Not the Answer
AU - Bouazizi, Mondher
AU - Ohtsuki, Tomoaki
N1 - Funding Information:
This work was supported by the Keio Leading-Edge Laboratory of Science and Technology-Japan under Grant KEIO-KLL-000081.
Publisher Copyright:
© 2018 IEEE.
PY - 2018
Y1 - 2018
N2 - With the rapid growth of online social media content, and the impact these have made on people's behavior, many researchers have been interested in studying these media platforms. A major part of their work focused on sentiment analysis and opinion mining. These refer to the automatic identification of opinions of people toward specific topics by analyzing their posts and publications. Multi-class sentiment analysis, in particular, addresses the identification of the exact sentiment conveyed by the user rather than the overall sentiment polarity of his text message or post. That being the case, we introduce a task different from the conventional multi-class classification, which we run on a data set collected from Twitter. We refer to this task as 'quantification.' By the term 'quantification,' we mean the identification of all the existing sentiments within an online post (i.e., tweet) instead of attributing a single sentiment label to it. For this sake, we propose an approach that automatically attributes different scores to each sentiment in a tweet, and selects the sentiments with the highest scores which we judge as conveyed in the text. To reach this target, we added to our previously introduced tool SENTA the necessary components to run and perform such a task. Throughout this work, we present the added components; we study the feasibility of quantification, and propose an approach to perform it on a data set made of tweets for 11 different sentiment classes. The data set was manually labeled and the results of the automatic analysis were checked against the human annotation. Our experiments show the feasibility of this task and reach an F1 score equal to 45.9%.
AB - With the rapid growth of online social media content, and the impact these have made on people's behavior, many researchers have been interested in studying these media platforms. A major part of their work focused on sentiment analysis and opinion mining. These refer to the automatic identification of opinions of people toward specific topics by analyzing their posts and publications. Multi-class sentiment analysis, in particular, addresses the identification of the exact sentiment conveyed by the user rather than the overall sentiment polarity of his text message or post. That being the case, we introduce a task different from the conventional multi-class classification, which we run on a data set collected from Twitter. We refer to this task as 'quantification.' By the term 'quantification,' we mean the identification of all the existing sentiments within an online post (i.e., tweet) instead of attributing a single sentiment label to it. For this sake, we propose an approach that automatically attributes different scores to each sentiment in a tweet, and selects the sentiments with the highest scores which we judge as conveyed in the text. To reach this target, we added to our previously introduced tool SENTA the necessary components to run and perform such a task. Throughout this work, we present the added components; we study the feasibility of quantification, and propose an approach to perform it on a data set made of tweets for 11 different sentiment classes. The data set was manually labeled and the results of the automatic analysis were checked against the human annotation. Our experiments show the feasibility of this task and reach an F1 score equal to 45.9%.
KW - Twitter
KW - machine learning
KW - sentiment analysis
UR - http://www.scopus.com/inward/record.url?scp=85055030547&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85055030547&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2018.2876674
DO - 10.1109/ACCESS.2018.2876674
M3 - Article
AN - SCOPUS:85055030547
SN - 2169-3536
VL - 6
SP - 64486
EP - 64502
JO - IEEE Access
JF - IEEE Access
M1 - 8496747
ER -