首页 | 本学科首页   官方微博 | 高级检索  
     


A Novel Auto-Annotation Technique for Aspect Level Sentiment Analysis
Authors:Muhammad Aasim Qureshi  Muhammad Asif  Mohd Fadzil Hassan  Ghulam Mustafa  Muhammad Khurram Ehsan  Aasim Ali  Unaza Sajid
Affiliation:1.Department of Computer Sciences, Bahria University, Lahore Campus, 54000, Pakistan2 Computer and Information Science Department, University Teknologi, Petronas, 32610, Malaysia
Abstract:In machine learning, sentiment analysis is a technique to find and analyze the sentiments hidden in the text. For sentiment analysis, annotated data is a basic requirement. Generally, this data is manually annotated. Manual annotation is time consuming, costly and laborious process. To overcome these resource constraints this research has proposed a fully automated annotation technique for aspect level sentiment analysis. Dataset is created from the reviews of ten most popular songs on YouTube. Reviews of five aspects—voice, video, music, lyrics and song, are extracted. An N-Gram based technique is proposed. Complete dataset consists of 369436 reviews that took 173.53 s to annotate using the proposed technique while this dataset might have taken approximately 2.07 million seconds (575 h) if it was annotated manually. For the validation of the proposed technique, a sub-dataset—Voice, is annotated manually as well as with the proposed technique. Cohen's Kappa statistics is used to evaluate the degree of agreement between the two annotations. The high Kappa value (i.e., 0.9571%) shows the high level of agreement between the two. This validates that the quality of annotation of the proposed technique is as good as manual annotation even with far less computational cost. This research also contributes in consolidating the guidelines for the manual annotation process.
Keywords:Machine learning  natural language processing  annotation  semi-annotated technique  reviews annotation  text annotation  corpus annotation
点击此处可从《》浏览原始摘要信息
点击此处可从《》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号