首页 | 本学科首页   官方微博 | 高级检索  
     


Text analytics for big data using rough–fuzzy soft computing techniques
Authors:Mohammed Al‐Maitah
Abstract:Text mining or analytics is important for various applications such as market analysis and biomedical purposes because it enables the efficient retrieval of information from large datasets. During the analysis, increasing the dimensionality of the data reduces the performance of an entire system because doing so may retrieve irrelevant text, which creates errors. Therefore, this paper introduces big data and data mining techniques to analyse large volumes of information while mining texts, emails, blogs, online forums, news, and call centre documents. Initially, the data are collected from various sources that contain noise, which is removed by applying normalization techniques. Data mining techniques eliminate the irrelevant information and noise, and the relevant features are selected using the rough set‐based particle swarm optimization algorithm. The selected features are formed as a cluster using a fuzzy set with the particle swarm optimization algorithm, which improves the efficiency of the mining process. Then, the efficiency of the system is evaluated using the University of California Irvine Machine Learning Repository knowledge process mining database, along with the sum of the intra cluster distances, the mean squared error rate, and the accuracy.
Keywords:big data  clustering  feature selection  fuzzy set  rough set  text mining
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号