The most computationally demanding aspect of Association Rule Mining is the identification and counting of support of the frequent sets of items that occur together sufficiently often to be the basis of potentially interesting rules. The task increases in difficulty with the scale of the data and also with its density. The greatest challenge is posed by data that is too large to be contained in primary memory, especially when high data density and/or low support thresholds give rise to very large numbers of candidates that must be counted. In this paper, we consider strategies for partitioning the data to deal effectively with such cases. We describe a partitioning approach which organises the data into tree structures that can be processed independently. We present experimental results that show the method scales well for increasing dimensions of data and performs significantly better than alternatives, especially when dealing with dense data and low support thresholds.
Shakil Ahmed received a first class BSc (Hons) degree from Dhaka University, Bangladesh, in 1990; and an MSc (first class), also Dhaka University, in 1992. He received his PhD from The University of Liverpool, UK, in 2005. From 2000 onwards he is a member of the Data Mining Group at the Department of Computer Science of the University of Liverpool, UK. His research interests include data mining, Association Rule Mining and pattern recognition.
Frans Coenen has been working in the field of Data Mining for many years and has written widely on the subject. He received his PhD from Liverpool Polytechnic in 1989, after which he took up a post as a RA within the Department of Computer Science at the University of Liverpool. In 1997, he took up a lecturing post within the same department. His current Data Mining research interests include Association rule Mining, Classification algorithms and text mining. He is on the programme committee for ICDM'05 and was the chair for the UK KDD symposium (UKKDD'05).
Paul Leng is professor of e-Learning at the University of Liverpool and director of the e-Learning Unit, which is responsible for overseeing the University's online degree programmes, leading to degrees of MSc in IT and MBA. Along with e-Learning, his main research interests are in Data Mining, especially in methods of discovering Association Rules. In collaboration with Frans Coenen, he has developed efficient new algorithms for finding frequent sets and is exploring applications in text mining and classification. 相似文献
Analytical expressions are obtained for predicting the harmonic and intermodulation performance of R-LED series networks. These expressions are in terms of the ordinary Bessel functions with arguments depenedent on the modulation index. 相似文献
Geologists interpret seismic data to understand subsurface properties and subsequently to locate underground hydrocarbon resources. Channels are among the most important geological features interpreters analyze to locate petroleum reservoirs. However, manual channel picking is both time consuming and tedious. Moreover, similar to any other process dependent on human intervention, manual channel picking is error prone and inconsistent. To address these issues, automatic channel detection is both necessary and important for efficient and accurate seismic interpretation. Modern systems make use of real-time image processing techniques for different tasks. Automatic channel detection is a combination of different mathematical methods in digital image processing that can identify streaks within the images called channels that are important to the oil companies. In this paper, we propose an innovative automatic channel detection algorithm based on machine learning techniques. The new algorithm can identify channels in seismic data/images fully automatically and tremendously increases the efficiency and accuracy of the interpretation process. The algorithm uses deep neural network to train the classifier with both the channel and non-channel patches. We provide a field data example to demonstrate the performance of the new algorithm. The training phase gave a maximum accuracy of 84.6% for the classifier and it performed even better in the testing phase, giving a maximum accuracy of 90%. 相似文献
The bivariate distributions are useful in simultaneous modeling of two random variables. These distributions provide a way to model models. The bivariate families of distributions are not much widely explored and in this article a new family of bivariate distributions is proposed. The new family will extend the univariate transmuted family of distributions and will be helpful in modeling complex joint phenomenon. Statistical properties of the new family of distributions are explored which include marginal and conditional distributions, conditional moments, product and ratio moments, bivariate reliability and bivariate hazard rate functions. The maximum likelihood estimation (MLE) for parameters of the family is also carried out. The proposed bivariate family of distributions is studied for the Weibull baseline distributions giving rise to bivariate transmuted Weibull (BTW) distribution. The new bivariate transmuted Weibull distribution is explored in detail. Statistical properties of the new BTW distribution are studied which include the marginal and conditional distributions, product, ratio and conditional momenst. The hazard rate function of the BTW distribution is obtained. Parameter estimation of the BTW distribution is also done. Finally, real data application of the BTW distribution is given. It is observed that the proposed BTW distribution is a suitable fit for the data used. 相似文献
Safety and reliability are absolutely important for modern sophisticated systems and technologies. Therefore, malfunction monitoring capabilities are instilled in the system for detection of the incipient faults and anticipation of their impact on the future behavior of the system using fault diagnosis techniques. In particular, state-of-the-art applications rely on the quick and efficient treatment of malfunctions within the equipment/system, resulting in increased production and reduced downtimes. This paper presents developments within Fault Detection and Diagnosis (FDD) methods and reviews of research work in this area. The review presents both traditional model-based and relatively new signal processing-based FDD approaches, with a special consideration paid to artificial intelligence-based FDD methods. Typical steps involved in the design and development of automatic FDD system, including system knowledge representation, data-acquisition and signal processing, fault classification, and maintenance related decision actions, are systematically presented to outline the present status of FDD. Future research trends, challenges and prospective solutions are also highlighted.
The Journal of Supercomputing - Agile software development (ASD) and software product line (SPL) have shown significant benefits for software engineering processes and practices. Although both... 相似文献
The Journal of Supercomputing - This research presents the detection and mitigation of distributed denial of service (DDoS) in software defined networks (SDN). The proposed method consists of three... 相似文献
The Journal of Supercomputing - The Internet of Things is a rapidly evolving technology in which interconnected computing devices and sensors share data over the network to decipher different... 相似文献
Neural Computing and Applications - In order to provide benchmark performance for Urdu text document classification, the contribution of this paper is manifold. First, it provides a publicly... 相似文献
Neural Computing and Applications - Autonomous driving research is an emerging area in the machine learning domain. Most existing methods perform single-task learning, while multi-task learning... 相似文献