首页 | 本学科首页   官方微博 | 高级检索  
     


A UML profile for the conceptual modelling of data-mining with time-series in data warehouses
Authors:Jose Zubcoff  Jesús Pardillo  Juan Trujillo
Affiliation:1. Lucentia Research Group, Department of Sea Sciences and Applied Biology, University of Alicante, 03690 Alicante, Spain;2. Lucentia Research Group, Department of Software and Computing Systems, University of Alicante, 03690 Alicante, Spain;1. Center for Nutrition and Pregnancy, Department of Animal Sciences, North Dakota State University, North Dakota State University, Fargo, ND 58108, USA;2. Department of Animal Sciences, South Dakota State University, Brookings, SD 57007, USA;1. College of Animal Science, Inner Mongolia Agricultural University, No. 306# Zhao Wu Da Street, Hohhot 010018, China;2. College of Life Science, Inner Mongolia Agricultural University, Hohhot 010018, China;1. School of Marine Sciences, Ningbo University, 818 Fenghua Road, Ningbo, Zhejiang Province 315211, PR China;2. College of Animal Husbandry and Veterinary, Liaoning Medical University, Jinzhou, Liaoning Province 121001, PR China;3. Ningbo City College of Vocational Technology, Ningbo, Zhejiang Province 315100, PR China;3. From the Protein Science Laboratory of the Ministry of Education, School of Life Sciences, Tsinghua University, Beijing 100084;4. Department of Biotechnology and Biomedicine, Yangtze Delta Region Institute of Tsinghua University, Jiaxing 314006, Zhejiang Province, China
Abstract:Time-series analysis is a powerful technique to discover patterns and trends in temporal data. However, the lack of a conceptual model for this data-mining technique forces analysts to deal with unstructured data. These data are represented at a low-level of abstraction and their management is expensive. Most analysts face up to two main problems: (i) the cleansing of the huge amount of potentially-analysable data and (ii) the correct definition of the data-mining algorithms to be employed. Owing to the fact that analysts’ interests are also hidden in this scenario, it is not only difficult to prepare data, but also to discover which data is the most promising. Since their appearance, data warehouses have, therefore, proved to be a powerful repository of historical data for data-mining purposes. Moreover, their foundational modelling paradigm, such as, multidimensional modelling, is very similar to the problem domain. In this article, we propose a unified modelling language (UML) extension through UML profiles for data-mining. Specifically, the UML profile presented allows us to specify time-series analysis on top of the multidimensional models of data warehouses. Our extension provides analysts with an intuitive notation for time-series analysis which is independent of any specific data-mining tool or algorithm. In order to show its feasibility and ease of use, we apply it to the analysis of fish-captures in Alicante. We believe that a coherent conceptual modelling framework for data-mining assures a better and easier knowledge-discovery process on top of data warehouses.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号