期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The analysis and improvement of Apriori algorithm

HAN Feng ZHANG Shu-mao DU Ying-shuang 《通讯和计算机》2008,5(9):12-18

The data mining of association rules is an essential research aspect in the data mining fields. Association rules reflect the inner relationship of data. Discovering these associations is beneficial to the correct and appropriate decision made by decision-makers. Association rules is an important subject of data mining study. The association rules provide an effective means to found the potential link between the data, reflecting a built-in association between the data. In this paper, from the study of data mining technology, we make a in-depth study of the mining association rules, and on the basis, we analyzing the classic method of mining association rules-Apriori algorithm, pointing out its weaknesses, and putting tbrward a new improved algorithm-AprioriMend algorithm. 相似文献

2.

Web主题关联知识自学习算法

杨沛郑启伦彭宏《计算机科学》2003,30(10):49-51

There are hidden and rich information for data mining in the topology of topic-specific websites. A new topic-specific association rules mining algorithm is proposed to further the research on this area. The key idea is to analyze the frequent hyperlinked relati ons between pages of different topics. In the topic-specific area, if pages of onetopic are frequently hyperlinked by pages of another topic, we consider the two topics are relevant. Also, if pages oftwo different topics are frequently hyperlinked together by pages of the other topic, we consider the two topics are relevant.The initial experiments show that this algorithm performs quite well while guiding the topic-specific crawling agent and it can be applied to the further discovery and mining on the topic-specific website. 相似文献

3.

A survey of uncertain data management

Lingli LI Hongzhi WANG Jianzhong LI Hong GAO 《Frontiers of Computer Science》2020,14(1):162-190

Uncertain data are data with uncertainty information,which exist widely in database applications.In recent years,uncertainty in data has brought challenges in almost all database management areas such as data modeling,query representation,query processing,and data mining.There is no doubt that uncertain data management has become a hot research topic in the field of data management.In this study,we explore problems in managing uncertain data,present state-of-the-art solutions,and provide future research directions in this area.The discussed uncertain data management techniques include data modeling,query processing,and data mining in uncertain data in the forms of relational,XML,graph,and stream. 相似文献

4.

A survey on ensemble learning

Xibin DONG Zhiwen YU Wenming CAO Yifan SHI Qianli MA 《Frontiers of Computer Science》2020,14(2):241-258

Despite significant successes achieved in knowledge discovery,traditional machine learning methods may fail to obtain satisfactory performances when dealing with complex data,such as imbalanced,high-dimensional,noisy data,etc.The reason behind is that it is difficult for these methods to capture multiple characteristics and underlying structure of data.In this context,it becomes an important topic in the data mining field that how to effectively construct an efficient knowledge discovery and mining model.Ensemble learning,as one research hot spot,aims to integrate data fusion,data modeling,and data mining into a unified framework.Specifically,ensemble learning firstly extracts a set of features with a variety of transformations.Based on these learned features,multiple learning algorithms are utilized to produce weak predictive results.Finally,ensemble learning fuses the informative knowledge from the above results obtained to achieve knowledge discovery and better predictive performance via voting schemes in an adaptive way.In this paper,we review the research progress of the mainstream approaches of ensemble learning and classify them based on different characteristics.In addition,we present challenges and possible research directions for each mainstream approach of ensemble learning,and we also give an extra introduction for the combination of ensemble learning with other machine learning hot spots such as deep learning,reinforcement learning,etc. 相似文献

5.

Encoding of primary structures of biological macromolecules within a data mining perspective

下载免费PDF全文

MondherMaddouri MouradElloumi 《计算机科学技术学报》2004,19(1):0-0

An encoding method has a direct effect on the quality and the representation of the discovered knowledge in data mining systems. Biological macromolecules are encoded by strings of characters, called primary structures. Knowing that data mining systems usually use relational tables to encode data, we have then to reencode these strings and transform them into relational tables. In this paper, we do a comparative study of the existing static encoding methods, that are based on the Biologist know-how, and our new dynamic encoding one, that is based on the construction of Discriminant and Minimal Substrings (DMS). Different classification methods are used to do this study. The experimental results show that our dynamic encoding method is more efficient than the static ones, to encode biological macromolecules within a data mining perspective. 相似文献

6.

空间数据挖掘关键问题研究 总被引：1，自引：0，他引：1

肖予钦景宁吴秋云钟志农《计算机科学》2003,30(9):49-53

With the rapid development of remote sensing and mapping technology and the widespread application of spatial database system, the spatial data collected and stored by human expand increasingly. These very large datasets far exceed human's capabilities of comprehending and handling, so the requirement of spatial data mining to pro-vide human with valuable information becomes in stant need. In this paper, the methods of spatial data handling in spatial data mining are discussed from a database perspective, the key problems of spatial data mining and their solu-tions in current study are presented, the relations between spatial data mining and geographical information system are analyzed. 相似文献

7.

数据挖掘中的统计方法

魏玲祁建军张文修《计算机科学》2003,30(12):118-119

Data mining is the process of secondary analysis of large databases aimed at finding unsuspected relationships which are of interest or value to the database owners. We analyze the statistical methods in the classification in data mining, include: preprocessing techniques, classification algorithms, and post-classification analysis. Also, we introduce the Bayesian networks for data mining. 相似文献

8.

双库协同机制对知识发现主流发展的驱动

周颖杨炳儒《计算机科学》2003,30(9):27-30

The paper, by a research report, summarizes emergence and definition of double bases cooperating mecha-nism, and introduces its driving force and influence to many sides of main stream of knowledge discovery from struc-tural model to algorithm , from structuring data mining to complex type data mining. The influence also expands tophilosophy field. It has been above five years from proposing it to now. Summarizing it makes us learn a thing clear-ly : its functions are not simply improvement to algorithm, are to bring forward many new structural models and tech-nology methods . It answers those urgent questions in the one paragraph of the paper to a greater extent. So we maysay: double bases cooperating mechanism has important driving force to main stream of knowledge discovery. 相似文献

9.

The Modelling of Temporal Data in the Relational Database Environment

下载免费PDF全文

Sun Yuan 《计算机科学技术学报》1995,10(2):163-174

This research takes the view that the modelling of temporal data is a fundamental step towards the solution of capturing semantics of time.The problems inherent in the modelling of time are not unique to database processing.The representation of temporal knowledge and temporal reasoning arises in a wide range of other disciplines.In this paper an account is given of a technique for modelling the semantics of temporal data and its associated normalization method.It discusses the techniques of processing temporal data by employing a Time Sequence (TS) data model.It shows a number of different strategies which are used to classify different data properties of temporal data,and it goes on to develop the model of temporal data and addresses issues of temporal data application design by introducing the concept of temporal data normalisation. 相似文献

10.

第9届国际机器学习和数据挖掘会议(英文)

《智能系统学报》2012,(4):338

The aim of the conference is to bring together researchers from all over the world who deal with machine learningand data mining in order to discuss the recent status of the research and to direct further developments.Basic re-search papers as well as application papers are welcome. 相似文献