期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Ontology-based similarity for product information retrieval

Suriati Akmal Li-Hsing Shih Rafael Batres 《Computers in Industry》2014

Product development of today is becoming increasingly knowledge intensive. Specifically, design teams face considerable challenges in making effective use of increasing amounts of information. In order to support product information retrieval and reuse, one approach is to use case-based reasoning (CBR) in which problems are solved “by using or adapting solutions to old problems.” In CBR, a case includes both a representation of the problem and a solution to that problem. Case-based reasoning uses similarity measures to identify cases which are more relevant to the problem to be solved. However, most non-numeric similarity measures are based on syntactic grounds, which often fail to produce good matches when confronted with the meaning associated to the words they compare. To overcome this limitation, ontologies can be used to produce similarity measures that are based on semantics. This paper presents an ontology-based approach that can determine the similarity between two classes using feature-based similarity measures that replace features with attributes. The proposed approach is evaluated against other existing similarities. Finally, the effectiveness of the proposed approach is illustrated with a case study on product–service–system design problems. 相似文献

2.

Empirical comparison of fast partitioning-based clustering algorithms for large data sets 总被引：3，自引：0，他引：3

Chih-Ping Wei Yen-Hsien Lee Che-Ming Hsu 《Expert systems with applications》2003,24(4):351-363

Several fast algorithms for clustering very large data sets have been proposed in the literature, including CLARA, CLARANS, GAC-R³, and GAC-RAR_w. CLARA is a combination of a sampling procedure and the classical PAM algorithm, while CLARANS adopts a serial randomized search strategy to find the optimal set of medoids. GAC-R³ and GAC-RAR_w exploit genetic search heuristics for solving clustering problems. In this research, we conducted an empirical comparison of these four clustering algorithms over a wide range of data characteristics described by data size, number of clusters, cluster distinctness, cluster asymmetry, and data randomness. According to the experimental results, CLARANS outperforms its counterparts both in clustering quality and execution time when the number of clusters increases, clusters are more closely related, more asymmetric clusters are present, or more random objects exist in the data set. With a specific number of clusters, CLARA can efficiently achieve satisfactory clustering quality when the data size is larger, whereas GAC-R³ and GAC-RAR_w can achieve satisfactory clustering quality and efficiency when the data size is small, the number of clusters is small, and clusters are more distinct and symmetric. 相似文献

3.

A rough set-based association rule approach implemented on exploring beverages product spectrum

Shu-Hsien Liao Yin-Ju Chen 《Applied Intelligence》2014,40(3):464-478

When items are classified according to whether they have more or less of a characteristic, the scale used is referred to as an ordinal scale. The main characteristic of the ordinal scale is that the categories have a logical or ordered relationship to each other. Thus, the ordinal scale data processing is very common in marketing, satisfaction and attitudinal research. This study proposes a new data mining method, using a rough set-based association rule, to analyze ordinal scale data, which has the ability to handle uncertainty in the data classification/sorting process. The induction of rough-set rules is presented as method of dealing with data uncertainty, while creating predictive if—then rules that generalize data values, for the beverage market in Taiwan. Empirical evaluation reveals that the proposed Rough Set Associational Rule (RSAR), combined with rough set theory, is superior to existing methods of data classification and can more effectively address the problems associated with ordinal scale data, for exploration of a beverage product spectrum. 相似文献

4.

Biclustering in data mining

Stanislav Busygin Oleg Prokopyev Panos M. Pardalos 《Computers & Operations Research》2008

Biclustering consists in simultaneous partitioning of the set of samples and the set of their attributes (features) into subsets (classes). Samples and features classified together are supposed to have a high relevance to each other. In this paper we review the most widely used and successful biclustering techniques and their related applications. This survey is written from a theoretical viewpoint emphasizing mathematical concepts that can be met in existing biclustering techniques. 相似文献

5.

基于数据挖掘的信号分离技术研究

张少刚《自动化与仪器仪表》2010,(4):21-22

信号的准确可靠接收是非常重要的,然而信号在产生和传输过程中,要受到各种干扰噪声的污染,需要进行信号的分离。总结了数据挖掘中聚类算法的研究现状,分析比较了它们的差异。提出了基于数据挖掘的信号处理算法,通过实例证明了该技术的实用性和有效性。相似文献

6.

A data mining approach to product assortment and shelf space allocation

Mu-Chen Chen Chia-Ping Lin 《Expert systems with applications》2007,32(4):976-986

In retailing, a variety of products compete to be displayed in the limited shelf space since it has a significant effect on demands. To affect customers’ purchasing decisions, retailers properly make decisions about which products to display (product assortment) and how much shelf space to allocate the stocked products (shelf space allocation). In the previous studies, researchers usually employed the space elasticity to optimize product assortment and space allocation models. The space elasticity is usually used to construct the relationship between shelf space and product demand. However, the large number of parameters requiring to estimate and the he non-linear nature of space elasticity can reduce the efficacy of the space elasticity based models. This paper utilizes a popular data mining approach, association rule mining, instead of space elasticity to resolve the product assortment and allocation problems in retailing. In this paper, the multi-level association rule mining is applied to explore the relationships between products as well as between product categories. Because association rules are obtained by directly analyzing the transaction database, they can generate more reliable information to shelf space management. 相似文献

7.

Implement web learning environment based on data mining 总被引：2，自引：0，他引：2

Qinglin Guo Ming Zhang 《Knowledge》2009,22(6):439-442

The need for providing learners with web-based learning content that match their accessibility needs and preferences, as well as providing ways to match learning content to user’s devices has been identified as an important issue in accessible educational environment. For a web-based open and dynamic learning environment, personalized support for learners becomes more important. In order to achieve optimal efficiency in a learning process, individual learner’s cognitive learning style should be taken into account. Due to different types of learners using these systems, it is necessary to provide them with an individualized learning support system. However, the design and development of web-based learning environments for people with special abilities has been addressed so far by the development of hypermedia and multimedia based on educational content. In this paper a framework of individual web-based learning system is presented by focusing on learner’s cognitive learning process, learning pattern and activities, as well as the technology support needed. Based on the learner-centered mode and cognitive learning theory, we demonstrate an online course design and development that supports the students with the learning flexibility and the adaptability. The proposed framework utilizes data mining algorithm for representing and extracting a dynamic learning process and learning pattern to support students’ deep learning, efficient tutoring and collaboration in web-based learning environment. And experiments do prove that it is feasible to use the method to develop an individual web-based learning system, which is valuable for further study in more depth. 相似文献

8.

An effective parallel approach for genetic-fuzzy data mining

《Expert systems with applications》2014,41(2):655-662

Data mining is most commonly used in attempts to induce association rules from transaction data. In the past, we used the fuzzy and GA concepts to discover both useful fuzzy association rules and suitable membership functions from quantitative values. The evaluation for fitness values was, however, quite time-consuming. Due to dramatic increases in available computing power and concomitant decreases in computing costs over the last decade, learning or mining by applying parallel processing techniques has become a feasible way to overcome the slow-learning problem. In this paper, we thus propose a parallel genetic-fuzzy mining algorithm based on the master–slave architecture to extract both association rules and membership functions from quantitative transactions. The master processor uses a single population as a simple genetic algorithm does, and distributes the tasks of fitness evaluation to slave processors. The evolutionary processes, such as crossover, mutation and production are performed by the master processor. It is very natural and efficient to run the proposed algorithm on the master–slave architecture. The time complexities for both sequential and parallel genetic-fuzzy mining algorithms have also been analyzed, with results showing the good effect of the proposed one. When the number of generations is large, the speed-up can be nearly linear. The experimental results also show this point. Applying the master–slave parallel architecture to speed up the genetic-fuzzy data mining algorithm is thus a feasible way to overcome the low-speed fitness evaluation problem of the original algorithm. 相似文献

9.

Automatic abstraction in reinforcement learning using data mining techniques

Ghorban Mohammad 《Robotics and Autonomous Systems》2009,57(11):1119-1128

In this paper, we used data mining techniques for the automatic discovering of useful temporal abstraction in reinforcement learning. This idea was motivated by the ability of data mining algorithms in automatic discovering of structures and patterns, when applied to large data sets. The state transitions and action trajectories of the learning agent are stored as the data sets for data mining techniques. The proposed state clustering algorithms partition the state space to different regions. Policies for reaching different parts of the space are separately learned and added to the model in a form of options (macro-actions). The main idea of the proposed action sequence mining is to search for patterns that occur frequently within an agent’s accumulated experience. The mined action sequences are also added to the model in a form of options. Our experiments with different data sets indicate a significant speedup of the Q-learning algorithm using the options discovered by the state clustering and action sequence mining algorithms. 相似文献

10.

A data mining approach to database compression

Chin-Feng Lee S. Wesley Changchien Wei-Tse Wang Jau-Ji Shen 《Information Systems Frontiers》2006,8(3):147-161

Data mining can dig out valuable information from databases to assist a business in approaching knowledge discovery and improving business intelligence. Database stores large structured data. The amount of data increases due to the advanced database technology and extensive use of information systems. Despite the price drop of storage devices, it is still important to develop efficient techniques for database compression. This paper develops a database compression method by eliminating redundant data, which often exist in transaction database. The proposed approach uses a data mining structure to extract association rules from a database. Redundant data will then be replaced by means of compression rules. A heuristic method is designed to resolve the conflicts of the compression rules. To prove its efficiency and effectiveness, the proposed approach is compared with two other database compression methods. Chin-Feng Lee is an associate professor with the Department of Information Management at Chaoyang University of Technology, Taiwan, R.O.C. She received her M.S. and Ph.D. degrees in 1994 and 1998, respectively, from the Department of Computer Science and Information Engineering at National Chung Cheng University. Her current research interests include database design, image processing and data mining techniques. S. Wesley Changchien is a professor with the Institute of Electronic Commerce at National Chung-Hsing University, Taiwan, R.O.C. He received a BS degree in Mechanical Engineering (1989) and completed his MS (1993) and Ph.D. (1996) degrees in Industrial Engineering at State University of New York at Buffalo, USA. His current research interests include electronic commerce, internet/database marketing, knowledge management, data mining, and decision support systems. Jau-Ji Shen received his Ph.D. degree in Information Engineering and Computer Science from National Taiwan University at Taipei, Taiwan in 1988. From 1988 to 1994, he was the leader of the software group in Institute of Aeronautic, Chung-Sung Institute of Science and Technology. He is currently an associate professor of information management department in the National Chung Hsing University at Taichung. His research areas focus on the digital multimedia, database and information security. His current research areas focus on data engineering, database techniques and information security. Wei-Tse Wang received the B.A. (2001) and M.B.A (2003) degrees in Information Management at Chaoyang University of Technology, Taiwan, R.O.C. His research interests include data mining, XML, and database compression. 相似文献

11.

基于数据挖掘技术的自适应入侵检测系统模型

李涛《计算机工程与设计》2010,31(6)

为了提高数据库系统的安全,将改进的数据预处理算法和改进的Apriori算法应用于数据库入侵检测系统,提出一个基于数据挖掘技术的自适应的数据库入侵检测系统模型.模型中,针对滥用检测规则生成的局限性,提出将改进算法的中间结果运用到滥用检测规则的生成中,不断完善滥用检测规则库,结合滥用检测和异常检测的特点,先进行滥用检测,再进行异常检测,降低漏检率和误警率.检测结果表明,不断更新规则库,能够提高系统的自适应性. 相似文献

12.

改进数据挖掘算法在入侵检测系统中的应用

赵艳君魏明军《计算机工程与应用》2013,49(18):69-72

针对已有检测机制存在的对于未知攻击行为无能为力、漏报率较高、检测效率低以及缺少规则库自动扩充机制等问题,结合数据挖掘技术的相关知识,设计了基于数据挖掘的改进网络入侵检测系统模型。模型中选取聚类分析K-means算法和关联规则挖掘Apriori算法,并对其进行改进。用改进的K-means算法实现正常行为类及数据分离模块,用改进Apriori算法实现规则库的自动扩充功能,并通过实验验证了两个算法的功能。相似文献

13.

Ontology-based data integration and decision support for product e-Design 总被引：1，自引：0，他引：1

Xiaomeng Janis 《Robotics and Computer》2009,25(6):863

Currently, computer-based support tools are widely used to facilitate the design process and have the potential to reduce design time, decrease product cost and enhance product quality. Although there are promising information systems to manage product lifecycle and product-related data, including product data management (PDM) and product lifecycle management (PLM), significant limitations still exist, where information required to make decisions may not be available, may be lacking consistency, and may not be expressed in a general way for sharing between systems. Moreover, there remains little support for decision making that considers multiple complex technical and economical criteria, relations, and objectives in product design. To address these problems, this paper presents a framework for an ontology-based data integration and decision support environment for e-Design. The framework can guide designers in the design process, can make recommendations, and can provide decision support for parameter adjustments. 相似文献

14.

A systematic data-mining-based methodology for product family design and product configuration

《Advanced Engineering Informatics》2021

Product family design and product configuration based on data mining technology is identified as an intelligent and automated means to improve the efficiency of product development. However, few of previous literatures have proposed systematic product family design method based on data mining technology. To make up for this deficiency, this research put forward a systematic data-mining-based method for product family design and product configuration. First, the customer requirement information and product engineering information in the historical order are formatted into structural data. Second, principal component analysis is performed on historical orders to extract the customers' differentiated needs. Third, association rule algorithm is introduced to mine the rules between differentiated needs and module instances in the historical orders, thus obtained the configuration knowledge between customer needs and product engineer. Forth, the mined rules are used to construct association rule-based classifier (CBA) that is employed to sort out the best product configuration schemes as popular product variants. Fifth, sequence alignment technique is employed to identify modules for popular product variants, so that the module instances are divided into optional, common and special module, respectively, thereby the product platform is generated based on common modules. Finally, according to new customer needs, the CBA classifier is used to recommend the best configuration schemes, and then popular product variants are configured based on the product platform. The feasibility of the proposed method is demonstrated by the product family design example of desktop computer hosts. 相似文献

15.

一种分层聚类模型及其在电信行业的应用

苏进张佑生《计算机工程》2005,31(22):110-112

提出一种分层聚类算法,该算法可识别任意形状、大小的类,在某电信企业的客户分析中取得了较好的结果。算法首先从不同的角度对电信客户进行聚类或分类,然后以这些类为基础,实行自底向上的层次聚类得到最终的聚类结果。算法执行效率高,适合大规模数据的聚类问题。相似文献

16.

关联规则挖掘Apriori算法的研究与改进

王铮周国光《网络安全技术与应用》2011,(4):61-62

本文采用一种基于布尔矩阵的频繁集挖掘算法。该算法直接通过支持矩阵行向量的按位与运算来找出频繁集,而不需要Apriori算法的连接和剪枝,通过不断压缩支持矩阵,不仅节约了存储空间,还提高了算法的效率。相似文献

17.

Toward data mining engineering: A software engineering approach

Oscar Marbán Javier Segovia Ernestina Menasalvas Covadonga Fernández-Baizán 《Information Systems》2009

The number, variety and complexity of projects involving data mining or knowledge discovery in databases activities have increased just lately at such a pace that aspects related to their development process need to be standardized for results to be integrated, reused and interchanged in the future. Data mining projects are quickly becoming engineering projects, and current standard processes, like CRISP-DM, need to be revisited to incorporate this engineering viewpoint. This is the central motivation of this paper that makes the point that experience gained about the software development process over almost 40 years could be reused and integrated to improve data mining processes. Consequently, this paper proposes to reuse ideas and concepts underlying the IEEE Std 1074 and ISO 12207 software engineering model processes to redefine and add to the CRISP-DM process and make it a data mining engineering standard. 相似文献

18.

On data mining,compression, and Kolmogorov complexity 总被引：1，自引：1，他引：0

Christos Faloutsos Vasileios Megalooikonomou 《Data mining and knowledge discovery》2007,15(1):3-20

Will we ever have a theory of data mining analogous to the relational algebra in databases? Why do we have so many clearly different clustering algorithms? Could data mining be automated? We show that the answer to all these questions is negative, because data mining is closely related to compression and Kolmogorov complexity; and the latter is undecidable. Therefore, data mining will always be an art, where our goal will be to find better models (patterns) that fit our datasets as best as possible. 相似文献

19.

Adjusting Fuzzy Similarity Functions for use with standard data mining tools

Avichai Meged Author VitaeRoy GelbardAuthor Vitae 《Journal of Systems and Software》2011,84(12):2374-2383

Data mining is crucial in many areas and there are ongoing efforts to improve its effectiveness in both the scientific and the business world. There is an obvious need to improve the outcomes of mining techniques such as clustering and other classifiers without abandoning the standard mining tools that are popular with researchers and practitioners alike. Currently, however, standard tools do not have the flexibility to control similarity relations between attribute values, a critical feature in improving mining-clustering results. The study presented here introduces the Similarity Adjustment Model (SAM) where adjusted Fuzzy Similarity Functions (FSF) control similarity relations between attribute values and hence ameliorate clustering results obtained with standard data mining tools such as SPSS and SAS. The SAM draws on principles of binary database representation models and employs FSF adjusted via an iterative learning process that yields improved segmentation regardless of the choice of mining-clustering algorithm. The SAM model is illustrated and evaluated on three common datasets with the standard SPSS package. The datasets were run with several clustering algorithms. Comparison of “Naïve” runs (which used original data) and “Fuzzy” runs (which used SAM) shows that the SAM improves segmentation in all cases. 相似文献

20.

Novel techniques and an efficient algorithm for closed pattern mining

《Expert systems with applications》2014,41(11):5105-5114

In this paper we show that frequent closed itemset mining and biclustering, the two most prominent application fields in pattern discovery, can be reduced to the same problem when dealing with binary (0–1) data. FCPMiner, a new powerful pattern mining method, is then introduced to mine such data efficiently. The uniqueness of the proposed method is its extendibility to non-binary data. The mining method is coupled with a novel visualization technique and a pattern aggregation method to detect the most meaningful, non-overlapping patterns. The proposed methods are rigorously tested on both synthetic and real data sets. 相似文献