首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
为了克服传统问卷调查方法研究产品功能使用度时受限于样本大小和目标针对性不强等缺陷,提出了基于Web语义挖掘的产品功能使用度分析方法.运用基于人工修正的知网方法构建了关联词表,然后开发了产品使用信息系统,构建了产品功能定量化研究模型,对产品功能使用度进行分析.通过某款手机具体对该系统性的方法进行了验证,为产品功能决策提供...  相似文献   

2.
康军  黄山  段宗涛  李宜修 《计算机应用》2021,41(8):2379-2385
在全球定位、移动通信技术迅速发展的背景下涌现出了海量的时空轨迹数据,这些数据是对移动对象在时空环境下的移动模式和行为特征的真实写照,蕴含了丰富的信息,这些信息对于城市规划、交通管理、服务推荐、位置预测等领域具有重要的应用价值,而时空轨迹数据在这些领域的应用通常需要通过对时空轨迹数据进行序列模式挖掘才能得以实现.时空轨迹...  相似文献   

3.
In this paper, we present the results of a project that seeks to transform low-level features to a higher level of meaning. This project concerns a technique, latent semantic indexing (LSI), in conjunction with normalization and term weighting, which have been used for full-text retrieval for many years. In this environment, LSI determines clusters of co-occurring keywords, sometimes, called concepts, so that a query which uses a particular keyword can then retrieve documents perhaps not containing this keyword, but containing other keywords from the same cluster. In this paper, we examine the use of this technique for content-based image retrieval, using two different approaches to image feature representation. We also study the integration of visual features and textual keywords and the results show that it can help improve the retrieval performance significantly.  相似文献   

4.
Frequent pattern mining (FPM) is an important data mining paradigm to extract informative patterns like itemsets, sequences, trees, and graphs. However, no practical framework for integrating the FPM tasks has been attempted. In this paper, we describe the design and implementation of the Data Mining Template Library (DMTL) for FPM. DMTL utilizes a generic data mining approach, where all aspects of mining are controlled via a set of properties. It uses a novel pattern property hierarchy to define and mine different pattern types. This property hierarchy can be thought of as a systematic characterization of the pattern space, i.e., a meta-pattern specification that allows the analyst to specify new pattern types, by extending this hierarchy. Furthermore, in DMTL all aspects of mining are controlled by a set of different mining properties. For example, the kind of mining approach to use, the kind of data types and formats to mine over, the kind of back-end storage manager to use, are all specified as a list of properties. This provides tremendous flexibility to customize the toolkit for various applications. Flexibility of the toolkit is exemplified by the ease with which support for a new pattern can be added. Experiments on synthetic and public dataset are conducted to demonstrate the scalability provided by the persistent back-end in the library. DMTL been publicly released as open-source software (), and has been downloaded by numerous researchers from all over the world.  相似文献   

5.
Improving the quality of image data through noise filtering has gained more attention for a long time. To date, many studies have been devoted to filter the noise inside the image, while few of them focus on filtering the instance-level noise among normal images. In this paper, aiming at providing a noise filter for bag-of-features images, (1) we first propose to utilize the cosine interesting pattern to construct the noise filter; (2) then we prove that to filter noise only requires to mine the shortest cosine interesting patterns, which dramatically simplifies the mining process; (3) we present an in-breadth pruning technique to further speed up the mining process. Experimental results on two real-life image datasets demonstrate effectiveness and efficiency of our noise filtering method.  相似文献   

6.
针对PrefixSpan算法中反复扫描投影数据库寻找局部频繁项并重复构造挖掘大量重复投影数据库的不足,提出一种基于序列末项位置信息的序列模式挖掘算法SPM-LIPT。通过连接2-序列位置信息表(LIPT)找到序列模式的下一项,实现序列模式增长,避免对投影数据库反复扫描;同时通过检查相同末项序列首位置信息表(SLIFPT)进行前向剪枝;消除大量重复投影的构建。最后通过实验证明了算法的有效性。  相似文献   

7.
A multi-step recognition process is developed for extracting compound forest cover information from manually produced scanned historical topographic maps of the 19th century. This information is a unique data source for GIS-based land cover change modeling. Based on salient features in the image the steps to be carried out are character recognition, line detection and structural analysis of forest symbols. Semantic expansion implying the meanings of objects is applied for final forest cover extraction. The procedure resulted in high accuracies of 94% indicating a potential for automatic and robust extraction of forest cover from larger areas.  相似文献   

8.
基于分层神经网络模型的数据挖掘算法   总被引:1,自引:0,他引:1  
介绍了建立带钢板形缺陷模式识别的数据挖掘过程。针对普通神经网络识别精度较低的缺陷,提出一种基于分层神经网络进行数据挖掘的新方法。该方法采用二叉树型结构,通过分层来细化预测范围并选用多个神经网络进行递推。实验结果证明了分层神经网络模型比普通神经网络模型的预测精度有较大提高,完全可以满足实际生产需要。  相似文献   

9.
When computationally feasible, mining huge databases produces tremendously large numbers of frequent patterns. In many cases, it is impractical to mine those datasets due to their sheer size; not only the extent of the existing patterns, but mainly the magnitude of the search space. Many approaches have suggested the use of constraints to apply to the patterns or searching for frequent patterns in parallel. So far, those approaches are still not genuinely effective to mine extremely large datasets. We propose a method that combines both strategies efficiently, i.e. mining in parallel for the set of patterns while pushing constraints. Using this approach we could mine significantly large datasets; with sizes never reported in the literature before. We are able to effectively discover frequent patterns in a database made of billion transactions using a 32 processors cluster in less than an hour and a half. Recommended by: Ahmed Elmagarmid  相似文献   

10.
11.
Sequential pattern mining is an important data mining problem with broad applications. While the current methods are inducing sequential patterns within a single attribute, the proposed method is able to detect them among different attributes. By incorporating the additional attributes, the sequential patterns found are richer and more informative to the user. This paper proposes a new method for inducing multi-dimensional sequential patterns with the use of Hellinger entropy measure. A number of theorems are proposed to reduce the computational complexity of the sequential pattern systems. The proposed method is tested on some synthesized transaction databases. Dr. Chang-Hwan Lee is a full professor at the Department of Information and Communications at DongGuk University, Seoul, Korea since 1996. He has received his B.Sc. and M.Sc in Computer Science and Statistics from Seoul National University in 1982 and 1988, respectively. He received his Ph.D. in Computer Science and Engineering from University of Connecticut in 1994. Prior to joining DongGuk University in Korea, he had worked for AT&T Bell Laboratories, Middletown, USA. (1994-1995). He also had been a visiting professor at the University of Illinois at Urbana-Champaign (2000-2001). He is author or co-author of more than 50 refereed articles on topics such as machine learning, data mining, artificial intelligence, pattern recognition, and bioinformatics.  相似文献   

12.
The present paper reviews the techniques for automated extraction of information from signals. The techniques may be classified broadly into two categories—the conventional pattern recognition approach and the artificial intelligence (AI) based approach. The conventional approach comprises two methodologies—statistical and structural. The paper reviews salient issues in the application of conventional techniques for extraction of information. The systems that use the artificial intelligence approach are characterized with respect to three key properties. The basic differences between the approaches and the computational aspects are reviewed. Current trends in the use of the AI approach are indicated. Some key ideas in current literature are reviewed.  相似文献   

13.
Stemming is the basic operation in Natural language processing (NLP) to remove derivational and inflectional affixes without performing a morphological analysis. This practice is essential to extract the root or stem. In NLP domains, the stemmer is used to improve the process of information retrieval (IR), text classifications (TC), text mining (TM) and related applications. In particular, Urdu stemmers utilize only uni-gram words from the input text by ignoring bigrams, trigrams, and n-gram words. To improve the process and efficiency of stemming, bigrams and trigram words must be included. Despite this fact, there are a few developed methods for Urdu stemmers in the past studies. Therefore, in this paper, we proposed an improved Urdu stemmer, using hybrid approach divided into multi-step operation, to deal with unigram, bigram, and trigram features as well. To evaluate the proposed Urdu stemming method, we have used two corpora; word corpus and text corpus. Moreover, two different evaluation metrics have been applied to measure the performance of the proposed algorithm. The proposed algorithm achieved an accuracy of 92.97% and compression rate of 55%. These experimental results indicate that the proposed system can be used to increase the effectiveness and efficiency of the Urdu stemmer for better information retrieval and text mining applications.  相似文献   

14.
Successive stages can be distinguished in the development of the human visual system's ability to use and recognize signs. The stages involve perception of parts of objects, of whole objects, of several objects, and of their interrelations. The system of signs described in this paper was developed through experimental investigations of visual perception in adults, children, and mentally ill or brain-damaged persons.  相似文献   

15.
基于特定领域的中文微博热点话题挖掘系统BTopicMiner   总被引:1,自引:0,他引:1  
李劲  张华  吴浩雄  向军 《计算机应用》2012,32(8):2346-2349
随着微博应用的迅猛发展,自动地从海量微博信息中提取出用户感兴趣的热点话题成为一个具有挑战性的研究课题。为此研究并提出了基于扩展的话题模型的中文微博热点话题抽取算法。为了解决微博信息固有的数据稀疏性问题,算法首先利用文本聚类方法将内容相关的微博消息合成为微博文档;基于微博之间的跟帖关系蕴含着话题的关联性的假设,算法对传统潜在狄利克雷分配(LDA)话题模型进行扩展以建模微博之间的跟帖关系;最后利用互信息(MI)计算被抽取出的话题的话题词汇用于热点话题推荐。为了验证扩展的话题抽取模型的有效性,实现了一个基于特定领域的中文微博热点话题挖掘的原型系统——BTopicMiner。实验结果表明:基于微博跟帖关系的扩展话题模型可以更准确地自动提取微博中的热点话题,同时利用MI度量自动计算得到的话题词汇和人工挑选的热点词汇之间的语义相似度达到75%以上。  相似文献   

16.
Since its introduction, frequent-pattern mining has been the subject of numerous studies, including incremental updating. Many existing incremental mining algorithms are Apriori-based, which are not easily adoptable to FP-tree-based frequent-pattern mining. In this paper, we propose a novel tree structure, called CanTree (canonical-order tree), that captures the content of the transaction database and orders tree nodes according to some canonical order. By exploiting its nice properties, the CanTree can be easily maintained when database transactions are inserted, deleted, and/or modified. For example, the CanTree does not require adjustment, merging, and/or splitting of tree nodes during maintenance. No rescan of the entire updated database or reconstruction of a new tree is needed for incremental updating. Experimental results show the effectiveness of our CanTree in the incremental mining of frequent patterns. Moreover, the applicability of CanTrees is not confined to incremental mining; CanTrees can also be applicable to other frequent-pattern mining tasks including constrained mining and interactive mining. Carson K.-S. Leung received his B.Sc.(Honours), M.Sc., and Ph.D. degrees, all in computer science, from the University of British Columbia, Canada. Currently, he is an Assistant Professor at the University of Manitoba, Canada. His research interests include the areas of databases, data mining, and data warehousing. His work has been published in refereed journals and conferences such as ACM Transactions on Database Systems (TODS), IEEE International Conference on Data Engineering (ICDE), and IEEE International Conference on Data Mining (ICDM) Quamrul I. Khan received his B.Sc. degree in computer science from North South University, Bangladesh, in 2001. He then worked as a Test Engineer and a Software Engineer for a few years before he started his current M.Sc. degree program in computer science at the University of Manitoba under the academic supervision of Dr. C. K.-S. Leung. Zhan Li received her B.Eng. degree in computer engineering from Harbin Engineering University, China, in 2002. Currently, she is pursuing her M.Sc. degree in computer science at the University of Manitoba under the academic supervision of Dr. C. K.-S. Leung. Tariqul Hoque received his B.Sc. degree in computer science from North South University, Bangladesh, in 2001. Currently, he is pursuing his M.Sc. degree in computer science at the University of Manitoba under the academic supervision of Dr. C. K.-S. Leung.  相似文献   

17.
A computational algorithm is presented for the extraction of an optimal single linear feature from several Gaussian pattern classes. The algorithm minimizes the increase in the probability of misclassification in the transformed (feature) space. The general approach used in this procedure was developed in a recent paper by R. J. P. de Figueiredo.(1) Numerical results on the application of this procedure to the remotely sensed data from the Purdue C1 flight line as well asLandsat data are presented. It was found that classification using the optimal single linear feature yielded a value for the probability of misclassification on the order of 30% less than that obtained by using the best single untransformed feature. The optimal single linear feature gave performance results comparable to those obtained by using the two features which maximized the average divergence. Also discussed are improvements in classification results using this method when the size of the training set is small.This work was supported by the Air Force Office of Scientific Research under Grant 75-2777 and by the National Aeronautics and Space Administration under contract NAS 9-12776.  相似文献   

18.
Motivated by a growing need for intelligent housing to accommodate ageing populations, we propose a novel application of intertransaction association rule (IAR) mining to detect anomalous behaviour in smart home occupants. An efficient mining algorithm that avoids the candidate generation bottleneck limiting the application of current IAR mining algorithms on smart home data sets is detailed. An original visual interface for the exploration of new and changing behaviours distilled from discovered patterns using a new process for finding emergent rules is presented. Finally, we discuss our observations on the emergent behaviours detected in the homes of two real world subjects.  相似文献   

19.
Change detection on spatial data is important in many applications, such as environmental monitoring. Given a set of snapshots of spatial objects at various temporal instants, a user may want to derive the changing regions between any two snapshots. Most of the existing methods have to use at least one of the original data sets to detect changing regions. However, in some important applications, due to data access constraints such as privacy concerns and limited data online availability, original data may not be available for change analysis. In this paper, we tackle the problem by proposing a simple yet effective model-based approach. In the model construction phase, data snapshots are summarized using the novel cluster-embedded decision trees as concise models. Once the models are built, the original data snapshots will not be accessed anymore. In the change detection phase, to mine changing regions between any two instants, we compare the two corresponding cluster-embedded decision trees. Our systematic experimental results on both real and synthetic data sets show that our approach can detect changes accurately and effectively. Irene Pekerskaya’s and Jian Pei’s research is supported partly by National Sciences and Engineering Research Council of Canada and National Science Foundation of the US, and a President’s Research Grant and an Endowed Research Fellowship Award at Simon Fraser University. Ke Wang’s research is supported partly by Natural Sciences and Engineering Research Council of Canada. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号