共查询到20条相似文献,搜索用时 0 毫秒
1.
Increasing availability of music data via Internet evokes demand for efficient search through music files. Users' interests include melody tracking, harmonic structure analysis, timbre identification, and so on. We visualize, in an illustrative example, why content based search is needed for music data and what difficulties must be overcame to build an intelligent music information retrieval system. 相似文献
2.
3.
Textual Data Mining to Support Science and Technology Management 总被引:10,自引:0,他引:10
Paul Losiewicz Douglas W. Oard Ronald N. Kostoff 《Journal of Intelligent Information Systems》2000,15(2):99-119
This paper surveys applications of data mining techniques to large text collections, and illustrates how those techniques can be used to support the management of science and technology research. Specific issues that arise repeatedly in the conduct of research management are described, and a textual data mining architecture that extends a classic paradigm for knowledge discovery in databases is introduced. That architecture integrates information retrieval from text collections, information extraction to obtain data from individual texts, data warehousing for the extracted data, data mining to discover useful patterns in the data, and visualization of the resulting patterns. At the core of this architecture is a broad view of data mining—the process of discovering patterns in large collections of data—and that step is described in some detail. The final section of the paper illustrates how these ideas can be applied in practice, drawing upon examples from the recently completed first phase of the textual data mining program at the Office of Naval Research. The paper concludes by identifying some research directions that offer significant potential for improving the utility of textual data mining for research management applications. 相似文献
4.
Using Rough Sets with Heuristics for Feature Selection 总被引:32,自引:0,他引:32
Practical machine learning algorithms are known to degrade in performance (prediction accuracy) when faced with many features (sometimes attribute is used instead of feature) that are not necessary for rule discovery. To cope with this problem, many methods for selecting a subset of features have been proposed. Among such methods, the filter approach that selects a feature subset using a preprocessing step, and the wrapper approach that selects an optimal feature subset from the space of possible subsets of features using the induction algorithm itself as a part of the evaluation function, are two typical ones. Although the filter approach is a faster one, it has some blindness and the performance of induction is not considered. On the other hand, the optimal feature subsets can be obtained by using the wrapper approach, but it is not easy to use because of the complexity of time and space. In this paper, we propose an algorithm which is using rough set theory with greedy heuristics for feature selection. Selecting features is similar to the filter approach, but the evaluation criterion is related to the performance of induction. That is, we select the features that do not damage the performance of induction. 相似文献
5.
6.
Granular Computing: a Rough Set Approach 总被引:4,自引:0,他引:4
7.
Tej Anand 《Journal of Intelligent Information Systems》1995,4(1):27-37
TheNielsen Opportunity Explorer
tmproduct can be used by sales and trade marketing personnel within consumer packaged goods manufacturers to understand how their products are performing in the market place and find opportunities to sell more product, more profitably to the retailers. Opportunity Explorer uses data collected at the point-of-sale terminals, and by auditors of A. C. Nielsen. Opportunity Explorer uses a knowledge-base of market research expertise to analyze large databases and generate interactive reports using knowledge discovery templates, converting a large space of data into concise, inter-linkedinformation frames. Each information frame addresses specific business issues, and leads the user to seek related information by means of dynamically created hyperlinks. 相似文献
8.
9.
10.
分布式大数据函数依赖发现 总被引:1,自引:0,他引:1
在关系数据库中,函数依赖发现是一种十分重要的数据库分析技术,在知识发现、数据库语义分析、数据质量评估以及数据库设计等领域有着广泛的应用.现有的函数依赖发现算法主要针对集中式数据,通常仅适用于数据规模比较小的情况.在大数据背景下,分布式环境函数依赖发现更富有挑战性.提出了一种分布式环境下大数据的函数依赖发现算法,其基本思想是首先在各个节点利用本地数据并行进行函数依赖发现,基于以上发现的结果对函数依赖候选集进行剪枝,然后进一步利用函数依赖的左部(left hand side, LHS)的特征,对函数依赖候选集进行分组,针对每一组候选函数依赖并行执行分布式环境发现算法,最终得到所有函数依赖.对不同分组情况下所能检测的候选函数依赖数量进行了分析,在算法的执行过程中,综合考虑了数据迁移量和负载均衡的问题.在真实的大数据集上的实验表明,提出的检测算法在检测效率方面与已有方法相比有明显的提升. 相似文献
11.
12.
常梦星 《电脑编程技巧与维护》2010,(14):92-94
根据不同的应用背景和分类对象,分别概述了多媒体数据库中基于内容的音频分类的一些关键技术,如特征提取和分类器设计,并分析了各种基于内容的音频分类方法的优缺点,讨论了存在的问题,指出了未来的研究方向。 相似文献
13.
Kohonen's Self-Organizing Map (SOM) is combined with the Redundant Hash Addressing (RHA) principle. The SOM encodes the input feature vector sequence into the sequence of best-matching unit (BMU) indices and the RHA principle is then used to associate the BMU index sequence with the dictionary items. This provides a fast alternative for dynamic programming (DP) based methods for comparing and matching temporal sequences. Experiments include music retrieval and speech recognition. The separation of the classes can be improved by error-corrective learning. Comparisons to DP-based methods are presented. 相似文献
14.
15.
The need for content-based access to image and video information from media archives has captured the attention of researchers in recent years. Research efforts have led to the development of methods that provide access to image and video data. These methods have their roots in pattern recognition. The methods are used to determine the similarity in the visual information content extracted from low level features. These features are then clustered for generation of database indices. This paper presents a comprehensive survey on the use of these pattern recognition methods which enable image and video retrieval by content. 相似文献
16.
17.
18.
LEARNING IN RELATIONAL DATABASES: A ROUGH SET APPROACH 总被引:49,自引:0,他引:49
Knowledge discovery in databases, or dala mining, is an important direction in the development of data and knowledge-based systems. Because of the huge amount of data stored in large numbers of existing databases, and because the amount of data generated in electronic forms is growing rapidly, it is necessary to develop efficient methods to extract knowledge from databases. An attribute-oriented rough set approach has been developed for knowledge discovery in databases. The method integrates machine-learning paradigm, especially learning-from-examples techniques, with rough set techniques. An attribute-oriented concept tree ascension technique is first applied in generalization, which substantially reduces the computational complexity of database learning processes. Then the cause-effect relationship among the attributes in the database is analyzed using rough set techniques, and the unimportant or irrelevant attributes are eliminated. Thus concise and strong rules with little or no redundant information can be learned efficiently. Our study shows that attribute-oriented induction combined with rough set theory provide an efficient and effective mechanism for knowledge discovery in database systems. 相似文献
19.
20.
The Internet has solved the age-old problem of network connectivity and thus enabling the potential access to, and data sharing among large numbers of databases. However, enabling users to discover useful information requires an adequate metadata infrastructure that must scale with the diversity and dynamism of both users' interests and Internet accessible databases. In this paper, we present a model that partitions the information space into a distributed, highly specialized domain ontologies. We also introduce inter-ontology relationships to cater for user-based interests across ontologies defined over Internet databases. We also describe an architecture that implements these two fundamental constructs over Internet databases. The aim of the proposed model and architecture is to eventually facilitate data discovery and sharing for Internet databases. 相似文献