首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
As the growing in Internet, database types and sizes are getting various and larger. The topic of finding out the significant information from a database at the shortest time is important. In the music databases, a repeating pattern is an important feature of music objects, which commonly used in analyzing the repeated part of music data and looking for themes. Most of the repeating patterns are key melodies or easy to familiarize and remember for people. Therefore, we can use the themes or the repeating patterns to construct indices that can speedup query execution for music retrievals. Nevertheless, non-trivial repeating patterns exclude those patterns, which are all contained in other longer patterns, such that they can reduce the redundancy of the repeating patterns and save the index space needed. Most of existing algorithms are time consuming for finding non-trivial repeating patterns in a music object. In this research, we aim to apply the true suffix tree approach to discover non-trivial repeating patterns for a music object, which can efficiently address the cost problems in processing time and memory space. In general case, our proposed scheme can extract non-trivial repeating patterns in a linear time.
Lin-huang ChangEmail:
  相似文献   

2.
Effective and efficient mining of music structure patterns from music query data is one of the most interesting issues of multimedia data mining. In this paper, we introduce a new kind of pattern, called emerging melody structure (EMS), for knowledge discovery from music melody streams. EMSs are defined as music data items with melody strings whose support increase significantly from one sliding window to another window from streaming melody sequences. The discovered EMS can be used to predict the future trend of online music style recommendation, to personalize the Web service of music downloading priority, for music composers to compose new music or for service provider to collect more similar music. Therefore, an efficient data mining approach, called MEMSA (Mining Emerging Melody Structure Algorithm), is proposed to discover all EMSs from streaming music query data over sliding windows. In the framework of MEMSA, a prefix tree-based data structure, called EMS-tree (Emerging Melody Structure tree), is constructed for maintaining temporal EMSs effectively. Experimental results show that the proposed method MEMSA is an efficient algorithm for mining all EMSs from streaming melody sequences efficiently.  相似文献   

3.
The management of large collections of music data in a multimedia database has received much attention in the past few years. In the majority of current work, researchers extract the features, such as melodies, rhythms, and chords, from the music data and develop indices that will help to retrieve the relevant music quickly. Several reports have pointed out that these music features can be transformed and represented in forms of music feature strings or numeric values so that indices can be created for music retrieval. However, there are only a small number of existing approaches which introduce multi-feature index structures for music queries while most of the others are for developing single feature indices. The existing music multi-feature index structures are memory consuming and have lack of scalability. In this paper, we will propose a two-tier music index structure which is an efficient and scalable approach for multi-feature music indexing. Our experimental results show that this new approach outperforms existing multi-feature index schemes.  相似文献   

4.
Recently research on text mining has attracted lots of attention from both industrial and academic fields. Text mining concerns of discovering unknown patterns or knowledge from a large text repository. The problem is not easy to tackle due to the semi-structured or even unstructured nature of those texts under consideration. Many approaches have been devised for mining various kinds of knowledge from texts. One important aspect of text mining is on automatic text categorization, which assigns a text document to some predefined category if the document falls into the theme of the category. Traditionally the categories are arranged in hierarchical manner to achieve effective searching and indexing as well as easy comprehension for human beings. The determination of category themes and their hierarchical structures were most done by human experts. In this work, we developed an approach to automatically generate category themes and reveal the hierarchical structure among them. We also used the generated structure to categorize text documents. The document collection was trained by a self-organizing map to form two feature maps. These maps were then analyzed to obtain the category themes and their structure. Although the test corpus contains documents written in Chinese, the proposed approach can be applied to documents written in any language and such documents can be transformed into a list of separated terms.  相似文献   

5.

In the past decades, a large number of music pieces are uploaded to the Internet every day through social networks, such as Last.fm, Spotify and YouTube, that concentrates on music and videos. We have been witnessing an ever-increasing amount of music data. At the same time, with the huge amount of online music data, users are facing an everyday struggle to obtain their interested music pieces. To solve this problem, music search and recommendation systems are helpful for users to find their favorite content from a huge repository of music. However, social influence, which contains rich information about similar interests between users and users’ frequent correlation actions, has been largely ignored in previous music recommender systems. In this work, we explore the effects of social influence on developing effective music recommender systems and focus on the problem of social influence aware music recommendation, which aims at recommending a list of music tracks for a target user. To exploit social influence in social influence aware music recommendation, we first construct a heterogeneous social network, propose a novel meta path-based similarity measure called WPC, and denote the framework of similarity measure in this network. As a step further, we use the topological potential approach to mine social influence in heterogeneous networks. Finally, in order to improve music recommendation by incorporating social influence, we present a factor graphic model based on social influence. Our experimental results on one real world dataset verify that our proposed approach outperforms current state-of-the-art music recommendation methods substantially.

  相似文献   

6.
In this paper, we propose a unified approach to fast index-based music recognition. As an important area within the field of music information retrieval (MIR), the goal of music recognition is, given a database of musical pieces and a query document, to locate all occurrences of that document within the database, up to certain possible errors. In particular, the identification of the query with regard to the database becomes possible. The approach presented in this paper is based on a general algorithmic framework for searching complex patterns of objects in large databases. We describe how this approach may be applied to two important music recognition tasks: The polyphonic (musical score-based) search in polyphonic score data and the identification of pulse-code modulation audio material from a given acoustic waveform. We give an overview on the various aspects of our technology including fault-tolerant search methods. Several areas of application are suggested. We describe several prototypic systems we have developed for those applications including the notify! and the audentify! systems for score- and waveform-based music recognition, respectively.  相似文献   

7.
Graphs are a very expressive formalism for system modeling, especially when attributes are allowed. Our research is mainly focused on the use of graphs for system verification.Up to now, there are two main different approaches of modeling (typed) attributed graphs and specifying their transformation. Here we report preliminary results of our investigation on a third approach. In our approach we couple a graph to a data signature that consists of unary operations only. Therefore, we transform arbitrary signatures into a structure comparable to what is called a graph structure signature in the literature, and arbitrary algebras into the corresponding algebra graph.  相似文献   

8.
汉语分词系统中的信息集成和最佳路径搜索方法   总被引:11,自引:1,他引:10  
复杂的汉语分词系统中,各种信息的有效集成是系统实现的关键。本文介绍了分词系统SegTag中信息集成方法,并讨论了信息集成结构中的两种最佳路径搜索方法。最后,我们给出实验结果和结论。  相似文献   

9.
The autocorrelation is often used in signal processing as a tool for finding repeating patterns in a signal. In image processing, there are various image analysis techniques that use the autocorrelation of an image in a broad range of applications from texture analysis to grain density estimation. This paper provides an extensive review of two recently introduced and related frameworks for image representation based on autocorrelation, namely Patch Autocorrelation Features (PAF) and Translation and Rotation Invariant Patch Autocorrelation Features (TRIPAF). The PAF approach stores a set of features obtained by comparing pairs of patches from an image. More precisely, each feature is the euclidean distance between a particular pair of patches. The proposed approach is successfully evaluated in a series of handwritten digit recognition experiments on the popular MNIST data set. However, the PAF approach has limited applications, because it is not invariant to affine transformations. More recently, the PAF approach was extended to become invariant to image transformations, including (but not limited to) translation and rotation changes. In the TRIPAF framework, several features are extracted from each image patch. Based on these features, a vector of similarity values is computed between each pair of patches. Then, the similarity vectors are clustered together such that the spatial offset between the patches of each pair is roughly the same. Finally, the mean and the standard deviation of each similarity value are computed for each group of similarity vectors. These statistics are concatenated to obtain the TRIPAF feature vector. The TRIPAF vector essentially records information about the repeating patterns within an image at various spatial offsets. After presenting the two approaches, several optical character recognition and texture classification experiments are conducted to evaluate the two approaches. Results are reported on the MNIST (98.93%), the Brodatz (96.51%), and the UIUCTex (98.31%) data sets. Both PAF and TRIPAF are fast to compute and produce compact representations in practice, while reaching accuracy levels similar to other state-of-the-art methods.  相似文献   

10.
A Novel Approach for Phase-Type Fitting with the EM Algorithm   总被引:2,自引:0,他引:2  
The representation of general distributions or measured data by phase-type distributions is an important and nontrivial task in analytical modeling. Although a large number of different methods for fitting parameters of phase-type distributions to data traces exist, many approaches lack efficiency and numerical stability. In this paper, a novel approach is presented that fits a restricted class of phase-type distributions, namely, mixtures of Erlang distributions, to trace data. For the parameter fitting, an algorithm of the expectation maximization type is developed. This paper shows that these choices result in a very efficient and numerically stable approach which yields phase-type approximations for a wide range of data traces that are as good or better than approximations computed with other less efficient and less stable fitting methods. To illustrate the effectiveness of the proposed fitting algorithm, we present comparative results for our approach and two other methods using six benchmark traces and two real traffic traces as well as quantitative results from queueing analysis.  相似文献   

11.
Caching query results is one efficient approach to improving the performance of XML management systems. This entails the discovery of frequent XML queries issued by users. In this paper, we model user queries as a stream of XML query pattern trees and mine the frequent query patterns over the query stream. To facilitate the one-pass mining process, we devise a novel data structure called DTS to summarize the pattern trees seen so far. By grouping the incoming pattern trees into batches, we can dynamically mark the active portion of the current batch in DTS and limit the enumeration of candidate trees to only the currently active pattern trees. We also design another summary data structure called ECTree that provides for the incremental computation of the frequent tree patterns over the query stream. Based on the above two constructs, we present two mining algorithms called XQSMinerI and XQSMinerII. XQSMinerI is fast, but it tends to overestimate, while XQSMinerII adopts a filter-and-refine approach to minimize the amount of overestimation. Experimental results show that the proposed methods are both efficient and scalable and require only small memory footprints.Received: 17 October 2003, Accepted: 16 April 2004, Published online: 14 September 2004Edited by: J. Gehrke and J. Hellerstein.  相似文献   

12.
图数据模型广泛应用于各种具有复杂关联数据的领域.针对现有音乐数据模型与查询语言在功能上的缺陷,首先提出了一个基于图的音乐数据模型Gra-MM,用图数据模型对复杂音乐数据进行建模,定义了图逻辑数据结构以及相关的图代数操作,然后给出了建立在Gra-MM之上的音乐数据查询语言Gra-MQL,定义了查询语言的BNF定义.Gra-MQL能够较好地处理音乐数据之间的复杂关联,同时具有音乐元数据检索和音乐内容数据检索能力,从而满足用户对音乐数据不同层次的查询需求,克服了传统图数据查询语言对复杂关联数据的表达能力有限、不能直接应用于音乐内容检索等不足.最后对实现的音乐数据库原型系统进行了介绍,对原型系统进行测试并给出实验数据,证明了模型以及查询语言的可行性.  相似文献   

13.
14.
Nowadays, huge volumes of data are organized or exported in tree-structured form. Querying capabilities are provided through tree-pattern queries. The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper, we consider a query language that allows the partial specification of a tree pattern. Queries in this language range from structureless keyword-based queries to completely specified tree patterns. To support the evaluation of partially specified queries, we use semantically rich constructs, called dimension graphs, which abstract structural information of the tree-structured data. We address the problem of query containment in the presence of dimension graphs and we provide necessary and sufficient conditions for query containment. As checking query containment can be expensive, we suggest two heuristic approaches for query containment in the presence of dimension graphs. Our approaches are based on extracting structural information from the dimension graph that can be added to the queries while preserving equivalence with respect to the dimension graph. We considered both cases: extracting and storing different types of structural information in advance, and extracting information on-the-fly (at query time). Both approaches are implemented, validated, and compared through experimental evaluation.  相似文献   

15.
On modern computers, the performance of programs is often limited by memory latency rather than by processor cycle time. To reduce the impact of memory latency, the restructuring compiler community has developed locality-enhancing program transformations such as loop permutation and tiling. These transformations work well for perfectly nested loops (loops in which all assignment statements are contained in the innermost loop), but their performance on codes such as matrix factorizations that contain imperfectly nested loops leaves much to be desired. In this paper, we propose an alternative approach called data-centric transformation. Instead of reasoning directly about the control structure of the program, a compiler using the data-centric approach chooses an order for the arrival of data elements in the cache, determines what computations should be performed when that data arrives, and generates the appropriate code. At runtime, program execution will automatically pull data into the cache in an order that corresponds approximately to the order chosen by the compiler; since statements that touch a data structure element are scheduled close together, locality is improved. The idea of data-centric transformation is very general, and in this paper, we discuss a particular transformation called data-shackling. We have implemented shackling in the SGI MIPSPro compiler which already has a sophisticated implementation of control-centric transformations for locality enhancement. We present experimental results on the SGI Octane comparing the performance of the two approaches, and show that for dense numerical linear algebra codes, data-shackling does better by factors of two to five.  相似文献   

16.
The discovery of structures hidden in high-dimensional data space is of great significance for understanding and further processing of the data. Real world datasets are often composed of multiple low dimensional patterns, the interlacement of which may impede our ability to understand the distribution rule of the data. Few of the existing methods focus on the detection and extraction of the manifolds representing distinct patterns. Inspired by the nonlinear dimensionality reduction method ISOmap, in this paper we present a novel approach called Multi-Manifold Partition to identify the interlacing low dimensional patterns. The algorithm has three steps: first a neighborhood graph is built to capture the intrinsic topological structure of the input data, then the dimensional uniformity of neighboring nodes is analyzed to discover the segments of patterns, finally the segments which are possibly from the same low-dimensional structure are combined to obtain a global representation of distribution rules. Experiments on synthetic data as well as real problems are reported. The results show that this new approach to exploratory data analysis is effective and may enhance our understanding of the data distribution.  相似文献   

17.
Condition monitoring systems are widely used to monitor the working condition of equipment, generating a vast amount and variety of monitoring data in the process. The main task of surveillance focuses on detecting anomalies in these routinely collected monitoring data, intended to help detect possible faults in the equipment. However, with the rapid increase in the volume of monitoring data, it is a nontrivial task to scan all the monitoring data to detect anomalies. In this paper, we propose an approach called latent correlation-based anomaly detection (LCAD) that efficiently and effectively detects potential anomalies from a large number of correlative isomerous monitoring data series. Instead of focusing on one or more isomorphic monitoring data series, LCAD identifies anomalies by modeling the latent correlation among multiple correlative isomerous monitoring data series, using a probabilistic distribution model called the latent correlation probabilistic model, which helps to detect anomalies according to their relations with the model. Experimental results on real-world data sets show that when dealing with a large number of correlative isomerous monitoring data series, LCAD yields better performances than existing anomaly detection approaches.  相似文献   

18.
Information and telecommunications technologies have profoundly altered the distribution channels available for a wide range of goods and services. In this paper we analyze a particular class of products, information goods and develop a first framework for predicting which information goods are most likely to see their production, distribution, and consumption patterns altered by the net, which are likely to see shifts in power and profitability, and which are likely to remain unchanged for the foreseeable future. We focus on music and news as a selected couple of very different information goods industries that follow two very different trajectories. Our results suggest that the power structure in news distribution is unlikely to be transformed rapidly. In contrast, the power structure in music is transforming rapidly. Star acts no longer need their record labels to certify their music to their fans, and digital production and distribution have reduced or eliminated the value of other assets owned by the record companies. The framework we use to analyze these two industries can readily be applied to a range of others ... from the production of television soap opera series to the publication of academic journals in polymer chemistry.  相似文献   

19.
In this paper, we introduce a novel indexing scheme-query context tree (QUC-tree) to facilitate efficient query sensitive music search under different query contexts. Distinguished from the previous approaches, QUC-tree is a balanced multiway tree structure, where each level represents the data space at different dimensionality. Before the tree structure construction, principle component analysis (PCA) is applied for data analysis and transforming the raw composite features into a new feature space sorted by the importance of acoustic features. The PCA transformed data and reduced dimensions in the upper levels can alleviate suffering from dimensionality curse. To accurately mimic human perception, an extension called QUC +-tree is proposed, which further applies multivariate regression and EM based algorithm to estimate the weight of each individual feature. The comprehensive extensive experiments to evaluate the proposed structures against state-of-art techniques based on different datasets. The experimental results demonstrate the superiority of our technique.  相似文献   

20.
Tool path planning for compound surfaces in spray forming processes   总被引:3,自引:0,他引:3  
Spray forming is an emerging manufacturing process. The automated tool planning for this process is a nontrivial problem, especially for geometry-complicated parts consisting of multiple freeform surfaces. Existing tool planning approaches are not able to deal with this kind of compound surface. This paper proposes a tool-path planning approach which optimizes the tool motion performance and the thickness uniformity. There are two steps in this approach. The first step partitions the part surface into flat patches based on the topology and normal directions. The second step determines the tool movement patterns and the sweeping directions for each flat patch. Based on the above two steps, optimal tool paths can be calculated. Experimental tests are carried out on automotive body parts and the results validate the proposed approach. Note to Practitioners-This paper was motivated by the problem of automatically planning tool paths for spray forming using Programmable Powdered Preforming Process (P4) technology. However, the proposed approach can be applied to other surface manufacturing applications such as spray painting, spray cleaning, rapid tooling, etc. Existing tool planning approaches are not able to handle complicated, multi-patch surfaces. This paper proposes a methodology to partition complicated surfaces into easy-to-handle patches and generate tool paths with optimized thickness uniformity and tool motion performance. We tested the approach using simulation on sample automotive body parts and proved its feasibility. However, this approach requires that the parts to be sprayed belong to the sheet-metal type so that the part geometry can be analyzed on a plane. In our future research, we will run physical tests on actual parts and investigate the deposition effects on the thickness uniformity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号