首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Toward intelligent music information retrieval   总被引:1,自引:0,他引:1  
Efficient and intelligent music information retrieval is a very important topic of the 21st century. With the ultimate goal of building personal music information retrieval systems, this paper studies the problem of intelligent music information retrieval. Huron points out that since the preeminent functions of music are social and psychological, the most useful characterization would be based on four types of information: genre, emotion, style,and similarity. This paper introduces Daubechies Wavelet Coefficient Histograms (DWCH)for music feature extraction for music information retrieval. The histograms are computed from the coefficients of the db/sub 8/ Daubechies wavelet filter applied to 3 s of music. A comparative study of sound features and classification algorithms on a dataset compiled by Tzanetakis shows that combining DWCH with timbral features (MFCC and FFT), with the use of multiclass extensions of support vector machine,achieves approximately 80% of accuracy, which is a significant improvement over the previously known result on this dataset. On another dataset the combination achieves 75% of accuracy. The paper also studies the issue of detecting emotion in music. Rating of two subjects in the three bipolar adjective pairs are used. The accuracy of around 70% was achieved in predicting emotional labeling in these adjective pairs. The paper also studies the problem of identifying groups of artists based on their lyrics and sound using a semi-supervised classification algorithm. Identification of artist groups based on the Similar Artist lists at All Music Guide is attempted. The semi-supervised learning algorithm resulted in nontrivial increases in the accuracy to more than 70%. Finally, the paper conducts a proof-of-concept experiment on similarity search using the feature set.  相似文献   

Many solutions for the reuse and re-purposing of Music Information Retrieval (MIR) methods, and the tools implementing those methods, have been introduced over recent years. Proposals for achieving interoperability between systems have ranged from shared software libraries and interfaces, through common frameworks and portals, to standardised file formats and metadata. Here we assess these solutions for their suitability to be reused and combined as repurposable components within assemblies (or workflows) that can be used in novel and possibly more ambitious ways. Reuse and repeatability also have great implications for the process of MIR research: the encapsulation of any algorithm and its operation—including inputs, parameters, and outputs—is fundamental to the repeatability and reproducibility of an experiment. This is desirable both for the open and reliable evaluation of algorithms and for the advancement of MIR by building more effectively upon prior research. At present there is no clear best practice widely adopted by the field. Based upon our analysis of contemporary systems and their adoption we reflect as to whether this should be considered a failure. Are there limits to interoperability unique to MIR, and how might they be overcome? Beyond workflows how much research context can, and should, be captured? We frame our assessment within the emerging notion of Research Objects for reproducible research in other domains, and describe how their adoption could serve as a route to reuse in MIR.  相似文献   

Personalization and context-awareness are highly important topics in research on Intelligent Information Systems. In the fields of Music Information Retrieval (MIR) and Music Recommendation in particular, user-centric algorithms should ideally provide music that perfectly fits each individual listener in each imaginable situation and for each of her information or entertainment needs. Even though preliminary steps towards such systems have recently been presented at the “International Society for Music Information Retrieval Conference” (ISMIR) and at similar venues, this vision is still far away from becoming a reality. In this article, we investigate and discuss literature on the topic of user-centric music retrieval and reflect on why the breakthrough in this field has not been achieved yet. Given the different expertises of the authors, we shed light on why this topic is a particularly challenging one, taking computer science and psychology points of view. Whereas the computer science aspect centers on the problems of user modeling, machine learning, and evaluation, the psychological discussion is mainly concerned with proper experimental design and interpretation of the results of an experiment. We further present our ideas on aspects crucial to consider when elaborating user-aware music retrieval systems.  相似文献   

This paper describes a music information retrieval system that uses humming as the key for retrieval. Humming is an easy way for a user to input a melody. However, there are several problems with humming that degrade the retrieval of information. One problem is the human factor. Sometimes, people do not sing accurately, especially if they are inexperienced or unaccompanied. Another problem arises from signal processing. Therefore, a music information retrieval method should be sufficiently robust to surmount various humming errors and signal processing problems. A retrieval system has to extract the pitch from the user's humming. However, pitch extraction is not perfect. It often captures half or double pitches, which are harmonic frequencies of the true pitch, even if the extraction algorithms take the continuity of the pitch into account. Considering these problems, we propose a system that takes multiple pitch candidates into account. In addition to the frequencies of the pitch candidates, the confidence measures obtained from their powers are taken into consideration as well. We also propose the use of an algorithm with three dimensions that is an extension of the conventional Dynamic Programming (DP)algorithm, so that multiple pitch candidates can be treated. Moreover, in the proposed algorithm, DP paths are changed dynamically to take deltaPitches and IOIratios (inter-onset-interval) of input and reference notes into account in order to treat notes being split or unified. We carried out an evaluation experiment to compare the proposed system with a conventional system . When using three-pitch candidates with conference measure and IOI features, the top-ten retrieval accuracy was 94.1%. Thus, the proposed method gave a better retrieval performance than the conventional system.  相似文献   

Increasing amount of online music content has opened new opportunities for implementing new effective information access services–commonly known as music recommender systems–that support music navigation, discovery, sharing, and formation of user communities. In the recent years a new research area of contextual (or situational) music recommendation and retrieval has emerged. The basic idea is to retrieve and suggest music depending on the user’s actual situation, for instance emotional state, or any other contextual conditions that might influence the user’s perception of music. Despite the high potential of such idea, the development of real-world applications that retrieve or recommend music depending on the user’s context is still in its early stages. This survey illustrates various tools and techniques that can be used for addressing the research challenges posed by context-aware music retrieval and recommendation. This survey covers a broad range of topics, starting from classical music information retrieval (MIR) and recommender system (RS) techniques, and then focusing on context-aware music applications as well as the newer trends of affective and social computing applied to the music domain.  相似文献   

Probabilistic latent semantic analysis (PLSA) is a method for computing term and document relationships from a document set. The probabilistic latent semantic index (PLSI) has been used to store PLSA information, but unfortunately the PLSI uses excessive storage space relative to a simple term frequency index, which causes lengthy query times. To overcome the storage and speed problems of PLSI, we introduce the probabilistic latent semantic thesaurus (PLST); an efficient and effective method of storing the PLSA information. We show that through methods such as document thresholding and term pruning, we are able to maintain the high precision results found using PLSA while using a very small percent (0.15%) of the storage space of PLSI.  相似文献   

This paper presents a tunable content-based music retrieval (CBMR) system suitable the for retrieval of music audio clips. The audio clips are represented as extracted feature vectors. The CBMR system is expert-tunable by altering the feature space. The feature space is tuned according to the expert-specified similarity criteria expressed in terms of clusters of similar audio clips. The main goal of tuning the feature space is to improve retrieval performance, since some features may have more impact on perceived similarity than others. The tuning process utilizes our genetic algorithm. The R-tree index for efficient retrieval of audio clips is based on the clustering of feature vectors. For each cluster a minimal bounding rectangle (MBR) is formed, thus providing objects for indexing. Inserting new nodes into the R-tree is efficiently performed because of the chosen Quadratic Split algorithm. Our CBMR system implements the point query and the n-nearest neighbors query with the O(logn) time complexity. Different objective functions based on cluster similarity and dissimilarity measures are used for the genetic algorithm. We have found that all of them have similar impact on the retrieval performance in terms of precision and recall. The paper includes experimental results in measuring retrieval performance, reporting significant improvement over the untuned feature space.  相似文献   

适用于视觉媒体检索的视频镜头分割算法*   总被引:1,自引:0,他引:1  
针对基于内容的视觉媒体检索存在的问题,依据多维空间仿生信息学(multi-dimensional space biomimetic informatics,MDSBI)理论的同源连续性规律(principle of homology continuity,PHC),在高维空间中通过几何方法分析视频及镜头和图像,通过子空间方法解决不同视觉媒体的异构性问题,得出视觉媒体检索的实质性工作是在高维空间研究点和子空间的距离问题,并实现适用于视觉媒体检索的镜头分割算法。  相似文献   

Most Music Information Retrieval (MIR) researchers will agree that understanding users’ needs and behaviors is critical for developing a good MIR system. The number of user studies in the MIR domain has been gradually increasing since the early 2000s, reflecting this growing appreciation of the need for empirical studies of users. However, despite the growing number of user studies and the wide recognition of their importance, it is unclear how great their impact has been in the field: on how systems are developed, how evaluation tasks are created, and how MIR system developers in particular understand critical concepts such as music similarity or music mood. In this paper, we present our analysis on the growth, publication and citation patterns, topics, and design of 198 user studies. This is followed by a discussion of a number of issues/challenges in conducting MIR user studies and distributing the research results. We conclude by making recommendations to increase the visibility and impact of user studies in the field.  相似文献   

To effectively utilize information stored in a digital image library, effective image indexing and retrieval techniques are essential. This paper proposes an image indexing and retrieval technique based on the compressed image data using vector quantization (VQ). By harnessing the characteristics of VQ, the proposed technique is able to capture the spatial relationships of pixels when indexing the image. Experimental results illustrate the robustness of the proposed technique and also show that its retrieval performance is higher compared with existing color-based techniques.  相似文献   

一种有效的信息检索模型*   总被引:1,自引:0,他引:1  
提出基于用户查询行为和查询扩展的信息检索模型,给出了设计思想及其算法和实现的关键技术。实验结果表明,该模型能有效提高信息检索性能,有很高的实际应用价值和广阔的前景。  相似文献   

解决大规模音频数据库快速检索的有效手段之一是建立合适的音频索引,其中音频分割和标注是建立音频索引的基础。文中采用了一种基于短时能量和改进度量距离的两步音频分割算法,使得分割后的音频片段具有段间特征差异大、段内特征方差小的特点。在音频分割的基础上进行了音频数据库中音频流的标注;分别基于BP神经网络算法和Philips音频指纹算法对音频进行了音频类别和音频内容的标注,为后续建立音频索引表做准备。实验结果表明,两步分割算法能较好地分割任意音频流,音频标注算法能有效进行基于音频类别和音频内容的标注,算法同时具有良好的鲁棒性。  相似文献   

In this article, a brief review on texture segmentation is presented, before a novel automatic texture segmentation algorithm is developed. The algorithm is based on a modified discrete wavelet frames and the mean shift algorithm. The proposed technique is tested on a range of textured images including composite texture images, synthetic texture images, real scene images as well as our main source of images, the museum images of various kinds. An extension to the automatic texture segmentation, a texture identifier is also introduced for integration into a retrieval system, providing an excellent approach to content-based image retrieval using texture features.  相似文献   

Motivated by the need for the automatic indexing and analysis of huge number of documents in Ottoman divan poetry, and for discovering new knowledge to preserve and make alive this heritage, in this study we propose a novel method for segmenting and retrieving words in Ottoman divans. Documents in Ottoman are difficult to segment into words without a prior knowledge of the word. In this study, using the idea that divans have multiple copies (versions) by different writers in different writing styles, and word segmentation in some of those versions may be relatively easier to achieve than in other versions, segmentation of the versions (which are difficult, if not impossible, with traditional techniques) is performed using information carried from the simpler version. One version of a document is used as the source dataset and the other version of the same document is used as the target dataset. Words in the source dataset are automatically extracted and used as queries to be spotted in the target dataset for detecting word boundaries. We present the idea of cross-document word matching for a novel task of segmenting historical documents into words. We propose a matching scheme based on possible combinations of sequence of sub-words. We improve the performance of simple features through considering the words in a context. The method is applied on two versions of Layla and Majnun divan by Fuzuli. The results show that, the proposed word-matching-based segmentation method is promising in finding the word boundaries and in retrieving the words across documents.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号