共查询到20条相似文献,搜索用时 0 毫秒
4.
Music genre classification based on visual representation has been successfully explored over the last years. Recently, there has been increasing interest in attempting convolutional neural networks (CNNs) to achieve the task. However, most of the existing methods employ the mature CNN structures proposed in image recognition without any modification, which results in the learning features that are not adequate for music genre classification. Faced with the challenge of this issue, we fully exploit the low-level information from spectrograms of audio and develop a novel CNN architecture in this paper. The proposed CNN architecture takes the multi-scale time-frequency information into considerations, which transfers more suitable semantic features for the decision-making layer to discriminate the genre of the unknown music clip. The experiments are evaluated on the benchmark datasets including GTZAN, Ballroom, and Extended Ballroom. The experimental results show that the proposed method can achieve 93.9%, 96.7%, 97.2% classification accuracies respectively, which to the best of our knowledge, are the best results on these public datasets so far. It is notable that the trained model by our proposed network possesses tiny size, only 0.18M, which can be applied in mobile phones or other devices with limited computational resources. Codes and model will be available at https://github.com/CaifengLiu/music-genre-classification. 相似文献
6.
Automatic text classification based on vector space model (VSM), artificial neural networks (ANN), K-nearest neighbor (KNN), Naives Bayes (NB) and support vector machine (SVM) have been applied on English language documents, and gained popularity among text mining and information retrieval (IR) researchers. This paper proposes the application of VSM and ANN for the classification of Tamil language documents. Tamil is morphologically rich Dravidian classical language. The development of internet led to an exponential increase in the amount of electronic documents not only in English but also other regional languages. The automatic classification of Tamil documents has not been explored in detail so far. In this paper, corpus is used to construct and test the VSM and ANN models. Methods of document representation, assigning weights that reflect the importance of each term are discussed. In a traditional word-matching based categorization system, the most popular document representation is VSM. This method needs a high dimensional space to represent the documents. The ANN classifier requires smaller number of features. The experimental results show that ANN model achieves 93.33% which is better than the performance of VSM which yields 90.33% on Tamil document classification. 相似文献
9.
Multimedia Tools and Applications - In the case of digital music industry, current major internet stores contain millions of tracks, which complicate search, retrieval and discovery of music... 相似文献
10.
This paper describes an experimental method for automatic text genre recognition based on 45 statistical, lexical, syntactic, positional, and discursive parameters. The suggested method includes: (1) the development of software permitting heterogeneous parameters to be normalized and clustered using the k-means algorithm; (2) the verification of parameters; (3) the selection of the parameters that are the most significant for scientific, newspaper, and artistic texts using two-factor analysis algorithms. Adaptive summarization algorithms have been developed based on these parameters. 相似文献
12.
Sketching is a natural and easy way for humans to express visual information in everyday life. Despite a number of approaches to understand online sketch maps, the automatic understanding of offline, hand-drawn sketch maps still poses a problem. This paper presents a new approach for novel sketch map understanding. To our knowledge, this is the first comprehensive work dealing with this task in an offline way. This paper presents a system for automatic understanding of sketch maps and the underlying algorithms for all steps. Major parts are a region-growing segmentation for sketch map objects, a classification for isolated objects, and a context-aware classification. The context-aware classification uses probabilistic relaxation labeling to integrate dependencies between objects into the recognition. We show how these algorithms can deal with the major problems of sketch map understanding, such as vagueness in interpretation. Our experiments demonstrate the importance of context-aware classification for sketch map understanding. In addition, a new database of annotated sketch maps was developed and is made publicly available. This can be used for training and evaluation of sketch map understanding algorithms. 相似文献
15.
Innovations in the fields of medicine and medical image processing are becoming increasingly important. Historically, RNA viruses produced in cell cultures have been identified using electron microscopy, in which virus identification is performed by eye. Such an approach is time consuming and depends on manual controls. Moreover, detailed knowledge about RNA viruses is required. This study introduces the Entropy-Adaptive Network Based Fuzzy Inference System (Entropy-ANFIS method), which can be used to automatically detect RNA virus images. This system consists of four stages: pre-processing, feature extraction, classification and testing the Entropy-ANFIS method with respect to the correct classification ratio. In the pre-processing stage, a center-edge changing method is used, in which the Euclidian distances are calculated from the center pixels to the edges of the imaged object. In this way, the distance vector is obtained. This calculation is repeated for each RNA virus image. In feature extraction, stage norm entropy, logarithmic energy and threshold entropy values are calculated to form the feature vector. The obtained feature vector is independent of the rotation and scale of the RNA virus image. In the classification stage, the feature vector is given as input to the ANFIS classifier, ANN classifier and FCM cluster. Finally, the test stage is performed to evaluate the correct classification ratio of the Entropy-ANFIS algorithm for the RNA virus images. The correct classification ratio has been determined as 95.12% using the proposed Entropy-ANFIS method. 相似文献
16.
The need to classify audio into categories such as speech or music is an important aspect of many multimedia document retrieval systems. In this paper, we investigate audio features that have not been previously used in music-speech classification, such as the mean and variance of the discrete wavelet transform, the variance of Mel-frequency cepstral coefficients, the root mean square of a lowpass signal, and the difference of the maximum and minimum zero-crossings. We, then, employ fuzzy C-means clustering to the problem of selecting a viable set of features that enables better classification accuracy. Three different classification frameworks have been studied:Multi-Layer Perceptron (MLP) Neural Networks, radial basis functions (RBF) Neural Networks, and Hidden Markov Model (HMM), and results of each framework have been reported and compared. Our extensive experimentation have identified a subset of features that contributes most to accurate classification, and have shown that MLP networks are the most suitable classification framework for the problem at hand. 相似文献
17.
Multimedia Tools and Applications - Deep Neural Network (DNN) models have lately received considerable attention for that the network structure can extract deep features to improve classification... 相似文献
18.
MFCC参数和LPCC参数是说话人识别中两种最常用的特征参数,研究了MFCC和LPCC参数提取的算法原理及差分倒谱参数的提取方法,采用MFCC、LPCC及其一阶、二阶差分作为特征参数,通过k均值算法与三层BP神经网络来进行说话人识别.实验结果表明,该方法可以有效提高识别率,同时也验证MFCC参数的鲁棒性优于LPCC参数. 相似文献
19.
In this paper, a procedure for segmenting and classifying scanned legume leaves based only on the analysis of their veins is proposed (leaf shape, size, texture and color are discarded). Three legume species are studied, namely soybean, red and white beans. The leaf images are acquired using a standard scanner. The segmentation is performed using the unconstrained hit-or-miss transform and adaptive thresholding. Several morphological features are computed on the segmented venation, and classified using four alternative classifiers, namely support vector machines (linear and Gaussian kernels), penalized discriminant analysis and random forests. The performance is compared to the one obtained with cleared leaves images, which require a more expensive, time consuming and delicate procedure of acquisition. The results are encouraging, showing that the proposed approach is an effective and more economic alternative solution which outperforms the manual expert's recognition. 相似文献
20.
Multimedia Tools and Applications - A computer aided diagnosis system supports doctors by providing quantitative diagnostic clues from medical data. In this paper, we propose a computer aided... 相似文献
|