共查询到17条相似文献,搜索用时 109 毫秒
1.
音频特征提取是音频分类的基础,而音频分类又是内容的音频检索的关键。综合分析了语音和音乐的区别性特征,提出一种基于小波变换和支持向量机的音频特征提取和分类的方法,用于纯语音、音乐、带背景音乐的语音以及环境音的分类,并且评估了新特征集合在SVM分类器上的分类效果。实验结果表明,提出的音频特征有效、合理,分类性能较好。 相似文献
2.
基于支持向量机的音频分类与分割 总被引:8,自引:0,他引:8
音频分类与分割是提取音频结构和内容语义的重要手段,是基于内容的音频、视频检索和分析的基础。支持向量机(SVM)是一种有效的统计学习方法。本文提出了一种基于SVM的音频分类算法。将音频分为5类:静音、噪音、音乐、纯语音和带背景音的语音。在分类的基础上,采用3个平滑规则对分类结果进行平滑。分析了SVM分类嚣的分类性能,同时也评估了本文提出的新的音频特征在SVM分类嚣上的分类效果。实验结果显示,基于SVM的音频分类算法分类效果良好,平滑处理后的音频分割结果比较准确。 相似文献
3.
音频自动分类中的特征分析和抽取 总被引:8,自引:1,他引:8
音频特征分析和抽取是音频自动分类的基础,本文将音频对象分为静音,噪音,纯语音,带背景音语音,音乐等5类,从帧层次和段层次上深入分析了不同类音频之间的区别性特征,包括帧层次上的MFCC,频域能量,子带能量,过零率,频谱中心等特征,在此基础上计算了段层次上的基本音频特征,包括静音比率,子带能量比均值等,提出了3个音频”流”特征-High-ZCR比率,Low-Frequency-Energy比率,频谱流量.设计并实现了一种基于支持向量机(support vector machine)的自动分类器,考察了上述特征组成的特征集合在该分类器中的分类性能.实验表明,本文提出的特征有效,分类性能良好. 相似文献
4.
分析音频时域特征及提取方法,研究基于支持向量机的语音分类系统流程、分类系统架构以及SVM语音分类器的设计,并进行了相关实验。结果表明,设计的基于SVM的音频分类系统能够有效地对音频进行分类,平均识别准确率达到90%以上。 相似文献
5.
音频分类在多媒体应用中十分广泛,主要有时域分析和频域分析方法.文中提出了一种基于自适应间距比(APR)算法和支持向量机(SVM)算法的音频分类方法,先用APR算法区分语音与非语音;对于非语音,再通过SVM进行音频分类. APR算法是比较PR参数和阈值来区分语音和非语音,它和信噪比密切相关;而将非语音分成四组:音乐,汽车,会议,雨声,提取特征因子.实验结果表明:文中设计的分类器的精度达到93.75%以上,能很好地把各类型音频分开. 相似文献
6.
基于单类支持向量机的音频分类 总被引:1,自引:0,他引:1
研究一种基于单类支持向量机的音频分类方法,能够使每一类样本都独立地获得一个决策函数,通过决策函数的最大值来判断样本所属的类。通过使用小波包变换提取语音特征向量,并融合多特征向量,将音频分为5类:纯语音、音乐、环境音、含背景音语音和静音。实验结果表明这种方法具有较好的分类精度,性能优于贝叶斯、隐马尔可夫模型和神经网络分类器。 相似文献
7.
基于分形布朗运动和Ada Boosting的多类音频例子识别 总被引:2,自引:0,他引:2
提出了一种基于分形布朗运动的音频特征提取和识别方法.这种方法使用分形布朗运动模型计算出音频例子的分形维数,并作为其分形特征.针对音频分形特征符合高斯分布的特点,使用Ada Boosting算法进行特征约减.然后分别使用Ada-加权高斯分类器和支持向量机对约减特征后的音频分类,并在两类分类的基础上构造多类分类的模型.实验表明,经过特征约减后的音频分形特征在音乐和语音的分类中都优于其他音频特征. 相似文献
8.
9.
一种基于选择性集成SVM的新闻音频自动分类方法 总被引:1,自引:0,他引:1
作为视频检索的一种重要线索,音频检测和分类受到广泛关注并已成为一个热门的研究方向.在新闻视频先验模型和结构的基础上,提出一种基于选择性集成SVM(SEN-SVM)的分类器设计方法.从而将新闻视频划分成静音、音乐、语音和带有背景音乐的语音这4种类型.用8 514s的真实新闻音频数据所作的仿真实验结果表明:所提出基于选择性集成SVM的新闻音频自动分类算法的平均准确率高达98.2%,远远高于单纯基于SVM的方法和传统的基于门限的方法. 相似文献
10.
针对目前机械故障诊断中难以进行特征提取和常规SVM算法诊断多类分类问题时存在困难等问题,提出了结合了WPA理论和基于二叉树的多级SVM分类器的WPA-SVM多分类故障混合诊断模型。采用小波包分析对机械信号提取频域能量特征向量,通过训练多个依赖故障优先级的基于二叉树的多级SVM分类器中,找到样本中的支持向量,并以此决定超平面。然后根据最优分类平面,对测试集的样本进行故障诊断。通过对两种不同特征提取方法、三种不同SVM识别策略的实验比较结果可知,该方法是有效的。 相似文献
11.
Audio classification is an essential task in multimedia content analysis, which is a prerequisite to a variety of tasks such
as segmentation, indexing and retrieval. This paper describes our study on multi-class audio classification on broadcast news,
a popular multimedia repository with rich audio types. Motivated by the tonal regulations of music, we propose two pitch-density-based
features, namely average pitch-density (APD) and relative tonal power density (RTPD). We use an SVM binary tree (SVM-BT) to
hierarchically classify an audio clip into five classes: pure speech, music, environment sound, speech with music and speech
with environment sound. Since SVM is a binary classifier, we use the SVM-BT architecture to realize coarse-to-fine multi-class
classification with high accuracy and efficiency. Experiments show that the proposed one-dimensional APD and RTPD features
are able to achieve comparable accuracy with popular high-dimensional features in speech/music discrimination, and the SVM-BT
approach demonstrates superior performance in multi-class audio classification. With the help of the pitch-density-based features,
we can achieve a high average accuracy of 94.2% in the five-class audio classification task. 相似文献
12.
Content-based audio classification and segmentation by using support vector machines 总被引:9,自引:0,他引:9
Content-based audio classification and segmentation is a basis for further audio/video analysis. In this paper, we present
our work on audio segmentation and classification which employs support vector machines (SVMs). Five audio classes are considered
in this paper: silence, music, background sound, pure speech, and non- pure speech which includes speech over music and speech
over noise. A sound stream is segmented by classifying each sub-segment into one of these five classes. We have evaluated
the performance of SVM on different audio type-pairs classification with testing unit of different- length and compared the
performance of SVM, K-Nearest Neighbor (KNN), and Gaussian Mixture Model (GMM). We also evaluated the effectiveness of some
new proposed features. Experiments on a database composed of about 4- hour audio data show that the proposed classifier is
very efficient on audio classification and segmentation. It also shows the accuracy of the SVM-based method is much better
than the method based on KNN and GMM. 相似文献
13.
Guobin Ou 《Pattern recognition》2007,40(1):4-18
Multi-class pattern classification has many applications including text document classification, speech recognition, object recognition, etc. Multi-class pattern classification using neural networks is not a trivial extension from two-class neural networks. This paper presents a comprehensive and competitive study in multi-class neural learning with focuses on issues including neural network architecture, encoding schemes, training methodology and training time complexity. Our study includes multi-class pattern classification using either a system of multiple neural networks or a single neural network, and modeling pattern classes using one-against-all, one-against-one, one-against-higher-order, and P-against-Q. We also discuss implementations of these approaches and analyze training time complexity associated with each approach. We evaluate six different neural network system architectures for multi-class pattern classification along the dimensions of imbalanced data, large number of pattern classes, large vs. small training data through experiments conducted on well-known benchmark data. 相似文献
14.
针对超声图像样本冗余、不同标准切面因疾病导致的高度相似性、感兴趣区域定位不准确问题,提出一种结合特征袋(BOF)特征、主动学习方法和多分类AdaBoost改进算法的经食管超声心动图(TEE)标准切面分类方法。首先采用BOF方法对超声图像进行描述;然后采用主动学习方法选择对分类器最有价值的样本作为训练集;最后,在AdaBoost算法对弱分类器的迭代训练中,根据临时强分类器的分类情况调整样本更新规则,实现对多分类AdaBoost算法的改进和TEE标准切面的分类。在TEE数据集和三个UCI数据集上的实验表明,相比AdaBoost.SAMME算法、多分类支持向量机(SVM)算法、BP神经网络和AdaBoost.M2算法,所提算法在各个数据集上的G-mean指标、整体分类准确率和大多数类别分类准确率都有不同程度的提升,且比较难分的类别分类准确率提升最为显著。实验结果表明,在包含类间相似样本的数据集上,分类器的性能有显著提升。 相似文献
15.
现实中许多领域产生的数据通常具有多个类别并且是不平衡的。在多类不平衡分类中,类重叠、噪声和多个少数类等问题降低了分类器的能力,而有效解决多类不平衡问题已经成为机器学习与数据挖掘领域中重要的研究课题。根据近年来的多类不平衡分类方法的文献,从数据预处理和算法级分类方法两方面进行了分析与总结,并从优缺点和数据集等方面对所有算法进行了详细的分析。在数据预处理方法中,介绍了过采样、欠采样、混合采样和特征选择方法,对使用相同数据集算法的性能进行了比较。从基分类器优化、集成学习和多类分解技术三个方面对算法级分类方法展开介绍和分析。最后对多类不平衡数据分类研究领域的未来发展方向进行总结归纳。 相似文献
16.
This paper proposes a hierarchical time-efficient method for audio classification and also presents an automatic procedure
to select the best set of features for audio classification using Kolmogorov-Smirnov test (KS-test). The main motivation for
our study is to propose a framework of general genre (e.g., action, comedy, drama, documentary, musical, etc...) movie video
abstraction scheme for embedded devices-based only on the audio component. Accordingly simple audio features are extracted
to ensure the feasibility of real-time processing. Five audio classes are considered in this paper: pure speech, pure music
or songs, speech with background music, environmental noise and silence. Audio classification is processed in three stages,
(i) silence or environmental noise detection, (ii) speech and non-speech classification and (iii) pure music or songs and
speech with background music classification. The proposed system has been tested on various real time audio sources extracted
from movies and TV programs. Our experiments in the context of real time processing have shown the algorithms produce very
satisfactory results. 相似文献
17.
The audio channel conveys rich clues for content-based multimedia indexing. Interesting audio analysis includes, besides widely
known speech recognition and speaker identification problems, speech/music segmentation, speaker gender detection, special
effect recognition such as gun shots or car pursuit, and so on. All these problems can be considered as an audio classification
problem which needs to generate a label from low audio signal analysis. While most audio analysis techniques in the literature
are problem specific, we propose in this paper a general framework for audio classification. The proposed technique uses a
perceptually motivated model of the human perception of audio classes in the sense that it makes a judicious use of certain
psychophysical results and relies on a neural network for classification. In order to assess the effectiveness of the proposed
approach, large experiments on several audio classification problems have been carried out, including speech/music discrimination
in Radio/TV programs, gender recognition on a subset of the switchboard database, highlights detection in sports videos, and
musical genre recognition. The classification accuracies of the proposed technique are comparable to those obtained by problem
specific techniques while offering the basis of a general approach for audio classification.
相似文献
Liming ChenEmail: |