共查询到19条相似文献,搜索用时 109 毫秒
1.
基于语义信息提取的新闻视频场景分割方法 总被引:3,自引:1,他引:3
随着数字视频的广泛应用,视频数据库系统已成为多媒体领域的一个研究热点。在建立视频数据库的过程中,视频场景的分割是一个重要而又难以解决的问题。文章从分析新闻视频场景所特有的结构特征入手,提出了一种基于语义信息提取的新闻视频场景分割新方法,该方法通过对音频流和视频流中的镜头变换、主持人镜头、主题字幕和静音区间等语义信息的提取和分析来实现新闻视频场景的分割。实验表明,采用该文提出的方法,场景分割正确率可达86.9%,较好地解决了新闻视频场景分割问题。 相似文献
2.
基于支持向量机的音频分类与分割 总被引:8,自引:0,他引:8
音频分类与分割是提取音频结构和内容语义的重要手段,是基于内容的音频、视频检索和分析的基础。支持向量机(SVM)是一种有效的统计学习方法。本文提出了一种基于SVM的音频分类算法。将音频分为5类:静音、噪音、音乐、纯语音和带背景音的语音。在分类的基础上,采用3个平滑规则对分类结果进行平滑。分析了SVM分类嚣的分类性能,同时也评估了本文提出的新的音频特征在SVM分类嚣上的分类效果。实验结果显示,基于SVM的音频分类算法分类效果良好,平滑处理后的音频分割结果比较准确。 相似文献
3.
4.
基于内容的音频检索:概念和方法 总被引:38,自引:1,他引:37
F过去对视觉媒体的检索,如图象和视频,进行了大量的研究。但是我们注意到音频也是多媒体中的一种典型媒体,是信息的一种常用载体。常规的自理是把数字音频当成非结构化流媒体。然而音频是语音的载体、包含丰富的听觉特征,并且具有结构信息。因此需要并且可以基于这些内容对音频进行存取。本文根据当前相关研究的进展,综述基于内容的音频检索方法,包括面向语音、音乐和音频分析的检索、音频分割等;分析并总结出音频内容及其检 相似文献
5.
6.
7.
作为多媒体媒质之一的音频信号蕴涵了丰富的视觉听觉语义,但是目前多媒体检索主要利用的是视觉信息,音频信息被忽略.为了弥补这一不足,本文介绍了一个音频语义检索原型系统,在这个系统中,音频信号被分层次处理:首先分析音频信息中的短时能量、过零率和基本频率能量比等特征,音频信息流被按层次粗分为静音、和谐音乐、对话和环境背景音四类;由于环境背景音蕴涵了大量语义,环境背景音被继续细分,并用训练好的隐马尔可夫链表示每类环境背景音以进行语义检索.实验数据表明,这样的音频查询处理方式取得了良好效果. 相似文献
8.
9.
10.
音频自动分类中的特征分析和抽取 总被引:8,自引:1,他引:8
音频特征分析和抽取是音频自动分类的基础,本文将音频对象分为静音,噪音,纯语音,带背景音语音,音乐等5类,从帧层次和段层次上深入分析了不同类音频之间的区别性特征,包括帧层次上的MFCC,频域能量,子带能量,过零率,频谱中心等特征,在此基础上计算了段层次上的基本音频特征,包括静音比率,子带能量比均值等,提出了3个音频”流”特征-High-ZCR比率,Low-Frequency-Energy比率,频谱流量.设计并实现了一种基于支持向量机(support vector machine)的自动分类器,考察了上述特征组成的特征集合在该分类器中的分类性能.实验表明,本文提出的特征有效,分类性能良好. 相似文献
11.
Content-based audio classification and segmentation by using support vector machines 总被引:9,自引:0,他引:9
Content-based audio classification and segmentation is a basis for further audio/video analysis. In this paper, we present
our work on audio segmentation and classification which employs support vector machines (SVMs). Five audio classes are considered
in this paper: silence, music, background sound, pure speech, and non- pure speech which includes speech over music and speech
over noise. A sound stream is segmented by classifying each sub-segment into one of these five classes. We have evaluated
the performance of SVM on different audio type-pairs classification with testing unit of different- length and compared the
performance of SVM, K-Nearest Neighbor (KNN), and Gaussian Mixture Model (GMM). We also evaluated the effectiveness of some
new proposed features. Experiments on a database composed of about 4- hour audio data show that the proposed classifier is
very efficient on audio classification and segmentation. It also shows the accuracy of the SVM-based method is much better
than the method based on KNN and GMM. 相似文献
12.
语音/音乐自动分类中的特征分析 总被引:16,自引:0,他引:16
综合分析了语音和音乐的区别性特征,包括音调,亮度,谐度等感觉特征与MFCC(Mel-Frequency Cepstral Coefficients)系数等,提出一种left-right DHMM(Discrete Hidden Markov Model)的分类器,以极大似然作为判别规则,用于语音,音乐以及它们的混合声音的分类,并且考察了上述特征集合在该分类器中的分类性能,实验结果表明,文中提出的音频特征有效,合理,分类性能较好。 相似文献
13.
Indexing and Retrieval of Audio: A Survey 总被引:3,自引:0,他引:3
With more and more audio being captured and stored, there is a growing need for automatic audio indexing and retrieval techniques that can retrieve relevant audio pieces quickly on demand. This paper provides a comprehensive survey of audio indexing and retrieval techniques. We first describe main audio characteristics and features and discuss techniques for classifying audio into speech and music based on these features. Indexing and retrieval of speech and music is then described separately. Finally, significance of audio in multimedia indexing and retrieval is discussed. 相似文献
14.
This paper proposes a hierarchical time-efficient method for audio classification and also presents an automatic procedure
to select the best set of features for audio classification using Kolmogorov-Smirnov test (KS-test). The main motivation for
our study is to propose a framework of general genre (e.g., action, comedy, drama, documentary, musical, etc...) movie video
abstraction scheme for embedded devices-based only on the audio component. Accordingly simple audio features are extracted
to ensure the feasibility of real-time processing. Five audio classes are considered in this paper: pure speech, pure music
or songs, speech with background music, environmental noise and silence. Audio classification is processed in three stages,
(i) silence or environmental noise detection, (ii) speech and non-speech classification and (iii) pure music or songs and
speech with background music classification. The proposed system has been tested on various real time audio sources extracted
from movies and TV programs. Our experiments in the context of real time processing have shown the algorithms produce very
satisfactory results. 相似文献
15.
音频特征提取是音频分类的基础,而音频分类又是内容的音频检索的关键。综合分析了语音和音乐的区别性特征,提出一种基于小波变换和支持向量机的音频特征提取和分类的方法,用于纯语音、音乐、带背景音乐的语音以及环境音的分类,并且评估了新特征集合在SVM分类器上的分类效果。实验结果表明,提出的音频特征有效、合理,分类性能较好。 相似文献
16.
语音/音乐区分是音频高效编码、音频检索、自动语音识别等音频处理和分析的重要步骤。本文提出一种新颖的语音/音乐分割与分类方法,首先根据相邻帧间的均方能量差异检测音频的变化点,实现分割;然后对音频段提取低带能量方差比、倒谱能量调制、熵调制等八维特征,用人工神经网络做分类。实验结果显示,本文算法和特征具有很高的分割准确率和分类正确率。 相似文献
17.
Audio classification is an essential task in multimedia content analysis, which is a prerequisite to a variety of tasks such
as segmentation, indexing and retrieval. This paper describes our study on multi-class audio classification on broadcast news,
a popular multimedia repository with rich audio types. Motivated by the tonal regulations of music, we propose two pitch-density-based
features, namely average pitch-density (APD) and relative tonal power density (RTPD). We use an SVM binary tree (SVM-BT) to
hierarchically classify an audio clip into five classes: pure speech, music, environment sound, speech with music and speech
with environment sound. Since SVM is a binary classifier, we use the SVM-BT architecture to realize coarse-to-fine multi-class
classification with high accuracy and efficiency. Experiments show that the proposed one-dimensional APD and RTPD features
are able to achieve comparable accuracy with popular high-dimensional features in speech/music discrimination, and the SVM-BT
approach demonstrates superior performance in multi-class audio classification. With the help of the pitch-density-based features,
we can achieve a high average accuracy of 94.2% in the five-class audio classification task. 相似文献
18.
基于单类支持向量机的音频分类 总被引:1,自引:0,他引:1
研究一种基于单类支持向量机的音频分类方法,能够使每一类样本都独立地获得一个决策函数,通过决策函数的最大值来判断样本所属的类。通过使用小波包变换提取语音特征向量,并融合多特征向量,将音频分为5类:纯语音、音乐、环境音、含背景音语音和静音。实验结果表明这种方法具有较好的分类精度,性能优于贝叶斯、隐马尔可夫模型和神经网络分类器。 相似文献
19.
基于分形布朗运动和Ada Boosting的多类音频例子识别 总被引:2,自引:0,他引:2
提出了一种基于分形布朗运动的音频特征提取和识别方法.这种方法使用分形布朗运动模型计算出音频例子的分形维数,并作为其分形特征.针对音频分形特征符合高斯分布的特点,使用Ada Boosting算法进行特征约减.然后分别使用Ada-加权高斯分类器和支持向量机对约减特征后的音频分类,并在两类分类的基础上构造多类分类的模型.实验表明,经过特征约减后的音频分形特征在音乐和语音的分类中都优于其他音频特征. 相似文献