期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

孙晓彭晓琪胡敏任福继《电子与信息学报》2017,39(9):2048-2055

该文提出了一种基于深度信念网络(DBN)和多维扩展特征的模型,实现对中文微博短文本的情感分类。为降低传统文本分类方法在处理微博短文时特征稀疏的影响,引入社交关系网络作为扩展特征,依据评论者和博主之间的社交关系,提取相关评论扩展原始微博,将扩展后的多维特征作为深度信念网络的输入。通过叠加多层玻尔兹曼机(RBM)构建DBN模型底层网络结构,多层玻尔兹曼机可以对原始输入抽象并获得数据的深层语义特征。在多个RBM层上叠加一层分类玻尔兹曼机(ClassRBM),实现最终情感分类。实验结果表明,通过调整模型参数和网络结构,构建的深度学习模型在情感分类中能够获得比SVM和NB等浅层分类系统更优的结果,另外,实验证明使用扩展多维特征方法可提高短文本情感分类的性能。相似文献

2.

基于掩蔽自监督语音特征提取的帕金森病检测方法

季薇杨茗淇李云郑慧芬《电子与信息学报》2023,45(10):3502-3510

帕金森病是一种常见的慢性神经系统疾病,构音障碍是帕金森病的早期症状之一。基于语音进行帕金森病的辅助诊疗有助于更早发现病情和观测病情的发展。传统方法常通过对语音特征(如频率微扰、振幅微扰等)的参数计算来进行疾病评估,然而这些特征可能无法全面反映所有的病理现象,从而影响了检测和评估的准确率。为更好地提取帕金森病患者语音中的病理信息,提升检测和评估的准确率,该文提出一种基于掩蔽自监督语音特征提取的帕金森病检测方法。首先,从帕金森病患者的原始语音中提取Mel语谱图特征,得到患者富含病理特征的全局时序化表示;然后,对部分Mel语谱图特征进行掩蔽,并通过掩蔽自监督模型对掩蔽部分进行重构,从而学习到帕金森病患者语音特征的更高级表示。为解决帕金森病语音数据稀缺的问题,该文先在LibriSpeech公开数据集上进行掩蔽自监督模型的预训练,然后基于迁移学习的思想,利用帕金森病语音数据对预训练好的掩蔽自监督模型进行微调和加权求和,以提升该模型特征表示学习的性能。最终,使用随机森林和支持向量机分类器分别对提取好的语音特征进行分类,以实现帕金森病的检测。该文在MaxLittle公开数据集和课题组自采数据集上,采用10折交叉验证的方法验证了所提方法的有效性。结果表明,与传统的Mel语谱图特征检测方法和其他经典的自监督特征提取方法相比,所提方法在准确率、敏感度、特异度性能方面均有明显提升。相似文献

3.

A robust elastic net approach for feature learning

《Journal of Visual Communication and Image Representation》2014,25(2):313-321

Unsupervised feature learning has drawn more and more attention especially in visual representation in past years. Traditional feature learning approaches assume that there are few noises in training data set, and the number of samples is enough compared with the dimensions of samples. Unfortunately, these assumptions are violated in most of visual representation scenarios. In these cases, many feature learning approaches are failed to extract the important features. Toward this end, we propose a Robust Elastic Net (REN) approach to handle these problems. Our contributions are twofold. First of all, a novel feature learning approach is proposed to extract features by weighting elastic net. A distribution induced weight function is used to leverage the importance of different samples thus reducing the effects of outliers. Moreover, the REN feature learning approach can handle High Dimension, Low Sample Size (HDLSS) issues. Second, a REN classifier is proposed for object recognition, and can be used for generic visual representation including that from the REN feature extraction. By doing so, we can reduce the effect of outliers in samples. We validate the proposed REN feature learning and classifier on face recognition and background reconstruction. The experimental results showed the robustness of this proposed approach for both corrupted/occluded samples and HDLSS issues. 相似文献

4.

基于Sinc-Transformer模型的原始语音情感识别

下载免费PDF全文

俞佳佳金赟马勇姜芳艽戴妍妍《信号处理》2021,37(10):1880-1888

考虑传统语音情感识别任务中,手动提取声学特征的繁琐性,本文针对原始语音信号提出一种Sinc-Transformer（SincNet Transformer）模型来进行语音情感识别任务。该模型同时具备SincNet层及Transformer模型编码器的优点,利用SincNet滤波器从原始语音波形中捕捉一些重要的窄带情感特征,使其整个网络结构在特征提取过程中具有指导性,从而完成原始语音信号的浅层特征提取工作;利用两层Transformer模型编码器进行二次处理,以提取包含全局上下文信息的深层特征向量。在交互式情感二元动作捕捉数据库（IEMOCAP）的四类情感分类中,实验结果表明本文提出的Sinc-Transformer模型准确率与非加权平均召回率分别为64.14%和65.28%。同时与基线模型进行对比,所提模型能有效地提高语音情感识别性能。相似文献

5.

Robust Perceptual Wavelet Packet Features for Recognition of Continuous Kannada Speech

Mahadevaswamy Ravi D. J. 《Wireless Personal Communications》2021,121(3):1781-1804

An ASR system is built for the Continuous Kannada Speech Recognition. The acoustic and language models are created with the help of the Kaldi toolkit. The speech database is created with the native male and female Kannada speakers. The 80% of collected speech data is used for training the acoustic models and 20% of speech database is used for the system testing. The Performance of the system is presented interms of Word Error Rate (WER). Wavelet Packet Decomposition along with Mel filter bank is used to achieve feature extraction. The proposed feature extraction performs slightly better than the conventional features such as MFCC, PLP interms of WRA and WER under uncontrolled conditions. For the speech corpus collected in Kannada Language, the proposed features shows an improvement in Word Recognition Accuracy (WRA) of 1.79% over baseline features.

相似文献

6.

基于深度信念网络的事件识别 总被引：2，自引：0，他引：2

下载免费PDF全文

张亚军刘宗田周文《电子学报》2017,45(6):1415

事件识别是信息抽取的重要基础.为了克服现有事件识别方法的缺陷,本文提出一种基于深度学习的事件识别模型.首先,我们通过分词系统获得候选词并将它们分为五种类型.然后选择六种识别特征并制定相应的特征表示规则用来将词转化为向量样例.最后我们通过深度信念网络抽取词的深层语义信息,并由Back-Propagation(BP)神经网络识别事件.实验显示模型最高F值达85.17%.同时,本文还提出了一种融合无监督和有监督两种学习方式的混合监督深度信念网络,该网络能够提高识别效果(F值达89.2%)并控制训练时间(增加27.50%). 相似文献

7.

An enhance excavation equipments classification algorithm based on acoustic spectrum dynamic feature

Jiuwen Cao Wuhao Huang Tuo Zhao Jianzhong Wang Ruirong Wang 《Multidimensional Systems and Signal Processing》2017,28(3):921-943

Underground pipeline network surveillance system attracts increasingly attentions recently due to severe breakages caused by external excavation equipments in the mainland of China. In this paper, we study excavation equipments classification algorithm based on acoustic signal processing and machine learning algorithms. A cross-layer microphone array with four elements is designed to collect the acoustic database of representative excavation equipments on real construction sites. The generalized sidelobe canceller algorithm is employed for background noise reduction. The improved spectrum dynamic feature extraction algorithm is then implemented for the benchmark acoustic feature database construction of excavation equipments. To perform classification and background noise identification, the single hidden layer feedforward neural network is employed as the classifier. An improved algorithm based on the popular extreme learning machine (ELM) is proposed for classifier learning. The leave-one-out cross validation strategy is adopted for the regularization parameter optimization in ELM. Comprehensive experiments are conducted to test the effectiveness of the proposed algorithm. Comparisons with state-of-art classifiers and the Mel-frequency cepstrual coefficients acoustic features are also provided to demonstrate the superiority of our approach. 相似文献

8.

听觉注意模型的语谱图语音情感识别方法

下载免费PDF全文

张昕然查诚宋鹏陶华伟赵力《信号处理》2016,32(9):1117-1125

在语音情感识别技术中,由于噪声环境、说话方式和说话人特质原因,会造成实验数据库特征不匹配的情况。从语音学上分析,该问题多存在于跨数据库情感识别实验。训练的声学模型和用于测试的语句样本之间的错位,会使语音情感识别性能剧烈下降。本文据此所研究的选择性注意声学模型能有效探测变化的情感特征。同时,利用时频原子对模型进行改进,使之能提取跨语音数据库中的显著性特征用于情感识别。实验结果表明,利用文章所提方法在跨库情感样本上进行特征提取,再通过典型的分类器,识别性能提高了9个百分点,从而验证了该方法对不同数据库具有更好的鲁棒性。相似文献

9.

基于DBF的汉语方言自动辨识

韩军《电声技术》2017,41(4)

在汉语方言辨识中,传统的声学特征是语音信号的谱特征的参数化表示,常常包含说话人、信道、背景噪声等冗余信息,针对上述问题将深度神经网络(Deep Neural Network,DNN)引入特征提取之中,提出了与音素层面相关的深度瓶颈特征(Deep Bottleneck Feature,DBF),尝试从特征层面抑制方言冗余信息的影响.最后在实验部分对瓶颈层的位置,节点数目进行了讨论,结果显示,深度瓶颈特征相对于传统声学特征能够取得更高的识别率. 相似文献

10.

基于卷积循环网络与非局部模块的语音增强方法

李辉景浩严康华徐良浩《电子科技》2022,35(3):8-15

现有的深度神经网络语音增强方法忽视了相位谱学习的重要性,从而造成增强语音质量不理想。针对这一问题,文中提出了一种基于卷积循环网络与非局部模块的语音增强方法。通过设计一种编解码网络,将语音信号的时域表示作为编码端的输入进行深层特征提取,从而充分利用语音信号的幅值信息以及相位信息。在编码端和解码端的卷积层中加入非局部模块,在提取语音序列关键特征的同时,抑制无用特征,并引入门控循环单元网络捕捉语音序列间的时序相关性信息。在ST-CMDS中文语音数据集上实验结果表明,与未处理的含噪语音相比,使用文中方法生成的增强语音质量和可懂度平均提升了61%和7.93%。相似文献

11.

基于DBN 的频率与相位编码信号快速调制识别方法

许程成张剑云黄健航谌诗娃《现代雷达》2018,40(2):33-39

为提升低信噪比条件下雷达/ 通信频率、相位编码信号调制识别性能,降低特征提取复杂度,提出了基于深度信念网络DBN(Deep Belief Network, DBN)以及快速特征提取的调制识别方法。结合快速傅里叶累加算法FAM(FFT Accumulation Method)算法,提出了将循环谱估计图像转化为有效可识别特征向量的提取算法;设计了用于编码信号调制识别的DBN 网络训练与识别框架。仿真结果表明,文中方法较传统方法具有更低的特征提取与预处理复杂度,提取的特征在几种典型编码调制模式信号中具有明显区分,DBN 训练识别框架对雷达/ 通信编码信号调制识别均具有可行性与有效性,在低信噪比条件下对无线电编码信号有更高的识别正确率。相似文献

12.

Detection of landmines and underground utilities from acoustic and GPR images with a cepstral approach

Umar S. Khan Waleed Al-Nuaimy Fathi E. Abd El-Samie 《Journal of Visual Communication and Image Representation》2010,21(7):731-740

This paper introduces a cepstral approach for the automatic detection of landmines and underground utilities from acoustic and ground penetrating radar (GPR) images. This approach is based on treating the problem as a pattern recognition problem. Cepstral features are extracted from a group of images, which are transformed first to 1-D signals by lexicographic ordering. Mel-frequency cepstral coefficients (MFCCs) and polynomial shape coefficients are extracted from these 1-D signals to form a database of features, which can be used to train a neural network with these features. The target detection can be performed by extracting features from any new image with the same method used in the training phase. These features are tested with the neural network to decide whether a target exists or not. The different domains are tested and compared for efficient feature extraction from the lexicographically ordered 1-D signals. Experimental results show the success of the proposed cepstral approach for landmine detection from both acoustic and GPR images at low as well as high signal to noise ratios (SNRs). Results also show that the discrete cosine transform (DCT) is the most appropriate domain for feature extraction. 相似文献

13.

基于ELM理论的昆虫分类

徐源浩齐焕芳《电子科技》2015,28(3):33-37

机器视觉技术应用在昆虫分类领域,取代传统人眼观察识别过程、提高了工作效率。自动识别技术包含昆虫特征提取和分类器设计两个主要步骤。根据整个识别过程,文中提出了一种基于混合特征的ELM理论昆虫识别方法。在特征提取阶段,提取混合特征包括颜色特征、形态特征、空域纹理特征和频谱纹理特征。在分类器设计阶段采用具有学习速度快且泛化性能好的极限学习机。实验结果表明,该方法使昆虫识别的正确率达到97%,且分类器训练时间短,优于传统的自动识别方法。相似文献

14.

基于倒谱特征的带噪语音端点检测 总被引：44，自引：0，他引：44

下载免费PDF全文

胡光锐韦晓东《电子学报》2000,28(10):95-97

在语音识别系统中产生错误识别的原因之一是端点检测有误差.在高信噪比情况下,正确地确定语音的端点并不困难.然而,大多数实际的语音识别系统需工作在低信噪比情况下,一些常规的端点检测方法,例如基于能量的端点检测方法在噪声环境下不能有效地工作.本文利用倒谱特征来检测语音端点,提出了带噪语音端点检测的两个算法,第一个算法利用倒谱距离代替短时能量作为判决的门限,第二个算法改进了基于隐马尔柯夫模型(HMM)的语音检测以适应噪声的变化,实验结果表明本方法可得到高正确率的带噪语音端点检测. 相似文献

15.

基于深度分层特征表示的行人识别方法

孙锐张广海高隽《电子与信息学报》2016,38(6):1528-1535

该文针对行人识别中的特征表示问题,提出一种混合结构的分层特征表示方法,这种混合结构结合了具有表示能力的词袋结构和学习适应性的深度分层结构。首先利用基于梯度的HOG局部描述符提取局部特征,再通过一个由空间聚集受限玻尔兹曼机组成的深度分层编码方法进行编码。对于每个编码层,利用稀疏性和选择性正则化进行无监督受限玻尔兹曼机学习,再应用监督微调来增强分类任务中视觉特征表示,采用最大池化和空间金字塔方法得到高层图像特征表示。最后采用线性支持向量机进行行人识别,提取深度分层特征遮挡等与目标无关部分自然分离,有效提高了后续识别的准确性。实验结果证明了所提出方法具有较高的识别率。相似文献

16.

听觉模型用于语音识别以及与一般方法的比较

黄泰翼高雨青《电子学报》1993,21(10):1-6

本文在文献（１）建立的外周听觉系统以及部分中枢听觉神经系统的基础上，建立了一个主意识别器。它由听觉模型作为语音声学前端处理器（即特征提取），由具有ｔｏｎｏｔｏｐｉｃ组织结构的神经网络作为识别分类器。大量实验表明，由该听觉模型提取的特征参数不仅能很好地表示主意区别意义，而且对于噪声环境下的语音特征表示有较好ｔｏｂｕｓｔｎｅｓｓ。语音识别实验表明：在有噪声的情况下，采用听觉模型参数的识别器，其识别率明相似文献

17.

Recognition of the Communication Signals Using Particle Swarm Optimization and Support Vector Machine Based on the Multi-Resolution Wavelet Analysis

Ataollah Ebrahimzadeh Shrime Mahdi Yousefi 《Wireless Personal Communications》2012,63(4):847-860

Automatic recognition of the communication signals plays an important role for various applications. This paper presents a novel intelligent system for recognition of digital communication signals. This system includes three main modules: feature extraction module, classifier module and optimization module. In the feature extraction module, multi-resolution wavelet analysis is proposed for extraction the suitable features. In the classifier module, a multi-class support vector machine (SVM) based classifier is proposed as the multi-class classifier. For optimization module, a particle swarm optimization algorithm is proposed to improve the generalization performance of the recognizer. In this module, it is optimized the SVM classifier design by searching for the best value of the parameters that tune its discriminant function, and upstream by looking for the best subset of features that feed the classifier. Simulation results show that the proposed hybrid intelligent system has high performance even at very low signal to noise ratios (SNRs). 相似文献

18.

相似性约束的深度置信网络在SAR图像目标识别的应用

丁军刘宏伟陈渤冯博王英华《电子与信息学报》2016,38(1):97-103

特征提取是合成孔径雷达(SAR)图像目标识别的关键环节。SAR图像中存在的相干斑点和非光滑特性使得传统针对光学图像的特征提取方法变得很难应用。虽然可以采用深度置信网络(DBN)自动地进行特征学习,但是该方法属于无监督学习方法,这使得学习到的特征与具体的任务是无关的。该文提出一种叫做相似性约束的受限玻尔兹曼机模型。该模型在学习过程中通过约束特征向量之间的相似性达到引入监督信息的目的。另外,可以将多个相似性约束的受限玻尔兹曼机堆叠成一种新的深度模型,称其为相似性约束的深度置信网络模型。实验结果表明在SAR图像目标识别应用中,该方法相比主成分分析(PCA)以及原始DBN具有更好的识别性能。相似文献

19.

Comparison of Khasi Speech Representations with Different Spectral Features and Hidden Markov States

下载免费PDF全文

Bronson Syiem Sushanta Kabir Dutta Juwesh Binong Lairenlakpam Joyprakash Singh 《电子科技学刊:英文版》2021,19(2):155-162

In this paper, we present a comparison of Khasi speech representations with four different spectral features and novel extension towards the development of Khasi speech corpora. These four features include linear predictive coding (LPC), linear prediction cepstrum coefficient (LPCC), perceptual linear prediction (PLP), and Mel frequency cepstral coefficient (MFCC). The 10-hour speech data were used for training and 3-hour data for testing. For each spectral feature, different hidden Markov model (HMM) based recognizers with variations in HMM states and different Gaussian mixture models (GMMs) were built. The performance was evaluated by using the word error rate (WER). The experimental results show that MFCC provides a better representation for Khasi speech compared with the other three spectral features. 相似文献

20.

采用原子表示模型的维吾尔语语音情感识别

下载免费PDF全文

塔什甫拉提·尼扎木丁梁瑞宇谢跃赵力《信号处理》2020,36(1):9-17

针对现有的基于表示学习的语音情感计算算法中存在着限制条件单一的问题,且没有证明它们的有效性,提出了一种采用原子表示模型的语音情感识别算法。通过引入一个新的条件,称为原子分类条件。在这种条件下,对正确识别新的测试情感样本有较好的效果。现有的基于表示的分类算法以单一的稀疏表示方法为主,而提出的算法可以结合稀疏表示模型和其他的表示模型。该算法能够放宽适用条件的范围,使得原子表示模型适应更多分类任务。采集并建立了维吾尔语语音情感数据库。在该情感数据库上,分析维吾尔语情感语音的基本声学特征。通过对情感特征空间进行原子表示的映射变换,可以有效表示情感特征空间。经实验结果证明所提出的方法优于传统的方法,在维吾尔语情感语音数据库上达到了64.17%识别率。相似文献