基于非负矩阵分解的语音深层低维特征提取方法 Nonnegative Matrix Factorization Based Deep Low-Dimensional Feature Extraction Approach for Speech Recognition期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于非负矩阵分解的语音深层低维特征提取方法

引用本文：	秦楚雄,张连海.基于非负矩阵分解的语音深层低维特征提取方法[J].数据采集与处理,2017,32(5):921-930.

作者姓名：	秦楚雄张连海

作者单位：	解放军信息工程大学信息系统工程学院，郑州，450001

摘要：	作为一种基于深层神经网络提取的低维特征，瓶颈特征在连续语音识别中取得了很大的成功。然而训练瓶颈结构的深层神经网络时，瓶颈层的存在会降低网络输出层的帧准确率，进而反过来影响该特征的性能。针对这一问题，本文基于非负矩阵分解算法，提出一种利用不包含瓶颈层的深层神经网络提取低维特征的方法。该方法利用半非负矩阵分解和凸非负矩阵分解算法对隐含层权值矩阵分解得到基矩阵，将其作为新的特征层权值矩阵，然后在该层不设置偏移向量的情况下，通过数据前向传播提取新型特征。实验表明，该特征具有较为稳定的规律，且适用于不同的识别任务和网络结构。当使用训练数据充足的语料进行实验时，该特征表现出同瓶颈特征几乎相同的识别性能；而在低资源环境下，基于该特征识别系统的识别率明显优于深层神经网络混合识别系统和瓶颈特征识别系统。
关键词：	连续语音识别深层神经网络半非负矩阵分解凸非负矩阵分解低维特征
Nonnegative Matrix Factorization Based Deep Low-Dimensional Feature Extraction Approach for Speech Recognition

Qin Chuxiong,Zhang Lianhai.Nonnegative Matrix Factorization Based Deep Low-Dimensional Feature Extraction Approach for Speech Recognition[J].Journal of Data Acquisition & Processing,2017,32(5):921-930.

Authors:	Qin Chuxiong Zhang Lianhai

Affiliation:	Institute of Information System Engineering, PLA Information Engineering University, Zhengzhou, 450001, China

Abstract:	As a type of deep neural network (DNN) based low-dimensional feature,bottleneck feature (BNF) has achieved great success in continuous speech recognition. However, the existing of bottleneck layer reduces the frame accuracy of output layer when training a bottleneck deep neural network (BN DNN), which in return has a bad impact on the performance of bottleneck feature. To solve this problem, a nonnegative matrix factorization based low-dimensional feature extraction approach using DNN without bottleneck layer is proposed in this paper. Specifically, semi-nonnegative matrix factorization and convex-nonnegative matrix factorization algorithms are applied to hidden-layer weights matrix to obtain a basis matrix as the new feature-layer weights matrix, and a new type of feature is extracted by forward passing input data without setting a bias vector in the new feature-layer. Experiments show that the feature has a relatively stable pattern around different tasks and network structures. For corpus with enough training data, the proposed features have almost the same recognition performance with conventional bottleneck feature. Under low-resource environment, the recognition accuracy of the new feature-tandem system outperforms both DNN hybrid system and bottleneck-tandem system obviously.

Keywords:	continuous speech recognition deep neural network semi-nonnegative matrix factorization convex-nonnegative matrix factorization low-dimensional features

	点击此处可从《数据采集与处理》浏览原始摘要信息
	点击此处可从《数据采集与处理》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏