首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
汉语连续语音识别中不同基元声学模型的复合   总被引:1,自引:0,他引:1  
张辉  杜利民 《电子与信息学报》2006,28(11):2045-2049
该文研究由不同声学基元训练的声学模型的复合。在汉语连续语音识别中,流行的基元包括上下文相关的声韵母基元和音素基元。实验发现,有些汉语音节在声韵母模型下有更高的识别率,有些音节在音素模型下有更高的识别率。该文提出一种复合这两种声学模型的方法,一方面在识别过程中同时使用两种模型,另一方面在识别过程中避开造成低识别率的模型。实验表明,采用本文的方法后,音节错误率比音素模型和声韵母模型分别下降了9.60%和6.10%。  相似文献   

2.
陆俊  张琼  杨俊安  王一  刘辉 《信号处理》2013,29(7):865-872
基于点过程模型的关键词检出系统是一种新颖的连续语音关键词检出系统,虽然该系统具有对样本数要求不高、计算速度快等优点,但其检出性能比较依赖于前端音素探测器的准确度,而目前广泛用于音素探测器的高斯混合模型存在表征和建模能力不强的问题。针对这一缺陷,本文提出了一种嵌入深度信念网络的点过程模型并将其应用于关键词检出,该模型采用表征能力强的深度信念网络来建立音素探测器,改进了高斯混合模型在表征能力上的不足。实验结果表明该方法能够获得比原模型更高的检出率,并且降低了计算复杂度,更适用于需要实时检测关键词的场合。   相似文献   

3.
提出了基于点过程模型(PPM)的连续语音关键词检测方法。该方法首先利用时态模式(TRAP)特征和多层感知器(MLP)计算每个音素的帧级后验概率,在此基础上,将语音可看作多个相互独立的事件(音素),利用泊松过程对事件建立点过程模型,最后通过计算似然比达到关键词检测目的。实验结果表明,对8kHz采样语音,关键词平均召回率和准确率分别可达69.5%和82%以上。  相似文献   

4.
结合维吾尔语的语音特征和语义信息,在大量电话语音语料库的基础上,以建立维吾尔语连续音素识别平台为目标,通过构建隐马尔科夫模型工具HTK(Hidden Markov Model Toolkit)工具实现了维吾尔语连续音素识别算法:首先根据具体技术指标完成了较大规模电话语音语料库的录制和标注工作;确定音素为基元,通过训练获得了每个音素的HMM(Hidden Markov Model)声学模型,随后对输入的语音进行识别,声学模型在不同的高斯混合数目下,得出了识别结果;统计了32个音素的识别率并对它进行分析,为了进一步提高识别率奠定了基础。  相似文献   

5.
语音和唇部运动的异步性是多模态融合语音识别的关键问题,该文首先引入一个多流异步动态贝叶斯网络(MS-ADBN)模型,在词的级别上描述了音频流和视频流的异步性,音视频流都采用了词-音素的层次结构.而多流多状态异步DBN(MM-ADBN)模型是MS-ADBN模型的扩展,音视频流都采用了词-音素-状态的层次结构.本质上,MS-ADBN是一个整词模型,而MM-ADBN模型是一个音素模型,适用于大词汇量连续语音识别.实验结果表明:基于连续音视频数据库,在纯净语音环境下,MM-ADBN比MS-ADBN模型和多流HMM识别率分别提高35.91%和9.97%.  相似文献   

6.
由于现有的加权有限状态机(WFST)解码网络没有精确词尾标记,导致当前已有的词图生成算法不含精确的词尾时间点,或者仅是状态、音素级别的词图,无法应用到关键词检索中。该文提出在WFST静态解码器下的语音识别词图生成算法。首先从理论上分析了WFST解码音素图和词图的可转换关系,然后提出了字典的动态音素匹配方法解决了WFST网络中词尾时间点对齐的问题,最后通过令牌传递的遍历方法生成了词图。同时,考虑到计算量优化,在令牌传递过程中引入了剪枝算法,使音素图转词图的耗时不到解码耗时的3%。得到的词图,不仅可以用于语言模型重打分,由于含有精确的词尾时间点,还可以直接应用到关键词检索系统中。实验结果表明,该文的词图生成算法具有较高的计算效率;和已有动态解码器的词图相比,词图中包含更多解码信息,在大词汇连续语音识别的重打分结果和关键词检索中都能取得更好的性能。  相似文献   

7.
基于语音事件检测的自动语音识别是当前研究的热点问题。针对说话人语速变化导致模型适应性差的问题,提出了一种语速自适应调整算法。该算法以语句为单位,采用连续变化的帧长与帧移间隔对语句进行归一化调整,使调整后速率与语料库平均速率一致,减小速率因素对模型训练的影响;另外,通过计算音位属性的后验概率向量间夹角,得到测试集的语速,相比采用训练模型的语速检测方法减轻了系统负担。本文将语速调整算法应用于音位属性的提取,并对音位属性特征进行非线性变换,最后采用隐马尔科夫模型进行建模,实验表明:经过语速调整后,音素的平均持续帧数较为恒定,动态变化范围减小,使得音素识别率提升了1.3%。   相似文献   

8.
李伟  李媛媛 《电声技术》2011,35(7):42-44
针对目前汉语连续语音识别中英文识别问题,采用中英文混合建模的方法建立中英文混合模型.在分析已有语音识别系统的基础上,根据发音学的一些先验知识,提出一种基于主元音及英文音素序列混合的声学模型,同时利用最大似然规则训练出的声学模型,通过最小音素错误准则对声学模型进行区分性训练,更新得到最终的声学模型.在测试集上的结果表明,...  相似文献   

9.
谢锦辉 《通信学报》1994,15(2):83-87
本文简要讨论了在基于HMM的连续语音识别系统中怎样选取基本语音单元的问题,介绍了在欧洲Polyglot课题下在法国LINSI-CNRS建立的基于上下文无关音素HMM。然后,本文详细给出了利用左或右上下文相关音素HMM,作者对上述系统改进后进行的连续语音识别,有用美国语音库DARPA-RM1,在不考虑句法信息时,我们获得了连续时词识别率大约3-10个百分点的明显提高。实验是在法国LIMSI-CNRS  相似文献   

10.
为实现音视频语音识别和同时对音频视频流进行准确的音素切分,该文提出一个新的多流异步三音素动态贝叶斯网络(MM-ADBN-TRI)模型,在词级别上描述了音频视频流的异步性,音频流和视频流都采用了词-三音素-状态-观测向量的层次结构,识别基元是三音素,描述了连续语音中的协同发音现象.实验结果表明:该模型在音视频语音识别和对音频视频流的音素切分方面,以及在确定音视频流的异步关系上,都具备较好的性能.  相似文献   

11.
High purity organic-tantalum precursors for thin film ALD TaN were synthesized and characterized.Vapor pressure and thermal stability of these precursors were studied.From the vapor pressure analysis,it was found that TBTEMT has a higher vapor pressure than any other published liquid TaN precursor,including TBTDET,TAITMATA,and IPTDET.Thermal stability of the alkyl groups on the precursors was investigated using a 1H NMR technique.The results indicated that the tertbutylimino group is the most stable group on TBTDET and TBTEMT as compared to the dialkylamido groups.Thermal stability of TaN precursors decreased in the following order:TBTDET > PDMAT > TBTEMT.In conclusion,precursor vapor pressure and thermal stability were tuned by making slight variations in the ligand sphere around the metal center.  相似文献   

12.
In order to diagnose the laser-produced plasmas, a focusing curved crystal spectrometer has been developed for measuring the X-ray lines radiated from a laser-produced plasmas. The design is based on the fact that the ray emitted from a source located at one focus of an ellipse will converge on the other focus by the reflection of the elliptical surface. The focal length and the eccentricity of the ellipse are 1350 mm and 0.9586, respectively. The spectrometer can be used to measure the X- ray lines in the wavelength range of 0.2-0.37 nm, and a LiF crystal (200) (2d = 0.4027 nm) is used as dispersive element covering Bragg angle from 30° to 67.5°. The spectrometer was tested on Shengnang- Ⅱ which can deliver laser energy of 60-80 J/pulse and the laser wavelength is 0.35 μm. Photographs of spectra including the 1 s2p ^1P1-1s^2 ^1S0 resonance line(w), the 1s2p ^3P2-1s^2 1S0 magnetic quadrupole line(x), the 1s2p ^3P1-1 s^2 ^1S0 intercombination lines(y), the 1 s2p ^3S~1-1 s^2 ^1S0 forbidden line(z) in helium-like Ti Ⅹ Ⅺ and the 1 s2s2p ^2P3/2-1 s622s ^2S1/2 line(q) in lithium-like Ti Ⅹ Ⅹhave been recorded with a X-ray CCD camera. The experimental result shows that the wavelength resolution(λ/△ 2) is above 1000 and the elliptical crystal spectrometer is suitable for X-ray spectroscopy.  相似文献   

13.
This paper reviews our recent development of the use of the large-scale pseudopotential method to calculate the electronic structure of semiconductor nanocrystals, such as quantum dots and wires, which often contain tens of thousands of atoms. The calculated size-dependent exciton energies and absorption spectra of quantum dots and wires are in good agreement with experiments. We show that the electronic structure of a nanocrystal can be tuned not only by its size,but also by its shape. Finally,we show that defect properties in quantum dots can be significantly different from those in bulk semiconductors.  相似文献   

14.
An improving utilization and efficiency of critical equipments in semiconductor wafer fabrication facilities are concerned. Semiconductor manufacturing FAB is one of the most complicated and cost sensitive environments. A good dispatching tool will make big difference in equipment utilization and FAB output as a whole. The equipment in this paper is In-Line DUV Scanner. There are many factors impacting utilization and output on this equipment group. In HMP environment one of the issues is changing of reticule in this area and idle counts due to load unbalance between equipments. Here we'll introduce a rule-based RTD system which aiming at decreasing the number of recipe change and idle counts among a group of scanner equipment in a high-mixed-products FAB.  相似文献   

15.
The epi material growth of GaAsSb based DHBTs with InAlAs emitters are investigated using a 4 × 100mm multi-wafer production Riber 49 MBE reactor fully equipped with real-time in-situ sensors including an absorption band edge spectroscope and an optical-based flux monitor. The state-of-the-art hole mobilities are obtained from 100nm thick carbon-doped GaAsSb. A Sb composition variation of less than ± 0.1 atomic percent across a 4 × 100mm platen configuration has been achieved. The large area InAlAs/GaAsSb/InP DHBT device demonstrates excellent DC characteristics,such as BVCEO>6V and a DC current gain of 45 at 1kA/cm2 for an emitter size of 50μm × 50μm. The devices have a 40nm thick GaAsSb base with p-doping of 4. 5 × 1019cm-3 . Devices with an emitter size of 4μm × 30μm have a current gain variation less than 2% across the fully processed 100mm wafer. ft and fmax are over 50GHz,with a power efficiency of 50% ,which are comparable to standard power GaAs HBT results. These results demonstrate the potential application of GaAsSb/InP DHBT for power amplifiers and the feasibility of multi-wafer MBE for mass production of GaAsSb-based HBTs.  相似文献   

16.
We calculate the Langevin noise sources of self-pulsation laser diodes, analyze the effects of active region noise and saturable-absorption region noise on the power fluctuation as well as period fluctuation, and propose a novel method to restrain the noise effects. A visible SIMULINK model is established to simulate the system, The results indicate that the effects of noise in absorption region can be ignored; that with the increase of DC injecting current, the noise effects enhance power jitter, and nevertheless, the period jitter is decreased; and that with external sinusoidal current modulating the self-pulsation laser diode, the noise-induced power jitter and period jitter can be suppressed greatly. This work is valuable for clock recovery in all-optical network.  相似文献   

17.
Large-scale synthesis of single-crystal CdSe nanoribbons is achieved by a modified thermal evaporation method, in which two-step-thermal-evaporation is used to control CdSe sources' evaporation. The synthesized CdSe nanoribbons are usually several micrometers in width, 50 nm in thickness, and tens to several hundred micrometers in length. Studies have shown that high-quality CdSe nanoribbons with regular shapes can be obtained by this method. Room-temperature photolumines-cence indicates that the lasing emission at 710 nm has been observed under optical pumping (266 nm) at power densities of 25-153 kW/cm^2. The full width half maximum (FWHM) of the lasing mode is 0.67 nm  相似文献   

18.
By using the expansion of the aperture function into a finte sum of complex Gaussian functions, the corresponding analytical expressions of Hermite-cosh-Gaussian beams passing through annular apertured paraxially and symmetrically optical systems written in terms of ABCD matrix were derived, and they could reduce to the cases with squared aperture. In a similar way, the corresponding analytical expressions of cosh-Gaussian beams through annular apertured ABCD matrix were also given. The method could save more calculation time than that by using the diffraction integral formula directly.  相似文献   

19.
Distributed polarization coupling in polarization-maintaining fibers can be detected by using a white light Michelson interferometer. This technique usually requires that only one polarization mode is excited. However, in practical measurement, the injection polarization direction could not be exactly aligned to one of the principal axes of the PMF, so the influence of the polarization extinction ratio should be considered. Based on the polarization coupling theory, the influence of the incident polarization extinction on the measurement result is evaluated and analyzed, and a method for distributed polarization coupling detection is developed when both two orthogonal eigenmodes are excited.  相似文献   

20.
Call for Papers     
正Communications—VLSI Researches and industries of telecommunications have been growing rapidly in the last 20 years and will keep their high growing pace in the next decade.The involved researches and developments cover mobile communications,highway and last-mile broadband communication,domain specific communications,and emerging D2D M2M communications.Radio communication steps into its  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号