期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Dynamic language modeling for European Portuguese

Ciro Martins António Teixeira João Neto 《Computer Speech and Language》2010,24(4):750-773

相似文献

2.

《Computer Speech and Language》2000,14(4):283-332

This paper presents an attempt at using the syntactic structure in natural language for improved language models for speech recognition. The structured language model merges techniques in automatic parsing and language modeling using an original probabilistic parameterization of a shift-reduce parser. A maximum likelihood re-estimation procedure belonging to the class of expectation-maximization algorithms is employed for training the model. Experiments on the Wall Street Journal and Switchboard corpora show improvement in both perplexity and word error rate—word lattice rescoring—over the standard 3-gram language model. 相似文献

3.

Arabic broadcast news transcription system

Mansour Alghamdi Moustafa Elshafei Husni Al-Muhtaseb 《International Journal of Speech Technology》2007,10(4):183-195

相似文献

4.

Continuous space language models

《Computer Speech and Language》2007,21(3):492-518

This paper describes the use of a neural network language model for large vocabulary continuous speech recognition. The underlying idea of this approach is to attack the data sparseness problem by performing the language model probability estimation in a continuous space. Highly efficient learning algorithms are described that enable the use of training corpora of several hundred million words. It is also shown that this approach can be incorporated into a large vocabulary continuous speech recognizer using a lattice rescoring framework at a very low additional processing time. The neural network language model was thoroughly evaluated in a state-of-the-art large vocabulary continuous speech recognizer for several international benchmark tasks, in particular the Nist evaluations on broadcast news and conversational speech recognition. The new approach is compared to four-gram back-off language models trained with modified Kneser–Ney smoothing which has often been reported to be the best known smoothing method. Usually the neural network language model is interpolated with the back-off language model. In that way, consistent word error rate reductions for all considered tasks and languages were achieved, ranging from 0.4% to almost 1% absolute. 相似文献

5.

Speechbot: an experimental speech-based search engine formultimedia content on the web

Van Thong J.-M. Moreno P.J. Logan B. Fidler B. Maffey K. Moores M. 《Multimedia, IEEE Transactions on》2002,4(1):88-96

相似文献

6.

Within-word pronunciation variation modeling for Arabic ASRs: a direct data-driven approach

Dia AbuZeina Wasfi Al-Khatib Moustafa Elshafei Husni Al-Muhtaseb 《International Journal of Speech Technology》2012,15(2):65-75

Pronunciation variation is a major obstacle in improving the performance of Arabic automatic continuous speech recognition systems. This phenomenon alters the pronunciation spelling of words beyond their listed forms in the pronunciation dictionary, leading to a number of out of vocabulary word forms. This paper presents a direct data-driven approach to model within-word pronunciation variations, in which the pronunciation variants are distilled from the training speech corpus. The proposed method consists of performing phoneme recognition, followed by a sequence alignment between the observation phonemes generated by the phoneme recognizer and the reference phonemes obtained from the pronunciation dictionary. The unique collected variants are then added to dictionary as well as to the language model. We started with a Baseline Arabic speech recognition system based on Sphinx3 engine. The Baseline system is based on a 5.4 hours speech corpus of modern standard Arabic broadcast news, with a pronunciation dictionary of 14,234 canonical pronunciations. The Baseline system achieves a word error rate of 13.39%. Our results show that while the expanded dictionary alone did not add appreciable improvements, the word error rate is significantly reduced by 2.22% when the variants are represented within the language model. 相似文献

7.

广播电视新闻自动记录系统研究现状--语音识别的重要应用 总被引：2，自引：0，他引：2

张红黄泰翼徐波《自动化学报》2001,27(3):338-345

广播电视新闻自动记录系统是近两年国际上出现的大词汇量连续语音识别系统研究的新热点,是语音识别技术进一步走向实用化的重要过渡形式.文中介绍了目前国际上广播电视新闻自动记录系统出现的背景和发展历史,从系统性能与理论研究两方面介绍了这方面的研究现状并加以分析,最后对开发我国自己的广播电视新闻自动记录系统提出了具体的发展方案. 相似文献

8.

An Iterative Relative Entropy Minimization-Based Data Selection Approach for n-Gram Model Adaptation

《IEEE transactions on audio, speech, and language processing》2009,17(1):13-23

相似文献

9.

Automatic Word Decompounding for ASR in a Morphologically Rich Language: Application to Amharic

《IEEE transactions on audio, speech, and language processing》2009,17(5):863-873

This paper investigates a data-driven word decompounding algorithm for use in automatic speech recognition. An existing algorithm, called “Morfessor,” has been enhanced in order to address the problem of increased phonetic confusability arising from word decompounding by incorporating phonetic properties and some constraints on recognition units derived from forced alignments experiments. Speech recognition experiments have been carried out on a broadcast news task for the Amharic language to validate the approach. The out of vocabulary (OOV) word rates were reduced by 35% to 50% and a small reduction in word error rate (WER) has been achieved. The algorithm is relatively language independent and requires minimal adaptation to be applied to other languages. 相似文献

10.

On the effectiveness of subwords for lexical cohesion based story segmentation of Chinese broadcast news

L. Xie Y.-L. Yang 《Information Sciences》2011,181(13):2873-110

相似文献

11.

Turkish Broadcast News Transcription and Retrieval

《IEEE transactions on audio, speech, and language processing》2009,17(5):874-883

相似文献

12.

>维吾尔语广播新闻敏感词检索系统的研究 总被引：1，自引：0，他引：1

木合塔尔·沙地克李晓布合力齐姑丽·瓦斯力《中文信息学报》2011,25(4):3-11

维吾尔语广播新闻敏感词检索系统是以HMM为基础。在MATLAB平台上设计实现的。该系统的特点包括 1.由于维吾尔语敏感词数量不多,该系统语音语料库很小。2.由于广播新闻中的发音较为标准规范,在识别中避免了说话人发音上的不规范,这有利于语音识别系统性能的提高。3.由于选择词素为识别基元,易于识别基元端点检测。相似文献

13.

Cross-word Arabic pronunciation variation modeling for speech recognition

Dia AbuZeina Wasfi Al-Khatib Moustafa Elshafei Husni Al-Muhtaseb 《International Journal of Speech Technology》2011,14(3):227-236

相似文献

14.

Audio Partitioning and Transcription for Broadcast Data Indexation

J.L. Gauvain L. Lamel G. Adda 《Multimedia Tools and Applications》2001,14(2):187-200

相似文献

15.

Association pattern language modeling

Jen-Tzung Chien 《IEEE transactions on audio, speech, and language processing》2006,14(5):1719-1728

Statistical n-gram language modeling is popular for speech recognition and many other applications. The conventional n-gram suffers from the insufficiency of modeling long-distance language dependencies. This paper presents a novel approach focusing on mining long distance word associations and incorporating these features into language models based on linear interpolation and maximum entropy (ME) principles. We highlight the discovery of the associations of multiple distant words from training corpus. A mining algorithm is exploited to recursively merge the frequent word subsets and efficiently construct the set of association patterns. By combining the features of association patterns into n-gram models, the association pattern n-grams are estimated with a special realization to trigger pair n-gram where only the associations of two distant words are considered. In the experiments on Chinese language modeling, we find that the incorporation of association patterns significantly reduces the perplexities of n-gram models. The incorporation using ME outperforms that using linear interpolation. Association pattern n-gram is superior to trigger pair n-gram. The perplexities are further reduced using more association steps. Further, the proposed association pattern n-grams are not only able to elevate document classification accuracies but also improve speech recognition rates. 相似文献

16.

新闻生产智能化 ——论语言科技对新闻传媒业的影响

唐琦《计算机时代》2021,(2):12-15

人工智能时代各项技术不断革新,落地场景不断丰富.其中语音转写、机器写作、AI主播和短视频智能生产融合了语言科技的多项技术,如语音识别、语音合成、语言生成、语义理解、机器翻译等,其相关产品和应用进入新闻生产中采编、写稿、播报等各个环节,使新闻生产模式向智能化方向发展. 相似文献

17.

Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system

Matsoukas S. Gauvain J.-L. Adda G. Colthurst T. Chia-Lin Kao Kimball O. Lamel L. Lefevre F. Ma J.Z. Makhoul J. Nguyen L. Prasad R. Schwartz R. Schwenk H. Bing Xiang 《IEEE transactions on audio, speech, and language processing》2006,14(5):1541-1556

相似文献

18.

Improved Features and Models for Detecting Edit Disfluencies in Transcribing Spontaneous Mandarin Speech

Che-Kuang Lin Lin-Shan Lee 《IEEE transactions on audio, speech, and language processing》2009,17(7):1263-1278

相似文献

19.

Analysis of the errors produced by the 2004 BBN speech recognition system in the DARPA EARS evaluations

Duta N. Schwartz R. Makhoul J. 《IEEE transactions on audio, speech, and language processing》2006,14(5):1745-1753

This paper aims to quantify the main error types the 2004 BBN speech recognition system made in the broadcast news (BN) and conversational telephone speech (CTS) DARPA EARS evaluations. We show that many of the remaining errors occur in clusters rather than isolated, have specific causes, and differ to some extent between the BN and CTS domains. The correctly recognized words are also clustered and are highly correlated with regions where the system produces a single hypothesized choice per word. A statistical analysis of some well-known error causes (out-of-vocabulary words, word fragments, hesitations, and unlikely language constructs) was performed in order to assess their contribution to the overall word error rate (WER). We conclude with a discussion of the lower bound on the WER introduced by the human annotator disagreement. 相似文献

20.

Creating a ground truth multilingual dataset of news and talk show transcriptions through crowdsourcing

Rachele Sprugnoli Giovanni Moretti Luisa Bentivogli Diego Giuliani 《Language Resources and Evaluation》2017,51(2):283-317

相似文献