期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

刘瑞康世胤高光来李劲东飞龙《中文信息学报》2022,36(7):86-97

针对现有基于Tacotron模型的蒙古语语音合成系统存在的两个问题：(1)合成效率较低;(2)合成语音保真度较低,该文基于FastSpeech2模型提出了完全非自回归的实时、高保真蒙古语语音合成模型MonTTS。为了提高MonTTS模型合成蒙古语语音的韵律自然度/保真度,根据蒙古语声学特点提出以下三点创新改进：(1)使用蒙古文音素序列来表征蒙古文发音信息;(2)提出音素级的声学调节器以学习长时韵律变化;(3)提出基于蒙古语语音识别和自回归语音合成两种时长对齐方法。同时,该文构建了一个当前最大规模的蒙古语语音合成数据库：MonSpeech。实验结果表明,MonTTS在韵律自然度方面的主观平均意见分数(Mean Opinion Score, MOS)达到4.53,显著优于当前最优的基于Tacotron的蒙古语语音合成基线系统和基线FastSpeech2模型;MonTTS合成实时率达3.63×10^-3,满足实时高保真合成要求。最后,文中涉及的训练脚本和预训练模型全部开源(https://github.com/ttslr/MonTTS)。相似文献

2.

HMM与神经网络相融合的低资源语音合成方法

帕丽旦·木合塔尔吾守尔·斯拉木买买提阿依甫《计算机仿真》2021,38(12):203-211

为了提高语音合成自然度和稳定性,提出HMM与深度神经网络相融合的,以维吾尔语作为实验语言的语音合成方法.基于深度学习的端到端语音合成方法存在生成速度慢、稳定性及可控性不够好,但是合成语音自然度高,而基于HMM的方法系统稳定性好,合成语音自然度不如端到端的方法.因此,系统前端部分利用HMM(马尔科夫模型)获取维吾尔语固有的语言特征,后端合成部分利用深度神经网络框架建立自回归模型.前端文本分析用HMM模型获取语言特征,后端合成用不同的神经网路模型,并进行了对比试验.最后对于实验结果进行了评测.实验结果验证了基于HMM+BiLSTM的语音合成方法的效果最好. 相似文献

3.

声纹识别中合成语音的鲁棒性

陈联武郭武戴礼荣《模式识别与人工智能》2011,24(6):743-747

随着以隐马尔科夫模型为基础的语音合成技术的发展,冒认者很容易利用该技术生成具有目标说话人特性的合成语音,这对现有的声纹识别系统构成巨大威胁.针对此问题,文中从统计学的角度分析自然语音与合成语音在实倒谱上的区别,并提出对合成语音具有鲁棒性的声纹识别系统.实验结果初步表明,相比于传统的声纹识别系统,在对自然语音的等错误率不... 相似文献

4.

语音合成音库自动标注方法研究

刘豫军夏聪《网络安全技术与应用》2015,(2):65-66

本文语音合成技术的发展现状为切入点,简要介绍了该技术的基本原理与实现方法,并明确指出语音合成技术的广阔发展前景要求其突破当下的诸多局限,将研究重点转向自动韵律标注方法。本文在简析了语音合成音库自动标注方法的研究意义之后,着重介绍了包括隐马尔科夫模型声学自动韵律标注方法、深度神经网络声学建模方法以及隐藏重音状态下的音标注方法在内的自动标注方法,我们阐述了以上方法的主要原理以及各自优缺,提倡相关领域在使用这几种方法时秉持互补的原则。相似文献

5.

深度学习语音合成技术综述

下载免费PDF全文

张小峰谢钧罗健欣杨涛《计算机工程与应用》2021,57(9):50-59

语音合成技术在人机交互中扮演着重要角色,深度学习的发展带动语音合成技术高速发展。基于深度学习的语音合成技术在合成语音的质量和速度上都超过了传统语音合成技术。从基于深度学习的声码器和声学模型出发对语音合成技术进行综述,探讨各类声码器和声学模型的工作原理及其优缺点,在此基础上对语音合成系统进行综述,系统综述经典的基于深度学习的语音合成系统,对基于深度学习的语音合成技术进行展望。相似文献

6.

具有情感表现力的可视语音合成研究综述

下载免费PDF全文

曹亮赵晖《计算机工程与科学》2015,37(4):813-818

总结和分析了近年来情感可视语音合成领域的一些关键研究成果和研究方法,并根据可视语音合成机制的不同,从基于图像的方法和基于模型的方法两个角度对情感可视语音合成技术进行了系统归类和阐述,分析对比了其各自的优缺点及性能差异。重点讨论了各文献合成的可视语音在真实性和情感表现力两个方面的实现机理和程度。最后指出了合成具有情感表现力的可视语音应该重点考虑的一些问题,为情感可视语音合成的进一步研究指明了方向。相似文献

7.

情感语音合成综述

李虎孬赵晖《现代计算机》2014,(7):31-37

情感语音合成作为一个新兴的语音合成方向,糅合生理学、心理学、语言学和信息科学等各学科知识,可以应用于文本阅读、信息查询发布和计算机辅助教学等领域,能够很好地将语音的口语分析、情感分析与计算机技术有机融合,为实现以人为本,具有个性化特征的语音合成系统奠定基础。目前的情感语音合成工作可分为基于规则合成和基于波形拼接合成两类。情感语音合成研究分为情感分析和语音合成两个部分。其中．情感分析的主要工作是收集不同情感的语音数据、提取声学特征,分析声学特征与情感联系;语音合成的主要工作是建立情感转换模型,利用情感转换模型实现合成。相似文献

8.

基于数据驱动方法的汉语文本-可视语音合成 总被引：7，自引：0，他引：7

王志明蔡莲红艾海舟《软件学报》2005,16(6):1054-1063

计算机文本-可视语音合成系统(TTVS)可以增强语音的可懂度,并使人机交互界面变得更为友好.给出一个基于数据驱动方法(基于样本方法)的汉语文本-可视语音合成系统,通过将小段视频拼接生成新的可视语音.给出一种构造汉语声韵母视觉混淆树的有效方法,并提出了一个基于视觉混淆树和硬度因子的协同发音模型,模型可用于分析阶段的语料库选取和合成阶段的基元选取.对于拼接边界处两帧图像的明显差别,采用图像变形技术进行平滑并.结合已有的文本-语音合成系统(TTS),实现了一个中文文本视觉语音合成系统. 相似文献

9.

基于高斯混合隐马尔科夫模型的自由换道识别

杨志强朱家伟穆蕾安毅生《计算机系统应用》2022,31(8):388-394

驾驶辅助系统被认为是解决交通安全问题的有效手段, 开发驾驶辅助系统的基础是对车辆的行为进行准确的识别, 以应用于车辆安全预警, 路径规划, 智能导航等方面. 目前存在的基于支持向量机模型, 隐马尔科夫模型, 卷积神经网络等行为识别方法还存在计算量与精度平衡的问题. 本文结合了隐马尔科夫模型与高斯混合模型, 提出了高斯混合隐马尔科夫模型, 利用美国联邦公路管理局NGSIM数据集对此方法进行了实验验证, 结果表明该方法对自由换道行为识别具有较高的精度. 本文还对高斯混合隐马尔科夫模型的实验参数进行了优化, 以期达到最好的识别效果, 为未来智能驾驶的车辆行为识别提供了参考. 相似文献

10.

基于波形拼接的语音合成技术研究 总被引：1，自引：0，他引：1

苏珊珊《福建电脑》2008,24(10):104-105

本文在研究了当前最新语音合成技术的基础上,着重研究了基于波形拼接的语音合成方法,并采用基于时域平滑技术对合成语音进行平滑处理．从而减少拼接点处的语音失真和突变。最后实现了一个机场等级播报的语音合成应用,获得了较好的合成效果。相似文献

11.

Only connect: teaching, technology and telesis

S. Shaw 《Journal of Computer Assisted Learning》1993,9(2):93-99

Abstract This paper describes an approach to the design of interactive multimedia materials being developed in a European Community project. The developmental process is seen as a dialogue between technologists and teachers. This dialogue is often problematic because of the differences in training, experience and culture between them. Conditions needed for fruitful dialogue are described and the generic model for learning design used in the project is explained. 相似文献

12.

European Community policy and the market 总被引：1，自引：0，他引：1

C. Lloyd 《Journal of Computer Assisted Learning》1993,9(2):86-91

Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven. 相似文献

13.

一种自适应子融合集成多分类器方法

下载免费PDF全文

李敏李华程茂华《计算机测量与控制》2019,27(4):120-123

融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。相似文献

14.

Avoiding semantic and temporal gaps in developing software intensive systems

Wayne O’Brien Author Vitae 《Journal of Systems and Software》2008,81(11):1997-2013

Development of software intensive systems (systems) in practice involves a series of self-contained phases for the lifecycle of a system. Semantic and temporal gaps, which occur among phases and among developer disciplines within and across phases, hinder the ongoing development of a system because of the interdependencies among phases and among disciplines. Such gaps are magnified among systems that are developed at different times by different development teams, which may limit reuse of artifacts of systems development and interoperability among the systems. This article discusses such gaps and a systems development process for avoiding them. 相似文献

15.

Designing economic np control charts: A programmed simulation approach

D. Sculli K.M. Woo 《Computers in Industry》1985,6(3):185-194

This paper presents control charts models and the necessary simulation software for the location of economic values of the control parameters. The simulation program is written in FORTRAN, requires only 10K of main storage, and can run on most mini and micro computers. Two models are presented - one describes the process when it is operating at full capacity and the other when the process is operating under capacity. The models allow the product quality to deteriorate to a further level before an existing out-of-control state is detected, and they can also be used in situations where no prior knowledge exists of the out-of-control causes and the resulting proportion defectives. 相似文献

16.

The development of robot art

Luigi Pagliarini Henrik Hautop Lund 《Artificial Life and Robotics》2009,13(2):401-405

Going through a few examples of robot artists who are recognized worldwide, we try to analyze the deepest meaning of what is called “robot art” and the related art field definition. We also try to highlight its well-marked borders, such as kinetic sculptures, kinetic art, cyber art, and cyberpunk. A brief excursion into the importance of the context, the message, and its semiotics is also provided, case by case, together with a few hints on the history of this discipline in the light of an artistic perspective. Therefore, the aim of this article is to try to summarize the main characteristics that might classify robot art as a unique and innovative discipline, and to track down some of the principles by which a robotic artifact can or cannot be considered an art piece in terms of social, cultural, and strictly artistic interest. This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January 31–February 2, 2008 相似文献

17.

Explanation and prediction: an architecture for default and abductive reasoning 总被引：4，自引：0，他引：4

David Poole 《Computational Intelligence》1989,5(2):97-110

Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given. 相似文献

18.

Three Process Perspectives: Organizations, Teams, and People

Watts S. Humphrey 《Annals of Software Engineering》2002,14(1-4):39-72

This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)^SM and Team Software Process (TSP){^SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical. 相似文献

19.

基于复小波噪声方差显著修正的SAR图像去噪 总被引：4，自引：1，他引：3

施汉琴张大明罗斌《遥感技术与应用》2008,23(5):561-564

提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。相似文献

20.

How do children do mathematics with LOGO?

R. NOSS 《Journal of Computer Assisted Learning》1987,3(1):2-12

Abstract This paper considers some results of a study designed to investigate the kinds of mathematical activity undertaken by children (aged between 8 and 11) as they learned to program in LOGO. A model of learning modes is proposed, which attempts to describe the ways in which children used and acquired understanding of the programming/mathematical concepts involved. The remainder of the paper is concerned with discussing the validity and limitations of the model, and its implications for further research and curriculum development. 相似文献