首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 734 毫秒
1.
A number of virtual environments have been developed during the last years. Among them there are some applications for blind people based on different type of audio, from simple sounds to 3-D audio. In this study, we pursued a different approach. We designed AudioMUD by using spoken text to describe the environment, navigation, and interaction. We have also introduced some collaborative features into the interaction between blind users. The core of a multiuser MUD game is a networked textual virtual environment. We developed AudioMUD by adding some collaborative features to the basic idea of a MUD and placed a simulated virtual environment inside the human body. This paper presents the design and usability evaluation of AudioMUD. Blind learners were motivated when interacted with AudioMUD and helped to improve the interaction through audio and interface design elements.  相似文献   

2.
文本信息挖掘技术及其在断路器全寿命状态评价中的应用   总被引:1,自引:0,他引:1  
电网企业记录了大量故障与缺陷中文文本,这些文本蕴藏了丰富的设备健康信息。但迄今为止,鲜有电力领域的文本信息挖掘技术研究。以断路器全寿命状态评价为应用研究背景,探索了电网中文文本挖掘方法。首先,根据断路器状态评价的研究现状,提出了构建文本挖掘与全寿命状态评价模型的关键问题。然后,构建了包含文本挖掘信息的全寿命状态评价模型,通过基于隐马尔可夫法(HMM)的文本预处理与向量化、自主区间搜索k最近邻(KNN)算法的文本分类和比率型状态信息融合模型完成了断路器全寿命健康状态指数的展示。最后,采用某电网公司实际缺陷文本构建算例。算例表明,文本挖掘技术实现了相似缺陷的相关性学习,比率型信息融合模型能更全面真实地展示健康状态评价的历史流。  相似文献   

3.
In this paper, a Mandarin text‐to‐speech (TTS) technique is employed to achieve the implementation of a voiced E‐book on the PIC‐based embedded platform. A transformation from the text of E‐book to the corresponding speech can help blind users and make the reading more effortless and relaxed. Both the microcontroller with a PIC32 Ethernet Starter Kit (80 MHz, 32‐bit, 128 kB SRAM, 512 kB Flash) and the Multimedia Expansion Board designed by Microchip Technology Inc. are adopted as the embedded platform. Four subsystems, namely text analysis, a recurrent neural network‐based prosodic generator, a synthesis unit generator with 411 Chinese syllabic waveforms, and a pitch‐synchronous overlap‐add‐based speech synthesizer, are made in the Mandarin TTS system and are implemented with C programming language. Experimental results find that a system requirement of 1.66 MB storage memory and less than 25.4 kB runtime memory, as well as 21.3% CPU runtime, is sufficient for real‐time operation such that a natural and fluent speech with a 16‐bit PCM at 8 kHz sampling rate is provided. The performance of the PIC‐based Mandarin TTS system is demonstrated to be good. © 2016 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

4.
针对目前语音转录文本错误率较高的问题,本文提出一种基于MacBERT的文本先检错后纠错模型,对语音转录后文本进行校正。检错阶段使用MacBERT-BiLSTM-CRF模型检查文本是否有错及出错位置。纠错阶段从置信度和字音相似度两个维度出发,划定“置信度-字音相似度”曲线判断候选字是否进行纠错。候选字的置信度使用MacBERT语言模型计算,并提出一种基于拼音码的字音相似度计算方法。在语音公开数据集Thchs-30上通过调用百度语音识别API进行实验,相比现有方法,在检错阶段和纠错阶段的精确率、召回率、F1值都得到了提高,其中纠错阶段精确率达到83.32%,提高了转录文本的正确性。  相似文献   

5.
Litwin  L.R.  Jr 《Potentials, IEEE》1998,17(2):38-41
Speech is a very basic way for humans to convey information to one another. From a communications standpoint, information can be sent efficiently by sending it as just text. However, with a bandwidth of only 4 kHz, speech can convey information with the emotion of a human voice. People want to be able to hear someone's voice from anywhere in the world-as if the person was in the same room. As a result, new, efficient speech coding techniques have impacted areas such as cellular telephony, mobile radios and voice mail systems  相似文献   

6.
The paper describes an optical data transmission system supporting blind and partially sighted persons in an indoor navigation, mainly in identification of offices in public buildings. Two‐dimensional data matrix barcodes (ISO/IEC 16022) are sequentially displayed on a small LED matrix mounted on a wall in a close proximity to all potential points of interest. The user must be equipped with an Android mobile device with a pre‐installed software, which locates and processes barcode sequences recorded by a camera. Sequences are decoded as plain text English messages and read by an additional text‐to‐speech software installed on the mobile device. To evaluate its performance, the system was implemented and tested at the Lodz University of Technology. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

7.
为了克服传统群智能算法在求解盲源分离(BSS)问题时收敛速度慢和分离精度差的缺点,提出一种基于改进型象群优化(IEHO)算法的BSS方法.该方法利用独立性原则,融合分离信号的峭度和互信息来构建目标函数.在氏族更新阶段,通过改进算法比例因子并加入邻域搜索,提高了算法搜索方式的多样性;在分离阶段,引入量子粒子群优化策略,提高了算法的全局搜索能力.仿真结果表明,与传统的象群优化算法和粒子群优化算法相比,IEHO算法的寻优效果较好,并成功实现了图像信号和语音信号的盲源分离,分离精度更高,收敛速度更快.  相似文献   

8.
In summary, there are many specific problems frequently faced by the blind and visually impaired that are amenable to solutions through appropriate applications of electronics and microcomputer technology. This potential is widely recognized by commercial manufacturers, who have been particularly active in the area of computer-access devices. It is expected that future developments will include continued refinement of synthetic speech applications and braille computer-access technology, increased research into the problems and solutions of orientation and mobility, and increased application of electronics to the reduction of educational and employment barriers for the blind and visually impaired.  相似文献   

9.
汪琳瑛  何胜伟 《广东电力》2006,19(12):61-63
语音报警在电力调度自动化系统中起着关键的作用,然而,语音的人为录制生成不仅方式烦琐,而且出错的可能性大。广东电网公司韶关供电局在更新电力调度自动化系统时,利用语音合成技术,开发出语音合成(TTS)系统采实现语音报警功能,解决人为录制语音文件所存在的弊端。TTS系统具有支持多种语种、合成音色多样化、短语合成效果佳、语气表现力强、数字与数值类型识别率高、支持MP3格式背景音乐、预录音批量处理、图形界面远程监控、网络语音合成服务和资源服务等特点,应用中具有免语音维护、语音报警准确性高、升级方便、语音合成速度快等优势。在电力调度自动化系统中应用时,采用侍输控制协议的客户一服务器方式实现语音报警,客户端运行TTS语音软件,并需硬件把关器支持,数据采集与监控(SCADA)服务器为服务器端。  相似文献   

10.
Reading of text and understanding images by touch is an important alternative and additional source of information when sight is absent or lost. Tactile graphics and models such as edge maps, binary output, etc., are the solution for simple access to images for blind persons. This paper introduces an approach to model the human tactile system based on the responses produced by stimuli on microcapsule paper. This system is utilized for the purpose of generating optimum halftone patterns on microcapsule paper that can be utilized for the effective generation of tactile graphics.  相似文献   

11.
黄颖  张茂青  何旭平 《江苏电器》2008,(3):19-21,49
以实现PLC指令集为背景,介绍了可编程控制器编程语言国际标准IEC61131—3的主要内容,包括两种文本语言:指令表IL和结构化文本语言ST,两种图形语言:梯形图语言LD和功能块图语言FBD,以及具有文本和图形两种表现形式的顺序功能图SFC语言;阐述了这些编程语言在FLC中的应用,给出了PLC编程语言的解释方法。  相似文献   

12.
语音通信是人们获取和交换信息的一种重要方式。随着社会信息化的发展,语音通信的机密性越来越受到重视。传统的语音加密方式都遵循经典的香农-奈奎斯特采样定理,在采样时会采集一些冗余语音数据,浪费了采样资源。为了节约采样资源并简化语音加密流程,采用压缩感知理论对语音进行加密。该方法不仅可以对语音进行加密而且也同时实现了语音的有效采样。仿真实验表明,利用压缩感知进行语音加密是切实有效的,其既可以节约采样资源又节省了存储空间。  相似文献   

13.
为了更精确地从语谱图中提取特征信息,提出了一种基于A-DResUnet的语音增强方法。A-DResUnet模型在ResUnet模型的基础上融合了空洞卷积,提升捕获语音上下文信息的能力;同时在编码器中加入卷积注意力模块(CBAM),提高对噪声谱图特征的关注。实验结果表明,与模型输出目标为干净语音语谱图相比,用噪声谱图作为模型输出目标时,该模型对未知噪声具有更强的分离能力;相较ResUnet模型,提出的A-DResUnet模型减少了语音细节信息的损失;对比基于DNN、GAN的语音增强方法,PESQ平均提升了22.81%、33.11%,STOI平均提升了9.62%、15.33%,为复杂环境下的语音增强提供了一种更有效的方法。  相似文献   

14.
This study addresses the problem of speech quality enhancement by adaptive and nonadaptive filtering algorithms. The well‐known two‐microphone forward blind source separation (TM‐FBSS) structure has been largely studied in the literature. Several two‐microphone algorithms combined with TM‐FBSS have been recently proposed. In this study, we propose 2 contributions: In the first, a new two‐microphone Gauss‐Seidel pseudo affine projection (TM‐GSPAP) algorithm is combined with TM‐FBSS. In the second, we propose to use the new TM‐GSPAP algorithm in speech enhancement. Furthermore, we show the efficiency of the proposed TM‐GSPAP algorithm in speech enhancement when highly noisy observations are available. To validate the good performances of our algorithm, we have evaluated the adaptive filtering properties in computational complexity and convergence speed performance by system mismatch criteria. A fair comparison with adaptive and nonadaptive noise reduction algorithms are also presented. The adaptive algorithms are the well‐known two‐microphone normalized least mean square algorithm, and the recently published two‐microphone pseudo affine projection algorithm. The nonadaptive algorithms are the one‐microphone spectral subtraction and the two‐microphone Wiener filter algorithm. We evalute the quality of the output speech signal in each algorithm by several objective and subjective criteria as the segmental signal‐to‐noise ratio, cepstral distance, perceptual evaluation of speech quality, and the mean opinion score. Finally, we validate the superior performances of the proposed algorithm with physically measured signals.  相似文献   

15.
为解决无人驾驶汽车外界环境感知系统对交通标识文字信息检测问题,提出一种在自动驾驶场景下对交通标识的文本信息进行检测并识别的两阶段方法,实现了自动驾驶信息精细化采集。首先使用YOLO检测器检测交通标识,同时使用本文改进的DB检测网络对场景内文本进行检测,将交通标识检测结果与场景文本检测结果进行交集运算得到待识别文本区域;最后使用轻量化CRNN网络对待识别区域文本进行识别。使用CSCT-1600数据集和MTWI-2018数据集分别进行训练和测试。实验结果表明,交通标识信息定位算法在召回率为92.98时精确度为94.95%,交通标识信息识别算法在F1为77.2%时识别速度为25帧。  相似文献   

16.
魏伟  苏津磷  李帆  仇娟  于秀丽 《电测与仪表》2023,60(12):176-181
电网系统的不断发展与智能化带来了庞大的计量需求,其中智能电能表作为主要计量设备得以广泛铺设,然而不同品牌、型号和批次的智能电能表携带的电能表信息也相差甚远,非智能的人工信息采集方式已经严重阻碍了电能表设备升级发展与采集安全,制约了电力资产管理的质量和水平。文中将文本识别技术应用于智能电能表的信息采集过程,设计一种两阶段的系统对电能表图片中的文本信息进行检测并识别,实现了电能表信息智能化采集,提高了智能电能表信息提取的效率和安全性。文中的两阶段系统包括文本检测模块和文本识别模块,文本检测模块通过改进的PSENet网络对电能表图片中的文本位置进行检测,文本识别模块通过CRNN网络对检测到的文本框进行识别。算法本身不受输入图像的质量和场景束缚,并且对面临的字体大小不一、曝光过高或过低等问题具有较强的抗干扰能力,对电能表图片中的汉字、英文和数字都具有很高的识别精度。  相似文献   

17.
与以往扩频通信系统不同,本文用序列偶代替Gold码应用于扩频通信系统。随着CDMA通信系统的应用范围越来越广泛,目前可用的地址码的数量有限并且码长均有限,这样就给实际工程带来很大的不便。本文首先介绍了序列偶的概念和几种重要的盲多用户检测方法。再将满足一定条件的序列偶应用于盲多用户检测中,通过理论分析和实验仿真,证明了序列偶应用于扩频通信系统的可行性。因此序列偶能很好的弥补目前扩频码存在的数量有限的缺陷,作为现有扩频码的有效补充。  相似文献   

18.
为提高目前基于掩蔽与基于频谱映射的语音增强方法性能上界以及复杂环境下的泛化能力,提出了一种在联合复频谱 与复掩蔽学习框架下的协作式单通道语音增强方法。 该方法采用编码器-双分支解码器结构,在编解码部分设计了一种交互协 作学习单元(ICU)来监督交互语音信息流,并提供有效的潜在特征空间;中间层则是设计出一种多尺度融合 Transformer,以少 量参数在空间-通道维度上多尺度地提取细节信息后融合输出,同时对语音子频带与全频带信息建模。 在大、小数据集与 115 种噪声环境下进行实验,结果表明该方法仅以 0. 57 M 的参数量,取得比大部分先进且相关方法更优的主、客观指标,具有良好 的鲁棒性与有效性。  相似文献   

19.
基于HMM的语音信号情感识别研究   总被引:2,自引:0,他引:2  
包含在语音信号中的情感信息是一种很重要的信息,它是人们感知事物不可缺少的部分。本文在语音识别的基础上提出了应用隐马尔可夫模型(HMM)进行语音信号情感识别的研究。从情感语音的分类、情感语音资料的获取、情感语音特征提取及情感语音识别等方面,讨论了应用连续隐马尔可夫模型进行情感识别的整个过程,并得到了比较理想的识别结果。  相似文献   

20.
基于SVM的汉语语音情感识别研究   总被引:1,自引:0,他引:1  
随着信息技术的发展,对人机交互能力的要求不断提高,情感信息处理已成为提高人机交互能力的一个重要课题.本文提出了一种汉语语音情感分类方法,主要研究了4种基本的人类情感:高兴、愤怒、恐惧、悲伤.从汉语语音信号中提取了能量、基频、语速等特征,利用支持向量机方法识别,取得了43.7%的平均识别率.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号