首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 140 毫秒
1.
汉字到盲文自动转换是改善我国1700万视障人群生活学习和贯彻落实国家信息无障碍建设的重要问题.现有汉盲转换方法均采用多步转换方法,先对汉字文本进行盲文分词连写,再对汉字进行标调,最后结合分词和标调信息合成盲文文本.该文提出一种基于编码器-解码器模型Transformer的端到端汉盲转换方法,利用汉字-盲文对照语料库训练Transformer模型.基于《人民日报》六个月约1200万字中文语料,该文构建了国家通用盲文、现行盲文、双拼盲文三种对照汉盲语料库.实验结果表明,该文提出的方法可将汉字一步转换为盲文,并在国家通用盲文、现行盲文、双拼盲文分别有80.25%,79.08%和79.29%的BLEU值.相比现有汉盲转换方法,该方法所需语料库的建设难度较小,且工程复杂度较低.  相似文献   

2.
汉盲翻译是把汉字源文本自动翻译为对应的盲文文本,目前存在着多音字混淆、未登录词不能增加、不符合盲文分词连写规则等挑战.构建一个基于逆向最大匹配分词算法的汉盲翻译系统,能够较好识别多音字,自主添加未登录词,得到较正确的分词连写结果,有效提高了汉盲翻译的准确率.该系统基于词库可以区分出大多数多音字,能够得到较符合盲文分词连写规则的分词结果,并可向词库自主添加未登录词,进而提高中文分词的准确率.实验结果表明该系统能够降低因中文分词错误引起的语句歧义,减少因多音字混淆引起的翻译错误,避免因音节结构分散导致的盲文方数繁多,具有一定的开放性和实用性.  相似文献   

3.
盲汉转换系统的研究与实现   总被引:1,自引:0,他引:1  
包塔  朱小燕 《计算机工程》2004,30(20):45-46,100
介绍了中文现行盲文和汉字相互转换中自然语言处理技术的研究与应用。在双拼盲文和汉字转换模型”研究的基础上,利用包含多知识的语言模型成功地实现了歧义程度更高的现行盲文和汉字的高精度转换。  相似文献   

4.
信息时代推进盲文数字化, 关乎我国广大盲人文化素质的提高和生活水平的改善. 本文实现了一种基于国家通用盲文标调规则的汉盲转换系统, 能够快速生成海量符合国家通用盲文方案的数字化资源, 满足视障人士无障碍获取信息的需求. 此系统按通用盲文规则处理汉语文本, 将其转换为符合标调规则、简写规则的盲文结果. 测试结果表明, 此系统可以准确处理标调规则、简写规则, 可得到准确的符合国家通用盲文方案的盲文数字化结果. 声调省写覆盖率、韵母简写覆盖率和篇幅增加量均与国家通用盲文方案的理论值相当, 能够快速处理长篇语料文件, 程序执行效率高, 具有实用价值, 可以用于推广国家通用盲文, 促进我国盲文数字化无障碍建设.  相似文献   

5.
汉盲翻译是一种将中文文本自动翻译为对应的盲文数据的过程. 在嵌入式环境下, 汉盲翻译的速度较慢, 难以达到复杂环境下的实时性需求. 为此设计出专用的汉盲翻译IP核, 通过实现逆向最大匹配分词算法、汉盲转换, 最终得到准确的盲文数据. 为了验证设计的合理性, 以Cortex-M3为微处理器构建SoC, 搭载串口、LCD驱动和汉盲翻译IP核, 并使用FPGA实验平台进行功能验证和性能测试. 测试结果表明, 该SoC可准确进行汉盲翻译, 翻译速度达5 079.37 B/s.  相似文献   

6.
汉盲翻译系统把中文信息自动翻译为盲文字符,这对盲人的教育、生活等起到非常大的帮助。盲文连写处理是汉盲翻译中重要的一个处理流程,因为盲文不同于中文的特殊性,对分词后的某些字词进行连写是必须的。本文主要研究如何借助形式化的自定义规则描述语言以及连写语料统计库,来设计一个高效,易于扩展和维护的盲文连写实现方案。  相似文献   

7.
为了提高专业领域中文分词性能,以及弥补专业领域大规模标注语料难以获取的不足,该文提出基于深度学习以及迁移学习的领域自适应分词方法。首先,构建包含词典特征的基于深度学习的双向长短期记忆条件随机场(BI-LSTM-CRF)分词模型,在通用领域分词语料上训练得到模型参数;接着,以建设工程法律领域文本作为小规模分词训练语料,对通用领域语料的BI-LSTM-CRF分词模型进行参数微调,同时在模型的词典特征中加入领域词典。实验结果表明,迁移学习减少领域分词模型的迭代次数,同时,与通用领域的BI-LSTM-CRF模型相比,该文提出的分词方法在工程法律领域的分词结果F1值提高了7.02%,与预测时加入领域词典的BI-LSTM-CRF模型相比,分词结果的F1值提高了4.22%。该文提出的分词模型可以减少分词的领域训练语料的标注,同时实现分词模型跨领域的迁移。  相似文献   

8.
提出一种基于深度学习的盲文点字识别方法,利用深度模型--堆叠去噪自动编码器(Stack Denoising AutoEncoder,SDAE)解决盲文识别中特征的自动提取与降维等问题。在构建深度模型过程中,采用非监督贪婪逐层训练算法(Greedy Layer Wise Unsupervised Learning Algorithm)初始化网络权重,使用反向传播算法优化网络参数。利用SDAE自动学习盲文点字图片特征,使用Softmax分类器进行识别。实验结果表明,本文所提方法较之传统方法,可以有效解决样本特征的自动学习与特征降维等问题,操作更为简易,并能获得满意的识别结果。  相似文献   

9.
汉语-盲文机器翻译系统的研究与实现   总被引:1,自引:0,他引:1  
对汉语-盲文(简称汉盲)翻译的原理进行了研究,提出了一个盲文形式模型和汉语-盲文机器翻译的方法:采用基于词频和词分级加权评估的逆向全切分算法进行分词,采用规则和统计相结合的方法进行词性标注和连写块识别,并在此基础上设计开发了一个实用的汉语-盲文机器翻译系统。  相似文献   

10.
该文介绍了以《淮南子》为文本的上古汉语分词及词性标注语料库及其构建过程。该文采取了自动分词与词性标注并结合人工校正的方法构建该语料库,其中自动过程使用领域适应方法优化标注模型,在分词和词性标注上均显著提升了标注性能。分析了上古汉语的词汇特点,并以此为基础描述了一些显式的词汇形态特征,将其运用于我们的自动分词及词性标注中,特别对词性标注系统带来了有效帮助。总结并分析了自动分词和词性标注中出现的错误,最后描述了整个语料库的词汇和词性分布特点。提出的方法在《淮南子》的标注过程中得到了验证,为日后扩展到其他古汉语资源提供了参考。同时,基于该文工作得到的《淮南子》语料库也为日后的古汉语研究提供了有益的资源。  相似文献   

11.
Transforming Mandarin Braille to Chinese text is a significant but less focused machine translation task. CBHG is a building block used in the Tacotron text-to-speech model. Since Mandarin Braille is constructed from the pronunciation of Chinese characters, CBHG can be used to perform Braille–Chinese translation. Unfortunately, only relying on the convolution blocks in CBHG cannot effectively extract the features of Braille sequences. Two ways are proposed to improve the CBHG model: CBHG-SE and CBHG-ECA. The two modules adaptively recalibrate channel-wise feature responses by explicitly modeling interdependencies between channels in CBHG. The quality of representations produced by the network can also be improved. Meanwhile, the network can learn to use global information to emphasize informative features and suppress less useful ones selectively. CBHG-ECA has stronger feature recalibration capabilities than CBHG-SE due to its more direct correspondence between channels and their weights. These two models can achieve 92.23 BLEU and 91.48 BLEU on the Braille–Chinese dataset, outperforming CBHG and other neural machine translation models.  相似文献   

12.
The technology for converting Chinese to Braille is of great importance. When paired with a Braille display, it can better meet the educational and daily needs of the visually impaired community, especially children and students. Incorporating visual assistance mechanisms can further enhance the user experience and provide comprehensive support for individuals with visual impairments. In recent years, the use of end-to-end neural machine translation models for Chinese–Braille translation has gained traction. However, this task requires large, high-quality, and domain-specific parallel data to train robust models. Unfortunately, the existing Chinese–Braille parallel data is insufficient to achieve satisfactory results. To address this challenge, this paper puts forward a groundbreaking approach that integrates pre-training models into the Chinese Braille translation task. This represents the first-ever application of such technology in this context and it is different from traditional pre-training methods. While previous pre-training method of natural language processing mainly utilized raw text data, we have identified its limitations in improving Chinese–Braille translation. Therefore, we have proposed three novel forms of pre-training datasets, instead of relying solely on raw text data. By utilizing the Transformer model, our approach achieves the highest BLEU score of 94.53 on a 10k parallel corpus, presenting a new direction for Chinese–Braille translation research. Furthermore, we introduce a new form of data that enables Chinese–Braille translation solely using the encoder framework. Leveraging the MacBERT model, this approach achieves a BLEU score of 98.87 on the test set and demonstrates an inference speed 54 times faster than the Transformer model. These findings have significant implications for the field of Chinese–Braille translation, providing insights for future research endeavors.  相似文献   

13.
People with visual disabilities face many difficulties and barriers when using computers and the Internet. Such people need the help of IT developers to create adaptive technologies that facilitate their interaction with the computers and Internet. This paper presents the design and implementation of an Arabic Braille environment (ABE). The paper also exposes to the reader the ABE's functionality and unique features. The ABE is designed to facilitate Arabic‐speaking visually impaired people interaction with computers, as well as helping sighted users to communicate with the visually impaired. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

14.
在实际工作学习时,尤其是正常人学习盲文,或正常人之间需要用盲文交流时,或者盲文和普通文字的混排,或者需要简单的方法把盲文输入到普通的文档中.主要介绍了盲文字库的设计方法,以及解决该字库在不同计算机之间的移植和盲文字符的输入法等问题.  相似文献   

15.
In this paper, we study Braille word segmentation and transformation of Mandarin Braille to Chinese characters. The former consists of rule, sign and knowledge bases for disambiguation and mistake correction by using adjacent constraints and bi-directional maximal matching in which segmentation precision is better than 99%. The latter can be divided into two stages: Braille to Chinese pinyin (a phonemic Romanization) and pinyin to characters. By incorporating a pinyin knowledge dictionary into the system, we have perfectly solved the problem of ambiguity in the translation from Braille to pinyin and developed a statistical language model based on the transformation of pinyin to characters. By using Viterbi search, we have built a multi-level graph and found the sequence of Chinese characters with maximal likelihood. By using an N-Best algorithm to get the N most likely character sequences and probing into the means of measurement, our correct candidates within the top-five have a further improvement of 3%. By testing on 40,000 Chinese characters for the evaluation of the system performance, our overall translation precision of Braille codes to Chinese characters for common documents arrives at 94.38%; if proper nouns are not considered, our improvement reaches 2%.  相似文献   

16.
交集型分词歧义是汉语自动分词中的主要歧义类型之一。现有的汉语自动分词系统对它的处理能力尚不能完全令人满意。针对交集型分词歧义,基于通用语料库的考察目前已有不少,但还没有基于专业领域语料库的相关考察。根据一个中等规模的汉语通用词表、一个规模约为9亿字的通用语料库和两个涵盖55个专业领域、总规模约为1.4亿字的专业领域语料库,对从通用语料库中抽取的高频交集型歧义切分字段在专业领域语料库中的统计特性,以及从专业领域语料库中抽取的交集型歧义切分字段关于专业领域的统计特性进行了穷尽式、多角度的考察。给出的观察结果对设计面向专业领域的汉语自动分词算法具有一定的参考价值。  相似文献   

17.
盲人用计算机软件系统中的语音和自然语言处理技术   总被引:3,自引:0,他引:3  
本文介绍了智能技术与系统国家重点实验室开发的“北极光”盲人用计算机软件系统中涉及的语音和语言处理技术。该系统能够获取和分析需要反馈的屏幕信息,通过语音合成平台将其内容朗读出来,对用户进行语音提示;与汉语自动分词、语言模型等自然语言处理技术的结合,使系统能够进行汉字和盲文的转换,反馈信息可以通过盲文点显器输出,使用户能够摸读盲文点字来获取所需要的信息,用户也可以采用盲文输入法进行输入,输入结果可转换为汉字文本形式。  相似文献   

18.
In this paper, a new type of Chinese Braille display (CBD) has been invented by the combined use of computer aided design (CAD) and adaptive-network-based fuzzy inference system (ANFIS). The new type of CBD can offer more powerful actuating force from 15 to 30 gw and lower power voltage from 6 to 4.5 V than the older type of CBD after the bunt mechanism and magnetic mechanism were redesigned. Not only did the study focus on the design process of the new CBD to establish system model using CAD, but also the physical design parameters were optimized by an inverse prediction technique using ANFIS. Besides, the study also solved the noise of fans and the thermal failure of Braille cells, and proved the new CBD could still work in safe even if the cooling system broke down by experiment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号