首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   4篇
  免费   3篇
  国内免费   4篇
无线电   1篇
冶金工业   1篇
自动化技术   9篇
  2023年   1篇
  2013年   1篇
  2009年   3篇
  2007年   2篇
  2006年   1篇
  2004年   2篇
  2001年   1篇
排序方式: 共有11条查询结果,搜索用时 15 毫秒
1.
基于改进的level set嘴唇轮廓定位方法   总被引:1,自引:1,他引:0  
根据嘴唇的几何分布特性,提出一种基于多方向的水平集方法(Multi-Level set)来进行嘴唇轮廓的定位。Multi-Level set方法通过对嘴唇图像多个方向的滤波得到新的边缘检测函数来增强嘴唇轮廓的梯度信息,然后利用能量函数最小化来使初始曲线向嘴唇轮廓靠近,达到说话人嘴唇轮廓的精确定位。实验证明用Multi-Level set方法定位嘴唇轮廓的准确率比level set提高了7.32%。  相似文献   
2.
While deep learning (DL) has been making enormous strides in lipreading areas, it is still massively underused in learning, understanding, and producing human language content. Current DL lipreading methods rely on single-channel processing and monolingual datasets, which have a limited ability to adapt to cross-language applications. Here, we propose a novel lipreading driven deep learning framework to create cross-language learning patterns. To evaluate the algorithm’s cross-language learning ability, we present a dataset CELR-200 for both Chinese and English in lipreading, containing 200-word classes with more than 80,000 samples. We also propose two Spatio-Temporal Reconstructed 3D convolutional kernels to reconstruct the 3D convolutional Spatio-Temporal relations. By using two STR-3D convolutional kernels, we present two new lipreading models, Serial-STRNet and Parallel-STRNet. These improvements reduce the number of 3D convolutional kernel parameters and improve performance, showing good performance in CELR-200 with 65.68% and 66.35%, respectively. They outperform other lipreading models, achieving an absolute improvement of 2.56% over the state-of-the-art model. Our results identify targets for future investigations and demonstrate that STR-3D convolutional kernels can provide critical insights into lipreading tasks.  相似文献   
3.
Understanding facial expressions in image sequences is an easy task for humans. Some of us are capable of lipreading by interpreting the motion of the mouth. Automatic lipreading by a computer is a challenging task, with so far limited success. The inverse problem of synthesizing real looking lip movements is also highly non-trivial. Today, the technology to automatically generate an image series that imitates natural postures is far from perfect. We introduce a new framework for facial image representation, analysis and synthesis, in which we focus just on the lower half of the face, specifically the mouth. It includes interpretation and classification of facial expressions and visual speech recognition, as well as a synthesis procedure of facial expressions that yields natural looking mouth movements. Our image analysis and synthesis processes are based on a parametrization of the mouth configuration set of images. These images are represented as points on a two-dimensional flat manifold that enables us to efficiently define the pronunciation of each word and thereby analyze or synthesize the motion of the lips. We present some examples of automatic lips motion synthesis and lipreading, and propose a generalization of our solution to the problem of lipreading different subjects.  相似文献   
4.
视觉语言——唇读综述   总被引:21,自引:0,他引:21       下载免费PDF全文
姚鸿勋  高文  王瑞  郎咸波 《电子学报》2001,29(2):239-246
本文介绍了目前唇读研究的现状与发展水平,详细阐述了唇读研究的内容和方法,以及唇读研究的意义,旨在引起大家对此新兴研究方向的关注与兴趣,从而积极参与对唇读问题的研究,并推动与此相关问题的进展.  相似文献   
5.
Listeners hearing an ambiguous phoneme flexibly adjust their phonetic categories in accordance with information telling what the phoneme should be (i.e., recalibration). Here the authors compared recalibration induced by lipread versus lexical information. Listeners were exposed to an ambiguous phoneme halfway between /t/ and /p/ dubbed onto a face articulating /t/ or /p/ or embedded in a Dutch word ending in /t/ (e.g., groot [big]) or /p/ (knoop [button]). In a posttest, participants then categorized auditory tokens as /t/ or /p/. Lipread and lexical aftereffects were comparable in size (Experiment 1), dissipated about equally fast (Experiment 2), were enhanced by exposure to a contrast phoneme (Experiment 3), and were not affected by a 3-min silence interval (Experiment 4). Exposing participants to 1 instead of both phoneme categories did not make the phenomenon more robust (Experiment 5). Despite the difference in nature (bottom-up vs. top-down information), lipread and lexical information thus appear to serve a similar role in phonetic adjustments. (PsycINFO Database Record (c) 2010 APA, all rights reserved)  相似文献   
6.
The current textual and graphical interfaces to computing, including the Web, is a dream come true for the hearing impaired. However, improved technology for voice and audio interface threaten to end this dream. Requirements are identified for continued access to computing for the hearing impaired. Consideration is given also to improving access to the sight impaired.  相似文献   
7.
视觉单通道唇读系统的有效性   总被引:1,自引:0,他引:1       下载免费PDF全文
在建立视觉单通道的大词汇量唇读系统中,提出了归一化的U-LDCT-KL两级唇读特征提取方法,即针对唇区分块的DCT(Discrete Cosine Transform)系数进行二级KL(Karhunen-Loeve Transform)去局域参数的交叠。此方法一方面提取了唇读的最有效的低级语义特征,另一方面更加合理地选择利用了特征的有效可区分性,使得用42维二级视觉特征,对特定人的唇动内容识别正确率达到77.8%。实验还证明了系统中分块的唇区DCT特征对的视觉单通道唇读系统是最有效的。  相似文献   
8.
在DCT域进行LDA的唇读特征提取方法   总被引:3,自引:0,他引:3       下载免费PDF全文
为解决视觉语言特征提取这个唇读技术中最关键的难题,提出一种新的基于DCT和LDA的特征提取方法。为提取对不同口型最具分类能力的特征矢量,首先基于DCT对视觉语言部位变换降维,然后基于LDA算法从DCT系数提取对口型分类性能最优的特征矢量。在特定人与非特定人的唇读数据库上以及实时唇读识别的实验都表明,该方法唇读识别率比传统的人工直接选择DCT系数法以及PCA提取法有明显提高。  相似文献   
9.
在视频图像中快速提取完整的嘴唇外形是计算机唇读系统的首要任务之一,文中提出了一种综合采用Red Exclusion和Fisher变换的唇部检测方法,根据肤色模型和运动相关性在视频图像中检测人脸,然后在RGB空间内排除红色,用(G,B)分量作为Fisher变换矢量,对人脸下三分之一区域进行唇部图像增强,并利用增强后的灰度图像的灰度值呈正态分布这一特性,自适应确定肤色和唇色阈值,将唇部从背景图像中分割出来。该方法能检测出完整的嘴唇外形,且检测速度高,对光照、胡须及说话人不敏感。  相似文献   
10.
基于多色彩空间的自适应嘴唇区域定位算法   总被引:5,自引:0,他引:5  
提出了一种基于多色彩空间的自适应嘴唇区域定位算法。该算法结合RGB色彩空间彩色梯度信息与HSV空间色调、饱和度分量的阈值分割,并根据嘴唇在脸部的位置特性进行自适应嘴唇基准线的自动定位,最终用投影法检测出嘴部所在的矩形区域。实验结果表明该算法简单易实现,具有较高的鲁棒性,能快速准确地框定嘴部区域,为后期的唇读特征提取奠定良好的基础。  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号