首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
International Journal on Document Analysis and Recognition (IJDAR) - Handwritten Text Recognition (HTR) in free-layout pages is a challenging image understanding task that can provide a relevant...  相似文献   

针对腹部CT影像邻近器官对比度较低及因个体肝脏形状差异较大等引起肝脏分割困难的问题,提出了全卷积神经网络肝脏分割模型。首先通过卷积神经网络提取图像深层、抽象的特征,再通过反卷积运算对提取到的特征映射进行插值重构后得到分割结果。由于单纯进行反卷积得到的分割结果往往比较粗糙,因此,在反卷积之前,先融合高层与低层的特征,并且通过增加反卷积的层数、减少反卷积步长,得到了更为精确的分割结果。与传统卷积神经网络的分割方法相比,该模型可以充分利用CT影像的空间信息。实验数据表明该模型能够使腹部CT影像肝脏分割具有较高的精度。  相似文献   

In this paper, we present a recognition system for on-line handwritten texts acquired from a whiteboard. The system is based on the combination of several individual classifiers of diverse nature. Recognizers based on different architectures (hidden Markov models and bidirectional long short-term memory networks) and on different sets of features (extracted from on-line and off-line data) are used in the combination. In order to increase the diversity of the underlying classifiers and fully exploit the current state-of-the-art in cursive handwriting recognition, commercial recognition systems have been included in the combined system, leading to a final word level accuracy of 86.16%. This value is significantly higher than the performance of the best individual classifier (81.26%).  相似文献   

This paper investigates rejection strategies for unconstrained offline handwritten text line recognition. The rejection strategies depend on various confidence measures that are based on alternative word sequences. The alternative word sequences are derived from specific integration of a statistical language model in the hidden Markov model based recognition system. Extensive experiments on the IAM database validate the proposed schemes and show that the novel confidence measures clearly outperform two baseline systems which use normalised likelihoods and local n-best lists, respectively.  相似文献   

Li  Hongzhu  Wang  Weiqiang  Lv  Ke 《Multimedia Tools and Applications》2019,78(16):22249-22268
Multimedia Tools and Applications - The convolutional recurrent neural network is one of the most popular text recognition methods. Recurrent structures can extract long-term dependencies, but they...  相似文献   

Multimedia Tools and Applications - This study proposes an approach for segmentation of skin lesions from dermoscopic images based on fully convolutional neural network and active contour model...  相似文献   

This paper presents a new technique of high accuracy to recognize both typewritten and handwritten English and Arabic texts without thinning. After segmenting the text into lines (horizontal segmentation) and the lines into words, it separates the word into its letters. Separating a text line (row) into words and a word into letters is performed by using the region growing technique (implicit segmentation) on the basis of three essential lines in a text row. This saves time as there is no need to skeletonize or to physically isolate letters from the tested word whilst the input data involves only the basic information—the scanned text. The baseline is detected, the word contour is defined and the word is implicitly segmented into its letters according to a novel algorithm described in the paper. The extracted letter with its dots is used as one unit in the system of recognition. It is resized into a 9 × 9 matrix following bilinear interpolation after applying a lowpass filter to reduce aliasing. Then the elements are scaled to the interval [0,1]. The resulting array is considered as the input to the designed neural network. For typewritten texts, three types of Arabic letter fonts are used—Arial, Arabic Transparent and Simplified Arabic. The results showed an average recognition success rate of 93% for Arabic typewriting. This segmentation approach has also found its application in handwritten text where words are classified with a relatively high recognition rate for both Arabic and English languages. The experiments were performed in MATLAB and have shown promising results that can be a good base for further analysis and considerations of Arabic and other cursive language text recognition as well as English handwritten texts. For English handwritten classification, a success rate of about 80% in average was achieved while for Arabic handwritten text, the algorithm performance was successful in about 90%. The recent results have shown increasing success for both Arabic and English texts.  相似文献   

Recent work on extracting features of gaps in handwritten text allows a classification of these gaps into inter-word and intra-word classes using suitable classification techniques. In this paper, we first analyse the features of the gaps using mutual information. We then investigate the underlying data distribution by using visualisation methods. These suggest that a complicated structure exists, which makes them difficult to be separated into two distinct classes. We apply five different supervised classification algorithms from the machine learning field on both the original dataset and a dataset with the best features selected using mutual information. Moreover, we improve the classification result with the aid of a set of feature variables of strokes preceding and following each gap. The classifiers are compared by employing McNemar's test. We find that SVMs and MLPs outperform the other classifiers and that preprocessing to select features works well. The best classification result attained suggests that the technique we employ is particularly suitable for digital ink manipulation at the level of words.  相似文献   

Tong  Zheng  Xu  Philippe  Denœux  Thierry 《Applied Intelligence》2021,51(9):6376-6399
Applied Intelligence - We propose a hybrid architecture composed of a fully convolutional network (FCN) and a Dempster-Shafer layer for image semantic segmentation. In the so-called evidential FCN...  相似文献   

目的 手写文本行提取是文档图像处理中的重要基础步骤,对于无约束手写文本图像,文本行都会有不同程度的倾斜、弯曲、交叉、粘连等问题。利用传统的几何分割或聚类的方法往往无法保证文本行边缘的精确分割。针对这些问题提出一种基于文本行回归-聚类联合框架的手写文本行提取方法。方法 首先,采用各向异性高斯滤波器组对图像进行多尺度、多方向分析,利用拖尾效应检测脊形结构提取文本行主体区域,并对其骨架化得到文本行回归模型。然后,以连通域为基本图像单元建立超像素表示,为实现超像素的聚类,建立了像素-超像素-文本行关联层级随机场模型,利用能量函数优化的方法实现超像素的聚类与所属文本行标注。在此基础上,检测出所有的行间粘连字符块,采用基于回归线的k-means聚类算法由回归模型引导粘连字符像素聚类,实现粘连字符分割与所属文本行标注。最后,利用文本行标签开关实现了文本行像素的操控显示与定向提取,而不再需要几何分割。结果 在HIT-MW脱机手写中文文档数据集上进行文本行提取测试,检测率DR为99.83%,识别准确率RA为99.92%。结论 实验表明,提出的文本行回归-聚类联合分析框架相比于传统的分段投影分析、最小生成树聚类、Seam Carving等方法提高了文本行边缘的可控性与分割精度。在高效手写文本行提取的同时,最大程度地避免了相邻文本行的干扰,具有较高的准确率和鲁棒性。  相似文献   

In this paper, we present a segmentation methodology of handwritten documents in their distinct entities, namely, text lines and words. Text line segmentation is achieved by applying Hough transform on a subset of the document image connected components. A post-processing step includes the correction of possible false alarms, the detection of text lines that Hough transform failed to create and finally the efficient separation of vertically connected characters using a novel method based on skeletonization. Word segmentation is addressed as a two class problem. The distances between adjacent overlapped components in a text line are calculated using the combination of two distance metrics and each of them is categorized either as an inter- or an intra-word distance in a Gaussian mixture modeling framework. The performance of the proposed methodology is based on a consistent and concrete evaluation methodology that uses suitable performance measures in order to compare the text line segmentation and word segmentation results against the corresponding ground truth annotation. The efficiency of the proposed methodology is demonstrated by experimentation conducted on two different datasets: (a) on the test set of the ICDAR2007 handwriting segmentation competition and (b) on a set of historical handwritten documents.  相似文献   

The multi-orientation occurs frequently in ancient handwritten documents, where the writers try to update a document by adding some annotations in the margins. Due to the margin narrowness, this gives rise to lines in different directions and orientations. Document recognition needs to find the lines everywhere they are written whatever their orientation. This is why we propose in this paper a new approach allowing us to extract the multi-oriented lines in scanned documents. Because of the multi-orientation of lines and their dispersion in the page, we use an image meshing allowing us to progressively and locally determine the lines. Once the meshing is established, the orientation is determined using the Wigner–Ville distribution on the projection histogram profile. This local orientation is then enlarged to limit the orientation in the neighborhood. Afterward, the text lines are extracted locally in each zone basing on the follow-up of the orientation lines and the proximity of connected components. Finally, the connected components that overlap and touch in adjacent lines are separated. The morphology analysis of the terminal letters of Arabic words is here considered. The proposed approach has been experimented on 100 documents reaching an accuracy of about 98.6%.  相似文献   

This paper investigates various ensemble methods for offline handwritten text line recognition. To obtain ensembles of recognisers, we implement bagging, random feature subspace, and language model variation methods. For the combination, the word sequences returned by the individual ensemble members are first aligned. Then a confidence-based voting strategy determines the final word sequence. A number of confidence measures based on normalised likelihoods and alternative candidates are evaluated. Experiments show that the proposed ensemble methods can improve the recognition accuracy over an optimised single reference recogniser.  相似文献   

目的 由于舌体与周围组织颜色相似,轮廓模糊,传统的分割方法难以精准分割舌体,为此提出一种基于两阶段卷积神经网络的舌体分割方法。方法 首先,在粗分割阶段,将卷积层和全连接层相结合构建网络Rsnet,采用区域建议策略得到舌体候选框,从候选框中进一步确定舌体,从而实现对舌体的定位,去除大量的干扰信息;然后,在精分割阶段,将卷积层与反卷积层相结合构建网络Fsnet,对粗分割舌象中的每一个像素点进行分类进而实现精分割;最后,采用形态学相关算法对精分割后的舌体图像进行后续处理,进一步消除噪点和边缘粗糙点。结果 本文构建了包含2 764张舌象的数据集,在该数据集上进行五折交叉实验。实验结果表明,本文算法能够取得较为理想的分割结果且具有较快的处理速度。选取了精确度、召回率及F值作为评价标准,与3种常用的传统分割方法相比,在综合指标F值上分别提高了0.58、0.34、0.12,效率上至少提高6倍,与同样基于深度学习思想的MNC(multi-task network cascades)算法相比,在F值上提高0.17,效率上提高1.9倍。结论 将基于深度学习的方法应用到舌体分割中,有利于实现舌象的准确、鲁棒、快速分割。在分割之前,先对舌体进行定位,有助于进一步减少分割中的错分与漏分。实验结果表明,本文算法有效提升了舌体分割的准确性,能够为后续的舌象自动识别和分析奠定坚实的基础。  相似文献   

In this paper we present a multiple classifier system (MCS) for on-line handwriting recognition. The MCS combines several individual recognition systems based on hidden Markov models (HMMs) and bidirectional long short-term memory networks (BLSTM). Beside using two different recognition architectures (HMM and BLSTM), we use various feature sets based on on-line and off-line features to obtain diverse recognizers. Furthermore, we generate a number of different neural network recognizers by changing the initialization parameters. To combine the word sequences output by the recognizers, we incrementally align these sequences using the recognizer output voting error reduction framework (ROVER). For deriving the final decision, different voting strategies are applied. The best combination ensemble has a recognition rate of 84.13%, which is significantly higher than the 83.64% achieved if only one recognition architecture (HMM or BLSTM) is used for the combination, and even remarkably higher than the 81.26% achieved by the best individual classifier. To demonstrate the high performance of the classification system, the results are compared with two widely used commercial recognizers from Microsoft and Vision Objects.  相似文献   

在计算机视觉领域中,语义分割是场景解析和行为识别的关键任务,基于深度卷积神经网络的图像语义分割方法已经取得突破性进展。语义分割的任务是对图像中的每一个像素分配所属的类别标签,属于像素级的图像理解。目标检测仅定位目标的边界框,而语义分割需要分割出图像中的目标。本文首先分析和描述了语义分割领域存在的困难和挑战,介绍了语义分割算法性能评价的常用数据集和客观评测指标。然后,归纳和总结了现阶段主流的基于深度卷积神经网络的图像语义分割方法的国内外研究现状,依据网络训练是否需要像素级的标注图像,将现有方法分为基于监督学习的语义分割和基于弱监督学习的语义分割两类,详细阐述并分析这两类方法各自的优势和不足。本文在PASCAL VOC(pattern analysis, statistical modelling and computational learning visual object classes)2012数据集上比较了部分监督学习和弱监督学习的语义分割模型,并给出了监督学习模型和弱监督学习模型中的最优方法,以及对应的MIoU(mean intersection-over-union)。最后,指出了图像语义分割领域未来可能的热点方向。  相似文献   

目的 肾脏图像分割对于肾脏疾病的诊断有着重要意义,临床上通过测量肾皮质的体积和厚度可判断肾脏是否有肿瘤、慢性动脉硬化性肾病和肾移植急性排斥反应等。现有的肾脏分割算法大多针对一种模态,且只能分割出肾脏整体。本文提出一种基于全卷积网络和GrowCut的肾皮质自动分割算法,用于多模态肾脏图像分割。方法 首先用广义霍夫变换对肾脏进行检测,提取出感兴趣区域,通过数据增强扩充带标签数据;然后用VGG-16预训练模型进行迁移学习,构建适用于肾皮质分割的全卷积网络,设置网络训练参数,使用扩充数据训练网络。最后用全卷积网络分割图像,提取最后一层卷积层的特征图得到种子点标记,结合肾脏图像的先验知识纠正错误种子点,将该标记图作为GrowCut初始种子点可实现肾皮质准确分割。结果 实验数据为30组临床CT和MRI图像,其中一组有标记的CT图像用于训练网络并测试算法分割准确性,该文算法分割准确率IU(region intersection over union)和DSC(Dice similarity coefficient)分别达到91.06%±2.34%和91.79%±2.39%。与全卷积网络FCN-32s相比,本文提出的网络参数减少,准确率更高,可实现肾皮质分割。GrowCut算法考虑像素间的邻域信息,与全卷积网络结合可进一步将分割准确率提高3%。结论 该方法可准确分割多模态肾脏图像,包括正常和变异肾脏的图像,说明该方法优于主流方法,能够为临床诊断提供可靠依据。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号