首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
This study presents a new method, namely the multi-plane segmentation approach, for segmenting and extracting textual objects from various real-life complex document images. The proposed multi-plane segmentation approach first decomposes the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. This process consists of two stages—localized histogram multilevel thresholding and multi-plane region matching and assembling. Then a text extraction procedure is applied on the resultant planes to detect and extract textual objects with different characteristics in the respective planes. The proposed approach processes document images regionally and adaptively according to their respective local features. Hence detailed characteristics of the extracted textual objects, particularly small characters with thin strokes, as well as gradational illuminations of characters, can be well-preserved. Moreover, this way also allows background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Experimental results on real-life complex document images demonstrate that the proposed approach is effective in extracting textual objects with various illuminations, sizes, and font styles from various types of complex document images.  相似文献   

Appearance-based visual speech recognition using only video signals is presented. The proposed technique is based on the use of directional motion history images (DMHIs), which is an extension of the popular optical-flow method for object tracking. Zernike moments of each DMHI are computed in order to perform the classification. The technique incorporates automatic temporal segmentation of isolated utterances. The segmentation of isolated utterance is achieved using pair-wise pixel comparison. Support vector machine is used for classification and the results are based on leave-one-out paradigm. Experimental results show that the proposed technique achieves better performance in visemes recognition than others reported in literature. The benefit of this proposed visual speech recognition method is that it is suitable for real-time applications due to quick motion tracking system and the fast classification method employed. It has applications in command and control using lip movement to text conversion and can be used in noisy environment and also for assisting speech impaired persons.  相似文献   

动态对白框是动漫、游戏及社交应用中新出现的一种气氛烘托手法,但相关的研究还非常匮乏。为此提出了周期性动态对白框的数学模型,并研究了它的呈现及编辑技术。将动态对白框表达为固定基线和动态曲线的叠加,其动态曲线的每个控制点都表达为一组傅里叶系数,并通过傅里叶合成的方式实现周期性的动态效果。提出了多种对白框的渲染风格,以实现多种实时呈现效果。提出了主题模板和笔式编辑工具,为用户提供直观的方式以编辑、创作包含大量参数的周期性动态对白框模型。实现了周期性动态对白框原型创作系统,并构造出多种动态对白框特效,验证了该技术可实现新颖的、多样化的气氛烘托效果。  相似文献   

Web新闻语料分词和标注错误分析   总被引:1,自引:1,他引:0       下载免费PDF全文
通过分析Web突发事件语料库文本的加工统计得出11类错误类型,并对其中的一些错误提出了解决方案。研究结果不仅对语料库加工初期分词、标注方法的改进有启发作用,而且对中文的自动校对方法,提供一定的借鉴。  相似文献   

Dot-matrix text recognition is a difficult problem, especially when characters are broken into several disconnected components. We present a dot-matrix text recognition system which uses the fact that dot-matrix fonts are fixed-pitch, in order to overcome the difficulty of the segmentation process. After finding the most likely pitch of the text, a decision is made as to whether the text is written in a fixed-pitch or proportional font. Fixed-pitch text is segmented using a pitch-based segmentation process that can successfully segment both touching and broken characters. We report performance results for the pitch estimation, fixed-pitch decision and segmentation, and recognition processes. Received October 18, 1999 / Revised April 21, 2000  相似文献   

《Pattern recognition letters》2001,22(3-4):395-405
In this paper, we propose a new knowledge-based method illustrated in the context of segmentation, which labels internal brain structures viewed by magnetic resonance imaging (MRI). In order to improve the accuracy of the labeling, we introduce a fuzzy model of regions of interest (ROI) by analogy with the electrostatic potential distribution, to represent more appropriately the knowledge of distance, shape and relationship of structures. The knowledge is mainly derived from the Talairach stereotaxic atlas. The labeling is achieved by the regionwise labeling using genetic algorithms (GAs), followed by a voxelwise amendment using parallel region growing. The fuzzy model is used both to design the fitness function of GAs, and to guide the region growing. The performance of our proposed method has been quantitatively validated by six indices with respect to manually labeled images.  相似文献   

肝脏CTA图像肋骨和脊椎骨分割方法研究   总被引:2,自引:0,他引:2       下载免费PDF全文
肋骨和脊椎骨的分割是从肝脏CTA(即肝脏CT血管造影)图像中准确分割出肝脏的重要预处理工作,一般考虑阈值分割方法,但该方法常常导致分割不全或过分割。提出了一种将形态学方法和阈值法结合起来的肋骨和脊椎骨分割方法。利用局部解剖学知识,构建四种图像边界特征线,然后通过二值形态学处理方法实现肋骨和脊椎骨的分割。这样既适应了由于不同图像的肋骨和脊椎骨灰度值不同的复杂性,也消除了由于其他器官灰度值相近的影响。实验结果表明,该方法能够高效准确地分割出肝脏CTA图像中的肋骨和脊椎骨。  相似文献   

语音分割是语音识别和语音合成中必不可少的基础性工作,其质量对后续系统的影响巨大。使用手工分割和标注虽然精度高,但费时费力,同时需要熟练的领域专家来完成,自动语音分割因此成为语音处理的研究热点。首先针对自动语音分割目前的研究进展,介绍了语音分割的不同分类方法;然后分别介绍了基于对齐的方法和基于边界检测的方法,并详细介绍了可以应用在上述两种框架下的神经网络语音分割方法;接着介绍了基于生物激励信号以及博弈论等方法的新型语音分割技术,并给出了领域内广泛使用的性能评估度量,并对这些评估指标进行比较和分析;最后总结并提出语音分割研究未来发展的重要方向。  相似文献   

This paper presents a methodology for separating handwritten foreground pixels, from background pixels, in carbon copied medical forms. Comparisons between prior and proposed techniques are illustrated. This study involves the analysis of the New York State (NYS) Department of Health (DoH) Pre-Hospital Care Report (PCR) [Western Regional Emergency Medical Services, Bureau of Emergency Medical Services, New York State (NYS) Department of Health (DoH), Prehospital Care Report v4.] which is a standard form used in New York by all Basic and Advanced Life Support pre-hospital health care professionals to document patient status in the emergency environment. The forms suffer from extreme carbon mesh noise, varying handwriting pressure sensitivity issues, and smudging which are further complicated by the writing environment. Extraction of handwriting from these medical forms is a vital step in automating emergency medical health surveillance systems.  相似文献   

目的 目前基于卷积神经网络(CNN)的文本检测方法对自然场景中小尺度文本的定位非常困难。但自然场景图像中文本目标与其他目标存在很强的关联性,即自然场景中的文本通常伴随特定物体如广告牌、路牌等同时出现,基于此本文提出了一种顾及目标关联的级联CNN自然场景文本检测方法。方法 首先利用CNN检测文本目标及包含文本的关联物体目标,得到文本候选框及包含文本的关联物体候选框;再扩大包含文本的关联物体候选框区域,并从原始图像中裁剪,然后以该裁剪图像作为CNN的输入再精确检测文本候选框;最后采用非极大值抑制方法融合上述两步生成的文本候选框,得到文本检测结果。结果 本文方法能够有效地检测小尺度文本,在ICDAR-2013数据集上召回率、准确率和F值分别为0.817、0.880和0.847。结论 本文方法顾及自然场景中文本目标与包含文本的物体目标的强关联性,提高了自然场景图像中小尺度文本检测的召回率。  相似文献   

Adaptive segmentation of noisy and textured images   总被引:2,自引:0,他引:2  
An image segmentation algorithm is described which is based on the integration of signal model parameter estimates and maximum a posteriori labelling. The parameter estimation is based on either a maximum likelihood-based method for a quadric signal model or a maximum pseudo-likelihood based method for a Gauss-Markov signal model. The first case is applicable to standard grey-level image segmentation as well as segmentation of shaded 3D surfaces, while the second case is applicable to texture segmentation. A key aspect of the algorithm is the incorporation of a coarse to fine processing strategy which limits the search for the optimum labelling at any one resolution to a subset of labellings which are consistent with the optimum labelling at the previous coarser resolution. Consistency is in terms of a prior label model which specifies the conditional probability of a given label in terms of the labelling at the previous level of resolution. It is shown how such an approach leads to a simple relaxation procedure based on local pyramid node computations. An extension of the algorithm is also described which performs accurate inter-region boundary placement using a step-wise refinement procedure based on a simple adaptive filter. The problem of automatic determination of the number of regions is also addressed. It is shown how a simple agglomerative clustering idea, again based on pyramid node computations, can effectively solve this problem.  相似文献   

Building a large vocabulary continuous speech recognition (LVCSR) system requires a lot of hours of segmented and labelled speech data. Arabic language, as many other low-resourced languages, lacks such data, but the use of automatic segmentation proved to be a good alternative to make these resources available. In this paper, we suggest the combination of hidden Markov models (HMMs) and support vector machines (SVMs) to segment and to label the speech waveform into phoneme units. HMMs generate the sequence of phonemes and their frontiers; the SVM refines the frontiers and corrects the labels. The obtained segmented and labelled units may serve as a training set for speech recognition applications. The HMM/SVM segmentation algorithm is assessed using both the hit rate and the word error rate (WER); the resulting scores were compared to those provided by the manual segmentation and to those provided by the well-known embedded learning algorithm. The results show that the speech recognizer built upon the HMM/SVM segmentation outperforms in terms of WER the one built upon the embedded learning segmentation of about 0.05%, even in noisy background.  相似文献   


Paper documents are ideal sources of useful information and have a profound impact on every aspect of human lives. These documents may be printed or handwritten and contain information as combinations of texts, figures, tables, charts, etc. This paper proposes a method to segment text lines from both flatbed scanned/camera-captured heavily warped printed and handwritten documents. This work uses the concept of semantic segmentation with the help of a multi-scale convolutional neural network. The results of line segmentation using the proposed method outperform a number of similar proposals already reported in the literature. The performance and efficacy of the proposed method have been corroborated by the test result on a variety of publicly available datasets, including ICDAR, Alireza, IUPR, cBAD, Tobacco-800, IAM, and our dataset.


Visual inspection based on closed circuit television surveys is used widely in North America to assess the condition of underground pipes. Although the human eye is extremely effective at recognition and classification, it is not suitable for assessing pipe defects in thousand of miles of pipeline because of fatigue, subjectivity, and cost. In this paper, simple, robust, and efficient image segmentation and classification algorithm for the automated analysis of scanned underground pipe images is presented. The experimental results demonstrate that the proposed algorithm can precisely segment and classify pipe cracks, holes, laterals, joints and collapse surface from underground pipe images  相似文献   

The Pap smear test is a manual screening procedure that is used to detect precancerous changes in cervical cells based on color and shape properties of their nuclei and cytoplasms. Automating this procedure is still an open problem due to the complexities of cell structures. In this paper, we propose an unsupervised approach for the segmentation and classification of cervical cells. The segmentation process involves automatic thresholding to separate the cell regions from the background, a multi-scale hierarchical segmentation algorithm to partition these regions based on homogeneity and circularity, and a binary classifier to finalize the separation of nuclei from cytoplasm within the cell regions. Classification is posed as a grouping problem by ranking the cells based on their feature characteristics modeling abnormality degrees. The proposed procedure constructs a tree using hierarchical clustering, and then arranges the cells in a linear order by using an optimal leaf ordering algorithm that maximizes the similarity of adjacent leaves without any requirement for training examples or parameter adjustment. Performance evaluation using two data sets show the effectiveness of the proposed approach in images having inconsistent staining, poor contrast, and overlapping cells.  相似文献   

Automated segmentation of touching or overlapping chromosomes in a metaphase image is a critical step for computer-aided chromosomes analysis. Conventional chromosome imaging methods acquire single-band grayscale images, and such a limitation makes the separation of touching or overlapping chromosomes challenging. In the multiplex fluorescence in situ hybridization (M-FISH) technique, each class of chromosomes can bind with a different combination of fluorophores. The M-FISH technique results in multispectral chromosome images, which has distinct spectral signatures. This paper presents a novel automated chromosome analysis method to combine the pixel-level geometric and multispectral information with decision-level pairing information. Our chromosome segmentation method uses the geometric and spectral information to partition the chromosome cluster into three regions. There will be ambiguity when combining these regions into separated chromosomes by using only spectral and geometric information. Then a graph–theoretical pairing method is introduced to resolve any remaining ambiguity of the aforementioned segmentation process. Experimental results demonstrate that the proposed joint segmentation and pairing method outperforms conventional grayscale and multispectral segmentation methods in separating touching and overlapping chromosomes.  相似文献   

The detection and segmentation of adherent eukaryotic cells from brightfield microscopy images represent challenging tasks in the image analysis field. This paper presents a free and open-source image analysis package which fully automates the tasks of cell detection, cell boundary segmentation, and nucleus segmentation in brightfield images. The package also performs image registration between brightfield and fluorescence images. The algorithms were evaluated on a variety of biological cell lines and compared against manual and fluorescence-based ground truths. When tested on HT1080 and HeLa cells, the cell detection step was able to correctly identify over 80% of cells, whilst the cell boundary segmentation step was able to segment over 75% of the cell body pixels, and the nucleus segmentation step was able to correctly identify nuclei in over 75% of the cells. The algorithms for cell detection and nucleus segmentation are novel to the field, whilst the cell boundary segmentation algorithm is contrast-invariant, which makes it more robust on these low-contrast images. Together, this suite of algorithms permit brightfield microscopy image processing without the need for additional fluorescence images. Finally our sephaCe application, which is available at http://www.sephace.com, provides a novel method for integrating these methods with any motorised microscope, thus facilitating the adoption of these techniques in biological research labs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号