期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Preprocessing of Low-Quality Handwritten Documents Using Markov Random Fields

Cao Huaigu Govindaraju Venu 《IEEE transactions on pattern analysis and machine intelligence》2009,31(7):1184-1194

This paper presents a statistical approach to the preprocessing of degraded handwritten forms including the steps of binarization and form line removal. The degraded image is modeled by a Markov Random Field (MRF) where the hidden-layer prior probability is learned from a training set of high-quality binarized images and the observation probability density is learned on-the-fly from the gray-level histogram of the input image. We have modified the MRF model to drop the preprinted ruling lines from the image. We use the patch-based topology of the MRF and Belief Propagation (BP) for efficiency in processing. To further improve the processing speed, we prune unlikely solutions from the search space while solving the MRF. Experimental results show higher accuracy on two data sets of degraded handwritten images than previously used methods. 相似文献

2.

Type-2 Fuzzy Markov Random Fields and Their Application to Handwritten Chinese Character Recognition 总被引：1，自引：0，他引：1

Jia Zeng Zhi-Qiang Liu 《Fuzzy Systems, IEEE Transactions on》2008,16(3):747-760

In this paper, we integrate type-2 (T2) fuzzy sets with Markov random fields (MRFs) referred to as T2 FMRFs, which may handle both fuzziness and randomness in the structural pattern representation. On the one hand, the T2 membership function (MF) has a 3-D structure in which the primary MF describes randomness and the secondary MF evaluates the fuzziness of the primary MF. On the other hand, MRFs can represent patterns statistical-structurally in terms of neighborhood system and clique potentials and, thus, have been widely applied to image analysis and computer vision. In the proposed T2 FMRFs, we define the same neighborhood system as that in classical MRFs. To describe uncertain structural information in patterns, we derive the fuzzy likelihood clique potentials from T2 fuzzy Gaussian mixture models. The fuzzy prior clique potentials are penalties for the mismatched structures based on prior knowledge. Because Chinese characters have hierarchical structures, we use T2 FMRFs to model character structures in the handwritten Chinese character recognition system. The overall recognition rate is 99.07%, which confirms the effectiveness of the proposed method. 相似文献

3.

Keyword spotting in poorly printed documents using pseudo 2-Dhidden Markov models

Shyh-Shiaw Kuo Agazzi O.E. 《IEEE transactions on pattern analysis and machine intelligence》1994,16(8):842-848

An algorithm for robust machine recognition of keywords embedded in a poorly printed document is presented. For each keyword, two statistical models, called pseudo 2-D hidden Markov models, are created for representing the actual keyword and all the other extraneous words, respectively. Dynamic programming is then used for matching an unknown input word with the two models and for making a maximum likelihood decision. Although the models are pseudo 2-D in the sense that they are not fully connected 2-D networks, they are shown to be general enough in characterizing printed words efficiently. These models facilitate a nice “elastic matching” property in both horizontal and vertical directions, which makes the recognizer not only independent of size and slant but also tolerant of highly deformed and noisy words. The system is evaluated on a synthetically created database that contains about 26000 words. Currently, the authors achieve a recognition accuracy of 99% when words in testing and training sets are of the same font size, and 96% when they are in different sizes. In the latter case, the conventional 1-D HMM achieves only a 70% accuracy rate 相似文献

4.

基于马尔可夫随机场的嘴唇特征提取方法*

岑杰赵杰煜《计算机应用研究》2007,24(7):300-302

讨论了基于马尔可夫随机场(MRF)模型的融合颜色和边缘信息的嘴唇特征提取方法.首先进行嘴唇区域检测,再结合嘴唇形状特点建立了基于MRF的嘴唇图像分割模型,构造相应的能量函数,并采用改进的最高置信度优先(HCF)算法求解能量函数的最优解,得到图像标记场,进而提取出嘴唇轮廓.结合人脸结构信息,提出了融合鼻孔角度信息的嘴唇特征点提取方法.实验结果表明,此算法具有良好的鲁棒性. 相似文献

5.

Fully Bayesian Field Slam Using Gaussian Markov Random Fields

下载免费PDF全文

Huan N. Do Mahdi Jadaliha Mehmet Temel Jongeun Choi 《Asian journal of control》2016,18(4):1175-1188

This paper presents a fully Bayesian way to solve the simultaneous localization and spatial prediction problem using a Gaussian Markov random field (GMRF) model. The objective is to simultaneously localize robotic sensors and predict a spatial field of interest using sequentially collected noisy observations by robotic sensors. The set of observations consists of the observed noisy positions of robotic sensing vehicles and noisy measurements of a spatial field. To be flexible, the spatial field of interest is modeled by a GMRF with uncertain hyperparameters. We derive an approximate Bayesian solution to the problem of computing the predictive inferences of the GMRF and the localization, taking into account observations, uncertain hyperparameters, measurement noise, kinematics of robotic sensors, and uncertain localization. The effectiveness of the proposed algorithm is illustrated by simulation results as well as by experiment results. The experiment results successfully show the flexibility and adaptability of our fully Bayesian approach in a data‐driven fashion. 相似文献

6.

基于模糊C均值与Markov随机场的图像分割 总被引：3，自引：1，他引：3

下载免费PDF全文

蔡涛徐国华徐筱龙《计算机工程》2007,33(20):34-36,3

针对传统模糊C-均值(FCM)图像分割算法没有考虑图像空间连续性的缺点,提出一种改进的空间约束FCM分割算法。该算法引入了Markov随机场理论中类别标记的伪似然度近似策略,将像素特征域相似性同空间域相邻性有机地结合起来,给出了新的像素样本聚类目标函数。实验证明,该算法能大大提高分割性能并改善分割的视觉效果。相似文献

7.

模糊马尔可夫随机场理论在阴影检测中的应用

下载免费PDF全文

柏柯嘉刘伟铭《中国图象图形学报》2010,15(3):409-416

阴影的检测是目标检测、目标跟踪、视频监控等领域的一个关键问题。提出了一种基于模糊马尔可夫随机场的阴影检测算法。该算法把阴影检测问题看做是一个求最优化的像素点分类问题。对于输入的视频,提取背景图像,找出阴影和前景目标物体区域。通过计算阴影概率分布,前景概率分布,隶属度函数,建立模糊马尔可夫随机场。应用贝叶斯准则,最大后验(MAP)估计和条件迭代模式(ICM)算法,寻找最优化的模糊马尔可夫随机场,并利用最大隶属度原则消除模糊性,得到阴影检测的结果。实验证明,文中算法具有较好的阴影检测率和目标检测率。相似文献

8.

Handwritten word-spotting using hidden Markov models and universal vocabularies

José A. Rodríguez-Serrano Author Vitae Florent Perronnin^{Author Vitae} 《Pattern recognition》2009,42(9):2106-2116

Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce—as low as one sample per keyword—thanks to the prior information which can be incorporated in the shared set of Gaussians. 相似文献

9.

Dynamic graph cuts for efficient inference in Markov Random Fields 总被引：3，自引：0，他引：3

Kohli P Torr PH 《IEEE transactions on pattern analysis and machine intelligence》2007,29(12):2079-2088

Abstract-In this paper we present a fast new fully dynamic algorithm for the st-mincut/max-flow problem. We show how this algorithm can be used to efficiently compute MAP solutions for certain dynamically changing MRF models in computer vision such as image segmentation. Specifically, given the solution of the max-flow problem on a graph, the dynamic algorithm efficiently computes the maximum flow in a modified version of the graph. The time taken by it is roughly proportional to the total amount of change in the edge weights of the graph. Our experiments show that, when the number of changes in the graph is small, the dynamic algorithm is significantly faster than the best known static graph cut algorithm. We test the performance of our algorithm on one particular problem: the object-background segmentation problem for video. It should be noted that the application of our algorithm is not limited to the above problem, the algorithm is generic and can be used to yield similar improvements in many other cases that involve dynamic change. 相似文献

10.

基于高阶马尔科夫随机场的图像去噪声研究

温喆《计算机应用研究》2016,33(7)

在图像去噪声处理中,高阶马尔科夫随机场通过最小化能量函数达到最优的去噪声结果。为了提高能量函数的优化性能,本文在马尔科夫随机场子模性的基础上对原始问题和对偶问题进行了分析,提出了一种基于原始-对偶方法的子模块之和方法。首先,描述了马尔科夫随机场的线性规划及其对偶问题,并介绍了子模块之和流方法。接下来,通过对子模块之和流方法的原始问题和对偶问题进行分析,提出了同时满足派系松弛和一元松弛条件的近似解计算方法。实验表明,本文提出的方法与四种典型的图像去噪声方法相比具有更好的效果和更短的运行时间。相似文献

11.

The use of Markov Random Fields as models of texture

《Computer Graphics and Image Processing》1980,12(4):357-370

We propose Markov Random Fields (MRFs) as probabilistic models of digital image texture where a textured region is viewed as a finite sample of a two-dimensional random process describable by its statistical parameters. MRFs are multidimensional generalizations of Markov chains defined in terms of conditional probabilities associated with spatial neighborhoods. We present an algorithm that generates an MRF on a finite toroidal square lattice from an independent identically distributed (i.i.d.) array of random variables and a given set of independent real-valued statistical parameters. The parametric specification of a consistent collection of MRF conditional probabilities is a general result known as the MRF-Gibbs Random Field (GRF) equivalence. The MRF statistical parameters control the size and directionality of the clusters of adjacent similar pixels which are basic to texture discrimination and thus seem to constitute an efficient model of texture. In the last part of this paper we outline an MRF parameter estimation method and goodness of fit statistical tests applicable to MRF models for a given unknown digital image texture on a finite toroidal square lattice. The estimated parameters may be used as basic features in texture classification. Alternatively these parameters may be used in conjunction with the MRF generation algorithm as a powerful data compression scheme. 相似文献

12.

高阶马尔科夫随机场及其在场景理解中的应用 总被引：3，自引：0，他引：3

余淼胡占义《自动化学报》2015,41(7):1213-1234

与传统的一阶马尔科夫随机场(Markov random field, MRF)相比, 高阶马尔科夫随机场能够表达更加复杂的定性和统计性先验信息, 在模型的表达能力上具有更大的优势. 但高阶马尔科夫随机场对应的能量函数优化问题更为复杂. 同时其模型参数数目的爆炸式增长使得选择合适的模型参数也成为了一个非常困难的问题. 近年来, 学术界在高阶马尔科夫随机场的能量模型的建模、优化和参数学习三个方面进行了深入的探索, 取得了很多有意义的成果. 本文首先从这三个方面总结和介绍了目前在高阶马尔科夫随机场研究上取得的主要成果, 然后介绍了高阶马尔科夫随机场在图像理解和三维场景理解中的应用现状. 相似文献

13.

Preserving objects in Markov Random Fields region growing image segmentation

Amer Dawoud Anton Netchaev 《Pattern Analysis & Applications》2012,15(2):155-161

This paper proposes an algorithm that preserves objects in Markov Random Fields (MRF) region growing based image segmentation. This is achieved by modifying the MRF energy minimization process so that it would penalize merging regions that have real edges in the boundary between them. Experimental results show that the integration of edge information increases the precision of the segmentation by ensuring the conservation of the objects contours during the region-growing process. 相似文献

14.

Endocardial Boundary E timation and Tracking in Echocardiographic Images using Deformable Template and Markov Random Fields

Max Mignotte Jean Meunier Jean-Claude Tardif 《Pattern Analysis & Applications》2001,4(4):256-271

We present a new approach to shape-based segmentation and tracking of deformable anatomical structures in medical images, and validate this approach by detecting and tracking the endocardial contour in an echocardiographic image sequence. To this end, some global prior shape knowledge of the endocardial boundary is captured by a prototype template with a set of predefined global and local deformations to take into account its inherent natural variability over time. In this deformable model-based Bayesian segmentation, the data likelihood model relies on an accurate statistical modelling of the grey level distribution of each class present in the ultrasound image. The parameters of this distribution mixture are given by a preliminary iterative estimation step. This estimation scheme relies on a Markov Random Field prior model, and takes into account the imaging process as well as the distribution shape of each class present in the image. Then the detection and the tracking problem is stated in a Bayesian framework, where it ends up as a cost function minimisation problem for each image of the sequence. In our application, this energy optimisation problem is efficiently solved by a genetic algorithm combined with a steepest ascent procedure. This technique has been successfully applied on synthetic images, and on a real echocardiographic image sequence. 相似文献

15.

基于通用高斯马尔可夫随机场模型的图像超分辨率重建

黄华李俊齐春朱世华《计算机科学》2005,32(11):195-197

提出了一种基于通用高斯马尔可夫随机场（0孙皿疆）模型的图像超分辨率重建方法,给出了求解过程和实验结果,并进行了分析。相对Compound Markov随机场模型和Huber-Markov随机场模型,GGMRF模型不用判断边缘或者线过程,因此优化求解简单,大大减少了运算量。实验结果表明在低噪声情况下,该方法重建图像视觉效果较好。相似文献

16.

Supervised semantic relation mining from linguistically noisy text documents

Cristina Giannone Roberto Basili Paolo Naggar Alessandro Moschitti 《International Journal on Document Analysis and Recognition》2011,14(2):213-228

In this paper, we present models for mining text relations between named entities, which can deal with data highly affected by linguistic noise. Our models are made robust by: (a) the exploitation of state-of-the-art statistical algorithms such as support vector machines (SVMs) along with effective and versatile pattern mining methods, e.g. word sequence kernels; (b) the design of specific features capable of capturing long distance relationships; and (c) the use of domain prior knowledge in the form of ontological constraints, e.g. bounds on the type of relation arguments given by the semantic categories of the involved entities. This property allows for keeping small the training data required by SVMs and consequently lowering the system design costs. We empirically tested our hybrid model in the very complex domain of business intelligence, where the textual data are constituted by reports on investigations into criminal enterprises based on police interrogatory reports, electronic eavesdropping and wiretaps. The target relations are typically established between entities, as they are mentioned in these information sources. The experiments on mining such relations show that our approach with small training data is robust to non-conventional languages as dialects, jargon expressions or coded words typically contained in such text. 相似文献

17.

Word spotting in historical printed documents using shape and sequence comparisons

Khurram Khurshid Claudie Faure Nicole Vincent 《Pattern recognition》2012,45(7):2598-2609

Information spotting in scanned historical document images is a very challenging task. The joint use of the mechanical press and of human controlled inking introduced great variability in ink level within a book or even within a page. Consequently characters are often broken or merged together and thus become difficult to segment and recognize. The limitations of commercial OCR engines for information retrieval in historical document images have inspired alternative means of identification of given words in such documents. We present a word spotting method for scanned documents in order to find the word images that are similar to a query word, without assuming a correct segmentation of the words into characters. The connected components are first processed to transform a word pattern into a sequence of sub-patterns. Each sub-pattern is represented by a sequence of feature vectors. A modified Edit distance is proposed to perform a segmentation-driven string matching and to compute the Segmentation Driven Edit (SDE) distance between the words to be compared. The set of SDE operations is defined to obtain the word segmentations that are the most appropriate to evaluate their similarity. These operations are efficient to cope with broken and touching characters in words. The distortion of character shapes is handled by coupling the string matching process with local shape comparisons that are achieved by Dynamic Time Warping (DTW). The costs of the SDE operations are provided by the DTW distances. A sub-optimal version of the SDE string matching is also proposed to reduce the computation time, nevertheless it did not lead to a great decrease in performance. It is possible to enter a query by example or a textual query entered with the keyboard. Textual queries can be used to directly spot the word without the need to synthesize its image, as far as character prototype images are available. Results are presented for different documents and compared with other methods, showing the efficiency of our method. 相似文献

18.

Multioriented and curved text lines extraction from Indian documents

Pal U. Roy P.P. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2004,34(4):1676-1684

There are printed artistic documents where text lines of a single page may not be parallel to each other. These text lines may have different orientations or the text lines may be curved shapes. For the optical character recognition (OCR) of these documents, we need to extract such lines properly. In this paper, we propose a novel scheme, mainly based on the concept of water reservoir analogy, to extract individual text lines from printed Indian documents containing multioriented and/or curve text lines. A reservoir is a metaphor to illustrate the cavity region of a character where water can be stored. In the proposed scheme, at first, connected components are labeled and identified either as isolated or touching. Next, each touching component is classified either straight type (S-type) or curve type (C-type), depending on the reservoir base-area and envelope points of the component. Based on the type (S-type or C-type) of a component two candidate points are computed from each touching component. Finally, candidate regions (neighborhoods of the candidate points) of the candidate points of each component are detected and after analyzing these candidate regions, components are grouped to get individual text lines. 相似文献

19.

A methodology to retrieve text documents from multiple databases 总被引：1，自引：0，他引：1

Yu C. King-Lup Liu Weiyi Meng Zonghuan Wu Rishe N. 《Knowledge and Data Engineering, IEEE Transactions on》2002,14(6):1347-1361

This paper presents a methodology for finding the n most similar documents across multiple text databases for any given query and for any positive integer n. This methodology consists of two steps. First, the contents of databases are indicated approximately by database representatives. Databases are ranked using their representatives with respect to the given query. We provide a necessary and sufficient condition to rank the databases optimally. In order to satisfy this condition, we provide three estimation methods. One estimation method is intended for short queries; the other two are for all queries. Second, we provide an algorithm, OptDocRetrv, to retrieve documents from the databases according to their rank and in a particular way. We show that if the databases containing the n most similar documents for a given query are ranked ahead of other databases, our methodology will guarantee the retrieval of the n most similar documents for the query. When the number of databases is large, we propose to organize database representatives into a hierarchy and employ a best-search algorithm to search the hierarchy. It is shown that the effectiveness of the best-search algorithm is the same as that of evaluating the user query against all database representatives. 相似文献

20.

Segmentation of text lines using multi-scale CNN from warped printed and handwritten document images

Dutta Arpita Garai Arpan Biswas Samit Das Amit Kumar 《International Journal on Document Analysis and Recognition》2021,24(4):299-313

Paper documents are ideal sources of useful information and have a profound impact on every aspect of human lives. These documents may be printed or handwritten and contain information as combinations of texts, figures, tables, charts, etc. This paper proposes a method to segment text lines from both flatbed scanned/camera-captured heavily warped printed and handwritten documents. This work uses the concept of semantic segmentation with the help of a multi-scale convolutional neural network. The results of line segmentation using the proposed method outperform a number of similar proposals already reported in the literature. The performance and efficacy of the proposed method have been corroborated by the test result on a variety of publicly available datasets, including ICDAR, Alireza, IUPR, cBAD, Tobacco-800, IAM, and our dataset.

相似文献