首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, we propose a mechanism for systematic comparison of the efficacy of unsupervised evaluation methods for parameter selection of binarization algorithms in optical character recognition (OCR). We also analyze these measures statistically and ascertain whether a measure is suitable or not to assess a binarization method. The comparison process is streamlined in several steps. Given an unsupervised measure and a binarization algorithm we: (i) find the best parameter combination for the algorithm in terms of the measure, (ii) use the best binarization of an image on an OCR, and (iii) evaluate the accuracy of the characters detected. We also propose a new unsupervised measure and a statistical test to compare measures based on an intuitive triad of possible results: better, worse or comparable performance. The comparison method and statistical tests can be easily generalized for new measures, binarization algorithms and even other accuracy-driven tasks in image processing. Finally, we perform an extensive comparison of several well known measures, binarization algorithms and OCRs, and use it to show the strengths of the WV measure.  相似文献   

2.
Pattern Analysis and Applications - The aim of this article is twofold. First, we propose an effective methodology for binarization of scene images. For our present study, we use the publicly...  相似文献   

3.
提出一种彩色图像下的文本提取方法,该方法对彩色图像在R、G、B三个颜色层分别进行亮度分级,以避开传统颜色聚类方法的聚类数目选择问题,降低图像复杂度;考虑到文字笔画的显著方向性特征,并且通常具有稳定的颜色,利用方向梯度算法进行文本粗定位;然后进一步利用多类SVM分类器实现文本区域精确判别。新方法限制了候选区域的种类,从而降低了SVM分类器的训练难度,具有较高的准确性和鲁棒性。  相似文献   

4.
5.
Due to the exponential growth of documents on the Internet and the emergent need to organize them, the automated categorization of documents into predefined labels has received an ever-increased attention in the recent years. A wide range of supervised learning algorithms has been introduced to deal with text classification. Among all these classifiers, K-Nearest Neighbors (KNN) is a widely used classifier in text categorization community because of its simplicity and efficiency. However, KNN still suffers from inductive biases or model misfits that result from its assumptions, such as the presumption that training data are evenly distributed among all categories. In this paper, we propose a new refinement strategy, which we called as DragPushing, for the KNN Classifier. The experiments on three benchmark evaluation collections show that DragPushing achieved a significant improvement on the performance of the KNN Classifier.  相似文献   

6.
一种可抗二值化攻击的文本图像可见水印算法   总被引:1,自引:0,他引:1  
一些在文本图像中嵌入可见水印标识的方法会在二值化攻击下完全失效,因此提出一种基于灰度均匀分布的文本图像可见水印算法。该算法通过对二值水印图像的黑色像素进行概率筛选来控制水印的嵌入强度,然后将二值文本图像和筛选后的水印图像映射到相同的灰度分布范围,以得到含可见水印标识的文本水印作品。仿真实验表明,该算法生成的文本图像可见水印作品灰度均匀分布,能够抵抗二值化攻击,具有良好的鲁棒性。  相似文献   

7.
图谱理论在文本图像二值化算法中的应用   总被引:1,自引:0,他引:1  
常丹华  苗丹  何耘娴 《计算机应用》2010,30(10):2802-2804
常用的阈值二值化方法不能很有效地分割出文本图像,而利用图谱理论的思想可以清晰有效地对文本图像进行二值化分割。针对传统的图谱理论分割图像算法计算量大、空间复杂度高的不足,提出了利用直方图灰度等级代替像素级,在此基础上近似计算了权函数的参数,算法的计算量和复杂度都有所降低。实验结果表明,该方法大大降低了计算的复杂性,在速度上优于传统的图谱理论分割方法,质量上优于常用的二值化分割方法。  相似文献   

8.
This paper presents an effective approach for unsupervised language model adaptation (LMA) using multiple models in offline recognition of unconstrained handwritten Chinese texts. The domain of the document to recognize is variable and usually unknown a priori, so we use a two-pass recognition strategy with a pre-defined multi-domain language model set. We propose three methods to dynamically generate an adaptive language model to match the text output by first-pass recognition: model selection, model combination and model reconstruction. In model selection, we use the language model with minimum perplexity on the first-pass recognized text. By model combination, we learn the combination weights via minimizing the sum of squared error with both L2-norm and L1-norm regularization. For model reconstruction, we use a group of orthogonal bases to reconstruct a language model with the coefficients learned to match the document to recognize. Moreover, we reduce the storage size of multiple language models using two compression methods of split vector quantization (SVQ) and principal component analysis (PCA). Comprehensive experiments on two public Chinese handwriting databases CASIA-HWDB and HIT-MW show that the proposed unsupervised LMA approach improves the recognition performance impressively, particularly for ancient domain documents with the recognition accuracy improved by 7 percent. Meanwhile, the combination of the two compression methods largely reduces the storage size of language models with little loss of recognition accuracy.  相似文献   

9.
10.
Document images often suffer from different types of degradation that renders the document image binarization a challenging task. This paper presents a document image binarization technique that segments the text from badly degraded document images accurately. The proposed technique is based on the observations that the text documents usually have a document background of the uniform color and texture and the document text within it has a different intensity level compared with the surrounding document background. Given a document image, the proposed technique first estimates a document background surface through an iterative polynomial smoothing procedure. Different types of document degradation are then compensated by using the estimated document background surface. The text stroke edge is further detected from the compensated document image by using L1-norm image gradient. Finally, the document text is segmented by a local threshold that is estimated based on the detected text stroke edges. The proposed technique was submitted to the recent document image binarization contest (DIBCO) held under the framework of ICDAR 2009 and has achieved the top performance among 43 algorithms that are submitted from 35 international research groups.  相似文献   

11.
康晓东  王昊  郭军  于文勇 《计算机应用》2015,35(9):2636-2639
针对彩色图像分类识别的重要性,提出了一种结合图像特征数据和深度信任网络(DBN)的彩色图像识别方法。首先,构造符合人类视觉特性的图像色彩数据场;其次,以小波变换描述图像的多尺度特征;最后,通过无监督训练深度信任网络实现对图像的识别。实验结果表明,所提方法与Adaboost、支持向量机(SVM)方法比较,分类准确率分别提高约3.7%和2.8%,可有效提高图像识别效果。  相似文献   

12.
Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based machine translation system (UnsupPBMT) achieved good performance, which initializes the phrase tables using the similar words obtained by word embedding modeling. Since word embedding modeling only considers the relevance between words, the phrase table in UnsupPBMT contains a lot of dissimilar words. In this paper, we propose an unsupervised statistical text simplification using pre-trained language modeling BERT for initialization. Specifically, we use BERT as a general linguistic knowledge base for predicting similar words. Experimental results show that our method outperforms the state-of-the-art unsupervised text simplification methods on three benchmarks, even outperforms some supervised baselines.  相似文献   

13.
This paper proposes a new multiresolution technique for color image representation and segmentation, particularly suited for noisy images. A decimated wavelet transform is initially applied to each color channel of the image, and a multiresolution representation is built up to a selected scale 2J. Color gradient magnitudes are computed at the coarsest scale 2J, and an adaptive threshold is used to remove spurious responses. An initial segmentation is then computed by applying the watershed transform to thresholded magnitudes, and this initial segmentation is projected to finer resolutions using inverse wavelet transforms and contour refinements, until the full resolution 20 is achieved. Finally, a region merging technique is applied to combine adjacent regions with similar colors. Experimental results show that the proposed technique produces results comparable to other state-of-the-art algorithms for natural images, and performs better for noisy images.  相似文献   

14.
Some authors have recently devised adaptations of spectral grouping algorithms to integrate prior knowledge, as constrained eigenvalues problems. In this paper, we improve and adapt a recent statistical region merging approach to this task, as a non-parametric mixture model estimation problem. The approach appears to be attractive both for its theoretical benefits and its experimental results, as slight bias brings dramatic improvements over unbiased approaches on challenging digital pictures.  相似文献   

15.
Legibility and subjective preference for color combinations in text   总被引:1,自引:0,他引:1  
S Pastoor 《Human factors》1990,32(2):157-171
This study examined legibility performance and subjective preference for text/background color combinations displayed on a video monitor. Luminance contrast was fixed at two preoptimized levels, either with text brighter than the background (10:1) or vice versa (1:6.5). In Experiment 1, 32 subjects rated about 800 color combinations. No evidence suggested differential effects of luminance polarity or hue, with the only exception that cool background colors (blue and bluish cyan) tended to be preferred for the light-on-dark polarity. Saturation had the most important influence on ratings. Any desaturated color combination appears to be satisfactory for text presentation. In Experiment 2 a reduced set of 18 color combinations was investigated with a new sample of 18 subjects. Reading and search times as well as multidimensional ratings were evaluated. There was no evidence for an influence of luminance polarity or chromaticity on performance. Subjective ratings corresponded well with the results of Experiment 1.  相似文献   

16.
Abstract The bag-of-words approach to text document representation typically results in vectors of the order of 5000–20,000 components as the representation of documents. To make effective use of various statistical classifiers, it may be necessary to reduce the dimensionality of this representation. We point out deficiencies in class discrimination of two popular such methods, Latent Semantic Indexing (LSI), and sequential feature selection according to some relevant criterion. As a remedy, we suggest feature transforms based on Linear Discriminant Analysis (LDA). Since LDA requires operating both with large and dense matrices, we propose an efficient intermediate dimension reduction step using either a random transform or LSI. We report good classification results with the combined feature transform on a subset of the Reuters-21578 database. Drastic reduction of the feature vector dimensionality from 5000 to 12 actually improves the classification performance.An erratum to this article can be found at  相似文献   

17.
In this article we propose a data treatment strategy to generate new discriminative features, called compound-features (or c-features), for the sake of text classification. These c-features are composed by terms that co-occur in documents without any restrictions on order or distance between terms within a document. This strategy precedes the classification task, in order to enhance documents with discriminative c-features. The idea is that, when c-features are used in conjunction with single-features, the ambiguity and noise inherent to their bag-of-words representation are reduced. We use c-features composed of two terms in order to make their usage computationally feasible while improving the classifier effectiveness. We test this approach with several classification algorithms and single-label multi-class text collections. Experimental results demonstrated gains in almost all evaluated scenarios, from the simplest algorithms such as kNN (13% gain in micro-average F1 in the 20 Newsgroups collection) to the most complex one, the state-of-the-art SVM (10% gain in macro-average F1 in the collection OHSUMED).  相似文献   

18.
In this paper, we address the document image binarization problem with a three-stage procedure. First, possible stains and general document background information are removed from the image through a background removal stage. The remaining misclassified background and character pixels are then separated using a Local Co-occurrence Mapping, local contrast and a two-state Gaussian Mixture Model. Finally, some isolated misclassified components are removed by a morphology operator. The proposed scheme offers robust and fast performance, especially for both handwritten and printed documents, which compares favorably with other binarization methods.  相似文献   

19.
Wang  Xinsheng  Pang  Shanmin  Zhu  Jihua  Wang  Jiaxing  Wang  Lin 《Multimedia Tools and Applications》2020,79(21-22):14465-14489
Multimedia Tools and Applications - Deep features extracted from the convolutional layers of pre-trained CNNs have been widely used in the image retrieval task. These features, however, are in a...  相似文献   

20.
Heterogeneous information network (HIN) is a general representation of many different applications, such as social networks, scholar networks, and knowledge networks. A key development of HIN is called PathSim based on meta-path, which measures the pairwise similarity of two entities in the HIN of the same type. When using PathSim in practice, we usually need to handcraft some meta-paths which are paths over entity types instead of entities themselves. However, finding useful meta-paths is not trivial to human. In this paper, we present an unsupervised meta-path selection approach to automatically find useful meta-paths over HIN, and then develop a new similarity measure called KnowSim which is an ensemble of selected meta-paths. To solve the high computational cost of enumerating all possible meta-paths, we propose to use an approximate personalized PageRank algorithm to find useful subgraphs to allocate the meta-paths. We apply KnowSim to text clustering and classification problems to demonstrate that unsupervised meta-path selection can help improve the clustering and classification results. We use Freebase, a well-known world knowledge base, to conduct semantic parsing and construct HIN for documents. Our experiments on 20Newsgroups and RCV1 datasets show that KnowSim results in impressive high-quality document clustering and classification performance. We also demonstrate the approximate personalized PageRank algorithm can efficiently and effectively compute the meta-path based similarity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号