共查询到20条相似文献,搜索用时 15 毫秒
1.
Marte A. Ramírez-Ortegón Author Vitae Edgar A. Duéñez-Guzmán Author Vitae 《Pattern recognition》2011,44(3):491-502
In this paper, we propose a mechanism for systematic comparison of the efficacy of unsupervised evaluation methods for parameter selection of binarization algorithms in optical character recognition (OCR). We also analyze these measures statistically and ascertain whether a measure is suitable or not to assess a binarization method. The comparison process is streamlined in several steps. Given an unsupervised measure and a binarization algorithm we: (i) find the best parameter combination for the algorithm in terms of the measure, (ii) use the best binarization of an image on an OCR, and (iii) evaluate the accuracy of the characters detected. We also propose a new unsupervised measure and a statistical test to compare measures based on an intuitive triad of possible results: better, worse or comparable performance. The comparison method and statistical tests can be easily generalized for new measures, binarization algorithms and even other accuracy-driven tasks in image processing. Finally, we perform an extensive comparison of several well known measures, binarization algorithms and OCRs, and use it to show the strengths of the WV measure. 相似文献
2.
Ghoshal Ranjit Roy Anandarup Banerjee Ayan Dhara Bibhas Chandra Parui Swapan K. 《Pattern Analysis & Applications》2019,22(4):1361-1375
Pattern Analysis and Applications - The aim of this article is twofold. First, we propose an effective methodology for binarization of scene images. For our present study, we use the publicly... 相似文献
3.
提出一种彩色图像下的文本提取方法,该方法对彩色图像在R、G、B三个颜色层分别进行亮度分级,以避开传统颜色聚类方法的聚类数目选择问题,降低图像复杂度;考虑到文字笔画的显著方向性特征,并且通常具有稳定的颜色,利用方向梯度算法进行文本粗定位;然后进一步利用多类SVM分类器实现文本区域精确判别。新方法限制了候选区域的种类,从而降低了SVM分类器的训练难度,具有较高的准确性和鲁棒性。 相似文献
4.
5.
《Expert systems with applications》2006,30(2):290-298
Due to the exponential growth of documents on the Internet and the emergent need to organize them, the automated categorization of documents into predefined labels has received an ever-increased attention in the recent years. A wide range of supervised learning algorithms has been introduced to deal with text classification. Among all these classifiers, K-Nearest Neighbors (KNN) is a widely used classifier in text categorization community because of its simplicity and efficiency. However, KNN still suffers from inductive biases or model misfits that result from its assumptions, such as the presumption that training data are evenly distributed among all categories. In this paper, we propose a new refinement strategy, which we called as DragPushing, for the KNN Classifier. The experiments on three benchmark evaluation collections show that DragPushing achieved a significant improvement on the performance of the KNN Classifier. 相似文献
6.
一种可抗二值化攻击的文本图像可见水印算法 总被引:1,自引:0,他引:1
一些在文本图像中嵌入可见水印标识的方法会在二值化攻击下完全失效,因此提出一种基于灰度均匀分布的文本图像可见水印算法。该算法通过对二值水印图像的黑色像素进行概率筛选来控制水印的嵌入强度,然后将二值文本图像和筛选后的水印图像映射到相同的灰度分布范围,以得到含可见水印标识的文本水印作品。仿真实验表明,该算法生成的文本图像可见水印作品灰度均匀分布,能够抵抗二值化攻击,具有良好的鲁棒性。 相似文献
7.
8.
This paper presents an effective approach for unsupervised language model adaptation (LMA) using multiple models in offline recognition of unconstrained handwritten Chinese texts. The domain of the document to recognize is variable and usually unknown a priori, so we use a two-pass recognition strategy with a pre-defined multi-domain language model set. We propose three methods to dynamically generate an adaptive language model to match the text output by first-pass recognition: model selection, model combination and model reconstruction. In model selection, we use the language model with minimum perplexity on the first-pass recognized text. By model combination, we learn the combination weights via minimizing the sum of squared error with both L2-norm and L1-norm regularization. For model reconstruction, we use a group of orthogonal bases to reconstruct a language model with the coefficients learned to match the document to recognize. Moreover, we reduce the storage size of multiple language models using two compression methods of split vector quantization (SVQ) and principal component analysis (PCA). Comprehensive experiments on two public Chinese handwriting databases CASIA-HWDB and HIT-MW show that the proposed unsupervised LMA approach improves the recognition performance impressively, particularly for ancient domain documents with the recognition accuracy improved by 7 percent. Meanwhile, the combination of the two compression methods largely reduces the storage size of language models with little loss of recognition accuracy. 相似文献
9.
10.
Shijian Lu Bolan Su Chew Lim Tan 《International Journal on Document Analysis and Recognition》2010,13(4):303-314
Document images often suffer from different types of degradation that renders the document image binarization a challenging
task. This paper presents a document image binarization technique that segments the text from badly degraded document images
accurately. The proposed technique is based on the observations that the text documents usually have a document background
of the uniform color and texture and the document text within it has a different intensity level compared with the surrounding
document background. Given a document image, the proposed technique first estimates a document background surface through
an iterative polynomial smoothing procedure. Different types of document degradation are then compensated by using the estimated
document background surface. The text stroke edge is further detected from the compensated document image by using L1-norm
image gradient. Finally, the document text is segmented by a local threshold that is estimated based on the detected text
stroke edges. The proposed technique was submitted to the recent document image binarization contest (DIBCO) held under the
framework of ICDAR 2009 and has achieved the top performance among 43 algorithms that are submitted from 35 international
research groups. 相似文献
11.
12.
Jipeng QIANG Feng ZHANG Yun LI Yunhao YUAN Yi ZHU Xindong WU 《Frontiers of Computer Science》2023,17(1):171303
Unsupervised text simplification has attracted much attention due to the scarcity of high-quality parallel text simplification corpora. Recent an unsupervised statistical text simplification based on phrase-based machine translation system (UnsupPBMT) achieved good performance, which initializes the phrase tables using the similar words obtained by word embedding modeling. Since word embedding modeling only considers the relevance between words, the phrase table in UnsupPBMT contains a lot of dissimilar words. In this paper, we propose an unsupervised statistical text simplification using pre-trained language modeling BERT for initialization. Specifically, we use BERT as a general linguistic knowledge base for predicting similar words. Experimental results show that our method outperforms the state-of-the-art unsupervised text simplification methods on three benchmarks, even outperforms some supervised baselines. 相似文献
13.
《Pattern recognition letters》2007,28(4):523-533
This paper proposes a new multiresolution technique for color image representation and segmentation, particularly suited for noisy images. A decimated wavelet transform is initially applied to each color channel of the image, and a multiresolution representation is built up to a selected scale 2J. Color gradient magnitudes are computed at the coarsest scale 2J, and an adaptive threshold is used to remove spurious responses. An initial segmentation is then computed by applying the watershed transform to thresholded magnitudes, and this initial segmentation is projected to finer resolutions using inverse wavelet transforms and contour refinements, until the full resolution 20 is achieved. Finally, a region merging technique is applied to combine adjacent regions with similar colors. Experimental results show that the proposed technique produces results comparable to other state-of-the-art algorithms for natural images, and performs better for noisy images. 相似文献
14.
Some authors have recently devised adaptations of spectral grouping algorithms to integrate prior knowledge, as constrained eigenvalues problems. In this paper, we improve and adapt a recent statistical region merging approach to this task, as a non-parametric mixture model estimation problem. The approach appears to be attractive both for its theoretical benefits and its experimental results, as slight bias brings dramatic improvements over unbiased approaches on challenging digital pictures. 相似文献
15.
S Pastoor 《Human factors》1990,32(2):157-171
This study examined legibility performance and subjective preference for text/background color combinations displayed on a video monitor. Luminance contrast was fixed at two preoptimized levels, either with text brighter than the background (10:1) or vice versa (1:6.5). In Experiment 1, 32 subjects rated about 800 color combinations. No evidence suggested differential effects of luminance polarity or hue, with the only exception that cool background colors (blue and bluish cyan) tended to be preferred for the light-on-dark polarity. Saturation had the most important influence on ratings. Any desaturated color combination appears to be satisfactory for text presentation. In Experiment 2 a reduced set of 18 color combinations was investigated with a new sample of 18 subjects. Reading and search times as well as multidimensional ratings were evaluated. There was no evidence for an influence of luminance polarity or chromaticity on performance. Subjective ratings corresponded well with the results of Experiment 1. 相似文献
16.
Abstract
The bag-of-words approach to text document representation
typically results in vectors of the order of 5000–20,000
components as the representation of documents. To make effective
use of various statistical classifiers, it may be necessary to
reduce the dimensionality of this representation. We point out
deficiencies in class discrimination of two popular such
methods, Latent Semantic Indexing (LSI), and sequential feature
selection according to some relevant criterion. As a remedy, we
suggest feature transforms based on Linear Discriminant Analysis
(LDA). Since LDA requires operating both with large and dense
matrices, we propose an efficient intermediate dimension
reduction step using either a random transform or LSI. We report
good classification results with the combined feature transform
on a subset of the Reuters-21578 database. Drastic reduction of
the feature vector dimensionality from 5000 to 12 actually
improves the classification performance.An erratum to this article can be found at 相似文献
17.
Fábio Figueiredo Leonardo Rocha Thierson Couto Thiago Salles Marcos André Gonçalves Wagner Meira Jr. 《Information Systems》2011
In this article we propose a data treatment strategy to generate new discriminative features, called compound-features (or c-features), for the sake of text classification. These c-features are composed by terms that co-occur in documents without any restrictions on order or distance between terms within a document. This strategy precedes the classification task, in order to enhance documents with discriminative c-features. The idea is that, when c-features are used in conjunction with single-features, the ambiguity and noise inherent to their bag-of-words representation are reduced. We use c-features composed of two terms in order to make their usage computationally feasible while improving the classifier effectiveness. We test this approach with several classification algorithms and single-label multi-class text collections. Experimental results demonstrated gains in almost all evaluated scenarios, from the simplest algorithms such as kNN (13% gain in micro-average F1 in the 20 Newsgroups collection) to the most complex one, the state-of-the-art SVM (10% gain in macro-average F1 in the collection OHSUMED). 相似文献
18.
In this paper, we address the document image binarization problem with a three-stage procedure. First, possible stains and general document background information are removed from the image through a background removal stage. The remaining misclassified background and character pixels are then separated using a Local Co-occurrence Mapping, local contrast and a two-state Gaussian Mixture Model. Finally, some isolated misclassified components are removed by a morphology operator. The proposed scheme offers robust and fast performance, especially for both handwritten and printed documents, which compares favorably with other binarization methods. 相似文献
19.
Wang Xinsheng Pang Shanmin Zhu Jihua Wang Jiaxing Wang Lin 《Multimedia Tools and Applications》2020,79(21-22):14465-14489
Multimedia Tools and Applications - Deep features extracted from the convolutional layers of pre-trained CNNs have been widely used in the image retrieval task. These features, however, are in a... 相似文献
20.
Chenguang Wang Yangqiu Song Haoran Li Ming Zhang Jiawei Han 《Data mining and knowledge discovery》2018,32(6):1735-1767
Heterogeneous information network (HIN) is a general representation of many different applications, such as social networks, scholar networks, and knowledge networks. A key development of HIN is called PathSim based on meta-path, which measures the pairwise similarity of two entities in the HIN of the same type. When using PathSim in practice, we usually need to handcraft some meta-paths which are paths over entity types instead of entities themselves. However, finding useful meta-paths is not trivial to human. In this paper, we present an unsupervised meta-path selection approach to automatically find useful meta-paths over HIN, and then develop a new similarity measure called KnowSim which is an ensemble of selected meta-paths. To solve the high computational cost of enumerating all possible meta-paths, we propose to use an approximate personalized PageRank algorithm to find useful subgraphs to allocate the meta-paths. We apply KnowSim to text clustering and classification problems to demonstrate that unsupervised meta-path selection can help improve the clustering and classification results. We use Freebase, a well-known world knowledge base, to conduct semantic parsing and construct HIN for documents. Our experiments on 20Newsgroups and RCV1 datasets show that KnowSim results in impressive high-quality document clustering and classification performance. We also demonstrate the approximate personalized PageRank algorithm can efficiently and effectively compute the meta-path based similarity. 相似文献