期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform

Huang Xiaodong 《Multimedia Tools and Applications》2018,77(6):7033-7049

Multimedia Tools and Applications - Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information,... 相似文献

2.

Curved text detection in blurred/non-blurred video/scene images

Xue Minglong Shivakumara Palaiahnakote Zhang Chao Lu Tong Pal Umapada 《Multimedia Tools and Applications》2019,78(18):25629-25653

Multimedia Tools and Applications - Text detection in video/images is challenging due to the presence of multiple blur caused by defocus and motion. In this paper, we present a new method for... 相似文献

3.

Automatic video scene text detection based on saliency edge map

Huang Xiaodong 《Multimedia Tools and Applications》2019,78(24):34819-34838

Multimedia Tools and Applications - Video scene text contains valuable information for scene understanding, as scene text in video provides important semantic clues for human beings to sense the... 相似文献

4.

基于文字混合特征的视频文本定位研究

赵超方向忠《微计算机信息》2009,25(18)

针对视频中文本信息在视频序列和视频索引中的重要性,本文提出了一种基于文字混合特征的文本定位算法.该算法首先对视频序列中每隔25帧的单帧图像进行边缘检测和投影处理来提取文本块,然后用支持向量基进行筛选,排除非文本块的干扰,最后利用视频序列中相邻帧之间的相关性来搜索剩余帧中的文本块.本文的算法在提高检测速度的同时保证了较高的检测准确度. 相似文献

5.

Integrating multiple character proposals for robust scene text extraction

SeongHun Lee Jin Hyung Kim 《Image and vision computing》2013

Text contained in scene images provides the semantic context of the images. For that reason, robust extraction of text regions is essential for successful scene text understanding. However, separating text pixels from scene images still remains as a challenging issue because of uncontrolled lighting conditions and complex backgrounds. In this paper, we propose a two-stage conditional random field (TCRF) approach to robustly extract text regions from the scene images. The proposed approach models the spatial and hierarchical structures of the scene text, and it finds text regions based on the scene text model. In the first stage, the system generates multiple character proposals for the given image by using multiple image segmentations and a local CRF model. In the second stage, the system selectively integrates the generated character proposals to determine proper character regions by using a holistic CRF model. Through the TCRF approach, we cast the scene text separation problem as a probabilistic labeling problem, which yields the optimal label configuration of pixels that maximizes the conditional probability of the given image. Experimental results indicate that our framework exhibits good performance in the case of the public databases. 相似文献

6.

Multi-oriented scene text detection in video based on wavelet and angle projection boundary growing 总被引：1，自引：0，他引：1

Palaiahnakote Shivakumara Anjan Dutta Chew Lim Tan Umapada Pal 《Multimedia Tools and Applications》2014,72(1):515-539

In this paper, we address two complex issues: 1) Text frame classification and 2) Multi-oriented text detection in video text frame. We first divide a video frame into 16 blocks and propose a combination of wavelet and median-moments with k-means clustering at the block level to identify probable text blocks. For each probable text block, the method applies the same combination of feature with k-means clustering over a sliding window running through the blocks to identify potential text candidates. We introduce a new idea of symmetry on text candidates in each block based on the observation that pixel distribution in text exhibits a symmetric pattern. The method integrates all blocks containing text candidates in the frame and then all text candidates are mapped on to a Sobel edge map of the original frame to obtain text representatives. To tackle the multi-orientation problem, we present a new method called Angle Projection Boundary Growing (APBG) which is an iterative algorithm and works based on a nearest neighbor concept. APBG is then applied on the text representatives to fix the bounding box for multi-oriented text lines in the video frame. Directional information is used to eliminate false positives. Experimental results on a variety of datasets such as non-horizontal, horizontal, publicly available data (Hua’s data) and ICDAR-03 competition data (camera images) show that the proposed method outperforms existing methods proposed for video and the state of the art methods for scene text as well. 相似文献

7.

An effective graph-cut scene text localization with embedded text segmentation

Xiaoqian Liu Weiqiang Wang 《Multimedia Tools and Applications》2015,74(13):4891-4906

相似文献

8.

顾及目标关联的自然场景文本检测

下载免费PDF全文

易尧华何婧婧卢利琼汤梓伟《中国图象图形学报》2020,25(1):126-135

目的目前基于卷积神经网络（CNN）的文本检测方法对自然场景中小尺度文本的定位非常困难。但自然场景图像中文本目标与其他目标存在很强的关联性,即自然场景中的文本通常伴随特定物体如广告牌、路牌等同时出现,基于此本文提出了一种顾及目标关联的级联CNN自然场景文本检测方法。方法首先利用CNN检测文本目标及包含文本的关联物体目标,得到文本候选框及包含文本的关联物体候选框;再扩大包含文本的关联物体候选框区域,并从原始图像中裁剪,然后以该裁剪图像作为CNN的输入再精确检测文本候选框;最后采用非极大值抑制方法融合上述两步生成的文本候选框,得到文本检测结果。结果本文方法能够有效地检测小尺度文本,在ICDAR-2013数据集上召回率、准确率和F值分别为0.817、0.880和0.847。结论本文方法顾及自然场景中文本目标与包含文本的物体目标的强关联性,提高了自然场景图像中小尺度文本检测的召回率。相似文献

9.

Detecting memory leaks in managed languages with Cork

Maria Jump Kathryn S. McKinley 《Software》2010,40(1):1-22

A memory leak in a managed language occurs when the program inadvertently maintains references to objects that it no longer needs. Memory leaks cause systematic heap growth that degrade performance and can result in program crashes after perhaps days or weeks of execution. Prior approaches for detecting memory leaks rely on heap differencing or detailed object statistics which store state proportional to the number of objects in the heap. These overheads preclude their use on the same processor for deployed long‐running applications. This paper introduces Cork as a tool that accurately identifies heap growth caused by leaks. It is space efficient (adding less than 1% to the heap) and time efficient (adding 2.3% on average to total execution time). We implement this approach of examining and summarizing the class of live objects during garbage collection in a class points‐from graph (CPFG). Each node in the CPFG represents a class and edges between nodes represent references between objects of the specific classes. Cork annotates nodes and edges with the corresponding volume of live objects. Cork identifies growing data structures across multiple collections and computes a class slice to identify leaks for the user. We experiment with two functions for identifying growth and show that Cork is accurate: it identifies systematic heap growth with no false positives in 4 of 15 benchmarks we tested. Cork's slice report enabled us to quickly identify and eliminate growing data structures in large and unfamiliar programs, something their developers had not previously done. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

10.

Arbitrary-shaped scene text detection with keypoint-based shape representation

Qin Shuxin Chen Lin 《International Journal on Document Analysis and Recognition》2022,25(2):115-127

International Journal on Document Analysis and Recognition (IJDAR) - Recently scene text detection has become a hot research topic. Arbitrary-shaped text detection is more challenging due to the... 相似文献

11.

Bidirectional extraction and recognition of scene text with layout consistency

Ryota Hinami Xinhao Liu Naoki Chiba Shin’ichi Satoh 《International Journal on Document Analysis and Recognition》2016,19(2):83-98

Text recognition in natural scene images is a challenging task that has recently been garnering increased research attention. In this paper, we propose a method for recognizing text by utilizing the layout consistency of a text string. We estimate the layout (four lines of a text string) using initial character extraction and recognition result. On the basis of the layout consistency across a word, we perform character extraction and recognition again using four lines, which is more accurate than the first process. Our layout estimation method is different from previous methods in terms of exploiting character recognition results and its use of a class-conditional layout model. More accurate and robust estimation is achieved, and it can be used to refine character extraction and recognition. We call this two-way process—from extraction and recognition to layout, and from layout to extraction and recognition—“bidirectional” to discriminate it from previous feedback refinement approaches. Experimental results demonstrate that our bidirectional processes provide a boost to the performance of word recognition. 相似文献

12.

Modelling text prediction systems in low- and high-inflected languages

Nestor Garay-Vitoria Julio Abascal 《Computer Speech and Language》2010,24(2):117-135

相似文献

13.

Automatic text segmentation and text recognition for video indexing 总被引：13，自引：0，他引：13

Rainer Lienhart Wolfgang Effelsberg 《Multimedia Systems》2000,8(1):69-81

Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics in videos. 相似文献

14.

Detecting patterns in finite regular and context-free languages

Narad Rampersad Jeffrey Shallit 《Information Processing Letters》2010,110(3):108-112

We consider some natural variations on the following classic pattern-matching problem: given an NFA M over the alphabet Σ and a pattern p over some alphabet Δ, does there exist a word x∈L(M) such that x matches p? We consider the restricted problem where M only accepts a finite language. We also consider the variation where only some factor of x is required to match the pattern p. We show that both of these problems are NP-complete. We also consider the same problems for context-free grammars; in this case the problems become PSPACE-complete. 相似文献

15.

Arbitrarily-oriented multi-lingual text detection in video

Khare Vijeta Shivakumara Palaiahnakote Paramesran Raveendran Blumenstein Michael 《Multimedia Tools and Applications》2017,76(15):16625-16655

Multimedia Tools and Applications - Text detection in arbitrarily-oriented multi-lingual video is an emerging area of research because it plays a vital role for developing real-time indexing and... 相似文献

16.

Fast and accurate scene text understanding with image binarization and off-the-shelf OCR

Sergey Milyaev Olga Barinova Tatiana Novikova Pushmeet Kohli Victor Lempitsky 《International Journal on Document Analysis and Recognition》2015,18(2):169-182

相似文献

17.

挖掘文本框位置特性的anchor-free自然场景文本检测

卢利琼吴东吴涛刘瑶《计算机应用研究》2021,38(8):2556-2560

针对现有优秀的anchor-free文本检测方法只挖掘了文本框几何特性而没有考虑文本框位置特性且缺乏有效的过滤机制,提出了挖掘文本框位置特性的anchor-free自然场景文本检测方法.该方法以ResNet50作为卷积神经网络的主干网络,将多个不同尺寸的特征层融合后预测文本框的几何特性和位置特性,最后辅之以二层过滤机制得到最终的检测文本框.在公开的数据集ICDAR2013和ICDAR2011上F值分别达到了0.870和0.861,证明了该方法的有效性. 相似文献

18.

Learning multiple languages in groups

Sanjay Jain Efim Kinber 《Theoretical computer science》2007

We consider a variant of Gold’s learning paradigm where a learner receives as input n

n

different languages (in the form of one text where all input languages are interleaved). Our goal is to explore the situation when a more “coarse” classification of input languages is possible, whereas more refined classification is not. More specifically, we answer the following question: under which conditions, a learner, being fed n

n

different languages, can produce m

m

grammars covering all input languages, but cannot produce k

k

grammars covering input languages for any k>m

k > m

. We also consider a variant of this task, where each of the output grammars may not cover more than r

r

input languages. Our main results indicate that the major factor affecting classification capabilities is the difference n−m

n - m

between the number n

n

of input languages and the number m

m

of output grammars. We also explore the relationship between classification capabilities for smaller and larger groups of input languages. For the variant of our model with the upper bound on the number of languages allowed to be represented by one output grammar, for classes consisting of disjoint languages, we found complete picture of relationship between classification capabilities for different parameters n

n

(the number of input languages), m

m

(number of output grammars), and r

r

(bound on the number of languages represented by each output grammar). This picture includes a combinatorial characterization of classification capabilities for the parameters n,m,r

n, m, r

of certain types. 相似文献

19.

Study of automatic text summarization approaches in different languages

Kumar Yogesh Kaur Komalpreet Kaur Sukhpreet 《Artificial Intelligence Review》2021,54(8):5897-5929

Artificial Intelligence Review - Nowadays we see huge amount of information is available on both, online and offline sources. For single topic we see hundreds of articles are available, containing... 相似文献

20.

A novel method for binarization of scene text images and its application in text identification

Ghoshal Ranjit Roy Anandarup Banerjee Ayan Dhara Bibhas Chandra Parui Swapan K. 《Pattern Analysis & Applications》2019,22(4):1361-1375

Pattern Analysis and Applications - The aim of this article is twofold. First, we propose an effective methodology for binarization of scene images. For our present study, we use the publicly... 相似文献