首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Searching and indexing historical handwritten collections are a very challenging problem. We describe an approach called word spotting which involves grouping word images into clusters of similar words by using image matching to find similarity. By annotating “interesting” clusters, an index that links words to the locations where they occur can be built automatically. Image similarities computed using a number of different techniques including dynamic time warping are compared. The word similarities are then used for clustering using both K-means and agglomerative clustering techniques. It is shown in a subset of the George Washington collection that such a word spotting technique can outperform a Hidden Markov Model word-based recognition technique in terms of word error rates. An erratum to this article can be found at  相似文献   

2.
3.
4.
In this paper, we study the effect of taking the user into account in a query-by-example handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and two baseline word spotting approaches both based on the bag-of-visual-words model. We finally present two alternative ways of presenting the results to the user that might be more attractive and suitable to the user's needs than the classic ranked list.  相似文献   

5.
6.
In this paper, we propose a novel technique for word spotting in historical printed documents combining synthetic data and user feedback. Our aim is to search for keywords typed by the user in a large collection of digitized printed historical documents. The proposed method consists of the following stages: (1) creation of synthetic image words; (2) word segmentation using dynamic parameters; (3) efficient feature extraction for each word image and (4) a retrieval procedure that is optimized by user feedback. Experimental results prove the efficiency of the proposed approach.  相似文献   

7.
8.
9.
10.
11.
Searching for words of interest from a speech sequence is referred to as keyword spotting (KWS). A myriad of techniques have been proposed over the years for effectively spotting keywords from adults' speech. However, not much work has been reported on KWS for children's speech. The speech data for adult and child speakers differs significantly due to physiological differences between the two groups of speakers. Consequently, the performance of a KWS system trained on adults' speech degrades severely when used by children due to the acoustic mismatch. In this paper, we present our efforts towards improving the performance of keyword spotting systems for children's speech under limited data scenario. In this regard, we have explored prosody modification in order to reduce the acoustic mismatch resulting from the differences in pitch and speaking-rate. The prosody modification technique explored in this paper is the one based on glottal closure instant (GCI) events. The approach based on zero-frequency filtering (ZFF) is used to compute the GCI locations. Further, we have presented two different ways for effectively applying prosody modification. In the first case, prosody modification is applied to the children's speech test set prior to the decoding step in order to improve the recognition performance. Alternatively, we have also applied prosody modification to the training data from adult speakers. The original as well as the prosody modified adults' speech data are then augmented together before learning the statistical parameters of the KWS system. The experimental evaluations presented in this paper show that, significantly improved performances for children's speech are obtained by both of the aforementioned approaches of applying prosody modification. Prosody-modification-based data augmentation helps in improving the performance with respect to adults' speech as well.  相似文献   

12.
13.
Geoscientific data interpretation is a challenging task, which requires the detection and synthesis of complex patterns within data. As a first step towards better understanding this interpretation process, our research focuses on quantitative monitoring of interpreters' brain responses associated with geoscientific target spotting. This paper presents a method that profiles brain responses using electroencephalography (EEG) to detect P300-like responses that are associated with target spotting for complex geoscientific data. In our experiment, eight interpreters with varying levels of expertise and experience were asked to detect features, which are likely to be copper–gold rich porphyry systems within magnetic geophysical data. The target features appear in noisy background and often have incomplete shape. Magnetic images with targets and without targets were shown to participants using the “oddball” paradigm. Event related potentials were obtained by averaging the EEG epochs across multiple trials and the results show delayed P3 response to the targets, likely due to the complexity of the task. EEG epochs were classified and the results show reliable single trial classification of EEG responses with an average accuracy of 83%. The result demonstrated the usability of the P300-like responses to quantify the geoscientific target spotting performances.  相似文献   

14.
基于谱分析的形状描述符在非刚性三维形状匹配中取得了较好的匹配效果,引起了研究者的广泛关注.谱分析是基于流形上拉普拉斯贝尔特拉米算子谱分解的一种内蕴形状分析方法.谱形状描述符和谱距离分布函数是最主要的两类谱分析形状描述符,它们具有不同的数学性质和物理意义.基于两类不同的形状描述符,给出了详细的方法分析及其在形状匹配中的应用.首先,给出了应用基于谱分析的形状描述符的非刚性三维形状匹配框架,介绍了几种常用的谱形状描述符及谱距离分布函数的基本思想和计算方法;然后,分析比较了这些形状描述符的优缺点及应用场景,为研究者选择基于谱分析的形状描述符提供参考;最后,通过实验对比了不同基于谱分析的形状描述符的算法鲁棒性、时间耗费及非刚性匹配性能,以此推动谱分析形状描述符的应用进程.  相似文献   

15.
16.
17.
18.
Due to the large number of spelling variants found in historical texts, standard methods of Information Retrieval (IR) fail to produce satisfactory results on historical document collections. In order to improve recall for search engines, modern words used in queries have to be associated with corresponding historical variants found in the documents. In the literature, the use of (1) special matching procedures and (2) lexica for historical language have been suggested as two alternative ways to solve this problem. In the first part of the paper, we show how the construction of matching procedures and lexica may benefit from each other, leading the way to a combination of both approaches. A tool is presented where matching rules and a historical lexicon are built in an interleaved way based on corpus analysis. In the second part of the paper, we ask if matching procedures alone suffice to lift IR on historical texts to a satisfactory level. Since historical language changes over centuries, it is not simple to obtain an answer. We present experiments where the performance of matching procedures in text collections from four centuries is studied. After classifying missed vocabulary, we measure precision and recall of the matching procedure for each period. Results indicate that for earlier periods, matching procedures alone do not lead to satisfactory results. We then describe experiments where the gain for recall obtained from historical lexica of distinct sizes is estimated.  相似文献   

19.
20.
In this paper, we propose a novel solution for the problem of segmenting macro- and micro-expression frames (or retrieving the expression intervals) in video sequences, which is a prior step for many expression recognition algorithms. The proposed method exploits the non-rigid facial motion that occurs during facial expressions by capturing the optical strain corresponding to the elastic deformation of facial skin tissue. The method is capable of spotting both macro-expressions which are typically associated with expressed emotions and rapid micro- expressions which are typically associated with semi-suppressed macro-expressions. We test our algorithm on several datasets, including a newly released hour-long video with two subjects recorded in a natural setting that includes spontaneous facial expressions. We also report results on a dataset that contains 75 feigned macro-expressions and 37 feigned micro-expressions. We achieve over a 75% true positive rate with a 1% false positive rate for macro-expressions, and a nearly 80% true positive rate for spotting micro-expressions with a .3% false positive rate.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号