期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

赵天雪孙光民许爽《微计算机信息》2007,23(33):193-194,179

针对视频文本图像,采用Shannon插值算法以及Niblack二值化算法进行图像增强。对Shannon插值算法进行改进,提高了算法效率;同时针对Niblack二值化方法中偏移量S提出了一种新的计算公式,从而形成改进的Niblack二值化算法。实验证明上述增强算法大大提高了OCR软件的文本识别正确率。相似文献

2.

基于多帧分析的视频文本分割和增强

许剑峰黎绍发《计算机工程》2006,32(9):209-210

提出了一种将垂直颜色边缘与分块多帧分析的文本定位与增强的方法。根据文本的垂直颜色边缘特征进行粗筛选，获得和跟踪文本候选区；然后通过分块多帧分析对文本进行增强。实验表明，该方法能有效去除复杂背景，提高视频文本的识别率。相似文献

3.

A Dynamic Gesture and Posture Recognition System

Kyriakos Sgouropoulos Ekaterini Stergiopoulou Nikos Papamarkos 《Journal of Intelligent and Robotic Systems》2014,76(2):283-296

This paper presents a real time dynamic hand gesture and posture recognition system based on a neural network and a Hidden Markov Model. For skin color segmentation an adaptive online trained skin color model is used, while the hand posture recognition is accomplished through a likelihood-based classification technique of geometric features. A novel trajectory smoothing technique based on Self Organized Neural Network is introduced to improve HMM classification performance of dynamic gestures. The aim of the proposed system is the creation of a visual dictionary combining hand postures and dynamic gestures. The system has been successfully tested with many people under varying light conditions and different web cameras. 相似文献

4.

Simulating a Smartboard by Real-Time Gesture Detection in Lecture Videos

Feng Wang Chong-Wah Ngo Ting-Chuen Pong 《Multimedia, IEEE Transactions on》2008,10(5):926-935

Gesture plays an important role for recognizing lecture activities in video content analysis. In this paper, we propose a real-time gesture detection algorithm by integrating cues from visual, speech and electronic slides. In contrast to the conventional “complete gesture” recognition, we emphasize detection by the prediction from “incomplete gesture”. Specifically, intentional gestures are predicted by the modified hidden Markov model (HMM) which can recognize incomplete gestures before the whole gesture paths are observed. The multimodal correspondence between speech and gesture is exploited to increase the accuracy and responsiveness of gesture detection. In lecture presentation, this algorithm enables the on-the-fly editing of lecture slides by simulating appropriate camera motion to highlight the intention and flow of lecturing. We develop a real-time application, namely simulated smartboard, and demonstrate the feasibility of our prediction algorithm using hand gesture and laser pen with simple setup without involving expensive hardware. 相似文献

5.

Content-Processing for Video Browsing, Retrieval, and Editing

Shih-Fu Chang HongJiang Zhang 《Multimedia Systems》1999,7(4):255-255

相似文献

6.

基于视频、音频和文本的视频分段

朱映映周洞汝《计算机工程与应用》2001,37(3):85-87

在分析应用视频数据的过程中,视频分段是分析,组织,应用视频数据的基础。由于视频数据的多样性,传统的分段方法不能给出令人满意的结果,一般需要通过人机交互来进行。文中将较为成熟的文本分析、语音处理、图像处理三种技术进行综合,互为补充,对视频流进行分割。文本分析的对象是语音转换成的文本、标题、注释等。语音处理包括语音识别和语音信号分析。语音识别将视频中的自然语言转换为文字。语音信号分析对视频材料中的语音成分进行基础分析。图像处理主要用来处理视频中的图像部分。文章阐述了视频流的分段层次,文本分析,语音处理算法以及镜头突变,镜头渐变识别算法的思想。相似文献

7.

支持视频编辑的〈x,p〉存储方式

吕梁表《计算机工程》1998,24(10):32-34,75

连续媒体，具有数据量大及实时约束的本质特征。诸如视频等的多媒体应用也正日益推广。相似文献

8.

Integrated Video and Text for Content-based Access to Video Databases

Jiang Haitao Montesi Danilo Elmagarmid Ahmed K. 《Multimedia Tools and Applications》1999,9(3):227-249

This paper introduces a new approach to realize video databases. The approach consists of a VideoText data model based on free text annotations associated with logical video segments and a corresponding query language. Traditional database techniques are inadequate for exploiting queries on unstructured data such as video, supporting temporal queries, and ranking query results according to their relevance to the query. In this paper, we propose to use information retrieval techniques to provide such features and to extend the query language to accommodate interval queries that are particularly suited to video data. Algorithms are provided to show how user queries are evaluated. Finally, a generic and modular video database architecture which is based on VideoText data model is described. 相似文献

9.

The Application of Video Semantics and Theme Representation in Automated Video Editing

Nack Frank Parkes Alan 《Multimedia Tools and Applications》1997,4(1):57-83

This paper considers the automated generation of humorous video sequences from arbitrary video material. We present a simplified model of the editing process. We then outline our approach to narrativity and visual humour, discuss the problems of context and shot-order in video and consider influences on the editing process. We describe the role of themes and semantic fields in the generation of content oriented video scenes. We then present the architecture of AUTEUR, an experimental system that embodies mechanisms to interpret, manipulate and generate video. An example of a humorous video sequence generated by AUTEUR is described. 相似文献

10.

支持向量机及其在草图编辑手势识别中的应用

沈莉芳方贵盛《数字社区&智能家居》2006,(8):158-159

支持向量机（SVM）是一种基于统计学习理论的机器学习与模式识别方法。它通过结构风险最小化准则和核函数方法．较好地解决了小样本、非线性及高维模式识别问题。本文主要从联机手绘草图编辑的角度出发，谈谈支持向量机在草绘手势笔划识别中的具体应用。相似文献

11.

支持向量机及其在草图编辑手势识别中的应用

沈莉芳方贵盛《数字社区&智能家居》2006,(23)

支持向量机(SVM)是一种基于统计学习理论的机器学习与模式识别方法。它通过结构风险最小化准则和核函数方法,较好地解决了小样本、非线性及高维模式识别问题。本文主要从联机手绘草图编辑的角度出发,谈谈支持向量机在草绘手势笔划识别中的具体应用。相似文献

12.

Structure and Development of Plans in Computer Text Editing

《Human-Computer Interaction》2013,28(3):201-226

When people learn such a complex skill as computer text editing, they are learning a set of goals and the plans for accomplishing those goals. In this experiment we examined the structure and development of simple text-editing goals and plans. Long interkeystroke times were found to be associated with plan boundaries. The longest times were found between keystrokes separating superordinate goals, whereas less significant time increases appeared between keystrokes at subgoal boundaries. Changes in the patterns of interkeystroke times showed plan restructuring with experience. 相似文献

13.

Bayesian Scheme for Interactive Colourization,Recolourization and Image/Video Editing

Oscar Dalmau Mariano Rivera Teresa Alarcn 《Computer Graphics Forum》2010,29(8):2372-2386

We propose a general image and video editing method based on a Bayesian segmentation framework. In the first stage, classes are established from scribbles made by a user on the image. These scribbles can be considered as a multi‐map (multi‐label map) that defines the boundary conditions of a probability measure field to be computed for each pixel. In the second stage, the global minima of a positive definite quadratic cost function with linear constraints, is calculated to find the probability measure field. The components of such a probability measure field express the degree of each pixel belonging to spatially smooth classes. Finally, the computed probabilities (memberships) are used for defining the weights of a linear combination of user provided colours or effects associated to each class. The proposed method allows the application of different operators, selected interactively by the user, over part or the whole image without needing to recompute the memberships. We present applications to colourization, recolourization, editing and photomontage tasks. 相似文献

14.

Integrating AJAX and Web Services for Cooperative Image Editing

Lihui Lei Zhenhua Duan 《IT Professional》2007,9(3):25-29

AJAX (asynchronous JavaScript and XML) is a powerful Web development model for browser-based Web applications. While Web services essentially are universally accessible software components deployed on the Web and designed to support interoperable machine-to-machine interaction over a network. Because technologies that support both AJAX and Web services are XML-based, the two can leverage each others' strengths. More and more companies and organizations are taking advantage of this relationship, working to improve their Web applications through AJAX and Web services. Our system integrates the two for Web-based cooperative image editing. For message exchange delays and browser security limitations have hampered Web-based cooperative image editing. This integration resolves these issues and offers a framework for generic cooperative business processes 相似文献

15.

A Keystroke Analysis of Learning and Transfer in Text Editing

《Human-Computer Interaction》2013,28(3):223-274

Two experiments studied the acquisition and transfer of text-editing skill. The first experiment, originally reported in Singley and Anderson (1985) but reanalyzed in greater detail here, found nearly total transfer between two similar line editors and partial transfer from the line editors to a screen editor. Analyses of the keystroke data revealed that the majority of the improvement during both learning and transfer was concentrated in the planning components of the skill. The second experiment found little evidence for negative transfer between a pair of screen editors designed for maximal interference using a classic interference paradigm. The few instances of negative transfer observed were better characterized as the positive transfer of nonoptimal methods rather than instances of true procedural interference. These results support an identical elements model of transfer based on a production system representation of cognitive skill. The relative magnitudes of transfer observed were consistent with detailed measures of production system overlap. In addition, localized transfer sites were hypothesized and identified through a series of microanalyses. Finally, specific transfer predictions based on the differential practice of general and specific components were tested and confirmed. 相似文献

16.

结合时空信息的视频对象平面自动提取算法

杨文明刘济林王其聪《计算机辅助设计与图形学学报》2006,18(6):865-869

提出一种视频对象平面自动提取算法,首先以时域帧间运动信息为依据,利用高斯检验方法得到初始的二值运动模板,并建立双尺度邻域的MRF模型进一步检验,以获取平滑、完整的运动模板;然后提出了结合非线性变换的改进分水岭算法对运动区域进行帧内空域分割;最后对时域和空域分割结果进行比重运算,提取最终运动对象.实验结果说明了该算法的有效性. 相似文献

17.

Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video

Lionel Pigou Aäron van den Oord Sander Dieleman Mieke Van Herreweghe Joni Dambre 《International Journal of Computer Vision》2018,126(2-4):430-439

Recent studies have demonstrated the power of recurrent neural networks for machine translation, image captioning and speech recognition. For the task of capturing temporal structure in video, however, there still remain numerous open research questions. Current research suggests using a simple temporal feature pooling strategy to take into account the temporal aspect of video. We demonstrate that this method is not sufficient for gesture recognition, where temporal information is more discriminative compared to general video classification tasks. We explore deep architectures for gesture recognition in video and propose a new end-to-end trainable neural network architecture incorporating temporal convolutions and bidirectional recurrence. Our main contributions are twofold; first, we show that recurrence is crucial for this task; second, we show that adding temporal convolutions leads to significant improvements. We evaluate the different approaches on the Montalbano gesture recognition dataset, where we achieve state-of-the-art results. 相似文献

18.

基于语义强化和特征融合的文本分类

王子牛王宏杰高建瓴《软件》2020,(1):211-215

文本分类是信息检索、机器问答的基础性任务,是自然语言理解的一项重要语义任务。本文提出了一种基于语义强化和特征融合的(LAC)分类模型。该模型首先将Word2vec嵌入的词向量输入LSTM进行句子语义的提取,然后将提取的语义特征输入Attention层进行特征的强化,同时用卷积层提取文本的局部特征,其次把强化的语义特征和利用卷积层提取的局部特征进行融合,最后经池化层对特征进行降维,并将经池化后的特征向量输入到全连接层,引入Dropout防止过拟合,得到最终的分类结果。由于CNN提取特征时存在忽略上下文信息的弊端,所以提出用LATM提取上下文信息,然后进行特征的融合的方法;另外,由于LSTM在捕获句子信息时会因为距离过长而出现的信息遗漏现象,所以提出将Attention作用于LSTM。通过实验表明,本文提出的模型比单独的CNN模型、以及先用LSTM提取语义信息,在进行局部特征提取的LSTM-CNN模型的效果更好。相似文献

19.

一种视频字幕检测定位新方法

王勇李小春郑辉胡德文《计算机工程与应用》2004,40(23):40-42,67

简要介绍了现有视频字幕的检测提取方法及独立成分分析的基本理论和算法,探讨了独立成分分析在视频图像序列处理方面的应用,提出了一种基于独立成分分析的新的视频字幕检测提取方法。仿真实验结果表明,在图像背景复杂、图像分辨率低以及字幕字体、大小、颜色多变这些传统检测提取方法或多或少都存在困难的条件下,该方法都具有良好的视频字幕检测提取能力。相似文献

20.

融合文本序列和图信息的海关商品HS编码分类

杜少华万怀宇武志昊林友芳《计算机科学》2021,48(4):97-103

海关商品HS编码分类是企业和个人进出口贸易的重要国际程序。HS编码分类可以看作是一个文本分类问题,即给定一段商品的描述,确定商品由HS编码表示的所属类别。然而,该任务比一般的文本分类任务更具挑战性,原因是商品描述文本具有特定的层次结构,同时商品描述文本展现出了两个层次上的序列特征,并且商品描述文本还存在关键信息分散且描述形式多样的特点。现有的文本分类方法无法综合考虑以上因素来捕获商品描述文本中的关键信息。对此,文中提出了一种融合文本序列和图信息的神经网络(Text Sequence and Graph Information combination Neural Network,TSGINN)模型,用于解决海关商品HS编码分类问题。TSGINN将HS编码分类问题定义为基于词共现网络的子图分类问题,通过图注意力网络建模非连续词之间的关联关系,同时利用分层的长短期记忆网络结合商品文本层次结构捕获多层次的序列信息。在真实海关商品数据集上进行了实验,结果表明TSGINN模型的HS编码分类效果优于其他分类方法。相似文献