共查询到20条相似文献,搜索用时 218 毫秒
1.
Philippe Muller 《Computational Intelligence》2002,18(3):420-450
We present here a theory of motion from a topological point of view, in a symbolic perspective. Taking space–time histories of objects as primitive entities, we introduce temporal and topological relations on the thus defined space–time to characterize classes of spatial changes. The theory thus accounts for qualitative spatial information, dealing with underspecified, symbolic information when accurate data are not available or unnecessary. We show that these structures give a basis for commonsense spatio–temporal reasoning by presenting a number of significant deductions in the theory. This can serve as a formal basis for languages describing motion events in a qualitative way. 相似文献
2.
Gwo Giun Lee Ming-Jiun Wang He-Yuan Lin Drew Wei-Chi Su Bo-Yun Lin 《Multimedia, IEEE Transactions on》2007,9(3):455-465
This paper presents a new spatio-temporal motion estimation algorithm and its VLSI architecture for video coding based on algorithm and architecture co-design methodology. The algorithm consists of the new strategies of spatio-temporal motion vector prediction, modified one-at-a-time search scheme, and multiple update paths derived from optimization theory. The hardware specification is for high-definition video coding. We applied the ME algorithm to H.264 reference software. Our algorithm surpasses recently published research and achieves close performance to full search. The VLSI implementation proves the low cost feature of our algorithm. The algorithm and architecture co-design concept is highly emphasized in this paper. We provide some quantitative example to show the necessity of algorithm and architecture co-design 相似文献
3.
Lathoud G. Odobez J.-M. 《IEEE transactions on audio, speech, and language processing》2007,15(5):1696-1710
Distant microphones permit to process spontaneous multiparty speech with very little constraints on speakers, as opposed to close-talking microphones. Minimizing the constraints on speakers permits a large diversity of applications, including meeting summarization and browsing, surveillance, hearing aids, and more natural human-machine interaction. Such applications of distant microphones require to determine where and when the speakers are talking. This is inherently a multisource problem, because of background noise sources, as well as the natural tendency of multiple speakers to talk over each other. Moreover, spontaneous speech utterances are highly discontinuous, which makes it difficult to track the multiple speakers with classical filtering approaches, such as Kalman filtering of particle filters. As an alternative, this paper proposes a probabilistic framework to determine the trajectories of multiple moving speakers in the short-term only, i.e., only while they speak. Instantaneous location estimates that are close in space and time are grouped into ldquoshort-term clustersrdquo in a principled manner. Each short-term cluster determines the precise start and end times of an utterance and a short-term spatial trajectory. Contrastive experiments clearly show the benefit of using short-term clustering, on real indoor recordings with seated speakers in meetings, as well as multiple moving speakers. 相似文献
4.
Douglas S.C. Gupta M. Sawada H. Makino S. 《IEEE transactions on audio, speech, and language processing》2007,15(5):1511-1520
This paper derives two spatio-temporal extensions of the well-known FastICA algorithm of Hyvarinen and Oja that are applicable to the convolutive blind source separation task. Our time-domain algorithms combine multichannel spatio-temporal prewhitening via multistage least-squares linear prediction with novel adaptive procedures that impose paraunitary constraints on the multichannel separation filter. The techniques converge quickly to a separation solution without any step size selection or divergence difficulties, and unlike other methods, ours do not require special coefficient initialization procedures to obtain good separation performance. They also allow for the efficient reconstruction of individual signals as observed in the sensor measurements directly from the system parameters for single-input multiple-output blind source separation tasks. An analysis of one of the adaptive constraint procedures shows its fast convergence to a paraunitary filter bank solution. Numerical evaluations of the proposed algorithms and comparisons with several existing convolutive blind source separation techniques indicate the excellent relative performance of the proposed methods. 相似文献
5.
图像感知哈希技术是一门较新型的技术,哈希提取过程的关键步骤是特征提取,传统的基于DCT变换的感知哈希技术抗几何攻击能力较差。试图在图像生成哈希之前,将几何形变对图像的影响去除,首先将图像进行正则化,使图像具有几何不变性,然后再进行DCT特征系数提取,最后通过量化、编码生成最终哈希。该算法可以抵抗任意的仿射变换。 相似文献
6.
7.
提出一种基于鲁棒Hash的视频拷贝检测方案.通过对特征点进行分类,选取在时空域上持久存在的稳定点,对邻域点进行微分计算构造局部特征.将多维特征数据进行Hilbert编码,并选取有效位作为检测Hash码.为了准确的在目标视频中定位可疑内容,提出了Hash匹配方案,将序列相似度作为匹配的依据,提高匹配精度.实验结果表明本方案拥有较好检测性能,适用于视频内容的拷贝检测. 相似文献
8.
Hashing is a common solution for content-based multimedia retrieval by encoding high-dimensional feature vectors into short binary codes. Previous works mainly focus on image hashing problem. However, these methods can not be directly used for video hashing, as videos contain not only spatial structure within each frame, but also temporal correlation between successive frames. Several researchers proposed to handle this by encoding the extracted key frames, but these frame-based methods are time-consuming in real applications. Other researchers proposed to characterize the video by averaging the spatial features of frames and then the existing hashing methods can be adopted. Unfortunately, the sort of “video” features does not take the correlation between frames into consideration and may lead to the loss of the temporal information. Therefore, in this paper, we propose a novel unsupervised video hashing framework via deep neural network, which performs video hashing by incorporating the temporal structure as well as the conventional spatial structure. Specially, the spatial features of videos are obtained by utilizing convolutional neural network, and the temporal features are established via long-short term memory. After that, the time series pooling strategy is employed to obtain the single feature vector for each video. The obtained spatio-temporal feature can be applied to many existing unsupervised hashing methods. Experimental results on two real datasets indicate that by employing the spatio-temporal features, our hashing method significantly improves the performance of existing methods which only deploy the spatial features, and meanwhile obtains higher mean average precision compared with the state-of-the-art video hashing methods. 相似文献
9.
10.
基于DirectShow设计视频加密Transform Filter 总被引:4,自引:0,他引:4
实现了基于DirectShow技术开发视频加密TransformFilter。首先介绍了视频加密的原理和方案,分析了Di-rectShow的体系结构,然后研究了基于DirectShow技术如何实现视频加密TransformFilter,并给出了试验结果和结论。试验结果表明该Filter可以对实时视频进行加密,加密速度快,效果好。 相似文献
11.
12.
13.
提出了一种人体运动姿态视频检索的新方法,整体算法分为典型姿态学习和姿态检索两个阶段。首先提取样本库中人体姿态的时空运动特征点作为姿态运动底层特征,一个姿态对应一个时空特征点集合;计算每个特征点的时空三维邻域中像素的梯度,进而为每个姿态建立一个梯度直方图;其次,采用非监督的聚类方法对姿态样本归类,按照语义要求提取多个典型姿态;最后,用基于EM的高斯混合模型对聚类结果建模,形成典型姿态检索的分类器,完成姿态建模的的学习阶段。运动姿态的视频检索是根据最大概率匹配准则,对输入的测试视频进行姿态匹配,从而实现基于语义的姿态检索。基于Weizmann和KTH标准测试视频库的大量实验结果表明,本文提出的方法能够准确有效地检索人体运动姿态。 相似文献
14.
随着Internet和无线通信的飞速发展,人们在网络上实时获取视频数据已经成为可能.由于传输网络和接收终端的多样性,所以需要视频流能适应多种不同传输、解码和显示的要求,由此产生了可分级视频编码.文中在可分级视频编码的基础上对其进行改进,把感兴趣区域的检测和小波变换应用到可分级编码中,该方法中对视频流进行感兴趣区域检测,利用小波变换对增强层中的感兴趣区域进行编码,由于小波变换具有良好的空间方向选择性,与人眼的视觉特性十分吻合,从而得到很好的效果. 相似文献
15.
随着Internet和无线通信的飞速发展,人们在网络上实时获取视频数据已经成为可能。由于传输网络和接收终端的多样性,所以需要视频流能适应多种不同传输、解码和显示的要求,由此产生了可分级视频编码。文中在可分级视频编码的基础上对其进行改进,把感兴趣区域的检测和小波变换应用到可分级编码中,该方法中对视频流进行感兴趣区域检测,利用小波变换对增强层中的感兴趣区域进行编码,由于小波变换具有良好的空间方向选择性,与人眼的视觉特性十分吻合,从而得到很好的效果。 相似文献
16.
17.
在视频水印的评价标准中,鲁棒性和透明性是一对非常重要的性能指标,同时也是一对相互制约的指标。已有的视频水印算法对上述2个指标都采用折衷的方法,为保证透明性通常不能兼顾水印的鲁棒性。为解决上述问题,提出一种可移除数字视频水印算法。在嵌入水印时,不限制水印嵌入的强度,以保证水印信息的鲁棒性,而嵌入后的视频在播放时,需要经过一个水印移除的过程,移除水印后的视频在视觉效果上与原视频接近或完全一致,以满足水印的透明性。同时,用户需提供合法的密钥,以保证水印移除的正确性,并提高水印算法的安全性。测试结果表明,该算法对分辨率放缩、视频重编码及码率压缩的操作具有较强的鲁棒性。 相似文献
18.
19.