期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Phonetic segmentation using multiple speech features

Iosif Mporas Todor Ganchev Nikos Fakotakis 《International Journal of Speech Technology》2008,11(2):73-85

In this paper we propose a method for improving the performance of the segmentation of speech waveforms to phonetic units. The proposed method is based on the well known Viterbi time-alignment algorithm and utilizes the phonetic boundary predictions from multiple speech parameterization techniques. Specifically, we utilize the most appropriate, with respect to boundary type, phone transition position prediction as initial point to start Viterbi time-alignment for the prediction of the successor phonetic boundary. The proposed method was evaluated on the TIMIT database, with the exploitation of several, well known in the area of speech processing, Fourier-based and wavelet-based speech parameterization algorithms. The experimental results for the tolerance of 20 milliseconds indicated an improvement of the absolute segmentation accuracy of approximately 0.70%, when compared to the baseline speech segmentation scheme. 相似文献

2.

Steganography algorithms recognition based on match image and deep features verification

Xiaoyu Xu Yifeng Sun Jiang Wu Yi Sun 《Multimedia Tools and Applications》2018,77(21):27955-27979

Steganography algorithms recognition is a sub-section of steganalysis. Analysis shows when a steganalysis detector trained on one cover source is applied to images from an unseen source, generally the detection performance decreases. To tackle with this problem, this paper proposes a steganalytic scheme for steganography algorithms recognition. For a given testing image, a match image of the testing image is achieved. The match image is generated by performing a Gaussian filtering on the testing image to remove the possible stego signal. Then the match image is embedded in with recognized steganography algorithms. A CNN model trained on a training set is used to extract deep features from testing image and match images. Computing similarity between features with inner product operation or weighted-χ², the final decision is made according to similarity between testing feature and each class of match feature. The proposed scheme can also detect steganography algorithms unknown in training set. Experiments show that, comparing with directly used CNN model, the proposed scheme achieves considerable improvement on testing accuracy when detecting images come from unseen source. 相似文献

3.

Validation of phonetic transcriptions in the context of automatic speech recognition

Christophe Van Bael Henk van den Heuvel Helmer Strik 《Language Resources and Evaluation》2007,41(2):129-146

相似文献

4.

用于成对PPI网络比对的分治与整合算法

下载免费PDF全文

刘晓陈璟王子祥《智能系统学报》2022,17(5):960-968

生物网络比对是分析不同生物间进化关系的重要手段,它可以揭示不同物种间的保守功能并为物种间的注释转移提供重要信息。网络比对与子图同构类似,是一个NP-hard问题。本文提出了一种新的分治与整合策略的生物网络比对算法。首先进行模块划分,并根据已有的比对信息计算模块相似性;然后根据模块间结点的子比对获取候选结果集,最终通过超图匹配获得比对结果。使用已有的比对信息的集体行为预估模块间的相似性,大大提高了模块匹配的效率。基于路径和结点的得分函数保证了模块内结点的相似性。对于不同网络间结点的相似性,分别从结点自身和结点间的差异进行相似性判断。与现有算法相比,本文算法在生物和拓扑指标上均表现最佳。相似文献

5.

Dynamic–static unsupervised sequentiality,statistical subunits and lexicon for sign language recognition

Stavros Theodorakis Vassilis Pitsikalis Petros Maragos 《Image and vision computing》2014

We introduce a new computational phonetic modeling framework for sign language (SL) recognition. This is based on dynamic–static statistical subunits and provides sequentiality in an unsupervised manner, without prior linguistic information. Subunit “sequentiality” refers to the decomposition of signs into two types of parts, varying and non-varying, that are sequentially stacked across time. Our approach is inspired by the Movement–Hold SL linguistic model that refers to such sequences. First, we segment signs into intra-sign primitives, and classify each segment as dynamic or static, i.e., movements and non-movements. These segments are then clustered appropriately to construct a set of dynamic and static subunits. The dynamic/static discrimination allows us employing different visual features for clustering the dynamic or static segments. Sequences of the generated subunits are used as sign pronunciations in a data-driven lexicon. Based on this lexicon and the corresponding segmentation, each subunit is statistically represented and trained on multimodal sign data as a hidden Markov model. In the proposed approach, dynamic/static sequentiality is incorporated in an unsupervised manner. Further, handshape information is integrated in a parallel hidden Markov modeling scheme. The novel sign language modeling scheme is evaluated in recognition experiments on data from three corpora and two sign languages: Boston University American SL which is employed pre-segmented at the sign-level, Greek SL Lemmas, and American SL Large Vocabulary Dictionary, including both signer dependent and unseen signers' testing. Results show consistent improvements when compared with other approaches, demonstrating the importance of dynamic/static structure in sub-sign phonetic modeling. 相似文献

6.

基于多相似性度量和集合编码的属性对齐方法

伍家豪陈波韩先培孙乐《中文信息学报》2021,35(4):35-43

属性对齐的目标是发现异构知识图谱中表示同一概念的属性之间的对应关系,是实现跨图谱知识融合的关键技术之一.现有模型通常利用基于规则和词嵌入的方法进行属性对齐,但这些方法仍存在以下两个问题:相似性度量不全面和属性实例信息未被充分利用.针对上述问题,该文提出了基于多相似性度量的属性对齐模型,通过多个角度设计相似性度量方法来获... 相似文献

7.

中文异构百科知识库实体对齐

黄峻福李天瑞贾真景运革张涛《计算机应用》2016,36(7):1881-1886

针对传统实体对齐方法在中文异构网络百科实体对齐任务中效果不够显著的问题,提出一种基于实体属性与上下文主题特征相结合的实体对齐方法。首先,基于百度百科及互动百科数据构造中文异构百科知识库,通过统计方法构造资源描述框架模式（RDFS）词表,对实体属性进行规范化;其次,抽取实体上下文信息,对其进行中文分词后,利用主题模型对上下文建模并通过吉布斯采样法求解模型参数,计算出主题-单词概率矩阵,提取特征词集合及对应特征矩阵;然后,利用最长公共子序列（LCS）算法判定实体属性相似度,当相似度位于下界与上界之间时,进一步结合百科类实体上下文主题特征进行判定;最后,依据标准方法构造了一个异构中文百科实体对齐数据集进行仿真实验。实验结果表明,与经典的属性相似度算法、属性加权算法、上下文词频特征模型及主题模型算法进行比较,所提出的实体对齐算法在人物领域和影视领域的准确率、召回率与综合指标F值分别达到97.8%、88.0%、92.6%和98.6%、73.0%、83.9%,比其他方法均有较大的提高。实验结果验证了在构建中文异构百科知识库场景中,所提算法可以有效提升中文百科实体对齐效果,可应用到具有上下文信息的实体对齐任务中。相似文献

8.

融合音素串编辑距离的随机段模型解码算法

晁浩《计算机工程与应用》2015,51(6):208-211

解码时声学特性最优的路径蕴含了揭示当前路径是否正确的重要参考信息,为此提出了一种随机段模型系统的解码优化方法。训练能够准确地衡量当前路径与声学最优路径相似性程度的上下文相关音素串编辑距离模型,在N-Best重打分的过程中将音素串编辑距离加入到路径总得分中。在“863-test”测试集上进行的连续语音识别实验显示汉语字的相对错误率下降了8.1%。实验结果表明了将音素串编辑距离应用到随机段模型的可行性。相似文献

9.

D. Fuentes R. Bardeli J.A. Ortega L. Gonzalez-Abril 《Expert systems with applications》2012,39(11):10278-10282

相似文献

10.

水下光声图像空间配准算法研究综述

下载免费PDF全文

郭银景马新瑞许越铖孔芳吕文红《计算机工程与应用》2023,59(5):14-27

水下光声图像配准是水下设备实现信息融合的关键技术。在简述了水下光声图像配准的概念及实例的基础上,分析了目前水下光声图像重建与复原的相关算法,详细综述了水下异源图像基于区域和特征的配准算法研究进展,重点论述了基于图像域和形状特征相似度的两个准确度较高的研究方向的发展现状,并根据其他领域的异源图像配准的研究热点,从增加成像模型的结构性约束、引入相位一致性和生成对抗网络等算法提高配准精度,展望了水下声光图像配准研究的发展趋势。相似文献

11.

Scene Detection in Videos Using Shot Clustering and Sequence Alignment

《Multimedia, IEEE Transactions on》2009,11(1):89-100

Video indexing requires the efficient segmentation of video into scenes. The video is first segmented into shots and a set of key-frames is extracted for each shot. Typical scene detection algorithms incorporate time distance in a shot similarity metric. In the method we propose, to overcome the difficulty of having prior knowledge of the scene duration, the shots are clustered into groups based only on their visual similarity and a label is assigned to each shot according to the group that it belongs to. Then, a sequence alignment algorithm is applied to detect when the pattern of shot labels changes, providing the final scene segmentation result. In this way shot similarity is computed based only on visual features, while ordering of shots is taken into account during sequence alignment. To cluster the shots into groups we propose an improved spectral clustering method that both estimates the number of clusters and employs the fast global k-means algorithm in the clustering stage after the eigenvector computation of the similarity matrix. The same spectral clustering method is applied to extract the key-frames of each shot and numerical experiments indicate that the content of each shot is efficiently summarized using the method we propose herein. Experiments on TV-series and movies also indicate that the proposed scene detection method accurately detects most of the scene boundaries while preserving a good tradeoff between recall and precision. 相似文献

12.

Automated definition of phonetically homogeneous sections of words in a natural language based on multiparameter optimization

O. N. Korsun A. V. Poliev 《Journal of Computer and Systems Sciences International》2016,55(4):609-618

An approach to the automated splitting of words into phonetically homogeneous parts is proposed under which the boundaries of the parts are defined as a result of solving a multiparameter optimization problem. The approach is assumed to ensure the maximum difference in the phonetic material between the adjacent parts and the maximum similarity within the parts. The accepted measure of similarity and difference is based on the correlation between the columns of the parametric portrait matrix of the word generated as a result of a time-spectral conversion of an audio recording of the word. To obtain a numerical solution of the problem, an algorithm is proposed which is a modification of a dynamic programming technique. The experimental results are presented with several words from the Russian language taken as examples to confirm the legitimacy of the assumptions made and viability of the algorithms proposed. 相似文献

13.

基于汉语拼音的模糊查询及其在图书馆管理系统中的应用

靳小倩杨静《计算机应用与软件》2011,28(5)

在图书馆管理系统的查询功能中,模糊查询可以使得整个系统的查询、管理变得更简洁和高效。通过区位码实现汉字拼音模糊查询,主要介绍其基本思想、算法实现示例以及使用的实例,来展示拼音模糊查询的实现方法,解决汉字模糊查询的弊端。相似文献

14.

国际音标图像字符细化方法

下载免费PDF全文

孙孝坤黄继风《图学学报》2018,39(2):214

对图像文字进行细化有助于突出文字的形状特点和减少冗余的信息量,在文字识别领域有着重要的应用。在分析研究传统细化算法后,针对传统细化出现的畸变、细化不完全现象, 提出了一种对国际音标图像字符的细化方法。该算法通过对文字区域的边缘分类标记,并判断被标记点是否满足可去除条件,然后逐步去除边缘像素点,最终能让国际音标图像字符的宽度细化到一个像素宽度。针对国际音标图像字符的实验表明,该算法能够准确地对国际音标图像字符进行细化,且简单高效。相似文献

15.

A comparison of standard spell checking algorithms and a novel binary neural approach 总被引：1，自引：0，他引：1

Hodge V.J. Austin J. 《Knowledge and Data Engineering, IEEE Transactions on》2003,15(5):1073-1081

In this paper, we propose a simple, flexible, and efficient hybrid spell checking methodology based upon phonetic matching, supervised learning, and associative matching in the AURA neural system. We integrate Hamming Distance and n-gram algorithms that have high recall for typing errors and a phonetic spell-checking algorithm in a single novel architecture. Our approach is suitable for any spell checking application though aimed toward isolated word error correction, particularly spell checking user queries in a search engine. We use a novel scoring scheme to integrate the retrieved words from each spelling approach and calculate an overall score for each matched word. From the overall scores, we can rank the possible matches. We evaluate our approach against several benchmark spellchecking algorithms for recall accuracy. Our proposed hybrid methodology has the highest recall rate of the techniques evaluated. The method has a high recall rate and low-computational cost. 相似文献

16.

Robust factorization 总被引：3，自引：0，他引：3

Aanaes H. Fisker R. Astrom K. Carstensen J.M. 《IEEE transactions on pattern analysis and machine intelligence》2002,24(9):1215-1225

Factorization algorithms for recovering structure and motion from an image stream have many advantages, but they usually require a set of well-tracked features. Such a set is in generally not available in practical applications. There is thus a need for making factorization algorithms deal effectively with errors in the tracked features. We propose a new and computationally efficient algorithm for applying an arbitrary error function in the factorization scheme. This algorithm enables the use of robust statistical techniques and arbitrary noise models for the individual features. These techniques and models enable the factorization scheme to deal effectively with mismatched features, missing features, and noise on the individual features. The proposed approach further includes a new method for Euclidean reconstruction that significantly improves convergence of the factorization algorithms. The proposed algorithm has been implemented as a modification of the Christy-Horaud factorization scheme, which yields a perspective reconstruction. Based on this implementation, a considerable increase in error tolerance is demonstrated on real and synthetic data. The proposed scheme can, however, be applied to most other factorization algorithms 相似文献

17.

视频人脸识别中判别性联合多流形分析

于谦高阳霍静庄韫恺《软件学报》2015,26(11):2897-2911

将基于视频的人脸识别转换为图像集识别问题,并提出两种流形来表示每个图像集:一种是类间流形,表示每个图像集的平均脸信息;另一种是类内流形,表示每个图像集的所有原始图像的信息.类间流形针对图像集之间的区别提取整体判别信息,作用是选出几个与待识别图像集较为相似的候选图像集.类内流形则考虑图像集内各原始图像之间的关系,负责从候选图像集中找出最为相似的一个.不同于现有的非线性流形方法中每幅图像对应流形中的一个点,采用分片技术学习两种流形的投影矩阵,每个分片对应流形中的一个点,所学到的特征更具有判别性,进而使流形边界更加清晰,同时解决了传统非线性流形方法中的角度偏差和不充分采样问题.还提出了与分片技术相匹配的流形之间的距离度量方法.最后在几个广为研究的数据集上进行了实验,结果表明:新方法的识别准确率高,尤其适用于不受控环境下的视频识别,而且不受视频段长短的影响. 相似文献

18.

改进的本体概念相似度计算模型

姚佳岷杨思春《计算机应用》2013,33(6):1579-1586

本体映射能很好地解决语义网中的本体异构性问题,其核心在于计算本体概念的相似度。针对现有的概念相似度计算的精度和查准率不高,提出一种改进的概念相似度计算模型。首先利用本体特征之间的偏序关系建立形式背景和概念格,然后在结构层次求出概念间的交不可约元集,并通过对集合里各元素的语义关系进行量化计算出概念间的相似度。实例和分析结果表明,改进的概念相似度计算模型在F-Score上有明显提高。相似文献

19.

Semi-supervised image clustering with multi-modal information

Jianqing Liang Yahong Han Qinghua Hu 《Multimedia Systems》2016,22(2):149-160

How to organize and retrieve images is now a great challenge in various domains. Image clustering is a key tool in some practical applications including image retrieval and understanding. Traditional image clustering algorithms consider a single set of features and use ad hoc distance functions, such as Euclidean distance, to measure the similarity between samples. However, multi-modal features can be extracted from images. The dimension of multi-modal data is very high. In addition, we usually have several, but not many labeled images, which lead to semi-supervised learning. In this paper, we propose a framework of image clustering based on semi-supervised distance learning and multi-modal information. First we fuse multiple features and utilize a small amount of labeled images for semi-supervised metric learning. Then we compute similarity with the Gaussian similarity function and the learned metric. Finally, we construct a semi-supervised Laplace matrix for spectral clustering and propose an effective clustering method. Extensive experiments on some image data sets show the competent performance of the proposed algorithm. 相似文献

20.

Automatic phonetic transcription of large speech corpora

Christophe Van Bael Lou Boves Henk van den Heuvel Helmer Strik 《Computer Speech and Language》2007,21(4):652-668

相似文献