共查询到20条相似文献,搜索用时 125 毫秒
1.
随着生物信息学的发展,模体识别已经成为一种能够从生物序列中提取有用生物信息的方法.文中介绍了有关模体的一些概念,讨论了模体识别算法(MEME)的基础,即EM(expectation maximization)算法,由于MEME算法是建立在EM算法的基础上的,所以又由此引出了MEME算法,并对MEME算法的一些基本问题比如时间复杂度、算法性能等进行了详细讨论,对算法的局限性和有待改进的地方作了说明.实践证明,MEME是一个较好的模体识别算法,它能够识别出蛋白质或者DNA序列中单个或多个模体,具有很大的灵活性. 相似文献
2.
3.
有效管理生物数据并提供高效的查询方法是生物信息处理的重要研究内容.BioSeg是一个新的生物序列数据模型.查询优化研究是生物数据库管理系统开发的重要内容之一.研究当前生物数据索引技术,针对BioSeg数据模型的特点和生物序列相似性查询需求设计了一种新的生物序列数据索引BioIndex,并设计相应的查询算法.首先,使用MEME(Multiple EM for Moeif Elicitation)算法挖掘生物序列集中的序列模式作为索引建立索引序列库;之后,在索引序列库中查找与查询序列最相似的索引序列,将其对应的序列集作为候选集;再在候选集中查找与查询序列最相似的序列.在真实生物序列数据集上的实验表明使用新的生物序列数据索引BioIndex的序列查询算法提高了序列查询的效率. 相似文献
4.
5.
6.
7.
基于在线分裂合并EM算法的高斯混合模型分类方法* 总被引:2,自引:1,他引:1
为了解决传统高斯混合模型中期望值EM处理必须具备足够数量的样本才能开始训练的问题,提出了一种新的高斯混合模型在线增量训练算法。本算法在Ueda等人提出的Split-and-Merge EM方法基础上对分裂合并准则的计算进行了改进,能够有效避免陷入局部极值并减少奇异值出现的情况;通过引入时间序列参数提出了增量EM训练方法,能够实现增量式的期望最大化训练,从而能够逐样本在线更新GMM模型参数。对合成数据和实际语音识别应用的实验结果表明,本算法具有较好的运算效率和分类准确性。 相似文献
8.
为了克服微分进化的局部收敛问题,通过模拟游牧民族的迁徙机制,提出一种迁徙策略,将其与差分进化算法相结合,得到一种迁徙差分进化算法新范式,利用集成技术,发挥各种差分进化算法的优点,提高算法的全局搜索能力。通过生物序列模体识别实验,验证了该算法的有效性。 相似文献
9.
10.
11.
Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization 总被引:36,自引:0,他引:36
The MEME algorithm extends the expectation maximization (EM) algorithm for identifying motifs in unaligned biopolymer sequences. The aim of MEME is to discover new motifs in a set of biopolymer sequences where little or nothing is known in advance about any motifs that may be present. MEME innovations expand the range of problems which can be solved using EM and increase the chance of finding good solutions. First, subsequences which actually occur in the biopolymer sequences are used as starting points for the EM algorithm to increase the probability of finding globally optimal motifs. Second, the assumption that each sequence contains exactly one occurrence of the shared motif is removed. This allows multiple appearances of a motif to occur in any sequence and permits the algorithm to ignore sequences with no appearance of the shared motif, increasing its resistance to noisy data. Third, a method for probabilistically erasing shared motifs after they are found is incorporated so that several distinct motifs can be found in the same set of sequences, both when different motifs appear in different sequences and when a single sequence may contain multiple motifs. Experiments show that MEME can discover both the CRP and LexA binding sites from a set of sequences which contain one or both sites, and that MEME can discover both the –10 and –35 promoter regions in a set of E. coli sequences. 相似文献
12.
13.
EM算法与K-Means算法比较 总被引:1,自引:0,他引:1
聚类是广泛应用的基本数据挖掘方法之一,它按照数据的相似性和差异性将数据分为若干簇,并使得同簇的尽量相似,不同簇的尽量相异.目前存在大量的聚类算法,本文仅考察了划分方法中的两个常用算法:EM算法和K-Means算法,并重点剖析了EM算法,对实验结果进行了分析.最后对算法进行了总结与讨论. 相似文献
14.
Ali Karci 《Expert systems with applications》2009,36(4):7952-7963
ObjectiveThis paper presents an algorithm for the solution of the motif discovery problem (MDP).Methods and materialsMotif discovery problem can be considered in two cases: motifs with insertions/deletions, and motifs without insertions/deletions. The first group motifs can be found by stochastic and approximated methods. The second group can be found by using stochastic and approximated methods, but also deterministic method. We proved that the second group motifs can be found with a deterministic algorithm, and so, it can be said that the second motifs finding is a P-type problem as proved in this paper.Results and conclusionsAn algorithm was proposed in this paper for motif discovery problem. The proposed algorithm finds all motifs which are occurred in the sequence at least two times, and it also finds motifs of various sizes. Due to this case, this algorithm is regarded as Automatic Exact Motif Discovery Algorithm. All motifs of different sizes can be found with this algorithm, and this case was proven in this paper. It shown that automatic exact motif discovery is a P-type problem in this paper. The application of the proposed algorithm has been shown that this algorithm is superior to MEME, MEME3, Motif Sampler, WEEDER, CONSENSUS, AlignACE. 相似文献
15.
Hiroyuki OkamuraAuthor Vitae Tadashi DohiAuthor VitaeKishor S. TrivediAuthor Vitae 《Performance Evaluation》2011,68(10):938-954
This paper proposes an improved computation method of maximum likelihood (ML) estimation for phase-type (PH) distributions with a number of phases. We focus on the EM (expectation-maximization) algorithm proposed by Asmussen et al. [27] and refine it in terms of time complexity. Two ideas behind our method are a uniformization-based procedure for computing a convolution integral of the matrix exponential and an improvement of the forward-backward algorithm using time intervals. Compared with the differential-equation-based EM algorithm discussed in Asmussen et al. [27], our approach succeeds in the reduction of computation time for the PH fitting with a moderate to large number of phases. In addition to the improvement of time complexity, this paper discusses how to estimate the canonical form by applying the EM algorithm. In numerical experiments, we examine computation times of the proposed and differential-equation-based EM algorithms. Furthermore, the proposed EM algorithm is also compared with the existing PH fitting methods in terms of computation time and fitting accuracy. 相似文献
16.
主要研究了如何评价蛋白质家族Motifs预测算法的预测结果,目的是在对传统的算法预测问题分析优化的基础上,制定新的评价策略。主要方法是通过对MEME算法和PKG算法预测结果的比较分析,计算同一家族中Motifs的敏感性和特异性并比较它们对应的ROC曲线,确定真实的Motifs,进而获得该蛋白质家族的最佳Motifs的模型。实验结果表明这种评价策略可用于算法对蛋白质家族Motifs预测结果的评价,还可利用确定的最佳Motifs搜索数据库来预测蛋白质家族中其他的Motifs。 相似文献
17.
Ching-Hung Lee Fu-Kai Chang Che-Ting Kuo Hao-Hang Chang 《International journal of systems science》2013,44(2):231-247
This article introduces a novel hybrid evolutionary algorithm for recurrent fuzzy neural systems design in applications of nonlinear systems. The hybrid learning algorithm, IEMBP-improved electromagnetism-like (EM) with back-propagation (BP) technique, combines the advantages of EM and BP algorithms which provides high-speed convergence, higher accuracy and less computational complexity (computation time in seconds). In addition, the IEMBP needs only a small population to outperform the standard EM that uses a larger population. For a recurrent neural fuzzy system, IEMBP simulates the ‘attraction’ and ‘repulsion’ of charged particles by considering each neural system parameters as a charged particle. The EM algorithm is modified in such a way that the competition selection is adopted and the random neighbourhood local search is replaced by BP without evaluations. Thus, the IEMBP algorithm combines the advantages of multi-point search, global optimisation and faster convergence. Finally, several illustration examples for nonlinear systems are shown to demonstrate the performance and effectiveness of IEMBP. 相似文献
18.
19.
20.
以EM算法为基础,在给定贝叶斯网络结构情况下。研究分析了Voting EM算法并利用该算法对防洪决策贝叶斯网络进行在线参数学习,将该算法与EM算法的学习结果进行了比较分析,结果表明Voting EM算法不但能够进行在线参数学习,而且也具有较高的学习精度. 相似文献