首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 343 毫秒
1.
Motif在转录和后转录水平的基因表达调控中起着重要的作用。目前,识别Motif的算法和相应的软件已有不少,但是却鲜有对各种算法及软件性能共同评测的研究和报告。介绍了算法的分类以及三种常见的Motif识别算法Wordup,MM和Gibbs采样,并对AlignACE,MEME,MotifSampler,Weeder等13种Motif寻找软件进行性能比较分析。通过生物学意义的研究和性能比较结果可以得出:由于唯有Weeder算法考虑了Motif保守核心位置,因而它在各种软件中识别效果较好;大部分算法只考虑简单而  相似文献   

2.
生物序列motif识别问题是当今生物信息学面临的一个复杂问题,要设计一个能识别所有motif的方法几乎是不可能的。针对该问题,在免疫遗传算法中引入了统计估计,提高了motif识别的精度,根据个体的浓度和适应值概率。设计了免疫替换算子,有效地解决了种群的多样性问题,利用Gibbs Sampler算法生成种子,提高了免疫遗传算法的搜索速度,最后得到了一个基于免疫GA与Gibbs Sampler的生物序列motif识别算法,该算法充分发挥了免疫遗传算法和Gibbs Sampler算法的优越性,较好地解决了计算速度和计算精度之间的矛盾。实验表明,该算法是有效的。  相似文献   

3.
基于动态Gibbs采样的RBM训练算法研究   总被引:2,自引:0,他引:2  
目前大部分受限玻尔兹曼机(Restricted Boltzmann machines, RBMs)训练算法都是以多步Gibbs采样为基础的采样算法. 本文针对多步Gibbs采样过程中出现的采样发散和训练速度过慢的问题,首先, 对问题进行实验描述,给出了问题的具体形式; 然后, 从马尔科夫采样的角度对多步Gibbs采样的收敛性质进行了理论分析, 证明了多步Gibbs采样在受限玻尔兹曼机训练初期较差的收敛性质是造成采样发散和训练速度过慢的主要原因; 最后, 提出了动态Gibbs采样算法,给出了对比仿真实验.实验结果表明, 动态Gibbs采样算法可以有效地克服采样发散的问题,并且能够以微小的运行时间为代价获得更高的训练精度.  相似文献   

4.
《计算机工程》2017,(1):13-19
为提高移动云数据存储远程服务器的计算和存储能力,提出一种改进的移动云数据存储算法。利用表决数据分配和表决数据处理框架,构建考虑节点失效概率的重采样期望传播时间计算模型,并建立整合能源效率和容错性的表决动态网络。采用概率分布估计对动态网络模型进行存储路径优化,应用Gibbs采样解决分布估计的样本数据高维耦合和无监督训练问题。实验结果表明,与贪心算法、随机放置算法和分布估计算法相比,该算法具有更高的能源效率和存储可靠性。  相似文献   

5.
挖掘时间序列motif间潜在的关联规则可以在预测未来趋势方面发挥重要作用,时间序列motif即时间序列中先前未知的重复出现的模式。针对符号化时间序列提取motif导致信息丢失的问题,提出基于剪枝技术的motif提取算法PM_Motif,实现了保留原始信息的motif的精准快速提取;针对分割motif来发现其内部关联规则导致的规则不一致的问题,从motif间的关联规则入手,给出了基于AR_TSM方法的时间序列motif关联规则挖掘算法,从根本上避免了因motif分割引起的不确定性,保证了规则的一致性;最后,引入了关联规则评价参数RM,在多数据集上证明了关联规则的预测性能。  相似文献   

6.
计算机方法识别转录因子结合位点(TFBS,也称“模式”)是目前生物信息学的一个很有吸引性和挑战性的课题。吉布斯采样识别模式的算法本质上是一个启发式搜索方法,容易陷入非全局最优的局部最大值。为此,提出了一种改进的吉布斯采样策略YGMS(Yeast Gibbs Motif Sampler)采识别酿酒酵母共表达基因调控区域转录因子结合位点。在酵母的共调控基因序列的数据集测试中,YGMS比其他几个基于吉布斯采样算法更有效地识别出真实模式序列,在一定程度上提高了算法的性能。  相似文献   

7.
针对卫星轨道连续跟踪采样的时间窗口传统计算方法计算量大、效率低的问题,提出了一种新的快速算法。为减少参与计算的采样点数量,算法通过预测参与计算对象之间距离动态调整采样步长;为使算法适于解决各类时间窗口计算问题,提出广义可视概念进行时间窗口判定。分别研究了卫星与地面点目标可见时间窗口、星间可见时间窗口、卫星对地面目标覆盖时间窗口、地面大范围区域卫星过境时间窗口的广义可视判断方法和预测距离计算模型。实验结果表明,算法与传统算法精度完全一致,效率提升约99.7%。  相似文献   

8.
随着生物信息学的发展,模体识别已经成为一种能够从生物序列中提取有用生物信息的方法.文中介绍了有关模体的一些概念,讨论了模体识别算法(MEME)的基础,即EM(expectation maximization)算法,由于MEME算法是建立在EM算法的基础上的,所以又由此引出了MEME算法,并对MEME算法的一些基本问题比如时间复杂度、算法性能等进行了详细讨论,对算法的局限性和有待改进的地方作了说明.实践证明,MEME是一个较好的模体识别算法,它能够识别出蛋白质或者DNA序列中单个或多个模体,具有很大的灵活性.  相似文献   

9.
为了进一步提高蔬菜识别的精度,提出了基于Gibbs采样和残差卷积神经网络的蔬菜识别算法,本文将其命名为GiRAlexNet算法。根据马尔科夫随机场与吉布斯随机场的等价性构建图像概率模型,用Gibbs采样获取最优样本点集合,随机取点切割图片。通过GoogleNet、ResNet和AlexNet模型实验显示,分类准确率分别提升了9.22%,3.34%和9.19%。大量实验表明,该GiRAlexNet算法对蔬菜识别的准确率达到98.14%。  相似文献   

10.
随着生物信息学的发展,模体识别已经成为一种能够从生物序列中提取有用生物信息的方法。文中介绍了有关模体的一些概念,讨论了模体识别算法(MEME)的基础,即EM(expectation maximization)算法,由于MEME算法是建立在EM算法的基础上的,所以又由此引出了MEME算法,并对MEME算法的一些基本问题比如时间复杂度、算法性能等进行了详细讨论,对算法的局限性和有待改进的地方作了说明。实践证明,MEME是一个较好的模体识别算法,它能够识别出蛋白质或者DNA序列中单个或多个模体,具有很大的灵活性。  相似文献   

11.
12.
The MEME algorithm extends the expectation maximization (EM) algorithm for identifying motifs in unaligned biopolymer sequences. The aim of MEME is to discover new motifs in a set of biopolymer sequences where little or nothing is known in advance about any motifs that may be present. MEME innovations expand the range of problems which can be solved using EM and increase the chance of finding good solutions. First, subsequences which actually occur in the biopolymer sequences are used as starting points for the EM algorithm to increase the probability of finding globally optimal motifs. Second, the assumption that each sequence contains exactly one occurrence of the shared motif is removed. This allows multiple appearances of a motif to occur in any sequence and permits the algorithm to ignore sequences with no appearance of the shared motif, increasing its resistance to noisy data. Third, a method for probabilistically erasing shared motifs after they are found is incorporated so that several distinct motifs can be found in the same set of sequences, both when different motifs appear in different sequences and when a single sequence may contain multiple motifs. Experiments show that MEME can discover both the CRP and LexA binding sites from a set of sequences which contain one or both sites, and that MEME can discover both the –10 and –35 promoter regions in a set of E. coli sequences.  相似文献   

13.
张斐 《微机发展》2011,(10):171-175
主要研究了如何评价蛋白质家族Motifs预测算法的预测结果,目的是在对传统的算法预测问题分析优化的基础上,制定新的评价策略。主要方法是通过对MEME算法和PKG算法预测结果的比较分析,计算同一家族中Motifs的敏感性和特异性并比较它们对应的ROC曲线,确定真实的Motifs,进而获得该蛋白质家族的最佳Motifs的模型。实验结果表明这种评价策略可用于算法对蛋白质家族Motifs预测结果的评价,还可利用确定的最佳Motifs搜索数据库来预测蛋白质家族中其他的Motifs。  相似文献   

14.
Gibbsian fields or Markov random fields are widely used in Bayesian image analysis, but learning Gibbs models is computationally expensive. The computational complexity is pronounced by the recent minimax entropy (FRAME) models which use large neighborhoods and hundreds of parameters. In this paper, we present a common framework for learning Gibbs models. We identify two key factors that determine the accuracy and speed of learning Gibbs models: The efficiency of likelihood functions and the variance in approximating partition functions using Monte Carlo integration. We propose three new algorithms. In particular, we are interested in a maximum satellite likelihood estimator, which makes use of a set of precomputed Gibbs models called "satellites" to approximate likelihood functions. This algorithm can approximately estimate the minimax entropy model for textures in seconds in a HP workstation. The performances of various learning algorithms are compared in our experiments  相似文献   

15.
ObjectiveThis paper presents an algorithm for the solution of the motif discovery problem (MDP).Methods and materialsMotif discovery problem can be considered in two cases: motifs with insertions/deletions, and motifs without insertions/deletions. The first group motifs can be found by stochastic and approximated methods. The second group can be found by using stochastic and approximated methods, but also deterministic method. We proved that the second group motifs can be found with a deterministic algorithm, and so, it can be said that the second motifs finding is a P-type problem as proved in this paper.Results and conclusionsAn algorithm was proposed in this paper for motif discovery problem. The proposed algorithm finds all motifs which are occurred in the sequence at least two times, and it also finds motifs of various sizes. Due to this case, this algorithm is regarded as Automatic Exact Motif Discovery Algorithm. All motifs of different sizes can be found with this algorithm, and this case was proven in this paper. It shown that automatic exact motif discovery is a P-type problem in this paper. The application of the proposed algorithm has been shown that this algorithm is superior to MEME, MEME3, Motif Sampler, WEEDER, CONSENSUS, AlignACE.  相似文献   

16.
This paper deals with the wind speed prediction in wind farms, using spatial information from remote measurement stations. Owing to the temporal complexity of the problem, we employ local recurrent neural networks with internal dynamics, as advanced forecast models. To improve the prediction performance, the training task is accomplished using on-line learning algorithms based on the recursive prediction error (RPE) approach. A global RPE (GRPE) learning scheme is first developed where all adjustable weights are simultaneously updated. In the following, through weight grouping we devise a simplified method, the decoupled RPE (DRPE), with reduced computational demands. The partial derivatives required by the learning algorithms are derived using the adjoint model approach, adapted to the architecture of the networks being used. The efficiency of the proposed approach is tested on a real-world wind farm problem, where multi-step ahead wind speed estimates from 15 min to 3 h are sought. Extensive simulation results demonstrate that our models exhibit superior performance compared to other network types suggested in the literature. Furthermore, it is shown that the suggested learning algorithms outperform three gradient descent algorithms, in training of the recurrent forecast models.  相似文献   

17.
In recent years, energy efficiency has become an important topic, especially in the field of ultra-dense networks (UDNs). In this area, cell-association bias adjustment and small cell on/off are proposed to enhance the performance of energy efficiency in UDNs. This is done by changing the cell association relationship and turning off the extra small cells that have no users. However, the variety of cell association relationships and the switching on/off of the small cells may deteriorate some users’ data rates, leading to nonconformance to the users’ data rate requirement. Considering the discreteness and non-convexity of the energy efficiency optimization problem and the coupled relationship between cell association and scheduling during the optimization process, it is difficult to achieve an optimal cell-association bias. In this study, we optimize the network energy efficiency by adjusting the cell-association bias of small cells while satisfying the users’ data rate requirement. We propose an energy-efficient centralized Gibbs sampling based cell-association bias adjustment (CGSCA) algorithm. In CGSCA, global information such as channel state information, cell association information, and network load information need to be collected. Then, considering the overhead of the messages that are exchanged and the implementation complexity of CGSCA to obtain the global information in UDNs, we propose an energy-efficient distributed Gibbs sampling based cell-association bias adjustment (DGSCA) algorithm with a lower message-exchange overhead and implementation complexity. Using DGSCA, we derive the updated formulas for calculating the number of users in a cell and the users’ SINR. We analyze the implementation complexities (e.g., computation complexity and communication com- plexity) of the proposed two algorithms and other existing algorithms. We perform simulations, and the results show that CGSCA and DGSCA have faster convergence speed, as well as a higher performance gain of the energy efficiency and throughput compared to other existing algorithms. In addition, we analyze the importance of the users’ data rate constraint in optimizing the energy efficiency, and we compare the energy efficiency performance of different algorithms with different number of small cells. Then, we present the number of sleeping small cells as the number of small cells increases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号