首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 125 毫秒
1.
基于遗传算法的系统发生树构建方法   总被引:1,自引:0,他引:1       下载免费PDF全文
提出了一种基于遗传算法的系统发生树构建方法。将遗传算法应用于系统发生树的构建,首先,用后缀表示法将树的拓扑结构表示成编码的形式。其次,针对系统发生树的性质,设计了交叉和变异操作方法,确定了对个体的评价及选择策略,从而通过遗传操作,最终搜索到最优解。实验结果表明该算法可以得到与传统UPGMA算法拓扑结果一致的系统发生树,并且除了最优拓扑结构的树之外,该算法还可以输入多个具有相似质量的树。  相似文献   

2.
DNA序列中基于适应性后缀树的重复体识别算法   总被引:1,自引:0,他引:1  
现有的在DNA序列中识别重复体的算法多数是基于比对的,对识别速度和吞吐量有很大的限制.针对这个问题文中根据一个平衡重复体的长度和频率的定义,提出了一种基于Ukkonen后缀树的快速识别重复体的RepSeeker算法.算法采用最低限制频率,最大程度地扩展了重复体的长度,同时为了进一步地提高RepSeeker算法的效率,对Ukkonen的后缀树构造算法进行了适应性改进,在构造时加入RepSeeker算法所需的结点信息并将叶子结点和分支结点加以区分,从而使得RepSeeker算法能通过直接读取结点信息来求得子串频率和子串位置.这种改进较大地提高了RepSeeker算法的性能,而且空间开销不大.实验中使用了NCBI中的9条典型DNA序列作为测试数据,并对后缀树改进前后的重复体识别算法做了比较分析.结果表明,RepSeeker在没有损失精度的情况下缩短了算法的运行时间.实验结果与理论上的分析一致.  相似文献   

3.
受TSP问题的启发,提出一种基于TSP构建系统发生树的蚁群算法(TSP-PTC)。该算法将物种集合用一个带权图G表示,并利用蚁群算法在图中搜索一条最优路径,最终系统发生树用最优路径及距离矩阵构建而成。用该方法构建出来的系统发生树是一棵带权树,它不仅可以表示物种之间的进化关系,而且可以粗略地表示出物种之间的进化时间。  相似文献   

4.
本文算法在建立组播树时,采用双种蚁群算法,一组从源结点向目的结点搜索,另一组从目的结点向源结点搜索。蚂蚁搜索路径时根据QoS参数影响度的大小修改信息素更新规则,从而建立满足多QoS约束的最优组播树。QoS参数影响度的确认通过正交实验统计方法,根据要搜索路径的规模,选择合适的正交表。实验证明该算法能有效的利用各QoS资源,较快的得到较优解。  相似文献   

5.
针对维文黏着语的特点和广义后缀树提取概念间分类关系时后缀树中出现非概念词的问题,提出一种改进的基于广义后缀树的维文领域本体组合词概念分类关系提取算法。该算法首先对维文领域本体组合词概念构建广义后缀树,先序遍历广义后缀树,对叶子节点存储的后缀词进行维文词干提取,删除非概念词所在叶节点,合并经维文词干提取后表示相同概念的叶节点,实现广义后缀树的剪枝;进而自动提取组合词概念分类关系。实验表明,与传统的基于广义后缀树的概念分类关系提取算法相比,准确率、召回率都得到了提高。  相似文献   

6.
近年来,二分网络的社区挖掘问题得到了极大的关注。提出了一种基于广义后缀树的二分网络社区挖掘算法。首先从二分网络的邻接矩阵中提取网络中每个节点的链接节点序列,然后构建广义后缀树。广义后缀树的每个节点表示二分网络的一个完全二分团,由此获取并调整完全二分团。通过引入二分团的紧密度得到初始的社区划分,最后再对孤立点进行处理以得到最终的社区划分。所提算法不仅能发现重叠社区,而且能得到一对多关系的社区。在人工数据集和真实数据集上的实验表明,所提算法能准确地识别二分网络中的社区个数,获得很好的划分效果。  相似文献   

7.
在利用计算机处理文本信息时,为了能发现大文本信息中的重复词句,本文介绍两种用来发现重复词句的算法——基于后缀树的方法和基于倒排索引的方法。第一种ST算法使用树型数据结构,每个节点表示一个字并且根节点为空。第二种算法应用倒排索引,以及哈希表实现方法(HT)。对同一样本运行仿真后,在时间和空间复杂度上对实验结果进行比较。得出结论,尽管ST算法在考虑到时间成本时要更优,但在空间复杂度方面倒排索引方法更胜一筹。  相似文献   

8.
物种的进化史通常被描述成一棵有根系统树,但是当物种进化过程中发生网状进化事件(如,杂交、重组和水平基因转移)时,物种的进化史不再适合被描述成系统树。系统发生网络是系统树的一般化,也是被用来描述物种的进化史,并可以描述物种的网状进化事件。而且系统发生网络也可以可视化冲突数据集,如由不同的基因得到的物种树。因此,系统发生网络的研究是生物信息的一个重要领域。介绍了系统发生网络的概念、发展、研究现状,总结了现有的系统发生网络构建算法。  相似文献   

9.
张毅超  车玫  马骏 《计算机仿真》2007,24(12):97-100,116
高效求解2个字符串的最长公共子串(Longest Common Substring)是实现很多字符串算法的关键.文中首先给出了求解LCP问题的动态规划算法,广义后缀树算法,研究并分析了这两种算法,得出动态规划算法易于理解,但时间复杂度较高;广义后缀树算法的时间复杂度较低,但实现较为复杂并且广义后缀树占用的空间也较多.最后提出了一个新算法,该算法使用2个字符串的广义后缀数组,在保持和广义后缀树时间复杂度相等的基础上,可以简单地实现并且占用较少的空间.  相似文献   

10.
后缀树的并行构造算法   总被引:1,自引:0,他引:1  
后缀树是一种非常重要的数据结构,它在与字符串处理相关的各种领域里有着非常广泛的应用。构造后缀树是应用后缀树解决问题的前提和关键。虽然很多现有的后缀树构造算法都是线性时间和空间的,但是,当被索引的字符串的长度很长时,构造其后缀树所消耗的时间和空间仍将非常巨大,这极大地限制了后缀树的实际应用。而并行技术是解决这一问题的很好途径,因此人们提出了后缀树的并行构造算法。本文对后缀树的三种并行构造算法进行了综述,通过系统的比较和分析,总结出当前存在的问题,并指明了下一步的研究方向。  相似文献   

11.
A suffix tree approach to anti-spam email filtering   总被引:1,自引:0,他引:1  
We present an approach to email filtering based on the suffix tree data structure. A method for the scoring of emails using the suffix tree is developed and a number of scoring and score normalisation functions are tested. Our results show that the character level representation of emails and classes facilitated by the suffix tree can significantly improve classification accuracy when compared with the currently popular methods, such as naive Bayes. We believe the method can be extended to the classification of documents in other domains. Editor: Tom Fawcett  相似文献   

12.
为了解决含有缺失形态学数据谱系树的构建问题,提出了运用属性约简构建谱系树的方法。首先,利用先验知识和较完整的部分物种数据构建初始谱系树;然后,运用属性约简原理获得属性决策组集合的决策点,进而建立先验决策模型;最后,根据先验决策模型确定缺失数据比例较高的物种在初始谱系树中的位置,通过物种嫁接完成谱系演化树的构建。实验结果表明,当单个物种缺失数据比例大于10%时,相比最大简约法在平均准确率方面平均高出10%左右。  相似文献   

13.
Functional Trees   总被引:1,自引:0,他引:1  
In the context of classification problems, algorithms that generate multivariate trees are able to explore multiple representation languages by using decision tests based on a combination of attributes. In the regression setting, model trees algorithms explore multiple representation languages but using linear models at leaf nodes. In this work we study the effects of using combinations of attributes at decision nodes, leaf nodes, or both nodes and leaves in regression and classification tree learning. In order to study the use of functional nodes at different places and for different types of modeling, we introduce a simple unifying framework for multivariate tree learning. This framework combines a univariate decision tree with a linear function by means of constructive induction. Decision trees derived from the framework are able to use decision nodes with multivariate tests, and leaf nodes that make predictions using linear functions. Multivariate decision nodes are built when growing the tree, while functional leaves are built when pruning the tree. We experimentally evaluate a univariate tree, a multivariate tree using linear combinations at inner and leaf nodes, and two simplified versions restricting linear combinations to inner nodes and leaves. The experimental evaluation shows that all functional trees variants exhibit similar performance, with advantages in different datasets. In this study there is a marginal advantage of the full model. These results lead us to study the role of functional leaves and nodes. We use the bias-variance decomposition of the error, cluster analysis, and learning curves as tools for analysis. We observe that in the datasets under study and for classification and regression, the use of multivariate decision nodes has more impact in the bias component of the error, while the use of multivariate decision leaves has more impact in the variance component.  相似文献   

14.
检索结果聚类能够帮助用户快速定位需要查找的信息。注重进行中文文本聚类的同时生成高质量的标签,获取搜索引擎返回的网页标题和,利用分词工具对文本分词,去除停用词;统一构建一棵后缀树,以词语为单位插入后缀树各节点,通过词频、词长、词性和位置几项约束条件计算各节点词语得分;合并基类取得分高的节点词作标签。实验结果显示该方法的聚类簇纯度较高,提取的标签准确且区分性较强,方便用户使用。  相似文献   

15.
Suffix trees are among the most important data structures in stringology, with a number of applications in flourishing areas like bioinformatics. Their main problem is space usage, which has triggered much research striving for compressed representations that are still functional. A smaller suffix tree representation could fit in a faster memory, outweighing by far the theoretical slowdown brought by the space reduction. We present a novel compressed suffix tree, which is the first achieving at the same time sublogarithmic complexity for the operations, and space usage that asymptotically goes to zero as the entropy of the text does. The main ideas in our development are compressing the longest common prefix information, totally getting rid of the suffix tree topology, and expressing all the suffix tree operations using range minimum queries and a novel primitive called next/previous smaller value in a sequence. Our solutions to those operations are of independent interest.  相似文献   

16.
In this paper we suggest a new way of representing planar two-dimensional shapes and a shape matching method which utilizes the new representation. Through merging of the neighboring boundary runs, a shape can be partitioned into a set of triangles. These triangles are inherently connected according to a binary tree structure. Here we use the binary tree with the triangles as its nodes to represent the shape. This representation is found to be insensitive to shape translation, rotation, scaling and skewing changes due to viewer's location changes (or the object's pose changes). Furthermore, the representation is of multiresolution.

In shape matching we compare the two trees representing two given shapes node by node according to the breadth-first tree traversing sequence. The comparison is done from top of the tree and moving downward, which means that we first compare the lower resolution approximations of the two shapes. If the two approximations are different, the comparison stops. Otherwise, it goes on and compares the finer details of the two shapes. Only when the two shapes are very similar, will the two corresponding trees be compared entirely. Thus, the matching algorithm utilizes the multiresolution characteristic of the tree representation and appears to be very efficient.  相似文献   


17.
18.
陈晓飞  王润生 《计算机学报》2004,27(11):1540-1545
基于骨架的目标表示是计算机视觉领域的重要研究内容.虽然目前基于不同原理提出了许多骨架提取算法.但是关于利用骨架信息来有效地表示并识别目标的研究却很少.文章对骨架的结构基元自顶向下地进行分解,将基元组织成层次树表示.通过引入尺度的概念.获得了目标的节点数目小、连接关系稳定的多尺度树表示.实验表明,它可以紧致、稳健地表示目标,并可降低图匹配过程的复杂度.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号