首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The author has conducted mathematical analysis of genetic decoding for a number of years. Overlapping genes, discovered in 1976, were one of the main objects of the study. These are cases where the same segment of DNA encodes two or more protein sequences. Numerous cases of identified genetic overlaps allowed setting a number of mathematical problems that have been successfully solved. A detailed exposition of these problems is given in the author’s monograph Mathematical Analysis of Genetic Code (BINOM, Moscow, 2010). Such problems have made it possible to penetrate deeply enough into the structure of the genetic code and its relationship with the overlapping genes. As a result, a new problem was set: the computation of the genetic code on the basis of the amino acid sequences that record overlapping genes. One approach to this problem is described in this paper.  相似文献   

2.
Numerous data acquired for complete DNA sequences of large genomes (human and others) created the problem of their studies by biomathematics. This work presents theoretical foundations for the solution of certain problems and, primarily, for estimations of large sets of genes. The major issue is the so-called natural blockage of genes, when all five sequences of codons, alternative to gene sequence, contain multiple stoppages of protein synthesis. Theorem 1 establishes the potential of such blockage for standard genetic code. Theorem 2 establishes the code potential that is utilized for peculiar records of genetic information, the so-called overlapping genes. The interrelation is established between integral characteristics of genetic codes that can be derived on the basis of the aforementioned theorems.  相似文献   

3.
The emerging field of synthetic biology moves beyond conventional genetic manipulation to construct novel life forms which do not originate in nature. We explore the problem of designing the provably shortest genomic sequence to encode a given set of genes by exploiting alternate reading frames. We present an algorithm for designing the shortest DNA sequence simultaneously encoding two given amino acid sequences. We show that the coding sequence of naturally occurring pairs of overlapping genes approach maximum compression. We also investigate the impact of alternate coding matrices on overlapping sequence design. Finally, we discuss an interesting application for overlapping gene design, namely the interleaving of an antibiotic resistance gene into a target gene inserted into a virus or plasmid for amplification.  相似文献   

4.
本文提出了一种利用双字耦合度和t-测试差解决中文分词中交叉歧义的方法: 首先利用词典找出所有的交叉歧义,然后用双字耦合度和t-测试差的线性叠加值来判断各歧义位置是否该切分。实验结果表明,双字耦合度和t-测试差的结合要优于互信息和t-测试差的结合,因此,用双字耦合度和t-测试差的线性叠加值来消除交叉歧义是一种简单有效的方法。  相似文献   

5.
L.  D.  G.  D.   《Data & Knowledge Engineering》2002,40(3):285-314
This paper tackles the problem of semi-automatically extracting hyponymy and overlapping properties between entities belonging to heterogeneous database schemes. The technique we propose consists of two phases: the first one derives basic hyponymies and overlappings starting from a specific situation; the second one receives basic properties derived from the first phase and extracts further, more general hyponymies. In addition, the paper reports some experimental results obtained by applying the technique presented here to the database schemes of Italian Central Government Offices. Finally, it shows some applications of derived hyponymies and overlappings; in particular it illustrates how they can influence scheme integration.  相似文献   

6.
针对最新的生物DNA研究,病毒中同一DNA碱基顺序可以编码出2条或者3条不同的多肽链.在此基础上分析与模仿了重叠基因和重叠密码的机理,得到一种新的基于重叠基因编码框架,从而提高了问题求解的效率;同时,得到一种移码解读框架的DNA遗传算法(SDNA-GA)计算模型,并将其应用于一类广义隶属度型T-S模糊神经网络控制器(GTS-FNNC)的优化设计,实现了GTS-FNNC的在线学习.  相似文献   

7.
The problem of genome annotation (i.e., the establishment of the biological roles of proteins and corresponding genes) is one of the major tasks of postgenomic bioinformatics. This paper reports the development of the previously proposed formalism for the study of the local solvability of the genome annotation problem. Here, we introduce the concepts of elementary motifs, positional independence of motifs, heuristic evaluation of informativeness, and solvability on the sets of elementary motifs. We show that introduction of a linear order in a set of elementary motifs allows us to calculate the irreducible motif sets. The formalism was used in experiments to compute the sets of the most informative motifs for several protein functions.  相似文献   

8.
A morphological neural network is generally defined as a type of artificial neural network that performs an elementary operation of mathematical morphology at every node, possibly followed by the application of an activation function. The underlying framework of mathematical morphology can be found in lattice theory.With the advent of granular computing, lattice-based neurocomputing models such as morphological neural networks and fuzzy lattice neurocomputing models are becoming increasingly important since many information granules such as fuzzy sets and their extensions, intervals, and rough sets are lattice ordered. In this paper, we present the lattice-theoretical background and the learning algorithms for morphological perceptrons with competitive learning which arise by incorporating a winner-take-all output layer into the original morphological perceptron model. Several well-known classification problems that are available on the internet are used to compare our new model with a range of classifiers such as conventional multi-layer perceptrons, fuzzy lattice neurocomputing models, k-nearest neighbors, and decision trees.  相似文献   

9.
交集型分词歧义是汉语自动分词中的主要歧义类型之一。现有的汉语自动分词系统对它的处理能力尚不能完全令人满意。针对交集型分词歧义,基于通用语料库的考察目前已有不少,但还没有基于专业领域语料库的相关考察。根据一个中等规模的汉语通用词表、一个规模约为9亿字的通用语料库和两个涵盖55个专业领域、总规模约为1.4亿字的专业领域语料库,对从通用语料库中抽取的高频交集型歧义切分字段在专业领域语料库中的统计特性,以及从专业领域语料库中抽取的交集型歧义切分字段关于专业领域的统计特性进行了穷尽式、多角度的考察。给出的观察结果对设计面向专业领域的汉语自动分词算法具有一定的参考价值。  相似文献   

10.
A number of algorithms have been proposed aimed at tackling the problem of learning “Gene Linkage” within the context of genetic optimisation, that is to say, the problem of learning which groups of co-adapted genes should be inherited together during the recombination process. These may be seen within a wider context as a search for appropriate relations which delineate the search space and “guide” heuristic optimisation, or, alternatively, as a part of a comprehensive body of work into Adaptive Evolutionary Algorithms. In this paper, we consider the learning of Gene Linkage as an emergent property of adaptive recombination operators. This is in contrast to the behaviour observed with fixed recombination strategies in which there is no correspondence between the sets of genes which are inherited together between generations, other than that caused by distributional bias. A discrete mathematical model of Gene Linkage is introduced, and the common families of recombination operators, along with some well known linkage-learning algorithms, are modelled within this framework. This model naturally leads to the specification of a recombination operator that explicitly operates on sets of linked genes. Variants of that algorithm, are then used to examine one of the important concepts from the study of adaptivity in Evolutionary Algorithms, namely that of the level (population, individual, or component) at which learning takes place. This is an aspect of adaptation which has received considerable attention when applied to mutation operators, but which has been paid little attention in the context of adaptive recombination operators and linkage learning. It is shown that even with the problem restricted to learning adjacent linkage, the population based variants are not capable of correctly identifying building blocks. This is in contrast to component level adaptation which outperforms conventional operators whose bias is ideal for the problems considered.  相似文献   

11.
基于混合模型的交集型歧义消歧策略   总被引:1,自引:0,他引:1       下载免费PDF全文
针对交集型歧义这一汉语分词中的难点问题,提出了一种规则和统计相结合的交集型歧义消歧模型。首先,根据标注语料库,通过基于错误驱动的学习思想,获取交集型歧义消歧规则库,同时,利用统计工具,构建N-Gram统计语言模型;然后,采用正向/逆向最大匹配方法和消歧规则库探测发现交集型歧义字段;最后,通过消歧规则库和评分函数进行交集型歧义的消歧处理。这种基于混合模型的方法可以探测到更多的交集型歧义字段,并且结合了规则方法和统计方法在处理交集型歧义上的优势。实验表明,这种方法提高了交集型歧义处理的精度,为解决交集型歧义提供了一种新的思路。  相似文献   

12.
Networks in various forms are used extensively in Computer Aided Instruction and learning systems. Their use extends from simple frame instruction sequences to semantic networks and transition diagrams.The Path Algebra approach is a powerful mathematical tool for analysing networks. Different algebras may be defined to solve different problems. They have been successfully used in analysing man-machine dialogues and it is suggested that they may provide a useful analytical tool for CAI/CAL designers. Examples are given of algebras which are useful for analysing connectivity, step length, minimum paths, simple and elementary paths and for determining cut sets of arcs.  相似文献   

13.
Fuzzy logic controllers (FLCs) are gaining in popularity across a broad array of disciplines because they allow a more human approach to control. Recently, the design of the fuzzy sets and the rule base has been automated by the use of genetic algorithms (GAs) which are powerful search techniques. Though the use of GAs can produce near optimal FLCs, it raises problems such as messy overlapping of fuzzy sets and rules not in agreement with common sense. This paper describes an enhanced genetic algorithm which constrains the optimization of FLCs to produce well-formed fuzzy sets and rules which can be better understood by human beings. To achieve the above, we devised several new genetic operators and used a parallel GA with three populations for optimizing FLCs with 3x3, 5x5, and 7x7 rule bases, and we also used a novel method for creating migrants between the three populations of the parallel GA to increase the chances of optimization. In this paper, we also present the results of applying our GA to designing FLCs for controlling three different plants and compare the performance of these FLC's with their unconstrained counterparts.  相似文献   

14.
常用的排列方法从DNA微数据中选择的基因集合往往会包含相关性较高的基因,而且使用单个基因评价方法也不能真正反映由此得到的特征集合分类能力的优劣。另外,基因数量远多于样本数量是进行疾病诊断面临的又一挑战。为此,提出一种DNA微阵列数据特征提取方法用于组织分类。该方法运用K-means方法对基因进行聚类分析,获取各子类DNA微阵列数据中心,用排列法去除对分类无关的子类,然后利用ICA方法提取剩余子类集合的特征,用SVMs方法构造分类器对组织进行分类。真实的生物学数据实验表明,该方法通过提取一种复合基因,能综合评价基因分类能力,减少特征数,提高分类器的分类准确性。  相似文献   

15.
The typical design process for the relational database model develops the conceptual schema and each of the external schemas separately and independently from each other. This paper proposes a new design methodology that constructs the conceptual schema in such a way that overlappings among external schemas are reflected. If the overlappings of external schemas do not produce transitivity at the conceptual level, then with our design method, the relations in the external schemas can be realized as a join over independent components. Thus, a one-to-one function can be defined for the mapping between tuples in the external schemas to tuples in the conceptual schema. If transitivity is produced, then we show that no such function is possible and a new technique is introduced to handle this special case.  相似文献   

16.
Abstract: Cancer classification, through gene expression data analysis, has produced remarkable results, and has indicated that gene expression assays could significantly aid in the development of efficient cancer diagnosis and classification platforms. However, cancer classification, based on DNA array data, remains a difficult problem. The main challenge is the overwhelming number of genes relative to the number of training samples, which implies that there are a large number of irrelevant genes to be dealt with. Another challenge is from the presence of noise inherent in the data set. It makes accurate classification of data more difficult when the sample size is small. We apply genetic algorithms (GAs) with an initial solution provided by t statistics, called t‐GA, for selecting a group of relevant genes from cancer microarray data. The decision‐tree‐based cancer classifier is built on the basis of these selected genes. The performance of this approach is evaluated by comparing it to other gene selection methods using publicly available gene expression data sets. Experimental results indicate that t‐GA has the best performance among the different gene selection methods. The Z‐score figure also shows that some genes are consistently preferentially chosen by t‐GA in each data set.  相似文献   

17.
现有的双聚类算法缺乏发现具有重叠结构双聚类的能力,无法有效发现基因表达数据中隐藏的相应双聚类结构,并且在增删条件过程中均未考虑条件重要性对双聚类结果的影响.针对上述问题,文中提出基于加权均方残差的改进双聚类算法.首先利用重叠率和隶属度控制的模糊划分将基因集划分为初始双聚类,然后在最小化目标函数过程中迭代修改各双簇中条件的权重,最后利用加权的均方残差添加符合条件的基因,删除优化的双聚类中一致波动性不好的基因,得到最终的双聚类集.实验表明,文中算法不仅能生成具有共表达水平大小不同的双簇,并且能将重叠率控制在合理范围内.  相似文献   

18.
中文粗分和歧义消解是中文分词的两大基本过程。通过引入广义词条和诱导词集,在最大匹配算法基础上提出一种中文分词的粗分方法,以最长广义词匹配为原则进行中文分词,利用诱导词集实现交叉型歧义识别。在保证快速准确切分无歧义汉语语句的同时,100%检测并标记有歧义汉语语句中的交叉型歧义,最大程度上简化后续歧义消解过程。通过对含有160万汉字1998年1月人民日报语料测试的结果证明了算法速度、歧义词准确率以及粗分召回率的有效性。  相似文献   

19.
The present paper proposes a new hybrid multi-population genetic algorithm (HMPGA) as an approach to solve the multi-level capacitated lot sizing problem with backlogging. This method combines a multi-population based metaheuristic using fix-and-optimize heuristic and mathematical programming techniques. A total of four test sets from the MULTILSB (Multi-Item Lot-Sizing with Backlogging) library are solved and the results are compared with those reached by two other methods recently published. The results have shown that HMPGA had a better performance for most of the test sets solved, specially when longer computing time is given.  相似文献   

20.
In this paper, we consider a method using matrix elementary transformations, which is similar to the method in linear algebraic systems, for solving systems of fuzzy relation equations. The solution sets of a system of fuzzy relation equations before and after performing some elementary transformations are compared. We also give some necessary and sufficient conditions for some elementary transformations which do not change the solution sets.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号