首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
针对当前DNA序列图形表达模式中存在简并现象的相关问题,提出了一种新的二元符号图形表达方式。将四类碱基的编码过程看成是构成DNA序列的元素在直角坐标平面上的移动过程,以两种不同的标志符号来解决可能出现的元素重叠情况。此方案所标志的图形不存在自交现象,从而在DNA序列和图形表达之间建立了一一对应的关系。通过实例说明该方法在对无向图和有向图表达中均能有效地降低图形简并度,并引入人工代谢系统中的编码模式作为分析工具对DNA序列进行比较分析;以代谢中间物值作为参数,研究不同物种的DNA序列之间的相似性。实例分析表明,该参数能较好地表征不同物种之间的相似性程度高低,是一种简便可行的DNA序列特征的比较方法。  相似文献   

2.
传统的图正则化方法使用欧氏距离度量样本空间的相似度,并不能准确考察复杂数据集的邻域信息,容易导致模型在复杂形状数据和非凸数据集中的泛化性能下降。提出一种改进的图正则算法,使用等距特征映射保留样本空间的邻域信息,帮助模型进行流形学习,同时结合使用KL约束进一步使得数据表示的外部结构变得光滑,从而捕获到更稀疏和高级的特征表示。在MNIST和YaleB等数据集上的实验结果表明,相比于流行的几种特征提取算法,该算法能够提取到更有意义和稳健的特征。在分类任务和聚类任务上具有优势,同时具有更好的抗干扰性能。  相似文献   

3.
Biological sequence comparison is one of the most important tasks in Bioinformatics. Owing to the fast growth of databases that contain biological information, sequence comparison represents an important challenge for high‐performance computing, especially when very long sequences are compared, i.e. the complete genome of several organisms. The Smith–Waterman (SW) algorithm is an exact method based on dynamic programming to quantify local similarity between sequences. The inherent large parallelism of the algorithm makes it ideal for architectures supporting multiple dimensions of parallelism (TLP, DLP and ILP). Concurrently, there is a paradigm shift towards chip multiprocessors in computer architecture, which offer a huge amount of potential performance that can only be exploited efficiently if applications are effectively mapped and parallelized. In this work, we analyze how large‐scale biology sequence comparison takes advantage of the current and future multicore architectures. Our starting point is the performance analysis of the current multicore IBM Cell B.E. processor; we analyze two different SW implementations on the Cell B.E. Then, using simulation tools, we study the performance scalability when a many‐core architecture is used for performing long DNA sequence comparison. We investigate the efficient memory organization that delivers the maximum bandwidth with the minimum cost. Our results show that a heterogeneous architecture can be an efficient alternative to execute challenging bioinformatic workloads. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

4.
Mining frequent sequences in large databases has been an important research topic. The main challenge of mining frequent sequences is the high processing cost due to the large amount of data. In this paper, we propose a novel strategy to find all the frequent sequences without having to compute the support counts of non-frequent sequences. The previous works prune candidate sequences based on the frequent sequences with shorter lengths, while our strategy prunes candidate sequences according to the non-frequent sequences with the same lengths. As a result, our strategy can cooperate with the previous works to achieve a better performance. We then identify three major strategies used in the previous works and combine them with our strategy into an efficient algorithm. The novelty of our algorithm lies in its ability to dynamically switch from a previous strategy to our new strategy in the mining process for a better performance. Experiment results show that our algorithm outperforms the previous ones under various parameter settings. This paper is a major-value added version of the following paper: D. Y. Chiu, Y. H. Wu, A. L. P. Chen, “An Efficient Algorithm for Mining Frequent Sequences by a New Strategy without Support Counting,” Proceedings of IEEE Data Engineering Conference, pp. 375–386, 2004.  相似文献   

5.
DNA sequence similarity/dissimilarity analysis is a fundamental task in computational biology, which is used to analyze the similarity of different DNA sequences for learning their evolutionary relationships. In past decades, a large number of similarity analysis methods for DNA sequence have been proposed due to the ever-growing demands. In order to learn the advances of DNA sequence similarity analysis, we make a survey and try to promote the development of this field. In this paper, we first introduce the related knowledge of DNA similarities analysis, including the data sets, similarities distance and output data. Then, we review recent algorithmic developments for DNA similarity analysis to represent a survey of the art in this field. At last, we summarize the corresponding tendencies and challenges in this research field. This survey concludes that although various DNA similarity analysis methods have been proposed, there still exist several further improvements or potential research directions in this field.  相似文献   

6.
The authors propose a simple version of the dot-plot scheme to be used in the case when the distances between sequence elements may take more than two values. The method is applicable, in particular, to the case of the sequences of large-length windows when the sets of distance values are continuous. The proposed technique is simple to implement and the results can produce readable maps for further analysis. To illustrate its potentialities, the method has been applied to the comparison of genomic sequences. The asymmetry in the number of direct and reverse tracks for the Homo sapience genome has been discovered.  相似文献   

7.
8.
在提出的符号序列的高维数字表达以及高维傅里叶变换概念的基础上,提出了蛋白质比较的新方法——高维共鸣识别。将两种蛋白质对应的氨基酸序列转化为向量序列,分别计算它们对应的向量序列的离散傅里叶变换。据此,定义两个蛋白质序列所对应的交叉谱函数,考查交叉谱函数的信噪比,判断两种蛋白质序列的相似性或差异性。计算结果显示它是蛋白质比对的又一个有效方法,是Cosic一维共鸣识别的拓展。  相似文献   

9.
This paper describes the application of face pattern as a medium to illustrate to man complex computer-processed medical diagnosis. The case of a nephrotic syndrome was selected to show the mechanics of the study. First, the design of the face pattern was constructed and psychometrical experiments analyzing the resulting facial expressions were conducted. Next, the results obtained from the analysis of facial expressions based and constructed from the original findings of conventional medical methods were used to loosely predict the patient's future conditions, and the same results were compared with those of statistical procedures to test the efficiency of the method. Finally, a separate application of the method was also performed to indicate a possible effect of corticosteroid on idiopathic nephrotic syndrome.  相似文献   

10.
丁建立  李洋  王家亮 《计算机应用》2019,39(12):3476-3481
针对当前生成式文本摘要方法存在的语义信息利用不充分、摘要精度不够等问题,提出一种基于双编码器的文本摘要方法。首先,通过双编码器为序列映射(Seq2Seq)架构提供更丰富的语义信息,并对融入双通道语义的注意力机制和伴随经验分布的解码器进行了优化研究;然后,在词嵌入生成技术中融合位置嵌入和词嵌入,并新增词频-逆文档频率(TF-IDF)、词性(POS)、关键性得分(Soc),优化词嵌入维度。所提方法对传统序列映射Seq2Seq和词特征表示进行优化,在增强模型对语义的理解的同时,提高了摘要的质量。实验结果表明,该方法在Rouge评价体系中的表现相比传统伴随自注意力机制的递归神经网络方法(RNN+atten)和多层双向伴随自注意力机制的递归神经网络方法(Bi-MulRNN+atten)提高10~13个百分点,其文本摘要语义理解更加准确、生成效果更好,拥有更好的应用前景。  相似文献   

11.
通过时空异常流检测技术可以发现城市交通数据中的异常交通特征。与时间序列中单个异常流检测采用的方法不同,提出了从流序列中检测异常流分布的k最近邻流序列算法(kNNFS)。算法首先为每个位置测定每个时间区间内的单个流观测值;随后计算单个流的观测频率来构建每个位置处每个时间区间的流分布概率库;最后由阈值判定使用KL散度计算的新的流分布概率与其k最近邻之间的距离是否为异常值,距离值小于阈值则更新入流分布概率库,否则为异常的流分布。仿真分析表明,对比DPMM算法和SETMADA算法,kNNFS算法在检测精度和算法运行时间方面均有优化提升。  相似文献   

12.
A new color image encryption algorithm based on DNA (Deoxyribonucleic acid) sequence addition operation is presented. Firstly, three DNA sequence matrices are obtained by encoding the original color image which can be converted into three matrices R, G and B. Secondly, we use the chaotic sequences generated by Chen's hyper-chaotic maps to scramble the locations of elements from three DNA sequence matrices, and then divide three DNA sequence matrices into some equal blocks respectively. Thirdly, we add these blocks by using DNA sequence addition operation and Chen's hyper-chaotic maps. At last, by decoding the DNA sequence matrices and recombining the three channels R, G and B, we get the encrypted color image. The simulation results and security analysis show that our algorithm not only has good encryption effect, but also has the ability of resisting exhaustive attack, statistical attack and differential attack.  相似文献   

13.
Probabilistic finite automata (PFAs) can exhibit a stochastic behavior, and its reachability and controllability is viewed as the first necessary step of supervisory control and stabilization. In this paper, the problems of reachability and controllability of PFAs are investigated under the framework of semi‐tensor product (STP) of matrices. First, a matrix‐based modeling approach to PFAs is proposed, and the dynamics of PFAs can be described as a discrete‐time bilinear expression. Meanwhile, the notions of reachability with a probability of one is formally defined, and F‐reachability with a probability of one is introduced. With the algebraic expression, necessary and sufficient conditions of such reachability are provided systematically. Second, F‐set controllability with a probability of one of PFAs is developed by introducing the F‐reachability with a probability of one, and the associated algebraic condition to verify such controllability are given. Finally, a simple example is illustrated to validate the proposed results.  相似文献   

14.
Stock trend prediction is regarded as one of the most challenging tasks of financial time series prediction. Conventional statistical modeling techniques are not adequate for stock trend forecasting because of the non-stationarity and non-linearity of the stock market. With this regard, many machine learning approaches are used to improve the prediction results. These approaches mainly focus on two aspects: regression problem of the stock price and prediction problem of the turning points of stock price. In this paper, we concentrate on the evaluation of the current trend of stock price and the prediction of the change orientation of the stock price in future. Then, a new approach named status box method is proposed. Different from the prediction issue of the turning points, the status box method packages some stock points into three categories of boxes which indicate different stock status. And then, some machine learning techniques are used to classify these boxes so as to measure whether the states of each box coincides with the stock price trend and forecast the stock price trend based on the states of the box. These results would support us to make buying or selling strategies. Comparing with the turning points prediction that only considered the features of one day, each status box contains a certain amount of points which represent the stock price trend in a certain period of time. So, the status box reflects more information of stock market. To solve the classification problem of the status box, a special features construction approach is presented. Moreover, a new ensemble method integrated with the AdaBoost algorithm, probabilistic support vector machine (PSVM), and genetic algorithm (GA) is constructed to perform the status boxes classification. To verify the applicability and superiority of the proposed methods, 20 shares chosen from Shenzhen Stock Exchange (SZSE) and 16 shares from National Association of Securities Dealers Automated Quotations (NASDAQ) are applied to perform stock trend prediction. The results show that the status box method not only have the better classification accuracy but also effectively solve the unbalance problem of the stock turning points classification. In addition, the new ensemble classifier achieves preferable profitability in simulation of stock investment and remarkably improves the classification performance compared with the approach that only uses the PSVM or back-propagation artificial neural network (BPN).  相似文献   

15.
Encoding and processing information in DNA-, RNA- and other biomolecule-based devices is an important requirement for DNA based computing with potentially important applications. To make DNA computing more reliable, much work has focused on designing the good DNA sequences. However, this is a bothersome task as encoding problem is an NP problem. In this paper, a new methodology based on the IWO algorithm is developed to optimize encoding sequences. Firstly, the mathematics models of constrained objective optimization design for encoding problems based on the thermodynamic criteria are set up. Then, a modified IWO method is developed by defining the colonizing behavior of weeds to overcome the obstacles of the original IWO algorithm, which cannot be applied to discrete problems directly. The experimental results show that the proposed method is effective and convenient for the user to design and select effective DNA sequences in silicon for controllable DNA computing.  相似文献   

16.
M.  G.   《Performance Evaluation》2007,64(9-12):1153-1168
The paper investigates the problem of minimal representation of Markov arrival processes of order n (MAP(n)). The minimal representation of MAPs is crucial for developing effective fitting methods. It seems that all existing MAP fitting methods are based on the , representation which is known to be redundant. We present the minimal number of parameters to define a MAP(n) and provide a numerical moments-matching method based on a minimal representation.

The discussion starts with a characterization of phase type (PH) distributions and then the analysis of MAPs follows a similar pattern. This characterization contains essential results on the identity of stationary behaviour of MAPs and on the number of parameters required to describe the stationary behaviour.

The proposed moments matching method is also applicable for PH distributions. In this case it is a unique method that fits a general PH distribution of order n based on 2n−1 parameters.  相似文献   


17.
余艳 《计算机应用》2014,34(3):828-832
为提高高斯混合模型(GMM)间相似性度量方法的计算效率和准确性,通过对称化KL散度(KLD)并结合移地距离(EMD)提出一种新的相似性度量方法。首先计算待比较的两个高斯混合模型内各高斯成分间的KL散度,对称化处理后用于构造地面距离矩阵;然后用线性规划方法求解两个高斯混合模型间的移地距离作为高斯混合模型间的相似性度量。实验结果表明,将该相似性度量方法应用于彩色图像检索,相对于传统方法能够提高检索的时间效率和准确性。  相似文献   

18.
In this paper, we propose an autonomous molecular walking machine using DNA. This molecular machine follows a track of DNA equipped with many single-strand DNA stators arranged in a certain pattern. The molecular machine achieves autonomous walk by using a restriction enzyme as source of power. With a proposed machine we can control its moving direction and we can easily extend walking patterns in two or three dimensions. Combination of multiple legs and ssDNA stators can control the walking pattern. We designed and performed a series of feasibility study with computer simulation and molecular biology experiments.  相似文献   

19.
A new method is proposed for proving some theorems on the convergence of sequences of random quantities n that assume values in a set {0,1,...,n} to discrete probability distributions. The method is based on the investigation of definite numerical characteristics (called lattice moments) of asymptotic behavior of distributions of n and is illustrated by the examples of investigating the asymptotic behavior of the probability distribution of the solution space dimension of a system of independent random homogeneous linear equations over a finite field and that of the number of connected components of a random (unequiprobable) hypergraph with independent hyperedges.Translated from Kibernetika i Sistemnyi Analiz, No. 6, pp. 44– 65, November–December 2004.Part 1 was published in Cybernetics and Systems Analysis, No. 5, 2004.This revised version was published online in April 2005 with a corrected cover date.  相似文献   

20.
In this paper, the system-level synthesis problem (SLSP) is modeled as a multi-objective mode-identity resource-constrained project scheduling problem with makespan and resource investment criteria (MOMIRCPSP-MS-RI). Then, a hybrid Pareto-archived estimation of distribution algorithm (HPAEDA) is presented to solve the MOMIRCPSP-MS-RI. To be specific, the individual of the population is encoded as the activity-mode-priority-resource list (AMPRL), and a hybrid probability model is used to predict the most promising search area, and a Pareto archive is used to preserve the non-dominated solutions that have been explored, and another archive is used to preserve the solutions for updating the probability model. Moreover, specific sampling mechanism and updating mechanism for the probability model are both provided to track the most promising search area via the EDA-based evolutionary search. Finally, the modeling methodology and the HPAEDA are tested by an example of a video codec based on the H.261 image compression standard. Simulation results and comparisons demonstrate the effectiveness of the modeling methodology and the proposed algorithm.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号