首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
提出了基于Levenberg-Marquardt(LM)算法的BP神经网络对蛋白质序列进行家族分类的新方法.该方法采用二肽含量对蛋白质序列进行特征提取,根据影响因子评价特征的相对重要性,用改进的BP神经网络LM优化算法构造一个三层人工神经网络,通过对PIR数据库中三类家族的学习,该网络对未知蛋白质序列分类的准确率分别达到了98.9%.98.1%,97.8%。  相似文献   

2.
随着基因工程产生大量新序列,导致蛋白质序列数据库的迅速增长,巨量蛋白质数据的功能组和族谱分析使蛋白质序列聚类分析成为结构和功能基因组学重要的研究目标,应用数据挖掘技术对生物数据进行聚类分析成为生物信息学研究的热点。聚类分析算法中的CLARA划分算法已广泛应用于其它领域,但在大数据量蛋白质序列聚类分析中应用很少,文章应用CLARA算法对在基准数据库中选取的蛋白质序列进行聚类分析,并将结果与其它几种蛋白质聚类算法进行了比较。  相似文献   

3.
活性肽搜寻与蛋白模拟水解数据库的建立   总被引:1,自引:0,他引:1  
利用Microsoft Office XP中的Access XP数据库软件建立3个数据库系统,蛋白质数据库包含小麦面筋、大米、玉米等常见食物蛋白质序列23739条,活性肽数据库包含ACE抑制肽、免疫肽、阿片肽等生物活性肽序列1396条,以及常见的蛋白质水解酶信息。数据库与编制的“生物活性肽搜寻与酶解模拟系统”程序配合,实现单条、多条活性肽序列在蛋白质中批量搜寻,并找出活性肽含量的链长百分比,活性肽在蛋白质中的位置和前后氨基酸的种类,实现肽的活性不完全归纳预测活性,实现蛋白质用单酶或者复酶的模拟水解并标出水解产物中活性肽及其功能。  相似文献   

4.
人类肝脏蛋白质组的研究离不开生物信息学及计算机技术的支持,本文结合现阶段蛋白质组数据研究以及实际试验需求,对生物信息学及蛋白质组学进行介绍,分别从蛋白质相互作用、生物信息学和蛋白质功能、生物信息学和蛋白质数据库等方面对人肝脏蛋白质组生物信息学研究,对人肝脏蛋白质表达谱数据库应用系统进行探讨。  相似文献   

5.
有效分析蛋白质家族是生物信息学的一项重要挑战,聚类成为解决这一问题的主要途径之一.基于传统序列比对方法定义蛋白质序列间相似关系时,假设了同源片断问的邻接保守性,与遗传重组相冲突.为更好地识别蛋白质家族,提出了一种蛋白质序列家族挖掘算法ProFaM.ProFaM首先采用前缀投影策略挖掘表征蛋白质序列的模式,然后基于模式及其权重信息构造相似度度量函数,并采用共享最近邻方法,实现了蛋白质序列家族聚类.解决了以往方法在蛋白质模式挖掘及相似度设计中的不足.在蛋白质家族数据库Pfam上的实验结果证实了ProFaM算法在蛋白质家族分析上有良好的结果.  相似文献   

6.
为了从蛋白质结构数据库中提取经验知识,进行蛋白质作用位点预测,提出了以蛋白质序列谱作为特征向量,采用支持向量机算法进行训练和预测蛋白质相互作用位点的方法。从蛋白质一级序列出发,以序列上邻近残基的序列谱为输入特征向量,采用支持向量机方法构建预测器,来预测蛋白质相互作用位点,预测精度达到70.47%,相关系数CC=0.1919。实验结果表明,利用蛋白质序列谱,结合支持向量机算法进行蛋白质相互作用位点预测的方法是有效的。  相似文献   

7.
一种基于子序列分布的蛋白质结构类预测方法   总被引:2,自引:4,他引:2  
蛋白质结构类预测方法的预测能力主要取决于两个方面:一方面,方法采用的序列描述中包含多少有效的蛋白质结构类信息;另一方面,方法采用的判别函数对序列描述中包含的有效信息的利用程度。子序列分布是蛋白质结构类预测中的一种新的序列描述,广义平方距离是组分耦联方法中的判别函数,它包含序列描述中不同组分之间的耦合效应。本文改进了组分耦联方法中的判别函数,解决了当协方差矩阵不可逆时组分耦联方法不能解决的问题,从而把子序列分布与包含耦合效应的判别函数结合起来,对Chou等选取的含有359个蛋白质(结构域)的训练集做了预测,自检测和jackknife检测结果分别为100%和96.7%,这一结果比组分耦联方法提高了5.6和12.6个百分点,比基于自相关函数方法提高了3.3和6.2个百分点。  相似文献   

8.
基于关联规则与遗传算法的蛋白质二级结构预测   总被引:3,自引:1,他引:2  
文章通过建立蛋白质二级结构预测的数学模型,运用挖掘与遗传算法相结合的关联规则技术对蛋白质二级结构进行预测,设计并实现了该原型系统。实验表明,该文所采用的基于蛋白质氨基酸疏水性周期规律的预测模型方法较其它相关的二级结构预测方法有较好的准确性、有效性与可行性。  相似文献   

9.
唐东明  朱清新  杨凡  陈科 《软件学报》2011,22(8):1827-1837
提出了一种有效的基于仿射传播聚类算法和后处理方法的蛋白质序列聚类方法.在聚类分析蛋白质序列时,为了优化仿射传播聚类算法的聚类结果,采用后处理的方式来提高聚类结果的质量.为了度量蛋白质序列之间的相似度,给出了一种改进的无比对计算方法.在6个蛋白质序列数据集上进行对比实验,实验结果表明,所给出的方法能够有效地分析蛋白质序列.  相似文献   

10.
计算实验表明蛋白质一级结构包含着四级结构信息。本文用支持向量机方法从蛋白质一级结构出发区分同源二聚体和非同源二聚体。蛋白质原始序列的子序列分布用于支持向量机的输入向量,从而充分考虑了蛋白质序列的信息。当子序列的长度为3时,10次交叉验证的总预测准确率达到84.9%,在相同的数据集上,比原有的决策树方法提高了15.0%。实验表明残基顺序对同源寡聚蛋白质的识别起重要作用,而支持向量机方法是蛋白质四级结构预测的强有力工具。  相似文献   

11.
Automatic evaluation of protein sequence functional patterns   总被引:1,自引:0,他引:1  
A procedure that automatically provides an evaluation of the diagnostic ability of a protein sequence functional pattern is described. The procedure relies on the identification of the closest definable set in terms of a (protein sequence) database functional annotation to the set of database instances containing a given pattern. Assuming annotation correctness and completeness in the protein sequence database, the degree of statistical association between these sets provides an appropriate measure of the diagnostic ability of the pattern. An experimental implementation of the procedure, using the NBRF/PIR protein database, has been applied to a diverse collection of published sequence patterns. Results obtained reveal that frequently it is not possible to define (in NBRF/PIR database terminology) the set of database instances containing a given pattern, suggesting either lack of pattern diagnostic ability or protein database annotation incompleteness and/or inconsistencies.  相似文献   

12.
The CLEF 2005 Automatic Medical Image Annotation Task   总被引:2,自引:0,他引:2  
In this paper, the automatic annotation task of the 2005 CLEF cross-language image retrieval campaign (ImageCLEF) is described. This paper focuses on the database used, the task setup, and the plans for further medical image annotation tasks in the context of ImageCLEF. Furthermore, a short summary of the results of 2005 is given. The automatic annotation task was added to ImageCLEF in 2005 and provides the first international evaluation of state-of-the-art methods for completely automatic annotation of medical images based on visual properties. The aim of this task is to explore and promote the use of automatic annotation techniques to allow for extracting semantic information from little-annotated medical images. A database of 10.000 images was established and annotated by experienced physicians resulting in 57 classes, each with at least 10 images. Detailed analysis is done regarding the (i) image representation, (ii) classification method, and (iii) learning method. Based on the strong participation of the 2005 campain, future benchmarks are planned.  相似文献   

13.
Current multimedia databases contain a wealth of information in the form of audiovisual as well as text data. Even though efficient search algorithms have been developed for either media, there still exists the need for abstract presentation and summarization of the results of database users' queries. Moreover, multimedia retrieval systems should be capable of providing the user with additional information related to the specific subject of the query, as well as suggest other topics which could be identified to attract the interest of users with a similar profile. In this paper, we present solutions to these issues, giving as an example an integrated architecture we have developed, along with notions that support efficient and secure Internet access to audiovisual/video databases. Segmentation of each video in shots is followed by shot classification in a number of predetermined categories. Generation of users' profiles according to the categories, enhanced by relevance feedback, permits an efficient presentation of retrieved video shots or characteristic frames in terms of the user interest in them. Moreover, this clustering scheme assists the notion of lateral links that enable the user to continue retrieval with data of similar nature or content to those already returned. Furthermore, user groups are formed and modeled by registering actual preferences and practices. This enables the system to predict information that is possibly relevant to the user's interest and present it along with the returned results. The concepts utilized in this system can be smoothly integrated in MPEG-7 compatible multimedia database systems.  相似文献   

14.
Digital video databases have become more pervasive and finding video clips quickly in large databases becomes a major challenge. Due to the nature of video, accessing contents of video is difficult and time-consuming. With content-based video systems today, there exists a significant gap between the user's information and what the system can deliver. Therefore, enabling intelligent means of interpretation on visual content, semantics annotation and retrieval are important topics of research. In this paper, we consider semantic interpretation of the contents as annotation tags for video clips, giving a retrieval-driven and application-oriented semantics extraction, annotation and retrieval model for video database management system. This system design employs an algorithm on objects' relation and it can reveal the semantics defined with fast real-time computation.  相似文献   

15.
土地利用基础图件建库是国土资源信息化和“数字国土”的重要组成部分.在传统建库过程中,图件上的大量注记主要是人工识别、手工录入到数据库中,影响了数据库建库的进度,并使数据库质量难以保障.为此提出了扫描图件数字注记自动识别与属性自动入库技术流程和方法,设计开发了较为实用的软件.在包头市土地利用详查数据库建库中的应用表明,上述技术方法合理,注记自动识别精度在87%以上,属性的自动入库准确率超过90%,具有较高的实用价值.  相似文献   

16.
This paper targets at the problem of automatic semantic indexing of news videos by presenting a video annotation and retrieval system which is able to perform automatic semantic annotation of news video archives and provide access to the archives via these annotations. The presented system relies on the video texts as the information source and exploits several information extraction techniques on these texts to arrive at representative semantic information regarding the underlying videos. These techniques include named entity recognition, person entity extraction, coreference resolution, and semantic event extraction. Apart from the information extraction components, the proposed system also encompasses modules for news story segmentation, text extraction, and video retrieval along with a news video database to make it a full-fledged system to be employed in practical settings. The proposed system is a generic one employing a wide range of techniques to automate the semantic video indexing process and to bridge the semantic gap between what can be automatically extracted from videos and what people perceive as the video semantics. Based on the proposed system, a novel automatic semantic annotation and retrieval system is built for Turkish and evaluated on a broadcast news video collection, providing evidence for its feasibility and convenience for news videos with a satisfactory overall performance.  相似文献   

17.
18.
Segmentation, video data modeling, and annotation are indispensable operations necessary for creating and populating a video database. To support such video databases, annotation data can be collected as metadata for the database and subsequently used for indexing and query evaluation. In this paper we describe the design and development of a video annotation engine, called Vane, intended to solve this problem as a domain-independent video annotation application.Using the Vane tool, the annotation of raw video data is achieved through metadata collection. This process, which is performed semi-automatically, produces tailored SGML documents whose purpose is to describe information about the video content. These documents constitute the metadatabase component of the video database. The video data model which has been developed for the metadata, is as open as possible for multiple domain-specific applications. The tool is currently in use to annotate a video archive comprised of educational and news video content.  相似文献   

19.
《Computers & Education》2008,50(4):1308-1320
Annotation can be a valuable exercise when trying to understand new information. The technique can be used to create a ‘condensed’ version of the original information for later review and to add additional information into the existing document. The growth in web-based learning materials and information sources has created requirement for systems that allow annotations to be attached to these new sources and, potentially, shared with other learners. This paper discusses annotation in an educational context and introduces some of the web annotation systems currently available. It also provides an overview of the development of a new system, eLAWS, by the authors, based upon the Web Service architecture. Finally, the paper provides suggestions for the future development of e-Learning Annotation tools.  相似文献   

20.
蛋白质的功能对于理解细胞和生物的活动机制、研究疾病机理等至关重要。面对序列数据库的快速增长,传统的实验和序列对比方法不足以支撑大规模的蛋白质功能标注。为此,提出EGNet(evolutionary graph network)模型,采用蛋白质预训练语言模型ESM2和one-hot编码得到蛋白质序列编码,通过序列自注意力和物理计算整合出残基间的协同进化信息PI(paired interaction)和SPI(strong paired interaction);之后将两种进化信息和序列编码作为多层串联图卷积网络输入,学习序列编码节点特征,实现端到端的蛋白质功能预测。与早期方法相比,在ENZYME数据库中的EC(Enzyme Commission)类别标签上,EGNet获得了更好的性能,其F-score达到0.89,AUPR值达到0.91。结果表明,EGNet仅仅采用单条序列来预测蛋白质功能就可以得到良好的结果,从而能够提供快速且有效的蛋白质功能注释。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号