首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 22 毫秒
1.
PRONUC is a menu-driven software package from which a molecular biologist may gain access to a variety of tools for the analysis of protein and nucleic acid sequences. Features include various algorithms for sequence comparisons, secondary structure prediction, sequence manipulation (translation complementation etc.) and finding restriction enzyme cut-sites. The sequences under study can be retrieved from several databases of published sequences or a users sequence(s) can be entered by means of a sequence editor or retrieved from a database constructed by the user. PRONUC comes with a comprehensive manual and on-line help which reflects several years of user feedback and is available for Digital VAX computer systems running the VMS or micro-VMS operating system.  相似文献   

2.
Intrinsically disordered regions in proteins are relatively frequent and important for our understanding of molecular recognition and assembly, and protein structure and function. From an algorithmic standpoint, flagging large disordered regions is also important for ab initio protein structure prediction methods. Here we first extract a curated, non-redundant, data set of protein disordered regions from the Protein Data Bank and compute relevant statistics on the length and location of these regions. We then develop an ab initio predictor of disordered regions called DISpro which uses evolutionary information in the form of profiles, predicted secondary structure and relative solvent accessibility, and ensembles of 1D-recursive neural networks. DISpro is trained and cross validated using the curated data set. The experimental results show that DISpro achieves an accuracy of 92.8% with a false positive rate of 5%. DISpro is a member of the SCRATCH suite of protein data mining tools available through  相似文献   

3.
We consider the problem of on-line prediction competitive with a benchmark class of continuous but highly irregular prediction rules. It is known that if the benchmark class is a reproducing kernel Hilbert space, there exists a prediction algorithm whose average loss over the first N examples does not exceed the average loss of any prediction rule in the class plus a “regret term” of O(N ?1/2). The elements of some natural benchmark classes, however, are so irregular that these classes are not Hilbert spaces. In this paper we develop Banach-space methods to construct a prediction algorithm with a regret term of O(N ?1/p ), where p∈[2,∞) and p?2 reflects the degree to which the benchmark class fails to be a Hilbert space. Only the square loss function is considered.  相似文献   

4.
《Computers & chemistry》1998,21(4):279-294
The preference functions method is described for prediction of membrane-buried helices in membrane proteins. Preference for the α-helix conformation of amino acid residue in a sequence is a non-linear function of average hydrophobicity of its sequence neighbors. Kyte–Doolittle hydropathy values are used to extract preference functions from a training data set of integral membrane proteins of partially known secondary structure. Preference functions for β-sheet, turn and undefined conformation are also extracted by including β-class soluble proteins of known structure in the training data set. Conformational preferences are compared in tested sequence for each residue and predicted secondary structure is associated with the highest preference. This procedure is incorporated in an algorithm that performs accurate prediction of transmembrane helical segments. Correct sequence location and secondary structure of transmembrane segments is predicted for 20 of 21 reference membrane polypeptides with known crystal structure that were not included in the training data set. Comparison with hydrophobicity plots revealed that our preference profiles are more accurate and exhibit higher resolution and less noise. Shorter unstable or movable membrane-buried α-helices are also predicted to exist in different membrane proteins with transport function. For instance, in the sequence of voltage-gated ion channels and glutamate receptors, N-terminal parts of known P-segments can be located as characteristic α-helix preference peaks. Our e-mail server: predict@drava.etfos.hr, returns a preference profile and secondary structure prediction for a suspected or known membrane protein when its sequence is submitted.  相似文献   

5.
The formation of protein secondary structure especially the regions of β-sheets involves long-range interactions between amino acids. We propose a novel recurrent neural network architecture called segmented-memory recurrent neural network (SMRNN) and present experimental results showing that SMRNN outperforms conventional recurrent neural networks on long-term dependency problems. In order to capture long-term dependencies in protein sequences for secondary structure prediction, we develop a predictor based on bidirectional segmented-memory recurrent neural network (BSMRNN), which is a noncausal generalization of SMRNN. In comparison with the existing predictor based on bidirectional recurrent neural network (BRNN), the BSMRNN predictor can improve prediction performance especially the recognition accuracy of β-sheets.  相似文献   

6.
《Information Systems》2001,26(3):143-163
The Web is rapidly becoming the platform through which many companies deliver services to businesses and individual customers. The number and type of on-line services increase day by day, and this trend is likely to continue at an even faster pace in the immediate future. Examples of e-services currently available include bill payment, delivery of customized news, or archiving and sharing of digital documents. E-Services are typically delivered individually. However, the e-service market creates the opportunity for providing value-added, integrated services, which are delivered by composing existing e-services. To support organizations in pursuing this business opportunity we have developed eFlow, a system that supports the specification, enactment, and management of composite e-services, modeled as processes that are enacted by a service process engine. Composite e-services have to cope with a highly dynamic business environment in terms of services and of service providers. In addition, the increased competition forces companies to provide customized services to better satisfy the needs of every individual customer. Ideally, service process should be able to transparently adapt to changes in the environment and to the need of different customers with minimal or no user intervention. In addition, it should be possible to dynamically modify service process definitions in a simple and effective way to manage cases where user intervention is indeed required. In this paper we show how eFlow achieves these goals.  相似文献   

7.
Modeling and predicting the structure of proteins is one of the most important challenges of computational biology. Exact physical models are too complex to provide feasible prediction tools and other ab initio methods only use local and probabilistic information to fold a given sequence. We show in this paper that all-α transmembrane protein secondary and super-secondary structures can be modeled with a multi-tape S-attributed grammar. An efficient structure prediction algorithm using both local and global constraints is designed and evaluated. Comparison with existing methods shows that the prediction rates as well as the definition level are sensibly increased. Furthermore this approach can be generalized to more complex proteins.  相似文献   

8.
Accurate protein secondary structure prediction plays an important role in direct tertiary structure modeling, and can also significantly improve sequence analysis and sequence-structure threading for structure and function determination. Hence improving the accuracy of secondary structure prediction is essential for future developments throughout the field of protein research.In this article, we propose a mixed-modal support vector machine (SVM) method for predicting protein secondary structure. Using the evolutionary information contained in the physicochemical properties of each amino acid and a position-specific scoring matrix generated by a PSI-BLAST multiple sequence alignment as input for a mixed-modal SVM, secondary structure can be predicted at significantly increased accuracy. Using a Knowledge Discovery Theory based on the Inner Cognitive Mechanism (KDTICM) method, we have proposed a compound pyramid model, which is composed of three layers of intelligent interface that integrate a mixed-modal SVM (MMS) module, a modified Knowledge Discovery in Databases (KDD1) process, a mixed-modal back propagation neural network (MMBP) module and so on.Testing against data sets of non-redundant protein sequences returned values for the Q3 accuracy measure that ranged from 84.0% to 85.6%,while values for the SOV99 segment overlap measure ranged from 79.8% to 80.6%. When compared using a blind test dataset from the CASP8 meeting against currently available secondary structure prediction methods, our new approach shows superior accuracy.Availability: http://www.kdd.ustb.edu.cn/protein_Web/.  相似文献   

9.
The problem of protein secondary structure prediction is one of the most important problems in Bioinformatics. After the study of this problem for 30 years and more, there have been some breakthroughs. Especially, the introduction of ensemble prediction model and hybrid prediction model makes the accuracy of prediction better, but there is a long distance to induce the tertiary structures from the secondary ones. As one of the extension researches of KDTICM [Bingru, Yang (2004). Knowledge discovery based on theory of inner cognition mechanism and application. Beijing: Electronic Industry Press] theory, this paper proposed a method KAAPRO, which is based on Maradbcm algorithm which is induced by KDD1 model and combined with CBA, for protein secondary structure prediction. And a gradually enhanced, multi-layer systematic prediction model, compound pyramid model, is proposed. The kernel of this model is KAAPRO. Domain knowledge is used through the whole model, and the physical–chemical attributes are chosen by causal cellular automata. In the experiment, the test proteins used in reference Muggleton et al. (Muggleton, S. H., King, R., Sternberg, M. (1992). Protein secondary structure prediction using logic-based machine learning. Protein Engineering, 5(7), 647–657) are predicted. The structures of amino acids, whose structural traits are obscure, are predicted well by KAAPRO. Hence, the result of this model is satisfying too.  相似文献   

10.
《Computers & Geosciences》2006,32(9):1403-1410
A web service model for geophysical data manipulation, analysis and modeling based on a generalized data processing system was implemented. The service is not limited to any specific data type or operation and allows the user to combine ∼190 tools of the existing package, and new codes easily includable. It allows remote execution of complex processing flows completely designed and controlled by remote clients who are presented with mirror images of the server processing environment. Clients are also able to upload their processing flows to the server, thereby building a knowledge base of processing expertise shared by the community. Flows in this knowledge base are currently represented by a hierarchy of automatically generated interactive web forms. These flows can be accessed and the resulting data retrieved by either using a web browser or through API calls from within the clients’ applications. The server administrator is thus relieved of the need for development of any content-specific data access mechanisms. The underlying processing system is fully developed and includes a graphical user interface, parallel processing capabilities, on-line documentation, on-line software distribution service and automatic code updates. Currently, the service is installed on the University of Saskatchewan seismology web server (http://seisweb.usask.ca/SIA/ps.php) and maintains a library of processing examples (http://seisweb.usask.ca/temp/examples) including a number of useful web tools (such as UTM coordinate transformations, calculation of travel times of seismic waves in a global Earth model and generation of color palettes). Important potential applications of this web service model for building intelligent data queries, processing and modeling of global seismological data are also discussed.  相似文献   

11.
Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html.  相似文献   

12.
Precise prediction of protein secondary structures from the associated amino acids sequence is of great importance in bioinformatics and yet a challenging task for machine learning algorithms. As a major step toward predicting the ultimate three dimensional structures, the secondary structure assignment specifies the protein function. Considering a multilayer perceptron neural network, pruned for optimum size of hidden layers, as the reference network, advanced kinds of recurrent neural network (RNN) are devised in this article to enhance the secondary structure prediction. To better model the strong correlations between secondary structure elements, types of modular reciprocal recurrent neural networks (MRR-NN) are examined. Additionally, to take into account the long-range interactions between amino acids in formation of the secondary structure, bidirectional RNN are investigated. A multilayer bidirectional recurrent neural network (MBR-NN) is finally applied to capture the predominant long-term dependencies. Eventually, a modular prediction system based on the interactive combination of the MRR-NN and MBR-NN boosts the percentage accuracy (Q3) up to 76.91% and augments the segment overlap (SOV) up to 68.13% when tested on the PSIPRED dataset. The coupling effects of the secondary structure types as well as the sequential information of amino acids along the protein chain can be well cast by the integration of the MRR-NN and the MBR-NN.  相似文献   

13.
This paper addresses the problem of constructing reliable interval predictors directly from observed data. Differently from standard predictor models, interval predictors return a prediction interval as opposed to a single prediction value. We show that, in a stationary and independent observations framework, the reliability of the model (that is, the probability that the future system output falls in the predicted interval) is guaranteed a priori by an explicit and non-asymptotic formula, with no further assumptions on the structure of the unknown mechanism that generates the data. This fact stems from a key result derived in this paper, which relates, at a fundamental level, the reliability of the model to its complexity and to the amount of available information (number of observed data).  相似文献   

14.
Mobile applications and services relying on mobility prediction have recently spurred lots of interest. In this paper, we propose mobility prediction based on cellular traces as an infrastructural level service of telecom cloud. Mobility Prediction as a Service (MPaaS) embeds mobility mining and forecasting algorithms into a cloud-based user location tracking framework. By empowering MPaaS, the hosted 3rd-party and value-added services can benefit from online mobility prediction. Particularly we took Mobility-aware Personalization and Predictive Resource Allocation as key features to elaborate how MPaaS drives new fashion of mobile cloud applications. Due to the randomness of human mobility patterns, mobility predicting remains a very challenging task in MPaaS research. Our preliminary study observed collective behavioral patterns (CBP) in mobility of crowds, and proposed a CBP-based mobility predictor. MPaaS system equips a hybrid predictor fusing both CBP-based scheme and Markov-based predictor to provide telecom cloud with large-scale mobility prediction capacity.  相似文献   

15.
We study an M/G/1 queueing system with a server that can be switched on and off. The server can take a vacation time T after the system becomes empty. In this paper, we investigate a randomized policy to control a server with which, when the system is empty, the server can be switched off with probability p and take a vacation or left on with probability (1  p) and continue to serve the arriving customers. For this system, we consider the operating cost and the holding cost where the operating cost consists of the system running and switching costs (start up and shut down costs). We describe the structure and characteristics of this policy and solve a constrained problem to minimize the average operating cost per unit time under the constraint for the holding cost per unit time.  相似文献   

16.
The paper presents a method of interactive construction of global Hidden Markov Models (HMMs) based on local sequence patterns discovered in data. The method is based on finding interesting sequences whose frequency in the database differs from that predicted by the model. The patterns are then presented to the user who updates the model using their intelligence and their understanding of the modelled domain. It is demonstrated that such an approach leads to more understandable models than automated approaches. Two variants of the problem are considered: mining patterns occurring only at the beginning of sequences and mining patterns occurring at any position; both practically meaningful. For each variant, algorithms have been developed allowing for efficient discovery of all sequences with given minimum interestingness. Applications to modelling webpage visitors behavior and to modelling protein secondary structure are presented, validating the proposed approach.  相似文献   

17.
A new missing data algorithm ARFIL gives good results in spectral estimation. The log likelihood of a multivariate Gaussian random variable can always be written as a sum of conditional log likelihoods. For a complete set of autoregressive AR(p) data the best predictor in the likelihood requires only p previous observations. If observations are missing, the best AR predictor in the likelihood will in general include all previous observations. Using only those observations that fall within a finite time interval will approximate this likelihood. The resulting non-linear estimation algorithm requires no user provided starting values. In various simulations, the spectral accuracy of robust maximum likelihood methods was much better than the accuracy of other spectral estimates for randomly missing data.  相似文献   

18.
Different cost models of allocation and rearrangement of a set {F1,F2Fn} of files are investigated in the paper. It is assumed that the distribution p1(t) of the reference string ξ(t) depends on the user's activity. For large values of n a limiting process is used and a continuous rearrangement model is introduced, in which the integral formulas can be handled simpler than those summation formulas in the discrete case. Some open problems could be solved by the help of this treatment. The connection with order statistical treatment is found and used for a simple user activity model. The formulas can be used to approximate the average head movement, in case of different distributions, in optimal deterministic file allocation problems. Deterministic and stochastic strategies of allocation and rearrangement are studied and compared.  相似文献   

19.
介绍了一款适用于校园局域网内的、US结构的即时通信软件.聊天信息采用服务器转发方式,只要当前在线的用户都可以接收到,实现了两人之间的私聊等功能.软件包括服务器程序和客户端程序,服务器包括查看服务器信息、更改管理员密码等功能;客户端具有登录、注册、修改个人信息、聊天等功能.  相似文献   

20.
One of the main research problems in structural bioinformatics is the prediction of three-dimensional structures (3-D) of polypeptides or proteins. The current rate at which amino acid sequences are identified increases much faster than the 3-D protein structure determination by experimental methods, such as X-ray diffraction and NMR techniques. The determination of protein structures is both experimentally expensive and time consuming. Predicting the correct 3-D structure of a protein molecule is an intricate and arduous task. The protein structure prediction (PSP) problem is, in computational complexity theory, an NP-complete problem. In order to reduce computing time, current efforts have targeted hybridizations between ab initio and knowledge-based methods aiming at efficient prediction of the correct structure of polypeptides. In this article we present a hybrid method for the 3-D protein structure prediction problem. An artificial neural network knowledge-based method that predicts approximated 3-D protein structures is combined with an ab initio strategy. Molecular dynamics (MD) simulation is used to the refinement of the approximated 3-D protein structures. In the refinement step, global interactions between each pair of atoms in the molecule (including non-bond interactions) are evaluated. The developed MD protocol enables us to correct polypeptide torsion angles deviation from the predicted structures and improve their stereo-chemical quality. The obtained results shows that the time to predict native-like 3-D structures is considerably reduced. We test our computational strategy with four mini proteins whose sizes vary from 19 to 34 amino acid residues. The structures obtained at the end of 32.0 nanoseconds (ns) of MD simulation were comparable topologically to their correspondent experimental structures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号