首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The separation of pathological discontinuous adventitious sounds (DAS) from vesicular sounds (VS) is of great importance to the analysis of lung sounds, since DAS are related to certain pulmonary pathologies. An automated way of revealing the diagnostic character of DAS by isolating them from VS, based on their nonstationarity, is presented in this paper. The proposed algorithm combines multiresolution analysis with hard thresholding in order to compose a wavelet transform-based stationary-nonstationary filter (WTST-NST). Applying the WTST-NST filter to fine/coarse crackles and squawks, selected from three lung sound databases, the coherent structure of DAS is revealed and they are separated from VS. When compared to other separation tools, the WTST-NST filter performed more accurately, objectively, and with lower computational cost. Due to its simple implementation it can easily be used in clinical medicine.  相似文献   

2.
The paper presents a hybrid soft computing system for mining of complex construction databases. The proposed approach hybridizes soft computing techniques, such as fuzzy logic, artificial neural networks (ANNs), and messy genetic algorithms (mGAs), to form a novel computational method for mining of human understandable knowledge from historical databases. The hybridization combines the merits of explicit knowledge representation of fuzzy logic decision-making systems, learning abilities of ANNs, and global search of mGAs. A hybrid soft computing system (HSCS) is developed for mining complex databases in construction with three characteristics: scarcity, incompleteness, and uncertainty. Real-world construction data repositories are selected to test the capabilities of the proposed HSCS for data-mining under the above-mentioned complex conditions. The testing results show the promising potential of the proposed HSCS for mining of complex databases in construction.  相似文献   

3.
As the construction industry is adapting to new computer technologies in terms of hardware and software, computerized construction data are becoming increasingly available. The explosive growth of many business, government, and scientific databases has begun to far outpace our ability to interpret and digest the data. Such volumes of data clearly overwhelm the traditional methods of data analysis such as spreadsheets and ad-hoc queries. The traditional methods can create informative reports from data, but cannot analyze the contents of those reports. A significant need exists for a new generation of techniques and tools with the ability to automatically assist humans in analyzing the mountains of data for useful knowledge. Knowledge discovery in databases (KDD) and data mining (DM) are tools that allow identification of valid, useful, and previously unknown patterns so that the construction manager may analyze the large amount of construction project data. These technologies combine techniques from machine learning, artificial intelligence, pattern recognition, statistics, databases, and visualization to automatically extract concepts, interrelationships, and patterns of interest from large databases. This paper presents the necessary steps such as (1) identification of problems, (2) data preparation, (3) data mining, (4) data analysis, and (5) refinement process required for the implementation of KDD. In order to test the feasibility of the proposed approach, a prototype of the KDD system was developed and tested with a construction management database, RMS (Resident Management System), provided by the U. S. Corps of Engineers. In this paper, the KDD process was applied to identify the cause(s) of construction activity delays. However, its possible applications can be extended to identify cause(s) of cost overrun and quality control/assurance among other construction problems. Predictable patterns may be revealed in construction data that were previously thought to be chaotic.  相似文献   

4.
We have created databases and software applications for the analysis of DNA mutations in the human p53 gene, the human hprt gene and the rodent transgenic lacZ locus. The databases themselves are stand-alone dBase files and the software for analysis of the databases runs on IBM- compatible computers. The software created for these databases permits filtering, ordering, report generation and display of information in the database. In addition, a significant number of routines have been developed for the analysis of single base substitutions. One method of obtaining the databases and software is via the World Wide Web (WWW). Open home page http://sunsite.unc.edu/dnam/mainpage.ht ml with a WWW browser. Alternatively, the databases and programs are available via public ftp from anonymous@sunsite.unc.edu. There is no password required to enter the system. The databases and software are found in subdirectory pub/academic/biology/dna-mutations. Two other programs are available at the WWW site, a program for comparison of mutational spectra and a program for entry of mutational data into a relational database.  相似文献   

5.
6.
We have created databases and software applications for the analysis of DNA mutations at the human p53 gene, the human hprt gene and both the rodent transgenic lacI and lacZ loci. The databases themselves are stand-alone dBASE files and the software for analysis of the databases runs on IBM-compatible computers with Microsoft Windows. Each database has a separate software analysis program. The software created for these databases permit the filtering, ordering, report generation and display of information in the database. In addition, a significant number of routines have been developed for the analysis of single base substitutions. One method of obtaining the databases and software is via the World Wide Web. Open the following home page with a Web Browser: http://sunsite.unc.edu/dnam/mainpage. html . Alternatively, the databases and programs are available via public FTP from: anonymous@sunsite.unc.edu. There is no password required to enter the system. The databases and software are found beneath the subdirectory: pub/academic/biology/dna-mutations. Two other programs are available at the site, a program for comparison of mutational spectra and a program for entry of mutational data into a relational database.  相似文献   

7.
The Georgia Department of Transportation's Multi-modal Transportation Planning Tool (MTPT) facilitates multimodal planning in rural areas. Using open databases that are available agencywide, the tool can aid in the analysis of transportation requirements of rural areas, identify potential implementation constraints early in the planning process, and develop a prioritized project list by mode for an analysis region. The MTPT addresses highways, transit, intercity bus, commuter and passenger rail, aviation, and bicycles. An integrated geographic information system plays an important role in the presentation of the results. This paper discusses the development of the MTPT and describes program functionality. The paper will be of particular interest to state transportation agencies interested in using statewide databases for multimodal planning purposes. Described techniques identify how data that are typically collected and maintained for an entire state (e.g., traffic volumes, posted speeds, designated bike routes, roadway functional classes, crash information, and county-based socioeconomic data) can be combined with field verified default factors, widely accepted planning and analysis methods, and additional regionally calibrated planning algorithms to perform system-level planning at the city, county, multicounty, or state levels.  相似文献   

8.
Molecular biology data are distributed among multiple databases. Although containing related data, these databases are often isolated and are characterized by various degrees of heterogeneity: they usually represent different views (schemas) of the scientific domain and are implemented using different data management systems. Currently, several systems support managing data in heterogeneous molecular biology databases. Lack of clear criteria for characterizing such systems precludes comprehensive evaluations of these systems or determining their relationships in terms of shared goals and facilities. In this paper, we propose criteria that would facilitate characterizing, evaluating, and comparing heterogeneous molecular biology database systems.  相似文献   

9.
The Gene Expression Database (GXD) is a community resource that stores and integrates expression information for the laboratory mouse, with a particular emphasis on mouse development, and makes these data freely available in formats appropriate for comprehensive analysis. GXD is implemented as a relational database and integrated with the Mouse Genome Database (MGD) to enable global analysis of genotype, expression and phenotype information. Interconnections with sequence databases and with databases from other species further extend GXD's utility for the analysis of gene expression data. GXD is available through the Mouse Genome Informatics Web Site at http://www.informatics.jax.org/  相似文献   

10.
Literature analysis is important for the identification of the state of existing knowledge and prevailing research gaps. Effective literature analysis, however, is a lengthy process and requires a large effort to consider the information from different viewpoints and to identify areas of cross benefits. This paper represents an approach to summarize the information related to the construction and infrastructure domains by using a category of visual tools referred to as mind maps. First, the capabilities of various knowledge mapping tools for graphically representing the hierarchical concepts (keywords) of a given domain of knowledge are discussed, and example mind maps are developed for the infrastructure asset management domain. Enhancements to mind maps are then proposed on the basis of an extensive literature analysis to visually show numerical scores of various publications associated with the concepts in the mind map. This facilitates the identification of the highly relevant and the most useful knowledge in the literature. Suggestions are then presented for the use of mind maps by publishers of large literature databases to facilitate better analysis and the visual access and retrieval of information from the literature.  相似文献   

11.
Genome Informatics is not only a new area of computer science for genome projects but also a new approach of life science. As the genome projects proceed, genome informatics is becoming more important to bio-industry as well as life science. The major subjects are as follows; Database technologies for integration of various kinds of biological data Knowledge discovery from the integrated databases Interpretation and analysis of DNA sequence data with the databases Computer technologies for simulating life system with knowledge extracted from the databases in order to check the validity of the knowledge In this article, the history and trend of development of computer technologies for the subjects are described except for computer simulation.  相似文献   

12.
We compared the exon/intron organization of vertebrate genes belonging to different isochore classes, as predicted by their GC content at third codon position. Two main features have emerged from the analysis of sequences published in GenBank: (1) genes coding for long proteins (i.e., > or = 500 aa) are almost two times more frequent in GC-poor than in GC-rich isochores; (2) intervening sequences (= sum of introns) are on average three times longer in GC-poor than in GC-rich isochores. These patterns are observed among human, mouse, rat, cow, and even chicken genes and are therefore likely to be common to all warm-blooded vertebrates. Analysis of Xenopus sequences suggests that the same patterns exist in cold-blooded vertebrates. It could be argued that such results do not reflect the reality because sequence databases are not representative of entire genomes. However, analysis of biases in GenBank revealed that the observed discrepancies between GC-rich and GC-poor isochores are not artifactual, and are probably largely underestimated. We investigated the distribution of microsatellites and interspersed repeats in introns of human and mouse genes from different isochores. This analysis confirmed previous studies showing that L1 repeats are almost absent from GC-rich isochores. Microsatellites and SINES (Alu, B1, B2) are found at roughly equal frequencies in introns from all isochore classes. Globally, the presence of repeated sequences does not account for the increased intron length in GC-poor isochores. The relationships between gene structure and global genome organization and evolution are discussed.  相似文献   

13.
14.
Large-scale DNA sequencing is creating a sequence infrastructure of great benefit to protein biochemistry. Concurrent with the application of large-scale DNA sequencing to whole genome analysis, mass spectrometry has attained the capability to rapidly, and with remarkable sensitivity, determine weights and amino acid sequences of peptides. Computer algorithms have been developed to use the two different types of data generated by mass spectrometers to search sequence databases. When a protein is digested with a site-specific protease, the molecular weights of the resulting collection of peptides, the mass map or fingerprint, can be determined using mass spectrometry. The molecular weights of the set of peptides derived from the digestion of a protein can then be used to identify the protein. Several different approaches have been developed. Protein identification using peptide mass mapping is an effective technique when studying organisms with completed genomes. A second method is based on the use of data created by tandem mass spectrometers. Tandem mass spectra contain highly specific information in the fragmentation pattern as well as sequence information. This information has been used to search databases of translated protein sequences as well as nucleotide databases such as expressed sequence tag (EST) sequences. The ability to search nucleotide databases is an advantage when analyzing data obtained from organisms whose genomes are not yet completed, but a large amount of expressed gene sequence is available (e.g., human and mouse). Furthermore, a strength of using tandem mass spectra to search databases is the ability to identify proteins present in fairly complex mixtures.  相似文献   

15.
16.
The repeated DNA sequence of wild ram (Ovis ammon) of 800 bp has been cloned. The blot-hybridization, in situ-hybridization, sequencing and computer analysis were used for the sequence analysis. It was shown that the cloned DNA is from 1.714 gm/cm3 repeated satellite DNA family. Fourteen highly homologous sequences were revealed in the nucleotide sequence databases. An analysis of their alignment revealed presence of two subfamilies (A and B). Average divergence of subfamily A. sequences (including the wild ram repeated sequence) from consensus is about 1%.  相似文献   

17.
OBJECTIVES: Our objectives were to identify and define a minimum set of variables for interventional cardiology that carried the most statistical weight for predicting adverse outcomes. Though "gaming" cannot be completely avoided, variables were to be as objective as possible and reproducible and had to be predictive of outcome in current databases. BACKGROUND: Outcomes of percutaneous coronary interventions depend on patient risk characteristics and disease severity and acuity. Comparing results of interventions has been difficult because definitions of similar variables differ in databases, and variables are not uniformly tracked. Identifying the best predictor variables and standardizing their definitions are a first step in developing a universal stratification instrument. METHODS: A list of empirically derived variables was first tested in eight cardiac databases (158,273 cases). Three end points (in-hospital death, in-hospital coronary artery bypass graft surgery, Q wave myocardial infarction) were chosen for analysis. Univariate and multivariate regression models were used to quantify the predictive value of the variable in each database. The variables were then defined by consensus by a panel of experts. RESULTS: In all databases patient demographics were similar, but disease severity varied greatly. The most powerful predictors of adverse outcome were measures of hemodynamic instability, disease severity, demographics and comorbid conditions in both univariate and multivariate analyses. CONCLUSIONS: Our analysis identified 29 variables that have the strongest statistical association with adverse outcomes after coronary interventions. These variables were also objectively defined. Incorporation of these variables into every cardiac dataset will provide uniform standards for data collected. Comparisons of outcomes among physicians, institutions and databases will therefore be more meaningful.  相似文献   

18.
Dissected tissue pieces of the pituitary pars intermedia from the amphibian Xenopus laevis was directly subjected to matrix-assisted laser desorption/ionization (MALDI) mass analysis. The obtained MALDI peptide profile revealed both previously known and unexpected processing products of the proopiomelanocortin gene. Mass spectrometric peptide sequencing of a few of these neuropeptides was performed by employing MALDI combined with postsource decay (PSD) fragment ion mass analysis. The potential of MALDI-PSD for sequence analysis of peptides directly from unfractionated tissue samples was examined for the first time for the known desacetyl-alpha-MSH-NH2 and the presumed vasotocin neuropeptide. In addition, the sequence of an unknown peptide which was present in the pars intermedia tissue sample at mass 1392.7 u was determined. The MALDI-PSD mass spectrum of precursor ion 1392.7 u contained sufficient structural information to uniquely identify the sequence by searching protein sequence databases. The determined amino acid sequence corresponds to the vasotocin peptide with a C-terminal extension of Gly-Lys-Arg ("vasotocinyl-GKR"), indicating incomplete processing of the vasotocin precursor protein in the pituitary pars intermediate of X. laevis. Both vasotocin and vasotocinyl-GKR are nonlinear peptides containing a disulfide (S-S) bridge between two cysteine residues. Interpretation of the spectra of these two peptides reveals three different forms of characteristic fragment ions of the cysteine side chain: peptide-CH2-SH (regular mass of Cys-containing fragment ions), peptide-CH2-S-SH (regular mass + 32 u) and peptide = CH2 (regular mass -34 u) due to cleavage on either side of the sulfur atoms.  相似文献   

19.
20.
The most widely used signal in clinical practice is the ECG. ECG conveys information regarding the electrical function of the heart, by altering the shape of its constituent waves, namely the P, QRS, and T waves. Thus, the required tasks of ECG processing are the reliable recognition of these waves, and the accurate measurement of clinically important parameters measured from the temporal distribution of the ECG constituent waves. In this paper, we shall review some current trends on ECG pattern recognition. In particular, we shall review non-linear transformations of the ECG, the use of principal component analysis (linear and non-linear), ways to map the transformed data into n-dimensional spaces, and the use of neural networks (NN) based techniques for ECG pattern recognition and classification. The problems we shall deal with are the QRS/PVC recognition and classification, the recognition of ischemic beats and episodes, and the detection of atrial fibrillation. Finally, a generalised approach to the classification problems in n-dimensional spaces will be presented using among others NN, radial basis function networks (RBFN) and non-linear principal component analysis (NLPCA) techniques. The performance measures of the sensitivity and specificity of these algorithms will also be presented using as training and testing data sets from the MIT-BIH and the European ST-T databases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号