首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
在微处理器中,为突破数据流限制以获取更高的指令级并行,指令值预测研究日益得到广泛重视,多种值预测器设计方案被提出。这些预测器可以获得很高的性能,但在性价比优化设计上还有很大的研究空间。本文提出的基于线性函数的值预测器在性能和硬件耗费两方面实现了较好的折衷。SPEC CINT95基准测试程序集模拟结果表明,与复杂的基于stride和2level的混和值预测器相比,基于线性函数的值预测器在性能上仅有很小损失。  相似文献   

2.
Loan fraud is a critical factor in the insolvency of financial institutions, so companies make an effort to reduce the loss from fraud by building a model for proactive fraud prediction. However, there are still two critical problems to be resolved for the fraud detection: (1) the lack of cost sensitivity between type I error and type II error in most prediction models, and (2) highly skewed distribution of class in the dataset used for fraud detection because of sparse fraud-related data. The objective of this paper is to examine whether classification cost is affected both by the cost-sensitive approach and by skewed distribution of class. To that end, we compare the classification cost incurred by a traditional cost-insensitive classification approach and two cost-sensitive classification approaches, Cost-Sensitive Classifier (CSC) and MetaCost. Experiments were conducted with a credit loan dataset from a major financial institution in Korea, while varying the distribution of class in the dataset and the number of input variables. The experiments showed that the lowest classification cost was incurred when the MetaCost approach was used and when non-fraud data and fraud data were balanced. In addition, the dataset that includes all delinquency variables was shown to be most effective on reducing the classification cost.  相似文献   

3.
Intrusion detection is a necessary step to identify unusual access or attacks to secure internal networks. In general, intrusion detection can be approached by machine learning techniques. In literature, advanced techniques by hybrid learning or ensemble methods have been considered, and related work has shown that they are superior to the models using single machine learning techniques. This paper proposes a hybrid learning model based on the triangle area based nearest neighbors (TANN) in order to detect attacks more effectively. In TANN, the k-means clustering is firstly used to obtain cluster centers corresponding to the attack classes, respectively. Then, the triangle area by two cluster centers with one data from the given dataset is calculated and formed a new feature signature of the data. Finally, the k-NN classifier is used to classify similar attacks based on the new feature represented by triangle areas. By using KDD-Cup ’99 as the simulation dataset, the experimental results show that TANN can effectively detect intrusion attacks and provide higher accuracy and detection rates, and the lower false alarm rate than three baseline models based on support vector machines, k-NN, and the hybrid centroid-based classification model by combining k-means and k-NN.  相似文献   

4.
Mobile applications and services relying on mobility prediction have recently spurred lots of interest. In this paper, we propose mobility prediction based on cellular traces as an infrastructural level service of telecom cloud. Mobility Prediction as a Service (MPaaS) embeds mobility mining and forecasting algorithms into a cloud-based user location tracking framework. By empowering MPaaS, the hosted 3rd-party and value-added services can benefit from online mobility prediction. Particularly we took Mobility-aware Personalization and Predictive Resource Allocation as key features to elaborate how MPaaS drives new fashion of mobile cloud applications. Due to the randomness of human mobility patterns, mobility predicting remains a very challenging task in MPaaS research. Our preliminary study observed collective behavioral patterns (CBP) in mobility of crowds, and proposed a CBP-based mobility predictor. MPaaS system equips a hybrid predictor fusing both CBP-based scheme and Markov-based predictor to provide telecom cloud with large-scale mobility prediction capacity.  相似文献   

5.
The aim of this paper is to propose a new hybrid data mining model based on combination of various feature selection and ensemble learning classification algorithms, in order to support decision making process. The model is built through several stages. In the first stage, initial dataset is preprocessed and apart of applying different preprocessing techniques, we paid a great attention to the feature selection. Five different feature selection algorithms were applied and their results, based on ROC and accuracy measures of logistic regression algorithm, were combined based on different voting types. We also proposed a new voting method, called if_any, that outperformed all other voting methods, as well as a single feature selection algorithm's results. In the next stage, a four different classification algorithms, including generalized linear model, support vector machine, naive Bayes and decision tree, were performed based on dataset obtained in the feature selection process. These classifiers were combined in eight different ensemble models using soft voting method. Using the real dataset, the experimental results show that hybrid model that is based on features selected by if_any voting method and ensemble GLM + DT model performs the highest performance and outperforms all other ensemble and single classifier models.  相似文献   

6.
There is wide agreement that one of the most significant impediments to the performance of current and future pipelined superscalar processors is the presence of conditional branches in the instruction stream. Speculative execution is one solution to the branch problem, but speculative work is discarded if a branch is mispredicted. For it to be effective, speculative execution requires a very accurate branch predictor; 95% accuracy is not good enough. This paper proposes branch classification, a methodology for building more accurate branch predictors. Branch classification allows an individual branch instruction to be associated with the branch predictor best suited to predict its direction. Using this approach, a hybrid branch predictor can be constructed such that each component branch predictor predicts those branches for which it is best suited. To demonstrate the usefulness of branch classification, an example classification scheme is given and a new hybrid predictor is built based on this scheme which achieves a higher prediction accuracy than any branch predictor previously reported in the literature.  相似文献   

7.
We propose a two-stage phone duration modelling scheme, which can be applied for the improvement of prosody modelling in speech synthesis systems. This scheme builds on a number of independent feature constructors (FCs) employed in the first stage, and a phone duration model (PDM) which operates on an extended feature vector in the second stage. The feature vector, which acts as input to the first stage, consists of numerical and non-numerical linguistic features extracted from text. The extended feature vector is obtained by appending the phone duration predictions estimated by the FCs to the initial feature vector. Experiments on the American-English KED TIMIT and on the Modern Greek WCL-1 databases validated the advantage of the proposed two-stage scheme, improving prediction accuracy over the best individual predictor, and over a two-stage scheme which just fuses the first-stage outputs. Specifically, when compared to the best individual predictor, a relative reduction in the mean absolute error and the root mean square error of 3.9% and 3.9% on the KED TIMIT, and of 4.8% and 4.6% on the WCL-1 database, respectively, is observed.  相似文献   

8.
The prediction of business failure is an important and challenging issue that has served as the impetus for many academic studies over the past three decades. This paper proposes a hybrid manifold learning approach model which combines both isometric feature mapping (ISOMAP) algorithm and support vector machines (SVM) to predict the failure of firms based on past financial performance data. By making use of the ISOMAP algorithm to perform dimension reduction, is then utilized as a preprocessor to improve business failure prediction capability by SVM. To create a benchmark, we further compare principal component analysis (PCA) and SVM with our proposed hybrid approach. Analytic results demonstrate that our hybrid approach not only has the best classification rate, but also produces the lowest incidence of Type II errors, and is capable of achieving an improved predictive accuracy and of providing guidance for decision makers to detect and prevent potential financial crises in the early stages.  相似文献   

9.
This paper presents a novel face recognition method by means of fusing color, local spatial and global frequency information. Specifically, the proposed method fuses the multiple features derived from a hybrid color space, the Gabor image representation, the local binary patterns (LBP), and the discrete cosine transform (DCT) of the input image. The novelty of this paper is threefold. First, a hybrid color space, the RCrQ color space, is constructed by combining the R component image of the RGB color space and the chromatic component images, Cr and Q, of the YCbCr and YIQ color spaces, respectively. The RCrQ hybrid color space, whose component images possess complementary characteristics, enhances the discriminating power for face recognition. Second, three effective image encoding methods are proposed for the component images in the RCrQ hybrid color space to extract features: (i) a patch-based Gabor image representation for the R component image, (ii) a multi-resolution LBP feature fusion scheme for the Cr component image, and (iii) a component-based DCT multiple face encoding for the Q component image. Finally, at the decision level, the similarity matrices generated using the three component images in the RCrQ hybrid color space are fused using a weighted sum rule. Experiments on the Face Recognition Grand Challenge (FRGC) version 2 Experiment 4 show that the proposed method improves face recognition performance significantly. In particular, the proposed method achieves the face verification rate (ROC III curve) of 92.43%, at the false accept rate of 0.1%, compared to the FRGC baseline performance of 11.86% face verification rate at the same false accept rate.  相似文献   

10.
11.
This paper presents a novel method for diagnosis of heart disease. The proposed method is based on a hybrid method that uses fuzzy weighted pre-processing and artificial immune recognition system (AIRS). Artificial immune recognition system has showed an effective performance on several problems such as machine learning benchmark problems and medical classification problems like breast cancer, diabetes, liver disorders classification. The robustness of the proposed method is examined using classification accuracy, k-fold cross-validation method and confusion matrix. The obtained classification accuracy is 96.30% and it is very promising compared to the previously reported classification techniques.  相似文献   

12.
This study addresses the need for operational models in view of rapidly advancing in situ sensor technology that puts lakes into online surveillance mode. A model ensemble for simulating plankton community dynamics in Lake Kinneret (Israel) from 1988 to 1999 has been induced from electronically-measurable predictor variables (EMPV) such as water temperature, pH, turbidity, electrical conductivity and dissolved oxygen by the hybrid evolutionary algorithm HEA. It cascade wise predicts the total nitrogen to total phosphorus ratios TN/TP, concentrations of chlorophyta, baccilariophyta, cyanophyta and dinophyta, as well as densities of rotifera, cladocera and copepoda solely from EMPV. The best coefficients of determination (r2) have been achieved with 0.6 by the dinophyta model, 0.45 by the rotifera model and 0.44 by the bacillariophyta model. The worst coefficients of determination (r2) have been produced by the cladocera model with 0.24 and by the TN/TP model with 0.28. Despite the differences in the r2 values and apart from the cladocera model, the remaining models matched reasonably well seasonal and interannual plankton dynamics observed over 11 years in Lake Kinneret.The model ensemble developed by HEA also revealed ecological thresholds and relationships determining plankton community dynamics in Lake Kinneret solely based on in situ predictor variables.  相似文献   

13.
Uniaxial compressive strength (UCS) of rock is crucial for any type of projects constructed in/on rock mass. The test that is conducted to measure the UCS of rock is expensive, time consuming and having sample restriction. For this reason, the UCS of rock may be estimated using simple rock tests such as point load index (I s(50)), Schmidt hammer (R n) and p-wave velocity (V p) tests. To estimate the UCS of granitic rock as a function of relevant rock properties like R n, p-wave and I s(50), the rock cores were collected from the face of the Pahang–Selangor fresh water tunnel in Malaysia. Afterwards, 124 samples are prepared and tested in accordance with relevant standards and the dataset is obtained. Further an established dataset is used for estimating the UCS of rock via three-nonlinear prediction tools, namely non-linear multiple regression (NLMR), artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS). After conducting the mentioned models, considering several performance indices including coefficient of determination (R 2), variance account for and root mean squared error and also using simple ranking procedure, the models were examined and the best prediction model was selected. It is concluded that the R 2 equal to 0.951 for testing dataset suggests the superiority of the ANFIS model, while these values are 0.651 and 0.886 for NLMR and ANN techniques, respectively. The results pointed out that the ANFIS model can be used for predicting UCS of rocks with higher capacity in comparison with others. However, the developed model may be useful at a preliminary stage of design; it should be used with caution and only for the specified rock types.  相似文献   

14.
K nearest neighbor and Bayesian methods are effective methods of machine learning. Expectation maximization is an effective Bayesian classifier. In this work a data elimination approach is proposed to improve data clustering. The proposed method is based on hybridization of k nearest neighbor and expectation maximization algorithms. The k nearest neighbor algorithm is considered as the preprocessor for expectation maximization algorithm to reduce the amount of training data making it difficult to learn. The suggested method is tested on well-known machine learning data sets iris, wine, breast cancer, glass and yeast. Simulations are done in MATLAB environment and performance results are concluded.  相似文献   

15.
In this paper, we consider scheduling of a multi-item single stage production-inventory system in the presence of uncertainty regarding demand patterns, production times and switchover times. For a given specification of base-stock levels of individual items and under (S − 1, S) requests for replenishment policy, a mathematical program to minimize long-run average system wide costs is formulated. We derive approximations for the first two moments of demand over lead time using residual service analysis of vacation queue models. Subsequently, we develop an approximate convex program for the original cost model and determine optimal production frequencies for individual types. Based on these relative frequencies, we determine a table size and devise an efficient heuristic to construct a tabular sequence in which individual items appear according to their respective absolute frequencies and items are positioned such that variance of their inter-visit times is minimized. A numerical study that demonstrates effectiveness of the proposed policy against cyclic policies is given.  相似文献   

16.
In this paper, we developed a prediction model based on support vector machine (SVM) with a hybrid feature selection method to predict the trend of stock markets. This proposed hybrid feature selection method, named F-score and Supported Sequential Forward Search (F_SSFS), combines the advantages of filter methods and wrapper methods to select the optimal feature subset from original feature set. To evaluate the prediction accuracy of this SVM-based model combined with F_SSFS, we compare its performance with back-propagation neural network (BPNN) along with three commonly used feature selection methods including Information gain, Symmetrical uncertainty, and Correlation-based feature selection via paired t-test. The grid-search technique using 5-fold cross-validation is used to find out the best parameter value of kernel function of SVM. In this study, we show that SVM outperforms BPN to the problem of stock trend prediction. In addition, our experimental results show that the proposed SVM-based model combined with F_SSFS has the highest level of accuracies and generalization performance in comparison with the other three feature selection methods. With these results, we claim that SVM combined with F_SSFS can serve as a promising addition to the existing stock trend prediction methods.  相似文献   

17.
Breast cancer has been becoming the main cause of death in women all around the world. An accurate and interpretable method is necessary for diagnosing patients with breast cancer for well-performed treatment. Nowadays, a great many of ensemble methods have been widely applied to breast cancer diagnosis, capable of achieving high accuracy, such as Random Forest. However, they are black-box methods which are unable to explain the reasons behind the diagnosis. To surmount this limitation, a rule extraction method named improved Random Forest (RF)-based rule extraction (IRFRE) method is developed to derive accurate and interpretable classification rules from a decision tree ensemble for breast cancer diagnosis. Firstly, numbers of decision tree models are constructed using Random Forest to generate abundant decision rules available. And then a rule extraction approach is devised to detach decision rules from the trained trees. Finally, an improved multi-objective evolutionary algorithm (MOEA) is employed to seek for an optimal rule predictor where the constituent rule set is the best trade-off between accuracy and interpretability. The developed method is evaluated on three breast cancer data sets, i.e., the Wisconsin Diagnostic Breast Cancer (WDBC) dataset, Wisconsin Original Breast Cancer (WOBC) dataset, and Surveillance, Epidemiology and End Results (SEER) breast cancer dataset. The experimental results demonstrate that the developed method can primely explain the black-box methods and outperform several popular single algorithms, ensemble learning methods, and rule extraction methods from the view of accuracy and interpretability. What is more, the proposed method can be popularized to other cancer diagnoses in practice, which provides an option to a more interpretable, more accurate cancer diagnosis process.  相似文献   

18.
Facial action units (AUs) can be represented spatially, temporally, and in terms of their correlation. Previous research focuses on one or another of these aspects or addresses them disjointly. We propose a hybrid network architecture that jointly models spatial and temporal representations and their correlation. In particular, we use a Convolutional Neural Network (CNN) to learn spatial representations, and a Long Short-Term Memory (LSTM) to model temporal dependencies among them. The outputs of CNNs and LSTMs are aggregated into a fusion network to produce per-frame prediction of multiple AUs. The hybrid network was compared to previous state-of-the-art approaches in two large FACS-coded video databases, GFT and BP4D, with over 400,000 AU-coded frames of spontaneous facial behavior in varied social contexts. Relative to standard multi-label CNN and feature-based state-of-the-art approaches, the hybrid system reduced person-specific biases and obtained increased accuracy for AU detection. To address class imbalance within and between batches during network training, we introduce multi-labeling sampling strategies that further increase accuracy when AUs are relatively sparse. Finally, we provide visualization of the learned AU models, which, to the best of our best knowledge, reveal for the first time how machines see AUs.  相似文献   

19.
As many structures of protein–DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein–DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone.  相似文献   

20.
《Information Fusion》2009,10(3):217-232
Protein secondary structure prediction is still a challenging problem at today. Even if a number of prediction methods have been presented in the literature, the various prediction tools that are available on-line produce results whose quality is not always fully satisfactory. Therefore, a user has to know which predictor to use for a given protein to be analyzed. In this paper, we propose a server implementing a method to improve the accuracy in protein secondary structure prediction. The method is based on integrating the prediction results computed by some available on-line prediction tools to obtain a combined prediction of higher quality. Given an input protein p whose secondary structure has to be predicted, and a group of proteins F, whose secondary structures are known, the server currently works according to a two phase approach: (i) it selects a set of predictors good at predicting the secondary structure of proteins in F (and, therefore, supposedly, that of p as well), and (ii) it integrates the prediction results delivered for p by the selected team of prediction tools. Therefore, by exploiting our system, the user is relieved of the burden of selecting the most appropriate predictor for the given input protein being, at the same time, assumed that a prediction result at least as good as the best available one will be delivered. The correctness of the resulting prediction is measured referring to EVA accuracy parameters used in several editions of CASP.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号