首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this study, we propose a set of new algorithms to enhance the effectiveness of classification for 5-year survivability of breast cancer patients from a massive data set with imbalanced property. The proposed classifier algorithms are a combination of synthetic minority oversampling technique (SMOTE) and particle swarm optimization (PSO), while integrating some well known classifiers, such as logistic regression, C5 decision tree (C5) model, and 1-nearest neighbor search. To justify the effectiveness for this new set of classifiers, the g-mean and accuracy indices are used as performance indexes; moreover, the proposed classifiers are compared with previous literatures. Experimental results show that the hybrid algorithm of SMOTE + PSO + C5 is the best one for 5-year survivability of breast cancer patient classification among all algorithm combinations. We conclude that, implementing SMOTE in appropriate searching algorithms such as PSO and classifiers such as C5 can significantly improve the effectiveness of classification for massive imbalanced data sets.  相似文献   

2.
Multilayer perceptron (MLP) (trained with back propagation learning algorithm) takes large computational time. The complexity of the network increases as the number of layers and number of nodes in layers increases. Further, it is also very difficult to decide the number of nodes in a layer and the number of layers in the network required for solving a problem a priori. In this paper an improved particle swarm optimization (IPSO) is used to train the functional link artificial neural network (FLANN) for classification and we name it ISO-FLANN. In contrast to MLP, FLANN has less architectural complexity, easier to train, and more insight may be gained in the classification problem. Further, we rely on global classification capabilities of IPSO to explore the entire weight space, which is plagued by a host of local optima. Using the functionally expanded features; FLANN overcomes the non-linear nature of problems. We believe that the combined efforts of FLANN and IPSO (IPSO + FLANN = ISO ? FLANN) by harnessing their best attributes can give rise to a robust classifier. An extensive simulation study is presented to show the effectiveness of proposed classifier. Results are compared with MLP, support vector machine(SVM) with radial basis function (RBF) kernel, FLANN with gradiend descent learning and fuzzy swarm net (FSN).  相似文献   

3.
This study investigated the effects of upstream stations’ flow records on the performance of artificial neural network (ANN) models for predicting daily watershed runoff. As a comparison, a multiple linear regression (MLR) analysis was also examined using various statistical indices. Five streamflow measuring stations on the Cahaba River, Alabama, were selected as case studies. Two different ANN models, multi layer feed forward neural network using Levenberg–Marquardt learning algorithm (LMFF) and radial basis function (RBF), were introduced in this paper. These models were then used to forecast one day ahead streamflows. The correlation analysis was applied for determining the architecture of each ANN model in terms of input variables. Several statistical criteria (RMSE, MAE and coefficient of correlation) were used to check the model accuracy in comparison with the observed data by means of K-fold cross validation method. Additionally, residual analysis was applied for the model results. The comparison results revealed that using upstream records could significantly increase the accuracy of ANN and MLR models in predicting daily stream flows (by around 30%). The comparison of the prediction accuracy of both ANN models (LMFF and RBF) and linear regression method indicated that the ANN approaches were more accurate than the MLR in predicting streamflow dynamics. The LMFF model was able to improve the average of root mean square error (RMSEave) and average of mean absolute percentage error (MAPEave) values of the multiple linear regression forecasts by about 18% and 21%, respectively. In spite of the fact that the RBF model acted better for predicting the highest range of flow rate (flood events, RMSEave/RBF = 26.8 m3/s vs. RMSEave/LMFF = 40.2 m3/s), in general, the results suggested that the LMFF method was somehow superior to the RBF method in predicting watershed runoff (RMSE/LMFF = 18.8 m3/s vs. RMSE/RBF = 19.2 m3/s). Eventually, statistical differences between measured and predicted medians were evaluated using Mann-Whitney test, and differences in variances were evaluated using the Levene's test.  相似文献   

4.
This paper presents a novel adaptive cuckoo search (ACS) algorithm for optimization. The step size is made adaptive from the knowledge of its fitness function value and its current position in the search space. The other important feature of the ACS algorithm is its speed, which is faster than the CS algorithm. Here, an attempt is made to make the cuckoo search (CS) algorithm parameter free, without a Levy step. The proposed algorithm is validated using twenty three standard benchmark test functions. The second part of the paper proposes an efficient face recognition algorithm using ACS, principal component analysis (PCA) and intrinsic discriminant analysis (IDA). The proposed algorithms are named as PCA + IDA and ACS–IDA. Interestingly, PCA + IDA offers us a perturbation free algorithm for dimension reduction while ACS + IDA is used to find the optimal feature vectors for classification of the face images based on the IDA. For the performance analysis, we use three standard face databases—YALE, ORL, and FERET. A comparison of the proposed method with the state-of-the-art methods reveals the effectiveness of our algorithm.  相似文献   

5.
A new architecture of intelligent audio emotion recognition is proposed in this paper. It fully utilizes both prosodic and spectral features in its design. It has two main paths in parallel and can recognize 6 emotions. Path 1 is designed based on intensive analysis of different prosodic features. Significant prosodic features are identified to differentiate emotions. Path 2 is designed based on research analysis on spectral features. Extraction of Mel-Frequency Cepstral Coefficient (MFCC) feature is then followed by Bi-directional Principle Component Analysis (BDPCA), Linear Discriminant Analysis (LDA) and Radial Basis Function (RBF) neural classification. This path has 3 parallel BDPCA + LDA + RBF sub-paths structure and each handles two emotions. Fusion modules are also proposed for weights assignment and decision making. The performance of the proposed architecture is evaluated on eNTERFACE’05 and RML databases. Simulation results and comparison have revealed good performance of the proposed recognizer.  相似文献   

6.
3-D Networks-on-Chip (NoCs) have been proposed as a potent solution to address both the interconnection and design complexity problems facing future System-on-Chip (SoC) designs. In this paper, two topology-aware multicast routing algorithms, Multicasting XYZ (MXYZ) and Alternative XYZ (AL + XYZ) algorithms in supporting of 3-D NoC are proposed. In essence, MXYZ is a simple dimension order multicast routing algorithm that targets 3-D NoC systems built upon regular topologies. To support multicast routing in irregular regions, AL + XYZ can be applied, where an alternative output channel is sought to forward/replicate the packets whenever the output channel determined by MXYZ is not available. To evaluate the performance of MXYZ and AL + XYZ, extensive experiments have been conducted by comparing MXYZ and AL + XYZ against a path-based multicast routing algorithm and an irregular region oriented multiple unicast routing algorithm, respectively. The experimental results confirm that the proposed MXYZ and AL + XYZ schemes, respectively, have lower latency and power consumption than the other two routing algorithms, meriting the two proposed algorithms to be more suitable for supporting multicasting in 3-D NoC systems. In addition, the hardware implementation cost of AL + XYZ is shown to be quite modest.  相似文献   

7.
《Computer Networks》2003,41(1):73-88
To provide real-time service or engineer constrained-based paths, networks require the underlying routing algorithm to be able to find low-cost paths that satisfy given quality-of-service constraints. However, the problem of constrained shortest (least-cost) path routing is known to be NP-hard, and some heuristics have been proposed to find a near-optimal solution. However, these heuristics either impose relationships among the link metrics to reduce the complexity of the problem which may limit the general applicability of the heuristic, or are too costly in terms of execution time to be applicable to large networks. In this paper, we focus on solving the delay-constrained minimum-cost path problem, and present a fast algorithm to find a near-optimal solution. This algorithm, called delay-cost-constrained routing (DCCR), is a variant of the k-shortest-path algorithm. DCCR uses a new adaptive path weight function together with an additional constraint imposed on the path cost, to restrict the search space. Thus, DCCR can return a near-optimal solution in a very short time. Furthermore, we use a variant of the Lagrangian relaxation method proposed by Handler and Zang [Networks 10 (1980) 293] to further reduce the search space by using a tighter bound on path cost. This makes our algorithm more accurate and even faster. We call this improved algorithm search space reduction + DCCR (SSR + DCCR). Through extensive simulations, we confirm that SSR + DCCR performs very well compared to the optimal but very expensive solution.  相似文献   

8.
A neural network combined to a neural classifier is used in a real time forecasting of hourly maximum ozone in the centre of France, in an urban atmosphere. This neural model is based on the MultiLayer Perceptron (MLP) structure. The inputs of the statistical network are model output statistics of the weather predictions from the French National Weather Service. These predicted meteorological parameters are very easily available through an air quality network. The lead time used in this forecasting is (t + 24) h. Efforts are related to a regularisation method which is based on a Bayesian Information Criterion-like and to the determination of a confidence interval of forecasting. We offer a statistical validation between various statistical models and a deterministic chemistry-transport model. In this experiment, with the final neural network, the ozone peaks are fairly well predicted (in terms of global fit), with an Agreement Index = 92%, the Mean Absolute Error = the Root Mean Square Error = 15 μg m−3 and the Mean Bias Error = 5 μg m−3, where the European threshold of the hourly ozone is 180 μg m−3.To improve the performance of this exceedance forecasting, instead of the previous model, we use a neural classifier with a sigmoid function in the output layer. The output of the network ranges from [0,1] and can be interpreted as the probability of exceedance of the threshold. This model is compared to a classical logistic regression. With this neural classifier, the Success Index of forecasting is 78% whereas it is from 65% to 72% with the classical MLPs. During the validation phase, in the Summer of 2003, six ozone peaks above the threshold were detected. They actually were seven.Finally, the model called NEUROZONE is now used in real time. New data will be introduced in the training data each year, at the end of September. The network will be re-trained and new regression parameters estimated. So, one of the main difficulties in the training phase – namely the low frequency of ozone peaks above the threshold in this region – will be solved.  相似文献   

9.
This paper presents results of a comparative study with the objective to identify the most effective and efficient way of applying a local search method embedded in a hybrid algorithm. The hybrid metaheuristic employed in this study is called “DE–HS–HJ” because it is comprised of two cooperative metaheusitic algorithms, i.e., differential evolution (DE) and harmony search (HS), and one local search (LS) method, i.e., Hooke and Jeeves (HJ) direct search. Eighteen different ways of using HJ local search were implemented and all of them were evaluated with 19 problems, in terms of six performance indices, covering both accuracy and efficiency. Statistic analyses were conducted accordingly to determine the significance in performance differences. The test results show that overall the best three LS application strategies are applying local search to every generated solution with a specified probability and also to each newly updated solution (NUS + ESP), applying local search to every generated solution with a specified probability (ESP), and applying local search to every generated solution with probability and also to the updated current global best solution (EUGbest + ESP). ESP is found to be the best local search application strategy in terms of success rate. Integrating it with NUS further improve the overall performance. EUGbest + ESP is the most efficient and it is also able to achieve high level of accuracy (the fourth place in terms of success rate with an average above 0.9).  相似文献   

10.
Based on a detailed check of the LDA + U and GGA + U corrected methods, we found that the transition energy levels depend almost linearly on the effective U parameter. GGA + U seems to be better than LDA + U, with effective U parameter of about 5.0 eV. However, though the results between LDA and GGA are very different before correction, the corrected transition energy levels spread less than 0.3 eV. These more or less consistent results indicate the necessity and validity of LDA + U and GGA + U correction.  相似文献   

11.
Protein thermostability information is closely linked to commercial production of many biomaterials. Recent developments have shown that amino acid composition, special sequence patterns and hydrogen bonds, disulfide bonds, salt bridges and so on are of considerable importance to thermostability. In this study, we present a system to integrate these various factors that predict protein thermostability. In this study, the features of proteins in the PGTdb are analyzed. We consider both structure and sequence features and correlation coefficients are incorporated into the feature selection algorithm. Machine learning algorithms are then used to develop identification systems and performances between the different algorithms are compared. In this research, two features, (E + F + M + R)/residue and charged/non-charged, are found to be critical to the thermostability of proteins. Although the sequence and structural models achieve a higher accuracy, sequence-only models provides sufficient accuracy for sequence-only thermostability prediction.  相似文献   

12.
Data partitioning and scheduling is one the important issues in minimizing the processing time for parallel and distributed computing system. We consider a single-level tree architecture of the system and the case of affine communication model, for a general m processor system with n rounds of load distribution. For this case, there exists an optimal activation order, optimal number of processors m* (m *  m), and optimal rounds of load distribution n* (n *  n), such that the processing time of the entire processing load is a minimum. This is a difficult optimization problem because for a given activation order, we have to first identify the processors that are participating (in the computation process) in every round of load distribution and then obtain the load fractions assigned to them, and the processing time. Hence, in this paper, we propose a real-coded genetic algorithm (RCGA) to solve the optimal activation order, optimal number of processors m* (m *  m), and optimal rounds of load distribution n* (n *  n), such that the processing time of the entire processing load is a minimum. RCGA employs a modified crossover and mutation operators such that the operators always produce a valid solution. Also, we propose different population initialization schemes to improve the convergence. Finally, we present a comparative study with simple real-coded genetic algorithm and particle swarm optimization to highlight the advantage of the proposed algorithm. The results clearly indicate the effectiveness of the proposed real-coded genetic algorithm.  相似文献   

13.
This article aims at finding efficient hyperspectral indices for the estimation of forest sun leaf chlorophyll content (CHL, µg cmleaf? 2), sun leaf mass per area (LMA, gdry matter mleaf? 2), canopy leaf area index (LAI, m2leaf msoil? 2) and leaf canopy biomass (Bleaf, gdry matter msoil? 2). These parameters are useful inputs for forest ecosystem simulations at landscape scale. The method is based on the determination of the best vegetation indices (index form and wavelengths) using the radiative transfer model PROSAIL (formed by the newly-calibrated leaf reflectance model PROSPECT coupled with the multi-layer version of the canopy radiative transfer model SAIL). The results are tested on experimental measurements at both leaf and canopy scales. At the leaf scale, it is possible to estimate CHL with high precision using a two wavelength vegetation index after a simulation based calibration. At the leaf scale, the LMA is more difficult to estimate with indices. At the canopy scale, efficient indices were determined on a generic simulated database to estimate CHL, LMA, LAI and Bleaf in a general way. These indices were then applied to two Hyperion images (50 plots) on the Fontainebleau and Fougères forests and portable spectroradiometer measurements. They showed good results with an RMSE of 8.2 µg cm? 2 for CHL, 9.1 g m? 2 for LMA, 1.7 m2 m? 2 for LAI and 50.6 g m? 2 for Bleaf. However, at the canopy scale, even if the wavelengths of the calibrated indices were accurately determined with the simulated database, the regressions between the indices and the biophysical characteristics still had to be calibrated on measurements. At the canopy scale, the best indices were: for leaf chlorophyll content: NDchl = (ρ925 ? ρ710)/(ρ925 + ρ710), for leaf mass per area: NDLMA = (ρ2260 ? ρ1490)/(ρ2260 + ρ1490), for leaf area index: DLAI = ρ1725 ? ρ970, and for canopy leaf biomass: NDBleaf = (ρ2160 ? ρ1540)/(ρ2160 + ρ1540).  相似文献   

14.
In this paper the optimization of type-2 fuzzy inference systems using genetic algorithms (GAs) and particle swarm optimization (PSO) is presented. The optimized type-2 fuzzy inference systems are used to estimate the type-2 fuzzy weights of backpropagation neural networks. Simulation results and a comparative study among neural networks with type-2 fuzzy weights without optimization of the type-2 fuzzy inference systems, neural networks with optimized type-2 fuzzy weights using genetic algorithms, and neural networks with optimized type-2 fuzzy weights using particle swarm optimization are presented to illustrate the advantages of the bio-inspired methods. The comparative study is based on a benchmark case of prediction, which is the Mackey-Glass time series (for τ = 17) problem.  相似文献   

15.
A new version of the Euclidean algorithm is developed for computing the greatest common divisor of two Gaussian integers. It uses approximation to obtain a sequence of remainders of decreasing absolute values. The algorithm is compared with the new (1  +  i)-ary algorithm of Weilert and found to be somewhat faster if properly implemented.  相似文献   

16.
The gas-phase geometry optimizations of bare, mono- and dihydrated complexes of temozolomide isomers were carried out using density functional calculation at the M06  2X/6  31 + G(d,p) level of the theory. The structures and protonation energies of protonated species of temozolomide are reported. Chemical indices of all isomers and protonated species are also reported. Energies, thermodynamic quantities, rate constants and equilibrium constants of tautomeric and rotameric transformations of all isomers I1  TZM  HIa  HIb  I2  I3 in bare and hydrated systems were obtained.  相似文献   

17.
It is very important for financial institutions to develop credit rating systems to help them to decide whether to grant credit to consumers before issuing loans. In literature, statistical and machine learning techniques for credit rating have been extensively studied. Recent studies focusing on hybrid models by combining different machine learning techniques have shown promising results. However, there are various types of combination methods to develop hybrid models. It is unknown that which hybrid machine learning model can perform the best in credit rating. In this paper, four different types of hybrid models are compared by ‘Classification + Classification’, ‘Classification + Clustering’, ‘Clustering + Classification’, and ‘Clustering + Clustering’ techniques, respectively. A real world dataset from a bank in Taiwan is considered for the experiment. The experimental results show that the ‘Classification + Classification’ hybrid model based on the combination of logistic regression and neural networks can provide the highest prediction accuracy and maximize the profit.  相似文献   

18.
To solve the speaker independent emotion recognition problem, a three-level speech emotion recognition model is proposed to classify six speech emotions, including sadness, anger, surprise, fear, happiness and disgust from coarse to fine. For each level, appropriate features are selected from 288 candidates by using Fisher rate which is also regarded as input parameter for Support Vector Machine (SVM). In order to evaluate the proposed system, principal component analysis (PCA) for dimension reduction and artificial neural network (ANN) for classification are adopted to design four comparative experiments, including Fisher + SVM, PCA + SVM, Fisher + ANN, PCA + ANN. The experimental results proved that Fisher is better than PCA for dimension reduction, and SVM is more expansible than ANN for speaker independent speech emotion recognition. The average recognition rates for each level are 86.5%, 68.5% and 50.2% respectively.  相似文献   

19.
《Computer Networks》2007,51(11):3172-3196
A search based heuristic for the optimisation of communication networks where traffic forecasts are uncertain and the problem is NP-complete is presented. While algorithms such as genetic algorithms (GA) and simulated annealing (SA) are often used for this class of problem, this work applies a combination of newer optimisation techniques specifically: fast local search (FLS) as an improved hill climbing method and guided local search (GLS) to allow escape from local minima. The GLS + FLS combination is compared with an optimised GA and SA approaches. It is found that in terms of implementation, the parameterisation of the GLS + FLS technique is significantly simpler than that for a GA and SA. Also, the self-regularisation feature of the GLS + FLS approach provides a distinctive advantage over the other techniques which require manual parameterisation. To compare numerical performance, the three techniques were tested over a number of network sets varying in size, number of switch circuit demands (network bandwidth demands) and levels of uncertainties on the switch circuit demands. The results show that the GLS + FLS outperforms the GA and SA techniques in terms of both solution quality and optimisation speed but even more importantly GLS + FLS has significantly reduced parameterisation time.  相似文献   

20.
Cost of testing activities is a major portion of the total cost of a software. In testing, generating test data is very important because the efficiency of testing is highly dependent on the data used in this phase. In search-based software testing, soft computing algorithms explore test data in order to maximize a coverage metric which can be considered as an optimization problem. In this paper, we employed some meta-heuristics (Artificial Bee Colony, Particle Swarm Optimization, Differential Evolution and Firefly Algorithms) and Random Search algorithm to solve this optimization problem. First, the dependency of the algorithms on the values of the control parameters was analyzed and suitable values for the control parameters were recommended. Algorithms were compared based on various fitness functions (path-based, dissimilarity-based and approximation level + branch distance) because the fitness function affects the behaviour of the algorithms in the search space. Results showed that meta-heuristics can be effectively used for hard problems and when the search space is large. Besides, approximation level + branch distance based fitness function is generally a good fitness function that guides the algorithms accurately.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号