首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
It is very important for financial institutions to develop credit rating systems to help them to decide whether to grant credit to consumers before issuing loans. In literature, statistical and machine learning techniques for credit rating have been extensively studied. Recent studies focusing on hybrid models by combining different machine learning techniques have shown promising results. However, there are various types of combination methods to develop hybrid models. It is unknown that which hybrid machine learning model can perform the best in credit rating. In this paper, four different types of hybrid models are compared by ‘Classification + Classification’, ‘Classification + Clustering’, ‘Clustering + Classification’, and ‘Clustering + Clustering’ techniques, respectively. A real world dataset from a bank in Taiwan is considered for the experiment. The experimental results show that the ‘Classification + Classification’ hybrid model based on the combination of logistic regression and neural networks can provide the highest prediction accuracy and maximize the profit.  相似文献   

2.
3-D Networks-on-Chip (NoCs) have been proposed as a potent solution to address both the interconnection and design complexity problems facing future System-on-Chip (SoC) designs. In this paper, two topology-aware multicast routing algorithms, Multicasting XYZ (MXYZ) and Alternative XYZ (AL + XYZ) algorithms in supporting of 3-D NoC are proposed. In essence, MXYZ is a simple dimension order multicast routing algorithm that targets 3-D NoC systems built upon regular topologies. To support multicast routing in irregular regions, AL + XYZ can be applied, where an alternative output channel is sought to forward/replicate the packets whenever the output channel determined by MXYZ is not available. To evaluate the performance of MXYZ and AL + XYZ, extensive experiments have been conducted by comparing MXYZ and AL + XYZ against a path-based multicast routing algorithm and an irregular region oriented multiple unicast routing algorithm, respectively. The experimental results confirm that the proposed MXYZ and AL + XYZ schemes, respectively, have lower latency and power consumption than the other two routing algorithms, meriting the two proposed algorithms to be more suitable for supporting multicasting in 3-D NoC systems. In addition, the hardware implementation cost of AL + XYZ is shown to be quite modest.  相似文献   

3.
In this study, we propose a set of new algorithms to enhance the effectiveness of classification for 5-year survivability of breast cancer patients from a massive data set with imbalanced property. The proposed classifier algorithms are a combination of synthetic minority oversampling technique (SMOTE) and particle swarm optimization (PSO), while integrating some well known classifiers, such as logistic regression, C5 decision tree (C5) model, and 1-nearest neighbor search. To justify the effectiveness for this new set of classifiers, the g-mean and accuracy indices are used as performance indexes; moreover, the proposed classifiers are compared with previous literatures. Experimental results show that the hybrid algorithm of SMOTE + PSO + C5 is the best one for 5-year survivability of breast cancer patient classification among all algorithm combinations. We conclude that, implementing SMOTE in appropriate searching algorithms such as PSO and classifiers such as C5 can significantly improve the effectiveness of classification for massive imbalanced data sets.  相似文献   

4.
In classification, every feature of the data set is an important contributor towards prediction accuracy and affects the model building cost. To extract the priority features for prediction, a suitable feature selector is schemed. This paper proposes a novel memetic based feature selection model named Shapely Value Embedded Genetic Algorithm (SVEGA). The relevance of each feature towards prediction is measured by assembling genetic algorithms with shapely value measures retrieved from SVEGA. The obtained results are then evaluated using Support Vector Machine (SVM) with different kernel configurations on 11 + 11 benchmark datasets (both binary class and multi class). Eventually, a contrasting analysis is done between SVEGA-SVM and other existing feature selection models. The experimental results with the proposed setup provides robust outcome; hence proving it to be an efficient approach for discovering knowledge via feature selection with improved classification accuracy compared to conventional methods.  相似文献   

5.
Carboxylesterases are ubiquitous enzymes with important physiological, industrial and medical applications such as synthesis and hydrolysis of stereo specific compounds, including the metabolic processing of drugs, and antimicrobial agents. Here, we have performed molecular dynamics simulations of carboxylesterase from hyperthermophilic bacterium Geobacillus stearothermophilus (GsEst) for 10 ns each at five different temperatures namely at 300 K, 343 K, 373 K, 473 K and 500 K. Profiles of root mean square fluctuation (RMSF) identify thermostable and thermosensitive regions of GsEst. Unfolding of GsEst initiates at the thermosensitive α-helices and proceeds to the thermostable β-sheets. Five ion-pairs have been identified as critical ion-pairs for thermostability and are maintained stably throughout the higher temperature simulations. A detailed investigation of the active site residues of this enzyme suggests that the geometry of this site is well preserved up to 373 K. Furthermore, the hydrogen bonds between Asp188 and His218 of the active site are stably maintained at higher temperatures imparting stability of this site. Radial distribution functions (RDFs) show similar pattern of solvent ordering and water penetration around active site residues up to 373 K. Principal component analysis suggests that the motion of the entire protein as well as the active site is similar at 300 K, 343 K and 373 K. Our study may help to identify the factors responsible for thermostability of GsEst that may endeavor to design enzymes with enhanced thermostability.  相似文献   

6.
《Computer Networks》2007,51(11):3172-3196
A search based heuristic for the optimisation of communication networks where traffic forecasts are uncertain and the problem is NP-complete is presented. While algorithms such as genetic algorithms (GA) and simulated annealing (SA) are often used for this class of problem, this work applies a combination of newer optimisation techniques specifically: fast local search (FLS) as an improved hill climbing method and guided local search (GLS) to allow escape from local minima. The GLS + FLS combination is compared with an optimised GA and SA approaches. It is found that in terms of implementation, the parameterisation of the GLS + FLS technique is significantly simpler than that for a GA and SA. Also, the self-regularisation feature of the GLS + FLS approach provides a distinctive advantage over the other techniques which require manual parameterisation. To compare numerical performance, the three techniques were tested over a number of network sets varying in size, number of switch circuit demands (network bandwidth demands) and levels of uncertainties on the switch circuit demands. The results show that the GLS + FLS outperforms the GA and SA techniques in terms of both solution quality and optimisation speed but even more importantly GLS + FLS has significantly reduced parameterisation time.  相似文献   

7.
In this work, we investigate the effect of a cation-π interaction on the cooperativity of X⋯N halogen bonds in PhX⋯NCX⋯NH3 complexes, where Ph = phenyl and X = Cl, Br, I. Molecular geometries and interaction energies of the resulting complexes are studied at the MP2/aug-cc-pVDZ(-PP) computational level. The mechanism of the cooperativity between halogen bonds is analyzed using parameters derived from the noncovalent index, quantum theory of atoms in molecules and natural bond orbital methodologies. It is found that the divalent cations (Be2+, Mg2+) have a larger influence on the cooperativity of halogen bonds than monovalent ones (Li+, Na+). The formation of a cation-π interaction leads to strengthening of the halogen bonds, hence increases their cooperativity.  相似文献   

8.
The structural features of helical transmembrane (TM) proteins, such as helical kinks, tilts, and rotational orientations are important in modulation of their function and these structural features give rise to functional diversity in membrane proteins with similar topology. In particular, the helical kinks caused by breaking of the backbone hydrogen bonds lead to hinge bending flexibility in these helices. Therefore it is important to understand the nature of the helical kinks and to be able to reproduce these kinks in structural models of membrane proteins. We have analyzed the position and extent of helical kinks in the transmembrane helices of all the crystal structures of membrane proteins taken from the MPtopo database, which are about 405 individual helices of length between 19 and 35 residues. 44% of the crystal structures of TM helices showed a significant helical kink, and 35% of these kinks are caused by prolines. Many of the non-proline helical kinks are caused by other residues like Ser and Gly that are located at the center of helical kinks. The side chain of Ser makes a hydrogen bond with the main chain carbonyl of the i  4th or i + 4th residue thus making a kink. We have also studied how well molecular dynamics (MD) simulations on isolated helices can reproduce the position of the helical kinks in TM helices. Such a method is useful for structure prediction of membrane proteins. We performed MD simulations, starting from a canonical helix for the 405 TM helices. 1 ns of MD simulation results show that we can reproduce about 79% of the proline kinks, only 59% of the vestigial proline kinks and 18% of the non-proline helical kinks. We found that similar results can be obtained from choosing the lowest potential energy structure from the MD simulation. 4–14% more of the vestigial prolines were reproduced by replacing them with prolines before performing MD simulations, and changing the amino acid back to proline after the MD simulations. From these results we conclude that the position of the helical kinks is dependent on the TM sequence. However the extent of helical kinking may depend on the packing of the rest of the protein and the lipid bilayer.  相似文献   

9.
This paper presents results of a comparative study with the objective to identify the most effective and efficient way of applying a local search method embedded in a hybrid algorithm. The hybrid metaheuristic employed in this study is called “DE–HS–HJ” because it is comprised of two cooperative metaheusitic algorithms, i.e., differential evolution (DE) and harmony search (HS), and one local search (LS) method, i.e., Hooke and Jeeves (HJ) direct search. Eighteen different ways of using HJ local search were implemented and all of them were evaluated with 19 problems, in terms of six performance indices, covering both accuracy and efficiency. Statistic analyses were conducted accordingly to determine the significance in performance differences. The test results show that overall the best three LS application strategies are applying local search to every generated solution with a specified probability and also to each newly updated solution (NUS + ESP), applying local search to every generated solution with a specified probability (ESP), and applying local search to every generated solution with probability and also to the updated current global best solution (EUGbest + ESP). ESP is found to be the best local search application strategy in terms of success rate. Integrating it with NUS further improve the overall performance. EUGbest + ESP is the most efficient and it is also able to achieve high level of accuracy (the fourth place in terms of success rate with an average above 0.9).  相似文献   

10.
Stock index forecasting is a hot issue in the financial arena. As the movements of stock indices are non-linear and subject to many internal and external factors, they pose a great challenge to researchers who try to predict them. In this paper, we select a radial basis function neural network (RBFNN) to train data and forecast the stock indices of the Shanghai Stock Exchange. We introduce the artificial fish swarm algorithm (AFSA) to optimize RBF. To increase forecasting efficiency, a K-means clustering algorithm is optimized by AFSA in the learning process of RBF. To verify the usefulness of our algorithm, we compared the forecasting results of RBF optimized by AFSA, genetic algorithms (GA) and particle swarm optimization (PSO), as well as forecasting results of ARIMA, BP and support vector machine (SVM). Our experiment indicates that RBF optimized by AFSA is an easy-to-use algorithm with considerable accuracy. Of all the combinations we tried in this paper, BIAS6 + MA5 + ASY4 was the optimum group with the least errors.  相似文献   

11.
Cost of testing activities is a major portion of the total cost of a software. In testing, generating test data is very important because the efficiency of testing is highly dependent on the data used in this phase. In search-based software testing, soft computing algorithms explore test data in order to maximize a coverage metric which can be considered as an optimization problem. In this paper, we employed some meta-heuristics (Artificial Bee Colony, Particle Swarm Optimization, Differential Evolution and Firefly Algorithms) and Random Search algorithm to solve this optimization problem. First, the dependency of the algorithms on the values of the control parameters was analyzed and suitable values for the control parameters were recommended. Algorithms were compared based on various fitness functions (path-based, dissimilarity-based and approximation level + branch distance) because the fitness function affects the behaviour of the algorithms in the search space. Results showed that meta-heuristics can be effectively used for hard problems and when the search space is large. Besides, approximation level + branch distance based fitness function is generally a good fitness function that guides the algorithms accurately.  相似文献   

12.
To solve the speaker independent emotion recognition problem, a three-level speech emotion recognition model is proposed to classify six speech emotions, including sadness, anger, surprise, fear, happiness and disgust from coarse to fine. For each level, appropriate features are selected from 288 candidates by using Fisher rate which is also regarded as input parameter for Support Vector Machine (SVM). In order to evaluate the proposed system, principal component analysis (PCA) for dimension reduction and artificial neural network (ANN) for classification are adopted to design four comparative experiments, including Fisher + SVM, PCA + SVM, Fisher + ANN, PCA + ANN. The experimental results proved that Fisher is better than PCA for dimension reduction, and SVM is more expansible than ANN for speaker independent speech emotion recognition. The average recognition rates for each level are 86.5%, 68.5% and 50.2% respectively.  相似文献   

13.
Comparative molecular dynamics simulations of psychrophilic type III antifreeze protein from the North-Atlantic ocean-pout Macrozoarces americanus and its corresponding mesophilic counterpart, the antifreeze-like domain of human sialic acid synthase, have been performed for 10 ns each at five different temperatures. Analyses of trajectories in terms of secondary structure content, solvent accessibility, intramolecular hydrogen bonds and protein–solvent interactions indicate distinct differences in these two proteins. The two proteins also follow dissimilar unfolding pathways. The overall flexibility calculated by the trace of the diagonalized covariance matrix displays similar flexibility of both the proteins near their growth temperatures. However at higher temperatures psychrophilic protein shows increased overall flexibility than its mesophilic counterpart. Principal component analysis also indicates that the essential subspaces explored by the simulations of two proteins at different temperatures are non-overlapping and they show significantly different directions of motion. However, there are significant overlaps within the trajectories and similar directions of motion of each protein especially at 298 K, 310 K and 373 K. Overall, the psychrophilic protein leads to increased conformational sampling of the phase space than its mesophilic counterpart.Our study may help in elucidating the molecular basis of thermostability of homologous proteins from two organisms living at different temperature conditions. Such an understanding is required for designing efficient proteins with characteristics for a particular application at desired working temperatures.  相似文献   

14.
The human liver is one of the major organs in the body and liver disease can cause many problems in human life. Fast and accurate prediction of liver disease allows early and effective treatments. In this regard, various data mining techniques help in better prediction of this disease. Because of the importance of liver disease and increase the number of people who suffer from this disease, we studied on liver disease through using two well-known methods in data mining area.In this paper, novel decision tree based algorithms is used which leads to considering more factors in general and predictions with high accuracy compared to other studies in liver disease. In this application, 583 UCI instances of liver disease dataset from the UCI repository are considered. This dataset consists of 416 records of liver disease and 167 records of healthy liver. This dataset is analyzed by two algorithms named Boosted C5.0 and CHAID algorithms. Until now there is no work in the literature that uses boosted C5.0 and CHAID for creating the rules in liver disease. Our results show that in both algorithms, the DB, ALB, SGPT, TB and A/G factors have a significant impact on predicting liver disease which according to the rules generated by both algorithms important ranges are DB = [10.900–1.200], ALB [4.00–4.300], SGPT = [34–37], TB = [0.600–1.200] (by boosted C5.0), A/G = [1.180–1.390], as well as in the Boosted C5.0 algorithm, Alkphos, SGOT and Age have significant impact in prediction of liver disease. By comparing the performance of these algorithms, it becomes clear that C5.0 algorithm via Boosting technique has an accuracy of 93.75% and this result reveals that it has a better performance than the CHAID algorithm which is 65.00%. Another important achievement of this paper is about the ability of both algorithms to produce rules in one class for liver disease. The results of our assessment show that Boosted C5.0 and CHAID algorithms are capable to produce rules for liver disease. Our results also show that boosted C5.0 considers the gender in liver disease, a factor which is missing in many other studies. Meanwhile, using the rules generated in boosted C5.0 algorithm, we obtained the important result about low susceptibility of female to liver disease than male. This factor is missing in other studies of liver disease. Therefore, our proposed computer-aided diagnostic methods as an expert and intelligent system have impressive impact on liver disease detection. Based on obtained results, we observed that our model had better performance compared to existing methods in the literature.  相似文献   

15.
PurposeTo compare the diagnostic performances of artificial neural networks (ANNs) and multivariable logistic regression (LR) analyses for differentiating between malignant and benign lung nodules on computed tomography (CT) scans.MethodsThis study evaluated 135 malignant nodules and 65 benign nodules. For each nodule, morphologic features (size, margins, contour, internal characteristics) on CT images and the patient’s age, sex and history of bloody sputum were recorded. Based on 200 bootstrap samples generated from the initial dataset, 200 pairs of ANN and LR models were built and tested. The area under the receiver operating characteristic (ROC) curve, Hosmer–Lemeshow statistic and overall accuracy rate were used for the performance comparison.ResultsANNs had a higher discriminative performance than LR models (area under the ROC curve: 0.955 ± 0.015 (mean ± standard error) and 0.929 ± 0.017, respectively, p < 0.05). The overall accuracy rate for ANNs (90.0 ± 2.0%) was greater than that for LR models (86.9 ± 1.6%, p < 0.05). The Hosmer–Lemeshow statistic for the ANNs was 8.76 ± 6.59 vs. 6.62 ± 4.03 (p > 0.05) for the LR models.ConclusionsWhen used to differentiate between malignant and benign lung nodules on CT scans based on both objective and subjective features, ANNs outperformed LR models in both discrimination and clinical usefulness, but did not outperform for the calibration.  相似文献   

16.
In this paper a new mathematical geometric model of spiral triangular wire strands with a construction of (3 + 9) and (3 + 9 + 15) wires is proposed and an accurate computational two-layered triangular strand 3D solid modelling, which is used for a finite element analysis, is presented. The present geometric model fully considers the spatial configuration of individual wires in the strand. The three dimensional curve geometry of wires axes in the individual layers of the triangular strand consists of straight linear and helical segments. The derived mathematical representation of this curve is in the form of parametric equations with variable input parameters which facilitate the determination of the centreline of an arbitrary circular wire of the right and left hand lay triangular one and two-layered strands. Derived geometric equations were used for the generation of accurate 3D geometric and computational strand models. The correctness of the derived parametric equations and performance of the generated strand model are controlled by visualizations. The 3D computational model was used for a finite element behaviour analysis of the two-layered triangular strand subjected to tension loadings. Illustrative examples are presented to highlight the benefits of the proposed geometric parametric equations and computational modelling procedures by using the finite element method.  相似文献   

17.
The information extraction capability of two widely used signal processing tools, Hilbert Transform (HT) and Wavelet Transform (WT), is investigated to develop a multi-class fault diagnosis scheme for induction motor using radial vibration signals. The vibration signals are associated with unique predominant frequency components and instantaneous amplitudes depending on the motor condition. Using good systematic and analytical approach this fault frequencies can be identified. However, some faults either electrical or mechanical in nature are associated with same or similar vibration frequencies leading to erroneous conclusions. Genetic Algorithm (GA) is proposed and used successfully to find the most relevant fault frequencies in radial (vertical) frame vibration signal which can be used to diagnose the induction motor faults very effectively even in the presence of noise. The information obtained by Continuous Wavelet Transform (CWT) was found to be highly redundant compared to HT and thus by selecting the most relevant features using GA, the fault classification accuracy has considerably improved especially for CWT. Almost similar fault frequencies were found using CWT + GA and HT + GA for radial vibration signal.  相似文献   

18.
Based on a detailed check of the LDA + U and GGA + U corrected methods, we found that the transition energy levels depend almost linearly on the effective U parameter. GGA + U seems to be better than LDA + U, with effective U parameter of about 5.0 eV. However, though the results between LDA and GGA are very different before correction, the corrected transition energy levels spread less than 0.3 eV. These more or less consistent results indicate the necessity and validity of LDA + U and GGA + U correction.  相似文献   

19.
Discriminating between potato tubers and clods is the first step in developing an automatic separation system on potato harvesters. In this study, an acoustic-based intelligent system was developed for high speed discriminating between potato tubers and soil clods. About 500 kg mixture of potato tubers and clods were loaded on a belt conveyer and were impacted against a steel plate at four different velocities. The resulting acoustic signals were recorded, processed and potential features were extracted from the analysis of sound signals in both time and frequency domains. A multilayer perceptron neural network with a back propagation algorithm was used for pattern recognition. Altogether, 17 potential discriminating features were selected and fed as input vectors to the artificial neural network models. Optimal network was selected based on mean square error, correct detection rate and correlation coefficient. At the belt velocity of 1 m s?1, detection accuracy of the presented system was about 97.3% and 97.6% for potatoes and clods, respectively. Increasing the belt velocity resulted in the reduction of detection accuracy and increase in the number of miss classified samples. By using this system, it is expected that a potato harvester may operate at a capacity of 20 ton hr?1 with the accuracy of about 97%.  相似文献   

20.
Advances in field of artificial intelligence (AI) offers opportunities of utilizing new algorithms and models that enable researchers to solve the most complex systems. As in other engineering fields, AI methods have widely been used in geotechnical engineering. Unlikely, there seems quite insufficient number of research related to the use of AI methods for the estimation of California bearing ratio (CBR). There were actually some attempts to develop prediction models for CBR, but most of these models were essentially statistical correlations. Nevertheless, many of these statistical correlation equations generally produce unsatisfactory CBR values. However, this paper is likely one of the very first research which aims to investigate the applicability of AI methods for prediction of CBR. In this context, artificial neural network (ANN) and gene expression programming (GEP) were applied for the prediction of CBR of fine grained soils from Southeast Anatolia Region/Turkey. Using CBR test data of fine grained soils, some proper models are successfully developed. The results have shown that the both ANN and GEP are found to be able to learn the relation between CBR and basic soil properties. Additionally, sensitivity analysis is performed and it is found that maximum dry unit weight (γd) is the most effective parameter on CBR among the others such as plasticity index (PI), optimum moisture content (wopt), sand content (S), clay + silt content (C + S), liquid limit (LL) and gravel content (G) respectively.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号