首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
Resource allocation in auctions is a challenging problem for cloud computing. However, the resource allocation problem is NP-hard and cannot be solved in polynomial time. The existing studies mainly use approximate algorithms such as PTAS or heuristic algorithms to determine a feasible solution; however, these algorithms have the disadvantages of low computational efficiency or low allocate accuracy. In this paper, we use the classification of machine learning to model and analyze the multi-dimensional cloud resource allocation problem and propose two resource allocation prediction algorithms based on linear and logistic regressions. By learning a small-scale training set, the prediction model can guarantee that the social welfare, allocation accuracy, and resource utilization in the feasible solution are very close to those of the optimal allocation solution. The experimental results show that the proposed scheme has good effect on resource allocation in cloud computing.  相似文献   

3.
The rapid growth of user interactions in social media sites gives useful insights in many areas. Facebook is the most popular social media site lately, with the highest number of active users, which is a valuable and hassle-free source in obtaining data. Despite its enthusiastic nature, it is a mere fact that people use Facebook to gain instant updates on the current state of affairs. The ability of getting updates from several sources of news channels in a single user news feed, the extreme ease of providing feedbacks on those news posts using gesture-based reactions, send and share messages among people are some of the main reasons for its increasing popularity in the perspective of attaining news. Politics has always been a ubiquitous topic in the world. Sri Lanka was in a war on terrorism for nearly three decades, followed by a governance (2005–2015) led by the same political party which was alleged for autocracy and lasted for nearly a decade has influenced the citizens’ political conviction heavily. On such a background, the “Good-Governance” (2015–2019) which is a coalition government, trounced the ruling government at the presidential election held by then, which they claimed to direct Sri Lanka towards a sustainable, stable, responsible and moral society with necessary constitutional amendments guaranteeing democracy to all ethnic groups eradicating corruption, wastage and fraud. The interest and motivation of this study builds up to discover whether there are any significant trends in the Sri Lankan political context following this transformation, in the perspective of the general public. Facebook user reactions on news posts have been used for this study as the data source. The analysis of this study reveals an increasing trend of user reactions in politics from 2011 to 2018. Further, it is identified that the present government (2015–2019) has a decreasing trend of user reactions over the past years (2011–2015) in the sight of its citizens, although they pledged for a better governance. On the contrary, the previous government has an increasing trend even though they were overpowered by the “Good-Governance” for its alleged unscrupulous ruling.  相似文献   

4.
At present, the prevalence of diabetes is increasing because the human body cannot metabolize the glucose level. Accurate prediction of diabetes patients is an important research area. Many researchers have proposed techniques to predict this disease through data mining and machine learning methods. In prediction, feature selection is a key concept in preprocessing. Thus, the features that are relevant to the disease are used for prediction. This condition improves the prediction accuracy. Selecting the right features in the whole feature set is a complicated process, and many researchers are concentrating on it to produce a predictive model with high accuracy. In this work, a wrapper-based feature selection method called recursive feature elimination is combined with ridge regression (L2) to form a hybrid L2 regulated feature selection algorithm for overcoming the overfitting problem of data set. Overfitting is a major problem in feature selection, where the new data are unfit to the model because the training data are small. Ridge regression is mainly used to overcome the overfitting problem. The features are selected by using the proposed feature selection method, and random forest classifier is used to classify the data on the basis of the selected features. This work uses the Pima Indians Diabetes data set, and the evaluated results are compared with the existing algorithms to prove the accuracy of the proposed algorithm. The accuracy of the proposed algorithm in predicting diabetes is 100%, and its area under the curve is 97%. The proposed algorithm outperforms existing algorithms.  相似文献   

5.
多波束测深声呐的反向散射数据中包含海底表层的声学信息,可以用来进行海底表层底质分类。但实际中通过物理采样获得大范围的底质类型的标签信息所需成本过高,制约了传统监督分类算法的性能。针对实际应用中只拥有大量无标签数据和少量有标签数据的情况,文章提出了基于自动编码器预训练以及伪标签自训练的半监督学习底质分类算法。利用2018年和2019年两次同一海域实验采集的多波束测深声呐反向散射数据,对所提算法进行了验证。数据处理结果表明,相比仅利用有标签数据的监督分类算法,提出的半监督学习分类算法保证分类准确率的同时所需的有标签数据更少。自动编码器预训练的半监督学习分类方法在有标签样本数量极少时的准确率仍高于75%。  相似文献   

6.
Online advertisements have a significant influence over the success or failure of your business. Therefore, it is important to somehow measure the impact of your advertisement before uploading it online, and this is can be done by calculating the Click Through Rate (CTR). Unfortunately, this method is not eco-friendly, since you have to gather the clicks from users then compute the CTR. This is where CTR prediction come in handy. Advertisement CTR prediction relies on the users’ log regarding click information data. Accurate prediction of CTR is a challenging and critical process for e-advertising platforms these days. CTR prediction uses machine learning techniques to determine how much the online advertisement has been clicked by a potential client: The more clicks, the more successful the ad is. In this study we develop a machine learning based click through rate prediction model. The proposed study defines a model that generates accurate results with low computational power consumption. We used four classification techniques, namely K Nearest Neighbor (KNN), Logistic Regression, Random Forest, and Extreme Gradient Boosting (XGBoost). The study was performed on the Click-Through Rate Prediction Competition Dataset. It is a click-through data that is ordered chronologically and was collected over 10 days. Experimental results reveal that XGBoost produced ROC-AUC of 0.76 with reduced number of features.  相似文献   

7.
Although predictive machine learning for supply chain data analytics has recently been reported as a significant area of investigation due to the rising popularity of the AI paradigm in industry, there is a distinct lack of case studies that showcase its application from a practical point of view. In this paper, we discuss the application of data analytics in predicting first tier supply chain disruptions using historical data available to an Original Equipment Manufacturer (OEM). Our methodology includes three phases: First, an exploratory phase is conducted to select and engineer potential features that can act as useful predictors of disruptions. This is followed by the development of a performance metric in alignment with the specific goals of the case study to rate successful methods. Third, an experimental design is created to systematically analyse the success rate of different algorithms, algorithmic parameters, on the selected feature space. Our results indicate that adding engineered features in the data, namely agility, outperforms other experiments leading to the final algorithm that can predict late orders with 80% accuracy. An additional contribution is the novel application of machine learning in predicting supply disruptions. Through the discussion and the development of the case study we hope to shed light on the development and application of data analytics techniques in the analysis of supply chain data. We conclude by highlighting the importance of domain knowledge for successfully engineering features.  相似文献   

8.
The increasing penetration rate of electric kickboard vehicles has been popularized and promoted primarily because of its clean and efficient features. Electric kickboards are gradually growing in popularity in tourist and education-centric localities. In the upcoming arrival of electric kickboard vehicles, deploying a customer rental service is essential. Due to its free-floating nature, the shared electric kickboard is a common and practical means of transportation. Relocation plans for shared electric kickboards are required to increase the quality of service, and forecasting demand for their use in a specific region is crucial. Predicting demand accurately with small data is troublesome. Extensive data is necessary for training machine learning algorithms for effective prediction. Data generation is a method for expanding the amount of data that will be further accessible for training. In this work, we proposed a model that takes time-series customers’ electric kickboard demand data as input, pre-processes it, and generates synthetic data according to the original data distribution using generative adversarial networks (GAN). The electric kickboard mobility demand prediction error was reduced when we combined synthetic data with the original data. We proposed Tabular-GAN-Modified-WGAN-GP for generating synthetic data for better prediction results. We modified The Wasserstein GAN-gradient penalty (GP) with the RMSprop optimizer and then employed Spectral Normalization (SN) to improve training stability and faster convergence. Finally, we applied a regression-based blending ensemble technique that can help us to improve performance of demand prediction. We used various evaluation criteria and visual representations to compare our proposed model’s performance. Synthetic data generated by our suggested GAN model is also evaluated. The TGAN-Modified-WGAN-GP model mitigates the overfitting and mode collapse problem, and it also converges faster than previous GAN models for synthetic data creation. The presented model’s performance is compared to existing ensemble and baseline models. The experimental findings imply that combining synthetic and actual data can significantly reduce prediction error rates in the mean absolute percentage error (MAPE) of 4.476 and increase prediction accuracy.  相似文献   

9.
Deep learning techniques, particularly convolutional neural networks (CNNs), have exhibited remarkable performance in solving vision-related problems, especially in unpredictable, dynamic, and challenging environments. In autonomous vehicles, imitation-learning-based steering angle prediction is viable due to the visual imagery comprehension of CNNs. In this regard, globally, researchers are currently focusing on the architectural design and optimization of the hyperparameters of CNNs to achieve the best results. Literature has proven the superiority of metaheuristic algorithms over the manual-tuning of CNNs. However, to the best of our knowledge, these techniques are yet to be applied to address the problem of imitation-learning-based steering angle prediction. Thus, in this study, we examine the application of the bat algorithm and particle swarm optimization algorithm for the optimization of the CNN model and its hyperparameters, which are employed to solve the steering angle prediction problem. To validate the performance of each hyperparameters’ set and architectural parameters’ set, we utilized the Udacity steering angle dataset and obtained the best results at the following hyperparameter set: optimizer, Adagrad; learning rate, 0.0052; and nonlinear activation function, exponential linear unit. As per our findings, we determined that the deep learning models show better results but require more training epochs and time as compared to shallower ones. Results show the superiority of our approach in optimizing CNNs through metaheuristic algorithms as compared with the manual-tuning approach. Infield testing was also performed using the model trained with the optimal architecture, which we developed using our approach.  相似文献   

10.
Data mining process involves a number of steps from data collection to visualization to identify useful data from massive data set. the same time, the recent advances of machine learning (ML) and deep learning (DL) models can be utilized for effectual rainfall prediction. With this motivation, this article develops a novel comprehensive oppositional moth flame optimization with deep learning for rainfall prediction (COMFO-DLRP) Technique. The proposed CMFO-DLRP model mainly intends to predict the rainfall and thereby determine the environmental changes. Primarily, data pre-processing and correlation matrix (CM) based feature selection processes are carried out. In addition, deep belief network (DBN) model is applied for the effective prediction of rainfall data. Moreover, COMFO algorithm was derived by integrating the concepts of comprehensive oppositional based learning (COBL) with traditional MFO algorithm. Finally, the COMFO algorithm is employed for the optimal hyperparameter selection of the DBN model. For demonstrating the improved outcomes of the COMFO-DLRP approach, a sequence of simulations were carried out and the outcomes are assessed under distinct measures. The simulation outcome highlighted the enhanced outcomes of the COMFO-DLRP method on the other techniques.  相似文献   

11.
Prediction of drug synergy score is an ill‐posed problem. It plays an efficient role in the medical field for inhibiting specific cancer agents. An efficient regression‐based machine learning technique has an ability to minimise the drug synergy prediction errors. Therefore, in this study, an efficient machine learning technique for drug synergy prediction technique is designed by using ensemble based differential evolution (DE) for optimising the support vector machine (SVM). Because the tuning of the attributes of SVM kernel regulates the prediction precision. The ensemble based DE employs two trial vector generation techniques and two control attributes settings. The initial generation technique has the best solution and the other is without the best solution. The proposed and existing competitive machine learning techniques are applied to drug synergy data. The extensive analysis demonstrates that the proposed technique outperforms others in terms of accuracy, root mean square error and coefficient of correlation.Inspec keywords: cancer, evolutionary computation, support vector machines, regression analysis, drugs, learning (artificial intelligence), medical computingOther keywords: ensemble based differential evolution, specific cancer agents, efficient regression‐based machine learning technique, drug synergy prediction errors, efficient machine learning technique, drug synergy prediction technique, support vector machine, prediction precision, trial vector generation techniques, initial generation technique, drug synergy data, drug synergy score prediction, medical field, SVM kernel attributes, ensemble based DE, control attribute settings, competitive machine learning techniques, root mean square error  相似文献   

12.
Generally, conventional methods for anomaly detection rely on clustering, proximity, or classification. With the massive growth in surveillance videos, outliers or anomalies find ingenious ways to obscure themselves in the network and make conventional techniques inefficient. This research explores the structure of Graph neural networks (GNNs) that generalize deep learning frameworks to graph-structured data. Every node in the graph structure is labeled and anomalies, represented by unlabeled nodes, are predicted by performing random walks on the node-based graph structures. Due to their strong learning abilities, GNNs gained popularity in various domains such as natural language processing, social network analytics and healthcare. Anomaly detection is a challenging task in computer vision but the proposed algorithm using GNNs efficiently performs the identification of anomalies. The Graph-based deep learning networks are designed to predict unknown objects and outliers. In our case, they detect unusual objects in the form of malicious nodes. The edges between nodes represent a relationship of nodes among each other. In case of anomaly, such as the bike rider in Pedestrians data, the rider node has a negative value for the edge and it is identified as an anomaly. The encoding and decoding layers are crucial for determining how statistical measurements affect anomaly identification and for correcting the graph path to the best possible outcome. Results show that the proposed framework is a step ahead of the traditional approaches in detecting unusual activities, which shows a huge potential in automatically monitoring surveillance videos. Performing autonomous monitoring of CCTV, crime control and damage or destruction by a group of people or crowd can be identified and alarms may be triggered in unusual activities in streets or public places. The suggested GNN model improves accuracy by 4% for the Pedestrian 2 dataset and 12% for the Pedestrian 1 dataset compared to a few state-of-the-art techniques.  相似文献   

13.
Supervised machine learning techniques have become well established in the study of spectroscopy data. However, the unsupervised learning technique of cluster analysis hasn’t reached the same level maturity in chemometric analysis. This paper surveys recent studies which apply cluster analysis to NIR and IR spectroscopy data. In addition, we summarize the current practices in cluster analysis of spectroscopy and contrast these with cluster analysis literature from the machine learning and pattern recognition domain. This includes practices in data pre-processing, feature extraction, clustering distance metrics, clustering algorithms and validation techniques. Special consideration is given to the specific characteristics of IR and NIR spectroscopy data which typically includes high dimensionality and relatively low sample size. The findings highlighted a lack of quantitative analysis and evaluation in current practices for cluster analysis of IR and NIR spectroscopy data. With this in mind, we propose an analysis model or workflow with techniques specifically suited for cluster analysis of IR and NIR spectroscopy data along with a pragmatic application strategy.  相似文献   

14.
This paper focuses on identification of the relationships between a disease and its potential risk factors using Bayesian networks in an epidemiologic study, with the emphasis on integrating medical domain knowledge and statistical data analysis. An integrated approach is developed to identify the risk factors associated with patients' occupational histories and is demonstrated using real‐world data. This approach includes several steps. First, raw data are preprocessed into a format that is acceptable to the learning algorithms of Bayesian networks. Some important considerations are discussed to address the uniqueness of the data and the challenges of the learning. Second, a Bayesian network is learned from the preprocessed data set by integrating medical domain knowledge and generic learning algorithms. Third, the relationships revealed by the Bayesian network are used for risk factor analysis, including identification of a group of people who share certain common characteristics and have a relatively high probability of developing the disease, and prediction of a person's risk of developing the disease given information on his/her occupational history. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

15.
Diabetes is one of the fastest-growing human diseases worldwide and poses a significant threat to the population’s longer lives. Early prediction of diabetes is crucial to taking precautionary steps to avoid or delay its onset. In this study, we proposed a Deep Dense Layer Neural Network (DDLNN) for diabetes prediction using a dataset with 768 instances and nine variables. We also applied a combination of classical machine learning (ML) algorithms and ensemble learning algorithms for the effective prediction of the disease. The classical ML algorithms used were Support Vector Machine (SVM), Logistic Regression (LR), Decision Tree (DT), K-Nearest Neighbor (KNN), and Naïve Bayes (NB). We also constructed ensemble models such as bagging (Random Forest) and boosting like AdaBoost and Extreme Gradient Boosting (XGBoost) to evaluate the performance of prediction models. The proposed DDLNN model and ensemble learning models were trained and tested using hyperparameter tuning and K-Fold cross-validation to determine the best parameters for predicting the disease. The combined ML models used majority voting to select the best outcomes among the models. The efficacy of the proposed and other models was evaluated for effective diabetes prediction. The investigation concluded that the proposed model, after hyperparameter tuning, outperformed other learning models with an accuracy of 84.42%, a precision of 85.12%, a recall rate of 65.40%, and a specificity of 94.11%.  相似文献   

16.
A bit hurdle for financial institutions is to decide potential candidates to give a line of credit identifying the right people without any credit risk. For such a crucial decision, past demographic and financial data of debtors is important to build an automated artificial intelligence credit score prediction model based on machine learning classifier. In addition, for building robust and accurate machine learning models, important input predictors (debtor's information) must be selected. The present computational work focuses on building a credit scoring prediction model. A publicly available German credit data is incorporated in this study. An improvement in the credit scoring prediction has been shown with the use of different feature selection techniques (such as Information-gain, Gain-Ratio and Chi-Square) and machine learning classifiers (Bayesian, Naïve Bayes, Random Forest, Decision Tree (C5.0) and SVM (support Vector Machine)). Further, a comparative analysis is performed between different machine learning classifiers and between different feature selection techniques. Different evaluation metrics are considered for analyzing performance of the models (such as accuracy, F-measure, false positive rate, false negative rate and training time). After analysis, a best combination of machine learning classifier and feature selection technique are identified. In this study, a combination of random forest (RF) and Chi-Square (CS) is found good, among other combinations, with respect to good performance accuracy, F-measure and low false positive and false negative rates. However, training time for this particular combination was found to be slightly higher. Result of C5.0 with chi-square was comparable with the best one. This study provides an opportunity to financial institutions to build an automated model for better credit scoring.  相似文献   

17.
Ueno  Maomi  Yamazaki  Takahiro 《Behaviormetrika》2008,35(2):137-158

This paper proposes a collaborative filtering method for massive datasets that is based on Bayesian networks. We first compare the prediction accuracy of four scoring-based learning Bayesian networks algorithms (AIC, MDL, UPSM, and BDeu) and two conditional-independence-based (Cl-based) learning Bayesian networks algorithms (MWST, and Polytree-MWST) using actual massive datasets. The results show that (1) for large networks, the scoring-based algorithms have lower prediction accuracy than the CI-based algorithms and (2) when the scoring-based algorithms use a greedy search to learn a large network, algorithms which make a lot of arcs tend to have less prediction accuracy than those that make fewer arcs. Next, we propose a learning algorithm based on MWST for collaborative filtering of massive datasets. The proposed algorithm employs a traditional data mining technique, the “a priori” algorithm, to quickly calculate the amount of mutual information, which is needed in MWST, from massive datasets. We compare the original MWST algorithm and the proposed algorithm on actual data, and the comparison shows the effectiveness of the proposed algorithm.

  相似文献   

18.
Weathering has several adverse effects on the physical, mechanical and deformation characteristics of rock. However, when determining the weathering degree of rocks, some difficulties are encountered. Ideally, the weathering degree can be determined by simple test results and reliable prediction models. Considering this situation, the purpose of the present study is to construct simple and low cost weathering degree prediction models with two soft computing techniques, artificial neural networks and fuzzy inference systems. When developing these models, model results were tested against data from specimens collected from the Harsit granitoid (NE Turkey) and data published in the literature. Model inputs are porosity, P-wave velocity and uniaxial compressive strength, and model output is weathering degree. The models developed in this study exhibited high prediction performances when checked by train and test data sets. This result shows that the models developed herein can be used for indirect determination of weathering degree. The artificial neural network model requests numerical data as the input, while the fuzzy inference system model can take numerical data and expert opinion as the input. As a conclusion, the models have a high potential when determining weathering degree of a rock for various purposes.  相似文献   

19.
Immunization is a noteworthy and proven tool for eliminating lifethreating infectious diseases, child mortality and morbidity. Expanded Program on Immunization (EPI) is a nation-wide program in Pakistan to implement immunization activities, however the coverage is quite low despite the accessibility of free vaccination. This study proposes a defaulter prediction model for accurate identification of defaulters. Our proposed framework classifies defaulters at five different stages: defaulter, partially high, partially medium, partially low, and unvaccinated to reinforce targeted interventions by accurately predicting children at high risk of defaulting from the immunization schedule. Different machine learning algorithms are applied on Pakistan Demographic and Health Survey (2017–18) dataset. Multilayer Perceptron yielded 98.5% accuracy for correctly identifying children who are likely to default from immunization series at different risk stages of being defaulter. In this paper, the proposed defaulters’ prediction framework is a step forward towards a data-driven approach and provides a set of machine learning techniques to take advantage of predictive analytics. Hence, predictive analytics can reinforce immunization programs by expediting targeted action to reduce dropouts. Specially, the accurate predictions support targeted messages sent to at-risk parents’ and caretakers’ consumer devices (e.g., smartphones) to maximize healthcare outcomes.  相似文献   

20.
Dementia is a disorder with high societal impact and severe consequences for its patients who suffer from a progressive cognitive decline that leads to increased morbidity, mortality, and disabilities. Since there is a consensus that dementia is a multifactorial disorder, which portrays changes in the brain of the affected individual as early as 15 years before its onset, prediction models that aim at its early detection and risk identification should consider these characteristics. This study aims at presenting a novel method for ten years prediction of dementia using on multifactorial data, which comprised 75 variables. There are two automated diagnostic systems developed that use genetic algorithms for feature selection, while artificial neural network and deep neural network are used for dementia classification. The proposed model based on genetic algorithm and deep neural network had achieved the best accuracy of 93.36%, sensitivity of 93.15%, specificity of 91.59%, MCC of 0.4788, and performed superior to other 11 machine learning techniques which were presented in the past for dementia prediction. The identified best predictors were: age, past smoking habit, history of infarct, depression, hip fracture, single leg standing test with right leg, score in the physical component summary and history of TIA/RIND. The identification of risk factors is imperative in the dementia research as an effort to prevent or delay its onset.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号