首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In this paper, a hybrid approach to combine conditional restricted Boltzmann machines (CRBM) and echo state networks (ESN) for binary time series prediction is proposed. Both methods have demonstrated their ability to extract complex dynamic patterns from time-dependent data in several applications and benchmark studies. To the authors’ knowledge, it is the first time that the proposed combination of algorithms is applied for reliability prediction.The proposed approach is verified on a case study predicting the occurrence of railway operation disruptions based on discrete-event data, which is represented by a binary time series. The case study concerns speed restrictions affecting railway operations, caused by failures of tilting systems of railway vehicles. The overall prediction accuracy of the algorithm is 99.93%; the prediction accuracy for occurrence of speed restrictions within the foresight period is 98% (which corresponds to the sensitivity of the algorithm). The prediction results of the case study are compared to the prediction with a MLP trained with a Newton conjugate gradient algorithm. The proposed approach proves to be superior to MLP.  相似文献   

2.
Previous studies on predicting the box-office performance of a movie using machine learning techniques have shown practical levels of predictive accuracy. Their works are technically- and methodologically-oriented, focusing mainly on what algorithms are better at predicting the movie performance. However, the accuracy of prediction model can also be elevated by taking other perspectives such as introducing unexplored features that might be related to the prediction of the outcomes. In this paper, we examine multiple approaches to improve the performance of the prediction model. First, we develop and add a new feature derived from the theory of transmedia storytelling. Such theory-driven feature selection not only increases the forecast accuracy, but also enhances the interpretability of a prediction model. Second, we use an ensemble approach, which has rarely been adopted in the research on predicting box-office performance. As a result, the proposed model, Cinema Ensemble Model (CEM), outperforms the prediction models from the past studies that use machine learning algorithms. We suggest that CEM can be extensively used for industrial experts as a powerful tool for improving decision-making process.  相似文献   

3.
《Knowledge》2006,19(7):544-553
Bayesian networks (BNs) provide a means for representing, displaying, and making available in a usable form the knowledge of experts in a given field. In this paper, we look at the performance of an expert constructed BN compared with other machine learning (ML) techniques for predicting the outcome (win, lose, or draw) of matches played by Tottenham Hotspur Football Club. The period under study was 1995–1997 – the expert BN was constructed at the start of that period, based almost exclusively on subjective judgement. Our objective was to determine retrospectively the comparative accuracy of the expert BN compared to some alternative ML models that were built using data from the two-year period. The additional ML techniques considered were: MC4, a decision tree learner; Naive Bayesian learner; Data Driven Bayesian (a BN whose structure and node probability tables are learnt entirely from data); and a K-nearest neighbour learner. The results show that the expert BN is generally superior to the other techniques for this domain in predictive accuracy. The results are even more impressive for BNs given that, in a number of key respects, the study assumptions place them at a disadvantage. For example, we have assumed that the BN prediction is ‘incorrect’ if a BN predicts more than one outcome as equally most likely (whereas, in fact, such a prediction would prove valuable to somebody who could place an ‘each way’ bet on the outcome). Although the expert BN has now long been irrelevant (since it contains variables relating to key players who have retired or left the club) the results here tend to confirm the excellent potential of BNs when they are built by a reliable domain expert. The ability to provide accurate predictions without requiring much learning data are an obvious bonus in any domain where data are scarce. Moreover, the BN was relatively simple for the expert to build and its structure could be used again in this and similar types of problems.  相似文献   

4.
Distributed applications are popular for heavy workloads where the resources of a single machine are not sufficient. These distributed applications come with many parameters to tune so that cluster resources can be effectively utilized. However, any misconfiguration of the available parameters may result in suboptimal performance of one or more machines in the cluster. These events may go unnoticed or can result in crashes. This problem of misconfigured parameters has no straightforward solution due to the variety of parameters and vastly different workloads being processed. In this article, we propose a methodology for machine learning-based detection of misconfigurations. We collect data mined from system resource utilization, Hadoop logs, and job-level metrics to train a model using decision tree and support vector machine. The models are used to identify whether a set of configuration parameters could result in a crash or a slowdown for a specific workload. The approach explained in this article can be extended to other distributed big data applications, such as Spark, Hive, Pig, and so on.  相似文献   

5.
Multimedia Tools and Applications - In this work, we propose to reduce the complexity of HEVC video encoding by predicting the split decisions of coding units. We use a sequence-dependent approach...  相似文献   

6.
Fake content is flourishing on the Internet, ranging from basic random word salads to web scraping. Most of this fake content is generated for the purpose of nourishing fake web sites aimed at biasing search engine indexes: at the scale of a search engine, using automatically generated texts render such sites harder to detect than using copies of existing pages. In this paper, we present three methods aimed at distinguishing natural texts from artificially generated ones: the first method uses basic lexicometric features, the second one uses standard language models and the third one is based on a relative entropy measure which captures short range dependencies between words. Our experiments show that lexicometric features and language models are efficient to detect most generated texts, but fail to detect texts that are generated with high order Markov models. By comparison our relative entropy scoring algorithm, especially when trained on a large corpus, allows us to detect these “hard” text generators with a high degree of accuracy.  相似文献   

7.
Yang  Lei  Feng  Li  Zhang  Longqing  Tian  Liwei 《The Journal of supercomputing》2021,77(10):11853-11865
The Journal of Supercomputing - The enrollment rate of freshmen has always been a headache for colleges and universities. It is also very difficult to accurately predict the number of freshmen...  相似文献   

8.
It is very important for financial institutions to develop credit rating systems to help them to decide whether to grant credit to consumers before issuing loans. In literature, statistical and machine learning techniques for credit rating have been extensively studied. Recent studies focusing on hybrid models by combining different machine learning techniques have shown promising results. However, there are various types of combination methods to develop hybrid models. It is unknown that which hybrid machine learning model can perform the best in credit rating. In this paper, four different types of hybrid models are compared by ‘Classification + Classification’, ‘Classification + Clustering’, ‘Clustering + Classification’, and ‘Clustering + Clustering’ techniques, respectively. A real world dataset from a bank in Taiwan is considered for the experiment. The experimental results show that the ‘Classification + Classification’ hybrid model based on the combination of logistic regression and neural networks can provide the highest prediction accuracy and maximize the profit.  相似文献   

9.
This paper presents an application of a classification method to adaptively and dynamically modify the therapy and real-time displays of a virtual reality system in accordance with the specific state of each patient using his/her physiological reactions. First, a theoretical background about several machine learning techniques for classification is presented. Then, nine machine learning techniques are compared in order to select the best candidate in terms of accuracy. Finally, first experimental results are presented to show that the therapy can be modulated in function of the patient state using machine learning classification techniques.  相似文献   

10.

Appendicitis is a common disease that occurs particularly often in childhood and adolescence. The accurate diagnosis of acute appendicitis is the most significant precaution to avoid severe unnecessary surgery. In this paper, the author presents a machine learning (ML) technique to predict appendix illness whether it is acute or subacute, especially between 10 and 30 years and whether it requires an operation or just taking medication for treatment. The dataset has been collected from public hospital-based citizens between 2016 and 2019. The predictive results of the models achieved by different ML techniques (Logistic Regression, Naïve Bayes, Generalized Linear, Decision Tree, Support Vector Machine, Gradient Boosted Tree, Random Forest) are compared. The covered dataset are 625 specimens and the total of the medical records that are applied in this paper include 371 males (60.22%) and 254 females (40.12%). According to the dataset, the records consist of 318 (50.88%) operated and 307 (49.12%) unoperated patients. It is observed that the random forest algorithm obtains the optimal result with an accurately predicted result of 83.75%, precision of 84.11%, sensitivity of 81.08%, and the specificity of 81.01%. Moreover, an estimation method based on ML techniques is improved and enhanced to detect individuals with acute appendicitis.

  相似文献   

11.
12.
Generally, skin disease is a common one in human diseases. In computer vision application, the skin color is the powerful indication for this disease. This system identifies the skin cancer disease based on the images of skin. Initially, the skin is filtered using median filter and segmented using Mean shift segmentation. Segmented images are fed as input to feature extraction. GLCM, Moment Invariants and GLRLM features are extracted in this research work. The extracted features are classified by using classification techniques like Support vector machine, Probabilistic Neural Networks and Random forest and Combined SVM+ RF classifiers. Here combined SVM+RF classifier provided better results than other classifiers.  相似文献   

13.
Multimedia Tools and Applications - Breast cancer is one of the most common types of cancer among Jordanian women. Recently, healthcare organizations in Jordan have adopted electronic health...  相似文献   

14.
Bitcoin is the most accepted cryptocurrency in the world, which makes it attractive for investors and traders. However, the challenge in predicting the Bitcoin exchange rate is its high volatility. Therefore, the prediction of its behavior is of great importance for financial markets. In this way, recent studies have been carried out on what internal and/or external Bitcoin information is relevant to its prediction. The increased use of machine learning techniques to predict time series and the acceptance of cryptocurrencies as financial instruments motivated the present study to seek more accurate predictions for the Bitcoin exchange rate. In this way, in a first stage of the proposed methodology, different feature selection techniques were evaluated in order to obtain the most relevant attributes for the predictions. In the sequence, it was analyzed the behavior of Artificial Neural Networks (ANN), Support Vector Machines (SVM) and Ensemble algorithms (based on Recurrent Neural Networks and the k-Means clustering method) for price direction predictions. Likewise, the ANN and SVM were employed for regression of the maximum, minimum and closing prices of the Bitcoin. Moreover, the regression results were also used as inputs to try to improve the price direction predictions. The results showed that the selected attributes and the best machine learning model achieved an improvement of more than 10%, in accuracy, for the price direction predictions, with respect to the state-of-the-art papers, using the same period of information. In relation to the maximum, minimum and closing Bitcoin prices regressions, it was possible to obtain Mean Absolute Percentage Errors between 1% and 2%. Based on these results, it was possible to demonstrate the efficacy of the proposed methodology when compared to other studies.  相似文献   

15.
The analysis of social communities related logs has recently received considerable attention for its importance in shedding light on social concerns by identifying different groups, and hence helps in resolving issues like predicting terrorist groups. In the customer analysis domain, identifying calling communities can be used for determining a particular customer’s value according to the general pattern behavior of the community that the customer belongs to; this helps the effective targeted marketing design, which is significantly important for increasing profitability. In telecommunication industry, machine learning techniques have been applied to the Call Detail Record (CDR) for predicting customer behavior such as churn prediction. In this paper, we pursue identifying the calling communities and demonstrate how cluster analysis can be used to effectively identify communities using information derived from the CDR data. We use the information extracted from the cluster analysis to identify customer calling patterns. Customers calling patterns are then given to a classification algorithm to generate a classifier model for predicting the calling communities of a customer. We apply different machine learning techniques to build classifier models and compare them in terms of classification accuracy and computational performance. The reported test results demonstrate the applicability and effectiveness of the proposed approach.  相似文献   

16.
Neural Computing and Applications - The financial time series is inherently nonlinear and hence cannot be efficiently predicted by using linear statistical methods such as regression. Hence,...  相似文献   

17.
It is well known that microarray printing, hybridization, and washing oftentimes create erroneous measurements, and these errors detrimentally impact machine microarray spot quality classification. Thus, it is crucial to identify and remove these errors if automation is to replace the still common practice of visually assessing spot quality, an extremely expensive and time-consuming procedure. A major problem in microarray spot quality classification methods proposed in the literature is the correlation among the features extracted from the spots. In this paper, we propose using a random subspace ensemble of neural networks and a feature selection algorithm to improve the performance of our microarray spot quality classification method. Our best method obtains an error under the receiver operating characteristic curve (EAUR) of 0.3 outperforming the stand-alone support vector machine EAUR of 1.7. The consistency of our proposed approach makes it a viable alternative to the labour-intensive manual method of spot quality assessment.  相似文献   

18.
Computational Visual Media - Visual analytics for machine learning has recently evolved as one of the most exciting areas in the field of visualization. To better identify which research topics are...  相似文献   

19.
Given the importance of implicit communication in human interactions, it would be valuable to have this capability in robotic systems wherein a robot can detect the motivations and emotions of the person it is working with. Recognizing affective states from physiological cues is an effective way of implementing implicit human–robot interaction. Several machine learning techniques have been successfully employed in affect-recognition to predict the affective state of an individual given a set of physiological features. However, a systematic comparison of the strengths and weaknesses of these methods has not yet been done. In this paper, we present a comparative study of four machine learning methods—K-Nearest Neighbor, Regression Tree (RT), Bayesian Network and Support Vector Machine (SVM) as applied to the domain of affect recognition using physiological signals. The results showed that SVM gave the best classification accuracy even though all the methods performed competitively. RT gave the next best classification accuracy and was the most space and time efficient.  相似文献   

20.

Mechanical excavators are widely used in mining, tunneling and civil engineering projects. There are several types of mechanical excavators, such as a roadheader, tunnel boring machine and impact hammer. This is because these tools can bring productivity to the project quickly, accurately and safely. Among these, roadheaders have some advantages like selective mining, mobility, less over excavation, minimal ground disturbances, elimination of blast vibration, reduced ventilation requirements and initial investment cost. A critical issue in successful roadheader application is the ability to evaluate and predict the machine performance named instantaneous (net) cutting rate. Although there are several prediction methods in the literature, for the prediction of roadheader performance, only a few of them have been developed via artificial neural network techniques. In this study, for this purpose, 333 data sets including uniaxial compressive strength and power on cutting boom, 103 data set including RQD, and 125 data sets including machine weight are accumulated from the literature. This paper focuses on roadheader performance prediction using six different machine learning algorithms and a combination of various machine learning algorithms via ensemble techniques. Algorithms are ZeroR, random forest (RF), Gaussian process, linear regression, logistic regression and multi-layer perceptron (MLP). As a result, MLP and RF give better results than the other algorithms also the best solution achieved was bagging technique on RF and principle component analysis (PCA). The best success rate obtained in this study is 90.2% successful prediction, and it is relatively better than contemporary research.

  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号