首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
为了能够更好地预测股票的走向趋势,解决在大量特征和大数据下预测精度低的问题,在随机森林的基础上提出了一种基于Pearson系数的随机森林新的组合模型方法。利用Pearson系数进行相关性检验删除无关特征;使用改进的网格搜索法对决策树参数调优;利用随机森林将剩余特征进行建模回归预测,并得出最终结论。实验结果表明:改进后的随机森林在预测值的平均绝对误差(MAE)、均方误差(MSE)都得到了较大的提高。其中今世缘改进后的随机森林比传统随机森林的MSE值降低了56%,MAE值降低了37.3%,其他两只股票预测效果也均得到提高。新的组合模型,可以实现对股票价格的短期预测回归,并且能够降低噪声对股票价格预测的影响。该研究为更好地预测股票价格提供了有效证据并为投资者提供了对股票影响因素的选择。  相似文献   

2.
基于随机森林算法的用电负荷预测研究   总被引:3,自引:0,他引:3  
为了解决当下用电负荷预测精度不高,难以很好模拟实际用电负荷的分布情况而不能对未来的负荷数据进行合理预测的问题,实现了基于随机森林的分类模型、回归模型以及结合Weka的时间序列模型,对某省份的负荷数据进行预测,通过对不同模型的大量的实验及评估,发现这三个模型皆能合理地预测未来的用电负荷数据。此外,在同一评估指标下随机森林算法结合WEKA中的时间序列模型的方法能够较好地预测未来时刻的负荷数据。  相似文献   

3.
The incidence and the prevalence of end-stage renal disease (ESRD) in Taiwan are the highest in the world. Therefore, hemodialysis (HD) therapy is a major concern and an important challenge due to the shortage of donated organs for transplantation. Previous researchers developed various forecasting models based on statistical methods and artificial intelligence techniques to address the real-world problems of HD therapy that are faced by ESRD patients and their doctors in the healthcare services. Because the performance of these forecasting models is highly dependent on the context and the data used, it would be valuable to develop more suitable methods for applications in this field. This study presents an integrated procedure that is based on rough set classifiers and aims to provide an alternate method for predicting the urea reduction ratio for assessing HD adequacy for ESRD patients and their doctors. The proposed procedure is illustrated in practice by examining a dataset from a specific medical center in Taiwan. The experimental results reveal that the proposed procedure has better accuracy with a low standard deviation than the listed methods. The output created by the rough set LEM2 algorithm is a comprehensible decision rule set that can be applied in knowledge-based healthcare services as desired. The analytical results provide useful information for both academics and practitioners.  相似文献   

4.
An extended two compartment model is proposed to describe the dynamics of myoglobin in rhabdomyolysis patients undergoing dialysis. Before using clinical data to estimate the model's unknown parameters, structural identifiability analysis was performed to determine the parameters uniqueness given certain clinical observations. A Taylor series expansion method was implemented which found that the model was structurally globally/uniquely identifiable for both on- and off-dialysis phases. The fitted model was then used in a predictive capacity showing that the use of Theralite high cut-off (HCO) or HCO 1100 dialyser gave a significant reduction in myoglobin renal exposure compared to standard haemodialysis (HD).  相似文献   

5.
本文以极端天气中的雷暴天气为研究对象,基于历史气象数据预测未来三小时是否发生雷暴。为预测雷暴是否发生,本文分别对极端天气气象数据的采样、数据预处理、特征选择,以及建模分析进行了研究,最终提出一种基于机器学习方法的HY-FMV模型框架对雷暴天气进行预测。该模型采用混合模型进行数据预处理,基于概率分布与模型评价进行特征的选择和构建,并使用梯度提升树算法对极端天气进行预测分类。最后,本文以2010年到2015年福建和广东两省数据为例,分别使用本文所提出的HY-FMV模型,和随机森林算法等进行雷暴天气预测,结果表明,本文所提出的HY-FMV模型在F1指标上精度达到78%,相比其他算法,在雷暴天气预测精度上提高了0.5%-0.6%。  相似文献   

6.
针对当前医院护理不良事件上报的内容多为非结构化文本数据,缺乏合理明确的分类,人工分析难度大、人为因素多、存在漏报瞒报、人为降低事件级别等问题,提出一种基于字符卷积神经网络CNN与支持向量机SVM的中文护理不良事件文本分类模型。该模型通过构建字符级文本词汇表对文本进行向量化,利用CNN对文本进行抽象的特征提取,并用SVM分类器实现中文文本分类。与传统基于TF-IDF的SVM、随机森林等多组分类模型进行对比实验,来验证该模型在中文护理不良事件文本分类中的分类效果。  相似文献   

7.
余东昌  赵文芳  聂凯  张舸 《计算机应用》2021,41(4):1035-1041
为了提高能见度预报的准确率,尤其是低能见度预报的准确率,提出一种基于集成学习随机森林和LightGBM的能见度预测模型。首先,以数值模式系统的气象预报数据为基础,结合地面气象观测数据、PM2.5浓度观测数据,利用随机森林算法构建特征向量;其次,针对不同时间跨度的缺失数据,设计了3种缺失值处理方法对缺失值进行替代,生成用于训练和测试的连续性较好的数据样本集;最后,建立基于LightGBM的能见度预测模型,并用网络搜索法对其进行参数优化。把所提模型与支持向量机(SVM)、多元线性回归(MLR)、人工神经网络(ANN)在性能上进行对比。实验结果表明,对于不同的等级的能见度,应用LightGBM的能见度预测模型获得预兆得分(TS)均较高,而对于<2 km的低能见度,该模型对各观测站点的能见度预测值与各观测站点的能见度实况值的平均相关系数为0.75,平均均方误差为6.49。可见基于LightGBM的预测模型能有效提高能见度预测精度。  相似文献   

8.
Adsorption of plasma proteins onto the membrane surface during the hemodialysis session represents a key feature of membranes used for chronic dialysis therapy. In this issue of Proteomics—Clinical Applications, Han et al. originally describe how, by using proteomic technologies, the adsorptive properties of two membranes made from the same biomaterial (Polyamix) may have different flux characteristics (low flux and high flux, the former having smaller pore size). A total of 497 differentially expressed proteins were identified in eluates obtained after in vivo hemodialysis: 320 proteins concentrated more in low‐flux membrane (predominantly proteins with molecular weight 30–60 kDa) and 177 in high flux (most represented by proteins with molecular weight 10–15 kDa). Use of bioinformatics tools shed light on the involvement of adsorbed proteins in important biological pathways, such as the coagulation cascade and the complement system, again with some differences between the two membranes. The study indicates that flux characteristics of a biomaterial used for hemodialysis membrane strongly influence its adsorptive properties, and that proteomic application may provide information relevant to renal replacement therapy.  相似文献   

9.
10.
随着机器学习模型的广泛应用,研究者们逐渐认识到这类方法的局限之处。这些模型大多数为黑盒模型,导致其可解释性较差。为了解决这一问题,以集成学习模型为基础,提出了一种基于规则的可解释模型以及规则约简方法,包括生成优化的随机森林模型、冗余规则的发现和约简等步骤。首先,提出了一种随机森林模型的评价方法,并基于强化学习的思想对随机森林模型的关键参数进行了优化,得到了更具可解释性的随机森林模型。其次,对随机森林模型中提取的规则集进行了冗余消除,得到了更加精简的规则集。在公开数据集上的实验结果表明,生成的规则集在预测准确率和可解释性方面均表现优秀。  相似文献   

11.
The reduction of morbidity and mortality in patients undergoing hemo- or peritoneal dialysis is strongly related to an efficient and selective clearance of uremic toxins. We used proteomics methods to analyze and further characterize the dialytic removal of still undefined middle and high molecular weigh proteins as a basis for further improvement of dialysis assessment. Dialysates from 26 hemodialysis patients treated with different types of low- (F6HPS?) and high-flux (FX80?, APS650?, FX60?) filters as well as peritoneal fluids from 10 continuous ambulatory peritoneal dialysis (CAPD) patients were analyzed by SELDI-TOF and 2-DE. The protein patterns showed selective differences in the proteins cleared depending on the dialysis method used and the filter membrane. While SELDI analyses of dialysates from the F6HPS revealed almost no protein clearance, high-flux filters and CAPD dialysates showed protein release of different molecular weight ranges. Furthermore, 2-DE and MS analysis identified 48 different proteins from the dialysate of high-flux filters and 21 from peritoneal dialysis fluids. In F6HPS dialysates, however, only few proteins could be identified.  相似文献   

12.
The basic idea in the estimation of distribution algorithms is the replacement of heuristic operators with machine learning models such as regression models, clustering models, or classification models. So, recently, the model-based evolutionary algorithms (MBEAs) have been suggested in three groups: The estimation of distribution algorithms (EDAs), surrogate assisted evolutionary algorithms, and the inversed models to map from the objective space to the decision space. In this article, a new approach, based on an inversed model of Gaussian process and random forest framework, is proposed. The main idea is applying the process of random forest variable importance with a random grouping that determines some of the best assignment of decision variables to objective functions in order to form a Gaussian process in inverse models that maps to decision space the rich solutions which are discovered from objective space. Then these inverse models through sampling the objective space generate offspring. The proposed algorithm has been tested on the benchmark test suite for evolutionary algorithms (modified Deb K, Thiele L, Laumanns M, Zitzler E (DTLZ), and Walking Fish Group (WFG)) and indicates that the proposed method is a competitive and promising approach.  相似文献   

13.
Patients who suffer from chronic renal failure (CRF) tend to suffer from an associated anemia as well. Therefore, it is essential to know the hemoglobin (Hb) levels in these patients. The aim of this paper is to predict the hemoglobin (Hb) value using a database of European hemodialysis patients provided by Fresenius Medical Care (FMC) for improving the treatment of this kind of patients. For the prediction of Hb, both analytical measurements and medication dosage of patients suffering from chronic renal failure (CRF) are used. Two kinds of models were trained, global and local models. In the case of local models, clustering techniques based on hierarchical approaches and the adaptive resonance theory (ART) were used as a first step, and then, a different predictor was used for each obtained cluster. Different global models have been applied to the dataset such as Linear Models, Artificial Neural Networks (ANNs), Support Vector Machines (SVM) and Regression Trees among others. Also a relevance analysis has been carried out for each predictor model, thus finding those features that are most relevant for the given prediction.  相似文献   

14.
基于贝叶斯网络理论的交通事件预测模型   总被引:1,自引:0,他引:1  
在综合考虑影响交通安全因素的基础上,采用贝叶斯网络理论,提出了一种交通事件预测模型。在模型的基础上提出基于贝叶斯法则的学习算法,并通过计算变量间的条件概率来计算各类交通事件发生的可能性,从而达到预测的目的。最后对该模型进行仿真实验,并对实验结果进行了分析,验证了模型的有效性。  相似文献   

15.
随着信息技术的不断发展,基于网络数据对事物近期发展态势预测成为热点.本文以北京市月度游客量预测为目标,以相关网络关键词搜索指数为自变量建立了BP神经网络、支持向量回归和随机森林三种单一预测模型,在此基础上构建组合模型以提高预测准确度.实验结果表明:基于GBDT建立的组合模型达到了较高的预测准确度,误差仅为3.16%,预测结果可以为旅游管理部门提供决策支持.  相似文献   

16.
Diagnosing the cardiovascular disease is one of the biggest medical difficulties in recent years. Coronary cardiovascular (CHD) is a kind of heart and blood vascular disease. Predicting this sort of cardiac illness leads to more precise decisions for cardiac disorders. Implementing Grid Search Optimization (GSO) machine training models is therefore a useful way to forecast the sickness as soon as possible. The state-of-the-art work is the tuning of the hyperparameter together with the selection of the feature by utilizing the model search to minimize the false-negative rate. Three models with a cross-validation approach do the required task. Feature Selection based on the use of statistical and correlation matrices for multivariate analysis. For Random Search and Grid Search models, extensive comparison findings are produced utilizing retrieval, F1 score, and precision measurements. The models are evaluated using the metrics and kappa statistics that illustrate the three models’ comparability. The study effort focuses on optimizing function selection, tweaking hyperparameters to improve model accuracy and the prediction of heart disease by examining Framingham datasets using random forestry classification. Tuning the hyperparameter in the model of grid search thus decreases the erroneous rate achieves global optimization.  相似文献   

17.
In modeling multivariate failure time data, a class of survival model with random effects is applicable. It incorporates the random effect terms in the linear predictor and includes various random effect survival models as special cases, such as the random effect model assuming Cox's proportional hazards, with Weibull baseline hazards and with power family of transformation in the relative risk function. Residual maximum likelihood (REML) estimation of parameters is achieved by adopting the generalised linear mixed models (GLMM) approach. Accordingly, influence diagnostics are developed as sensitivity measures for the REML estimation of model parameters. A data set of recurrent infections of kidney patients on portable dialysis illustrates the usefulness of the influence diagnostics. A simulation study is carried out to examine the performance of the proposed influence diagnostics.  相似文献   

18.
In modeling multivariate failure time data, a class of survival model with random effects is applicable. It incorporates the random effect terms in the linear predictor and includes various random effect survival models as special cases, such as the random effect model assuming Cox's proportional hazards, with Weibull baseline hazards and with power family of transformation in the relative risk function. Residual maximum likelihood (REML) estimation of parameters is achieved by adopting the generalised linear mixed models (GLMM) approach. Accordingly, influence diagnostics are developed as sensitivity measures for the REML estimation of model parameters. A data set of recurrent infections of kidney patients on portable dialysis illustrates the usefulness of the influence diagnostics. A simulation study is carried out to examine the performance of the proposed influence diagnostics.  相似文献   

19.
This paper proposes the random subspace binary logit (RSBL) model (or random subspace binary logistic regression analysis) by taking the random subspace approach and using the classical logit model to generate a group of diverse logit decision agents from various perspectives for predictive problem. These diverse logit models are then combined for a more accurate analysis. The proposed RSBL model takes advantage of both logit (or logistic regression) and random subspace approaches. The random subspace approach generates diverse sets of variables to represent the current problem as different masks. Different logit decision agents from these masks, instead of a single logit model, are constructed. To verify its performance, we used the proposed RSBL model to forecast corporate failure in China. The results indicate that this model significantly improves the predictive ability of classical statistical models such as multivariate discriminant analysis, logit model, and probit model. Thus, the proposed model should make logit model more suitable for predictive problems in academic and industrial uses.  相似文献   

20.
为了科学准确地预测近期公交客流量,根据近期公交客流量预测受到多因素影响以及非线性的特点,利用随机灰色变量描述预测系统的不确定性,建立了随机灰色预测模型以及基于蚁群算法的递归神经网络模型,在此基础上,提出了一种基于随机灰色蚁群神经网络的近期公交客流量预测方法。最后以铜陵市为例,对模型的预测精度和有效性进行了分析。结果表明,基于蚁群算法的递归神经网络模型的预测精度不但高于其他单一预测模型,而且明显优于其他传统组合预测模型,能很好地反映事物发展的规律,能够指导公交经营管理者近期的决策行为,有效地改善了预测精度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号