首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
近年来,我国传统暴力犯罪与成年人犯罪呈下降态势,但是,犯罪案由层出不穷。为有效提升公安实践工作中犯罪预测能力,打击各类违法犯罪事件,本文针对犯罪数据,提出一种新型犯罪预测模型。利用密度聚类分析方法将犯罪数据分类,然后进行数据降维提取关键属性生成特征数据,继而对特征数据进行加权优化并采用机器学习的方式对特征数据进行学习,从而预测犯罪案由。实验结果表明,与传统方法相比,本文方法具有更好的预测效果,为公安实践工作中类似案件的侦破和预防,提供新的路径支撑。  相似文献   

2.
软件缺陷预测是软件质量保障领域的热点研究课题,缺陷预测模型的质量与训练数据有密切关系。用于缺陷预测的数据集主要存在数据特征的选择和数据类不平衡问题。针对数据特征选择问题,采用软件开发常用的过程特征和新提出的扩展过程特征,然后采用基于聚类分析的特征选择算法进行特征选择;针对数据类不平衡问题,提出改进的Borderline-SMOTE过采样方法,使得训练数据集的正负样本数量相对平衡且合成样本的特征更符合实际样本特征。采用bugzilla、jUnit等项目的开源数据集进行实验,结果表明:所采用的特征选择算法在保证模型F-measure值的同时,可以降低57.94%的模型训练时间;使用改进的Borderline-SMOTE方法处理样本得到的缺陷预测模型在Precision、Recall、F-measure、AUC指标上比原始方法得到的模型平均分别提高了2.36个百分点、1.8个百分点、2.13个百分点、2.36个百分点;引入了扩展过程特征得到的缺陷预测模型比未引入扩展过程特征得到的模型在F-measure值上平均提高了3.79%;与文献中的方法得到的模型相比,所提方法得到的模型在F-measure值上平均提高了15.79%。实验结果证明所提方法能有效提升缺陷预测模型的质量。  相似文献   

3.
软件缺陷预测是软件质量保障领域的热点研究课题,缺陷预测模型的质量与训练数据有密切关系。用于缺陷预测的数据集主要存在数据特征的选择和数据类不平衡问题。针对数据特征选择问题,采用软件开发常用的过程特征和新提出的扩展过程特征,然后采用基于聚类分析的特征选择算法进行特征选择;针对数据类不平衡问题,提出改进的Borderline-SMOTE过采样方法,使得训练数据集的正负样本数量相对平衡且合成样本的特征更符合实际样本特征。采用bugzilla、jUnit等项目的开源数据集进行实验,结果表明:所采用的特征选择算法在保证模型F-measure值的同时,可以降低57.94%的模型训练时间;使用改进的Borderline-SMOTE方法处理样本得到的缺陷预测模型在Precision、Recall、F-measure、AUC指标上比原始方法得到的模型平均分别提高了2.36个百分点、1.8个百分点、2.13个百分点、2.36个百分点;引入了扩展过程特征得到的缺陷预测模型比未引入扩展过程特征得到的模型在F-measure值上平均提高了3.79%;与文献中的方法得到的模型相比,所提方法得到的模型在F-measure值上平均提高了15.79%。实验结果证明所提方法能有效提升缺陷预测模型的质量。  相似文献   

4.

Crime forecasting has been one of the most complex challenges in law enforcement today, especially when an analysis tends to evaluate inferable and expanded crime rates, although a few methodologies for subsequent equivalents have been embraced before. In this work, we use a strategy for a time series model and machine testing systems for crime estimation. The paper centers on determining the quantity of crimes. Considering various experimental analyses, this investigation additionally features results obtained from a neural system that could be a significant alternative to machine learning and ordinary stochastic techniques. In this paper, we applied various techniques to forecast the number of possible crimes in the next 5 years. First, we used the existing machine learning techniques to predict the number of crimes. Second, we proposed two approaches, a modified autoregressive integrated moving average model and a modified artificial neural network model. The prime objective of this work is to compare the applicability of a univariate time series model against that of a variate time series model for crime forecasting. More than two million datasets are trained and tested. After rigorous experimental results and analysis are generated, the paper concludes that using a variate time series model yields better forecasting results than the predicted values from existing techniques. These results show that the proposed method outperforms existing methods.

  相似文献   

5.
Crime risk prediction is helpful for urban safety and citizens’life quality.However,existing crime studies focused on coarse-grained prediction,and usually failed to capture the dynamics of urban crimes.The key challenge is data sparsity,since that 1)not all crimes have been recorded,and 2)crimes usually occur with low frequency.In this paper,we propose an effective framework to predict fine-grained and dynamic crime risks in each road using heterogeneous urban data.First,to address the issue of unreported crimes,we propose a cross-aggregation soft-impute(CASI)method to deal with possible unreported crimes.Then,we use a novel crime risk measurement to capture the crime dynamics from the perspective of influence propagation,taking into consideration of both time-varying and location-varying risk propagation.Based on the dynamically calculated crime risks,we design contextual features(i.e.,POI distributions,taxi mobility,demographic features)from various urban data sources,and propose a zero-inflated negative binomial regression(ZINBR)model to predict future crime risks in roads.The experiments using the real-world data from New York City show that our framework can accurately predict road crime risks,and outperform other baseline methods.  相似文献   

6.
A conceptual contingency model matching the characteristics of knowledge acquisition (KA) methodologies to several decision types is proposed. KA methodologies are divided into three categories: knowledge engineer-driven, expert-driven, and machine-driven. To evaluate current KA methodologies, a framework is proposed by addressing the nature of knowledge and problem domains. Different methodologies in each category are described and evaluated for their ability to support various kinds of problem domain and the types of knowledge they are designed to elicit. A contingency model mapping these methodologies to Mintzberg's managerial decision categories is developed. Implications of the proposed model and future research directions are addressed.  相似文献   

7.
随着互联网的发展,网络环境愈加复杂,由此导致的网络安全问题不断出现,因此网络安全的防护成为一项重要研究课题.针对真实网络环境中采集到的流量数据非平衡以及传统机器学习方法提取特征表示不准确等问题,文中提出一种基于Borderline-SMOTE和双Attention的入侵检测方法.首先对入侵数据进行Borderline-...  相似文献   

8.
石拓    张齐    石磊 《智能系统学报》2022,17(6):1104-1112
针对盗窃犯罪时空预测特征融合不精、时序动态适应性不足问题,提出自注意力和多尺度多视角特征动态融合的预测模型。首先,以盗窃发案的位置信息为基础,将数据投射到地图栅格内,通过构建一种可将不同时序长度案件数据匹配为自适应长度数据的方法,并组合向量映射后的天气、作案时间、地理位置等属性,构造多维度特征融合的输入向量;其次,采用自注意力机制生成多视角特征动态融合的向量;最后,通过采用多尺度窗口CNN对多视角特征动态融合向量进行编码后送入分类器,预测出每个地图栅格内的发案态势。在某市盗窃数据集上验证,本文方法在3种地理栅格尺度下,预测准确率最高可达到0.899,显著优于其他对比模型。  相似文献   

9.
诊疗前预测急性缺血性脑卒中(AIS)的预后分级,有利于揭示预后转归水平并指导治疗策略,提升方法的预测性能是实现精准医疗的重要指导。利用临床和影像组学的融合特征实施脑卒中的多分类预测,并提出了一种基于融合特征的深度集成优化方法(IABC-DEL)模型,其特征选择方法为Embedded嵌入法和卡方检验,数据不平衡处理方式为Borderline-SMOTE算法,利用Stacking构建深度集成优化模型,基学习器包括深度神经网络(DNN)、长短期记忆网络(LSTM)和门控循环单元(GRU),优化方法为改良人工蜂群算法(IABC)。研究结果表明,深度集成优化方法的预后预测性能优于经典方法和既往研究,Macro-F1 score为87.88%,Macro-AUC为96.27%,ACC为88.02%。因此,基于深度集成优化学习的急性缺血性脑卒中预后模型可对临床诊治和预后康复提供指导意义,并为研究预测提供新的建模思路。  相似文献   

10.
针对点值预测方法预测虚拟机故障,未充分利用虚拟机历史周期特征和上下文信息、预测准确率不高的问题,提出了一种动态滑动窗口多通道Bi-LSTM的虚拟机故障预测模型。该模型首先利用动态滑动窗口动态捕获虚拟机故障发生过程的上下文特征;然后构建多通道机制的Bi-LSTM以同时学习不同指标类之间的相关性特征,预测虚拟机下一周期的故障;最后根据OCSVM和区间偏移度方法对预测结果进行判断,得出具体的故障类型。实验表明,该模型在预测准确率、召回率、F值三个指标上均优于基线模型,验证了模型对虚拟机故障预测的有效性。  相似文献   

11.
针对超短期光伏功率预测的传统方法存在的限制,本文提出了一种基于LightGBM-TextCNN-XGBoost算法的预测模型。首先,本文对原始数据进行了预处理,并使用CEEMDAN对数据进行模态分解。然后,该文将模态分解后的数据归一化,并基于GWO-FCM聚类算法将数据聚类为三种天气类型。接着,该文将数据划分为训练集和测试集,分别对LightGBM和TextCNN算法进行训练。最后,文章基于Stacking思想建立了基于LightGBM-TextCNN-XGBoost算法的模型进行预测,并使用R2等评价指标对预测模型进行了综合评价。实验结果显示,文中模型的预测效果非常优秀。这种模型能够精确地预测光伏发电的功率,有助于光伏电站降低损失,从而确保微电网的安全稳健运行。  相似文献   

12.
This research proposes a pattern/shape‐similarity‐based clustering approach for time series prediction. This article uses single hidden Markov model (HMM) for clustering and combines it with soft computing techniques (fuzzy inference system/artificial neural network) for the prediction of time series. Instead of using distance function as an index of similarity, here shape/pattern of the sequence is used as the similarity index for clustering, which overcomes few of the shortcomings associated with distance‐based clustering approaches. Underlying hidden properties of time series are captured with the help of HMM. The prediction method used here exploits the pattern identification prowess of the HMM for cluster selection and the generalization and nonlinear modeling capabilities of soft computing methods to predict the output of the system. To see the validity of the proposed method in the real‐life scenario, it is tested on four different time series. The first is a benchmark Mackey–Glass time series, which is tested for delay parameters τ = 17 and τ = 30. The remaining time series are monthly sunspot data time series, Laser data time series and the last is Lorenz attractor time series. Simulation results show that the proposed method provide a better prediction performance in comparison with the existing methods.  相似文献   

13.
A new approach to revealing regularities in nonstationary k-valued multidimensional time series is proposed. It allows one to discover regularities that are subject to gentle structural changes with time. A measure of similarity between regularities is proposed to describe such changes, and its application in the form of weight in the graph of regularities is discussed. The discovered regularities can be used to predict the subsequent elements in multidimensional time series, to analyze the phenomenon described by this series, and to model the phenomenon. This allows one to use the proposed algorithm in a wide variety of problems concerning prediction of time series and for examining and describing the processes that can be represented by multidimensional time series. Means for direct practical application of the proposed methods of the analysis and prediction of time series are described, and the use of these methods for short-term prediction of model series and a real-life multidimensional time series consisting of the stock prices of companies operating in similar fields is discussed.  相似文献   

14.
李洪亮  张弄  孙婷  李想 《计算机应用》2022,42(6):1649-1655
通过分析分布式机器学习中作业性能干扰的问题,发现性能干扰是由于内存过载、带宽竞争等GPU资源分配不均导致的,为此设计并实现了快速预测作业间性能干扰的机制,该预测机制能够根据给定的GPU参数和作业类型自适应地预测作业干扰程度。首先,通过实验获取分布式机器学习作业运行时的GPU参数和干扰率,并分析出各类参数对性能干扰的影响;其次,依托多种预测技术建立GPU参数-干扰率模型进行作业干扰率误差分析;最后,建立自适应的作业干扰率预测算法,面向给定的设备环境和作业集合自动选择误差最小的预测模型,快速、准确地预测作业干扰率。选取5种常用的神经网络作业,在两种GPU设备上设计实验并进行结果分析。结果显示,所提出的自适应干扰预测(AIP)机制能够在不提供任何预先假设信息的前提下快速完成预测模型的选择和性能干扰预测,耗时在300 s以内,预测干扰率误差在2%~13%,可应用于作业调度和负载均衡等场景。  相似文献   

15.
The ability to predict human mobility, i.e., transitions between a user's significant locations (the home, workplace, etc.) can be helpful in a wide range of applications, including targeted advertising, personalized mobile services, and transportation planning. Most studies on human mobility prediction have focused on the algorithmic perspective rather than on investigating human predictability. Human predictability has great significance, because it enables the creation of more robust mobility prediction models and the assignment of more accurate confidence scores to location predictions. In this study, we propose a novel method for detecting a user's stay points from millions of GPS samples. Then, after detecting these stay points, a long short-term memory (LSTM) neural network is used to predict future stay points. We explore the use of two types of stay point prediction models (a general model that is trained in advance and a personal model that is trained over time) and analyze the number of previous locations needed for accurate prediction. Our evaluation on two real-world datasets shows that by using our preprocessing approach, we can detect stay points from routine trajectories with higher accuracy than the methods commonly used in this domain, and that by utilizing various LSTM architectures instead of the traditional Markov models and advanced deep learning models, our method can predict human movement with high accuracy of more than 40% when using the Acc@1 measure and more than 59% when using the Acc@3 measure. We also demonstrate that the movement prediction accuracy varies for different user populations based on their trajectory characteristics and demographic attributes.  相似文献   

16.
鞋印是作案人在案发现场经常遗留的痕迹,承载人的性别、身高等属性信息。基于鞋印的性别预测对快速排查嫌疑人具有重要作用,其方法主要由刑侦人员凭借经验判断,需要大量领域知识,而少数自动预测方法是基于人工提取的特征和经验模型进行预测,受测量误差的影响,导致预测准确率降低。针对该问题,提出基于鞋印图像的端到端预测方法。采用卷积神经网络提取鞋印图像特征,引入通道注意力模块对特征权重进行重新分配,使模型重点关注鞋印图像中对性别起显著作用的部分。在此基础上,将特征图输入到性别预测模块进行预测。此外,分别构建适用于单枚和多枚鞋印应用场景的数据集SiSIS和SeSIS,根据在案发现场中鞋印可能出现的情况,设计鞋印方向差异、鞋印残缺和弹性形变的数据增广方式。实验结果表明,该方法在SiSIS和SeSIS数据集上的预测准确率分别达到91.80%和99.35%,相比现有基于鞋印的性别预测方法,具有较优的预测性能。  相似文献   

17.
为了缓解软件缺陷预测的类不平衡问题,避免过拟合影响缺陷预测模型的准确率,本文提出一种面向软件缺陷预测的基于异类距离排名的过采样方法(HDR).首先,对少数类实例进行3类实例区分,去除噪声实例,减少噪声数据导致的过拟合的情况,然后基于异类距离将实例进行排名,选取相似度高的实例两两组合产生新实例,以此来提升新实例的多样性,...  相似文献   

18.
In this study, short-term prediction of aluminum foil thickness time-series data recorded during cold-rolling process was investigated. The locally projective nonlinear noise reduction was applied in order to improve the predictability of the time series. The higher-order statistics methods (bispectrum and bicoherence) were used to detect the nonlinearity. The embedding vectors with appropriate embedding dimension and time delay were obtained via the false nearest neighbors and mutual information methods, respectively. The maximum prediction horizon was determined depending on the maximal Lyapunov exponent. For various prediction horizons, the embedding vector and corresponding thickness value pairs were used as the dataset to assess the prediction performance of various machine learning algorithms (i.e., multilayer perceptron neural network, support vector machines with Pearson VII function-based kernel, and radial basis function network). The n-step ahead prediction outputs of the machine learning algorithms were globally combined with simple voting in favor of the one having minimum absolute error. The accuracy of our proposed method was compared with nonlinear autoregressive exogenous model for various thickness time-series data using mean absolute percentage error measure.  相似文献   

19.
针对单传感器煤矿数据预测存在的片面性问题,提出将信息融合技术与相空间重构技术相结合的多传感器煤矿数据的预测模型。对井下多种传感器,包括瓦斯浓度、风速、温度传感器,进行融合预测。以多类传感器时序数据为研究对象,首先利用信息融合的方法分别对各类传感器数据依次进行数据层融合、特征层融合;然后采用关联积分方法对两级融合之后的传感器数据分别确定相重构的时间延迟τ和嵌入维数m两个参数;最后结合多变量相空间重构技术,将各类传感器数据融合重构相空间,运用基于K-Means聚类的加权一阶局域法构建多传感器数据的预测模型。数据来源于山西省阳泉煤矿,采集了近20G数据,以瓦斯浓度、风速、温度三种传感器数据进行实验,实验结果表明:对于特征层的融合,每15分钟时间段内的数据经融合后可有效作为衡量这段时间内的特征,经过预测模型计算后,与时间段为5分钟、10分钟、20分钟相比较误差达到最小ESS=0.003,较目前的最小误差值0.05,误差大大下降,故融合预测效果较好,可以较准确地预测未来15分钟后的传感器数据,可有充足时间进一步为井下的安全评估提供决策依据。  相似文献   

20.
Accurate prediction of construction cost in the initial phase of a construction project is critical to the success of the project. Accordingly, many researchers have proposed various methodologies for predicting the cost in the initial phase with the use of limited information. This study was aimed at improving the prediction performance of a cost prediction model based on the Case-Based Reasoning (CBR) technique, which has recently become widely used. Toward this end, an improved CBR model that uses the Multiple Regression Analysis (MRA) technique in the revision phase of the CBR technique was developed. To verify the prediction performance of the proposed model, a case study was performed on 41 business facilities and 99 multi-family housing projects. The results showed that the prediction performance of the revised CBR model for business facilities and multi-family housings improved by 17.23% and 4.39%, respectively, compared to that of the existing CBR model. The proposed MRA-based revised CBR model is expected to be useful in estimating the construction cost in the initial phase of a project.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号