期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Theoretical analyses of cross-validation error and voting in instance-based learning

PETER TURNEY 《人工智能实验与理论杂志》2013,25(4):331-360

Abstract

This paper begins with a general theory of error in cross-validation testing of algorithms for supervised learning from examples. It is assumed that the examples are described by attribute-value pairs, where the values are symbolic. Cross-validation requires a set of training examples and a set of testing examples. The value of the attribute that is to be predicted is known to the learner in the training set, but unknown in the testing set. The theory demonstrates that cross-validation error has two components: error on the training set (inaccuracy) and sensitivity to noise (instability). This general theory is then applied to voting in instance-based learning. Given an example in the testing set, a typical instance-based learning algorithm predicts the designated attribute by voting among the k nearest neighbours (the k most similar examples) to the testing example in the training set. Voting is intended to increase the stability (resistance to noise) of instance-based learning, but a theoretical analysis shows that there are circumstances in which voting can be destabilising. The theory suggests ways to minimize cross-validation error, by insuring that voting is stable and does not adversely affect accuracy. 相似文献

2.

Predicting SQL injection and cross site scripting vulnerabilities through mining input sanitization patterns

《Information and Software Technology》2013,55(10):1767-1780

相似文献

3.

基于交叉验证的集成学习误差分析

路佳佳《计算机系统应用》2023,32(1):302-309

目前关于集成学习的泛化性能的研究已取得很大成功,但是关于集成学习的误差分析还需要进一步研究.考虑交叉验证在统计机器学习中对于模型性能评估有重要应用,为此,应用组块3×2交叉验证和k折交叉验证方法为每个样本点进行赋予权重的预测值的集成,并进行误差分析.在模拟数据和真实数据上进行实验,结果表明基于组块3×2交叉验证的集成学习预测误差小于单个学习器的预测误差,并且集成学习的方差比单个学习器方差小.与基于k折交叉验证的集成学习方法相比,基于组块3×2交叉验证的泛化误差小于基于k折交叉验证的泛化误差,说明基于组块3×2交叉验证的集成学习模型稳定性好. 相似文献

4.

Using complex and hypercomplex systems in image and signal processing

Ya. A. Furman A. V. Krevetskii 《Pattern Recognition and Image Analysis》2006,16(4):659-670

New approaches to processing of dense and point images are presented. They are based on the theory of hypercomplex numbers and make use of simplified but reasonably adequate image models that incur no significant loss of information. The advantage of these approaches consists in increased efficiency of decisions made by machine vision systems and in considerable reduction of time needed to arrive at these decisions. The basics of the theory of complex-valued (contour) and quaternion-valued signals are considered. We show how this theory is related to the theory of real-valued signals and identify the problems where hypercomplex signals have advantages over real-valued ones. Yakov A. Furman. Born 1939. Graduated from the Taganrog Radioengineering Institute in 1961. Received doctoral degree (Dr. Sci. (Eng.)) in 1989. Professor, Head of the Chair of Radioengineering Systems, Maryi State Technical University. Scientific interests: digital processing and synthesis of signals and images, pattern recognition. Author of more than 66 papers and two monographs. Corresponding member of the Russian Engineering Academy, member of the Editorial Board of Vestnik Verkhne-Volzhskogo otdeleniya Akademii tekhnologicheskikh nauk RF. In 2002 awarded the order of Druzhba Narodov and, in 1995, a medal of the order Za zaslugi pered Otechestvom II stepeni. Aleksandr V. Krevetskii. Born 1966. Graduated from Maryi State Technical University in 1990. Received candidates degree (Cand. Sci. (Eng.)) in 1995. Docent, Head of the Chair of Informatics, Maryi State Technical University. Scientific interests: digital processing and synthesis of signals, image analysis, pattern recognition. Author of more than 25 papers and two monographs. 相似文献

5.

Combining belief functions taking into consideration error in judgement

Sunay P. Pai Rajesh S. Prabhu Gaonkar 《国际通用系统杂志》2020,49(4):438-448

ABSTRACT

Dempster–Shafer (D–S) evidence theory is very efficient and widely used mathematical tool for uncertain and imprecise information fusion for decision making. D–S rule is criticised by many researchers as it gives illogical and counterintuitive results especially when the series of evidence provided by various experts are in a high degree of conflict. Various attempts have been made and several alternatives proposed to this rule. In this paper, a new alternative is proposed which considers the possibility of an error made by experts while providing evidence, calculates the error and incorporates in the revised masses. The validity and efficiency of the proposed approach have been demonstrated with numerous examples and the results are compared with already existing methods.

Highlights

An alternative method is proposed to handle the conflicting evidence.
An Error In Judgement while gathering evidence is considered and incorporated before combining evidence.
The method is simple and gives better and reasonable results than other previous methods when evidence conflicts

相似文献

6.

Job performance prediction in a call center using a naive Bayes classifier

Mauricio A. Valle Samuel Varas Gonzalo A. Ruz 《Expert systems with applications》2012,39(11):9939-9945

This study presents an approach to predict the performance of sales agents of a call center dedicated exclusively to sales and telemarketing activities. This approach is based on a naive Bayesian classifier. The objective is to know what levels of the attributes are indicative of individuals who perform well. A sample of 1037 sales agents was taken during the period between March and September of 2009 on campaigns related to insurance sales and service pre-paid phone services, to build the naive Bayes network. It has been shown that, socio-demographic attributes are not suitable for predicting performance. Alternatively, operational records were used to predict production of sales agents, achieving satisfactory results. In this case, the classifier training and testing is done through a stratified tenfold cross-validation. It classified the instances correctly 80.60% of times, with the proportion of false positives of 18.1% for class no (does not achieve minimum) and 20.8% for the class yes (achieves equal or above minimum acceptable). These results suggest that socio-demographic attributes has no predictive power on performance, while the operational information of the activities of the sale agent can predict the future performance of the agent. 相似文献

7.

实值信息系统的属性约简

黄小刚陈子春《计算机工程与应用》2012,48(20):158-163

实值信息系统是连续值信息系统的广义形式,其属性值是实际问题反映出来的真实数据。通过在实值信息系统上定义一种相容关系,主要讨论了这种关系下实值信息系统与实值决策表基于粗糙集理论的属性约简,给出了区分函数的定义与约简的判定定理,得到了计算约简的具体方法,并将所得结论用于无线电信号数据分析处理上。相似文献

8.

CORRESPONDENCE AMONG MATHEMATICAL TREATMENTS OF CULTURE THEORY

Paul Ballonoff 《控制论与系统》2013,44(8):847-859

ABSTRACT

Several approaches exist in the development of a mathematical theory of culture. While those approaches often use similar vocabularies in different ways, the approaches are complimentary, not conflicting. The note outlines relationships among the different approaches, and some implications of those relationships. Thus viewed, the field is much closer to having a comprehensive theory than has been so far recognized. 相似文献

9.

Dissimilarity learning for nominal data

Victor Cheng Author Vitae Chun-Hung Li Author Vitae James T. Kwok Author Vitae Author Vitae 《Pattern recognition》2004,37(7):1471-1477

Defining a good distance (dissimilarity) measure between patterns is of crucial importance in many classification and clustering algorithms. While a lot of work has been performed on continuous attributes, nominal attributes are more difficult to handle. A popular approach is to use the value difference metric (VDM) to define a real-valued distance measure on nominal values. However, VDM treats the attributes separately and ignores any possible interactions among attributes. In this paper, we propose the use of adaptive dissimilarity matrices for measuring the dissimilarities between nominal values. These matrices are learned via optimizing an error function on the training samples. Experimental results show that this approach leads to better classification performance. Moreover, it also allows easier interpretation of (dis)similarity between different nominal values. 相似文献

10.

A spatio-temporal prediction of NDVI based on precipitation: an application for grazing management in the arid and semi-arid grasslands

《International journal of remote sensing》2012,33(6):2359-2373

ABSTRACT

A method for predicting the dynamic spatio-temporal variations of the normalized difference vegetation index (NDVI) based on precipitation is proposed using combined nonlinear autoregressive with exogenous input (NARX) networks and artificial neural networks (ANNs). The proposed method is validated by applying to predict the spatio-temporal NDVI for the Hulunbuir grassland located in Inner Mongolia, China. The results show the good predictive ability for the spatio-temporal variations of NDVI with the mean absolute percentage error of 11.59%, mean absolute error of 7.11 × 10^?2 and root mean square error of 8.06 × 10^?2, respectively. The approach presented in the paper can be further used as the guidance to reduce the occurrence of overgrazing in the arid and semi-arid grasslands. 相似文献

11.

Review Article Principles of field spectroscopy

E. J. MILTON 《International journal of remote sensing》2013,34(12):1807-1827

Abstract

Field spectroscopy involves the study of the interrelationships between the spectral characteristics of objects and their biophysical attributes in the field environment. It is a technique of fundamental importance in remote sensing, yet its full potential is rarely exploited. In this article the principles of the subject are explained and its historical development reviewed with reference to the instruments and methods adopted. Field spectroscopy has a role to play in at least three areas of remote sensing. Firstly, it acts as a ridge between laboratory measurements of spectral reflectance and the field situation and is thus useful in the calibration of airborne and satellite sensors. Secondly, it is useful in predicting the optimum spectral bands, viewing configuration and time to perform a particular remote sensing task. Thirdly, it provides a tool for the development, refinement and testing of models relating biophysical attributes to remotely-sensed data. 相似文献

12.

A Split-Complex Valued Gradient-Based Descent Neuro-Fuzzy Algorithm for TS System and Its Convergence

Liu Yan Yang Dakun Li Long Yang Jie 《Neural Processing Letters》2019,50(2):1589-1609

In order to broaden the study of the most popular and general Takagi–Sugeno (TS) system, we propose a complex-valued neuro-fuzzy inference system which realises the zero-order TS system in the complex-valued network architecture and develop it. In the complex domain, boundedness and analyticity cannot be achieved together. The splitting strategy is given by computing the gradients of the real-valued error function with respect to the real and the imaginary parts of the weight parameters independently. Specifically, this system has four layers: in the Gaussian layer, the L-dimensional complex-valued input features are mapped to a Q-dimensional real-valued space, and in the output layer, complex-valued weights are employed to project it back to the complex domain. Hence, split-complex valued gradients of the real-valued error function are obtained, forming the split-complex valued neuro-fuzzy (split-CVNF) learning algorithm based on gradient descent. Another contribution of this paper is that the deterministic convergence of the split-CVNF algorithm is analysed. It is proved that the error function is monotone during the training iteration process, and the sum of gradient norms tends to zero. By adding a moderate condition, the weight sequence itself is also proved to be convergent.

相似文献

13.

A method for predicting the risk of virtual crashes in a simulated driving task using behavioural and subjective drowsiness measures

《Ergonomics》2012,55(5):714-730

Abstract

This study proposed a procedure for predicting the point in time with high risk of virtual crash using a control chart methodology for behavioural measures during a simulated driving task. Tracking error, human back pressure, sitting pressure and horizontal and vertical neck bending angles were measured during the simulated driving task. The time with a high risk of a virtual crash occurred in 9 out of 10 participants. The time interval between the successfully detected point in time with high risk of virtual crash and the point in time of virtual crash ranged from 80 to 324 s. The proposed procedure for predicting the point in time with a high risk of a crash is promising for warning drivers of the state of high risk of crash.

Practitioner Summary: Many fatal crashes occur due to drowsy driving. We proposed a method to predict the point in time with high risk of virtual crash before such a virtual crash occurs. This is done using behavioural measures during a simulated driving task. The effectiveness of the method is also demonstrated. 相似文献

14.

Cognitive ergonomics for data analysis. Experimental study of cognitive limitations in a data-based judgement task

Virpi Kalakoski Andreas Henelius Emilia Oikarinen Antti Ukkonen Kai Puolamäki 《Behaviour & Information Technology》2019,38(10):1038-1047

ABSTRACT

Today’s ever-increasing amount of data places new demands on cognitive ergonomics and requires new design ideas to ensure successful human–data interaction. Our aim was to identify the cognitive factors that must be considered when designing systems to improve decision-making based on large amounts of data. We constructed a task that simulates the typical cognitive demands people encounter in data analysis situations. We demonstrate some essential cognitive limitations using a behavioural experiment with 20 participants. The studied task presented the participants with critical and noncritical attributes that contained information on two groups of people. They had to select the response option (group) with the higher level of critical attributes. The results showed that accuracy of judgement decreased as the amount of information increased, and that judgement was affected by irrelevant information. Our results thus demonstrate critical cognitive limitations when people utilise data and suggest a cognitive bias in data-based decision-making. Therefore, when designing for cognition, we should consider the human cognitive limitations that are manifested in a data analysis context. Furthermore, we need general cognitive ergonomic guidelines for design that support the utilisation of data and improve data-based decision-making. 相似文献

15.

Maximal width learning of binary functions

Martin Anthony 《Theoretical computer science》2010,411(1):138-147

This paper concerns learning binary-valued functions defined on R, and investigates how a particular type of ‘regularity’ of hypotheses can be used to obtain better generalization error bounds. We derive error bounds that depend on the sample width (a notion analogous to that of sample margin for real-valued functions). This motivates learning algorithms that seek to maximize sample width. 相似文献

16.

The Tarpit – A general theory of software engineering

《Information and Software Technology》2016

ContextRecent years have seen an increasing interest in general theories of software engineering. As in other academic fields, these theories aim to explain and predict the key phenomena of the discipline.ObjectiveThe present article proposes a general theory of software engineering that we have labeled the Tarpit theory, in reference to the 1982 epigram by Alan Perlis.MethodAn integrative theory development approach was employed to develop the Tarpit theory from four underlying theoretical fields: (i) languages and automata, (ii) cognitive architecture, (iii) problem solving, and (iv) organization structure. Its applicability was explored in three test cases.ResultsThe theory demonstrates an explanatory and predictive potential for a diverse set of software engineering phenomena. It demonstrates a capability of explaining Brooks’s law, of making predictions about domain-specific languages, and of evaluating the pros and cons of the practice of continuous integration.ConclusionThe presented theory appears capable of explaining and predicting a wide range of software engineering phenomena. Further refinement and application of the theory remains as future work. 相似文献

17.

基于流形学习的彩色遥感图像分维数估算

下载免费PDF全文

王洪波罗贺王晓佳《中国图象图形学报》2015,20(8):1110-1121

目的纹理特征提取一直是遥感图像分析领域研究的热点和难点。现有的纹理特征提取方法主要集中于研究单波段灰色遥感图像,如何提取多波段彩色遥感图像的纹理特征,是多光谱遥感的研究前沿。方法提出了一种基于流形学习的彩色遥感图像分维数估算方法。该方法利用局部线性嵌入方法,对由颜色属性所组成的5-D欧氏超曲面进行维数简约处理;再将维数简约处理后的颜色属性用于分维数估算。结果利用Landsat-7遥感卫星数据和GeoEye-1遥感卫星数据进行实验,结果表明,同Peleg法和Sarkar法等其他分维数估算方法相比,本文方法具有较小的拟合误差。其中,其他4种对比方法所获拟合误差E平均值分别是本文方法所获得拟合误差E平均值的26.2倍、5倍、26.3倍、5倍。此外,本文方法不仅可提供具有较好分类特性的分维数,而且还能提供相对于其他4种对比方法更加稳健的分维数。结论在针对中低分辨率的真彩遥感图像和假彩遥感图像以及高分辨率彩色合成遥感图像方面,本文方法能够利用不同地物所具有颜色属性信息,提取出各类型地物所对应的纹理信息,有效地改善了分维数对不同地物的区分能力。这对后续研究各区域中不同类型地物的分布情况及针对不同类型地物分布特点而制定区域规划及开发具有积极意义。相似文献

18.

A new method to forecast of Escherichia coli promoter gene sequences: Integrating feature selection and Fuzzy-AIRS classifier system

Kemal Polat Salih Güneş 《Expert systems with applications》2009,36(1):57-64

We have investigated the real-world task of recognizing biological concepts in DNA sequences in this work. Recognizing promoters in strings that represent nucleotides (one of A, G, T, or C) has been performed using a novel approach based on feature selection (FS) and Artificial Immune Recognition System (AIRS) with Fuzzy resource allocation mechanism (Fuzzy-AIRS), which is first proposed by us. The aim of this study is to improve the prediction accuracy of Escherichia coli promoter gene sequences using a novel system based on FS and Fuzzy-AIRS. The E. coli promoter gene sequences dataset has 57 attributes and 106 samples including 53 promoters and 53 non-promoters. The proposed system consists of two parts. Firstly, we have reduced the dimension of E. coli promoter gene sequences dataset from 57 attributes to 4 attributes by means of FS process. Second, Fuzzy-AIRS classifier algorithm has been run to predict the E. coli promoter gene sequences. The robustness of the proposed method is examined using prediction accuracy, sensitivity and specificity analysis, k-fold cross-validation method and confusion matrix. Whilst only Fuzzy-AIRS classifier has obtained 50% prediction accuracy using 10-fold cross-validation, the proposed system has obtained 90% prediction accuracy in the same conditions. These obtained results have indicated that the proposed system obtain the success rate in recognizing promoters in strings that represent nucleotides. 相似文献

19.

Development of a physical employment standard for a branch of the UK military

《Ergonomics》2012,55(12):1572-1584

Abstract

A Physical Employment Standard (PES) was developed for the British Royal Air Force Regiment (RAF Regt). Twenty-nine RAF Regt personnel completed eight critical tasks wearing Combat Equipment Fighting Order (31.5?kg) while being monitored for physical and perceptual effort. A PES was developed using task simulations, measured on 61 incumbents. The resultant PES consists of: 1) a battlefield test involving task simulations: single lift and point-of-entry (psss/fail); timed elements (react to effective enemy fire and crawl) set at 95th performance percentile; casualty evacuation (CASEVAC) casualty drag and CASEVAC simulated stretcher carry completed without stopping. 2) a Multi Stage Fitness Test level 9.10 to assess aerobic fitness to complete a tactical advance to battle. The task-based PES should ensure RAF Regt personnel have a baseline level of fitness to perform and withstand the physical demands of critical tasks to at least a minimum acceptable standard.

Practitioner summary: A Physical Employment Standard (PES) was developed for the British RAF Regiment by measuring the physiological demands of critical tasks on a representative cohort of incumbent personnel. A task-based PES should ensure that only those candidates, irrespective of gender, race or disability, with the necessary physical attributes to succeed in training and beyond, are selected. 相似文献

20.

An expectation operator for belief functions in the Dempster–Shafer theory*

Prakash P. Shenoy 《国际通用系统杂志》2020,49(1):112-141

ABSTRACT

The main contribution of this paper is a new definition of expected value of belief functions in the Dempster–Shafer (D–S) theory of evidence. Our definition shares many of the properties of the expectation operator in probability theory. Also, for Bayesian belief functions, our definition provides the same expected value as the probabilistic expectation operator. A traditional method of computing expected of real-valued functions is to first transform a D–S belief function to a corresponding probability mass function, and then use the expectation operator for probability mass functions. Transforming a belief function to a probability function involves loss of information. Our expectation operator works directly with D–S belief functions. Another definition is using Choquet integration, which assumes belief functions are credal sets, i.e. convex sets of probability mass functions. Credal sets semantics are incompatible with Dempster's combination rule, the center-piece of the D–S theory. In general, our definition provides different expected values than, e.g. if we use probabilistic expectation using the pignistic transform or the plausibility transform of a belief function. Using our definition of expectation, we provide new definitions of variance, covariance, correlation, and other higher moments and describe their properties. 相似文献