期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A comprehensive empirical evaluation of missing value imputation in noisy software measurement data 总被引：1，自引：0，他引：1

Jason Van Hulse Author Vitae Author Vitae 《Journal of Systems and Software》2008,81(5):691-708

The handling of missing values is a topic of growing interest in the software quality modeling domain. Data values may be absent from a dataset for numerous reasons, for example, the inability to measure certain attributes. As software engineering datasets are sometimes small in size, discarding observations (or program modules) with incomplete data is usually not desirable. Deleting data from a dataset can result in a significant loss of potentially valuable information. This is especially true when the missing data is located in an attribute that measures the quality of the program module, such as the number of faults observed in the program module during testing and after release. We present a comprehensive experimental analysis of five commonly used imputation techniques. This work also considers three different mechanisms governing the distribution of missing values in a dataset, and examines the impact of noise on the imputation process. To our knowledge, this is the first study to thoroughly evaluate the relationship between data quality and imputation. Further, our work is unique in that it employs a software engineering expert to oversee the evaluation of all of the procedures and to ensure that the results are not inadvertently influenced by poor quality data. Based on a comprehensive set of carefully controlled experiments, we conclude that Bayesian multiple imputation and regression imputation are the most effective techniques, while mean imputation performs extremely poorly. Although a preliminary evaluation has been conducted using Bayesian multiple imputation in the empirical software engineering domain, this is the first work to provide a thorough and detailed analysis of this technique. Our studies also demonstrate conclusively that the presence of noisy data has a dramatic impact on the effectiveness of imputation techniques. 相似文献

2.

New methods for imputation of missing genotype using linkage disequilibrium and haplotype information

Ho-Youl Jung Yun-Ju Park Jung-Sun Park InSong Koh 《Information Sciences》2007,177(3):804-814

In this paper, we propose new missing imputation methods for the missing genotype data of single nucleotide polymorphism (SNP). The common objective of imputation methods is to minimize the loss of information caused by experimental missing elements. In general, imputation of missing genotype data has used a major allele method, but this approach is not far from the objective of the imputation - minimizing the loss of information. This method generally produces high error rates of missing value estimation, since the characteristics of the genotype data are not considered over the structure of given genotype data. In our methods, we use the linkage disequilibrium and haplotype information for the missing SNP genotype. As a result, we provide the results of the comparative evaluation of our methods and major allele imputation method according to the various randomized missing rates. 相似文献

3.

AC800F系统在高炉鼓风机控制中的优化应用

袁玉祥沈长清牟红菊《自动化与仪器仪表》2009,(5):69-70

介绍了AC800F系统在高炉鼓风机控制系统中的防喘振控制及定风量、定风压控制的控制方案。相似文献

4.

Clustering and outlier detection using isoperimetric number of trees

A. Daneshgar R. Javadi S.B. Shariat Razavi 《Pattern recognition》2013,46(12):3371-3382

相似文献

5.

铁水轨道衡在高炉的实践与应用

袁静《自动化与仪器仪表》2009,(6):51-52

着重介绍了无基坑与有基坑2种轨道衡在高炉上的应用特点,阐述了高炉出铁场特殊环境对轨道衡的影响,以及在采购、安装和维护过程中的注意事项。相似文献

6.

Capabilities of outlier detection schemes in large datasets,framework and methodologies

Jian Tang Zhixiang Chen Ada Waichee Fu David W. Cheung 《Knowledge and Information Systems》2007,11(1):45-84

Outlier detection is concerned with discovering exceptional behaviors of objects. Its theoretical principle and practical implementation lay a foundation for some important applications such as credit card fraud detection, discovering criminal behaviors in e-commerce, discovering computer intrusion, etc. In this paper, we first present a unified model for several existing outlier detection schemes, and propose a compatibility theory, which establishes a framework for describing the capabilities for various outlier formulation schemes in terms of matching users'intuitions. Under this framework, we show that the density-based scheme is more powerful than the distance-based scheme when a dataset contains patterns with diverse characteristics. The density-based scheme, however, is less effective when the patterns are of comparable densities with the outliers. We then introduce a connectivity-based scheme that improves the effectiveness of the density-based scheme when a pattern itself is of similar density as an outlier. We compare density-based and connectivity-based schemes in terms of their strengths and weaknesses, and demonstrate applications with different features where each of them is more effective than the other. Finally, connectivity-based and density-based schemes are comparatively evaluated on both real-life and synthetic datasets in terms of recall, precision, rank power and implementation-free metrics. Jian Tang received an MS degree from the University of Iowa in 1983, and PhD from the Pennsylvania State University in 1988, both from the Department of Computer Science. He joined the Department of Computer Science, Memorial University of Newfoundland, Canada, in 1988, where he is currently a professor. He has visited a number of research institutions to conduct researches ranging over a variety of topics relating to theories and practices for database management and systems. His current research interests include data mining, e-commerce, XML and bioinformatics. Zhixiang Chen is an associate professor in the Computer Science Department, University of Texas-Pan American. He received his PhD in computer science from Boston University in January 1996, BS and MS degrees in software engineering from Huazhong University of Science and Technology. He also studied at the University of Illinois at Chicago. He taught at Southwest State University from Fall 1995 to September 1997, and Huazhong University of Science and Technology from 1982 to 1990. His research interests include computational learning theory, algorithms and complexity, intelligent Web search, informational retrieval, and data mining. Ada Waichee Fu received her BSc degree in computer science in the Chinese University of Hong Kong in 1983, and both MSc and PhD degrees in computer science in Simon Fraser University of Canada in 1986, 1990, respectively; worked at Bell Northern Research in Ottawa, Canada, from 1989 to 1993 on a wide-area distributed database project; joined the Chinese University of Hong Kong in 1993. Her research interests are XML data, time series databases, data mining, content-based retrieval in multimedia databases, parallel, and distributed systems. David Wai-lok Cheung received the MSc and PhD degrees in computer science from Simon Fraser University, Canada, in 1985 and 1989, respectively. He also received the BSc degree in mathematics from the Chinese University of Hong Kong. From 1989 to 1993, he was a member of Scientific Staff at Bell Northern Research, Canada. Since 1994, he has been a faculty member of the Department of Computer Science in the University of Hong Kong. He is also the Director of the Center for E-Commerce Infrastructure Development. His research interests include data mining, data warehouse, XML technology for e-commerce and bioinformatics. Dr. Cheung was the Program Committee Chairman of the Fifth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2001), Program Co-Chair of the Ninth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2005). Dr. Cheung is a member of the ACM and the IEEE Computer Society. 相似文献

7.

车辆违章逆行的图像自动检测与识别

杨昌勇刘建伟曹泉《计算机工程与设计》2005,26(10):2825-2827

提出了一种基于图像处理的对车辆违章逆行进行自动检测与识别的方法。利用中值滤波、帧数统计、帧计数和边界检测等图像处理方法，对视频图像中的逆行车辆进行运动特征提取与识别，实现对违章逆行车辆的自动检测，实验结果表明该方法是有效的。相似文献

8.

Subspace identification of continuous time models for process fault detection and isolation

Weihua Li Harigopal Raghavan Sirish Shah 《Journal of Process Control》2003,13(5):1093-421

This paper proposes a novel subspace approach towards identification of optimal residual models for process fault detection and isolation (PFDI) in a multivariate continuous-time system. We formulate the problem in terms of the state space model of the continuous-time system. The motivation for such a formulation is that the fault gain matrix, which links the process faults to the state variables of the system under consideration, is always available no matter how the faults vary with time. However, in the discrete-time state space model, the fault gain matrix is only available when the faults follow some known function of time within each sampling interval. To isolate faults, the fault gain matrix is essential. We develop subspace algorithms in the continuous-time domain to directly identify the residual models from sampled noisy data without separate identification of the system matrices. Furthermore, the proposed approach can also be extended towards the identification of the system matrices if they are needed. The newly proposed approach is applied to a simulated four-tank system, where a small leak from any tank is successfully detected and isolated. To make a comparison, we also apply the discrete time residual models to the tank system for detection and isolation of leaks. It is demonstrated that the continuous-time PFDI approach is practical and has better performance than the discrete-time PFDI approach. 相似文献

9.

基于稀疏分解残差的氢气传感器故障探测与辨识方法

韦宝泉付智辉邓芳明吴翔谭畅《传感器与微系统》2017,36(8)

针对传感器故障探测和诊断,提出了一种基于稀疏分解残差的氢气传感器故障探测和辨识方法.基于信号稀疏分解理论,对采集的传感器正常信号数据集,利用K奇异值分解(K-SVD)学习算法得到一超完备字典D;在字典上对非正常(故障)信号进行分解,根据稀疏分解的残差大小和范围完成对传感器故障的探测及辨识.实验结果表明:对氢气传感器的故障探测率和总辨识率分别达到98.75%和97.25%,可以有效地解决氢气传感器的故障探测和辨识. 相似文献

10.

Identification, prediction and detection of the process fault in a cement rotary kiln by locally linear neuro-fuzzy technique 总被引：1，自引：0，他引：1

Masoud Sadeghian Alireza Fatehi 《Journal of Process Control》2011,21(2):302-308

In this paper, we use nonlinear system identification method to predict and detect process fault of a cement rotary kiln. After selecting proper inputs and output, an input-output model is identified for the plant. To identify the various operation points in the kiln, locally linear neuro-fuzzy (LLNF) model is used. This model is trained by LOLIMOT algorithm which is an incremental tree-structure algorithm. Then, using this method, we obtained 3 distinct models for the normal and faulty situations in the kiln. One of the models is for normal condition of the kiln with 15 min prediction horizon. The other two models are presented for the two faulty situations in the kiln with 7 min prediction horizon. At the end, we detect these faults in validation data. The data collected from White Saveh Cement Company is used in this study. 相似文献