首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到10条相似文献,搜索用时 125 毫秒
1.
Finding centric local outliers in categorical/numerical spaces   总被引:2,自引:0,他引:2  
Outlier detection techniques are widely used in many applications such as credit-card fraud detection, monitoring criminal activities in electronic commerce, etc. These applications attempt to identify outliers as noises, exceptions, or objects around the border. The existing density-based local outlier detection assigns the degree to which an object is an outlier in a numerical space. In this paper, we propose a novel mutual-reinforcement-based local outlier detection approach. Instead of detecting local outliers as noise, we attempt to identify local outliers in the center, where they are similar to some clusters of objects on one hand, and are unique on the other. Our technique can be used for bank investment to identify a unique body, similar to many good competitors, in which to invest. We attempt to detect local outliers in categorical, ordinal as well as numerical data. In categorical data, the challenge is that there are many similar but different ways to specify relationships among the data items. Our mutual-reinforcement-based approach is stable, with similar but different user-defined relationships. Our technique can reduce the burden for users to determine the relationships among data items, and find the explanations why the outliers are found. We conducted extensive experimental studies using real datasets. Jeffrey Xu Yu received his B.E., M.E. and Ph.D. in computer science, from the University of Tsukuba, Japan, in 1985, 1987 and 1990, respectively. Jeffrey Xu Yu was a research fellow in the Institute of Information Sciences and Electronics, University of Tsukuba (Apr. 1990–Mar. 1991), and held teaching positions in the Institute of Information Sciences and Electronics, University of Tsukuba (Apr. 1991–July 1992) and in the Department of Computer Science, Australian National University (July 1992–June 2000). Currently he is an Associate Professor in the Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong. His major research interests include data mining, data stream mining/processing, XML query processing and optimization, data warehouse, on-line analytical processing, and design and implementation of database management systems. Weining Qian is currently an assistant professor of computer science at Fudan University, Shanghai, China. He received his M.S. and Ph.D. degrees in computer science from Fudan University in 2001 and 2004, respectively. He was supported by a Microsoft Research Fellowship when he was doing the research presented in this paper, and he is supported by the Shanghai Rising Star Program. His research interests include data mining for very large databases, data stream query processing and mining and peer-to-peer computing. Hongjun Lu received his B.Sc. from Tsinghua University, China, and M.Sc. and Ph.D. from the Department of Computer Science, University of Wisconsin–Madison. He worked as an engineer in the Chinese Academy of Space Technology, and a principal research scientist in the Computer Science Center of Honeywell Inc., Minnesota, USA (1985–1987), and a professor at the School of Computing of the National University of Singapore (1987–2000), and is a full professor of the Hong Kong University of Science and Technology. His research interests are in data/knowledge-base management systems with an emphasis on query processing and optimization, physical database design, and database performance. Hongjun Lu is currently a trustee of the VLDB Endowment, an associate editor of the IEEE Transactions on Knowledge and Data Engineering (TKDE), and a member of the review board of the Journal of Database Management. He served as a member of the ACM SIGMOD Advisory Board in 1998–2002. Aoying Zhou born in 1965, is currently a professor of computer science at Fudan University, Shanghai, China. He won his Bachelor degree and Master degree in Computer Science from Sichuan University in Chengdu, Sichuan, China in 1985 and 1988. respectively, and a Ph.D. degree from Fudan University in 1993. He has served as a member or chair of the program committees for many international conferences such as VLDB, ER, DASFAA, WAIM, and etc. His papers have been published in ACM SIGMOD, VLDB, ICDE and some international journals. His research interests include data mining and knowledge discovery, XML data management, web query and searching, data stream analysis and processing and peer-to-peer computing.  相似文献   

2.
ARMiner: A Data Mining Tool Based on Association Rules   总被引:3,自引:0,他引:3       下载免费PDF全文
In this paper,ARM iner,a data mining tool based on association rules,is introduced.Beginning with the system architecture,the characteristics and functions are discussed in details,including data transfer,concept hierarchy generalization,mining rules with negative items and the re-development of the system.An example of the tool‘s application is also shown.Finally,Some issues for future research are presented.  相似文献   

3.
The Music Table is an augmented reality system for composing music by manipulating objects on a tabletop as a physicalized representation of the music being heard. Educational theory, and the apparent success of related applications in various learning contexts, seems to support this idea. In our experiments with children, all were able to make a musical pattern and made many changes to their pattern over a short period of time. We propose its suitability as an educational tool, particularly in short and intense interactive learning situations such as children's museums. We discuss some future developments of the idea. Rodney Berry was born in 1963 in Australia. He came to ATR in 1999. He is musician, composer and media artist, he gained a Master of fine arts from the UNSW college of Fine Arts in Sydney in 1999. He is currently completing a Ph.D. at UTS Creativity and Cognition Studios in Sydney while continuing to work at ATR. Mao Makino was born in Osaka, graduated from School of Literature, Arts and Cultural Studies. She has been a Media Creator at ATR since 1999. Her 3D animations featured in the MIDAS interactive dance system shown at the exhibition “Dream Technologies for the 21st Century” in Tokyo in 2000. Naoto Hikawa was born in Sendai. He graduated in “Visual concept planning” from Osaka University of the Arts. Since 1998, he has been a media creator with ATR. He was also involved in the production of ATR MIC lab's MIDAS dance system. He is also a VJ at nightclubs in Osaka. Dr Masami Suzuki was born in Tokyo, Japan. He obtained Master degree from Keio University in 1980, since then has worked for KDD (currently KDDI) telecommunication company. His research area has been spread from natural language processing to creative human interfaces. Currently, he is a chief researcher at ATR Media Information Science Laboratories. Dr. Naomi Inoue was born in Nara, Japan. He gained Master degree and Ph.D. from Kyoto University in 1984 and 1998, respectively. His research interests are natural language processing, speech recognition and graphics user interface for mobile phones. Currently, he is a group leader at ATR Media Information Science Laboratories.  相似文献   

4.
Privacy-preserving SVM classification   总被引:2,自引:2,他引:0  
Traditional Data Mining and Knowledge Discovery algorithms assume free access to data, either at a centralized location or in federated form. Increasingly, privacy and security concerns restrict this access, thus derailing data mining projects. What is required is distributed knowledge discovery that is sensitive to this problem. The key is to obtain valid results, while providing guarantees on the nondisclosure of data. Support vector machine classification is one of the most widely used classification methodologies in data mining and machine learning. It is based on solid theoretical foundations and has wide practical application. This paper proposes a privacy-preserving solution for support vector machine (SVM) classification, PP-SVM for short. Our solution constructs the global SVM classification model from data distributed at multiple parties, without disclosing the data of each party to others. Solutions are sketched out for data that is vertically, horizontally, or even arbitrarily partitioned. We quantify the security and efficiency of the proposed method, and highlight future challenges. Jaideep Vaidya received the Bachelor’s degree in Computer Engineering from the University of Mumbai. He received the Master’s and the Ph.D. degrees in Computer Science from Purdue University. He is an Assistant Professor in the Management Science and Information Systems Department at Rutgers University. His research interests include data mining and analysis, information security, and privacy. He has received best paper awards for papers in ICDE and SIDKDD. He is a Member of the IEEE Computer Society and the ACM. Hwanjo Yu received the Ph.D. degree in Computer Science in 2004 from the University of Illinois at Urbana-Champaign. He is an Assistant Professor in the Department of Computer Science at the University of Iowa. His research interests include data mining, machine learning, database, and information systems. He is an Associate Editor of Neurocomputing and served on the NSF Panel in 2006. He has served on the program committees of 2005 ACM SAC on Data Mining track, 2005 and 2006 IEEE ICDM, 2006 ACM CIKM, and 2006 SIAM Data Mining. Xiaoqian Jiang received the B.S. degree in Computer Science from Shanghai Maritime University, Shanghai, 2003. He received the M.C.S. degree in Computer Science from the University of Iowa, Iowa City, 2005. Currently, he is pursuing a Ph.D. degree from the School of Computer Science, Carnegie Mellon University. His research interests are computer vision, machine learning, data mining, and privacy protection technologies.  相似文献   

5.
In this paper, we study the problem of efficiently computing k-medians over high-dimensional and high speed data streams. The focus of this paper is on the issue of minimizing CPU time to handle high speed data streams on top of the requirements of high accuracy and small memory. Our work is motivated by the following observation: the existing algorithms have similar approximation behaviors in practice, even though they make noticeably different worst case theoretical guarantees. The underlying reason is that in order to achieve high approximation level with the smallest possible memory, they need rather complex techniques to maintain a sketch, along time dimension, by using some existing off-line clustering algorithms. Those clustering algorithms cannot guarantee the optimal clustering result over data segments in a data stream but accumulate errors over segments, which makes most algorithms behave the same in terms of approximation level, in practice. We propose a new grid-based approach which divides the entire data set into cells (not along time dimension). We can achieve high approximation level based on a novel concept called (1 - ε)-dominant. We further extend the method to the data stream context, by leveraging a density-based heuristic and frequent item mining techniques over data streams. We only need to apply an existing clustering once to computing k-medians, on demand, which reduces CPU time significantly. We conducted extensive experimental studies, and show that our approaches outperform other well-known approaches.  相似文献   

6.
Classification is an important technique in data mining.The decision trees builty by most of the existing classification algorithms commonly feature over-branching,which will lead to poor efficiency in the subsequent classification period.In this paper,we present a new value-oriented classification method,which aims at building accurately proper-sized decision trees while reducing over-branching as much as possible,based on the concepts of frequent-pattern-node and exceptive-child-node.The experiments show that while using relevant anal-ysis as pre-processing ,our classification method,without loss of accuracy,can eliminate the over-branching greatly in decision trees more effectively and efficiently than other algorithms do.  相似文献   

7.
The study on database technologies, or more generally, the technologies of data and information management, is an important and active research field. Recently, many exciting results have been reported. In this fast growing field, Chinese researchers play more and more active roles. Research papers from Chinese scholars, both in China and abroad,appear in prestigious academic forums.In this paper,we, nine young Chinese researchers working in the United States, present concise surveys and report our recent progress on the selected fields that we are working on.Although the paper covers only a small number of topics and the selection of the topics is far from balanced, we hope that such an effort would attract more and more researchers,especially those in China,to enter the frontiers of database research and promote collaborations. For the obvious reason, the authors are listed alphabetically, while the sections are arranged in the order of the author list.  相似文献   

8.
Balance control of a biped robot using camera image of reference object   总被引:1,自引:0,他引:1  
This paper presents a new balance control scheme for a biped robot. Instead of using dynamic sensors to measure the pose of a biped robot, this paper uses only the visual information of a specific reference object in the workspace. The zero moment point (ZMP) of the biped robot can be calculated from the robot’s pose, which is measured from the reference object image acquired by a CCD camera on the robot’s head. For balance control of the biped robot a servo controller uses an error between the reference ZMP and the current ZMP, estimated by Kalman filter. The efficiency of the proposed algorithm has been proven by the experiments performed on both flat and uneven floors with unknown thin obstacles. Recommended by Editorial Board member Dong Hwan Kim under the direction of Editor Jae-Bok Song. This work was supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD). This research was supported by the MKE(The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA(Institute for Information Technology Advancement) (IITA-2008-C1090-0803-0006). Sangbum Park received the B.S. and M.S. degrees from Electronic Engineering of Soongsil University, Seoul, Korea, in 2004 and 2006 respectively. He has been with School of Electronic Engineering, Soongsil University since 2006, where he is currently pursuing a Ph.D. His current research interests include biped walking robot, robotics vision. Youngjoon Han received the B.S., M.S. and Ph.D. degrees in Electronic Engineering from Soongsil University, Seoul, Korea, in 1996, 1998, and 2003, respectively. He is currently an Assistant Professor in the School of Electornic Engineering at Soongsil University. His research interests include robot vision system, and visual servo control. Hernsoo Hahn received the B.S. and M.S. degrees in Electronic Engineering at Soongsil University and Younsei University, Korea in 1982 and 1983 respectively. He received the Ph.D. degree in Computer Engineering from University of Southern California in 1991, and became an Assistant Professor at the School Electroncis Engneering in Soongsil University in 1992. Currently, he is a Professor. His research interests include application of vision sensors to mobile robots and measurement systems.  相似文献   

9.
This paper proposes a novel method of analysing trajectories followed by people while they perform navigational tasks. The results indicate that modelling trajectories with Bézier curves provides a basis for the diagnosis of navigational patterns. The method offers five indicators: goodness of fit, average curvature, number of inflexion points, lengths of straight line segments, and area covered. Study results, obtained in a virtual environment show that these indicators carry important information about user performance, specifically spatial knowledge acquisition. Corina Sas is a Lecturer in the field of human–computer interaction in the Computing Department at Lancaster University. She holds bachelor degrees in Computer Science and Psychology and an M.A. in Industrial Psychology from Romania. She received her Ph.D. degree in Computer Science from University College Dublin in 2004. Her research interests include user modelling, adaptive systems, data mining, spatial cognition, user studies and individual differences. She has published in various journals and international conferences in these areas. Nikita Schmidt is a Postdoctoral Research Fellow at University College Dublin (UCD). He received his Ph.D. degree from UCD in 2004 and M.Sc. from St-Petersburg State University, Russia in 1994. His research interests include pervasive, ubiquitous and location-aware computing, embedded systems, hardware-close software development and tree-structured data. His work experience is a mix of industry and academia.  相似文献   

10.
Many supervised machine learning tasks can be cast as multi-class classification problems. Support vector machines (SVMs) excel at binary classification problems, but the elegant theory behind large-margin hyperplane cannot be easily extended to their multi-class counterparts. On the other hand, it was shown that the decision hyperplanes for binary classification obtained by SVMs are equivalent to the solutions obtained by Fisher's linear discriminant on the set of support vectors. Discriminant analysis approaches are well known to learn discriminative feature transformations in the statistical pattern recognition literature and can be easily extend to multi-class cases. The use of discriminant analysis, however, has not been fully experimented in the data mining literature. In this paper, we explore the use of discriminant analysis for multi-class classification problems. We evaluate the performance of discriminant analysis on a large collection of benchmark datasets and investigate its usage in text categorization. Our experiments suggest that discriminant analysis provides a fast, efficient yet accurate alternative for general multi-class classification problems. Tao Li is currently an assistant professor in the School of Computer Science at Florida International University. He received his Ph.D. degree in Computer Science from University of Rochester in 2004. His primary research interests are: data mining, machine learning, bioinformatics, and music information retrieval. Shenghuo Zhu is currently a researcher in NEC Laboratories America, Inc. He received his B.E. from Zhejiang University in 1994, B.E. from Tsinghua University in 1997, and Ph.D degree in Computer Science from University of Rochester in 2003. His primary research interests include information retrieval, machine learning, and data mining. Mitsunori Ogihara received a Ph.D. in Information Sciences at Tokyo Institute of Technology in 1993. He is currently Professor and Chair of the Department of Computer Science at the University of Rochester. His primary research interests are data mining, computational complexity, and molecular computation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号