首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 187 毫秒
1.
One view of finding a personalized solution of reduct in an information system is grounded on the viewpoint that attribute order can serve as a kind of semantic representation of user requirements. Thus the problem of finding personalized solutions can be transformed into computing the reduct on an attribute order. The second attribute theorem describes the relationship between the set of attribute orders and the set of reducts, and can be used to transform the problem of searching solutions to meet user requirements into the problem of modifying reduct based on a given attribute order. An algorithm is implied based on the second attribute theorem, with computation on the discernibility matrix. Its time complexity is O(n^2 × m) (n is the number of the objects and m the number of the attributes of an information system). This paper presents another effective second attribute algorithm for facilitating the use of the second attribute theorem, with computation on the tree expression of an information system. The time complexity of the new algorithm is linear in n. This algorithm is proved to be equivalent to the algorithm on the discernibility matrix.  相似文献   

2.
Work in inductive learning has mostly been concentrated on classifying.However,there are many applications in which it is desirable to order rather than to classify instances.Formodelling ordering problems,we generalize the notion of information tables to ordered information tables by adding order relations in attribute values.Then we propose a data analysis model by analyzing the dependency of attributes to describe the properties of ordered information tables.The problem of mining ordering rules is formulated as finding association between orderings of attribute values and the overall ordering of objects.An ordering rules may state that “if the value of an object x on an attribute a is ordered ahead of the value of another object y on the same attribute,then x is ordered ahead of y“.For mining ordering rules,we first transform an ordered information table into a binary information table,and then apply any standard machine learning and data mining algorithms.As an illustration,we analyze in detail Maclean‘s universities ranking for the year 2000.  相似文献   

3.
一种基于决策表的分类规则挖掘新算法   总被引:2,自引:0,他引:2  
The mining of classification rules is an important field in Data Mining. Decision table of rough sets theory is an efficient tool for mining classification rules. The elementary concepts corresponding to decision table of Rough Sets Theory are introduced in this paper. A new algorithm for mining classification rules based on Decision Table is presented, along with a discernable function in reduction of attribute values, and a new principle for accuracy of rules. An example of its application to the car‘s classification problem is included, and the accuracy of rules discovered is analyzed. The potential fields for its application in data mining are also discussed.  相似文献   

4.
Reduct and attribute order   总被引:14,自引:2,他引:12       下载免费PDF全文
Based on the principle of discernibility matrix, a kind of reduction algorithm with attribute order has been developed and its solution has been proved to be complete for reduct and unique for a given attribute order. Being called the reduct problem, this algorithm can be regarded as a mapping R = Reduct(S) from the attribute order space O to the reduct space R for an information system (U, C ∪ D), where U is the universe and C and D are two sets of condition and decision attributes respectively. This paper focuses on the reverse problem of reduct problem S = Order(R), i.e., for a given reduct R of an information system, we determine the solution of S = Order(R) in the space θ. First, we need to prove that there is at least one attribute order S such that S = Order(R). Then, some decision rules are proposed, which can be used directly to decide whether the pair of attribute orders has the same reduct. The main method is based on the fact that an attribute order can be transformed into another one by moving the attribute for limited times. Thus, the decision of the pair of attribute orders can be altered to the decision of the sequence of neighboring pairs of attribute orders. Therefore,the basic theorem of neighboring pair of attribute orders is first proved, then, the decision theorem of attribute order is proved accordingly by the second attribute.  相似文献   

5.
In this paper,a new effective method is proposed to find class association rules (CAR),to get useful class associaiton rules(UCAR)by removing the spurious class association rules (SCAR),and to generate exception class associaiton rules(ECAR)for each UCAR.CAR mining,which integrates the techniques of classification and association,is of great interest recently.However,it has two drawbacks:one is that a large part of CARs are spurious and maybe misleading to users ;the other is that some important ECARs are diffcult to find using traditional data mining techniques .The method introduced in this paper aims to get over these flaws.According to our approach,a user can retrieve correct information from UCARs and konw the influence from different conditions by checking corresponding ECARs.Experimental results demonstrate the effectiveness of our proposed approach.  相似文献   

6.
As an important type of multidimensional preference query, the skyline query can find a superset of optimal results when there is no given linear function to combine values for all attributes of interest. Its processing has been extensively investigated in the past. While most skyline query processing algorithms are designed based on the assumption that query processing is done for all attributes in a static dataset with deterministic attribute values, some advanced work has been done recently to remove part of such a strong assumption in order to process skyline queries for real-life applications, namely, to deal with data with multi-valued attributes (known as data uncertainty), to support skyline queries in a subspace which is a subset of attributes selected by the user, and to support continuous queries on streaming data. Naturally, there are many application scenarios where these three complex issues must be considered together. In this paper, we tackle the problem of probabilistic subspace skyline query processing over sliding windows on uncertain data streams. That is, to retrieve all objects from the most recent window of streaming data in a user-selected subspace with a skyline probability no smaller than a given threshold. Based on the subtle relationship between the full space and an arbitrary subspace, a novel approach using a regular grid indexing structure is developed for this problem. An extensive empirical study under various settings is conducted to show the effectiveness and efficiency of our PSS algorithm.  相似文献   

7.
Knowledge Representation in KDD Based on Linguistic Atoms   总被引:11,自引:0,他引:11       下载免费PDF全文
  相似文献   

8.
3GPP long term evolution (LTE) is a promising candidate for the next-generation wireless network, which is expected to achieve high spectrum efficiency by using advanced physical layer techniques and flat network structures. However, the LTE network still faces the problem of load imbalance as in GSM/WCDMA networks, and this may cause significant deterioration of system performance. To deal with this problem, mobility load balancing (MLB) has been proposed as an important use case in 3GPP self-organizing network (SON), in which the serving cell of a user can be selected to achieve load balancing rather than act as the cell with the maximum received power. Furthermore, the LTE network aims to serve users with different quality-of-service (QoS) requirements, and the network-wide objective function for load balancing is distinct for different kinds of users. Thus, in this paper, a unified algorithm is proposed for MLB in the LTE network. The load balancing problem is first formulated as an optimization problem with the optimizing variables being cell-user connections. Then the complexity and overhead of the optimal solution is analyzed and a practical and distributed algorithm is given. After that, the proposed algorithm is evaluated for users with different kinds of QoS requirements, i.e., guaranteed bit rate (GBR) users with the objective function of load balance index and non-GBR (nGBR) users with the objective function of total utility, respectively. Simulation results show that the proposed algorithm leads to significantly balanced load distribution for GBR users to decrease the new call blocking rate, and for nGBR users to improve the cell-edge throughput at the cost of only slight deterioration of total throughput.  相似文献   

9.
Semistructued data are specified in lack of any fixed and rigid schema,even though typically some implicit structure appears in the data.The huge amounts of on-line applications make it important and imperative to mine the schema of semistructured data ,both for the users(e.g.,to gather useful information and facilitate querying)and for the systems (e.g.,to optimize access).The critical problem is to discover the hidden structure in the semistructured data.Current methods in extracting Web data structure are either in a general way independent of application background,or bound in some concrete environment such as HTML,XML etc.But both face the burden of expensive cost and difficulty in keeping along with the frequent and complicated variances of Web data.In this paper,the problem of incremental mining of schema for semistructured data after the update of the raw data is discusses.An algorithm for incrementally mining the schema of semistructured data is provided,and some experimental results are also given,which show that incremental mining for semistructured data is more efficient than non-incremental mining.  相似文献   

10.
张伟 《计算机科学》2003,30(11):56-57
Today, search engine is the most commonly used tool for Web information retrieval, data mining may discover knowledge in large data. With the era of information and digital of media, Web data mining is becoming one of the hottest topics. By combining information retrieval technology with data mining technology, a prototype system of search engine is designed and implemented in this paper. It can group Web search results in a semantic, online and tree way, in order to help users find relevant Web information easier and faster.  相似文献   

11.
In this Paper,we present reduction algorithms based on the principle of Skowron‘s discernibility matrix-the ordered attributes method.The completeness of the algorithms for Pawlak reduct and the uniqueness for a given order of the attributes are proved.Since a discernibility matrix requires the size of the memory of |U|^2,U is a universe of bojects,it would be impossible to apply these algorithms directly to a massive object set.In order to solve the problem,a so=called quasi-discernibility matrix and two reduction algorithms are prpopsed.Although the proposed algorithms are incomplete for Pawlak reduct,their optimal paradigms ensure the completeness as long as they satisfy some conditions.Finally,we consider the problem on the reduction of distributive object sets.  相似文献   

12.
Kernel Projection Algorithm for Large-Scale SVM Problems   总被引:5,自引:0,他引:5       下载免费PDF全文
Support Vector Machine (SVM) has become a very effective method in statistical machine learning and it has proved that training SVM is to solve Nearest Point pair Problem (NPP) between two disjoint closed convex sets.Later Keerthi pointed out that it is difficult to apply classical excellent geometric algorithms directly to SVM and so designed a new geometric algorithm for SVM.In this article,a new algorithm for geometrically solving SVM,Kernel Projection Algorithm,is presented based on the theorem on fixed-points of projection mapping.This new algorithm makes it easy to apply classical geometric algorithms to solving SVM and is more understandable than Keerthi‘s.Experiments show that the new algorithm can also handle large-scale SVM problems.Geometric algorithms for SVM,such as Keerthi‘s algorithm,require that two closed convex sets be disjoint and otherwise the algorithms are meaningless.In this article,this requirement will be guaranteed in theory be using the theoretic result on universal kernel functions.  相似文献   

13.
In this paper, we present a novel framework on personalized retrieval of sports video, which includes two research tasks: semantic annotation and user preference acquisition. For semantic annotation, web-casting texts which are corresponding to sports videos are firstly captured from the webpages using data region segmentation and labeling. Incorporating the text, we detect events in the sports video and generate video event clips. These video clips are annotated by the semantics extracted from web-casting texts and indexed in a sports video database. Based on the annotation, these video clips can be retrieved from different semantic attributes according to the user preference. For user preference acquisition, we utilize click-through data as a feedback from the user. Relevance feedback is applied on text annotation and visual features to infer the intention and interested points of the user. A user preference model is learned to re-rank the initial results. Experiments are conducted on broadcast soccer and basketball videos and show an encouraging performance of the proposed method.
Hanqing LuEmail:

Yi-Fan Zhang   received the B.E. degree from Southeast University, Nanjing, China, in 2004. He is currently pursuing the Ph.D. degree at National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China. In 2007, he was an intern student in Institute for Infocomm Research, Singapore. Currently he is an intern student in China-Singapore Institute of Digital Media. His research interests include multimedia, video analysis and pattern recognition. Changsheng Xu   (M’97–SM’99) received the Ph.D. degree from Tsinghua University, Beijing, China in 1996. Currently he is Professor of Institute of Automation, Chinese Academy of Sciences and Executive Director of China-Singapore Institute of Digital Media. He was with Institute for Infocomm Research, Singapore from 1998 to 2008. He was with the National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences from 1996 to 1998. His research interests include multimedia content analysis, indexing and retrieval, digital watermarking, computer vision and pattern recognition. He published over 150 papers in those areas. Dr. Xu is an Associate Editor of ACM/Springer Multimedia Systems Journal. He served as Short Paper Co-Chair of ACM Multimedia 2008, General Co-Chair of 2008 Pacific-Rim Conference on Multimedia (PCM2008) and 2007 Asia-Pacific Workshop on Visual Information Processing (VIP2007), Program Co-Chair of VIP2006, Industry Track Chair and Area Chair of 2007 International Conference on Multimedia Modeling (MMM2007). He also served as Technical Program Committee Member of major international multimedia conferences, including ACM Multimedia Conference, International Conference on Multimedia & Expo, Pacific-Rim Conference on Multimedia, and International Conference on Multimedia Modeling. Xiaoyu Zhang   received the B.S. degree in computer science from Nanjing University of Science and Technology in 2005. He is a Ph.D. candidate of National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences. He is currently a student in China-Singapore Institute of Digital Media. His research interests include image retrieval, video analysis, and machine learning. Hanqing Lu   (M’05–SM’06) received the Ph.D. degree in Huazhong University of Sciences and Technology, Wuhan, China in 1992. Currently he is Professor of Institute of Automation, Chinese Academy of Sciences. His research interests include image similarity measure, video analysis, object recognition and tracking. He published more than 100 papers in those areas.   相似文献   

14.
The information dissemination model is becoming increasingly important in wide-area information systems,In this model,a user subscribes to an information dissemination service by submitting profiles that describe his interests.There have been several simple kinds of information dissemination services on the Internet such as mailing list,but the problem is that it provides a crude granularity of interest matching.A user whose information need does not exactly match certain lists will either receive too many irrelevant or too few relevant messages.This paper presents a personalized information dissemination model based on HowNet,which uses a Concept Network-Views(CN-V) model to support information filtering,user‘s interests modeling and information recommendation.A Concept Network is constructed upon the user‘s profiles and the content of documents,which describes concepts and their relations in the content and assigns different weights to these concepts.Usually the Concept Network is not well arranged,from which it is hard to find some useful realtions.so several views from are extracted it to represent the important relations explicitly.  相似文献   

15.
This paper introduces a new algorithm of mining association rules.The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions.The total number of pa sses over the database is only(k 2m-2)/m,where k is the longest size in the itemsets.It is much less than k .  相似文献   

16.
The technique of multilevel nonparametric pattern recognition systems synthesis on the basis of learning sample decomposition principles and parallel computing technology is proposed. This basis provides effective processing of highly dimensional information. Aleksandr Vasil’evich Lapko was born in 1949 and graduated from Frunze Polytechnic Institute in 1971. He has been a doctor of technical sciences since 1990 and a leading researcher at the Institute of Computational Modeling of the Siberian Branch of the Russian Academy of Sciences. His scientific interests include the following: nonparametric statistics, pattern recognition systems, and the design and optimization of indefinite systems. He is the author of 223 publications, including 13 monographs. He is chairman of the Krasnoyarsk regional department of the Pattern Recognition and Image Analysis Association and an Honored Science Worker of the Russian Federation. Vasilii Aleksandrovich Lapko was born in 1974 and graduated from Krasnoyarsk State Technical University in 1996. He has been a doctor of technical sciences since 2004 in systems analysis, management, and information processing. He is a senior researcher at the Institute of Computational Modeling of the Siberian Branch of the Russian Academy of Sciences. His scientific interests include the following: nonparametric statistics, pattern recognition systems, the design of indefinite systems, and collective evaluation methods. He is the author of 105 publications, including 4 monographs. He was awarded by the Russia Academy of Sciences a medal for the best scientific publication in 2005 for young scientists in Informatics, Computer Engineering, and Automation.  相似文献   

17.
Eliciting requirements for a proposed system inevitably involves the problem of handling undesirable information about customer's needs, including inconsistency, vagueness, redundancy, or incompleteness. We term the requirements statements involved in the undesirable information non-canonical software requirements. In this paper, we propose an approach to handling non-canonical software requirements based on Annotated Predicate Calculus (APC). Informally, by defining a special belief lattice appropriate for representing the stakeholder's belief in requirements statements, we construct a new form of APC to formalize requirements specifications. We then show how the APC can be employed to characterize non-canonical requirements. Finally, we show how the approach can be used to handle non-canonical requirements through a case study. Kedian Mu received B.Sc. degree in applied mathematics from Beijing Institute of Technology, Beijing, China, in 1997, M.Sc. degree in probability and mathematical statistics from Beijing Institute of Technology, Beijing, China, in 2000, and Ph.D. in applied mathematics from Peking University, Beijing, China, in 2003. From 2003 to 2005, he was a postdoctoral researcher at Institute of Computing Technology, Chinese Academy of Sciences, China. He is currently an assistant professor at School of Mathematical Sciences, Peking University, Beijing, China. His research interests include uncertain reasoning in artificial intelligence, knowledge engineering and science, and requirements engineering. Zhi Jin was awarded B.Sc. in computer science from Zhejiang University, Hangzhou, China, in 1984, and studied for her M.Sc. in computer science (expert system) and her Ph.D. in computer science (artificial intelligence) at National Defence University of Technology, Changsha, China. She was awarded Ph.D. in 1992. She is a senior member of China Computer Federation. She is currently a professor at Academy of Mathematics and System Sciences, Chinese Academy of Science. Her research interests include knowledge-based systems, artificial intelligence, requirements engineering, ontology engineering, etc. Her current research focuses on ontology-based requirements elicitation and analysis. She has got about 60 papers published, including co-authoring one book. Ruqian Lu is a professor of computer science of the Institute of Mathematics, Chinese Academy of Sciences. His research interests include artificial intelligence, knowledge engineering and knowledge based software engineering. He designed the “Tian Ma” software systems that have been widely applied in more than 20 fields, including the national defense and the economy. He has won two first class awards from Chinese Academy of Sciences and a National second class prize from the Ministry of Science and Technology. He has also won the sixth Hua Lookeng Prize for Mathematics. Yan Peng received B.Sc. degree in software from Jilin University, Changchun, China, in 1992. From June 2002 to December 2005, he studied for his M.E. in software engineering at College of Software Engineering, Graduate School of Chinese Academy of Sciences, Beijing, China. He was awarded M.E degree in 2006. He is currently responsible for CRM (customer relationship management) and BI (business intelligence) project in the BONG. His research interests include customer relationship management, business intelligence, data ming, software engineering and requirements engineering.  相似文献   

18.
A rough set theory is a new mathematical tool to deal with uncertainty and vagueness of decision system and it has been applied successfully in all the fields. It is used to identify the reduct set of the set of all attributes of the decision system. The reduct set is used as preprocessing technique for classification of the decision system in order to bring out the potential patterns or association rules or knowledge through data mining techniques. Several researchers have contributed variety of algorithms for computing the reduct sets by considering different cases like inconsistency, missing attribute values and multiple decision attributes of the decision system. This paper focuses on the review of the techniques for dimensionality reduction under rough set theory environment. Further, the rough sets hybridization with fuzzy sets, neural network and metaheuristic algorithms have also been reviewed. The performance analysis of the algorithms has been discussed in connection with the classification.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号