首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Mining frequent patterns from datasets is one of the key success of data mining research. Currently,most of the studies focus on the data sets in which the elements are independent, such as the items in the marketing basket. However, the objects in the real world often have close relationship with each other. How to extract frequent patterns from these relations is the objective of this paper. The authors use graphs to model the relations, and select a simple type for analysis. Combining the graph theory and algorithms to generate frequent patterns, a new algorithm called Topology, which can mine these graphs efficiently, has been proposed.The performance of the algorithm is evaluated by doing experiments with synthetic datasets and real data. The experimental results show that Topology can do the job well. At the end of this paper, the potential improvement is mentioned.  相似文献   

2.
This paper introduces the design and implemetation of BCL-3,a high performance low-level communication software running on a cluster of SMPs(CLUMPS) called DAWNING-3000,BCL-3 provides flexible and sufficient functionality to fulfill the communication requirements of fundamental system software developed for DAWNING-3000 while guaranteeing security,scalability,and reliability,Important features of BCL-3 are presented in the paper,including special support for SMP and heterogeneous network environment,semiuser-level communication,reliable and ordered data transfer and scalable flow control,The performance evaluation of BCL-3 over Myrinet is also given.  相似文献   

3.
This paper introduces a new algorithm of mining association rules.The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions.The total number of pa sses over the database is only(k 2m-2)/m,where k is the longest size in the itemsets.It is much less than k .  相似文献   

4.
Parallelizing compilers have made great progress in recent years.However,there still remains a gap between the current ability of parallelizing compilers and their final goals.In order to achieve the maximum,parallelism,run-time techniques were used in parallelizing compilers during last few years.First,this paper presents a basic run-time prviation method.The definition of run-time dead code,backward data-flow information must be used.Proteus Test,which can use backward information in run-time,is then presented to exploit more dynamic parallelism.Also.a variation of Protus Test,the Advanced Proteus Test,is offered to achieve partial parallelism.Proteus Test was implemented on the parallelizing compiler AFT.In the end of this paper the program fppp.f of Spec95fp Benchmark is taken as an example,to show the effectiveness of Proteus Test.  相似文献   

5.
Classification is an important technique in data mining.The decision trees builty by most of the existing classification algorithms commonly feature over-branching,which will lead to poor efficiency in the subsequent classification period.In this paper,we present a new value-oriented classification method,which aims at building accurately proper-sized decision trees while reducing over-branching as much as possible,based on the concepts of frequent-pattern-node and exceptive-child-node.The experiments show that while using relevant anal-ysis as pre-processing ,our classification method,without loss of accuracy,can eliminate the over-branching greatly in decision trees more effectively and efficiently than other algorithms do.  相似文献   

6.
The study on database technologies, or more generally, the technologies of data and information management, is an important and active research field. Recently, many exciting results have been reported. In this fast growing field, Chinese researchers play more and more active roles. Research papers from Chinese scholars, both in China and abroad,appear in prestigious academic forums.In this paper,we, nine young Chinese researchers working in the United States, present concise surveys and report our recent progress on the selected fields that we are working on.Although the paper covers only a small number of topics and the selection of the topics is far from balanced, we hope that such an effort would attract more and more researchers,especially those in China,to enter the frontiers of database research and promote collaborations. For the obvious reason, the authors are listed alphabetically, while the sections are arranged in the order of the author list.  相似文献   

7.
1 IntroductionLet G = (V, E) be a connected, undirected graph with a weight function W on the set Eof edges to the set of reals. A spanning tree is a subgraph T = (V, ET), ET G E, of C suchthat T is a tree. The weight W(T) of a spanning tree T is the sum of the weights of its edges.A spanning tree with the smallest possible'weight is called a minimum spanning tree (MST)of G. Computing an MST of a given weighted graph is an important problem that arisesin many applications. For this …  相似文献   

8.
Tracking clusters in evolving data streams over sliding windows   总被引:6,自引:4,他引:2  
Mining data streams poses great challenges due to the limited memory availability and real-time query response requirement. Clustering an evolving data stream is especially interesting because it captures not only the changing distribution of clusters but also the evolving behaviors of individual clusters. In this paper, we present a novel method for tracking the evolution of clusters over sliding windows. In our SWClustering algorithm, we combine the exponential histogram with the temporal cluster features, propose a novel data structure, the Exponential Histogram of Cluster Features (EHCF). The exponential histogram is used to handle the in-cluster evolution, and the temporal cluster features represent the change of the cluster distribution. Our approach has several advantages over existing methods: (1) the quality of the clusters is improved because the EHCF captures the distribution of recent records precisely; (2) compared with previous methods, the mechanism employed to adaptively maintain the in-cluster synopsis can track the cluster evolution better, while consuming much less memory; (3) the EHCF provides a flexible framework for analyzing the cluster evolution and tracking a specific cluster efficiently without interfering with other clusters, thus reducing the consumption of computing resources for data stream clustering. Both the theoretical analysis and extensive experiments show the effectiveness and efficiency of the proposed method. Aoying Zhou is currently a Professor in Computer Science at Fudan University, Shanghai, P.R. China. He won his Bachelor and Master degrees in Computer Science from Sichuan University in Chengdu, Sichuan, P.R. China in 1985 and 1988, respectively, and Ph.D. degree from Fudan University in 1993. He served as the member or chair of program committee for many international conferences such as WWW, SIGMOD, VLDB, EDBT, ICDCS, ER, DASFAA, PAKDD, WAIM, and etc. His papers have been published in ACM SIGMOD, VLDB, ICDE, and several other international journals. His research interests include Data mining and knowledge discovery, XML data management, Web mining and searching, data stream analysis and processing, peer-to-peer computing. Feng Cao is currently an R&D engineer in IBM China Research Laboratories. He received a B.E. degree from Xi'an Jiao Tong University, Xi'an, P.R. China, in 2000 and an M.E. degree from Huazhong University of Science and Technology, Wuhan, P.R. China, in 2003. From October 2004 to March 2005, he worked in Fudan-NUS Competency Center for Peer-to-Peer Computing, Singapore. In 2006, he received his Ph.D. degree from Fudan University, Shanghai, P.R. China. His current research interests include data mining and data stream. Weining Qian is currently an Assistant Professor in computer science at Fudan University, Shanghai, P.R. China. He received his M.S. and Ph.D. degree in computer science from Fudan University in 2001 and 2004, respectively. He is supported by Shanghai Rising-Star Program under Grant No. 04QMX1404 and National Natural Science Foundation of China (NSFC) under Grant No. 60673134. He served as the program committee member of several international conferences, including DASFAA 2006, 2007 and 2008, APWeb/WAIM 2007, INFOSCALE 2007, and ECDM 2007. His papers have been published in ICDE, SIAM DM, and CIKM. His research interests include data stream query processing and mining, and large-scale distributed computing for database applications. Cheqing Jin is currently an Assistant Professor in Computer Science at East China University of Science and Technology. He received his Bachelor and Master degrees in Computer Science from Zhejiang University in Hangzhou, P.R. China in 1999 and 2002, respectively, and the Ph.D. degree from Fudan University, Shanghai, P.R. China. He worked as a Research Assistant at E-business Technology Institute, the Hong Kong University from December 2003 to May 2004. His current research interests include data mining and data stream.  相似文献   

9.
Combinatorial optimization problems are found in many application fields such as computer science,engineering and economy. In this paper, a new efficient meta-heuristic, Intersection-Based Scaling (IBS for abbreviation), is proposed and it can be applied to the combinatorial optimization problems. The main idea of IBS is to scale the size of the instance based on the intersection of some local optima, and to simplify the search space by extracting the intersection from the instance, which makes the search more efficient. The combination of IBS with some local search heuristics of different combinatorial optimization problems such as Traveling Salesman Problem (TSP) and Graph Partitioning Problem (GPP) is studied, and comparisons are made with some of the best heuristic algorithms and meta-heuristic algorithms. It is found that it has significantly improved the performance of existing local search heuristics and significantly outperforms the known best algorithms.  相似文献   

10.
In this paper, we study the problem of efficiently computing k-medians over high-dimensional and high speed data streams. The focus of this paper is on the issue of minimizing CPU time to handle high speed data streams on top of the requirements of high accuracy and small memory. Our work is motivated by the following observation: the existing algorithms have similar approximation behaviors in practice, even though they make noticeably different worst case theoretical guarantees. The underlying reason is that in order to achieve high approximation level with the smallest possible memory, they need rather complex techniques to maintain a sketch, along time dimension, by using some existing off-line clustering algorithms. Those clustering algorithms cannot guarantee the optimal clustering result over data segments in a data stream but accumulate errors over segments, which makes most algorithms behave the same in terms of approximation level, in practice. We propose a new grid-based approach which divides the entire data set into cells (not along time dimension). We can achieve high approximation level based on a novel concept called (1 - ε)-dominant. We further extend the method to the data stream context, by leveraging a density-based heuristic and frequent item mining techniques over data streams. We only need to apply an existing clustering once to computing k-medians, on demand, which reduces CPU time significantly. We conducted extensive experimental studies, and show that our approaches outperform other well-known approaches.  相似文献   

11.
OpenMP on Networks of Workstations for Software DSMs   总被引:3,自引:0,他引:3       下载免费PDF全文
This paper describes the implementation of a sizable subset of OpenMP on networks of workstations(NOWs) and the source-to-source OpenMP complier(AutoPar) is used for the JIAJIA home-based shared virtual memory system (SVM).The paper suggests some simple modifications and extensions to the OpenMP standard for the difference between SVM and SMP(symmetric multi processor),at which the OpenMP specification is aimed.The OpenMP translator is based on an automatic paralleization compiler,so it is possible to check the correctness of the semantics of OpenMP programs which is not required in an OpenMP-compliant implementation AutoPar is measured for five applications including both programs from NAS Parallel Benchmarks and real applications on a cluster of eight Pentium Ⅱ PCs connected by a 100 Mbps switched Eternet.The evaluation shows that the parallelization by annotaing OpenMPdirectives is simple and the performance of generatd JIAJIA code is still acceptable on NOWs.  相似文献   

12.
1IntroductionMulticastcommunication,whichreferstothedeliveryofamessagefromasinglesourcenodetoanumberofdestinationnodes,isfrequentlyusedindistributed-memoryparallelcomputersystemsandnetworks[1].Efficientimplementationofmulticastcommunicationiscriticaltotheperformanceofmessage-basedscalableparallelcomputersandswitch-basedhighspeednetworks.Switch-basednetworksorindirectnetworks,basedonsomevariationsofmultistageiDterconnectionnetworks(MINs),haveemergedasapromisingnetworkajrchitectureforconstruct…  相似文献   

13.
Finding centric local outliers in categorical/numerical spaces   总被引:2,自引:0,他引:2  
Outlier detection techniques are widely used in many applications such as credit-card fraud detection, monitoring criminal activities in electronic commerce, etc. These applications attempt to identify outliers as noises, exceptions, or objects around the border. The existing density-based local outlier detection assigns the degree to which an object is an outlier in a numerical space. In this paper, we propose a novel mutual-reinforcement-based local outlier detection approach. Instead of detecting local outliers as noise, we attempt to identify local outliers in the center, where they are similar to some clusters of objects on one hand, and are unique on the other. Our technique can be used for bank investment to identify a unique body, similar to many good competitors, in which to invest. We attempt to detect local outliers in categorical, ordinal as well as numerical data. In categorical data, the challenge is that there are many similar but different ways to specify relationships among the data items. Our mutual-reinforcement-based approach is stable, with similar but different user-defined relationships. Our technique can reduce the burden for users to determine the relationships among data items, and find the explanations why the outliers are found. We conducted extensive experimental studies using real datasets. Jeffrey Xu Yu received his B.E., M.E. and Ph.D. in computer science, from the University of Tsukuba, Japan, in 1985, 1987 and 1990, respectively. Jeffrey Xu Yu was a research fellow in the Institute of Information Sciences and Electronics, University of Tsukuba (Apr. 1990–Mar. 1991), and held teaching positions in the Institute of Information Sciences and Electronics, University of Tsukuba (Apr. 1991–July 1992) and in the Department of Computer Science, Australian National University (July 1992–June 2000). Currently he is an Associate Professor in the Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong. His major research interests include data mining, data stream mining/processing, XML query processing and optimization, data warehouse, on-line analytical processing, and design and implementation of database management systems. Weining Qian is currently an assistant professor of computer science at Fudan University, Shanghai, China. He received his M.S. and Ph.D. degrees in computer science from Fudan University in 2001 and 2004, respectively. He was supported by a Microsoft Research Fellowship when he was doing the research presented in this paper, and he is supported by the Shanghai Rising Star Program. His research interests include data mining for very large databases, data stream query processing and mining and peer-to-peer computing. Hongjun Lu received his B.Sc. from Tsinghua University, China, and M.Sc. and Ph.D. from the Department of Computer Science, University of Wisconsin–Madison. He worked as an engineer in the Chinese Academy of Space Technology, and a principal research scientist in the Computer Science Center of Honeywell Inc., Minnesota, USA (1985–1987), and a professor at the School of Computing of the National University of Singapore (1987–2000), and is a full professor of the Hong Kong University of Science and Technology. His research interests are in data/knowledge-base management systems with an emphasis on query processing and optimization, physical database design, and database performance. Hongjun Lu is currently a trustee of the VLDB Endowment, an associate editor of the IEEE Transactions on Knowledge and Data Engineering (TKDE), and a member of the review board of the Journal of Database Management. He served as a member of the ACM SIGMOD Advisory Board in 1998–2002. Aoying Zhou born in 1965, is currently a professor of computer science at Fudan University, Shanghai, China. He won his Bachelor degree and Master degree in Computer Science from Sichuan University in Chengdu, Sichuan, China in 1985 and 1988. respectively, and a Ph.D. degree from Fudan University in 1993. He has served as a member or chair of the program committees for many international conferences such as VLDB, ER, DASFAA, WAIM, and etc. His papers have been published in ACM SIGMOD, VLDB, ICDE and some international journals. His research interests include data mining and knowledge discovery, XML data management, web query and searching, data stream analysis and processing and peer-to-peer computing.  相似文献   

14.
A Novel Computer Architecture to Prevent Destruction by Viruses   总被引:1,自引:0,他引:1       下载免费PDF全文
In today‘s Internet computing world,illegal activities by crackers pose a serious threat to computer security.It is well known that computer viruses,Trojan horses and other intrusive programs may cause sever and often catastrophic consequences. This paper proposes a novel secure computer architecture based on security-code.Every instruction/data word is added with a security-code denoting its security level.External programs and data are automatically addoed with security-code by hadware when entering a computer system.Instruction with lower security-code cannot run or process instruction/data with higher security level.Security-code cannot be modified by normal instruction.With minor hardware overhead,then new architecture can effectively protect the main computer system from destruction or theft by intrusive programs such as computer viruses.For most PC systems it includes an increase of word-length by 1 bit on register,the memory and the hard disk.  相似文献   

15.
In this paper,a noverl technique adopted in HarkMan is introduced.HarkMan is a keywore-spotter designed to automatically spot the given words of a vocabulary-independent task in unconstrained Chinese telephone speech.The speaking manner and the number of keywords are not limited.This paper focuses on the novel technique which addresses acoustic modeling,keyword spotting network,search strategies,robustness,and rejection.The underlying technologies used in HarkMan given in this paper are useful not only for keyword spotting but also for continuous speech recognition.The system has achieved a figure-of-merit value over 90%.  相似文献   

16.
In the part 2 of advanced Audio Video coding Standard (AVS-P2), many efficient coding tools are adopted in motion compensation, such as new motion vector prediction, symmetric matching, quarter precision interpolation, etc. However, these new features enormously increase the computational complexity and the memory bandwidth requirement, which make motion compensation a difficult component in the implementation of the AVS HDTV decoder. This paper proposes an efficient motion compensation architecture for AVS-P2 video standard up to the Level 6.2 of the Jizhun Profile. It has a macroblock-level pipelined structure which consists of MV predictor unit, reference fetch unit and pixel interpolation unit. The proposed architecture exploits the parallelism in the AVS motion compensation algorithm to accelerate the speed of operations and uses the dedicated design to optimize the memory access. And it has been integrated in a prototype chip which is fabricated with TSMC 0.18-#m CMOS technology, and the experimental results show that this architecture can achieve the real time AVS-P2 decoding for the HDTV 1080i (1920 - 1088 4 : 2 : 0 60field/s) video. The efficient design can work at the frequency of 148.5MHz and the total gate count is about 225K.  相似文献   

17.
Information service plays a key role in grid system, handles resource discovery and management process. Employing existing information service architectures suffers from poor scalability, long search response time, and large traffic overhead. In this paper, we propose a service club mechanism, called S-Club, for efficient service discovery. In S-Club, an overlay based on existing Grid Information Service (GIS) mesh network of CROWN is built, so that GISs are organized as service clubs. Each club serves for a certain type of service while each GIS may join one or more clubs. S-Club is adopted in our CROWN Grid and the performance of S-Club is evaluated by comprehensive simulations. The results show that S-Club scheme significantly improves search performance and outperforms existing approaches. Chunming Hu is a research staff in the Institute of Advanced Computing Technology at the School of Computer Science and Engineering, Beihang University, Beijing, China. He received his B.E. and M.E. in Department of Computer Science and Engineering in Beihang University. He received the Ph.D. degree in School of Computer Science and Engineering of Beihang University, Beijing, China, 2005. His research interests include peer-to-peer and grid computing; distributed systems and software architectures. Yanmin Zhu is a Ph.D. candidate in the Department of Computer Science, Hong Kong University of Science and Technology. He received his B.S. degree in computer science from Xi’an Jiaotong University, Xi’an, China, in 2002. His research interests include grid computing, peer-to-peer networking, pervasive computing and sensor networks. He is a member of the IEEE and the IEEE Computer Society. Jinpeng Huai is a Professor and Vice President of Beihang University. He serves on the Steering Committee for Advanced Computing Technology Subject, the National High-Tech Program (863) as Chief Scientist. He is a member of the Consulting Committee of the Central Government’s Information Office, and Chairman of the Expert Committee in both the National e-Government Engineering Taskforce and the National e-Government Standard office. Dr. Huai and his colleagues are leading the key projects in e-Science of the National Science Foundation of China (NSFC) and Sino-UK. He has authored over 100 papers. His research interests include middleware, peer-to-peer (P2P), grid computing, trustworthiness and security. Yunhao Liu received his B.S. degree in Automation Department from Tsinghua University, China, in 1995, and an M.A. degree in Beijing Foreign Studies University, China, in 1997, and an M.S. and a Ph.D. degree in computer science and engineering at Michigan State University in 2003 and 2004, respectively. He is now an assistant professor in the Department of Computer Science and Engineering at Hong Kong University of Science and Technology. His research interests include peer-to-peer computing, pervasive computing, distributed systems, network security, grid computing, and high-speed networking. He is a senior member of the IEEE Computer Society. Lionel M. Ni is chair professor and head of the Computer Science and Engineering Department at Hong Kong University of Science and Technology. Lionel M. Ni received the Ph.D. degree in electrical and computer engineering from Purdue University, West Lafayette, Indiana, in 1980. He was a professor of computer science and engineering at Michigan State University from 1981 to 2003, where he received the Distinguished Faculty Award in 1994. His research interests include parallel architectures, distributed systems, high-speed networks, and pervasive computing. A fellow of the IEEE and the IEEE Computer Society, he has chaired many professional conferences and has received a number of awards for authoring outstanding papers.  相似文献   

18.
Ontology-Based Semantic Cache in AOKB   总被引:2,自引:0,他引:2       下载免费PDF全文
When querying on a large-scale knowledge base,a major technique of improving performance is to preload knowledge to minimize the number of roundtrips to the knowledge base.In this paper,an ontology-based semantic cache is proposed for an agent and ontology-oriented knowledge base (AOKB).In AOKB,an ontology is the collection of relationships between a group of knowledge units (agents and/or other sub-ontologies).When loading some agent A,its relationships with other knowledge units are examined,and those who have a tight semantic tie with A will be preloaded at the same time,including agents and sub-ontologies in the same ontology where A is.The proloaded agents and ontologies are saved at a semantic cache located in the memory.Test results show that up to 50% reduction in running time is achieved.  相似文献   

19.
20.
Scheduling algorithms based on weakly hard real-time constraints   总被引:6,自引:0,他引:6       下载免费PDF全文
The problem of scheduling weakly hard real-time tasks is addressed in this paper.The paper first analyzes the characters of μ-pattern and weakly hard real-time constraints,then,presents two scheduling algorithms,Meet Any Algorithm and Meet Row Algorithm,for weakly hard real-time systems.Different from traditional algorithms used to guarantee deadlines,MeetAny Algorithm and Meet Row Algorithm can guarantee both deadlines and constraints.Meet Any Algorithm and Meet Row Algorithm try to find out the probabilities of tasks breaking constraints and increase task‘s priority in advance,but not till the last moment.Simulation results show that these two algorithms are better than other scheduling algorithms dealing with constraints and can largely decrease worst-case computation time of real-time tasks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号