排序方式: 共有48条查询结果,搜索用时 125 毫秒
41.
Zahra Farzanyar Mohammadreza Kangavari Nick Cercone 《Information Processing Letters》2013,113(19-21):793-798
Data intensive large-scale distributed systems like peer-to-peer (P2P) networks are finding large number of applications for social networking, file sharing networks, etc. Global data mining in such P2P environments may be very costly due to the high scale and the asynchronous nature of the P2P networks. The cost further increases in the distributed data stream scenario where peers receive continuous sequence of transactions rapidly. In this paper, we develop an efficient local algorithm, P2P-FISM, for discovering of the network-wide recent frequent itemsets. The algorithm works in a completely asynchronous manner, imposes low communication overhead, a necessity for scalability, transparently tolerates network topology changes, and quickly adapts to changes in the data stream. The paper demonstrates experimental results to corroborate the theoretical claims. 相似文献
42.
Evolutionary theory states that stronger genetic characteristics reflect the organism’s ability to adapt to its environment and to survive the harsh competition faced by every species. Evolution normally takes millions of generations to assess and measure changes in heredity. Determining the connections, which constrain genotypes and lead superior ones to survive is an interesting problem. In order to accelerate this process,we develop an artificial genetic dataset, based on an artificial life (AL) environment genetic expression (ALGAE). ALGAE can provide a useful and unique set of meaningful data, which can not only describe the characteristics of genetic data, but also simplify its complexity for later analysis.To explore the hidden dependencies among the variables, Bayesian Networks (BNs) are used to analyze genotype data derived from simulated evolutionary processes and provide a graphical model to describe various connections among genes. There are a number of models available for data analysis such as artificial neural networks, decision trees, factor analysis, BNs, and so on. Yet BNs have distinct advantages as analytical methods which can discern hidden relationships among variables. Two main approaches, constraint based and score based, have been used to learn the BN structure. However, both suit either sparse structures or dense structures. Firstly, we introduce a hybrid algorithm, called “the E-algorithm”, to complement the benefits and limitations in both approaches for BN structure learning. Testing E-algorithm against a standardized benchmark dataset ALARM, suggests valid and accurate results. BAyesian Network ANAlysis (BANANA) is then developed which incorporates the E-algorithm to analyze the genetic data from ALGAE. The resulting BN topological structure with conditional probabilistic distributions reveals the principles of how survivors adapt during evolution producing an optimal genetic profile for evolutionary fitness. 相似文献
43.
44.
Discovering Maximal Generalized Decision Rules Through Horizontal and Vertical Data Reduction 总被引:3,自引:0,他引:3
We present a method to learn maximal generalized decision rules from databases by integrating discretization, generalization and rough set feature selection. Our method reduces the data horizontally and vertically. In the first phase, discretization and generalization are integrated and the numeric attributes are discretized into a few intervals. The primitive values of symbolic attributes are replaced by high level concepts and some obvious superfluous or irrelevant symbolic attributes are also eliminated. Horizontal reduction is accomplished by merging identical tuples after the substitution of an attribute value by its higher level value in a pre-defined concept hierarchy for symbolic attributes, or the discretization of continuous (or numeric) attributes. This phase greatly decreases the number of tuples in the database. In the second phase, a novel context-sensitive feature merit measure is used to rank the features, a subset of relevant attributes is chosen based on rough set theory and the merit values of the features. A reduced table is obtained by removing those attributes which are not in the relevant attributes subset and the data set is further reduced vertically without destroying the interdependence relationships between classes and the attributes. Then rough set-based value reduction is further performed on the reduced table and all redundant condition values are dropped. Finally, tuples in the reduced table are transformed into a set of maximal generalized decision rules. The experimental results on UCI data sets and a real market database demonstrate that our method can dramatically reduce the feature space and improve learning accuracy. 相似文献
45.
Ekawat Chaowicharat Kanlaya Naruedomkul Nick Cercone 《Pattern Analysis & Applications》2016,19(4):1069-1080
The image feature used for classification is a crucial part of a character recognition system. To achieve a high accuracy of offline handwriting recognition, the feature should capture the essence of differences including the differences between different characters and the differences between different drawings of the same character. In this paper, we present a novel image feature called direction histogram (DH) and a feature extraction algorithm called bag of histogram (BoH). Unlike the traditional pre-defined feature, DH was designed based on the nature of language and the variation of writing styles. DH is, therefore, a global feature that represents pixel density in all directions around each center. BoH was introduced as it tolerates to thickness and curve variation and ignores the curve connectivity (if any). Fifty-two datasets, each containing 30 drawings of 80 Thai characters, are used for training our neural network, and the original, thick, and distorted handwriting datasets are used for testing. The recognition system with our proposed DH and BoH feature extraction algorithm yielded higher recognition accuracy compared to the convolutional neural network. 相似文献
46.
Xiangdong An Dawn Jutla Nick Cercone Charnyote Pluempitiwiriyawej Hai Wang 《International Journal of Information Security》2009,8(6):423-431
Context management is the key enabler for emerging context-aware applications, and it includes context acquisition, understanding
and exchanging. Context exchanging should be made privacy-conscious. We can specify privacy preferences to limit the disclosure
of sensitive contexts, but the sensitive contexts could still be derived from those insensitive. To date, there have been
very few inference control mechanisms for context management, especially when the environments are uncertain. In this paper,
we present an inference control method for private context protection in uncertain environments. 相似文献
47.
Intelligent query answering by knowledge discovery techniques 总被引:3,自引:0,他引:3
Jiawei Han Yue Huang Cercone N. Yongjian Fu 《Knowledge and Data Engineering, IEEE Transactions on》1996,8(3):373-390
Knowledge discovery facilitates querying database knowledge and intelligent query answering in database systems. We investigate the application of discovered knowledge, concept hierarchies, and knowledge discovery tools for intelligent query answering in database systems. A knowledge-rich data model is constructed to incorporate discovered knowledge and knowledge discovery tools. Queries are classified into data queries and knowledge queries. Both types of queries can be answered directly by simple retrieval or intelligently by analyzing the intent of query and providing generalized, neighborhood or associated information using stored or discovered knowledge. Techniques have been developed for intelligent query answering using discovered knowledge and/or knowledge discovery tools, which includes generalization, data summarization, concept clustering, rule discovery, query rewriting, deduction, lazy evaluation, application of multiple-layered databases, etc. Our study shows that knowledge discovery substantially broadens the spectrum of intelligent query answering and may have deep implications on query answering in data- and knowledge-base systems 相似文献
48.
Atorn Nuntiyagul Kanlaya Naruedomkul Nick Cercone Damras Wongsawang 《Computational Intelligence》2007,23(1):28-44
We proposed a feature selection approach, Patterned Keyword in Phrase ( PKIP ), to text categorization for item banks. The item bank is a collection of textual question items that are short sentences. Each sentence does not contain enough relevant words for directly categorizing by the traditional approaches such as "bag-of-words." Therefore, PKIP was designed to categorize such question item using only available keywords and their patterns. PKIP identifies the appropriate keywords by computing the weight of all words. In this paper, two keyword selection strategies are suggested to ensure the categorization accuracy of PKIP. PKIP was implemented and tested with the item bank of Thai high primary mathematics questions. The test results have proved that PKIP is able to categorize the question items correctly and the two keyword selection strategies can extract the very informative keywords. 相似文献