首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Semantic block identification is an approach to retrieve information from Web pages and applications. As Website design evolves, however, traditional methodologies cannot perform well any more. This paper proposes a new model to merge Web page content into semantic blocks by simulating human perception. A “layer tree” is constructed to remove hierarchical inconsistencies between the DOM tree representation and the visual layout of the Web page. Subsequently, the Gestalt laws of grouping are interpreted as the rules for semantic block detection. During interpretation, the normalized Hausdorff distance, the CIE-Lab color difference, the normalized compression distance, and the series of visual information are proposed to operationalize these Gestalt laws. Finally, a classifier is trained to combine each operationalized law into a unified rule for identifying semantic blocks from the Web page. Experiments are conducted to compare the efficiency of the model to a state-of-art algorithm, the VIPS. The comparison results of the first experiment show that the GLM model generates more “true positives” and less “false negatives” than VIPS. The next experiment upon a large-scale test set produces an average precision of 90.53 % and recall rate of 90.85 %, which is approximately 25 % better than that of VIPS.  相似文献   

2.
3.
We propose an approach to speed up the semantic object search and detection for vegetable trading information using Steiner Tree. Through analysis, comparing the relevant ontology construction method, we present a set of ontology construction methods based on domain ontology for vegetables transaction information. With Jena2 provides rule-based reasoning engine, More related information could be searched with the help of ontology database and ontology reasoning, query expansion is to achieve sub-vocabulary of user input, the parent class of words, equivalence class of extensions, and use of ontology reasoning to get some hidden information to use of these technologies, we design and implementation of ontology-based semantic vegetables transaction information retrieval system, and through compare to keyword-based matching of large-scale vegetable trading site retrieval systems, the results show that the recall and precision rate of ontology-based information retrieval system much better than keyword-based information retrieval system, and has some practical value.  相似文献   

4.
Data management and information processing play the key roles in developing the Internet of Things (IoT). The requirements of a well-defined data model for IoT involve in six aspects: semantic supporting, active data extracting and explaining, flexibility and extensibility, enabling to manage massive and heterogeneous data, supporting formal organization, and solid mathematic-based theory. This paper aims to exploring an extensible and active semantic information organization model for IoT to meet the above requirements, and the primary idea is “Object-cored organizing data, event-based explaining data, and knowledge-based using data.” The proposed model involves two layers: the object layer and the event layer, and both of them are discussed in detail including conceptions, schema definitions, and the rule-based knowledge representation. Semantic reasoning can be supported by the knowledge base which involves in a set of reasoning rules on semantic relations among objects or among events correspondingly.  相似文献   

5.
This paper presents an approach for event detection and annotation of broadcast soccer video. It benefits from the fact that occurrence of some audiovisual features demonstrates remarkable patterns for detection of semantic events. However, the goal of this paper is to propose a flexible system that can be able to be used with minimum reliance on predefined sequences of features and domain knowledge derivative structures. To achieve this goal, we design a fuzzy rule-based reasoning system as a classifier which adopts statistical information from a set of audiovisual features as its crisp input values and produces semantic concepts corresponding to the occurred events. A set of tuples is created by discretization and fuzzification of continuous feature vectors derived from the training data. We extract the hidden knowledge among the tuples and correlation between the features and related events by constructing a decision tree (DT). A set of fuzzy rules is generated by traversing each path from root toward leaf nodes of constructed DT. These rules are inserted in fuzzy rule base of designed fuzzy system and employed by fuzzy inference engine to perform decision-making process and predict the occurred events in input video. Experimental results conducted on a large set of broadcast soccer videos demonstrate the effectiveness of the proposed approach.  相似文献   

6.

A fundamental problem in data mining is whether the whole information available is always necessary to represent the information system (IS). Reduct is a rough set approach in data mining that determines the set of important attributes to represent the IS. The search for minimal reduct is based on the assumption that within the dataset in an IS, there are attributes that are more important than the rest. An algorithm in finding minimal reducts based on Propositional Satisfiability (SAT) algorithm is proposed. A branch and bound algorithm is presented to solve the proposed SAT problem. The experimental result shows that the proposed algorithm has significantly reduced the number of rules generated from the obtained reducts with high percentage of classification accuracy.  相似文献   

7.
8.
Rough set theory is a new mathematical approach to imprecision, vagueness and uncertainty in data analysis. This paper presents a new approach to rough set theory in determining the category of crack causes for insufficient and imprecise crack characteristics observed in regular inspection of concrete structures. The categories of crack causes are classified into four classes, that is, (1) concrete material, (2) construction work, (3) service and environmental factors, and (4) structure and applied loads. The crack characteristics include time of formation, shape, regularity, cause of concrete deformation, and range. The decision table was constructed considering crack characteristics as condition attributes and the categories of crack causes as decision attributes. A minimal decision algorithm for this decision table was generated on the basis of rough set theory; the algorithm is equivalent to the original decision table, but requires minimum subsets of condition attributes. It turned out in determining the category of crack causes that, “time of formation” had the most important influence among crack characteristics, and “shape” could be omitted with the least influence on diagnosis.  相似文献   

9.
Multilevel knowledge in transactional databases plays a significant role in our real-life market basket analysis. Many researchers have mined the hierarchical association rules and thus proposed various approaches. However, some of the existing approaches produce many multilevel and cross-level association rules that fail to convey quality information. From these large number of redundant association rules, it is extremely difficult to extract any meaningful information. There also exist some approaches that mine minimal association rules, but these have many shortcomings due to their naïve-based approaches. In this paper, we have focused on the need for generating hierarchical minimal rules that provide maximal information. An algorithm has been proposed to derive minimal multilevel association rules and cross-level association rules. Our work has made significant contributions in mining the minimal cross-level association rules, which express the mixed relationship between the generalized and specialized view of the transaction itemsets. We are the first to design an efficient algorithm using a closed itemset lattice-based approach, which can mine the most relevant minimal cross-level association rules. The parent–child relationship of the lattices has been exploited while mining cross-level closed itemset lattices. We have extensively evaluated our proposed algorithm’s efficiency using a variety of real-life datasets and performing a large number of experiments. The proposed algorithm has outperformed the existing related work significantly during the pervasive performance comparison.  相似文献   

10.
This paper introduces an innovative framework for product design and assembly process planning reconciliation. Nowadays, both product lifecycle phases are quasi concurrently performed in industry and this configuration has led to competitive gains in efficiency and flexibility by improving designers’ awareness and product quality. Despite these efforts, some limitations/barriers are still encountered regarding the lack of dynamical representation, information consistency and information flow continuity. It is due to the inherent nature of the information created and managed in both phases and the lack of interoperability between the related information systems. Product design and assembly process planning phases actually generate heterogeneous information, since the first one describes all information related to “what to be delivered” and the latter rationalises all information with regards to “how to be assembled”. In other words, the integration of assembly planning issue in product design requires reconciliation means with appropriate relationships of the architectural product definition in space with its assembly sequence in terms of time. Therefore, the main objective is to provide a spatiotemporal information management framework based on a strong semantic and logical foundation in product lifecycle management (PLM) systems, increasing therefore actors’ awareness, flexibility and efficiency with a better abstraction of the physical reality and appropriate information management procedures. A case study is presented to illustrate the relevance of the proposed framework and its hub-based implementation within PLM systems.  相似文献   

11.
Recently, a new approach to the design of fuzzy control rules was suggested. The method, referred to as fuzzy Lyapunov synthesis, extends classical Lyapunov synthesis to the domain of “computing with words”, and allows the systematic, instead of heuristic, design and analysis of fuzzy controllers given linguistic information about the plant. In this paper, we use fuzzy Lyapunov synthesis to design and analyze the rule-base of a fuzzy scheduler. Here, too, rather than use heuristics, we can derive the fuzzy rule-base systematically. This suggests that the process of deriving the rules can be automated. Our approach may lead to a novel computing with words algorithm: the input is linguistic information concerning the “plant” and the “control” objective, and the output is a suitable fuzzy rule-base.  相似文献   

12.
语义的模糊性给词语的情感分析带来了挑战。有些情感词语不仅使用频率高,而且语义模糊性强。如何消除语义模糊性成为词语情感分析中亟待解决的问题。该文提出了一种规则和统计相结合的框架来分析具有强语义模糊性词语的情感倾向。该框架根据词语的相邻信息获取有效的特征,利用粗糙集的属性约简方法生成决策规则,对于规则无法识别的情况,再利用贝叶斯分类器消除语义模糊性。该文以强语义模糊性词语“好”为例,对提出的框架在多个语料上进行实验,结果表明该框架可以有效消除“好”的语义模糊性以改进情感分析的效果。  相似文献   

13.
Rules often contain terms that are ambiguous, poorly defined or not defined at all. In order to interpret and apply rules containing such terms, appeal must be made to their previous constructions, as in the interpretation of legal statutes through relevant legal cases. We describe a system CABARET (CAse-BAsed REasoning Tool) that provides a domain-independent shell that integrates reasoning with rules and reasoning with previous cases in order to apply rules containing ill-defined terms. The integration of these two reasoning paradigms is performed via a collection of control heuristics, which suggest how to interleave case-based methods and rule-based methods to construct an argument to support a particular interpretation. CABARET is currently instantiated with cases and rules from an area of income tax law, the so-called “home office deduction”. An example of CABARET's processing of an actual tax case is provided in some detail. The advantages of CABARET's hybrid approach to interpretation stem from the synergy derived from interleaving case-based and rule-based tasks.  相似文献   

14.
随着语义Web的广泛使用,信息智能搜索、智能信息代理、智能交易代理等等智能化电子商务应用将成为现实。与此同时,规则在电子商务中的应用也取得了长足的进展。文中介绍的面向语义Web的约束规则系统结合两者的优势通过使用XML描述约束规则系统的语法、扩充在处理异构数据源方面的谓词和查询表示、引入并行约束程序设计的推理技术、规则选择机制的编译实现完成了规则系统在电子商务异构数据处理及商务决策方面的应用。  相似文献   

15.
Axiom sets and their extensions are viewed as functions from the set of formulas in the language to a set of four truth values, t, f, u for undefined, and k for contradiction. Such functions form a lattice with “contains less information” as the partial order ?, and “combination of several sources of knowledge” as the least-upper-bound operation ?. Inference rules are expressed as binary relations between such functions. We show that the usual criterium on fixpoints, namely, to be minimal, does not apply correctly in the case of non-monotonic inference rules. A stronger concept, approachable fixpoints, is introduced and proven to be sufficient for the existence of a derivation of the fixpoint. In addition, the usefulness of our approach is demonstrated by concise proofs for some previously known results about normal default rules.  相似文献   

16.
The dominance-based rough set approach is proposed as a methodology for plunge grinding process diagnosis. The process is analyzed and next its diagnosis is considered as a multi-criteria decision making problem based on the modelling of relationships between different process states and their symptoms using a set of rules induced from measured process data. The development of the diagnostic system is characterized by three phases. Firstly, the process experimental data is prepared in the form of a decision table. Using selected methods of signal processing, each process running is described by 17 process state features (condition attributes) and 5 criteria evaluating process state and results (decision attributes). The semantic correlation between all the attributes is modelled. Next, the phase of condition attributes selection and knowledge extraction are strictly integrated with the phase of the model evaluation using an iterative approach. After each loop of the iterative feature selection procedure the induction of rules is conducted using the VC-DomLEM algorithm. The classification capability of the induced rules is carried out using the leave-one-out method and a set of measures. The classification accuracy of individual models is in the range of 80.77–98.72 %. The induced set of rules constitutes a classifier for an assessment of new process run cases.  相似文献   

17.
This paper is a discussion of two continuous learning approaches for improving classification accuracy for an intuitive reasoner algorithm. The reasoner predicted the value of a given target variable by multiple iterations of forward-chained, rule-based inference. Each rule in the reasoner’s rule set had associated with it a weight, referred to here as “Strength of Belief” (SB). The value of SB of a rule indicated the certainty level of that rule. In each iteration of reasoning, any instances of similar values for a given variable were replaced by a single consolidated datum and the SB associated with the consolidated datum was increased. At the end of the reasoning process, the class (value) of the target variable which had the highest SB was reported as the conclusion. The rule set for the reasoner was generated based on a training data set that contained 80% of the data in a weather database comprising 50 years worth of hourly measurements for 54 weather variables. Each rule was induced based on only a small subset of the weather data. The intuitive reasoner was tested by using the induced rules to predict a number of pre-selected target variables using 275 test cases created from the test data. The first continuous learning approach was to identify relevant input variables for the reasoner, and the second was to rebalance the rule set used by the reasoner by adjusting the SB associated with each of the rules. Because of the way the rules were induced, the resulting rules did not contain any information about the relevance of the 53 possible input variables to the task of predicting a given target variable for previously unseen cases. A method was developed to identify which input variables were most relevant to the task based on the induced rule set. This method resulted in higher prediction accuracy of the intuitive reasoner than using a set of randomly chosen input variables for four of six target variables. The second continuous learning approach was intended to address the class imbalance problem in the rule set. The intuitive reasoner appeared to over-fit classes (values) which had frequent representation in the rule set. To address this problem, a heuristic was developed that generated adjustment factors for the SB values of the rules. The use of this heuristic improved the classification accuracy of the intuitive reasoner for four of the six target variables.  相似文献   

18.
该文研究不一致例子中的多概念学习.所谓不一致的例子是指具有相同的条件属性值却属于不同概念的矛盾例子.该文提出了一个基于粗集扩展模型的数据挖掘算法MIE-RS,能有效处理例子集的不一致性,并且通过确定每个概念的覆盖,即最小相关属性集,为每一概念产生最简的满足给定可信度的产生式规则知识.  相似文献   

19.
After having recalled some well-known shortcomings linked with the Semantic Web approach to the creation of (application oriented) systems of “rules” – e.g., limited expressiveness, adoption of an Open World Assumption (OWA) paradigm, absence of variables in the original definition of OWL – this paper examines the technical solutions successfully used for implementing advanced reasoning systems according to the NKRL’s methodology. NKRL (Narrative Knowledge Representation Language) is a conceptual meta-model and a Computer Science environment expressly created to deal, in an ‘intelligent’ and complete way, with complex and content-rich non-fictional ‘narrative’ data sources. These last include corporate memory documents, news stories, normative and legal texts, medical records, surveillance videos, actuality photos for newspapers and magazines, etc. In this context, we will expound first the need for distinguishing between “plain/static” and “structured/dynamic” knowledge and for introducing appropriate (and different) knowledge representation structures for these two types of knowledge. In a structured/dynamic context, we will then show how the introduction of “functional roles” – associated with the possibility of making use of n-ary structures – allows us to build up highly ‘expressive’ rules whose “atoms” can directly represent complex situations, actions, etc. without being restricted to the use of binary clauses. In an NKRL context, “functional roles” are primitive symbols interpreted as “relations” – like “subject”, “object”, “source”, “beneficiary”, etc. – that link a semantic predicate with its arguments within an n-ary conceptual formula. Functional roles contrast then with the “semantic roles” that are equated to ordinary concepts like “student”, to be inserted into the “non-sortal” (no direct instances) branch of a traditional ontology.  相似文献   

20.
一种基于粗糙-模糊集理论的分类规则挖掘方法   总被引:1,自引:0,他引:1  
提出了一种基于粗糙-模糊集理论的分类规则挖掘方法,以解决信息不完整情况下的推理和决策问题,并给出了该方法的流程图。利用基于粗糙集的特征属性约简算法和基于模糊集的决策规则归纳方法,可以挖掘出样本中隐藏的关联规则,形成决策。最后,将其应用于一个具体的信息系统中,结果令人满意,证明该方法是可行的且是有效的。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号