首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
基于数据空间网格划分的PK 树索引结构*   总被引:1,自引:0,他引:1  
在大规模高维数据挖掘研究中,数据存储与索引方法的有效性是决定算法时空效率的重要因素。将数据空间网格划分策略与高效率的树型索引结构结合起来,可以充分发挥两者在数据组织上的综合优势,将复杂问题转换为结构化的简单重复问题。在统一的框架下给出了各种数据空间网格划分的定义,讨论了两种适用于实现网格化数据索引的R树和PK树索引结构。试验结果表明,PK树在数据存储和索引上具有更高的效率,与网格化数据组织方法结合起来,对于降低大规模高维数据分析问题的时空复杂度具有重要意义。  相似文献   

2.
视频内容具有非常强的时间关联和逻辑结构,镜头语义是视频内容理解的基本单元。 从符合人类认识理解视频内容的角度来看,镜头语义之间隐含着时间上、语义上、结构上的多种 上下文关联信息。合理地描述这种上下文信息至关重要。为此,首先采用一棵带有上下文标签的 标签树作为镜头语义上下文层次结构的表征模型,以序列化的镜头语义序列为底层叶节点,以内 节点的上下文标签表征镜头语义间的上下文关联,其树形结构与视频内容层次化表征形式一致, 能为视频内容理解提供显著的信息增益。然后,着眼于解决镜头语义从其序列结构向标签树的层 次结构转化,采用结构化支持向量机的分析方法,根据镜头语义序列和视频语义上下文标签树的 联合特性构造了语义上下文结构化函数和损失函数,实现了镜头语义的结构化分析。实验结果表 明,视频语义上下文标签树在时序性、层次性、领域性、逻辑性等方面具有良好的表征能力,而 基于结构化支持向量机的结构化分析方法在镜头语义上下文分析的准确率、召回率及F1 值表现 良好。  相似文献   

3.
History independent data structures, presented by Micciancio, are data structures that possess a strong security property: even if an intruder manages to get a copy of the data structure, the memory layout of the structure yields no additional information on the history of operations applied on the structure beyond the information obtainable from the content itself. Naor and Teague proposed a stronger notion of history independence in which the intruder may break into the system several times without being noticed and still obtain no additional information from reading the memory layout of the data structure. An open question posed by Naor and Teague is whether these two notions are equally hard to obtain. In this paper we provide a separation between the two requirements for comparison-based algorithms. We show very strong lower bounds for obtaining the stronger notion of history independence for a large class of data structures, including, for example, the heap and the queue abstract data structures. We also provide complementary upper bounds showing that the heap abstract data structure may be made weakly history independent in the comparison based model without incurring any additional (asymptotic) cost on any of its operations. (A similar result is easy for the queue.) Thus, we obtain the first separation between the two notions of history independence. The gap we obtain is exponential: some operations may be executed in logarithmic time (or even in constant time) with the weaker definition, but require linear time with the stronger definition.  相似文献   

4.
In this paper, we propose a feature-based Korean grammar utilizing the learned constraint rules in order to improve parsing efficiency. The proposed grammar consists of feature structures, feature operations, and constraint rules; and it has the following characteristics. First, a feature structure includes several features to express useful linguistic information for Korean parsing. Second, a feature operation generating a new feature structure is restricted to the binary-branching form which can deal with Korean properties such as variable word order and constituent ellipsis. Third, constraint rules improve efficiency by preventing feature operations from generating spurious feature structures. Moreover, these rules are learned from a Korean treebank by a decision tree learning algorithm. The experimental results show that the feature-based Korean grammar can reduce the number of candidates by a third of candidates at most and it runs 1.5 ∼ 2 times faster than a CFG on a statistical parser.  相似文献   

5.
统计句法分析建模中基于信息论的特征类型分析   总被引:2,自引:0,他引:2  
统计句法分析利用概率评价模型评价每棵选句法树存在的可能性,选择概率值最高的候选句法树作为最终的句法分析结果。因此,统计句法分析的核心是一个概率评价模型,而各种概率评价模型的本质区别主要在于它们分别是根据上下文中的哪些特征来赋予句法树概率的。在统计句法分析研究领域,虽然已经提出了大量的概率评价模型,然而,不同的模型用得到了不同类型的特征,如何评价这些特征类型对于句法分析的作用呢?针对以上的问题,本研究为统计句法分析提出了一种特征类型的分析模型,该模型可以从信息论的角度量化地分析不同类型的上下文特征对于句法结构的预测作用。其基本思想是利用信息论中熵与条件熵的度量来显示一个特征类型是否抓住了预测句法结构的主要信息。如果加入某个特征类型之后当前句法结构的不确定性(熵)明显下降,则认为该特征类型抓住了上下文中影响句法结构的某些主要信息。特征类型分析的信息论模型利用预测信息量、预测信息增益、预测信息关联度以及预测信息总量四种度量从不同的仙量化地分析各种特征类型及特征类型组合对于当前目标的预测作用。实验以Penn TreeBank为训练集,将上下文中不同的特征类型对于句法分析规则的预测作用进行了系统的量化分析,得出了一系列有关不同特征类型及特征类型组合对句法结构的预测作用的结论。  相似文献   

6.
In this paper I look at Fred Dretske’s account of information and knowledge as developed in Knowledge and The Flow of Information. In particular, I translate Dretske’s probabilistic definition of information to a modal logical framework and subsequently use this to explicate the conception of information and its flow which is central to his account, including the notions of channel conditions and relevant alternatives. Some key products of this task are an analysis of the issue of information closure and an investigation into some of the logical properties of Dretske’s account of information flow.  相似文献   

7.
在大规模高维数据挖掘研究中,数据存储与索引方法的有效性是决定算法时空效率的重要因素。将数据空间网格划分策略与高效率的树型索引结构结合起来,可以充分发挥两者在数据组织上的综合优势,将复杂问题转换为结构化的简单重复问题:在统一的框架下给出了各种数据空间网格划分的定义,讨论了两种适用于实现网格化数据索引的R-树和PK-树索引结构:试验结果表明,PK-树在数据存储和索引上具有更高的效率,与网格化数据组织方法结合起来,对于降低大规模高维数据分析问题的时空复杂度具有重要意义。  相似文献   

8.
Tree species information is crucial for forest ecology and management, and development of techniques efficient for tree species classification has long been highlighted. In order to fulfil this task, a large variety of remote-sensing technologies have been attempted. Static terrestrial laser scanning (TLS) is such a representative case, which has proved to be capable of deriving explicit tree structure feature parameters (ETSPs) and has been primarily validated for tree species classification. However, in practice for each forest plot mapped by TLS, this kind of ETSP-based solutions can only work for the first circle layer of individual trees surrounding the TLS systems, because the trees at the outer circle layers tend to show incomplete crown representations due to the effect of laser obscuration. This adverse circumstance even may occur to the scenario of TLS-based inventory in the multi-scan mode. To break through this restriction, this study focused on tree stems that tend to be more readily mapped by TLS in the complicated forest environment, and then, their comparatively complete forms were used to comprehensively derive primarily stem-related feature parameters (SRPs) for distinguishing different tree species. Specifically, in this study 14 SRPs were proposed, mainly based on stem structure and surface texture characteristics. Based on a Support Vector Machine (SVM) classifier, the classification was operated in the leave-one-out cross-validation (LOOCV) mode. In the case of four typical boreal tree species, that is, Picea abies, Pinus sylvestris, Populus tremula, and Quercus robur, tests showed that the optimal total classification accuracy reached 71.93%. Given that tree stems generally display less features than crowns, the result is acceptable. Overall, the positive results have validated the strategy of fulfilling TLS-based tree species classification by deriving predominantly stem-related feature parameters, and this, in a broad sense, can expand the effective range of TLS on forest ecological studies.  相似文献   

9.
Use of high-dimensional feature spaces in a system has standard problems that must be addressed such as the high calculation costs, storage demands, and training requirements. To partially circumvent this problem, we propose the conjunction of the very high-dimensional feature space and image patches. This union allows for the image patches to be efficiently represented as sparse vectors while taking advantage of the high-dimensional properties. The key to making the system perform efficiently is the use of a sparse histogram representation for the color space which makes the calculations largely independent of the feature space dimension. The system can operate under multiple L p norms or mixed metrics which allows for optimized metrics for the feature vector. An optimal tree structure is also introduced for the approximate nearest neighbor tree to aid in patch classification. It is shown that the system can be applied to various applications and used effectively.  相似文献   

10.
Emerging database applications require the use of new indexing structures beyond B-trees and R-trees. Examples are the k-D tree, the trie, the quadtree, and their variants. They are often proposed as supporting structures in data mining, GIS, and CAD/CAM applications. A common feature of all these indexes is that they recursively divide the space into partitions. A new extensible index structure, termed SP-GiST is presented that supports this class of data structures, mainly the class of space partitioning unbalanced trees. Simple method implementations are provided that demonstrate how SP-GiST can behave as a k-D tree, a trie, a quadtree, or any of their variants. Issues related to clustering tree nodes into pages as well as concurrency control for SP-GiST are addressed. A dynamic minimum-height clustering technique is applied to minimize disk accesses and to make using such trees in database systems possible and efficient. A prototype implementation of SP-GiST is presented as well as performance studies of the various SP-GiST's tuning parameters.  相似文献   

11.
高级综合中VHDL描述向Petri网转换方法的研究   总被引:1,自引:0,他引:1  
提出一种基于执行路径的Petri网生成算法,该算法提取VHDL源描述中的功能和时序信息,生成与源描述完全等价的Petri网结构.算法采用条件树结构保存条件,语句执行条件和Petri网迁移条件都依据条件树生成.生成的Petri网能够准确地保存源描述中的I/O时序信息,形成调度过程中I/O操作处理的基础.从该结构出发,能够方便地实现各种I/O模式的调度。  相似文献   

12.
深度优先算法在创建树形结构中的应用研究   总被引:1,自引:0,他引:1  
唐青松 《微机发展》2014,(9):226-229
为了让软件系统可以对树结构进行灵活管理,对相关学者提出的生成动态树结构的方案进行改进,给出了以数据表自关联的方式对节点信息进行存储,提出了在存储状态下的父节点、兄弟节点、叶子节点等节点类型的定义。使用深度优先非递归算法抽取节点信息,并按照树结构方式对节点进行排序,依据排序结果以及节点类型生成树结构,实现了一种具有很好可移植性、可扩充性和可维护性的无限级动态树。最后,将动态树植入学校管理系统,通过实验证明,植入该树结构之后系统具有界面结构性强、信息层次清晰、用户操作简单等优点。  相似文献   

13.
14.
15.
16.
Deferred, Self-Organizing BSP Trees   总被引:1,自引:0,他引:1  
  相似文献   

17.
We consider the framework of regular tree model checking where sets of configurations of a system are represented by regular tree languages and its dynamics is modeled by a term rewriting system (or a regular tree transducer). We focus on the computation of the reachability set R*(L) where R is a regular tree transducer and L is a regular tree language. The construction of this set is not possible in general. Therefore, we present a general acceleration technique, called regular tree widening which allows to speed up the convergence of iterative fixpoint computations in regular tree model checking. This technique can be applied uniformly to various kinds of transformations. We show the application of our framework to different analysis contexts: verification of parameterized tree networks and data-flow analysis of multithreaded programs. Parametrized networks are modeled by relabeling tree transducers, and multithreaded programs are modeled by term rewriting rules encoding transformations on control structures. We prove that our widening technique can emulate many existing algorithms for special classes of transformations and we show that it can deal with transformations beyond the scope of these algorithms.  相似文献   

18.
This paper addresses the problem of reliability growth characterization and analysis. It is intended to show how reliability trend analyses can help the project manager in controlling the progress of the development activities and in appreciating the efficiency of the test programs. Reliability trend change may result from various reasons, some of them are desirable and expected (such as reliability growth due to fault removal) and some of them are undesirable (such as slowing down Of the testing effectiveness). Identification in time of the latter allows the project manager to take the appropriate decisions very quickly in order to avoid problems which may manifest later. The notions of reliability growth over a given interval and local reliability trend change are introduced through the subadditive property, allowing: better definition and understanding of the reliability growth phenomena; the already existing trend tests are then revisited using these concepts. Emphasis is put on the way trend tests can be used to help the management of the testing and validation process and on practical results that can be derived from their use; it is shown that, for several circumstances, trend analyses give information of prime importance to the developer  相似文献   

19.
This paper discusses learning techniques based upon the hierarchical censored production rules (HCPRs) system of knowledge representation. These HCPRs are written in the form: “A IF B UNLESS C GENERALITY G SPECIFICITY S,” where symbol A represents the conclusion, B is the set of preconditions, C is the set of exception conditions, G is the general information, while S represents the specific information. Learning can be classified into two major categories: the first includes the restructuring or modification of existing knowledge, and the second covers the creation of new knowledge depending upon externally supplied information and already acquired knowledge. In this system, schemes which modify various belief factors and information relegated to various operators (like IF, UNLESS, etc.) of an HCPR fall in the first category, while schemes which create a new HCPR in the system by using externally supplied information and already acquired knowledge fall in the second category. Using the growth algorithm, a new HCPR is added in the system by maintaining consistency as well as minimizing redundancy. The set of all related HCPRs connected to the SPECIFICITY or GENERALITY operators are shown to possess a tree structure, and hence it is given the name HCPRs tree. The fission algorithm restructures an HCPRs tree, thereby enabling the system to reorganize its knowledge base; a new HCPR may be created during this process. This is followed by the fusion algorithm that enables the merging of two related HCPRs trees in the HCPRs system. © 1998 John Wiley & Sons, Inc.  相似文献   

20.
鸟声识别研究中声音特征选取对识别分类的准确度有很大影响.为了提高鸟声识别正确率,针对传统的梅尔倒谱系数(MFCC)对鸟声高频信息表征不足.提出了基于Fisher准则MFCC和翻转梅尔倒谱系数(IMFCC)的特征融合,得到新的特征参数MFCC-IMFCC应用于鸟声识别,提高对鸟声高频信息表征.同时通过遗传算法(GA)对支持向量机(SVM)中的惩罚因子C和核参数g进行优化,训练出GA-SVM分类模型.实验表明,在同一条件下,MFCC-IMFCC与MFCC相比,识别率有一定的提高.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号