首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 281 毫秒
1.
This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependency-driven constituent parse tree (D-CPT), is proposed to combine the advantages of both constituent and dependence parse trees. This is achieved by directly representing various kinds of dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT (Constituent Parse Tree). In this way, D-CPT not only keeps the dependency relationship information in the dependency parse tree (DPT) structure but also retains the basic hierarchical structure of CPT style. Moreover, several schemes are designed to extract various kinds of necessary information, such as the shortest path between the nominal predicate and the argument candidate, the support verb of the nominal predicate and the head argument modified by the argument candidate, from D-CPT. This largely reduces the noisy information inherent in D-CPT. Finally, a convolution tree kernel is employed to compute the similarity between two parse trees. Besides, we also implement a feature-based method based on D-CPT. Evaluation on Chinese NomBank corpus shows that our tree kernel based method on D-CPT performs significantly better than other tree kernel-based ones and achieves comparable performance with the state-of-the-art feature-based ones. This indicates the effectiveness of the novel D-CPT structure in representing various kinds of dependency relations in a CPT-style structure and our tree kernel based method in exploring the novel D-CPT structure. This also illustrates that the kernel-based methods are competitive and they are complementary with the feature- based methods on SRL.  相似文献   

2.
Motion deblurring is a basic problem in the field of image processing and analysis. This paper proposes a new method of single image blind deblurring which can be significant to kernel estimation and non-blind deconvolution. Experiments show that the details of the image destroy the structure of the kernel, especially when the blur kernel is large. So we extract the image structure with salient edges by the method based on RTV. In addition, the traditional method for motion blur kernel estimation based on sparse priors is conducive to gain a sparse blur kernel. But these priors do not ensure the continuity of blur kernel and sometimes induce noisy estimated results. Therefore we propose the kernel refinement method based on L0 to overcome the above shortcomings. In terms of non-blind deconvolution we adopt the L1/L2 regularization term. Compared with the traditional method, the method based on L1/L2 norm has better adaptability to image structure, and the constructed energy functional can better describe the sharp image. For this model, an effective algorithm is presented based on alternating minimization algorithm.  相似文献   

3.
This paper presents a novel method to study Linux kernel evolution using complex networks.Having investigated the node degree distribution and average path length of the call graphs corresponding to the kernel modules of 223 different versions(V1.1.0 to V2.4.35),we found that the call graphs are scale-free and smallworld networks.Based on the relationship between average path length and nodes,we propose a method to find unusual points during Linux kernel evolution using the slope of the average path length.Using the unusual points we identify major structural changes in kernel modules.A stability coefficient is also proposed to describe quantitatively the stability of kernel modules during evolution.Finally,we verify our result through Vasa’s metrics method.  相似文献   

4.
Kernel selection is one of the key issues both in recent research and application of kernel methods. This is usually done by minimizing either an estimate of generalization error or some other related performance measure. Use of notions of stability to estimate the generalization error has attracted much attention in recent years. Unfortunately, the existing notions of stability, proposed to derive the theoretical generalization error bounds, are difficult to be used for kernel selection in practice. It is well known that the kernel matrix contains most of the information needed by kernel methods, and the eigenvalues play an important role in the kernel matrix. Therefore, we aim at introducing a new notion of stability, called the spectral perturbation stability, to study the kernel selection problem. This proposed stability quantifies the spectral perturbation of the kernel matrix with respect to the changes in the training set. We establish the connection between the spectral perturbation stability and the generalization error. By minimizing the derived generalization error bound, we propose a new kernel selection criterion that can guarantee good generalization properties. In our criterion, the perturbation of the eigenvalues of the kernel matrix is efficiently computed by solving the derivative of a newly defined generalized kernel matrix. Both theoretical analysis and experimental results demonstrate that our criterion is sound and effective.  相似文献   

5.
Predicting the response variables of the target dataset is one of the main problems in machine learning. Predictive models are desired to perform satisfactorily in a broad range of target domains. However, that may not be plausible if there is a mismatch between the source and target domain distributions. The goal of domain adaptation algorithms is to solve this issue and deploy a model across different target domains. We propose a method based on kernel distribution embedding and Hilbert-Schmidt independence criterion (HSIC) to address this problem. The proposed method embeds both source and target data into a new feature space with two properties: 1) the distributions of the source and the target datasets are as close as possible in the new feature space, and 2) the important structural information of the data is preserved. The embedded data can be in lower dimensional space while preserving the aforementioned properties and therefore the method can be considered as a dimensionality reduction method as well. Our proposed method has a closed-form solution and the experimental results show that it works well in practice.  相似文献   

6.
It is important for the rapid visualization of large scale forest scene to dynamically simplify and recombine model data. In order to preserve the geometric features and visual perception of tree model, this paper presents a real-time information recombination method of complex 3D tree model based on visual perception. This method adopts visual attention model and the visual characteristic of tree structures, and then uses geometry-based and image-based methods to simplify tree models and construct a hybrid representation model. The hybrid representation model reflects the visual perception features of 3D tree models that can embody topological semantics in dynamic simulation. In addition, this method automatically extracts the representation information of 3D tree model based on visual perception, and recombines model information in real time according to the dynamic viewpoint of virtual scene. Finally, this method is applied in the simplification of different tree models, and it is compared with the existing tree model simplification methods. Experimental results show that this method can not only preserve better visual perception for 3D tree models, but also effectively decrease the geometric data of forest scene, and improve the rendering efficiency of forest scene.  相似文献   

7.
Object-oriented modeling with declarative equation based languages often unconsciously leads to structural inconsistencies. Component-based debugging is a new structural analysis approach that addresses this problem by analyzing the structure of each component in a model to separately locate faulty components. The analysis procedure is performed recursively based on the depth-first rule. It first generates fictitious equations for a component to establish a debugging environment, and then detects structural defects by using graph theoretical approaches to analyzing the structure of the system of equations resulting from the component. The proposed method can automatically locate components that cause the structural inconsistencies, and show the user detailed error messages. This information can be a great help in finding and localizing structural inconsistencies, and in some cases pinpoints them immediately.  相似文献   

8.
Motion deblurring is one of the basic problems inthe field of image processing. This paper summarizes the mathematical basis of the previous work and presents a deblurringmethod that can improve the estimation of the motion blurkernel and obtain a better result than the traditional methods.Experiments show the motion blur kernel loses some important and useful properties during the estimation of the kernel which may cause a bad estimation and increase the ringingartifacts. Considering that the kernel is provided by the motion of the imaging sensor during the exposure and that the kernel shows the trace of the motion, this paper ensures the physical meaning of the kernel such as the continuity and the center of thekernel during the iterative process. By adding a post process to the estimation of the motion blur kernel, we remove some discrete points and make use of the centralizationof the kernel in order to accurate the estimation. The experiment shows the existence of the post process improves the effect of the estimation of the kernel and provides a better result with the clear edges.  相似文献   

9.
Reconfigurable SRAM-based FPGAs are highly susceptible to radiation induced single-event upsets (SEUs) in space applications.The bit flip in FPGAs configuration memory may alter user circuit permanently without proper bitstream reparation,which is a completely different phenomenon from upsets in traditional memory devices.It is important to find the relationship between a programmable resource and corresponding control bit in order to understand the impact of this effect.In this paper,a method is proposed to decode the bitstream of FPGAs from Xilinx Corporation,and then an analysis program is developed to parse the netlist of a specific design to get the configuration state of occupied programmable logic and routings.After that,an SEU propagation rule is established according to the resource type to identify critical logic nodes and paths,which could destroy the circuit topological structure.The decoded relationship is stored in a database.The database is queried to get the sensitive bits of a specific design.The result can be used to represent the vulnerability of the system and predict the on orbit system failure rate.The analysis tool was validated through fault injection and accelerator irradiation experiment.  相似文献   

10.
The topological characteristics of an IEEE 802.16 mesh network including the tree’s depth and degree of its nodes affect the delay and throughput of the network.To reach the desired trade-off between delay and throughput,all potential trees should be explored to obtain a tree with the proper topology.Since the number of extractable tree topologies from a given network graph is enormous,we use a genetic algorithm(GA) to explore the search space and find a good enough trade-off between per-node,as well as network-wide delay and throughput.In the proposed GA approach,we use the Pruefer code tree representation followed by novel genetic operators.First,for each individual tree topology,we obtain expressions analytically for per-node delay and throughput.Based on the required quality of service,the obtained expressions are invoked in the computation of fitness functions for the genetic approach.Using a proper fitness function,the proposed algorithm is able to find the intended trees while different constraints on delay and throughput of each node are imposed.Employing a GA approach leads to the exploration of this extremely wide search space in a reasonably short time,which results in overall scalability and accuracy of the proposed tree exploration algorithm.  相似文献   

11.
This paper proposes a novel tree kernel-based method with rich syntactic and semantic information for the extraction of semantic relations between named entities. With a parse tree and an entity pair, we first construct a rich semantic relation tree structure to integrate both syntactic and semantic information. And then we propose a context-sensitive convolution tree kernel, which enumerates both context-free and context-sensitive sub-trees by considering the paths of their ancestor nodes as their contexts to capture structural information in the tree structure. An evaluation on the Automatic Content Extraction/Relation Detection and Characterization (ACE RDC) corpora shows that the proposed tree kernel-based method outperforms other state-of-the-art methods.  相似文献   

12.
Program plagiarism detection is a task of detecting plagiarized code pairs among a set of source codes. In this paper, we propose a code plagiarism detection system that uses a parse tree kernel. Our parse tree kernel calculates a similarity value between two source codes in terms of their parse tree similarity. Since parse trees contain the essential syntactic structure of source codes, the system effectively handles structural information. The contributions of this paper are two-fold. First, we propose a parse tree kernel that is optimized for program source code. The evaluation shows that our system based on this kernel outperforms well-known baseline systems. Second, we collected a large number of real-world Java source codes from a university programming class. This test set was manually analyzed and tagged by two independent human annotators to mark plagiarized codes. It can be used to evaluate the performance of various detection systems in real-world environments. The experiments with the test set show that the performance of our plagiarism detection system reaches to 93% level of human annotators.  相似文献   

13.
A table is a well-organized and summarized knowledge expression for a domain. Therefore, it is of great importance to extract information from tables. However, many tables in Web pages are used not to transfer information but to decorate pages. One of the most critical tasks in Web table mining is thus to discriminate meaningful tables from decorative ones. The main obstacle of this task comes from the difficulty of generating relevant features for discrimination. This paper proposes a novel discrimination method using a composite kernel which combines parse tree kernels and a linear kernel. Because a Web table is represented as a parse tree by an HTML parser, it is natural to represent the structural information of a table as a parse tree. In this paper, two types of parse trees are used to represent structural information within and around a table. These two trees define the structure kernel that handles the structural information of tables. The contents of a Web table are manipulated by a linear kernel with content features. Support vector machines with the composite kernel distinguish meaningful tables from decorative ones with high accuracy. A series of experiments show that the proposed method achieves state-of-the-art performance.  相似文献   

14.
基于合一句法和实体语义树的中文语义关系抽取   总被引:1,自引:0,他引:1  
该文提出了一种基于卷积树核函数的中文实体语义关系抽取方法,该方法通过在关系实例的结构化信息中加入实体语义信息,如实体类型、引用类型和GPE角色等,从而构造能有效捕获结构化信息和实体语义信息的合一句法和实体语义关系树,以提高中文语义关系抽取的性能。在ACE RDC 2005中文基准语料上进行的关系探测和关系抽取的实验表明,该方法能显著提高中文语义关系抽取性能,大类抽取的最佳F值达到67.0,这说明结构化句法信息和实体语义信息在中文语义关系抽取中具有互补性。  相似文献   

15.
该文探索了基于树核函数的中文语义角色分类,重点研究如何获取有效的结构化信息特征。在最小句法树结构的基础上,根据语义角色分类的特点,进一步定义了三种不同的句法结构,并使用复合核将基于树核和基于特征的方法结合。在中文PropBank语料上的结果表明,基于树核函数的方法在中文语义角色分类任务中能够取得较好的结果,精确率达到91.79%。同时,与基于特征方法的结合,基于树核函数的方法能够进一步提高前者性能,精确率达到94.28%,优于同类系统。  相似文献   

16.
基于核方法的中文实体关系抽取研究   总被引:4,自引:1,他引:3  
命名实体关系抽取是信息抽取领域中的重要研究课题之一。该文探讨了核方法在中文关系抽取上的有效性问题,主要分为三部分研究了在卷积树核中使用不同的语法树对关系抽取性能的影响;通过构造复合核检查了树核与平面核之间的互补效果;改进了最短路径依赖核,将核计算建立在原最短依赖路径的最长公共子序列上,以消除原始最短路径依赖核对依赖路径长度相同的过严要求。因为核方法开始被用于英文关系抽取时,F1值也只有40%左右,而我们在ACE2007标准语料集上的实验结果表明,只使用作用在语法树上的卷积核时,中文关系抽取的F1值达到了35%,可见卷积核方法对中文关系抽取也是有效的,同时实验也表明最短路径依赖核对中文关系抽取效果不明显。  相似文献   

17.
由于Java Web应用业务场景复杂,且对输入数据的结构有效性要求较高,现有的测试方法和工具在测试Java Web时存在测试用例的有效率较低的问题.为了解决上述问题,本文提出了基于解析树的Java Web应用灰盒模糊测试方法.首先为Java Web应用程序的输入数据包进行语法建模创建解析树,区分分隔符和数据块,并为解析树中每一个叶子结点挂接一个种子池,隔离测试用例的单个数据块,通过数据包拼接生成符合Java Web应用业务格式的输入,从而提高测试用例的有效率;为了保留高质量的数据块,在测试期间根据测试程序的执行反馈信息,为每个数据块种子单独赋予权值;为了突破深度路径,会在相应种子池中基于条件概率学习提取数据块种子特征.本文实现了基于解析树的Java Web应用灰盒模糊测试系统PTreeFuzz,测试结果表明,该系统相较于现有工具取得了更好的测试准确率.  相似文献   

18.
该文提出了一种基于卷积树核的无指导中文实体关系抽取方法。该方法以最短路径包含树作为关系实例的结构化表示形式,以卷积树核函数作为树相似度计算方法,并采用分层聚类方法进行无指导中文实体关系抽取。在ACE RDC 2005中文基准语料库上的无指导关系抽取实验表明,采用该方法的F值最高可达到60.1,这说明基于卷积树核的无指导中文实体关系抽取是行之有效的。  相似文献   

19.
付健  孔芳 《计算机科学》2020,47(3):231-236
随着深度学习的兴起与发展,越来越多的学者开始将深度学习技术应用于指代消解任务中。但现有的神经指代消解模型普遍只关注文本的线性特征,忽略了传统方法中已证明非常有效的结构信息的融入。以目前表现最佳的Lee等提出的神经网络模型为基础,借助成分句法树对上述问题进行了改进:1)提出了一种枚举句法树中以结点为短语的抽取策略,避免了暴力枚举策略所受到的长度限制与不符合句法规则的短语集噪音的引入;2)利用树的遍历得到结点序列,结合结点的高度与路径等特征,直接对成分句法树进行上下文表示并将其融入模型中,避免了只使用字、词序列而产生的结构信息缺失问题。在CoNLL 2012 Shared Task的数据集上对所提模型进行了一系列实验,实验结果显示,其中文指代消解的F 1值达到了62.35,英文指代消解的F 1值也达到了67.24,从而验证了所提结构信息融入策略能大大提升指代消解的性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号