共查询到19条相似文献,搜索用时 156 毫秒
1.
设计了统计方法和规则方法相结合的专业术语抽取算法,并对由此算法抽取的术语作进一步的处理,以方便人们阅读专业文献。通过对统计数据库中双字和多字候选项增加字段的处理,设计出以生语料为输入、专业术语为输出,并根据需要对输出的术语进行含义解释和英文翻译的程序。利用计算机领域专业文献进行测试并对测试结果进行分析,指出其中存在的问题,为进一步的研究作出展望。 相似文献
2.
3.
本文提出了一种规则与统计相结合的方法,针对计算机领域术语综合其领域术语特征和统计特征。算法在语料词性标注基础上,在原有词串扩展算法上糅合领域术语部件和领域术语特征获取候选术语。综合统计特征G-MI实现候选术语过滤。实验证明,算法能有效提高术语抽取的正确率和抽取效率。 相似文献
4.
5.
多词领域术语抽取是自然语言处理技术中的一个重点和难点问题, 结合维吾尔语语言特征,该文提出了一种基于规则和统计相结合的维吾尔语多词领域术语的自动抽取方法。该方法分为四个阶段: ①语料预处理, 包括停用词过滤和词性标注; ② 对字串取N元子串, 利用改进的互信息算法和对数似然比率计算子串内部的联合强度, 结合词性构成规则, 构建候选维吾尔语多词领域术语集; ③ 利用相对词频差值, 得到尽可能多的维吾尔语多词领域术语; ④ 结合C_value值获取最终领域术语并作后处理。实验结果准确率为85.08%, 召回率为 73.19%, 验证了该文提出的方法在维吾尔语多词领域术语抽取上的有效性。 相似文献
6.
针对军事情报领域,提出了一种基于条件随机场的术语抽取方法,该方法将领域术语抽取看作一个序列标注问题,将领域术语分布的特征量化作为训练的特征,利用CRF工具包训练出一个领域术语特征模板,然后利用该模板进行领域术语抽取。实验采用的训练语料来自“搜狐网络军事频道”的新闻数据,测试语料选取《现代军事》杂志2007年第1~8期的所有文章。实验取得了良好的结果,准确率为73.24%,召回率为69.57%,F-测度为71.36%,表明该方法简单易行,且具有领域通用性。 相似文献
7.
8.
目前,大部分术语边界的确定方法是通过选取合适的统计量,设置合适的阈值计算字符串之间的紧密程度,但该类方法在抽取长术语时不能得到很好的效果。为了解决在术语抽取过程中长术语抽取召回率低的问题,在研究了大量专利文献的基础上,提出了一种基于专利术语边界标记集的术语抽取方法。方法中提出了边界标记集的概念,并结合专利文献中术语边界的特点构建专利术语边界标记集;提出了一种种子术语权重计算方法抽取种子术语;使用人民日报语料作为对比语料抽取专利文献术语部件词库,提高候选术语的术语度;最后采用左右边界熵的方法对识别出的术语进行过滤。实验表明,所提出的方法具有较好的实验结果,正确率81.67%,召回率71.92%,F值0.765,较对比实验有较大提高。 相似文献
9.
10.
在目前的生物信息领域开放语料的术语抽取实验中,前2000多个双字词的精度已经达到了90.36%,但是三字以上的词的抽取精度只有66.63%,多字词的抽取成为了名词术语自动抽取的一个难点问题。针对该难点,提出综合C-value参数在长术语抽取方面的优势,并与术语抽取中的互信息参数相结合的策略来识别术语。实验结果表明,长术语抽取正确率为75.7%,召回率为68.4%,F测量值为71.9%,高于相同语料下的其他方法。 相似文献
11.
12.
一种不产生候选项挖掘频繁项集的新算法 总被引:4,自引:2,他引:4
Apriori算法是关联规则挖掘算法中应用最为广泛的一种算法,它的主要目的是从大量的事务数据中通过候选项集挖掘出有趣的频繁项集,从而为用户提供有意义的关联关系。但随着数据库规模的扩大,apriori算法可能会产生如下两大棘手问题:大量候选项集的产生将造成巨大计算量的浪费;为剪掉无用候选项如何设置阈值。这些问题相对于众多普通用户来说都具有挑战性。该文提出的代码与运算是一种无须候选项挖掘频繁项集的算法,用户无须为设置阈值而煞费苦心。同时事务压缩算法的加入大大减少了算法中的计算量。 相似文献
13.
In this paper, we show that a new edge detection scheme developed from the notion of transition in nonlinear physics, associated with the precise computation of its quantitative parameters (most notably singularity exponents) provide enhanced performances in terms of reconstruction of the whole image from its edge representation; moreover it is naturally robust to noise. The study of biological vision in mammals state the fact that major information in an image is encoded in its edges, the idea further supported by neurophysics. The first conclusion that can be drawn from this stated fact is that of being able to reconstruct accurately an image from the compact representation of its edge pixels. The paper focuses on how the idea of edge completion can be assessed quantitatively from the framework of reconstructible systems when evaluated in a microcanonical formulation; and how it redefines the adequation of edge as candidates for compact representation. In the process of doing so, we also propose an algorithm for image reconstruction from its edge feature and show that this new algorithm outperforms the well-known ‘state-of-the-art’ techniques, in terms of compact representation, in majority of the cases. 相似文献
14.
从运行日志挖掘业务流程模型的流程挖掘方法研究方兴未艾,然而,复杂多变的运行环境使流程日志也不可避免地呈现出多样性.传统的流程挖掘算法各有其适用对象,因此,如何挑选适合多样性流程日志的流程挖掘算法成为了一项挑战.提出一种适用于多样性环境的业务流程挖掘方法 So Fi(survival of fittest integrator).该方法基于领域知识对日志进行分类,使用多种现有的挖掘算法对每一类子日志产生一组流程模型作为遗传算法的初始种群,借助遗传算法的优化能力,从中整合得到高质量的业务流程模型.针对模拟日志和某通信公司真实日志的实验结果表明:相对于任何单一的挖掘算法,So Fi产生的流程模型具有更高的综合质量,即重现度、精确度、通用性和简单性. 相似文献
15.
Yoshiaki Okubo Makoto Haraguchi 《Annals of Mathematics and Artificial Intelligence》1998,23(1-2):169-197
In theorem proving with abstraction, it is required for system designers to provide a useful abstraction. However, such a
task is so difficult that it would be worth studying an automatic construction of abstraction. In this paper, we propose a
new framework of Goal-Dependent Abstraction in which an appropriate abstraction is selected according to each goal to be proved.
Towards Goal-Dependent Abstraction, we present an algorithm for constructing an appropriate abstraction for a given goal.
The appropriateness is defined in terms of Upward-Property and Downward-Property. Since our abstraction is based on predicate
mapping, the algorithm in fact computes predicate mappings based on which appropriate abstractions can be constructed. Given
a goal, candidate predicate mappings are generated and then tested for their appropriateness for the goal. In order to find
appropriate mappings efficiently, we present a property to prune useless candidate generations. The numbers of pruned candidates
are evaluated in the best and worst cases. Furthermore some experimental results show that many useless candidates can be
pruned with the property and the obtained appropriate predicate mappings (abstractions) fit our intuition. From the experimental
results, we could expect our study in this paper to contribute to the fields of analogical reasoning and case-based reasoning
as well as theorem-proving.
This revised version was published online in June 2006 with corrections to the Cover Date. 相似文献
16.
CC Handley 《Image and vision computing》1985,3(1):29-35
A new algorithm for computing the convex hull of a planar set of points is presented. The method of determining the set of suitable candidates is compared with previous algorithms. The algorithm is also compared with other algorithms in terms of running time and storage requirements, and experiments indicate that it is at least as good in terms of space and generally better in terms of running time. 相似文献
17.
Markov Logic (ML) combines Markov networks (MNs) and first-order logic by attaching weights to first-order formulas and using
these as templates for features of MNs. State-of-the-art structure learning algorithms in ML maximize the likelihood of a
database by performing a greedy search in the space of structures. This can lead to suboptimal results because of the incapability
of these approaches to escape local optima. Moreover, due to the combinatorially explosive space of potential candidates these
methods are computationally prohibitive. We propose a novel algorithm for structure learning in ML, based on the Iterated
Local Search (ILS) metaheuristic that explores the space of structures through a biased sampling of the set of local optima.
We show through real-world experiments that the algorithm improves accuracy and learning time over the state-of-the-art algorithms.
On the other side MAP and conditional inference for ML are hard computational tasks. This paper presents two algorithms for
these tasks based on the Iterated Robust Tabu Search (IRoTS) metaheuristic. The first algorithm performs MAP inference and
we show through extensive experiments that it improves over the state-of-the-art algorithm in terms of solution quality and
inference time. The second algorithm combines IRoTS steps with simulated annealing steps for conditional inference and we
show through experiments that it is faster than the current state-of-the-art algorithm maintaining the same inference quality. 相似文献
18.
针对视觉位置识别中因检索全局图片而带来大量的时间消耗情况,以及不同地点的视觉图像存在相似和同一地点从不同视角看起来会不尽相同而导致感知混淆的问题.本文提出一种基于显著性算法提取候选对象并生成代表地点的标识牌算法.该方法对在位置识别系统中每个地点其对应的视频序列段上的关键帧使用显著性算法,生成大量的视觉显著的候选对象,并用对这些候选对象有效计算其之间的评价函数,再使用层次聚类算法计算出每一段序列上具有代表性的对象,最后将这些对象组合成具有代表视频序列的标识牌.使用标识牌代表地点的方式,插入位置识别系统中搜索地点对应的大量图像集的前一个步骤中,以此来缩小搜索范围,避免感知混淆所带来的全局搜索不确定的困惑. 相似文献