首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
基于文本数据源的地理空间信息解析研究侧重于地名实体、空间关系等空间语义角色的标注和抽取,忽略了丰富的时间信息、主题事件信息及其时空一体化信息。该文通过分析中文文本中事件信息描述的语言特点和事件的时空语义特征,基于地名实体和空间关系标注研究成果,制定了中文文本的事件时空信息标注体系和标注模式,并以GATE(General Architecture for Text Engineering)为标注平台,以网页文本为数据源,构建了事件时空信息标注语料库。研究成果为中文文本中地理信息的语义解析提供标准化的训练和测试数据。
  相似文献   

2.
传统话题模型用词项概率分布表示话题,在可解释性上存在很大的不足。该文在Latent Dirichlet Allocation(LDA)的结果上提出了一种基于种子词汇的话题标签抽取方法。首先根据提出的权重计算公式抽取每个话题的种子词,然后,采用bootstrapping思想,迭代产生包含种子词汇的关键短语集合,最后根据短语的完整性和泛化度选择话题标签。该文对两会报告话题和新闻事件话题进行实验,通过结果展示和人工评测,该方法抽取的话题标签能够较准确地表达话题的语义信息。  相似文献   

3.
分析了图像检索系统的研究现状,指出了出现语义鸿沟的原因在于系统中缺乏对于实体相互关系的描述,提出了一个四层的图像语义模型,并在此基础上给出了基于常识库和图像实体库的图像描述和检索模型。以图像的颜色、纹理、形状等特征来构造实体的描述信息,并以常识库信息来分析图像场景中的实体构成和关系,从而获得对图像语义信息的识别和理解。  相似文献   

4.
针对传统基于语义的Web服务匹配算法无法解决模糊语义下匹配的问题,提出一种基于动态信任语义库的Web服务匹配算法。将交互实体对服务提供者的评价行为进行灰色聚合,筛选出可信实体;依据可信实体对服务提供者的服务描述,提取语义信息,构建动态的语义库,对于Web服务描述中的模糊语义概念,从可信实体的服务描述中抽取相应概念并替换;计算服务请求向量与服务描述向量的语义相似度,衡量Web服务的匹配程度。实验结果表明,在存有模糊语义的情况下,该算法具有更好的匹配效果。  相似文献   

5.
随着CSS+DIV布局方式逐渐成为网页结构布局的主流,对此类网页进行高效的主题信息抽取已成为专业搜索引擎的迫切任务之一。提出一种基于DIV标签树的网页主题信息抽取方法,首先根据DIV标签把HTML文档解析成DIV森林,然后过滤掉DIV标签树中的噪声结点并且建立STU-DIV模型树,最后通过主题相关度分析和剪枝算法,剪掉与主题信息无关的DIV标签树。通过对多个新闻网站的网页进行分析处理,实验证明此方法能够有效地抽取新闻网页的主题信息。  相似文献   

6.
关系抽取旨在从未经标注的自由文本中抽取实体间的关系.然而,现有的方法大都孤立地预测每一个关系而未考虑关系标签相互之间的丰富语义关联.该文提出了一种融合预训练语言模型和标签依赖知识的关系抽取模型.该模型通过预训练模型BERT编码得到句子和两个目标实体的语义信息,使用图卷积网络建模关系标签之间的依赖图,并结合上述信息指导最...  相似文献   

7.
随着CSS+DIV布局方式逐渐成为网页结构布局的主流,对此类网页进行高效的主题信息抽取已成为专业搜索引擎的迫切任务之一。提出一种基于DIV标签树的网页主题信息抽取方法,首先根据DIV标签把HTML文档解析成DIV森林,然后过滤掉DIV标签树中的噪声结点并且建立STU-DIV模型树,最后通过主题相关度分析和剪枝算法,剪掉与主题信息无关的DIV标签树。通过对多个新闻网站的网页进行分析处理,实验证明此方法能够有效地抽取新闻网页的主题信息。  相似文献   

8.
命名实体识别技术是自然语言处理领域的重要任务之一。但岩石文本信息中的命名实体存在边界不清、分词困难、误差传播、计算效率慢等问题。基于岩石文本信息进行知识抽取对油气勘探领域的研究具有重大意义。为此,该文首先构建岩石文本数据集,并提出Lexicon-BiLSTM-CRF网络模型应用于非结构化的岩石文本上,该模型首先经过Lexicon机制获得每个字符的所有匹配词,从而解决了边界不清、分词困难的问题,在此基础上提升了计算效率。然后通过双向长短期记忆网络(BiLSTM)提取上下文语义特征,将语义向量传入条件随机场(CRF)层并采用维特比算法解码,降低了错误标签的输出概率并预测实体标注标签,最终实现岩石文本的命名实体抽取任务。在构建的岩石文本数据集的基础上进行几组对比实验,验证了该方法在准确率和召回率上具有一定提升。  相似文献   

9.
基于树核函数的实体语义关系抽取方法研究   总被引:3,自引:2,他引:3  
该文描述了一种改进的基于树核函数的实体语义关系抽取方法,通过在原有关系实例的结构化信息中加入实体语义信息和去除冗余信息的方法来提高关系抽取的性能。该方法在最短路径包含树的基础上,首先加入实体类型、引用类型等与实体相关的语义信息,然后对树进行裁剪,去掉修饰语冗余和并列冗余信息,并扩充所有格结构,最后生成实体语义关系实例。在ACE RDC 2004基准语料上进行的关系检测和7个关系大类抽取的实验表明,该方法在较大程度上提高了实体语义关系识别和分类的效果,F值分别达到了79.1%和71.9%。  相似文献   

10.
当前的电子病历实体关系抽取方法存在两个问题:忽视了位置向量噪声和语义表示匮乏.该文提出一种基于位置降噪和丰富语义的实体关系抽取模型.模型首先利用位置信息和专业领域语料训练的词向量信息获取每个词的注意力权重,然后将此权重与通用领域语料训练的词向量结合,实现位置向量降噪和丰富语义引入,最后根据加权后的词向量判断实体关系类型...  相似文献   

11.
In this paper, we proposed a novel approach based on topic ontology for tag recommendation. The proposed approach intelligently generates tag suggestions to blogs. In this approach, we construct topic ontology through enriching the set of categories in existing small ontology called as Open Directory Project. To construct topic ontology, a set of topics and their associated semantic relationships is identified automatically from the corpus‐based external knowledge resources such as Wikipedia and WordNet. The construction relies on two folds such as concept acquisition and semantic relation extraction. In the first fold, a topic‐mapping algorithm is developed to acquire the concepts from the semantic of Wikipedia. A semantic similarity‐clustering algorithm is used to compute the semantic similarity measure to group the set of similar concepts. The second is the semantic relation extraction algorithm, which derives associated semantic relations between the set of extracted topics from the lexical patterns between synsets in WordNet. A suitable software prototype is created to implement the topic ontology construction process. A Jena API framework is used to organize the set of extracted semantic concepts and their corresponding relationship in the form of knowledgeable representation of Web ontology language. Thus, Protégé tool provides the platform to visualize the automatically constructed topic ontology successfully. Using the constructed topic ontology, we can generate and suggest the most suitable tags for the new resource to users. The applicability of topic ontology with a spreading activation algorithm supports efficient recommendation in practice that can recommend the most popular tags for a specific resource. The spreading activation algorithm can assign the interest scores to the existing extracted blog content and tags. The weight of the tags is computed based on the activation score determined from the similarity between the topics in constructed topic ontology and content of the existing blogs. High‐quality tags that has the highest activation score is recommended to the users. Finally, we conducted experimental evaluation of our tag recommendation approach using a large set of real‐world data sets. Our experimental results explore and compare the capabilities of our proposed topic ontology with the spreading activation tag recommendation approach with respect to the existing AutoTag mechanism. And also discuss about the improvement in precision and recall of recommended tags on the data sets of Delicious and BibSonomy. The experiment shows that tag recommendation using topic ontology results in the folksonomy enrichment. Thus, we report the results of an experiment mean to improve the performance of the tag recommendation approach and its quality.  相似文献   

12.
Topic-based ranking in Folksonomy via probabilistic model   总被引:1,自引:0,他引:1  
Social tagging is an increasingly popular way to describe and classify documents on the web. However, the quality of the tags varies considerably since the tags are authored freely. How to rate the tags becomes an important issue. Most social tagging systems order tags just according to the input sequence with little information about the importance and relevance. This limits the applications of tags such as information search, tag recommendation, and so on. In this paper, we pay attention to finding the authority score of tags in the whole tag space conditional on topics and put forward a topic-sensitive tag ranking (TSTR) approach to rank tags automatically according to their topic relevance. We first extract topics from folksonomy using a probabilistic model, and then construct a transition probability graph. Finally, we perform random walk over the topic level on the graph to get topic rank scores of tags. Experimental results show that the proposed tag ranking method is both effective and efficient. We also apply tag ranking into tag recommendation, which demonstrates that the proposed tag ranking approach really boosts the performances of social-tagging related applications.  相似文献   

13.
针对人物标签推荐中多样性及推荐标签质量问题,该文提出了一种融合个性化与多样性的人物标签推荐方法。该方法使用主题模型对用户关注对象建模,通过聚类分析把具有相似言论的对象划分到同一类簇;然后对每个类簇的标签进行冗余处理,并选取代表性标签;最后对不同类簇中的标签融合排序,以获取Top-K个标签推荐给用户。实验结果表明,与已有推荐方法相比,该方法在反映用户兴趣爱好的同时,能显著提高标签推荐质量和推荐结果的多样性。  相似文献   

14.
15.
针对汉语语句表意灵活复杂多变的特点,提出一种基于语义与情感的句子相似度计算方法,从表意层面计算句子相似度。该方法使用哈工大LTP平台对句子进行预处理,提取词语、词性、句法依存标记与语义角色标记,将语义角色标注结果作为句中语义独立成分赋予相似度权重系数,综合句法依存关系与词法关系计算两句相同标签语义独立成分相似度得到部分相似度,加权计算部分相似度得到句子整体相似度。另外,考虑到情感与句式因子,在整体相似度的基础上对满足条件的两句计算情感减益与句式减益。实验结果表明,该方法能有效提取出句子语义独立成分,从语义层面上计算句子相似度,解决了信息遗漏与句子组成成分不一致的问题,提高了句子相似度计算的准确率与鲁棒性。  相似文献   

16.
Folksonomy, considered a core component for Web 2.0 user-participation architecture, is a classification system made by user’s tags on the web resources. Recently, various approaches for image retrieval exploiting folksonomy have been proposed to improve the result of image search. However, the characteristics of the tags such as semantic ambiguity and non-controlledness limit the effectiveness of tags on image retrieval. Especially, tags associated with images in a random order do not provide any information about the relevance between a tag and an image. In this paper, we propose a novel image tag ranking system called i-TagRanker which exploits the semantic relationships between tags for re-ordering the tags according to the relevance with an image. The proposed system consists of two phases: 1) tag propagation phase, 2) tag ranking phase. In tag propagation phase, we first collect the most relevant tags from similar images, and then propagate them to an untagged image. In tag ranking phase, tags are ranked according to their semantic relevance to the image. From the experimental results on a Flickr photo collection about over 30,000 images, we show the effectiveness of the proposed system.  相似文献   

17.
18.
In this paper, we evaluate the effectiveness of a semantic smoothing technique to organize folksonomy tags. Folksonomy tags have no explicit relations and vary because they form uncontrolled vocabulary. We discriminates so-called subjective tags like “cool” and “fun” from folksonomy tags without any extra knowledge other than folksonomy triples and use the level of tag generalization to form the objective tags into a hierarchy. We verify that entropy of folksonomy tags is an effective measure for discriminating subjective folksonomy tags. Our hierarchical tag allocation method guarantees the number of children nodes and increases the number of available paths to a target node compared to an existing tree allocation method for folksonomy tags.  相似文献   

19.
20.
Tag ranking has emerged as an important research topic recently due to its potential application on web image search. Existing tag relevance ranking approaches mainly rank the tags according to their relevance levels with respect to a given image. Nonetheless, such algorithms heavily rely on the large-scale image dataset and the proper similarity measurement to retrieve semantic relevant images with multi-labels. In contrast to the existing tag relevance ranking algorithms, in this paper, we propose a novel tag saliency ranking scheme, which aims to automatically rank the tags associated with a given image according to their saliency to the image content. To this end, this paper presents an integrated framework for tag saliency ranking, which combines both visual attention model and multi-instance learning to investigate the saliency ranking order information of tags with respect to the given image. Specifically, tags annotated on the image-level are propagated to the region-level via an efficient multi-instance learning algorithm firstly; then, visual attention model is employed to measure the importance of regions in the given image. Finally, tags are ranked according to the saliency values of the corresponding regions. Experiments conducted on the COREL and MSRC image datasets demonstrate the effectiveness and efficiency of the proposed framework.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号