首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
确定化学结构族性化处理方法研究   总被引:1,自引:1,他引:0  
本文提出了一种用确定化学结构检索族性化学结构的方法。利用SMILES线性编码对环原子的识别能力首先将一个确定化学结构拆分成两部分:Ring和Fragment,然后根据Ring和Fragment族性属性描述分别族性化,最后将族性化表示的结构片断重新组装成一个族性化学结构。  相似文献   

2.
自然语言处理技术在药物专利检索中的应用研究   总被引:2,自引:2,他引:0  
本文研究了自然语言处理技术在药物专利检索中的应用,开发出一套翻译软件,能够将药物专利中对族性变量的文本描述半自动化地转化为符合规则的GSCCT格式,为准确、高效地建立药物专利检索数据库打下了基础。  相似文献   

3.
在利用SMILES线性编码首先将化学结构拆分成Ring和Fragment然后根据各自的属性分别族性化表示的基础上,不考虑结构的具体连接关系,对Ring和Fragment按照预定义的优先次序进行数字编码,以数字编码描述的族性结构还可以进一步族性化处理得到二级数字编码,编码过程由程序自动完成,使得提问结构式和数据库存储的专利结构的数字编码保持一致,可应用于由确定结构检索族性结构的系统之中。  相似文献   

4.
利用计算机表示、存储和匹配族性结构面临的一个难题是如何将族性结构展开到一个合适的程度,既能避免过度枚举又包含族性结构应有的信息,本文根据族性结构特点,设计了一个程序用来提取族性结构中的环和环之间的连接片段,然后生成族性结构的分子骨架及相应的还原图,利用SMILES线性编码存储族性结构主要信息,避免了大部分的枚举。  相似文献   

5.
以处理族性结构信息的计算机表达式一族性结构紧缩关联表(Generic Structure Compact Connection Table,GSCCT)为基础,拟定了一套检索族性结构的筛选策略,即从GSCCT表中提取出主干环节点的预筛选方案。GSCCT表包含主干结构节点和叶结构节点,主干节点又分为环节点和非环节点两部分。叶结构节点中含有环节点时,将其提升为主干环节点。该结构匹配方法与传统的在原子节点层次上的算法不同,是在紧缩节点的层次上提取关键信息,即提取族性结构中的主要信息一环结构信息(或称指纹信息)进行预筛选,先不考虑非环节点和叶节点,以避免大量枚举。文中详细介绍了筛选思路和筛选功能的实现过程。  相似文献   

6.
1.建立网上图书馆,更好地为公众服务。现在专利局主页上能查到摘要及著录项目部分。 2.实现与代办处的联网,实现网上申请,进而向无纸件审查迈进。 3.建立深度加工的中国专利信息库。专利信息是国家科技信息的重要组成部分,尤其在这个改革开放的年代,我国大中型企业和商业贸易公司日益迅速加入充满风险和竞争的国际大环境。激烈的竞争使人们对信息的需求,特别是对与专利和知识产权相关信息的需求,从来没有像现在这样巨大而迫切。因此进一步开发好专利信息资源,对我国专利信息化建设和国民经济发展具有深远影响。目前中国专利局自动化改造工程计划中,侧重考虑了全文检索系统,下一步将考虑对中国专利数据库进行深度加工标引的问题。  相似文献   

7.
领域Ontology的自动丰富——基于ADL地名表的实例研究   总被引:1,自引:0,他引:1  
葛宁  王军 《计算机科学》2007,34(9):156-162
本文以一个地理特征词表(Feature Type Thesaurus,F1T)为研究实例,提出了一种对领域Ontology进行自动丰富的方法。FTT描述了200多种地理特征类型,依照等级结构组织,用于标引和组织美国亚历山大数字图书馆地名表(ADL Gazetteer)中的6百万个地名。为了对F1可进行自动丰富,(1)首先从地名中抽取和发现有检索价值的、表示地理特征类型的通用词;(2)根据它们和标引主题词间的同现关系,在相同词族词汇的聚类过程中,确定与之相对应的主题词,进而将提取出的通用词定位到F1T的等级结构中。充分利用已经存在的大量标引语料,实现通用词的定位分析是核心内容,并且实验结果证明有效性达到82.7%。这项研究的实质是从Ontology标引的语料库中自动提取领域知识和标引知识,达到对Ontology的自动丰富。这一方法可以应用到类似的语料库和知识库上,实现新术语的发现、Ontology自丰富及其互操作。  相似文献   

8.
基于文档标引图模型的文本相似度策略   总被引:2,自引:1,他引:1       下载免费PDF全文
文档标引图是一种基于短语的图结构文本特征表示模型,能更加全面、准确地表达文本特征信息,实现渐增的文本聚类和信息处理。该文基于文档标引图特征模型,提出文档相似度计算加法策略和乘法策略,采用变换函数对文档相似度值进行调整,增强文档之间的可区分性,改进文本聚类和分类等处理的性能,实例证明了策略的有效性。  相似文献   

9.
为促进我国医学科技创新,建设医学科技专利数据库提供有效的分析工具,为技术研发、专利战略研究、科学决策等提供强有力的支撑。本研究设计的医学科技专利数据库在充分借鉴国内外已有专利分析平台的基础上,实现专利数据自动导入与全文检索,实现专业标引、数据清洗及共现矩阵等功能。医学科技专利数据库吸收了Thomson Innovation(TI)和Derwent Innovation Index(DII)数据库等世界权威专利数据库的优点和先进的检索功能,同时借鉴了Thomson Data Analyzer(TDA)数据处理软件强大的分析功能;该数据库能够实现系统中数据导入的设计与实现,并能够基于已有数据进行数据清洗和统计分析,同时能够实现分组标引、共现矩阵等功能。该数据库首次实现了大数据量的分析功能,能够在数据库中实现几十万数据量的在线分析;首次在专利数据库中实现深度技术分析的专家标引,为专业角度的专利分析提供新的视角;首次实现在线的共现矩阵分析功能,完善了TDA分析工具存在的数据量限制等问题。  相似文献   

10.
概述了计算机化学中族性结构信息处理的发展情况,从理论上分析了族性结构匹配检索的策略,提出了族性结构数据库系统一致性和效率两原则,指出提高匹配效率需要找出原始族性结构的最佳展开点,对于建立高效率的族性结构匹配检索系统具有指导意义。  相似文献   

11.
面向对象的特征具有很强的建模能力,将面向对象的特征引入到XML可以增强XML的描述能力。而现有的索引都不支持面向对象的XML数据的查询,由此基于面向对象XML数据的两种索引模式被提出:基于Ctree的预处理模式和OOCtree模式。这两种模式都提供了面向对象XML数据的继承信息、简洁的结构概要和孩子父亲链接,可以在较短的时间内完成面向对象XML数据的查询;并从算法的建立到查询处理过程以及查询结果的比较来讨论分析了这两种索引模式的性能。  相似文献   

12.
XML正在迅速成为Internet上信息表示和数据交换的重要标准.而面向对象的特征具有很强的建模能力,将面向对象的概念引入到XML可以提高XML模式语言的建模能力.而现有的索引模式都不支持面向对象XML数据的查询,由此提出了一种基于面向对象XML数据的OOCtree(Object-Oriented compact tree)索引模式,它是一棵包含组级和元素级的两级双向树.组级提供简洁的结构概要和继承信息,可以在查询早期阶段裁剪大量的搜索空间;元素级提供详细的孩子父亲连接关系,可以快速地访问某元素的父亲,极大地提高了查询处理效率.  相似文献   

13.
This paper analyzes the scalability of indexing structures in a production systems testbed for large-scale computational research on discrete build-to-order environments. The testbed consists of five integrated services that interoperate for order entry, order promising, production planning, execution, and control. The service-based architecture is Java-based, object-oriented, event-driven, memory-resident, and multi-threaded. Services in the testbed very frequently need to rapidly locate specified elements in their large data models during algorithmic computations, and in that regard a number of indexing structures have been designed by computer scientists with the purpose of increasing the efficiency of data access. We explore the tradeoff between improved application scalability and increased implementation complexity of indexing structures by comparing the B+-tree, T-tree, and R-tree indexing structures to a simple and widely used linear structure in the context of an application to real-time order promising. Scalability is evaluated by measuring space requirements and the computational time as a function of the size of the system.  相似文献   

14.
In the traditional programming paradigm, data structures and algorithms are developed for specific data types and requirements. This leads to code redundancy and inflexibility, thus not allowing effective code reuse for similar applications. One effective approach to increase code reuse is generic programming, which focuses on the development of efficient, reusable software libraries through suitable abstractions for the common requirements. In this paper, we present how we applied generic programming to an ongoing effort for mesh-based adaptive simulations on massively parallel computers. Three generic components, iterator, set and tag, were developed using design pattern, C++ template programming and the standard template library. The scaling studies on petascale supercomputers demonstrate the efficiency of the reusable, generic components which do not sacrifice the performance of the previous tools developed in the traditional object-oriented programming paradigm.  相似文献   

15.
Retrieval, validation, and explanation tools are described for cooperative assistance during requirements engineering and are illustrated by a library system case study. Generic models of applications are reused as templates for modeling and critiquing requirements for new applications. The validation tools depend on a matching process which takes facts describing a new application and retrieves the appropriate generic model from the system library. The algorithms of the matcher, which implement a computational theory of analogical structure matching, are described. A theory of domain knowledge is proposed to define the semantics and composition of generic domain models in the context of requirements engineering. A modeling language and a library of models arranged in families of classes are described. The models represent the basic transaction processing or `use case' for a class of applications. Critical difference rules are given to distinguish between families and hierarchical levels. Related work and future directions of the domain theory are discussed  相似文献   

16.
面向对象数据库系统中有序集合的索引技术   总被引:2,自引:0,他引:2  
本文首先讨论了面向对象数据库系统中的索引技术,分析了传统的基于值的索引技术不适合于用来索引有序集合的原因,然后提出了一种新的适合于有序集合的索引机制-P+树,同时本文也设计了一个用于测试有序集合索引机制的评价基准,根据该测试基准对本文提出的索引机制进行了系统的分析与评价。  相似文献   

17.
SaIL: A Spatial Index Library for Efficient Application Integration   总被引:1,自引:0,他引:1  
With the proliferation of spatial and spatio-temporal data that are produced everyday by a wide range of applications, Geographic Information Systems (GIS) have to cope with millions of objects with diverse spatial characteristics. Clearly, under these circumstances, substantial performance speed up can be achieved with the use of spatial, spatio-temporal and other multi-dimensional indexing techniques. Due to the increasing research effort on developing new indexing methods, the number of available alternatives is becoming overwhelming, making the task of selecting the most appropriate method for indexing the data according to application needs rather challenging. Therefore, developing a library that can combine a variety of indexing techniques under a common application programming interface can prove to be a valuable tool. In this paper we present SaIL (SpAtial Index Library), an extensible framework that enables easy integration of spatial and spatio-temporal index structures into existing applications. We focus on design issues and elaborate on techniques for making the framework generic enough, so that it can support user defined data types, customizable spatial queries, and a broad range of spatial (and spatio-temporal) index structures, in a way that does not compromise functionality, extensibility and, primarily, ease of use. SaIL is publicly available and has already been successfully utilized for research and commercial applications. This work was conducted while the first author was visiting ESRI and was partially supported by ESRI, NSF grants IIS-9907477, EIA-9983445, and IIS-0220148.  相似文献   

18.
A novel indexing structure-the join index hierarchy-is proposed to handle the “gotos on disk” problem in object-oriented query processing. The method constructs a hierarchy of join indices and transforms a sequence of pointer-chasing operations into a simple search in an appropriate join index file, and thus accelerates navigation in object-oriented databases. The method extends the join index structure studied in relational and spatial databases, supports both forward and backward navigation among objects and classes, and localizes update propagations in the hierarchy. Our performance study shows that a partial join index hierarchy outperforms several other indexing mechanisms in object-oriented query processing  相似文献   

19.
本文介绍了一个CAD通用零件模型的设计原理,并讨论了该模型在参数化CAD环境中的实现技术。该模型通过引入面向对象的思想,不仅适用于机械设计诸领域的国标、部标标准件,而且由于采用了统一的数据结构,同样适用于用户自定义零件,使用户可以方便地对零件库进行扩充。模型既可在参数化环境中实现,又可在非参数化环境中实现,具有较好的通用性。  相似文献   

20.
With the rapid emergence of XML as a data exchange standard over the Web, storing and querying XML data have become critical issues. The two main approaches to storing XML data are (1) to employ traditional storage such as relational database, object-oriented database and so on, and (2) to create an XML-specific native storage. The storage representation affects the efficiency of query processing. In this paper, firstly, we review the two approaches for storing XML data. Secondly, we review various query optimization techniques such as indexing, labeling and join algorithms to enhance query processing in both approaches. Next, we suggest an indexing classification scheme and discuss some of the current trends in indexing methods, which indicate a clear shift towards hybrid indexing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号