首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Collaborative editing enables a group of people to edit documents collaboratively over a computer network. Customisation of the collaborative environment to different subcommunities of users at different points in time is an important issue. The model of the document is an important factor in achieving customisation. We have chosen a tree representation encompassing a large class of documents, such as text, XML and graphical documents and here we propose a multi-level editing approach for maintaining consistency over hierarchical-based documents. The multi-level editing approach involves logging edit operations that refer to each node. Keeping operations associated with the tree nodes to which they refer offers support for tracking user activity performed on various units of the document. This facilitates the computation of awareness information and the handling of conflicting changes referring to units of the document. Moreover, increased efficiency is obtained compared to existing approaches that use a linear structure for representing documents. The multi-level editing approach involves the recursive application of any linear merging algorithm over the document structure and we show how the approach was applied for real-time and asynchronous modes of collaboration.  相似文献   

2.
Browsing the DOM tree of an XML document is an act of following the links among the nodes of the DOM tree to find some desired nodes without any knowledge for search. When the structure of the XML document is not known to a user, browsing is the basic operation performed for referring the contents of the XML document. If the size of the XML document is very large, however, using a general-purpose XML parser for browsing the DOM tree of the XML document to access arbitrary node may suffer from the lack of memory space for constructing the large DOM tree. To alleviate this problem, we suggest a method to browse the DOM tree of a very large XML document by splitting the XML document into n small XML documents and generating sequentially the DOM tree of each of those small n XML documents. For later reference, the information of some nodes accessed from the DOM tree already generated has been also kept using the concept of their virtual nodes. With our suggested approach, the memory space necessary for browsing the DOM tree of a very large XML document is reduced such that it can be managed by a personal computer.  相似文献   

3.
Graphical Transformation of Multimedia XML Documents   总被引:1,自引:0,他引:1  
As a commonly acceptable standard for guiding Web markup documents, XML allows the Internet users to create multimedia documents of their preferred structures and share with other people. The creation of various multimedia document structures, typically as trees, implies that some kinds of conversion mechanisms are needed for people using different structures to understand each other. This paper presents a visual approach to the representation and validation of multimedia document structures specified in XML and transformation of one structure to another. The underlying theory of our approach is a context-sensitive graph grammar formalism. The paper demonstrates the conciseness and expressiveness of the graph grammar formalism. An example XML structure is provided and its graph grammar representation, validation and transformation to a multimedia representation are presented.  相似文献   

4.
利用关系表构建XML文档解析的树模型   总被引:2,自引:1,他引:1  
祝青  阳王东 《计算机应用》2009,29(6):1719-1721
在对XML文档的数据解析和查询操作研究中,发现树能较好地反映XML文档的层次结构,但其查询效率较低,而关系表是一种适合存储大量数据且有较好查询效率与操作功能的数据结构。给出了一个把树和关系表相结合构建一种存储XML文档的数据模型;在这个模型的解析过程中,采用回调事件式的分段解析方法以减少解析时间和存储空间。这样既能较好保存XML文档的结构特点,又能提高其查询的效率和操作的便利性。通过对大数据量XML文档的解析和操作实验,实验结果证明这种数据模型在处理大型XML文档中具有明显优势。  相似文献   

5.
保持数据约束的关系数据库至XML文档的转换   总被引:2,自引:0,他引:2  
XML已成为Internet上的技术趋势,在保留原有关系数据库的同时发展XML文档是目前的最佳选择,它需要在保持数据依赖约束基础上实现关系数据库与XML文档的转换.这一过程中,模式转换必须先于数据转换,因为现有的关系数据库通常是规范化的,重建XML文档树结构才能实现这一转换.为了达成此目的,首先依据已有的数据依赖约束将规范化的关系联合进一组表格,实现反向规范化,然后将这些联合表格映射为一组DOM,归并成XML文档树,根据用户选择的根结点,以及与它相连的结点形成一个期望的局部文档树,被选的XML文档树又映射为DTD格式的XML模式.这样就可以将联合表映射成一组DOM,并将其归并成单一DOM,最终转换成XML文档.  相似文献   

6.
XML文档在关系数据库中的存储方法   总被引:11,自引:0,他引:11  
XML是网络中跨平台数据发布与交换的标准格式,它在数据库领域有着广阔的应用空间,然而XML文档的树型结构与关系数据库的二维表结构之间存在着巨大的差异,因此在关系数据库中存储XML文档需要进行一些特殊处理。本文分析了XML文件在数据库领域中的存储与管理方法,并重点就Oracle9i中XML相关技术在现代远程教育中的应用进行了讨论,针对以数据为中心和以文档为中心两类文档资料的存储给出了可行的存储方案。  相似文献   

7.
在分析了Biztalk等商业化XML文档映射系统优缺点的基础上,设计和实现了一个多XML文档映射系统TRANSer,提供可视化设计工具让用户通过拖拉等简单操作设计映射关系,同时提供各种各样的函数来配合实现复杂的映射。它允许多个源XML文档映射到目标XML文档,还可以在设计映射关系的同时创建和修改目标XML文档的格式。实践证明,该系统开发效率高,捕述能力强,并且具有良好的扩展性。  相似文献   

8.
XML的并发加锁协议   总被引:3,自引:0,他引:3  
随着XML数据库管理系统(XML DBMS)研究的日益深入,研究基于树型结构的XML数据的并发控制协议变得十分重要.由Silberschatz和Kedem提出的树加锁协议(tree protocol)是基于静态树结构数据而定义的.而XML数据是动态变化的树型结构数据.针对XML数据的特点,定义了一个操作集,它可以将一个树型结构的XML文档变化为另外一个合法的树型结构的:XML文档.该操作集的最大特点是其操作对象为一棵子树而非一个结点.在这个操作集基础上定义了XML动态树协议XDTP(XML dynamic tree protocol),并证明了该协议能继续保持静态树协议的优良特性:可串行化(serializability)和无死锁(deadlock-freedom).在实际的数据集上进行了实验,结果表明XDTP有着较好的性能.  相似文献   

9.
The revolution of XML is recognized as the trend of technology on the Internet to researchers as well as practitioners. Companies need to adopt XML technology. With investment in the current relational database systems, they want to develop new XML documents while running existing relational databases on production. They need to reengineer the relational databases into XML documents with constraints preservation. In the process, schema translation must be done before data conversion. Since the existing relational databases are usually normalized, they have to be reconstructed into XML document tree structures. This can be accomplished through denormalization by joining the normalized relations into tables according to their data dependencies constraints. The joined tables are mapped into DOMs, which are then integrated into XML document trees. The user specifies an XML document root with its relevant nodes to form a partitioned XML document tree to meet their requirements. The selected XML document tree is mapped into an XML schema in the form of DTD. We then load joined tables into DOMs, integrate them into a DOM, and transform it into an XML document.  相似文献   

10.
Search operations and browsing facilities over an XML document database require special support at the physical level. Typical search operations involve path queries. This paper proposes a hierarchical access method to support such operations and to facilitate browsing. It advocates the idea of searching large XML collections by administering efficiently XML schemata. The proposed approach may be used for indexing XML documents according to their structural proximity. This is obtained by organizing the schemata of a large XML document collection in a hierarchical way by merging structurally close schemata. The proposed structure, which is called XML Schema Directory (XSD), is a balanced tree and it may serve two purposes: (1) to accelerate XML query processing and (2) to facilitate browsing. Received 15 March 2001 / Revised 12 April 2001 / Accepted in revised form 11 May 2001  相似文献   

11.
XML is acknowledged as the most effective format for data encoding and exchange over domains ranging from the World Wide Web to desktop applications. However, large-scale adoption into actual system implementations is being slowed down due to the inefficiency of its document-parsing methods. The recent development of lazy parsing techniques is a major step towards improving this situation, but lazy parsers still have a key drawback—they must load the entire XML document in order to extract the overall document structure before document parsing can be performed. We have developed a framework for efficient parsing based on the idea of placing internal physical pointers within the XML document that allow the navigation process to skip large portions of the document during parsing. We show how to generate such internal pointers in a way that optimizes parsing using constructs supported by the current W3C XML standard. A double-lazy parser (2LP) exploits these internal pointers to efficiently parse the document. The usage of supported W3C constructs to create internal pointers allows 2LP to be backward compatible—i.e., the pointer-augmented documents can be parsed by current XML parsers. We also implemented a mechanism to efficiently parse large documents with limited main memory, thereby overcoming a major limitation in current solutions. We study our pointer generation and parsing algorithms both theoretically and experimentally, and show that they perform considerably better than existing approaches.  相似文献   

12.
With the increasing number of available XML documents, numerous approaches for retrieval have been proposed in the literature. They usually use the tree representation of documents and queries to process them, whether in an implicit or explicit way. Although retrieving XML documents can be considered as a tree matching problem between the query tree and the document trees, only a few approaches take advantage of the algorithms and methods proposed by the graph theory. In this paper, we aim at studying the theoretical approaches proposed in the literature for tree matching and at seeing how these approaches have been adapted to XML querying and retrieval, from both an exact and an approximate matching perspective. This study will allow us to highlight theoretical aspects of graph theory that have not been yet explored in XML retrieval.  相似文献   

13.
The rapid growth of multimedia documents has raised huge demand for sophisticated multimedia knowledge discovery systems. The knowledge extraction of the documents mainly relies on the data representation model and the document representation model. As the multimedia document comprised of multimodal multimedia objects, the data representation depends on modality of the objects. The multimodal objects require distinct processing and feature extraction methods resulting in different features with different dimensionalities. Managing multiple types of features is challenging for knowledge extraction tasks. The unified representation of multimedia document benefits the knowledge extraction process, as they are represented by same type of features. The appropriate document representation will benefit the overall decision making process by reducing the search time and memory requirements. In this paper, we propose a domain converting method known as Multimedia to Signal converter (MSC) to represent the multimodal multimedia document in an unified representation by converting multimodal objects as signal objects. A tree based approach known as Multimedia Feature Pattern (MFP) tree is proposed for the compact representation of multimedia documents in terms of features of multimedia objects. The effectiveness of the proposed framework is evaluated by performing the experiments on four multimodal datasets. Experimental results show that the unified representation of multimedia documents helped in improving the classification accuracy for the documents. The MFP tree based representation of multimedia documents not only reduces the search time and memory requirements, also outperforms the competitive approaches for search and retrieval of multimedia documents.  相似文献   

14.
Using structural similarity for clustering XML documents   总被引:2,自引:2,他引:0  
In this paper, we describe a method for clustering XML documents. Its goal is to group documents sharing similar structures. Our approach is two-step. We first automatically extract the structure from each XML document to be classified. This extracted structure is then used as a representation model to classify the corresponding XML document. The idea behind the clustering is that if XML documents share similar structures, they are more likely to correspond to the structural part of the same query. Finally, for the experimentation purpose, we tested our algorithms on both real (ACM SIGMOD Record corpus) and synthetic data. The results clearly demonstrate the interest of our approach.  相似文献   

15.
XML文档近似连接操作是在两个XML文档集合中发现近似的XML文档,其在基于XML数据的信息集成、XML数据清洗等系统中有着广泛的应用.然而,目前XML文档近似连接操作的一个显著问题在于:当文档之间存在较大差异时,存在大量的重复计算,降低了处理效率.对于这个问题,提出了基于聚类的XML文档近似连接方法,基本思想是为每个XML文档建立一个索引,如果两个数据集中若干文档的索引较相似,可以把它们组成一簇,然后在每一簇中执行近似连接.而不在任何簇中的文档,则无需对其进行任何计算.实验结果表明,提出的方法在保证正确率的前提下具有高效性.  相似文献   

16.
结合XML文档的特点,采用XML数据模型XOEM和压缩结构树的存储结构,提出了一种高效的XML数据的频繁模式挖掘算法──AFPMX算法,并从理论和实验两方面证明了该算法是可行和有效的。  相似文献   

17.
XML文档相似性的仿真研究   总被引:1,自引:0,他引:1  
XML文档相似性的计算是XML文档分类中的一个难题。文中描述了一种基于结构的方法,通过序列化模式挖掘方法,挖掘出两个文档之间的最大相似路径,从而可以通过计算最大相似的路径的节点数目和所有路径的节点数目的比值,得到两个文档之间的相似度。文章提出了一种新的最小化XML文档的方法,并且综合考虑了文档节点的语义相似度和结构相似度,从而进一步地提高了计算文档相似度的精度。实验表明,该方法有着良好的应用前景。  相似文献   

18.
Extensible Markup Language (XML) documents consist of text data plus structured data (markup). XPath allows to query both text and structure. Evaluating such hybrid queries is challenging. We present a system for in‐memory evaluation of XPath search queries, that is, queries with text and structure predicates, yet without advanced features such as backward axes, arithmetics, and joins. We show that for this query fragment, which contains Forward Core XPath, our system, dubbed Succinct XML Self‐Index (‘SXSI’), outperforms existing systems by 1–3 orders of magnitude. SXSI is based on state‐of‐the‐art indexes for text and structure data. It combines two novelties. On one hand, it represents the XML data in a compact indexed form, which allows it to handle larger collections in main memory while supporting powerful search and navigation operations over the text and the structure. On the other hand, it features an execution engine that uses tree automata and cleverly chooses evaluation orders that leverage the speeds of the respective indexes. SXSI is modular and allows seamless replacement of its indexes. This is demonstrated through experiments with (1) a text index specialized for search of bio sequences, and (2) a word‐based text index specialized for natural language search. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

19.
基于频繁结构的XML文档聚类   总被引:1,自引:1,他引:0       下载免费PDF全文
研究基于频繁结构的XML文档聚类方法,其频繁结构包括频繁路径和频繁子树。首先介绍一种挖掘XML文档中所有嵌入频繁子树的算法SSTMiner,对SSTMiner算法进行修改,得到FrePathMiner算法和FreTreeMiner算法,分别用于挖掘XML文档中最大频繁路径和最大频繁子树,在此基础上,提出一种凝聚的层次聚类算法XMLCluster,分别以最大频繁路径和最大频繁子树作为XML文档的特征,对文档进行聚类。实验结果表明FrePathMiner算法和FreTreeMiner算法找到频繁结构的数量都比传统的ASPMiner算法多,这就可以为文档聚类提供更多的结构特征,从而获得更高的聚类精度。  相似文献   

20.
提出了一种用于搜索XML文档的新的索引方法即RIST。通过采用代码化的结构序列(SES)来表示XML文档和XML查询,得出查询XML数据等同于查找子序列匹配。RIST采用树结构作为查询的基本单元,从而避免了代价高昂的连接操作。另外,RIST还在XML文档的内容和结构上提供了一个统一的索引,所以它的一个很明显的优势就是克服了仅仅根据内容或结构建立索引的弊端。实验表明RIST在支持结构查询上是一种高效的方法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号