首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
G. A. Rose  J. Welsh 《Software》1981,11(7):651-669
This paper presents a systematic approach to formatted language design that incorporates formatting within the syntax of programming languages. The approach includes:
  • 1 a metasyntax to ensure that program text is foldable, not only to avoid right margin overflow but also to preclude constructs which are visually confusing or ambiguous;
  • 2 a set of guidelines for language designers to enhance readability within the constraints of the metasyntax; and
  • 3 a folding algorithm which selectively folds a program text.
The resulting automatic formatting is consistent with current practice in program and text layout. The effect of this approach is to put program format decisions in the domain of the language's designer, rather than its several implementors or numerous users, which implies uniformly formatted programs of improved readability and therefore usability.  相似文献   

2.
报告了一个基于形式语言的语法分析方法,对用DELAGES描述语言描述的族性结构文字描述部分进行分析。分析结果以语法树形式存放,以方便下一步的检索或匹配。由于使用了语法分析,程序可以报告文字描述可能存在的错误并指出错误的位置。  相似文献   

3.
XML是目前已被广泛采用的WWW上信息交换和表示的技术之一,被称为未来的数据通用格式。文中对现有的XML语法进行研究,用新的文本方式表示XML数据,针对可扩展XML数据的一些特殊形式提出了一些新的概念模式并证明了有效性和正确性,同时对语法可简化性进行了证明。针对相应的特殊XML语法形式开发了相应的自动机分析算法并进行了分析。  相似文献   

4.
We present a new approach to motion rearrangement that preserves the syntactic structures of an input motion automatically by learning a context‐free grammar from the motion data. For grammatical analysis, we reduce an input motion into a string of terminal symbols by segmenting the motion into a series of subsequences, and then associating a group of similar subsequences with the same symbol. To obtain the most repetitive and precise set of terminals, we search for an optimial segmentation such that a large number of subsequences can be clustered into groups with little error. Once the input motion has been encoded as a string, a grammar induction algorithm is employed to build up a context‐free grammar so that the grammar can reconstruct the original string accurately as well as generate novel strings sharing their syntactic structures with the original string. Given any new strings from the learned grammar, it is straightforward to synthesize motion sequences by replacing each terminal symbol with its associated motion segment, and stitching every motion segment sequentially. We demonstrate the usefulness and flexibility of our approach by learning grammars from a large diversity of human motions, and reproducing their syntactic structures in new motion sequences.  相似文献   

5.
This correspondence describes an approach to reducing the computational cost of document image decoding by viewing it as a heuristic search problem. The kernel of the approach is a modified dynamic programming (DP) algorithm, called the iterated complete path (ICP) algorithm, that is intended for use with separable source models. A set of heuristic functions are presented for decoding formatted text with ICP. Speedups of 3-25 over DP have been observed when decoding text columns and telephone yellow pages using ICP and the proposed heuristics  相似文献   

6.
为了高速度、高质量地浏览网络上的大量中文文本,提出了一种文本凹凸树结构的可视化浏览机制,并给出其彤式描述.通过以关键字和概念词典标注的最小概念集标识结点建立文本分类的层次树结构,为用户快速洲览文本提供有效路径.通过统计方法进行文本摘要抽取,按大纲、逻辑主题词段落和摘要洲览文本内容,提高了搜索查询速度与阅读效率,满足了用户快速、主动浏览文本的需求.  相似文献   

7.
8.
Most World-Wide Web information servers provide simple browsing access to collections of static text or hypertext files. This paper describes some interactive World-Wide Web servers that produce information displays and documents dynamically rather than just providing access to static files. The PARC Map Viewer uses a geographic database to create and display maps of any part of the world on demand. The Digital Tradition folk music server provides access to a large database of song lyrics and melodies. These applications take advantage of the multimedia capabilities of World-Wide Web to deliver graphical and audio content as well as formatted text. Hypertext links are used not only for navigation, but also for setting search and presentation parameters. In these applications the HTML format and the HTTP protocol are used like a user interface tool kit to provide not only document retrieval but a complete custom user interface specialized for the application.  相似文献   

9.
文本分类领域的困难,在于如何获得大量人工标记好的分类样本数据集,Medline数据库在专家的长期维护下,具有完善的基于MeSH(Medical Subject Headings)的分类体系,以及大量的文摘,可用来制作分类样本数据集。本文介绍和研究Medline数据库,提出如何利用它构建良好的分类模型,实验表明,利用Medline文摘数据库,通过Major标记,特征项数目采用5000,训练样本采用600,利用SVM分类器,可得较好的分类模型,从而为文本分类研究提供一种实用、高效的数据集制作方式。  相似文献   

10.
A context-free grammar corresponds to a system of equations in languages. The language generated by the grammar is the smallest solution of the system. We give a necessary and sufficient condition for an arbitrary solution to be the smallest one. We revive an old criterion to decide that a grammar has a unique solution. All this fits in an approach to search for a grammar for an arbitrary language that is given by other means. The approach is illustrated by the derivation of a grammar for a certain set of bit strings. The approach is used to give an elegant derivation of the grammar for a language accepted by a pushdown automaton.  相似文献   

11.
检索结果聚类能够帮助用户快速定位需要查找的信息。注重进行中文文本聚类的同时生成高质量的标签,获取搜索引擎返回的网页标题和摘要,利用分词工具对文本分词,去除停用词;统一构建一棵后缀树,以词语为单位插入后缀树各节点,通过词频、词长、词性和位置几项约束条件计算各节点词语得分;合并基类取得分高的节点词作标签。实验结果显示该方法的聚类簇纯度较高,提取的标签准确且区分性较强,方便用户使用。  相似文献   

12.
One may indicate the potentials of an MT system by stating what text genres it can process, e.g., weather reports and technical manuals. This approach is practical, but misleading, unless domain knowledge is highly integrated in the system. Another way to indicate which fragments of language the system can process is to state its grammatical potentials, or more formally, which languages the grammars of the system can generate. This approach is more technical and less understandable to the layman (customer), but it is less misleading, since it stresses the point that the fragments which can be translated by the grammars of a system need not necessarily coincide exactly with any particular genre. Generally, the syntactic and lexical rules of an MT system allow it to translate many sentences other than those belonging to a certain genre. On the other hand it probably cannot translate all the sentences of a particular genre. Swetra is a multilanguage MT system defined by the potentials of a formal grammar (standard referent grammar) and not by reference to a genre. Successful translation of sentences can be guaranteed if they are within a specified syntactic format based on a specified lexicon. The paper discusses the consequences of this approach (Grammatically Restricted Machine Translation, GRMT) and describes the limits set by a standard choice of grammatical rules for sentences and clauses, noun phrases, verb phrases, sentence adverbials, etc. Such rules have been set up for English, Swedish and Russian, mainly on the basis of familiarity (frequency) and computer efficiency, but restricting the grammar and making it suitable for several languages poses many problems for optimization. Sample texts — newspaper reports — illustrate the type of text that can be translated with reasonable success among Russian, English and Swedish.  相似文献   

13.
One may indicate the potentials of an MT system by stating what text genres it can process, e.g., weather reports and technical manuals. This approach is practical, but misleading, unless domain knowledge is highly integrated in the system. Another way to indicate which fragments of language the system can process is to state its grammatical potentials, or more formally, which languages the grammars of the system can generate. This approach is more technical and less understandable to the layman (customer), but it is less misleading, since it stresses the point that the fragments which can be translated by the grammars of a system need not necessarily coincide exactly with any particular genre. Generally, the syntactic and lexical rules of an MT system allow it to translate many sentences other than those belonging to a certain genre. On the other hand it probably cannot translate all the sentences of a particular genre. Swetra is a multilanguage MT system defined by the potentials of a formal grammar (standard referent grammar) and not by reference to a genre. Successful translation of sentences can be guaranteed if they are within a specified syntactic format based on a specified lexicon. The paper discusses the consequences of this approach (Grammatically Restricted Machine Translation, GRMT) and describes the limits set by a standard choice of grammatical rules for sentences and clauses, noun phrases, verb phrases, sentence adverbials, etc. Such rules have been set up for English, Swedish and Russian, mainly on the basis of familiarity (frequency) and computer efficiency, but restricting the grammar and making it suitable for several languages poses many problems for optimization. Sample texts—newspaper reports—illustrate the type of text that can be translated with reasonable success among Russian, English and Swedish.  相似文献   

14.
网络安全领域中威胁情报的描述方式多种多样,迫切需要一种对威胁情报格式化描述的标准,将非格式化情报信息,转化为格式化数据,为情报的可视化知识图谱提供支撑。针对STIX 2.0的描述规范,提取了适应于网络安全威胁情报中的本体元素,构建了一个可共享、重用、扩展的威胁情报本体模型,并对领域本体、应用本体和原子本体进行了详细分类。将该模型应用在Poisonivy攻击事件中,提取了Poisonivy研究报告中的61个实体,102个关系,并将抽取的格式化数据导入Gephi进行可视化表达。通过对威胁情报本体模型的构建,完成了情报信息从非结构化到结构化的转换,并使用统一的语法进行描述,最终以知识图谱的方式来表达情报中重要元素,可以快速定位网络安全事件中的核心元素及之间关系,为网络安全分析者和决策者,提供重要依据。  相似文献   

15.
16.
Trees can be conveniently compressed with linear straight-line context-free tree grammars. Such grammars generalize straight-line context-free string grammars which are widely used in the development of algorithms that execute directly on compressed structures (without prior decompression). It is shown that every linear straight-line context-free tree grammar can be transformed in polynomial time into a monadic (and linear) one. A tree grammar is monadic if each nonterminal uses at most one context parameter. Based on this result, polynomial time algorithms are presented for testing whether a given (i) nondeterministic tree automaton or (ii) nondeterministic tree automaton with sibling-constraints or (iii) nondeterministic tree walking automaton, accepts a tree represented by a linear straight-line context-free tree grammar. It is also shown that if tree grammars are nondeterministic or non-linear, then reducing their numbers of parameters cannot be done without an exponential blow-up in grammar size.  相似文献   

17.
This paper considers the problem of finding relevant answers to multi-sentence questions, which is urgent for many applied fields. In particular, this problem arises in industrial systems that are aimed at providing goods and services. One of the major approaches to this problem is that a set of potential answers that were obtained using a keyword search is repeatedly ordered by comparing syntactic answer parse trees with a question parse tree. This work modifies the approach based on using parse trees and improves it by passing to a more exact representation of semantic and syntactic text structure: the consideration of text paragraphs as a unit of analyzed information. The software implementation of the approach was performed and the results of the implementation were placed in open access as an adjustment for the Apache SOLR search engine, by which the suggested technology can be easily integrated with industrial search systems.  相似文献   

18.
Lars Arge 《Algorithmica》2003,37(1):1-24
We present a technique for designing external memory data structures that support batched operations I/ O efficiently. We show how the technique can be used to develop external versions of a search tree, a priority queue, and a segment tree, and give examples of how these structures can be used to develop I/ O-efficient algorithms. The developed algorithms are either extremely simple or straightforward generalizations of known internal memory algorithms—given the developed external data structures.  相似文献   

19.
We present a technique for designing external memory data structures that support batched operations I/ O efficiently. We show how the technique can be used to develop external versions of a search tree, a priority queue, and a segment tree, and give examples of how these structures can be used to develop I/ O-efficient algorithms. The developed algorithms are either extremely simple or straightforward generalizations of known internal memory algorithms—given the developed external data structures.  相似文献   

20.
Hai‐Feng Guo  Zongyan Qiu 《Software》2015,45(11):1519-1547
Grammar‐based test generation provides a systematic approach to producing test cases from a given context‐free grammar. Unfortunately, naive grammar‐based test generation is problematic because of the fact that exhaustive random test case production is often explosive, and grammar‐based test generation with explicit annotation controls often causes unbalanced testing coverage. In this paper, we present an automatic grammar‐based test generation approach, which takes a symbolic grammar as input, requires zero control input from users, and produces well‐distributed test cases. Our approach utilizes a novel dynamic stochastic model where each variable is associated with a tuple of probability distributions, which are dynamically adjusted along the derivation. We further present a coverage tree illustrating the distribution of generated test cases and their detailed derivations. More importantly, the coverage tree supports various implicit derivation control mechanisms. We implemented this approach in a Java‐based system, named Gena. Each test case generated by Gena automatically comes with a set of structural features, which can play an important and effective role on automated failure causes localization. Experimental results demonstrate the effectiveness of our approach, the well‐balanced distribution of generated test cases over grammatical structures, and a case study on grammar‐based failure causes localization. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号