首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
This report describes the current state of our central research thrust in the area of natural language generation. We have already reported on our text-level theory of lexical selection in natural language generation ([59, 60]), on a unification-based syntactic processor for syntactic generation ([73]) and designed a relatively flexible blackboard-oriented architecture for integrating these and other types of processing activities in generation ([60]). We have implemented these ideas in our prototype generator, Diogenes — a DIstributed, Opportunistic GENEration System — and tested our lexical selection and syntactic generation modules in a comprehensive natural language processing project — the KBMT-89 machine translation system ([15]). At this stage we are developing a more comprehensive Diogenes system, concentrating on both the theoretical and the system-building aspects of a) formulating a more comprehensive theory of distributed natural language generation; b) extending current theories of text organization as they pertain to the task of planning natural language texts; c) improving and extending the knowledge representation and the actual body of background knowledge (both domain and discourse/pragmatic) required for comprehensive text planning; d) designing and implementing algorithms for dynamic realization of text structure and integrating them into the blackboard style of communication and control; e) designing and implementing control algorithms for distributed text planning and realization. In this document we describe our ideas concerning opportunistic control for a natural language generation planner and present a research and development plan for the Diogenes project.Many people have contributed to the design and development of the Diogenes generation system over the last four years, especially Eric Nyberg, Rita McCardell, Donna Gates, Christine Defrise, John Leavitt, Scott Huffman, Ed Kenschaft and Philip Werner. Eric Nyberg and Masaru Tomita have created genkit, which is used as the syntactic component of Diogenes. A short version of this article appeared in Proceedings of IJCAI-89, co-authored with Victor Lesser and Eric Nyberg. To all the above many thanks. The remaining errors are the responsibility of this author.  相似文献   

2.
领域概念知识建模是构建信息系统分析的关键技术和任务之一,同时也是知识工程的瓶颈问题,其难点在于如何正确完整地捕捉和验证领域专家的知识。ER和UML等有属性特性的建模方法能够很好地描述领域知识,但却难以让领域专家确认知识的正确性和完整性。面向事实的信息建模(FOM)是一种完全面向自然语言交流的领域概念知识建模方法,是一种理想的概念建模和本体工程辅助工具。简要分析了概念建模过程,比较了不同概念建模方法,介绍了FOM的技术演化过程,从业务规则、动态建模、模型抽象机制、模型转换和工程应用等方面评述了FOM的研究现状和待研究问题。  相似文献   

3.
Market research for requirements analysis using linguistic tools   总被引:2,自引:2,他引:0  
Numerous studies in recent months have proposed the use of linguistic instruments to support requirements analysis. There are two main reasons for this: (i) the progress made in natural language processing and (ii) the need to provide the developers of software systems with support in the early phases of requirements definition and conceptual modelling. This paper presents the results of an online market research intended (a) to assess the economic advantages of developing a CASE (computer-aided software engineering) tool that integrates linguistic analysis techniques for documents written in natural language, and (b) to verify the existence of the potential demand for such a tool. The research included a study of the language – ranging from completely natural to highly restricted – used in documents available for requirements analysis, an important factor given that on a technological level there is a trade-off between the language used and the performance of the linguistic instruments. To determine the potential demand for such tool, some of the survey questions dealt with the adoption of development methodologies and consequently with models and support tools; other questions referred to activities deemed critical by the companies involved. Through statistical correspondence analysis of the responses, we were able to outline two profiles of companies that correspond to two potential market niches, which are characterised by their very different approach to software development.
Mich LuisaEmail: Phone: +39-461-882087Fax: +39-461-882093
  相似文献   

4.
5.
Written text is an important component in the process of knowledge acquisition and communication. Poorly written text fails to deliver clear ideas to the reader no matter how revolutionary and ground-breaking these ideas are. Providing text with good writing style is essential to transfer ideas smoothly. While we have sophisticated tools to check for stylistic problems in program code, we do not apply the same techniques for written text. In this paper we present TextLint, a rule-based tool to check for common style errors in natural language. TextLint provides a structural model of written text and an extensible rule-based checking mechanism.  相似文献   

6.
The paper exemplifies programming in a wide spectrum language by presenting styles which range from non-operative specifications—using abstract types and tools from predicate logic as well as set theory—over recursive functions, to procedural program with variables. Besides a number of basic types, we develop an interpreter for parts of the language itself, an algorithm for applying transformation rules to program representations, a text editor, and a simulation of Backus' functional programming language.  相似文献   

7.
Object-oriented design support system for machine tools   总被引:3,自引:0,他引:3  
This paper deals with an object-oriented intelligent design support system which is intended to assist in the basic design of machine tools, in particular machining centres. The machine tools design process is analysed through interviews with experienced designers, and an object-oriented model is established to represent the design process. Software modules named design objects are proposed, which are basic components for the implementation of an intelligent design support system for machine tools. A prototype of the design support system for machining centres is developed based on the design objects, and some case studies are carried out to verify the effectiveness of the methods proposed.  相似文献   

8.
The present paper is meant to summarise and enlighten the theoretical implications of the twin theories of text comprehension and of text compression. Compatibility and non-exclusiveness of particle-like analysis of language and wave-like analysis of intentionality are also demonstrated within the newly established quantum linguistics framework. The informative state of language is viewed as being relatively stable; once activated and subject to motion, therefore reaching a communicative state, different phenomena occur, which may be observed, analysed and visualised through CPP-TRS observational devices. Relativity theory may therefore be organised in terms of quanta with continuity and no contradiction.The present paper was presented at the Computation and Linguistic Colloquium Series, University of Maryland at College Park, in the Spring of 1997.  相似文献   

9.
We present an integrated knowledge representation system for natural language processing (NLP) whose main distinguishing feature is its emphasis on encoding not only the usual propositional structure of the utterances in the input text, but also capturing an entire complex of nonpropositional — discourse, attitudinal, and other pragmatic — meanings that NL texts always carry. The need for discourse pragmatics, together with generic semantic information, is demonstrated in the context of anaphoric and definite noun phrase resolution for accurate machine translation. The major types of requisite pragmatic knowledge are presented, and an extension of a frame-based formalism developed in the context of the TRANSLATOR system is proposed as a first-pass codification of the integrated knowledge base.  相似文献   

10.
We introduce a dual-use methodology for automating the maintenance and growth of two types of knowledge sources, which are crucial for natural language text understanding—background knowledge of the underlying domain and linguistic knowledge about the lexicon and the grammar of the underlying natural language. A particularity of this approach is that learning occurs simultaneously with the on-going text understanding process. The knowledge assimilation process is centered around the linguistic and conceptual ‘quality' of various forms of evidence underlying the generation, assessment and on-going refinement of lexical and concept hypotheses. On the basis of the strength of evidence, hypotheses are ranked according to qualitative plausibility criteria, and the most reasonable ones are selected for assimilation into the already given lexical class hierarchy and domain ontology.  相似文献   

11.
Identifier names play a key role in program understanding and in particular in concept location. Programmers can easily “parse” identifiers and understand the intended meaning. This, however, is not trivial for tools that try to exploit the information in the identifiers to support program understanding. To address this problem, we resort to natural language analyzers, which parse tokenized identifier names and provide the syntactic relationships (dependencies) among the terms composing the identifiers. Such relationships are then mapped to semantic relationships.  相似文献   

12.
Cooperative hypermedia means producting and manipulating hyperorganized multimedia data by a group of (co-)users. We have been realizing a prototype that enables coauthors to cooperatively produce hypermedia documents. It allows coauthors to communicate their ideas, drafts, guidelines, and constraints within a group in order to exchange information (remotely or face-to-face) and improve the final document.When analyzing the transition from individual work to group work within different human activities, two pitfalls are often detected if computer support is considered. On the one hand are the social and technological communication problems, particularly if members of the group are geographically distant from one another. On the other hand are the productivity falls, which are usually due to communication difficulties and frequent social inadequacies of the group's computer support. We would like to propose the use of this prototype — CoMEdiA — as a way to enhance, or prevent the fall of, intra-group communication and the outcomes of group edit tasks (constrained to the kind of taks for which it has been designed). Most of the techniques used to achieve this can be used in other tools to support other specific type of group activities.  相似文献   

13.
Since April 1989, the Center for Text and Technology at Georgetown University has gathered information on the structure of projects that produce electronic text in the humanities. This report — based on the April, 1991 version of the Georgetown Catalogue and emphasizing its full-text projects in humanities disciplines other than linguistics —surveys the countries in which projects are found, the languages encoded, the disciplines served, and the auspices represented. Then the report explores three trends toward the improvement of electronic texts: increased scope of the new projects, improved quality of the editions used, and greater sophistication in the text-analysis tools added. Included among the notes is a list of titles and contacts for 42 projects cited in the report.Michael Neuman is Director of Georgetown University's Center for Text and Technology, whose mission is the creation and dissemination of electronic text for the enhancement of teaching and research in the humanities. He has taught English literature, but his recent articles and presentations focus on electronic editions of philosophical works.James A. Wilderotter 11, Project Assistant at the Center for Text and Technology, has provided many of the compilations in this report and gathered much of the data in the current version of the Georgetown Catalogue of Projects in Electronic Text.  相似文献   

14.
Students still take class notes using pencil and paper—although digital documents are more legible, easier to search in and easier to edit—in part because of the lack of software to support note-taking. Class notes are characterized by free spatial organization, many small chunks of text, and a dense mix of text and graphic elements. These characteristics imply that a note-taking system should use pen, keyboard and mouse-or-equivalent; allow the swift entry of text at any desired position; and minimize the need to switch between input tools. A system with these properties was built and used by 10 subjects in a controlled study and by four users in their classes. Some users preferred our system to pencil and paper, suggesting that taking class notes with the computer is feasible.  相似文献   

15.
A text manipulating subset of the language ANALITIK is discussed in the framework of the hypertext technology. Extension tools are proposed that enable the language to process both texts and hypertexts.Translated from Kibernetika, No. 4, pp. 1–8, July–August, 1990.  相似文献   

16.
Distributed data stream processing applications are often characterized by data flow graphs consisting of a large number of built‐in and user‐defined operators connected via streams. These flow graphs are typically deployed on a large set of nodes. The data processing is carried out on‐the‐fly, as tuples arrive at possibly very high rates, with minimum latency. It is well known that developing and debugging distributed, multi‐threaded, and asynchronous applications, such as stream processing applications, can be challenging. Thus, without domain‐specific debugging support, developers struggle when debugging distributed applications. In this paper, we describe tools and language support to support debugging distributed stream processing applications. Our key insight is to view debugging of stream processing applications from four different, but related, perspectives. First, debugging the semantics of the application involves verifying the operator‐level composition and inspecting the flows at the logical level. Second, debugging the user‐defined operators involves traditional source‐code debugging, but strongly tied to the stream‐level interactions. Third, debugging the deployment details of the application require understanding the runtime physical layout and configuration of the application. Fourth, debugging the performance of the application requires inspecting various performance metrics (such as communication rates, CPU utilization, etc.) associated with streams, operators, and nodes in the system. In light of this characterization, we developed several tools such as a debugger‐aware compiler and an associated stream debugger, composition and deployment visualizers, and performance visualizers, as well as language support, such as configuration knobs for logging and tracing, deployment configurations such as operator‐to‐process and process‐to‐node mappings, monitoring directives to inspect streams, and special sink adapters to intercept and dump streaming data to files and sockets, to name a few. We describe these tools in the context of Spade —a language for creating distributed stream processing applications, and System S —a distributed stream processing middleware under development at the IBM Watson Research Center. Published in 2009 by John Wiley & Sons, Ltd.  相似文献   

17.
A new correctness criterion for schedules of update transactions is proposed, which captures users' intended changes to the database. This is motivated by the observation that traditional serializability may lead to anomalies by not taking into account semantics related to such intended changes. The alternate criterion —goal-correctness — is orthogonal to serializability, and is based on realizing goals associated with each transaction. The problems involved in goal-oriented concurrency control are first identified in a general framework. The analysis suggests that this approach is practical only for restricted transaction languages where goals can be inferred and manipulated efficiently. One such language is then considered, capturing a class of updates of practical interest. For this language, it is shown that goal-oriented concurrency control is tractable and compares favorably to serializability with respect to complexity: testing goal-correctness takes polynomial time, while testing serializability is NP-complete. The set of schedules which are correct with respect to the two criteria are incomparable. Thus, goal-correctness may allow increased concurrency. The results highlight the feasibility and advantages of goal-oriented concurrency control in restricted frameworks. The paper also discusses the dynamic aspects of goal-oriented concurrency control; in particular, an optimistic approach to the dynamic generation of goal-correct schedules is presented.An extended abstract of this paper appeared in the Proceedings of the 2nd Symposium on Mathematical Fundamentals of Database Systems (MFDBS), LNCS 364 (Springer, 1989) pp. 398–414.This author was supported in part by the National Science Foundation, under Grant Number IRI-8816078.  相似文献   

18.
We describe GATE, the General Architecture for Text Engineering, an integrated visual development environment to support the visual assembly, execution and analysis of modular natural language processing systems. The visual model is an executable data flow program graph, automatically synthesised from data dependency declarations of language processing modules. The graph is then directly executable: modules are run interactively in the graph, and results are accessible via generic text visualisation tools linked to the modules. These tools lighten the ‘cognitive load’ of viewing and comparing module results by relating data produced by modules back to the underlying text, by reducing the amount of search in examining results, and by displaying results in context. Overall, the GATE integrated visual development environment leads to rapid understanding of system behaviour and hence to rapid system refinement, therefore demonstrating the utility of visual programming and visualisation techniques for the development of natural language processing systems.  相似文献   

19.
Text visualization has become a significant tool that facilitates knowledge discovery and insightful presentation of large amounts of data. This paper presents a visualization system for exploring Arabic text called ViStA. We report about the design, the implementation and some of the experiments we conducted on the system. The development of such tools assists Arabic language analysts to effectively explore, understand, and discover interesting knowledge hidden in text data. We used statistical techniques from the field of Information Retrieval to identify the relevant documents coupled with sophisticated natural language processing (NLP) tools to process the text. For text visualization, the system used a hybrid approach combining latent semantic indexing for feature selection and multidimensional scaling for dimensionality reduction. Initial results confirm the viability of using this approach to tackle the problem of Arabic text visualization and other Arabic NLP applications.  相似文献   

20.
On the multiprocessor vector-supercomputer CRAY X-MP, parallelism—beyond vectorization—can be exploited on the programming language level by two multitasking strategies: macrotasking and, more recently, microtasking. In this paper, multitasking results and experiences are presented which have been gained by applying these two implemented modes to linear-algebra and non-numerical algorithms as well as to a large fluid-flow simulation code. While comparing the concepts and realizations of macrotasking and microtasking, the features, tools, and problems of multitasking programming and the potential user benefit of these parallel processing techniques are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号