首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
We describe a system under development, whose goal is to provide a “natural” environment for students learning to produce sentences in French. The learning objective is personal pronouns, the method is inductive (learning through exploration). Input of the learning component are conceptual structures (meanings) and the corresponding linguistic forms (sentences), its outputs are rules characterizing these data. The learning is dialogue based, that is to say, the student may ask certain kinds of questions such as:How does one say 〈idea〉?,Can one say 〈linguistic form〉?,Why does one say 〈linguistic form〉?, and the system answers them. By integrating the student into the process, that is, by encouraging him to build and explore a search space we hope to enhance not only his learning efficiency (what and how to learn), but also our understanding of the underlying processes. By analyzing the trace of the dialogue (what questions have been asked at what moment), we may infer the strategies a student put to use. Although the system covers far more than what is discussed here, we will restrict our discussion to a small subset of grammar, personal pronouns, which are known to be a notorious problem both in first and second language learning.  相似文献   

3.
Early phases of software development are known to be problematic, difficult to manage and errors occurring during these phases are expensive to correct. Many systems have been developed to aid the transition from informal Natural Language requirements to semi-structured or formal specifications. Furthermore, consistency checking is seen by many software engineers as the solution to reduce the number of errors occurring during the software development life cycle and allow early verification and validation of software systems. However, this is confined to the models developed during analysis and design and fails to include the early Natural Language requirements. This excludes proper user involvement and creates a gap between the original requirements and the updated and modified models and implementations of the system. To improve this process, we propose a system that generates Natural Language specifications from UML class diagrams. We first investigate the variation of the input language used in naming the components of a class diagram based on the study of a large number of examples from the literature and then develop rules for removing ambiguities in the subset of Natural Language used within UML. We use WordNet, a linguistic ontology, to disambiguate the lexical structures of the UML string names and generate semantically sound sentences. Our system is developed in Java and is tested on an independent though academic case study.  相似文献   

4.
Developing knowledge-transforming skills in writing may help students increase learning by actively building knowledge, regardless of the domain. However, many undergraduate students struggle to transform knowledge when drafting essays based on multiple sources. Writing analytics can be used to scaffold knowledge transforming as writers bring evidence to bear in supporting claims. We investigated how to automatically identify sentences representing knowledge transformation in argumentative essays. A synthesis of cognitive theories of writing and Bloom's typology identified 22 linguistic features to model processes of knowledge transforming in a corpus of 38 undergraduates' essays. Findings indicate undergraduates mostly paraphrase or copy information from multiple sources rather than engage deeply with sources' content. Eight linguistic features were important for discriminating evidential sentences as telling versus transforming source knowledge. We trained a machine learning algorithm that accurately classified nearly three of four evidential sentences as knowledge-telling or knowledge-transforming, offering potential for use in future research.  相似文献   

5.
In this paper, we describe Teapot, a domain-specific language for writing cache coherence protocols. Cache coherence is of concern when parallel and distributed systems make local replicas of shared data to improve scalability and performance. In both distributed shared memory systems and distributed file systems, a coherence protocol maintains agreement among the replicated copies as the underlying data are modified by programs running on the system. Cache coherence protocols are notoriously difficult to implement, debug, and maintain. Moreover, protocols are not off-the-shelf, reusable components, because their details depend on the requirements of the system under consideration. The complexity of engineering coherence protocols can discourage users from experimenting with new, potentially more efficient protocols. We have designed and implemented Teapot, a domain-specific language that attempts to address this complexity. Teapot's language constructs, such as a state-centric control structure and continuations, are better suited to expressing protocol code than those of a typical systems programming language. Teapot also facilitates automatic verification of protocols, so hard to find protocol bugs, such as deadlocks, can be detected and fixed before encountering them on an actual execution. We describe the design rationale of Teapot, present an empirical evaluation of the language using two case studies, and relate the lessons that we learned in building a domain-specific language for systems programming  相似文献   

6.
In this paper, we address the problem of literary writing style determination using a comparison of the randomness of two given texts. We attempt to comprehend if these texts are generated from distinct probability sources that can reveal a difference between the literary writing styles of the corresponding authors. We propose a new approach based on the incorporation of the known Friedman-Rafsky two-sample test into a multistage procedure with the aim of stabilizing the process. A sampling procedure constructed by applying the N-grams methodology is applied to simulate samples drawn from the pooled text with the aim of evaluating the null hypothesis distribution that appears after the writing styles coincide. Next, samples from different files are selected, and the p-values of the test statistics are calculated. An empirical distribution of these values is compared numerous times with the uniform one on the interval [0, 1], and the writing styles are recognized as different if the rejection fraction in this comparison’s sequence is significantly greater than 0.5. The offered approach is language independent in the community of alphabetic languages and does not involve the use of linguistics. In comparison with most existing methods our approach does not deal with any authorship attribute determination. A text itself, more precisely speaking, the distribution of sequential text templates and their mutual occurrences essentially identifies the style. Experiments demonstrate the strong capability of the proposed method.  相似文献   

7.
Applications of linguistic techniques for use case analysis   总被引:2,自引:2,他引:0  
Use cases are effective techniques to express the functional requirements of a system in a very simple and easy-to-learn way. Use cases are mainly composed of natural language (NL) sentences, and the use of NL to describe the behaviour of a system is always a critical point, due to the inherent ambiguities originating from the different possible interpretations of NL sentences. We discuss in this paper the application of analysis techniques based on a linguistic approach to detect, within requirements documents, defects related to such an inherent ambiguity. Starting from the proposed analysis techniques, we will define some metrics that will be used to perform a quality evaluation of requirements documents. Some available automatic tools supporting the linguistic analysis of NL requirements have been used to evaluate an industrial use cases document according to the defined metrics. A discussion on the application of linguistic analysis techniques to support the semantic analysis of use cases is also reported.  相似文献   

8.
也谈汉语书面语的分词问题——分词连写十大好处   总被引:2,自引:1,他引:1  
单词的切分对现代汉语的运用、研究和计算机信息处理等都具有相当重要的意义。本文阐述书面汉语分词连写的十大好处, 并讨论一些实施方面的问题。文章全文分词连写。  相似文献   

9.
计算机自动写作是人工智能领域的一个重要研究方向,现有方法大多都是基于一定的模板,生成行文较为单一的文章,没有对写作内容进行主题方面的提示和推荐,对文章修辞色彩的渲染就更少。为了使自动写作的文章更吸引人,可以将我们现实写作中使用的一些排比句根据主题和相似度的计算加入自动写作作品中,使得作品更加生动。文章主要研究规范文献资料的排比句自动抽取算法,以便抽取到的排比句作为语言素材有效应用于计算机自动写作。文章采用基于段内排比特征和段间排比特征的方法进行排比句的自动抽取,实验结果表明,本文方法抽取的准确率达到93%以上。  相似文献   

10.
《Computers & Education》2008,50(4):1122-1146
This paper presents a field study carried out with learners who used a grammar checker in real writing tasks in an advanced course at a Swedish university. The objective of the study was to investigate how students made use of the grammar checker in their writing while learning Swedish as a second language. Sixteen students with different linguistic and cultural backgrounds participated in the study. A judgment procedure was conducted by the learners on the alarms from the grammar checker. The students’ texts were also collected in two versions; a version written before the session with the grammar checker, and a version after the session. This procedure made it possible to study to what extent the students followed the advice from the grammar checker, and how this was related to their judgments of its behavior.The results obtained demonstrated that although most of the alarms from the grammar checker were accurate, some alarms were very hard for the students to judge correctly. The results also showed that providing the student with feedback on different aspects of their target language use; not only on their errors, and facilitating the processes of language exploration and reflection are important processes to be supported in second-language learning environments.Based on these results, design principles were identified and integrated in the development of Grim, an interactive language-learning program for Swedish. We present the design of Grim, which is grounded in visualization of grammatical categories and examples of language use, providing tools for both focus on linguistic code features and language comprehension.  相似文献   

11.
12.
Writing correct English sentences can be challenging. Furthermore, writing correct formulaic sequences can be especially difficult because accepted combinations do not follow clear rules governing which words appear together in a sequence. One solution is to provide examples of correct usage accompanied by statistical feedback from web‐based applications for learners to emulate. The goal of the study was to investigate whether such a dedicated web‐based program that provides instant examples of correct word combinations and detailed results of who used these examples, where and when, can improve students' writing. Data were collected from 74 ESL (English as a Second Language) undergraduate students divided into control and experimental groups, who performed formulaic sequence tests, wrote passages in English, followed by interviews with members of the experimental group. The results show that students who used the software outperformed those who did not, on the tests and the overall grades for their writing. Implementing formulaic sequence check programs in writing classes to help students decide which words can appear together, under which circumstances, and in which context is recommended.  相似文献   

13.
Eye tracking has been used successfully as a technique for measuring cognitive load in reading, psycholinguistics, writing, language acquisition etc. for some time now. Its application as a technique for measuring the reading ease of MT output has not yet, to our knowledge, been tested. We report here on a preliminary study testing the use and validity of an eye tracking methodology as a means of semi-automatically evaluating machine translation output. 50 French machine translated sentences, 25 rated as excellent and 25 rated as poor in an earlier human evaluation, were selected. Ten native speakers of French were instructed to read the MT sentences for comprehensibility. Their eye gaze data were recorded non-invasively using a Tobii 1750 eye tracker. The average gaze time and fixation count were found to be higher for the “bad” sentences, while average fixation duration and pupil dilations were not found to be substantially different for output rated as good and output rated as bad. Comparisons between HTER scores and eye gaze data were also found to correlate well with gaze time and fixation count, but not with pupil dilation and fixation duration. We conclude that the eye tracking data, in particular gaze time and fixation count, correlate reasonably well with human evaluation of MT output but fixation duration and pupil dilation may be less reliable indicators of reading difficulty for MT output. We also conclude that eye tracking has promise as a semi-automatic MT evaluation technique, which does not require bi-lingual knowledge, and which can potentially tap into the end users’ experience of machine translation output.  相似文献   

14.
Herein we propose a complete procedure to analyze and classify the texture of an image. We apply this scheme to solve a specific image processing problem: urban areas detection in satellite images. First we propose to analyze the texture through the modelling of the luminance field with eight different chain-based models. We then derived a texture parameter from these models. The effect of the lattice anisotropy is corrected by a renormalization group technique coming from statistical physics. This parameter, which takes into account local conditional variances of the image, is compared to classical methods of texture analysis. Afterwards we develop a modified fuzzy Cmeans algorithm that includes an entropy term. The advantage of such an algorithm is that the number of classes does not need to be known a priori. Besides this algorithm provides us with further information, i.e. the probability that a given pixel belongs to a given cluster. Finally we introduce this information in a Markovian model of segmentation. Some results on SPOT5 simulated images, SPOT3 images and ERS1 radar images are presented. These images are provided by the French National Space Agency (CNES) and the European Space Agency (ESA).Grant from CNES  相似文献   

15.
We present a phrase-based statistical machine translation approach which uses linguistic analysis in the preprocessing phase. The linguistic analysis includes morphological transformation and syntactic transformation. Since the word-order problem is solved using syntactic transformation, there is no reordering in the decoding phase. For morphological transformation, we use hand-crafted transformational rules. For syntactic transformation, we propose a transformational model based on a probabilistic context-free grammar. This model is trained using a bilingual corpus and a broad-coverage parser of the source language. This approach is applicable to language pairs in which the target language is poor in resources. We considered translation from English to Vietnamese and from English to French. Our experiments showed significant BLEU-score improvements in comparison with Pharaoh, a state-of-the-art phrase-based SMT system.  相似文献   

16.
The online charitable crowdfunding platform has emerged as a powerful tool for raising funds from large crowds to support non-profit activities. Following a patronage model, fundraisers on the platform are forced to use compelling stories to capture the attention of individual donors. As such, conceptualizing a persuasive request narrative for projects posted on the platform becomes one of the most pressing issues. Project writing guidelines, as an important platform design, are widely adopted to help fundraisers. Unfortunately, their role in the market has not been well examined. To fill this research gap, we leverage a unique dataset from a leading donation-based crowdfunding platform in the United States and make full use of its policy change opportunity on writing guidelines to thoroughly investigate its impacts and the underlying mechanism on donors’ contributions. Our empirical results show that the more constraints the guidelines impose on writing, the more likely they are to weaken the persuasiveness of narratives. We specifically focus on three aspects of narrative complexity and find that writing guidelines have a negative impact on linguistic complexity, which in turn diminishes donors’ contribution behavior. Nevertheless, they have positive impacts on both content and structural complexity, with the difference being that increased content complexity weakens donors’ contributions, while higher structural complexity is more likely to attract donations. Moreover, we examine the heterogeneous effects of writing guidelines on fundraisers with varied involvements. These findings deepen our understanding of writing guidance design in online platforms and have implications for charitable crowdfunding platforms and fundraisers.  相似文献   

17.
The assumption that tacit knowledge cannot be articulated remains dominant in knowledge elicitation. This paper, however, claims that linguistic theory does not support such a position and that language should not be factored out of accounts of tacit knowledge. We argue that Polanyi's (1966, p. 4) widely cited notion that ‘we know more than we can tell’ uses a folk model of language. This model does not acknowledge the linguistic patterns that competent language speakers deploy without direct awareness. This paper draws upon Systemic Functional Linguistics (SFL) to propose a Grammar-targeted Interview Method (GIM). The GIM uses SFL to unpack linguistic patterning, which we refer to as ‘under-representation’, to reveal tacit assumptions. It is a strategy that can be applied within a traditional interview method when the interviewer feels that there is confusion resulting from assumptions, such as those often embedded in terminology, that have not been directly expressed. This paper reports findings from an empirical study of tacit knowledge about requirements analysis in a Content Management System redevelopment. We compared the GIM with a Content-motivated Interview Method (CMIM) and show that, when the GIM is used, interviewees respond with less nominalised talk, that is the less nominalised content has more meaning unpacked as verbs and agents rather than hidden tacitly in nouns.  相似文献   

18.
This paper describes the design and function of the English generation phase in JETS, a minimal transfer, Japanese-English machine translation system that is based on the linguistic framework of relational grammar. To facilitate the development of relational grammar generators, we have built a generator shell that provides a high-level relational grammar rule-writing language and is independent of both the natural language and the application. The implemented English generator (called GENIE) maps abstract canonical structures, representing the basic predicate-argument structures of sentences, into well-formed English sentences via a two-stage plan-and-execute design. The modularity inherent in the plan-and-execute design permits the development of a very general and stable deterministic execution grammar. Another major feature of the GENIE generator is that it iscategory-driven, i.e., planning rules and execution rules are distributed over a part-of-speech hierarchy (down to individual lexical items) and are invoked via an inheritance mechanism only if appropriate for the category being processed. Categorydriven processing facilitates the handling of exceptions. The use of a syntactic planner and category-driven processing together provide a great deal of flexibility without sacrificing determinism in the generation process.  相似文献   

19.
Improving the quality of software demands quality controls since the very beginning of the development process, i.e., requirements capture and writing. Automating quality metrics may entail considerable savings, as opposed to tedious, manually performed evaluations. We present some indicators for measuring quality in textual requirements, as well as a tool that computes quality measures in a fully automated way. We want to emphasize that the final goal must be measure to improve. Reducing quality management to the acquisition of a numerical evaluation would crash against the strong opposition of requirements engineers themselves, who would not see in the measurement process the aid of a counselor, but a policeman mechanism of penalties. To avoid this, quality indicators must first of all point out concrete defects and provide suggestions for improvement. The final result will not only be an improvement in the quality of requirements, but also an improvement in the writing skills of requirements engineers.  相似文献   

20.
The Swiss avalanche bulletin is produced twice a day in four languages. Due to the lack of time available for manual translation, a fully automated translation system is employed, based on a catalogue of predefined phrases and predetermined rules of how these phrases can be combined to produce sentences. Because this catalogue of phrases is limited to a small sublanguage, the system is able to automatically translate such sentences from German into the target languages French, Italian and English without subsequent proofreading or correction. Having been operational for two winter seasons, we assess here the quality of the produced texts based on two different surveys where participants rated texts from real avalanche bulletins from both origins, the catalogue of phrases versus manually written and translated texts. With a mean recognition rate of 55 %, users can hardly distinguish between the two types of texts, and give very similar ratings with respect to their language quality. Overall, the output from the catalogue system can be considered virtually equivalent to a text written by avalanche forecasters and then manually translated by professional translators. Furthermore, forecasters declared that all relevant situations were captured by the system with sufficient accuracy. Forecaster’s working load did not change with the introduction of the catalogue: the extra time to find matching sentences is compensated by the fact that they no longer need to double-check manually translated texts. The reduction of daily translation costs is expected to offset the initial development costs within a few years.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号