首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
The aim of this paper is to illustrate the potential of a parallel corpus in the context of (computer-assisted) language learning. In order to do so, we propose to answer two main questions (1) what corpus (data) to use and (2) how to use the corpus (data). We provide an answer to the what-question by describing the importance and particularities of compiling and processing a corpus for pedagogical purposes. In order to answer the how-question, we first investigate the central concepts of the interactionist theory of second language acquisition: comprehensible input, input enhancement, comprehensible output and output enhancement. By means of two case studies, we illustrate how the abovementioned concepts can be realized in concrete corpus-based language learning activities. We propose a design for a receptive and productive language task and describe how a parallel corpus can be at the basis of powerful language learning activities. The Dutch Parallel Corpus, a ten-million word sentence aligned and annotated parallel corpus, is used to develop these language tasks.  相似文献   

4.
In this paper, we describe tools and resources for the study of African languages developed at the Collaborative Research Centre 632 “Information Structure”. These include deeply annotated data collections of 25 sub-Saharan languages that are described together with their annotation scheme, as well as the corpus tool ANNIS, which provides unified access to a broad variety of annotations created with a range of different tools. With the application of ANNIS to several African data collections, we illustrate its suitability for the purpose of language documentation, distributed access, and the creation of data archives.  相似文献   

5.
User-interface tools: introduction and survey   总被引:1,自引:0,他引:1  
Myers  B.A. 《Software, IEEE》1989,6(1):15-23
An overview is given of user-interface development systems (UIDS). Systems are classified by how they let the programmer specify the interfaces, and examples of each type are given. The three types are language-based, graphical, and automatic creation interfaces. Shortcomings of UIDS and user-interface toolkits are discussed  相似文献   

6.
As the number of Arabic corpora is constantly increasing, there is an obvious and growing need for concordancing software for corpus search and analysis that supports as many features as possible of the Arabic language, and provides users with a greater number of functions. This paper evaluates six existing corpus search and analysis tools based on eight criteria which seem to be the most essential for searching and analysing Arabic corpora, such as displaying Arabic text in its right-to-left direction, normalising diacritics and Hamza, and providing an Arabic user interface. The results of the evaluation revealed that three tools: Khawas, Sketch Engine, and aConCorde, have met most of the evaluation criteria and achieved the highest benchmark scores. The paper concluded that developers’ conscious consideration of the linguistic features of Arabic when designing these three tools was the most significant factor behind their superiority.  相似文献   

7.
The exploitation of today's high-performance computer systems requires the effective use of parallelism in many forms and at numerous levels. This survey article discusses program analysis and restructuring techniques that target parallel architectures. We first describe various categories of architectures that are oriented toward parallel computation models: vector architectures, shared-memory multiprocessors, massively parallel machines, message-passing architectures, VLIWs, and multithreaded architectures. We then describe a variety of optimization techniques that can be applied to sequential programs to effectively utilize the vector and parallel processing units. After an overview of basic dependence analysis, we present restructuring transformations on DO loops targeted both to vectorization and to concurrent execution, interprocedural and pointer analysis, task scheduling, instruction-level parallelization, and compiler-assisted data placement. We conclude that although tremendous advances have been made in dependence theory and in the development of a toolkit of transformations, parallel systems are used most effectively when the programmer interacts in the optimization process.  相似文献   

8.
9.
This paper presents a survey of some of the tools, techniques, and constructs for the development of portable, multitasked Fortran programs. The study mainly focuses on existing software tools that implement different approaches to achieving portability of multitasked Fortran programs for local and shared memory multiprocessor computers. However, some proposed approaches are also included. It appears that while each approach enjoys some advantages and suffers some disadvantages, at present, the development and use of portable multitasking tools is in its infancy, and thus no one system is clearly superior. Indeed, we expect that, for the foreseeable future, these and perhaps other techniques will all be actively pursued.  相似文献   

10.
11.
12.
Dynamic programming is an important paradigm that has been widely used to solve problems in various areas such as control theory, operation research, biology and computer science. We generalize the finite automaton formal model for dynamic programming deriving pipeline parallel algorithms. The optimality of these algorithms is established for the new class of non‐decreasing finite automata. As an intermediate step for the construction of a skeleton for the automatic parallelization of dynamic programming, we have developed a tool for the implementation of pipeline algorithms. The tool maps the processes in the pipeline in the target architecture following a mix of block and cyclic policies adapted to the grain of the machine. Based on the former tool, the automatic parallelization of dynamic programming is straightforward. The use of the model and its associated tools is illustrated with the Single Resource Allocation Problem. The performance and portability of these tools is compared with specific ‘hand made’ code written by experienced programmers. The experimental results on distributed memory and shared distributed memory architectures prove the scalability of the proposed paradigm and its associated tools. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

13.
并列式四字格是一种特殊却数量众多的四字格。介绍了在有词性标注语料库中基于条件随机场模型的四字格抽取工作,并在此基础上分析了并列式四字格的结构特点,提出了一种基于分词语料库环境的并列式四字格识别方法。通过不同语料库间的对比实验,结果表明该识别方法具有比较好的精确度和一定的适应性。  相似文献   

14.
The availability of machine-readable bilingual linguistic resources is crucial not only for rule-based machine translation but also for other applications such as cross-lingual information retrieval. However, the building of such resources (bilingual single-word and multi-word correspondences, translation rules) demands extensive manual work, and, as a consequence, bilingual resources are usually more difficult to find than “shallow” monolingual resources such as morphological dictionaries or part-of-speech taggers, especially when they involve a less-resourced language. This paper describes a methodology to build automatically both bilingual dictionaries and shallow-transfer rules by extracting knowledge from word-aligned parallel corpora processed with shallow monolingual resources (morphological analysers, and part-of-speech taggers). We present experiments for Brazilian Portuguese–Spanish and Brazilian Portuguese–English parallel texts. The results show that the proposed methodology can enable the rapid creation of valuable computational resources (bilingual dictionaries and shallow-transfer rules) for machine translation and other natural language processing tasks).  相似文献   

15.
16.
The synchronization barrier is a point in the program where the processing elements (PEs) wait until all the PEs have arrived at this point. In a reduction computation, given a commutative and associative binary operationop, one needs to reduce valuesa 0,...,a N-1, stored in PEs 0,...,N-1 to a single valuea *=a 0 op a, op...op a N -1 and then to broadcast the resulta * to all PEs. This computation is often followed by a synchronization barrier. Routines to perform these functions are frequently required in parallel programs. Simple and efficient, workingC-language routines for the parallel barrier synchronization and reduction computations are presented. The codes are appropriate for a CREW (concurrent-read-exclusive-write) or EREW parallel random access shared memory MIMD computer. They require only shared memory read and write; no locks, semaphores etc. are needed. The running time of each of these routines isO(logN). The amount of shared memory required and the number of shared memory accesses generated are botO(N). These are the asymptotically minimum values for the three parameters. The algorithms employ the obvious computational scheme involving a binary tree. Examples of applications for these routines and results of performance testing on the Sequent Balance 21000 computer are presented.An abstract of this article appeared inProc. 1989 Int. Conf. Parallel Processing, p. II-175.  相似文献   

17.
Hide and seek: an introduction to steganography   总被引:3,自引:0,他引:3  
Although people have hidden secrets in plain sight-now called steganography-throughout the ages, the recent growth in computational power and technology has propelled it to the forefront of today's security techniques. Essentially, the information-hiding process in a steganographic system starts by identifying a cover medium's redundant bits (those that can be modified without destroying that medium's integrity). The embedding process creates a stego medium by replacing these redundant bits with data from the hidden message. This article discusses existing steganographic systems and presents recent research in detecting them via statistical steganalysis. Here, we present recent research and discuss the practical application of detection algorithms and the mechanisms for getting around them.  相似文献   

18.
This paper presents work on developing speech corpora and recognition tools for Turkish by porting SONIC, a speech recognition tool developed initially for English at the Center for Spoken Language Research of the University of Colorado at Boulder. The work presented in this paper had two objectives: The first one is to collect a standard phonetically-balanced Turkish microphone speech corpus for general research use. A 193-speaker triphone-balanced audio corpus and a pronunciation lexicon for Turkish have been developed. The corpus has been accepted for distribution by the Linguistic Data Consortium (LDC) of the University of Pennsylvania in October 2005, and it will serve as a standard corpus for Turkish speech researchers. The second objective was to develop speech recognition tools (a phonetic aligner and a phone recognizer) for Turkish, which provided a starting point for obtaining a multilingual speech recognizer by porting SONIC to Turkish. This part of the work was the first port of this particular recognizer to a language other than English; subsequently, SONIC has been ported to over 15 languages. Using the phonetic aligner developed, the audio corpus has been provided with word, phone and HMM-state level alignments. For the phonetic aligner, it is shown that 92.6% of the automatically labeled phone boundaries are placed within 20 ms of manually labeled locations for the Turkish audio corpus. Finally, a phone recognition error rate of 29.2% is demonstrated for the phone recognizer.  相似文献   

19.
The paper proposes a new approach and a system to develop parallel algorithms based on the joint use of the algebraic-algorithmic methodology of specification and development of programs and non-algorithmic (heuristic) techniques for code generation. The algebraic part of the methodology provides the formalized process of parallel program design through high-level algebraic-algorithmic specifications and automating transformations up to program code in a standard programming language. The heuristic part of the system is the dynamic adjustment of program code to a target platform and its optimization using self-learning code generation and heuristic technologies.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号