期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Improving statistical MT by coupling reordering and decoding

Josep Maria Crego José B. Mariño 《Machine Translation》2006,20(3):199-215

In this paper we describe an elegant and efficient approach to coupling reordering and decoding in statistical machine translation, where the n-gram translation model is also employed as distortion model. The reordering search problem is tackled through a set of linguistically motivated rewrite rules, which are used to extend a monotonic search graph with reordering hypotheses. The extended graph is traversed in the global search when a fully informed decision can be taken. Further experiments show that the n-gram translation model can be successfully used as reordering model when estimated with reordered source words. Experiments are reported on the Europarl task (Spanish–English and English–Spanish). Results are presented regarding translation accuracy and computational efficiency, showing significant improvements in translation quality with respect to monotonic search for both translation directions at a very low computational cost. 相似文献

2.

Metrics for evaluating database selection techniques

James C. French Allison L. Powell 《World Wide Web》2000,3(3):153-163

The increasing availability of online databases and other information resources in digital libraries and on the World Wide Web has created the need for efficient and effective algorithms for selecting databases to search. A number of techniques have been proposed for query routing or database selection. We have developed a methodology and metrics that can be used to directly compare competing techniques. They can also be used to isolate factors that influence the performance of these techniques so that we can better understand performance issues. In this paper we describe the methodology we have used to examine the performance of database selection algorithms such as gGlOSS and CORI. In addition we develop the theory behind a “random” database selection algorithm and show how it can be used to help analyze the behavior of realistic database selection algorithms. This revised version was published online in August 2006 with corrections to the Cover Date. 相似文献

3.

Metrics for evaluating human information interaction systems

《Interacting with computers》2006,18(4):507-527

Society today has a wealth of information available due to information technology. The challenge facing researchers working in information access is how to help users easily locate the information needed. Evaluation methodologies and metrics are important tools to assess progress in human information interaction (HII). To properly evaluate these systems, evaluations need to consider the performance of the various components, the usability of the system, and the impact of the system on the end user. Current usability metrics are adequate for evaluating the efficiency, effectiveness, and user satisfaction of such systems. Performance measures for new intelligent technologies will have to be developed. Regardless of how well the systems are and how usable the systems are, it is critical that impact measures are developed. For HII systems to be useful, we need to assess how well information analysts work with the systems. This evaluation needs to go beyond technical performance metrics and usability metrics. What are the metrics for evaluating utility? This paper describes research efforts focused on developing metrics for the intelligence community that measure the impact of new software to facilitate information interaction. 相似文献

4.

Metrics for evaluating performance in document analysis: application to tables

Ana Costa e Silva 《International Journal on Document Analysis and Recognition》2011,14(1):101-109

Is an algorithm with high precision and recall at identifying table-parts also good at locating tables? Several document analysis tasks require merging or splitting certain document elements to form others. The suitability of the commonly used precision and recall for such division/aggregation tasks is arguable, since their underlying assumption is that the granularity of the items at input is the same as at output. We propose a new pair of evaluation metrics that better suit document analysis’ needs and show their application to several table tasks. In the process, we present a number of robust table location algorithms with which we draw a road-map for creating Hidden Markov Models for the task. 相似文献

5.

Metrics, Metrics, Metrics: Negative Hedonicity

Hoffman R.R. Marx M. Hancock P. 《Intelligent Systems, IEEE》2008,23(2):69-73

Intelligent technologies such as performance support systems and decision aids represent a key aspect of modern sociotechnical systems. When new tools are introduced into the workplace, they represent hypotheses about how cognitive work is expected to change. The tacit hypothesis is that any such change will be for the better, performance will be more efficient, and decisions will be improved - that is, they'll be made faster and on the basis of greater evidence. Experience suggests that technological interventions sometimes have the intended positive effect. However, they often result in negative effects, including unintended cascading failures and worker frustration due to "user-hostile" aspects of interfaces. 相似文献

6.

Factor-based evaluation for English to Hindi MT outputs

Renu Balyan Niladri Chatterjee 《Language Resources and Evaluation》2018,52(4):969-996

Design and implementation of automatic evaluation methods is an integral part of any scientific research in accelerating the development cycle of the output. This is no less true for automatic machine translation (MT) systems. However, no such global and systematic scheme exists for evaluation of performance of an MT system. The existing evaluation metrics, such as BLEU, METEOR, TER, although used extensively in literature have faced a lot of criticism from users. Moreover, performance of these metrics often varies with the pair of languages under consideration. The above observation is no less pertinent with respect to translations involving languages of the Indian subcontinent. This study aims at developing an evaluation metric for English to Hindi MT outputs. As a part of this process, a set of probable errors have been identified manually as well as automatically. Linear regression has been used for computing weight/penalty for each error, while taking human evaluations into consideration. A sentence score is computed as the weighted sum of the errors. A set of 126 models has been built using different single classifiers and ensemble of classifiers in order to find the most suitable model for allocating appropriate weight/penalty for each error. The outputs of the models have been compared with the state-of-the-art evaluation metrics. The models developed for manually identified errors correlate well with manual evaluation scores, whereas the models for the automatically identified errors have low correlation with the manual scores. This indicates the need for further improvement and development of sophisticated linguistic tools for automatic identification and extraction of errors. Although many automatic machine translation tools are being developed for many different language pairs, there is no such generalized scheme that would lead to designing meaningful metrics for their evaluation. The proposed scheme should help in developing such metrics for different language pairs in the coming days. 相似文献

7.

Remco Dijkman Marlon Dumas Boudewijn van Dongen Reina Käärik Jan Mendling 《Information Systems》2011

It is common for large organizations to maintain repositories of business process models in order to document and to continuously improve their operations. Given such a repository, this paper deals with the problem of retrieving those models in the repository that most closely resemble a given process model or fragment thereof. Up to now, there is a notable research gap on comparing different approaches to this problem and on evaluating them in the same setting. Therefore, this paper presents three similarity metrics that can be used to answer queries on process repositories: (i) node matching similarity that compares the labels and attributes attached to process model elements; (ii) structural similarity that compares element labels as well as the topology of process models; and (iii) behavioral similarity that compares element labels as well as causal relations captured in the process model. These metrics are experimentally evaluated in terms of precision and recall. The results show that all three metrics yield comparable results, with structural similarity slightly outperforming the other two metrics. Also, all three metrics outperform text-based search engines when it comes to searching through a repository for similar business process models. 相似文献

8.

Syntax-based reordering for statistical machine translation

Maxim Khalilov José A.R. Fonollosa 《Computer Speech and Language》2011,25(4):761-788

In this paper, we develop an approach called syntax-based reordering (SBR) to handling the fundamental problem of word ordering for statistical machine translation (SMT). We propose to alleviate the word order challenge including morpho-syntactical and statistical information in the context of a pre-translation reordering framework aimed at capturing short- and long-distance word distortion dependencies. We examine the proposed approach from the theoretical and experimental points of view discussing and analyzing its advantages and limitations in comparison with some of the state-of-the-art reordering methods.In the final part of the paper, we describe the results of applying the syntax-based model to translation tasks with a great need for reordering (Chinese-to-English and Arabic-to-English). The experiments are carried out on standard phrase-based and alternative N-gram-based SMT systems. We first investigate sparse training data scenarios, in which the translation and reordering models are trained on a sparse bilingual data, then scaling the method to a large training set and demonstrating that the improvement in terms of translation quality is maintained. 相似文献

9.

Eye tracking as an MT evaluation technique

Stephen Doherty Sharon O’Brien Michael Carl 《Machine Translation》2010,24(1):1-13

Eye tracking has been used successfully as a technique for measuring cognitive load in reading, psycholinguistics, writing, language acquisition etc. for some time now. Its application as a technique for measuring the reading ease of MT output has not yet, to our knowledge, been tested. We report here on a preliminary study testing the use and validity of an eye tracking methodology as a means of semi-automatically evaluating machine translation output. 50 French machine translated sentences, 25 rated as excellent and 25 rated as poor in an earlier human evaluation, were selected. Ten native speakers of French were instructed to read the MT sentences for comprehensibility. Their eye gaze data were recorded non-invasively using a Tobii 1750 eye tracker. The average gaze time and fixation count were found to be higher for the “bad” sentences, while average fixation duration and pupil dilations were not found to be substantially different for output rated as good and output rated as bad. Comparisons between HTER scores and eye gaze data were also found to correlate well with gaze time and fixation count, but not with pupil dilation and fixation duration. We conclude that the eye tracking data, in particular gaze time and fixation count, correlate reasonably well with human evaluation of MT output but fixation duration and pupil dilation may be less reliable indicators of reading difficulty for MT output. We also conclude that eye tracking has promise as a semi-automatic MT evaluation technique, which does not require bi-lingual knowledge, and which can potentially tap into the end users’ experience of machine translation output. 相似文献

10.

Attribute value reordering for efficient hybrid OLAP

Owen Kaser Daniel Lemire 《Information Sciences》2006,176(16):2304-2336

The normalization of a data cube is the ordering of the attribute values. For large multidimensional arrays where dense and sparse chunks are stored differently, proper normalization can lead to improved storage efficiency. We show that it is NP-hard to compute an optimal normalization even for 1 × 3 chunks, although we find an exact algorithm for 1 × 2 chunks. When dimensions are nearly statistically independent, we show that dimension-wise attribute frequency sorting is an optimal normalization and takes time O(dn log(n)) for data cubes of size n^d. When dimensions are not independent, we propose and evaluate a several heuristics. The hybrid OLAP (HOLAP) storage mechanism is already 19-30% more efficient than ROLAP, but normalization can improve it further by 9-13% for a total gain of 29-44% over ROLAP. 相似文献

11.

How good is good enough? Metrics for worm/anti-worm evaluation

Attila Ondi Richard Ford 《Journal in Computer Virology》2007,3(2):93-101

Self-replicating code is a huge problem worldwide, with worms like SQL/Slammer becoming pandemic within minutes of their initial release. Because of this, there has been significant interest in worm spread and how this spread is affected by various countermeasures. However, to date, comparative analysis of spread has been carried out “by eye”—there exist no meaningful metrics by which one can quantitatively compare the effectiveness of different protection paradigms. In this paper, we discuss several possible metrics for measuring worm spread and countermeasure effectiveness. We note that the “correct” metric for comparative purposes will vary depending on the goal of the defender, and provide several different measures which can be used to compare countermeasures. Finally, we discuss the idea of significance—that is, what changes induced by worm design or countermeasures are actually meaningful in the real world? 相似文献

12.

Dynamic reordering of alternatives for definite logic programs

Hai-Feng Guo Gopal Gupta 《Computer Languages, Systems and Structures》2009,35(3):252-265

Due to their highly declarative nature and efficiency, tabled logic programming systems have been applied to solving many complex problems. Tabled logic programming is essential for extending traditional logic programming with tabled resolution. In this paper, we propose a new tabled resolution scheme, called dynamic reordering of alternatives (DRA) resolution, for definite logic programs. The scheme keeps track of the type of the subgoals during resolution; if the subgoal in the current resolvent is a variant of a former tabled subgoal, tabled answers are used to resolve the subgoal; otherwise, program clauses are used similar to SLD resolution. Program clauses leading to variant subgoals at runtime are dynamically reordered for further computation until the subgoals are completely evaluated. DRA resolution allows query evaluation to be performed in a depth-first, left-to-right traversal order similar to Prolog-typed SLD resolution, thus yielding a simple technique for incorporating tabled resolution in traditional logic programming systems. We show the correctness of DRA resolution. 相似文献

13.

Equation reordering for iterative processes — a comment

M. Gilli G. Pauletto M. Garbely 《Computational Economics》1992,5(2):147-153

The ordering of the equations for a nonlinear model plays an important role in the performance of solution algorithms using iterative processes. The paper comments on what is often referred to be an optimal ordering. 相似文献

14.

Performance Metrics for All

Morton Al 《Internet Computing, IEEE》2009,13(4):82-86

When protocol developers or network engineers seek to understand their situation, performance measurements are one tool in the tool box that can help. But what if the technology is new and no relevant performance metrics exist? This article describes a process for developing performance metrics with sufficient detail for them to become industry standards. 相似文献

15.

A new technique for WFTA input/output reordering

Chao H. Huang Fred J. Taylor 《International journal of parallel programming》1981,10(1):27-37

The Winograd Fourier transform algorithm (WFTA) is receiving intensive study. The advantage of this class of transform is its potential high throughput due to reduced multiplication count. However, both input and output reorderings have to be performed when the algorithm is implemented. In this work, a technique for the WFTA input/output reorderings is developed. This technique is flexible to the choice of the base numbers of WFTA and capable of operating at high speed in digital hardware. This technique requires no extra memory for reordering when implementing the WFTA in the residue number system if the moduli set of RNS is carefully chosen to contain the base numbers of WFTA.This work was supported in part under AFOSR Grant F49620-79-C-0066 and Lockheed Independent Research Funds. 相似文献

16.

A comparison of usability evaluation methods for evaluating e-commerce websites

《Behaviour & Information Technology》2012,31(7):707-737

The importance of evaluating the usability of e-commerce websites is well recognised. User testing and heuristic evaluation methods are commonly used to evaluate the usability of such sites, but just how effective are these for identifying specific problems? This article describes an evaluation of these methods by comparing the number, severity and type of usability problems identified by each one. The cost of employing these methods is also considered. The findings highlight the number and severity level of 44 specific usability problem areas which were uniquely identified by either user testing or heuristic evaluation methods, common problems that were identified by both methods, and problems that were missed by each method. The results show that user testing uniquely identified major problems related to four specific areas and minor problems related to one area. Conversely, the heuristic evaluation uniquely identified minor problems in eight specific areas and major problems in three areas. 相似文献

17.

对模板的测试覆盖度量

张彤杨旭东《计算机工程与应用》2001,37(17):120-121

分析了传统覆盖度量应用在模板函数和模板类测试时的不足，定义了新的覆盖度量。新的定义考虑了参数类型或对象所处的状态，较传统的定义更能保证测试的充分性。相似文献

18.

A syntactically informed reordering model for statistical machine translation

Saeed Farzi Shahram Khadivi 《人工智能实验与理论杂志》2013,25(4):449-469

Word reordering is one of the challengeable problems of machine translation. It is an important factor of quality and efficiency of machine translation systems. In this paper, we introduce a novel reordering model based on an innovative structure, named, phrasal dependency tree. The phrasal dependency tree is a modern syntactic structure which is based on dependency relationships between contiguous non-syntactic phrases. The proposed model integrates syntactical and statistical information in the context of log-linear model aimed at dealing with the reordering problems. It benefits from phrase dependencies, translation directions (orientations) and translation discontinuity between translated phrases. In comparison with well-known and popular reordering models such as distortion, lexicalised and hierarchical models, the experimental study demonstrates the superiority of our model in terms of translation quality. Performance is evaluated for Persian → English and English → German translation tasks using Tehran parallel corpus and WMT07 benchmarks, respectively. The results report 1.54/1.7 and 1.98/3.01 point improvements over the baseline in terms of BLEU/TER metrics on Persian → English and German → English translation tasks, respectively. On average our model retrieved a significant impact on precision with comparable recall value with respect to the lexicalised and distortion models. 相似文献

19.

Out-of-sequence-measurement processing for probabilistic multiple hypothesis tracker with measurement reordering

Il-Hwan Seo Taek-Lyul Song 《International Journal of Control, Automation and Systems》2010,8(2):301-307

In a multi-sensor central level tracking system, owing to random delay in transmission and varying preprocessing time for different sensor platforms, an earlier measurement from the same target can arrive at the fusion center after a later one. Practical data fusion schemes are challenged by the inevitable appearance of measurements that are out of sequence, called, “out-of-sequence measurements” (OOSMs). The question is how to incorporate these OOSMs in a track that has already been updated with a later observation in order to enhance the performance of the tracking system. Several approaches for a sequential algorithm to find a solution for the OOSM problem have been discussed in previous papers. An approach to address the OOSM problem in the probabilistic multi-hypothesis tracker (PMHT), being a batch algorithm, was proposed in previous paper. However, the situation of this approach was not an OOSM case but, rather, an out of sequence scan (OOSS) where a batch of data was lost and then only one scan of measurements from the lost batch arrived with the present batch. In this paper, we propose an approach that has a measurement reordering step to address the OOSM problem in the PMHT within the framework of the OOSM case and report on the performance with the simulation results. The simulation results indicate that the proposed approach may be a suitable solution for the OOSM problem in PMHT under the proper conditions of length of batch, amount of lag, density of clutter, and probability of detection for the target. 相似文献

20.

Metrics for weighted transition systems: Axiomatization and complexity

Kim G. Larsen Uli Fahrenberg Claus Thrane 《Theoretical computer science》2011,412(28):3358-3369

Simulation distances are essentially approximations of simulation which provide a measure of the extent by which behaviors in systems are inequivalent. In this paper, we consider the general quantitative model of weighted transition systems, where transitions are labeled with elements of a finite metric space. We study the so-called point-wise and accumulating simulation distances which provide extensions to the well-known Boolean notion of simulation on labeled transition systems.We introduce weighted process algebras for finite and regular behavior and offer sound and (approximate) complete inference systems for the proposed simulation distances. We also settle the algorithmic complexity of computing the simulation distances. 相似文献