共查询到10条相似文献,搜索用时 0 毫秒
1.
In this paper we present a tool that uses comparable corpora to find appropriate translation equivalents for expressions that
are considered by translators as difficult. For a phrase in the source language the tool identifies a range of possible expressions
used in similar contexts in target language corpora and presents them to the translator as a list of suggestions. In the paper
we discuss the method and present results of human evaluation of the performance of the tool, which highlight its usefulness
when dictionary solutions are lacking. 相似文献
2.
Carina F. Dorneles Marcos Freitas Nunes Carlos A. Heuser Viviane P. Moreira Altigran S. da Silva Edleno S. de Moura 《Information Systems》2009,34(8):673
Approximate data matching aims at assessing whether two distinct instances of data represent the same real-world object. The comparison between data values is usually done by applying a similarity function which returns a similarity score. If this score surpasses a given threshold, both data instances are considered as representing the same real-world object. These score values depend on the algorithm that implements the function and have no meaning to the user. In addition, score values generated by different functions are not comparable. This will potentially lead to problems when the scores returned by different similarity functions need to be combined for computing the similarity between records. In this article, we propose that thresholds should be defined in terms of the precision that is expected from the matching process rather than in terms of the raw scores returned by the similarity function. Precision is a widely known similarity metric and has a clear interpretation from the user's point of view. Our approach defines mappings from score values to precision values, which we call adjusted scores. In order to obtain such mappings, our approach requires training over a small dataset. Experiments show that training can be reused for different datasets on the same domain. Our results also demonstrate that existing methods for combining scores for computing the similarity between records may be enhanced if adjusted scores are used. 相似文献
3.
Speech-to-speech translation technology has difficulties processing elements of spontaneity in conversation. We propose a
discourse marker attribute in speech corpora to help overcome some of these problems. There have already been some attempts
to annotate discourse markers in speech corpora. However, as there is no consistency on what expressions count as discourse
markers, we have to reconsider how to set a framework for annotating, and, in order to better understand what we gain by introducing
a discourse marker category, we have to analyse their characteristics and functions in discourse. This is especially important
for languages such as Slovenian where no or little research on the topic of discourse markers has been carried out. The aims
of this paper are to present a scheme for annotating discourse markers based on the analysis of a corpus of telephone conversations
in the tourism domain in the Slovenian language, and to give some additional arguments based on the characteristics and functions
of discourse markers that confirm their special status in conversation. 相似文献
4.
This paper develops word recognition methods for historical handwritten cursive and printed documents. It employs a powerful segmentation-free letter detection method based upon joint boosting with histograms of gradients as features. Efficient inference on an ensemble of hidden Markov models can select the most probable sequence of candidate character detections to recognize complete words in ambiguous handwritten text, drawing on character n-gram and physical separation models. Experiments with two corpora of handwritten historic documents show that this approach recognizes known words more accurately than previous efforts, and can also recognize out-of-vocabulary words. 相似文献
5.
6.
Automatic Speaker Recognition (ASR) refers to the task of identifying a person based on his or her voice with the help of
machines. ASR finds its potential applications in telephone based financial transactions, purchase of credit card and in forensic
science and social anthropology for the study of different cultures and languages. Results of ASR are highly dependent on
database, i.e., the results obtained in ASR are meaningless if recording conditions are not known. In this paper, a methodology
and a typical experimental setup used for development of corpora for various tasks in the text-independent speaker identification
in different Indian languages, viz., Marathi, Hindi, Urdu and Oriya have been described. Finally, an ASR system is presented
to evaluate the corpora. 相似文献
7.
8.
9.
A critical task of vision-based manufacturing applications is to generate a virtual representation of a physical object from a dataset of point clouds. Its success relies on reliable algorithms and tools. Many effective technologies have been developed to solve various problems involved in data acquisition and processing. Some articles are available on evaluating and reviewing these technologies and underlying methodologies. However, for most practitioners who lack a strong background on mathematics and computer science, it is hard to understand theoretical fundamentals of the methodologies. In this paper, we intend to survey and evaluate recent advances in data acquisition and progressing, and provide an overview from a manufacturing perspective. Some potential manufacturing applications have been introduced, the technical gaps between the practical requirements and existing technologies discussed, and research opportunities identified. 相似文献
10.
The Cranfield University torpedo-shaped underwater vehicle has been modified to accommodate a laser stripe illumination system. As well as providing enhanced viewing capabilities, this system derives real-time navigational data during the mission and gathers images to produce a post mission enhanced optical waterfall image of a surveyed area.
This paper describes a preliminary set of constrained motion trials at the IFREMER wave basin in Brest, where the system was towed through the 50 m test tank at different altitudes and orientations whilst the true trajectory was measured. A comparison is made between ground truth trajectory generated from these external measurements and that derived from the video camera and rotation sensors internal to the vehicle. 相似文献