首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 546 毫秒
Appropriate evaluation of referring expressions is critical for the design of systems that can effectively collaborate with humans. A widely used method is to simply evaluate the degree to which an algorithm can reproduce the same expressions as those in previously collected corpora. Several researchers, however, have noted the need of a task-performance evaluation measuring the effectiveness of a referring expression in the achievement of a given task goal. This is particularly important in collaborative situated dialogues. Using referring expressions used by six pairs of Japanese speakers collaboratively solving Tangram puzzles, we conducted a task-performance evaluation of referring expressions with 36 human evaluators. Particularly we focused on the evaluation of demonstrative pronouns generated by a machine learning-based algorithm. Comparing the results of this task-performance evaluation with the results of a previously conducted corpus-matching evaluation (Spanger et al. in Lang Resour Eval, 2010b), we confirmed the limitation of a corpus-matching evaluation and discuss the need for a task-performance evaluation.  相似文献   

多跳问题生成任务旨在聚合多段离散信息进行复杂推理并生成自然语言的问句。对于给定的问答对,文本中多数句子都是冗余或含有不相关信息的句子,而之前大多数方法在模型的训练和应用推断中都需要提前标注好的句级标签。然而,大规模的句子标注数据在现实场景中是难以获取的。为了解决这一问题,该文提出一种基于佐证句选择的图神经网络(Graph-based Evidence Selection network,GES)。该模型通过图神经网络从离散文档中提取出若干个关键句,然后根据对应结果引入归纳偏置来辅助问题生成。同时采用直通估计量(straight-through estimator)来端到端地训练模型。在公开数据集HotpotQA的对比实验中,该方法在问题生成的多个指标上均取得了显著的性能提升。  相似文献   

该文提出一种统计与规则相结合的时间表达式识别方法。首先,通过分析中文文本中时间表达式的词形、词性和上下文信息,采用条件随机场识别时间单元而非时间表达式整体,避免了中文时间表达式边界定位不准确的问题;然后,从训练语料中自动获取候选触发词,并依据评价函数对候选触发词打分,筛选出正确的触发词完善触发词库;最后,根据时间触发词库与时间缀词库,制定规则对时间表达式边界进行定位。实验结果显示开式测试F1值达到98.31%。  相似文献   

Recent research has illuminated some of the ways in which multilingual writers project multiple identities in their writing, conveying disciplinary allegiances as well as more personal expressions of individuality. Such work has focused on the writers’ uses of various verbal expressions, but has to this point overlooked the ways in which they manipulate the visual mode as a means for identity expression. The present study examines expressions of identity in a corpus of multimodal texts written by four multilingual graduate student writers. I consider how the writers’ uses of various verbal and visual expressions in their Microsoft PowerPoint presentation slides project both disciplinarity and individuality and how each individual's habitus has been influenced by both the discourses they have encountered and their personal reactions towards those discourses.  相似文献   

Extraction and normalization of temporal expressions from documents are important steps towards deep text understanding and a prerequisite for many NLP tasks such as information extraction, question answering, and document summarization. There are different ways to express (the same) temporal information in documents. However, after identifying temporal expressions, they can be normalized according to some standard format. This allows the usage of temporal information in a term- and language-independent way. In this paper, we describe the challenges of temporal tagging in different domains, give an overview of existing annotated corpora, and survey existing approaches for temporal tagging. Finally, we present our publicly available temporal tagger HeidelTime, which is easily extensible to further languages due to its strict separation of source code and language resources like patterns and rules. We present a broad evaluation on multiple languages and domains on existing corpora as well as on a newly created corpus for a language/domain combination for which no annotated corpus has been available so far.  相似文献   

This paper presents a robot search task (social tag) that uses social interaction, in the form of asking for help, as an integral component of task completion. Socially distributed perception is defined as a robot's ability to augment its limited sensory capacities through social interaction. We describe the task of social tag and its implementation on the robot GRACE for the AAAI 2005 Mobile Robot Competition & Exhibition. We then discuss our observations and analyses of GRACE's performance as a situated interaction with conference participants. Our results suggest we were successful in promoting a form of social interaction that allowed people to help the robot achieve its goal. Furthermore, we found that different social uses of the physical space had an effect on the nature of the interaction. Finally, we discuss the implications of this design approach for effective and compelling human-robot interaction, considering its relationship to concepts such as dependency, mixed initiative, and socially distributed cognition. An erratum to this article can be found at

In this paper, we investigate the empirical correlates of the agreement process. Informally, the agreement process is the dialog process by which collaborators achieve joint commitment on a joint action. We propose a specific instantiation of the agreement process, derived from our theoretical model, that integrates the IRMA framework for rational problem solving (Bratman, Israel & Pollack, 1988) with Clark's (1992, 1996) work on language as a collaborative activity; and from the characteristics of our task, a simple design problem (furnishing a two-room apartment) in which knowledge is equally distributed among agents, and needs to be shared. The main contribution of our paper is an empirical study of some of the components of the agreement process. We first discuss why we believe the findings from our corpus of computer-mediated dialogs are applicable to human–human collaborative dialogs in general. We then present our theoretical model, and apply it to make predictions about the components of the agreement process. We focus on how information is exchanged in order to arrive at a proposal, and on what constitutes a proposal and its acceptance/rejection. Our corpus study makes use of features of both the dialog and the domain reasoning situation, and led us to discover that the notion of commitment is more useful to model the agreement process than that of acceptance/rejection, as it more closely relates to the unfolding of negotiation.  相似文献   

We report on a project to annotate biblical texts in order to create an aligned multilingual Bible corpus for linguistic research, particularly computational linguistics, including automatically creating and evaluating translation lexicons and semantically tagged texts. The output of this project will enable researchers to take advantage of parallel translations across a wider number of languages than previously available, providing, with relatively little effort, a corpus that contains careful translations and reliable alignment at the near-sentence level. We discuss the nature of the text, our annotation process, preliminary and planned uses for the corpus, and relevant aspects of the Corpus Encoding Standard (CES) with respect to this corpus. We also present a quantitative comparison with dictionary and corpus resources for modern-day English, confirming the relevance of this corpus for research on present day language.  相似文献   

We report on a project to annotate biblical texts in order to create an aligned multilingual Bible corpus for linguistic research, particularly computational linguistics, including automatically creating and evaluating translation lexicons and semantically tagged texts. The output of this project will enable researchers to take advantage of parallel translations across a wider number of languages than previously available, providing, with relatively little effort, a corpus that contains careful translations and reliable alignment at the near-sentence level. We discuss the nature of the text, our annotation process, preliminary and planned uses for the corpus, and relevant aspects of the Corpus Encoding Standard (CES) with respect to this corpus. We also present a quantitative comparison with dictionary and corpus resources for modern-day English, confirming the relevance of this corpus for research on present day language.  相似文献   

Statistical methods to extract translational equivalents from non-parallel corpora hold the promise of ensuring the required coverage and domain customisation of lexicons as well as accelerating their compilation and maintenance. A challenge for these methods are rare, less common words and expressions, which often have low corpus frequencies. However, it is rare words such as newly introduced terminology and named entities that present the main interest for practical lexical acquisition. In this article, we study possibilities of improving the extraction of low-frequency equivalents from bilingual comparable corpora. Our work is carried out in the general framework which discovers equivalences between words of different languages using similarities between their occurrence patterns found in respective monolingual corpora. We develop a method that aims to compensate for insufficient amounts of corpus evidence on rare words: prior to measuring cross-language similarities, the method uses same-language corpus data to model co-occurrence vectors of rare words by predicting their unseen co-occurrences and smoothing rare, unreliable ones. Our experimental evaluation demonstrates that the proposed method delivers a consistent and significant improvement on the conventional approach to this task.  相似文献   

This paper reports progress in the synthesis of conversational speech, from the viewpoint of work carried out on the analysis of a very large corpus of expressive speech in normal everyday situations. With recent developments in concatenative techniques, speech synthesis has overcome the barrier of realistically portraying extra-linguistic information by using the actual voice of a recognizable person as a source for units, combined with minimal use of signal processing. However, the technology still faces the problem of expressing paralinguistic information, i.e., the variety in the types of speech and laughter that a person might use in everyday social interactions. Paralinguistic modification of an utterance portrays the speaker's affective states and shows his or her relationships with the speaker through variations in the manner of speaking, by means of prosody and voice quality. These inflections are carried on the propositional content of an utterance, and can perhaps be modeled by rule, but they are also expressed through nonverbal utterances, the complexity of which may be beyond the capabilities of many current synthesis methods. We suggest that this problem may be solved by the use of phrase-sized utterance units taken intact from a large corpus.  相似文献   

Weblogs are increasingly popular modes of communication and they are frequently used as mediums for emotional expression in the ever changing online world. This work uses blogs as object and data source for Chinese emotional expression analysis. First, a textual emotional expression space model is described, and based on this model, a relatively fine-grained annotation scheme is proposed for manual annotation of an emotion corpus. In document and paragraph levels, emotion category, emotion intensity, topic word and topic sentence are annotated. In sentence level, emotion category, emotion intensity, emotional keyword and phrase, degree word, negative word, conjunction, rhetoric, punctuation, objective or subjective, and emotion polarity are annotated. Then, using this corpus, we explore these linguistic expressions that indicate emotion in Chinese, and present a detailed data analysis on them, involving mixed emotions, independent emotion, emotion transfer, and analysis on words and rhetorics for emotional expression.  相似文献   

目的 生活中照片拍摄时难以捕捉到所有人脸表情最佳的时刻,多次摆拍不仅费时,而且可能会错过某些场景,传统的后期编辑软件不具备针对性,且操作复杂。针对人物照片中部分人脸表情不佳的情况,提出一种基于表情传输的交互式照片编辑算法。方法 首先将包含源人脸的照片与具有目标表情人脸的照片进行特征点检测,通过交互将指定人脸部分选出并将其姿态归正为眼睛在同一水平线上,如果目标人脸与源人脸身份相同,将目标人脸区域根据源人脸的轮廓以及左右半脸分布以扫描线变形得到替换目标;不相同时根据目标人脸特征点分布的几何特征计算源人脸中特征点的新位置,通过基于特征点变化的网格变形得到替换目标,最后利用二次光照与泊松融合将其无缝拼接到源图中。结果 实验表明算法可以对人脸五官清晰且在宽容度内的人物照片进行表情编辑,处理结果只改变了人物的脸部表情,并且无明显拼接痕迹。结论 提出了一种新型针对目标人脸不同身份信息的交互式表情传输模型,该模型可以适应不同的编辑条件与要求,效果出色。  相似文献   

The paralinguistic information in a speech signal includes clues to the geographical and social background of the speaker. This paper is concerned with automatic extraction of this information from a short segment of speech. A state-of-the-art language identification (LID) system is applied to the problems of regional accent recognition for British English, and ethnic group recognition within a particular accent. We compare the results with human performance and, for accent recognition, the ‘text dependent’ ACCDIST accent recognition measure. For the 14 regional accents of British English in the ABI-1 corpus (good quality read speech), our LID system achieves a recognition accuracy of 89.6%, compared with 95.18% for our best ACCDIST-based system and 58.24% for human listeners. The “Voices across Birmingham” corpus contains significant amounts of telephone conversational speech for the two largest ethnic groups in the city of Birmingham (UK), namely the ‘Asian’ and ‘White’ communities. Our LID system distinguishes between these two groups with an accuracy of 96.51% compared with 90.24% for human listeners. Although direct comparison is difficult, it seems that our LID system performs much better on the standard 12 class NIST 2003 Language Recognition Evaluation task or the two class ethnic group recognition task than on the 14 class regional accent recognition task. We conclude that automatic accent recognition is a challenging task for speech technology, and speculate that the use of natural conversational speech may be advantageous for these types of paralinguistic task.  相似文献   

Facial expression recognition has recently become an important research area, and many efforts have been made in facial feature extraction and its classification to improve face recognition systems. Most researchers adopt a posed facial expression database in their experiments, but in a real-life situation the facial expressions may not be very obvious. This article describes the extraction of the minimum number of Gabor wavelet parameters for the recognition of natural facial expressions. The objective of our research was to investigate the performance of a facial expression recognition system with a minimum number of features of the Gabor wavelet. In this research, principal component analysis (PCA) is employed to compress the Gabor features. We also discuss the selection of the minimum number of Gabor features that will perform the best in a recognition task employing a multiclass support vector machine (SVM) classifier. The performance of facial expression recognition using our approach is compared with those obtained previously by other researchers using other approaches. Experimental results showed that our proposed technique is successful in recognizing natural facial expressions by using a small number of Gabor features with an 81.7% recognition rate. In addition, we identify the relationship between the human vision and computer vision in recognizing natural facial expressions.  相似文献   

We describe research carried out as part of a text summarisation project for the legal domain for which we use a new XML corpus of judgments of the UK House of Lords. These judgments represent a particularly important part of public discourse due to the role that precedents play in English law. We present experimental results using a range of features and machine learning techniques for the task of predicting the rhetorical status of sentences and for the task of selecting the most summary-worthy sentences from a document. Results for these components are encouraging as they achieve state-of-the-art accuracy using robust, automatically generated cue phrase information. Sample output from the system illustrates the potential of summarisation technology for legal information management systems and highlights the utility of our rhetorical annotation scheme as a model of legal discourse, which provides a clear means for structuring summaries and tailoring them to different types of users.  相似文献   

In this paper we present an annotated audio–video corpus of multi-party meetings. The multimodal corpus provides for each subject involved in the experimental sessions six annotation dimensions referring to group dynamics; speech activity and body activity. The corpus is based on 11 audio and video recorded sessions which took place in a lab setting appropriately equipped with cameras and microphones. Our main concern in collecting this multimodal corpus was to explore the possibility of providing feedback services to facilitate group processes and to enhance self awareness among small groups engaged in meetings. We therefore introduce a coding scheme for annotating relevant functional roles that appear in a small group interaction. We also discuss the reliability of the coding scheme and we present the first results for automatic classification.  相似文献   

In this paper we present a tool that uses comparable corpora to find appropriate translation equivalents for expressions that are considered by translators as difficult. For a phrase in the source language the tool identifies a range of possible expressions used in similar contexts in target language corpora and presents them to the translator as a list of suggestions. In the paper we discuss the method and present results of human evaluation of the performance of the tool, which highlight its usefulness when dictionary solutions are lacking.  相似文献   

Shimon is a interactive robotic marimba player, developed as part of our ongoing research in Robotic Musicianship. The robot listens to a human musician and continuously adapts its improvisation and choreography, while playing simultaneously with the human. We discuss the robot??s mechanism and motion-control, which uses physics simulation and animation principles to achieve both expressivity and safety. We then present an interactive improvisation system based on the notion of physical gestures for both musical and visual expression. The system also uses anticipatory action to enable real-time improvised synchronization with the human player. We describe a study evaluating the effect of embodiment on one of our improvisation modules: antiphony, a call-and-response musical synchronization task. We conducted a 3×2 within-subject study manipulating the level of embodiment, and the accuracy of the robot??s response. Our findings indicate that synchronization is aided by visual contact when uncertainty is high, but that pianists can resort to internal rhythmic coordination in more predictable settings. We find that visual coordination is more effective for synchronization in slow sequences; and that occluded physical presence may be less effective than audio-only note generation. Finally, we test the effects of visual contact and embodiment on audience appreciation. We find that visual contact in joint Jazz improvisation makes for a performance in which audiences rate the robot as playing better, more like a human, as more responsive, and as more inspired by the human. They also rate the duo as better synchronized, more coherent, communicating, and coordinated; and the human as more inspired and more responsive.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号