排序方式: 共有7条查询结果,搜索用时 0 毫秒
1
1.
Perakakis M. Potamianos A. 《IEEE transactions on audio, speech, and language processing》2008,16(6):1194-1206
The usage patterns of speech and visual input modes are investigated as a function of relative input mode efficiency for both desktop and personal digital assistant (PDA) working environments. For this purpose the form-filling part of a multimodal dialogue system is implemented and evaluated; three multimodal modes of interaction are implemented: ldquoClick-to-Talk,rdquo ldquoOpen-Mike,rdquo and ldquoModality-Selection.rdquo ldquoModality-Selectionrdquo implements an adaptive interface where the system selects the most efficient input mode at each turn, effectively alternating between a ldquoClick-to-Talkrdquo and ldquoOpen-Mikerdquo interaction style as proposed in ldquoModality tracking in the multimodal Bell Labs Communicator,rdquo in Proceedings of the Automatic Speech Recognition and Understanding Workshop, by A. Potamianos, , 2003. The multimodal systems are evaluated and compared with the unimodal systems. Objective and subjective measures used include task completion, task duration, turn duration, and overall user satisfaction. Turn duration is broken down into interaction time and inactivity time to better measure the efficiency of each input mode. Duration statistics and empirical probability density functions are computed as a function of interaction context and user. Results show that the multimodal systems outperform the unimodal systems in terms of objective and subjective criteria. Also, users tend to use the most efficient input mode at each turn; however, biases towards the default input modality and a general bias towards the speech modality also exists. Results demonstrate that although users exploit some of the available synergies in multimodal dialogue interaction, further efficiency gains can be achieved by designing adaptive interfaces that fully exploit these synergies. 相似文献
2.
Academic papers, like genes, code for ideas or technological innovations that structure and transform the scientific organism
and consequently the society at large. Genes are subject to the process of natural selection which ensures that only the fittest
survive and contribute to the phenotype of the organism. The process of selection of academic papers, however, is far from
natural. Commercial for-profit publishing houses have taken control over the evaluation and access to scientific information
with serious consequences for the dissemination and advancement of knowledge. Academic authors and librarians are reacting
by developing an alternative publishing system based on free-access journals and self-archiving in institutional repositories
and global disciplinary libraries. Despite the emergence of such trends, the journal monopoly, rather than the scientific
community, is still in control of selecting papers and setting academic standards. Here we propose a dynamical and transparent
peer review process, which we believe will accelerate the transition to a fully open and free-for-all science that will allow
the natural selection of the fittest ideas. 相似文献
3.
Potamianos A. Fosler-Lussier E. Ammicht E. Perakakis M. 《Multimedia, IEEE Transactions on》2007,9(3):550-566
For pt.1see ibid., vol. 9, p. 3 (2007). In this paper, the task and user interface modules of a multimodal dialogue system development platform are presented. The main goal of this work is to provide a simple, application-independent solution to the problem of multimodal dialogue design for information seeking applications. The proposed system architecture clearly separates the task and interface components of the system. A task manager is designed and implemented that consists of two main submodules: the electronic form module that handles the list of attributes that have to be instantiated by the user, and the agenda module that contains the sequence of user and system tasks. Both the electronic forms and the agenda can be dynamically updated by the user. Next a spoken dialogue module is designed that implements the speech interface for the task manager. The dialogue manager can handle complex error correction and clarification user input, building on the semantics and pragmatic modules presented in Part I of this paper. The spoken dialogue system is evaluated for a travel reservation task of the DARPA Communicator research program and shown to yield over 90% task completion and good performance for both objective and subjective evaluation metrics. Finally, a multimodal dialogue system which combines graphical and speech interfaces, is designed, implemented and evaluated. Minor modifications to the unimodal semantic and pragmatic modules were required to build the multimodal system. It is shown that the multimodal system significantly outperforms the unimodal speech-only system both in terms of efficiency (task success and time to completion) and user satisfaction for a travel reservation task 相似文献
4.
Perakakis Pandelis Taylor Michael Mazza Marco G. Trachana Varvara 《Scientometrics》2011,88(2):669-673
We welcome the commentary by L. Egghe (Scientometrics, this issue) stimulating discussion on our recent article “Natural selection of academic papers” (NSAP) (Scientometrics, 85(2):553–559,
2010) that focuses on an important modern issue at the heart of the scientific enterprise—the open and continuous evaluation and
evolution of research. We are also grateful to the editor of Scientometrics for giving us the opportunity to respond to some
of the arguments by L. Egghe that we believe are inaccurate or require further comment. 相似文献
5.
Buela-Casal Gualberto Perakakis Pandelis Taylor Michael Checa Purificación 《Scientometrics》2006,67(1):45-65
6.
Vousdoukas MI Perakakis P Idrissi S Vila J 《Computer methods and programs in biomedicine》2012,108(1):318-329
This article presents a Matlab-based stereo-vision motion tracking system (SVMT) for the detection of human motor reactivity elicited by sensory stimulation. It is a low-cost, non-intrusive system supported by Graphical User Interface (GUI) software, and has been successfully tested and integrated in a broad array of physiological recording devices at the Human Physiology Laboratory in the University of Granada. The SVMT GUI software handles data in Matlab and ASCII formats. Internal functions perform lens distortion correction, camera geometry definition, feature matching, as well as data clustering and filtering to extract 3D motion paths of specific body areas. System validation showed geo-rectification errors below 0.5mm, while feature matching and motion paths extraction procedures were successfully validated with manual tracking and RMS errors were typically below 2% of the movement range. The application of the system in a psychophysiological experiment designed to elicit a startle motor response by the presentation of intense and unexpected acoustic stimuli, provided reliable data probing dynamical features of motor responses and habituation to repeated stimulus presentations. The stereo-geolocation and motion tracking performance of the SVMT system were successfully validated through comparisons with surface EMG measurements of eyeblink startle, which clearly demonstrate the ability of SVMT to track subtle body movement, such as those induced by the presentation of intense acoustic stimuli. Finally, SVMT provides an efficient solution for the assessment of motor reactivity not only in controlled laboratory settings, but also in more open, ecological environments. 相似文献
7.
Digalakis V.V. Neumeyer L.G. Perakakis M. 《Selected Areas in Communications, IEEE Journal on》1999,17(1):82-90
We examine alternative architectures for a client-server model of speech-enabled applications over the World Wide Web (WWW). We compare a server-only processing model where the client encodes and transmits the speech signal to the server, to a model where the recognition front end runs locally at the client and encodes and transmits the cepstral coefficients to the recognition server over the Internet. We follow a novel encoding paradigm, trying to maximize recognition performance instead of perceptual reproduction, and we find that by transmitting the cepstral coefficients we can achieve significantly higher recognition performance at a fraction of the bit rate required when encoding the speech signal directly. We find that the required bit rate to achieve the recognition performance of high-quality unquantized speech is just 2000 bits per second 相似文献
1