首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In a series of experiments isolated-word automated speech recognition (ASR) was compared with keyboard and mouse interfaces for three data entry tasks: textual phrase entry, selection from a list, and numerical data entry. To effect fair comparisons, the tasks were designed to minimize the transaction cycle for each input mode and data type, and the main comparisons used times from only correct data entries. With the hardware and software employed the results indicate that for inputting short phrases, ASR competes only if the typist's speed is below 45 words per minute. For selecting an item from a list, ASR offers an advantage only if the list length exceeds 15 items. For entering numerical data, ASR offers no advantage over keypad or mouse. An extrapolation to latency-free ASR suggests that even as hardware and software become faster, human factors will dominate and the results would shift only slightly in favor of ASR.  相似文献   

2.
Multi-touch, which has been heralded as a revolution in human–computer interaction, provides features such as gestural interaction, tangible interfaces, pen-based computing, and interface customization—features embraced by an increasingly tech-savvy public. However, multi-touch platforms have not been adopted as “everyday” computer interaction devices that support important text entry intensive applications such as word processing and spreadsheets. In this paper, we present two studies that begin to explore user performance and experience with entering text using a multi-touch input. The first study establishes a benchmark for text entry performance on a multi-touch platform across input modes that compare uppercase-only to mixed-case, single-touch to multi-touch and copy to memorization tasks. The second study includes mouse style interaction for formatting rich text to simulate a word processing task using multi-touch input. As expected, our results show that users do not perform as well in terms of text entry efficiency and speed using a multi-touch interface as with a traditional keyboard. Not as expected was the result that degradation in performance was significantly less for memorization versus copy tasks, and consequently willingness to use multi-touch was substantially higher (50% versus 26%) in the former case. Our results, which include preferred input styles of participants, also provide a baseline for further research to explore techniques for improving text entry performance on multi-touch systems.  相似文献   

3.
A program is described for sequence data entry which allows flexible program control by responding to both the keyboard and a sonic digitizer concurrently. Simplification of the initialization stage of each gel reading has been achieved, in comparison with other programs.  相似文献   

4.
Emotion recognition from speech has emerged as an important research area in the recent past. In this regard, review of existing work on emotional speech processing is useful for carrying out further research. In this paper, the recent literature on speech emotion recognition has been presented considering the issues related to emotional speech corpora, different types of speech features and models used for recognition of emotions from speech. Thirty two representative speech databases are reviewed in this work from point of view of their language, number of speakers, number of emotions, and purpose of collection. The issues related to emotional speech databases used in emotional speech recognition are also briefly discussed. Literature on different features used in the task of emotion recognition from speech is presented. The importance of choosing different classification models has been discussed along with the review. The important issues to be considered for further emotion recognition research in general and in specific to the Indian context have been highlighted where ever necessary.  相似文献   

5.
This paper describes a user study on interaction with a mobile device installed in a driving simulator. Two new auditory interfaces were proposed and their effectiveness and efficiency were compared to a standard visual interface. Both auditory interfaces consisted of spatialized auditory cues representing individual items in the hierarchical structure of the menu. In the first auditory interface all items of the current level of the menu were played simultaneously. In the second auditory interface only one item was played at a time. The visual interface was shown on a small in-vehicle LCD screen on the dashboard. In all three cases, a custom-made interaction device (a scrolling wheel and two buttons) attached to the steering wheel was used for controlling the interface. The driving performance, task completion times, perceived workload and overall user satisfaction were evaluated. The experiment proved that both auditory interfaces were effective to use in a mobile environment, but were not faster than the visual interface. In the case of shorter tasks, e.g. changing the active profile or deleting an image, the task completion times were comparable for all interfaces; however, both the driving performance was significantly better and the perceived workload was lower when using the auditory interfaces. The test subjects also reported a high overall satisfaction with the auditory interfaces. The latter were labelled as easier to use, more satisfying and more adequate for performing the required tasks than the visual interface. The results of the survey are not surprising as there is a stronger competition for the visual attention between the visual interface and the primary task (driving the car) than in the case of using the auditory interface. So although both types of interfaces were proven to be effective, the visual interface was less efficient as it strongly distracted the user from performing the primary task.  相似文献   

6.
Manual text entry, which is one of the main features of mobile communications devices, decreases the competitive advantages of full touch-screen interfaces over physical interfaces. Especially for small full QWERTY keyboards, text entry becomes more problematic because of the small size of the virtual keys, absence of tactile feedback, and occlusion of virtual keys by fingers. One solution to this problem is the regional error correction, which is a predictive text entry method that activates the key corresponding to the actual activation point and also other keys within an activation area. This study investigates how the size of keys and of the activation area affect the accuracy of the regional error correction and compares the regional error correction method with the conventional finger touch method, for a touch-screen QWERTY keyboard. The regional error correction reduced both the time and the number of touches required to complete text entry when keys were small, but no difference was observed when keys were large. Users’ subjective assessments of ease of use and preference indicated greater satisfaction with the regional error correction method than without it, regardless of key size.Relevance to industry: The result of this study can be used to speed and simplify text entry in mobile devices with full-QWERTY virtual keyboards.  相似文献   

7.
A DNA editor for an Apple II is described which contains many additional functions apart from just editing sequences. The data files are normal ASCII text or binary files and can thus be used easily by other programs. The program supports a special keyboard which greatly facilitates typing of DNA sequences. Furthermore a speech synthesizer is supported by the editor. The speech feedback, together with the special keyboard, reduces typing errors to a minimum.  相似文献   

8.
Automatic speech recognition (ASR) has made great strides with the development of digital signal processing hardware and software. But despite of all these advances, machines can not match the performance of their human counterparts in terms of accuracy and speed, especially in case of speaker independent speech recognition. So, today significant portion of speech recognition research is focused on speaker independent speech recognition problem. Before recognition, speech processing has to be carried out to get a feature vectors of the signal. So, front end analysis plays a important role. The reasons are its wide range of applications, and limitations of available techniques of speech recognition. So, in this report we briefly discuss the different aspects of front end analysis of speech recognition including sound characteristics, feature extraction techniques, spectral representations of the speech signal etc. We have also discussed the various advantages and disadvantages of each feature extraction technique, along with the suitability of each method to particular application.  相似文献   

9.
《Ergonomics》2012,55(3):508-517
Abstract

Video-motion analysis was used to analyse hand/wrist posture for subjects typing at a 101-key QWERTY keyboard on a 68 cm high worksurface. Three conditions were tested: subjects typed at the keyboard without arm support, subjects typed with adjustable full motion forearm supports, and subjects typed with an adjustable negative slope keyboard support system. The average declination of the negative slope keyboard support chosen by subjects was 12° below horizontal, which flattened the angle of the key tops. Ulnar deviation was comparable in all conditions and averaged 13° for the right hand and 15° for the left hand. Full motion forearm supports did not significantly affect any postural measures. Dorsal wrist extension averaged 13° when typing with or without the full motion forearm supports, but this was reduced to an average — 1° with the use of the negative slope keyboard support system. Subjects chose to sit at a distance of 79 cm from the computer screen when using the negative slope keyboard system compared with 69 cm without this.  相似文献   

10.
An on-line character recognition system was developed which recognized small sized characters, whose typical height was 4 mm, with a recognition rate of 98%. The system features programming flexibility and modifying a recognition logic. It was made possible by the scheme whereby proposition-test sequences were separated into both an assembly of tree node data and test subroutines. A demonstration system was also developed which enters personal data into a computer by recognizing hand-printed characters. The whole system showed a feasible substitution for a billing machine keyboard in data entry applications.  相似文献   

11.
Document recognition system (DRS), a workstation-based prototype document analysis system that uses optical character recognition (OCR), is described. The system provides functions for image capture, block segmentation, page structure analysis, and character recognition with contextual postprocessing, as well as a user interface for error correction. All the functions except image capture and character recognition have been implemented by means of software for the Japanese edition of OS/2  相似文献   

12.
This paper presents work on developing speech corpora and recognition tools for Turkish by porting SONIC, a speech recognition tool developed initially for English at the Center for Spoken Language Research of the University of Colorado at Boulder. The work presented in this paper had two objectives: The first one is to collect a standard phonetically-balanced Turkish microphone speech corpus for general research use. A 193-speaker triphone-balanced audio corpus and a pronunciation lexicon for Turkish have been developed. The corpus has been accepted for distribution by the Linguistic Data Consortium (LDC) of the University of Pennsylvania in October 2005, and it will serve as a standard corpus for Turkish speech researchers. The second objective was to develop speech recognition tools (a phonetic aligner and a phone recognizer) for Turkish, which provided a starting point for obtaining a multilingual speech recognizer by porting SONIC to Turkish. This part of the work was the first port of this particular recognizer to a language other than English; subsequently, SONIC has been ported to over 15 languages. Using the phonetic aligner developed, the audio corpus has been provided with word, phone and HMM-state level alignments. For the phonetic aligner, it is shown that 92.6% of the automatically labeled phone boundaries are placed within 20 ms of manually labeled locations for the Turkish audio corpus. Finally, a phone recognition error rate of 29.2% is demonstrated for the phone recognizer.  相似文献   

13.
The increasing use of complex digital avionics systems has resulted in the aircraft pilot making ever greater use of keyboards. The designer has comparatively little coherent information or guidance to help him select the optimal physical and functional characteristics of the keyboard, but current technology gives him a very wide choice. Much of the current literature relates to studies based on static ground environments and, although there is some evidence derived from studies of aircraft, the majority of the ergonomics data relevant to keyboard design is not easily generalized to the airborne environment. A review of the current military standards underlines this as they are brief and they are not always rigidly adhered to in practice. The factors which are likely to affect keying performance are discussed and the state of ergonomics knowledge in the area reviewed.  相似文献   

14.
Traditionally, driver distraction has been categorized into four types: visual, biomechanical, auditory, and cognitive. However, the place of emotion in driving research is largely undefined. The present study investigates the specific influences of anger – representative emotion arisen while driving, on driving performance, compared to those of traditional distraction tasks. In total, seventy-eight participants were recruited and placed into one of four driving conditions: physical (visual-biomechanical) distraction, cognitive (cognitive-auditory) distraction, emotional (anger), and control conditions. The results demonstrated that anger degrades driving performance as much as or more than other distraction types, specifically, in a yellow traffic signal situation. The causes for these results, underlying mechanisms, and other considerations are discussed with implications for future research.  相似文献   

15.
In this work, buried Markov models (BMM) are introduced. In a BMM, a Markov chain state at time t determines the conditional independence patterns that exist between random variables lying within a local time window surrounding t. This model is motivated by and can be fully described by “graphical models”, a general technique to describe families of probability distributions. In the paper, it is shown how information-theoretic criterion functions can be used to induce sparse, discriminative, and class-conditional network structures that yield an optimal approximation to the class posterior probability, and therefore are useful for classification tasks such as speech recognition. Using a new structure learning heuristic, the resulting structurally discriminative models are tested on a medium-vocabulary isolated-word speech recognition task. It is demonstrated that discriminatively structured BMMs, when trained in a maximum likelihood setting using EM, can outperform both hidden Markov models (HMMs) and other dynamic Bayesian networks with a similar number of parameters.  相似文献   

16.
In this paper, we introduce the concept of personal driving diary. A personal driving diary is a multimedia archive of a person’s daily driving experience, describing important driving events of the user with annotated videos. This paper presents an automated system that constructs such multimedia diary by analyzing videos obtained from a vehicle-mounted camera. The proposed system recognizes important interactions between the driving vehicle and the other actors in videos (e.g., accident, overtaking, etc.), and labels them together with its contextual knowledge on the vehicle (e.g., mean velocity) to construct an event log. A decision tree based activity recognizer is designed, detecting driving events of vehicles and pedestrians from the first-person view videos by analyzing their trajectories and spatio-temporal relationships. The constructed diary enables efficient searching and event-based browsing of video clips, which helps the users when retrieving videos of dangerous situations. Our experiment confirms that the proposed system reliably generates driving diaries by annotating the vehicle events learned from training examples.  相似文献   

17.
《Ergonomics》2012,55(6):780-790
Abstract

Prospective memories can divert attentional resources from ongoing activities. However, it is unclear whether these effects and the theoretical accounts that seek to explain them will generalise to a complex real-world task such as driving. Twenty-four participants drove two simulated routes while maintaining a fixed headway with a lead vehicle. Drivers were given either event-based (e.g. arriving at a filling station) or time-based errands (e.g. on-board clock shows 3:30). In contrast to the predominant view in the literature which suggests time-based tasks are more demanding, drivers given event-based errands showed greater difficulty in mirroring lead vehicle speed changes compared to the time-based group. Results suggest that common everyday secondary tasks, such as scouting the roadside for a bank, may have a detrimental impact on driving performance. The additional finding that this cost was only evident with the event-based task highlights a potential area of both theoretical and practical interest.

Practitioner Summary: Drivers were given either time- or event-based errands whilst engaged in a simulated drive. We examined the effect of errands on an ongoing vehicle follow task. In contrast to previous non-driving studies, event-based errands are more disruptive. Common everyday errands may have a detrimental impact on driving performance.  相似文献   

18.
Language Resources and Evaluation - This paper introduces the first large vocabulary speech recognition system (LVSR) for the Central Kurdish language, named Jira. The Kurdish language is an...  相似文献   

19.

Stuttering speech recognition is a well-studied concept in speech signal processing. Classification of speech disorder is the main focus of this study. Classification of stuttered speech is becoming more important with the enhancement of machine learning and deep learning. In this study, some of the recent and most influencing stuttering speech recognition methods are reviewed with a discussion on different categories of stuttering. The stuttering speech recognition process is divided mainly into four segments-input speech pre-emphasis, segmentation, feature extraction, and stutter classification. All these segments are briefly elaborated and related researches are discussed. It is observed that different traditional machine learning and deep learning classification approaches are employed to recognize stuttered speech in last few decades. A comprehensive analysis is presented on different feature extraction and classification method with their efficiency.

  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号