期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Alexander Schmitt Dmitry Zaykovskiy Wolfgang Minker 《International Journal of Speech Technology》2008,11(2):63-72

This article presents an overview of different approaches for providing automatic speech recognition (ASR) technology to mobile users. Three principal system architectures with respect to the employment of a wireless communication link are analyzed: Embedded Speech Recognition Systems, Network Speech Recognition (NSR) and Distributed Speech Recognition (DSR). An overview of the solutions having been standardized so far as well as a critical analysis of the latest developments in the field of speech recognition in mobile environments is given. Open issues, pros and cons of the different methodologies and techniques are highlighted. Special emphasis is placed on the constraints and limitations ASR applications are confronted with under different architectures. 相似文献

2.

A parallel algorithm for real-time object recognition

《Pattern recognition》2002,35(9):1917-1931

The aim of this paper is to present a new generalized Hough transform-based hardware algorithm in order to detect non-analytic objects in a two-dimensional (2D) image space. Our main idea consists to use, during voting process into the 5D parameter space, only meaningful set of edge points that belong to the boundary of the target object and that feature a similar geometric property. In this paper, a same line support property has been used. This has the merit to reduce the size of the 5D parameter space, while increasing the detection accuracy. The whole algorithm was implemented into a highly parallel architecture supported by a single PC board. It is composed of a mixture of digital signal processing and field programmable gate array technologies and uses the content addressable memory as a main processing unit. Complexity evaluation of the whole system indicated that a set of 46 different images of 256×256 pixels each can be classified in real-time (e.g. under frame rate). 相似文献

3.

Personness estimation for real-time human detection on mobile devices

《Expert systems with applications》2017

One aim of detection proposal methods is to reduce the computational overhead of object detection. However, most of the existing methods have significant computational overhead for real-time detection on mobile devices. A fast and accurate proposal method of human detection called personness estimation is proposed, which facilitates real-time human detection on mobile devices and can be effectively integrated into part-based detection, achieving high detection performance at a low computational cost. Our work is based on two observations: (i) normed gradients, which are designed for generic objectness estimation, effectively generate high-quality detection proposals for the person category; (ii) fusing the normed gradients with color attributes improves the performance of proposal generation for human detection. Thus, the candidate windows generated by the personness estimation will very likely contain human subjects. The human detection is then guided by the candidate windows, offering high detection performance even when the detection task terminates prior to completion. This interruptible detection scheme, called anytime detection, enables real-time human detection on mobile devices. Furthermore, we introduce a new evaluation methodology called time-recall curves to practically evaluate our approach. The applicability of our proposed method is demonstrated in extensive experiments on a publicly available dataset and a real mobile device, facilitating acquisition and enhancement of portrait photographs (e.g. selfie) on widespread mobile platforms. 相似文献

4.

Realtime training on mobile devices for face recognition applications

Kwontaeg Choi Author Vitae Author Vitae Hyeran Byun Author Vitae 《Pattern recognition》2011,44(2):386-400

Due to the increases in processing power and storage capacity of mobile devices over the years, an incorporation of realtime face recognition to mobile devices is no longer unattainable. However, the possibility of the realtime learning of a large number of samples within mobile devices must be established. In this paper, we attempt to establish this possibility by presenting a realtime training algorithm in mobile devices for face recognition related applications. This is differentiated from those traditional algorithms which focused on realtime classification. In order to solve the challenging realtime issue in mobile devices, we extract local face features using some local random bases and then a sequential neural network is trained incrementally with these features. We demonstrate the effectiveness of the proposed algorithm and the feasibility of its application in mobile devices through empirical experiments. Our results show that the proposed algorithm significantly outperforms several popular face recognition methods with a dramatic reduction in computational speed. Moreover, only the proposed method shows the ability to train additional samples incrementally in realtime without memory failure and accuracy degradation using a recent mobile phone model. 相似文献

5.

A point-based rendering approach for real-time interaction on mobile devices

XiaoHui Liang QinPing Zhao ZhiYing He Ke Xie YuBo Liu 《中国科学F辑(英文版)》2009,52(8):1335-1345

Mobile device is an important interactive platform. Due to the limitation of computation, memory, display area and energy, how to realize the efficient and real-time interaction of 3D models based on mobile devices is an important research topic. Considering features of mobile devices, this paper adopts remote rendering mode and point models, and then, proposes a transmission and rendering approach that could interact in real time. First, improved simplification algorithm based on MLS and display resolution of mobile devices is proposed. Then, a hierarchy selection of point models and a QoS transmission control strategy are given based on interest area of operator, interest degree of object in the virtual environment and rendering error. They can save the energy consumption. Finally, the rendering and interaction of point models are completed on mobile devices. The experiments show that our method is efficient. Supported by the National Natural Science Foundation of China (Grant No. 60873159), the Program for New Century Excellent Talents in University (Grant No. NCET-07-0039), the National High-Tech Research & Development Progrom of China (Grant No. 2006AA01Z333) 相似文献

6.

Energy-aware distributed speech recognition for wireless mobile devices

Delaney B. Jayant N. Simunic T. 《Design & Test of Computers, IEEE》2005,22(1):39-49

Severe limitations in computational, memory, and energy resources make implementing high-quality speech recognition in embedded devices a difficult challenge. In this article, the authors investigate the energy consumption of computation and communication in an embedded distributed speech recognition system and propose optimizations that reduce overall energy consumption while maintaining adequate quality of service for the end user. This article considers the application of DSR traffic to both Bluetooth and 802.11b networks. 相似文献

7.

Improving command and control speech recognition on mobile devices: using predictive user models for language modeling

Tim Paek David Maxwell Chickering 《User Modeling and User-Adapted Interaction》2007,17(1-2):93-117

Command and control (C&C) speech recognition allows users to interact with a system by speaking commands or asking questions restricted to a fixed grammar containing pre-defined phrases. Whereas C&C interaction has been commonplace in telephony and accessibility systems for many years, only recently have mobile devices had the memory and processing capacity to support client-side speech recognition. Given the personal nature of mobile devices, statistical models that can predict commands based in part on past user behavior hold promise for improving C&C recognition accuracy. For example, if a user calls a spouse at the end of every workday, the language model could be adapted to weight the spouse more than other contacts during that time. In this paper, we describe and assess statistical models learned from a large population of users for predicting the next user command of a commercial C&C application. We explain how these models were used for language modeling, and evaluate their performance in terms of task completion. The best performing model achieved a 26% relative reduction in error rate compared to the base system. Finally, we investigate the effects of personalization on performance at different learning rates via online updating of model parameters based on individual user data. Personalization significantly increased relative reduction in error rate by an additional 5%. 相似文献

8.

Robust and real-time pose tracking for augmented reality on mobile devices

Xin?Yang Jiabin?Guo Tangli?Xue Kwang-Ting??Cheng 《Multimedia Tools and Applications》2018,77(6):6607-6628

This paper addresses robust and ultrafast pose tracking on mobile devices, such as smartphones and small drones. Existing methods, relying on either vision analysis or inertial sensing, are either too computational heavy to achieve real-time performance on a mobile platform, or not sufficiently robust to address unique challenges in mobile scenarios, including rapid camera motions, long exposure time of mobile cameras, etc. This paper presents a novel hybrid tracking system which utilizes on-device inertial sensors to greatly accelerate the visual feature tracking process and improve its robustness. In particular, our system adaptively resizes each video frame based on inertial sensor data and applies a highly efficient binary feature matching method to track the object pose in each resized frame with little accuracy degradation. This tracking result is revised periodically by a model-based feature tracking method (Hare et al. 2012) to reduce accumulated errors. Furthermore, an inertial tracking method and a solution of fusing its results with the feature tracking results are employed to further improve the robustness and efficiency. We first evaluate our hybrid system using a dataset consisting of 16 video clips with synchronized inertial sensing data and then assess its performance in a mobile augmented reality application. Experimental results demonstrated our method’s superior performance to a state-of-the-art feature tracking method (Hare et al. 2012), a direct tracking method (Engel et al. 2014) and the Vuforia SDK (Ibañez and Figueras 2013), and can run at more than 40 Hz on a standard smartphone. We will release the source code with the pubilication of this paper. 相似文献

9.

Predicting performance of object recognition 总被引：3，自引：0，他引：3

Boshra M. Bhanu B. 《IEEE transactions on pattern analysis and machine intelligence》2000,22(9):956-969

We present a method for predicting fundamental performance of object recognition. We assume that both scene data and model objects are represented by 2D point features and a data/model match is evaluated using a vote-based criterion. The proposed method considers data distortion factors such as uncertainty, occlusion, and clutter, in addition to model similarity. This is unlike previous approaches, which consider only a subset of these factors. Performance is predicted in two stages. In the first stage, the similarity between every pair of model objects is captured by comparing their structures as a function of the relative transformation between them. In the second stage, the similarity information is used along with statistical models of the data-distortion factors to determine an upper bound on the probability of recognition error. This bound is directly used to determine a lower bound on the probability of correct recognition. The validity of the method is experimentally demonstrated using real synthetic aperture radar (SAR) data 相似文献

10.

Automatic feature selection for context recognition in mobile devices

Ville Könönen Jani Mäntyjärvi Heidi Similä Juha Pärkkä Miikka Ermes 《Pervasive and Mobile Computing》2010,6(2):181-197

In mobile devices there exist several in-built sensor units and sources which provide data for context reasoning. More context sources can be attached via wireless network connections. Usually, the mobile devices and the context sources are battery powered and their computational and space resources are limited. This sets special requirements for the context recognition algorithms. In this paper, several classification and automatic feature selection algorithms are compared in the context recognition domain. The main goal of this study is to investigate how much advantage can be achieved by using sophisticated and complex classification methods compared with a simple method that can easily be implemented in mobile devices. The main result is that even a simple linear classification algorithm can achieve a reasonably good accuracy if the features calculated from raw data are selected in a suitable way. Usually context recognition algorithms are fitted to a particular problem instance in an off-line manner and modifying methods for on-line learning is difficult or impossible. An on-line version of the Minimum-distance classifier is presented in this paper and it is justified that it leads to considerably higher classification accuracies compared with the static off-line version of the algorithm. Moreover, we report superior performance for the Minimum-distance classifier compared to other classifiers from the view point of computational load and power consumption of a smart phone. 相似文献

11.

Optimization methods of video images processing for mobile object recognition

Xiao Shuo Li Tianxu Wang Jiawei 《Multimedia Tools and Applications》2020,79(25-26):17245-17255

Multimedia Tools and Applications - Recognition of moving objects in video images is mainly based on acquiring the target information in a certain time series. After image processing, relevant... 相似文献

12.

SHORT: Segmented histogram technique for robust real-time object recognition

Bonny Talal Rabie Tamer Baziyad Mohammed Balid Walid 《Multimedia Tools and Applications》2019,78(18):25781-25806

Multimedia Tools and Applications - Object recognition is a broad area that covers several topics including face recognition, gesture recognition, human gait recognition, traffic road signs... 相似文献

13.

On-chip real-time feature extraction using semantic annotations for object recognition

Ying-Hao Yu Tsu-Tian Lee Pei-Yin Chen Ngaiming Kwok 《Journal of Real-Time Image Processing》2018,15(2):249-264

Describing image features in a concise and perceivable manner is essential to focus on candidate solutions for classification purpose. In addition to image recognition with geometric modeling and frequency domain transformation, this paper presents a novel 2D on-chip feature extraction named semantics-based vague image representation (SVIR) to reduce the semantic gap of content-based image retrieval. The development of SVIR aims at successively deconstructing object silhouette into intelligible features by pixel scans and then evolves and combines piecewise features into another pattern in a linguistic form. In addition to semantic annotations, SVIR is free of complicated calculations so that on-chip designs of SVIR can attain real-time processing performance without making use of a high-speed clock. The effectiveness of SVIR algorithm was demonstrated with timing sequences and real-life operations based on a field-programmable-gate-array (FPGA) development platform. With low hardware resource consumption on a single FPGA chip, the design of SVIR can be used on portable machine vision for ambient intelligence in the future. 相似文献

14.

Robust and real-time object recognition based on multiple fractal dimension

Wang Hainan Zhang Baochang Chen Wei 《Multimedia Tools and Applications》2021,80(30):36585-36603

相似文献

15.

Improving accessibility of mobile devices with EasyWrite

Rui Godinho Marielba Zacarias Fernando G. Lobo 《Behaviour & Information Technology》2015,34(2):135-150

This paper describes the development process of EasyWrite, a text-entry method for mobile devices that allows people with hand coordination problems to use small computer devices such as smartphones, tablet PCs, or other touchscreen machines. This text-entry method aims at improving typing accuracy and reducing frustration of people affected by this motor disability when using small devices. EasyWrite was developed following an iterative and user-centred process. Starting from requirements elicited from observing potential users with mild and moderate motor disabilities and information provided by a literature review, a low-fidelity prototype was built and evaluated. This early prototype was refined throughout several design and evaluation iterations. Its current state is a functional prototype that works on Android phones. The functional prototype usability was evaluated through user tests. The result of this process is a small virtual keyboard for mobile devices that has less and bigger keys as compared to other onscreen keyboards. The concept of EasyWrite is largely based on the notion of scanning group systems, but it allows users to navigate directly through groups and subgroups of characters by tapping on directional keys in order to find the desired character rather than waiting for a visual cursor to advance through the options, one at a time, at a specific time rate. Though at its current stage the method proposed by EasyWrite shows some limitations, it appears to be appropriate for users with moderate motor disabilities. For this group of people, user test results indicate that EasyWrite could be a more adequate text-entry method than the one provided by standard keyboards, both physical and onscreen, commonly found in mobile devices. 相似文献

16.

Advanced feature point transformation of corner points for mobile object recognition

Xiyuan Yin Dae-Hwan Kim Chung-Pyo Hong Cheong-Ghil Kim Kuinam J. Kim Shin-Dug Kim 《Multimedia Tools and Applications》2015,74(16):6541-6556

相似文献

17.

Authentication in mobile devices through hand gesture recognition

J. Guerra-Casanova C. Sánchez-ávila G. Bailador A. de Santos Sierra 《International Journal of Information Security》2012,11(2):65-83

This article proposes an innovative biometric technique based on the idea of authenticating a person on a mobile device by gesture recognition. To accomplish this aim, a user is prompted to be recognized by a gesture he/she performs moving his/her hand while holding a mobile device with an accelerometer embedded. As users are not able to repeat a gesture exactly in the air, an algorithm based on sequence alignment is developed to correct slight differences between repetitions of the same gesture. The robustness of this biometric technique has been studied within 2 different tests analyzing a database of 100 users with real falsifications. Equal Error Rates of 2.01 and 4.82% have been obtained in a zero-effort and an active impostor attack, respectively. A permanence evaluation is also presented from the analysis of the repetition of the gestures of 25 users in 10 sessions over a month. Furthermore, two different gesture databases have been developed: one made up of 100 genuine identifying 3-D hand gestures and 3 impostors trying to falsify each of them and another with 25 volunteers repeating their identifying 3-D hand gesture in 10 sessions over a month. These databases are the most extensive in published studies, to the best of our knowledge. 相似文献

18.

Improving object recognition by transforming Gabor filter responses

Pötzsch M Krüger N von der Malsburg C 《Network (Bristol, England)》1996,7(2):341-347

Previous work described a biologically motivated object recognition system with Gabor wavelets as basic feature type. These features are robust against slight distortion, rotation and variation in illumination. We here describe extensions of the system that address image variance due to arbitrary in-plane rotation, substantial scale changes and moderate depth rotation of objects, and to background variation, using simple linear transformation of the Gabor filter responses. The performance of the system is enhanced significantly. 相似文献

19.

Integrated approach of streaming 3d multimedia contents in real-time for mobile devices

Suk Kyu Lee Hyunsoon Kim Albert Yongjoon Chung Hwangnam Kim 《Multimedia Tools and Applications》2018,77(2):1811-1842

It is popular to watch a 3D video through a 3D display nowadays. However, it is still difficult to enjoy the 3D multimedia contents with a mobile device even if a mobile device with a 3D display is currently introduced into the market. The main technological challenges for watching 3D contents via the mobile devices can be identified as the following: generating and streaming 3D contents. Generating 3D contents requires extra computational resources. Moreover, streaming 3D contents demands additional network bandwidth for receiving and transmitting the 3D data. To overcome these technological challenges, we propose ReMA, a novel 3D video streaming system in this paper. We devised a novel architecture for transmitter, receiver, and a distribution system to efficiently disseminate and generate 3D videos for the mobile devices. We implemented ReMA in a real test-bed and conducted a thorough empirical evaluation study to see the feasibility of streaming 3D contents for the mobile devices. Based on our empirical study, the resulting system presents a great promise in streaming 3D video in real-time to the mobile devices. 相似文献

20.

Indoor scene recognition by a mobile robot through adaptive object detection

P. Espinace T. Kollar N. Roy A. Soto 《Robotics and Autonomous Systems》2013,61(9):932-947

Mobile robotics has achieved notable progress, however, to increase the complexity of the tasks that mobile robots can perform in natural environments, we need to provide them with a greater semantic understanding of their surrounding. In particular, identifying indoor scenes, such as an Office or a Kitchen, is a highly valuable perceptual ability for an indoor mobile robot, and in this paper we propose a new technique to achieve this goal. As a distinguishing feature, we use common objects, such as Doors or furniture, as a key intermediate representation to recognize indoor scenes. We frame our method as a generative probabilistic hierarchical model, where we use object category classifiers to associate low-level visual features to objects, and contextual relations to associate objects to scenes. The inherent semantic interpretation of common objects allows us to use rich sources of online data to populate the probabilistic terms of our model. In contrast to alternative computer vision based methods, we boost performance by exploiting the embedded and dynamic nature of a mobile robot. In particular, we increase detection accuracy and efficiency by using a 3D range sensor that allows us to implement a focus of attention mechanism based on geometric and structural information. Furthermore, we use concepts from information theory to propose an adaptive scheme that limits computational load by selectively guiding the search for informative objects. The operation of this scheme is facilitated by the dynamic nature of a mobile robot that is constantly changing its field of view. We test our approach using real data captured by a mobile robot navigating in Office and home environments. Our results indicate that the proposed approach outperforms several state-of-the-art techniques for scene recognition. 相似文献