首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
《Advanced Robotics》2013,27(6):629-653
We have developed a human tracking system for use by robots that integrate sound and face localization. Conventional systems usually require many microphones and/or prior information to localize several sound sources. Moreover, they are incapable of coping with various types of background noise. Our system, the cross-power spectrum phase analysis of sound signals obtained with only two microphones, is used to localize the sound source without having to use prior information such as impulse response data. An expectation-maximization (EM) algorithm is used to help the system cope with several moving sound sources. The problem of distinguishing whether sounds are coming from the front or back is also solved with only two microphones by rotating the robot's head. A developed method that uses facial skin colors classified by another EM algorithm enables the system to detect faces in various poses. It can compensate for the error in the sound localization for a speaker and also identify noise signals entering from undesired directions by detecting a human face. A developed probability-based method is used to integrate the auditory and visual information in order to produce a reliable tracking path in real-time. Experiments using a robot showed that our system can localize two sounds at the same time and track a communication partner while dealing with various types of background noise.  相似文献   

2.
Mobile robots capable of auditory perception usually adopt the stop-perceive-act principle to avoid sounds made during moving due to motor noise. Although this principle reduces the complexity of the problems involved in auditory processing for mobile robots, it restricts their capabilities of auditory processing. In this paper, sound and visual tracking are investigated to compensate each other's drawbacks in tracking objects and to attain robust object tracking. Visual tracking may be difficult in case of occlusion, while sound tracking may be ambiguous in localization due to the nature of auditory processing. For this purpose, we present an active audition system for humanoid robot. The audition system of the highly intelligent humanoid requires localization of sound sources and identification of meanings of the sound in the auditory scene. The active audition reported in this paper focuses on improved sound source tracking by integrating audition, vision, and motor control. Given the multiple sound sources in the auditory scene, SIG the humanoid actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates the effectiveness of sound and visual tracking.  相似文献   

3.
This paper describes a user study on the benefits and drawbacks of simultaneous spatial sounds in auditory interfaces for visually impaired and blind computer users. Two different auditory interfaces in spatial and non-spatial condition were proposed to represent the hierarchical menu structure of a simple word processing application. In the horizontal interface, the sound sources or the menu items were located in the horizontal plane on a virtual ring surrounding the user’s head, while the sound sources in the vertical interface were aligned one above the other in front of the user. In the vertical interface, the central pitch of the sound sources at different elevations was changed in order to improve the otherwise relatively low localization performance in the vertical dimension. The interaction with the interfaces was based on a standard computer keyboard for input and a pair of studio headphones for output. Twelve blind or visually impaired test subjects were asked to perform ten different word processing tasks within four experiment conditions. Task completion times, navigation performance, overall satisfaction and cognitive workload were evaluated. The initial hypothesis, i.e. that the spatial auditory interfaces with multiple simultaneous sounds should prove to be faster and more efficient than non-spatial ones, was not confirmed. On the contrary—spatial auditory interfaces proved to be significantly slower due to the high cognitive workload and temporal demand. The majority of users did in fact finish tasks with less navigation and key pressing; however, they required much more time. They reported the spatial auditory interfaces to be hard to use for a longer period of time due to the high temporal and mental demand, especially with regards to the comprehension of multiple simultaneous sounds. The comparison between the horizontal and vertical interface showed no significant differences between the two. It is important to point out that all participants were novice users of the system; therefore it is possible that the overall performance could change with a more extensive use of the interfaces and an increased number of trials or experiments sets. Our interviews with visually impaired and blind computer users showed that they are used to sharing their auditory channel in order to perform multiple simultaneous tasks such as listening to the radio, talking to somebody, using the computer, etc. As the perception of multiple simultaneous sounds requires the entire capacity of the auditory channel and total concentration of the listener, it does therefore not enable such multitasking.  相似文献   

4.
This paper describes our research on bio-mimetic robot audition. Among the many binaural and monaural sound localization cues in the human auditory system, the interaural time difference cue is selected as it can easily be obtained by omnidirectional microphones. We have used a three-microphone system to remove the anterior-posterior ambiguity which occurs in two-microphone (or ear) systems. The echo-avoidance model of the precedence effect is used to cope with the echoes and reverberations of real environments. We mimicked the cocktail party effect by perceptual grouping of continuous components according to the spatial information obtained by the sound localization method. A wheel-based mobile robot equipped with an auditory system was developed. The auditory system has two sound processing parts. One is a DSP-based realtime system; the other is an off-line system composed of remote computers. Experiments of localizing and separating multiple sound sources and robot navigation were conducted to demonstrate the system's ability and potential applications.  相似文献   

5.
Robotic auditory attention mainly relies on sound source localization using a microphone array. Typically, the robot detects a sound source whenever it emits, estimates its direction, and then turns to that direction to pay attention. However, in scenarios where multiple sound sources emit simultaneously, the robot may have difficulty with selecting a single target source. This paper proposes a novel robot auditory attention system that is based on source distance perception (e.g., selection of the closest among localized sources). Microphone array consists of head- and base-arrays installed in the robot’s head and base, respectively. The difficulty in the attention among multiple sound sources is solved by estimating a binary mask for each source based on the azimuth localization of the head-array. For each individual source represented by a binary mask, elevations of head- and base-array are estimated and triangulated to obtain distance to the robot. Finally, the closest source is determined and its direction is used for controlling the robot. Experiment results clearly show the benefit of the proposed system, on real indoor recordings of two and three simultaneous sound sources, as well as real-time demonstration at a robot exhibition.  相似文献   

6.
Auditory displays are developed and investigated for mobile service robots in a human–machine environment. The service robot domain was chosen as an example for future use of auditory displays within multimedia process supervision and control applications in industrial, transportation, and medical systems. The design of directional sounds and of additional sounds for robot states as well as the design of more complicated robot sound tracks are explained. Basic musical elements and robot-movement sounds are combined. Experimental studies on the auditory perception of directional sounds as well as of sound tracks for the predictive display of intended robot trajectories in a simulated supermarket scenario are described.  相似文献   

7.
随着机器人应用技术的发展,服务机器人走进人们的生活日渐成为可能. 但机器人本身计算能力有限,同时仅靠自身的传感器接收的信息也有一定的局限性. 现有的机器人还不足以胜任面对复杂场景的应对能力,也不能够满足人们对服务机器人的期待. 本文设计的云机器人计算框架(cloud robot computing framework, CRCF)通过云端,将智能家居以及其他智能硬件与机器人相结合,为机器人提供更多更广的信息. 同时,CRCF通过互联网结合了其它第三方的云端应用API,为机器人提供更多的服务功能. CRCF框架旨在利用云端的大数据处理能力提升机器人的计算和存储能力,并结合第三方云端应用服务和智能硬件设备来拓展机器人的信息来源和服务功能. 最后,本文通过远程语音控制机器人的实验,验证了CRCF系统平台在结合硬件设备以及第三方云端应用的功能和性能.  相似文献   

8.
介绍了虚拟环境中面向扬声器的空间立体声生成和定位的研究现状、矢量基幅值相移算法,讨论了有关声音仿真的若干问题。针对某大型分布式虚拟战场环境,构造并实现了DIS/HLA体系结构下实时3维空间立体声系统,分析了该系统中空间扬声器阵列声音显示的若干问题。提出了声觉信息显示包围球的概念,进而提出了一种以虚拟观察者/听者为中心的包围球内声源目标快速检索算法,解决了分布交互仿真中感兴趣区域大量实体声源的实时显示问题,该方法用于实际系统,效果良好。  相似文献   

9.
Methods of sonification based on the design and control of sound synthesis is presented in this paper. The semiotics of isolated sounds was evidenced by performing fundamental studies using a combined acoustical and brain imaging (event-related potentials) approach. The perceptual cues (which are known as invariants) responsible for the evocations elicited by the sounds generated by impacts, moving sound sources, dynamic events and vehicles (car-door closing and car engine noise) were then identified based on physical and perceptual considerations. Lastly, some examples of the high-level control of a synthesis process simulating immersive 3-D auditory scenes, interacting objects and evoked dynamics are presented.  相似文献   

10.
In this study, we propose a novel end-to-end system called Human–Machine Collaborative Inspection (HMCI) to enable collaboration between inspectors with Mixed Reality (MR) headsets and a robotic data collection platform (robot) for structural inspections. We utilize the MR headset’s holographic display and precise head tracking to allow inspectors to visualize and localize information (e.g., structural defect) on the real scenes, which are gathered by the robot and processed by an offsite computational server. The primary use case of HMCI is to enable the inspector to visualize, supervise, and improve results produced by automated defect detection algorithms in near real-time. The workflow in HMCI starts with collecting images and depth data to generate 3D maps of the site from the robot. A technique called single-shot localization is developed to create visual anchors for real-time spatial alignment between the robot and the MR headset. The 3D map and images are then sent to the computational server for analysis to detect defects and their locations. Then, the information is received by the MR headset and overlaid on the actual scenes to visualize it with spatial context. An experimental study is conducted in a lab environment to demonstrate HMCI using Microsoft HoloLens 2 (HL2) as the MR headset and Turtlebot2 as the robot. We start with the reconstruction of a 3D environment using a 3D depth sensor (Azure Kinect) on Turtlebot2 and visually detect fiducial markers as regions-of-interest (replicating structural damage) along a predefined inspection path. Then, regions-of-interest are successfully anchored to the real scene and visualized through HL2. To our knowledge, HMCI is one of the first human–machine collaborative systems that can integrate robots and inspectors with the MR headset, which has been developed, tested, and presented for structural inspection.  相似文献   

11.
为了增强机器人人机交互系统的自然性,提出了基于多种传感器的非接触式人机交互系统设计方案,系统通过检测操作者手部动作和手部位置姿态的变化实现机器人的遥操作。研制了肌电传感器,获取手臂上一对拮抗肌上的表面肌电信号,并以此来判断机器人操作者的部分手部动作;利用Kinect体感设备和惯性测量单元获取手臂三维位置和姿态角信息。通过网络将人手的动作及位置姿态发送至机器人控制系统,以完成对机器人的控制。系统综合多种传感器的优点,极大减小了传统接触式交互方式对操作者运动范围的限制,实现了自然交互,实验表明了其有效性。  相似文献   

12.
In this paper, we compare four different auditory displays in a mobile audio-augmented reality environment (a sound garden). The auditory displays varied in the use of non-speech audio, Earcons, as auditory landmarks and 3D audio spatialization, and the goal was to test the user experience of discovery in a purely exploratory environment that included multiple simultaneous sound sources. We present quantitative and qualitative results from an initial user study conducted in the Municipal Gardens of Funchal, Madeira. Results show that spatial audio together with Earcons allowed users to explore multiple simultaneous sources and had the added benefit of increasing the level of immersion in the experience. In addition, spatial audio encouraged a more exploratory and playful response to the environment. An analysis of the participants’ logged data suggested that the level of immersion can be related to increased instances of stopping and scanning the environment, which can be quantified in terms of walking speed and head movement.  相似文献   

13.
《Advanced Robotics》2013,27(3):289-304
Using our developed acoustical telepresence robot, TeleHead, we have so far confirmed that not only stationary binaural features, but also dynamic cues from head movement play important roles in sound localization. In this study, aiming towards the realization of an ideal acoustical telepresence robot, we clarify the relation between the head movement and the accuracy of sound localization in sound localization experiments. We examined two factors related to head movement that should have an impact on sound localization accuracy: observation from multiple postures and dynamic information during head movement. The results suggest that both factors improve the accuracy of sound localization in experiments. Moreover, even when we can use only one of these factors, the accuracy of sound localization is almost the same as the subject's original accuracy. The results confirm that even under very bad communication, control and head-shape conditions, the synchronization of head movement is important for building an acoustical telepresence robot. They also point to the possibility of building an acoustical telepresence robot with a dummy head of a general shape. This is meaningful from the viewpoint of engineering. In addition, it suggests the strong robustness of the human sound localization function.  相似文献   

14.
ABSTRACT

The present work describes some strategies for allowing an autonomous mobile robot to maneuver in an unknown environment. Such strategies are based in fuzzy control rules that are determined to correspond to the linguistic control rules of human operator's strategies. They are transformed into algorithms to control the steering and the speed of a mobile robot. Perception of the environment is simulated by means of ultrasonic sensors.  相似文献   

15.
《Advanced Robotics》2013,27(1-2):183-208
The classical mapping methods were developed based on anthropomorphic robot hands. Normally, they are not applicable or cannot provide satisfactory performance if the robot hand is non-anthropomorphic. This paper presents a virtual circle mapping method dealing with the three-fingered non-anthropomorphic robot hand. The basic idea is to express the operator's motion by a virtual circle determined by the three fingertips. Four sets of parameters are used to describe the circle: circle radius, central angles, circle center and circle orientation. By transforming the four sets of parameters from the human frame to the robot frame, the information of relative positions between the fingertips is delivered. The robot fingertip positions are then computed according to the transformed parameters. The concept is introduced and implemented on a specific three-fingered non-anthropomorphic robot hand. The simulation results and the experiments have shown that the proposed method is able to obtain better workspace matching and, thus, the operator can tele-control the robot hand more intuitively.  相似文献   

16.
In this work, we describe an autonomous mobile robotic system for finding, investigating, and modeling ambient noise sources in the environment. The system has been fully implemented in two different environments, using two different robotic platforms and a variety of sound source types. Making use of a two-step approach to autonomous exploration of the auditory scene, the robot first quickly moves through the environment to find and roughly localize unknown sound sources using the auditory evidence grid algorithm. Then, using the knowledge gained from the initial exploration, the robot investigates each source in more depth, improving upon the initial localization accuracy, identifying volume and directivity, and, finally, building a classification vector useful for detecting the sound source in the future.  相似文献   

17.
《Ergonomics》2012,55(11):1471-1484
Abstract

The current study applied Structural Equation Modelling to analyse the relationship among pitch, loudness, tempo and timbre and their relationship with perceived sound quality. Twenty-eight auditory signals of horn, indicator, door open warning and parking sensor were collected from 11 car brands. Twenty-one experienced drivers were recruited to evaluate all sound signals with 11 semantic differential scales. The results indicate that for the continuous sounds, pitch, loudness and timbre each had a direct impact on the perceived quality. Besides the direct impacts, pitch also had an impact on loudness perception. For the intermittent sounds, tempo and timbre each had a direct impact on the perceived quality. These results can help to identify the psychoacoustic attributes affecting the consumers’ quality perception and help to design preferable sounds for vehicles. In the end, a design guideline is proposed for the development of auditory signals that adopts the current study’s research findings as well as those of other relevant research.

Practitioner Summary: This study applied Structural Equation Modelling to analyse the relationship among pitch, loudness, tempo and timbre and their relationship with perceived sound quality. The result can help to identify psychoacoustic attributes affecting the consumers’ quality perception and help to design preferable sounds for vehicles.  相似文献   

18.
Human-robot collaborative (HRC) assembly combines the advantages of robot's operation consistency with human's cognitive ability and adaptivity, which provides an efficient and flexible way for complex assembly tasks. In the process of HRC assembly, the robot needs to understand the operator's intention accurately to assist the collaborative assembly tasks. At present, operator intention recognition considering context information such as assembly objects in a complex environment remains challenging. In this paper, we propose a human-object integrated approach for context-aware assembly intention recognition in the HRC, which integrates the recognition of assembly actions and assembly parts to improve the accuracy of the operator's intention recognition. Specifically, considering the real-time requirements of HRC assembly, spatial-temporal graph convolutional networks (ST-GCN) model based on skeleton features is utilized to recognize the assembly action to reduce unnecessary redundant information. Considering the disorder and occlusion of assembly parts, an improved YOLOX model is proposed to improve the focusing capability of network structure on the assembly parts that are difficult to recognize. Afterwards, taking decelerator assembly tasks as an example, a rule-based reasoning method that contains the recognition information of assembly actions and assembly parts is designed to recognize the current assembly intention. Finally, the feasibility and effectiveness of the proposed approach for recognizing human intentions are verified. The integration of assembly action recognition and assembly part recognition can facilitate the accurate operator's intention recognition in the complex and flexible HRC assembly environment.  相似文献   

19.
In a simulated air traffic control task, improvement in the detection of auditory warnings when using virtual 3-D audio depended on the spatial configuration of the sounds. Performance improved substantially when two of four sources were placed to the left and the remaining two were placed to the right of the participant. Surprisingly, little or no benefits were observed for configurations involving the elevation or transverse (front/back) dimensions of virtual space, suggesting that position on the interaural (left/right) axis is the crucial factor to consider in auditory display design. The relative importance of interaural spacing effects was corroborated in a second, free-field (real space) experiment. Two additional experiments showed that (a) positioning signals to the side of the listener is superior to placing them in front even when two sounds are presented in the same location, and (b) the optimal distance on the interaural axis varies with the amplitude of the sounds. These results are well predicted by the behavior of an ideal observer under the different display conditions. This suggests that guidelines for auditory display design that allow for effective perception of speech information can be developed from an analysis of the physical sound patterns.  相似文献   

20.
声学事件检测是指对连续音频信号流中具有明确语义的片段进行检测与标定的过程。它是机器对环境声音场景进行识别和语义理解的重要基础,并将在未来类人机器人声音环境的语义理解、无人车行车周边环境的声音感知等方面发挥重要的 作用。本文分别从与声学事件检测相关领域的发展历程以及应用需求出发,对声学事件检测的历史进行了回顾,介绍了典型的研究工作,并分析了未来的发展方向。在相关领域的分析 中,重点介绍语音识别、基于计算的音乐处理及基于听觉特性的声音处理等方面的工作;在应用需求方面,介绍机器的环境声音感知与多媒体信息检索方面的工作;最后分析本领域的研究现状,并展望其未来的发展趋势。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号