首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Advanced Robotics》2013,27(5):527-546
Prediction of dynamic features is an important task for determining the manipulation strategies of an object. This paper presents a technique for predicting dynamics of objects relative to the robot's motion from visual images. During the training phase, the authors use the recurrent neural network with parametric bias (RNNPB) to self-organize the dynamics of objects manipulated by the robot into the PB space. The acquired PB values, static images of objects and robot motor values are input into a hierarchical neural network to link the images to dynamic features (PB values). The neural network extracts prominent features that each induce object dynamics. For prediction of the motion sequence of an unknown object, the static image of the object and robot motor value are input into the neural network to calculate the PB values. By inputting the PB values into the closed loop RNNPB, the predicted movements of the object relative to the robot motion are calculated recursively. Experiments were conducted with the humanoid robot Robovie-IIs pushing objects at different heights. The results of the experiment predicting the dynamics of target objects proved that the technique is efficient for predicting the dynamics of the objects.  相似文献   

2.
A generic, modular, neural network-based feature extraction and pattern classification system is proposed for finding essentially two-dimensional objects or object parts from digital images in a distortion tolerant manner, The distortion tolerance is built up gradually by successive blocks in a pipeline architecture. The system consists of only feedforward neural networks, allowing efficient parallel implementation. The most time and data-consuming stage, learning the relevant features, is wholly unsupervised and can be made off-line. The consequent supervised stage where the object classes are learned is simple and fast. The feature extraction is based on distortion tolerant Gabor transformations, followed by minimum distortion clustering by multilayer self-organizing maps. Due to the unsupervised learning strategy, there is no need for preclassified training samples or other explicit selection for training patterns during the training, which allows a large amount of training material to be used at the early stages, A supervised, one-layer subspace network classifier on top of the feature extractor is used for object labeling. The system has been trained with natural images giving the relevant features, and human faces and their parts have been used as the object classes for testing. The current experiments indicate that the feature space has sufficient resolution power for a moderate number of classes with rather strong distortions.  相似文献   

3.
目的 现有的显著对象检测模型能够很好地定位显著对象,但是在获得完整均匀的对象和保留清晰边缘的任务上存在不足。为了得到整体均匀和边缘清晰的显著对象,本文提出了结合语义辅助和边缘特征的显著对象检测模型。方法 模型利用设计的语义辅助特征融合模块优化骨干网的侧向输出特征,每层特征通过语义辅助选择性融合相邻的低层特征,获得足够的结构信息并增强显著区域的特征强度,进而检测出整体均匀的显著对象。通过设计的边缘分支网络以及显著对象特征得到精确的边缘特征,将边缘特征融合到显著对象特征中,加强特征中显著对象边缘区域的可区分性,以便检测出清晰的边缘。同时,本文设计了一个双向多尺度模块来提取网络中的多尺度信息。结果 在4种常用的数据集ECSSD (extended complex scene saliency dataset)、DUT-O (Dalian University of Technology and OMRON Corporation)、HKU-IS和DUTS上与12种较流行的显著模型进行比较,本文模型的最大F值度量(max F-measure,MaxF)和平均绝对误差(mean absolution error,MAE)分别是0.940、0.795、0.929、0.870和0.041、0.057、0.034、0.043。从实验结果看,本文方法得到的显著图更接近真值图,在MaxF和MAE上取得最佳性能的次数多于其他12种方法。结论 本文提出的结合语义辅助和边缘特征的显著对象检测模型十分有效。语义辅助特征融合和边缘特征的引入使检测出的显著对象更为完整均匀,对象的边缘区分性也更强,多尺度特征提取进一步改善了显著对象的检测效果。  相似文献   

4.
This paper presents a novel object–object affordance learning approach that enables intelligent robots to learn the interactive functionalities of objects from human demonstrations in everyday environments. Instead of considering a single object, we model the interactive motions between paired objects in a human–object–object way. The innate interaction-affordance knowledge of the paired objects are learned from a labeled training dataset that contains a set of relative motions of the paired objects, human actions, and object labels. The learned knowledge is represented with a Bayesian Network, and the network can be used to improve the recognition reliability of both objects and human actions and to generate proper manipulation motion for a robot if a pair of objects is recognized. This paper also presents an image-based visual servoing approach that uses the learned motion features of the affordance in interaction as the control goals to control a robot to perform manipulation tasks.  相似文献   

5.
目标检测算法性能优劣既依赖于数据集样本分布,又依赖于特征提取网络设计.从这2点出发,首先通过分析COCO 2017数据集各尺度目标属性分布,探索了数据集固有的导致小目标检测准确率偏低的潜在因素,据此提出CP模块,该模块以离线方式调整数据集小目标分布,一方面对包含小目标图片进行上采样,另一方面对图片内小目标进行复制粘贴....  相似文献   

6.
This paper addresses a new method for combination of supervised learning and reinforcement learning (RL). Applying supervised learning in robot navigation encounters serious challenges such as inconsistent and noisy data, difficulty for gathering training data, and high error in training data. RL capabilities such as training only by one evaluation scalar signal, and high degree of exploration have encouraged researchers to use RL in robot navigation problem. However, RL algorithms are time consuming as well as suffer from high failure rate in the training phase. Here, we propose Supervised Fuzzy Sarsa Learning (SFSL) as a novel idea for utilizing advantages of both supervised and reinforcement learning algorithms. A zero order Takagi–Sugeno fuzzy controller with some candidate actions for each rule is considered as the main module of robot's controller. The aim of training is to find the best action for each fuzzy rule. In the first step, a human supervisor drives an E-puck robot within the environment and the training data are gathered. In the second step as a hard tuning, the training data are used for initializing the value (worth) of each candidate action in the fuzzy rules. Afterwards, the fuzzy Sarsa learning module, as a critic-only based fuzzy reinforcement learner, fine tunes the parameters of conclusion parts of the fuzzy controller online. The proposed algorithm is used for driving E-puck robot in the environment with obstacles. The experiment results show that the proposed approach decreases the learning time and the number of failures; also it improves the quality of the robot's motion in the testing environments.  相似文献   

7.
Hierarchical discriminant analysis for image retrieval   总被引:2,自引:0,他引:2  
A self-organizing framework for object recognition is described. We describe a hierarchical database structure for image retrieval. The self-organizing hierarchical optimal subspace learning and inference framework (SHOSLIF) system uses the theories of optimal linear projection for optimal feature derivation and a hierarchical structure to achieve logarithmic retrieval complexity. A space-tessellation tree is generated using the most expressive features (MEF) and most discriminating features (MDF) at each level of the tree. The major characteristics of the analysis include: (1) avoiding the limitation of global linear features by deriving a recursively better-fitted set of features for each of the recursively subdivided sets of training samples; (2) generating a smaller tree whose cell boundaries separate the samples along the class boundaries better than the principal component analysis, thereby giving a better generalization capability (i.e., better recognition rate in a disjoint test); (3) accelerating the retrieval using a tree structure for data pruning, utilizing a different set of discriminant features at each level of the tree. We allow for perturbations in the size and position of objects in the images through learning. We demonstrate the technique on a large image database of widely varying real-world objects taken in natural settings, and show the applicability of the approach for variability in position, size, and 3D orientation. This paper concentrates on the hierarchical partitioning of the feature spaces  相似文献   

8.
9.
伪装目标检测(COD)旨在精确且高效地检测出与背景高度相似的伪装物体, 其方法可为物种保护、医学病患检测和军事监测等领域提供助力, 具有较高的实用价值. 近年来, 采用深度学习方法进行伪装目标检测成为一个比较新兴的研究方向. 但现有大多数COD算法都是以卷积神经网络(CNN)作为特征提取网络, 并且在结合多层次特征时, 忽略了特征表示和融合方法对检测性能的影响. 针对基于卷积神经网络的伪装目标检测模型对被检测目标的全局特征提取能力较弱问题, 提出一种基于Transformer的跨尺度交互学习伪装目标检测方法. 该模型首先提出了双分支特征融合模块, 将经过迭代注意力的特征进行融合, 更好地融合高低层特征; 其次引入了多尺度全局上下文信息模块, 充分联系上下文信息增强特征; 最后提出了多通道池化模块, 能够聚焦被检测物体的局部信息, 提高伪装目标检测准确率. 在CHAMELEON、CAMO以及COD10K数据集上的实验结果表明, 与当前主流的伪装物体检测算法相比较, 该方法生成的预测图更加清晰, 伪装目标检测模型能取得更高精度.  相似文献   

10.
An ART2 and a Madaline combined neural network is applied to predicting object motions in dynamic environments. The ART2 network extracts a set of coherent patterns of the object motion by its self-organizing and unsupervised learning features. The identified patterns are directed to the Madaline network to generate a quantitative prediction of the future motion states. The method does not require any presumption of the mathematical models, and is applicable to a variety of situations.  相似文献   

11.
目的 全卷积模型的显著性目标检测大多通过不同层次特征的聚合实现检测,如何更好地提取和聚合特征是一个研究难点。常用的多层次特征融合策略有加法和级联法,但是这些方法忽略了不同卷积层的感受野大小以及产生的特征图对最后显著图的贡献差异等问题。为此,本文结合通道注意力机制和空间注意力机制有选择地逐步聚合深层和浅层的特征信息,更好地处理不同层次特征的传递和聚合,提出了新的显著性检测模型AGNet(attention-guided network),综合利用几种注意力机制对不同特征信息加权解决上述问题。方法 该网络主要由特征提取模块(feature extraction module, FEM)、通道—空间注意力融合模块(channel-spatial attention aggregation module, C-SAAM)和注意力残差细化模块(attention residual refinement module,ARRM)组成,并且通过最小化像素位置感知(pixel position aware, PPA)损失训练网络。其中,C-SAAM旨在有选择地聚合浅层的边缘信息以及深层抽象的语义特征,利用通道注意力和空间注意力避免融合冗余的背景信息对显著性映射造成影响;ARRM进一步细化融合后的输出,并增强下一个阶段的输入。结果 在5个公开数据集上的实验表明,AGNet在多个评价指标上达到最优性能。尤其在DUT-OMRON(Dalian University of Technology-OMRON)数据集上,F-measure指标相比于排名第2的显著性检测模型提高了1.9%,MAE(mean absolute error)指标降低了1.9%。同时,网络具有不错的速度表现,达到实时效果。结论 本文提出的显著性检测模型能够准确地分割出显著目标区域,并提供清晰的局部细节。  相似文献   

12.
三维视觉理解旨在智能地感知和解释三维场景,实现对物体、环境和动态变化的深入理解与分析。三维目标检测作为其核心技术,发挥着不可或缺的作用。针对当前的三维检测算法对于远距离目标和小目标检测精度较低的问题,提出了一种面向多模态交互式融合与渐进式优化的三维目标检测方法MIFPR。在特征提取阶段,首先引入自适应门控信息融合模块。通过把点云的几何特征融入图像特征中,能够获取对光照变化更有辨别力的图像表示。随后提出基于体素质心的可变形跨模态注意力模块,以驱使图像中丰富的语义特征和上下文信息融合到点云特征中。在目标框优化阶段,提出渐进式注意力模块,通过学习、聚合不同阶段的特征,不断增强模型对于精细化特征的提取与建模能力,逐步优化目标框,以提升对于远距离、小目标的检测精度,进而提高对于视觉场景理解的能力。在KITTI数据集上,所提方法对于Pedestrian和Cyclist等小目标的检测精度较最优基线有明显提升,证实了该方法的有效性。  相似文献   

13.
《Advanced Robotics》2013,27(12):1351-1367
Robot imitation is a useful and promising alternative to robot programming. Robot imitation involves two crucial issues. The first is how a robot can imitate a human whose physical structure and properties differ greatly from its own. The second is how the robot can generate various motions from finite programmable patterns (generalization). This paper describes a novel approach to robot imitation based on its own physical experiences. We considered the target task of moving an object on a table. For imitation, we focused on an active sensing process in which the robot acquires the relation between the object's motion and its own arm motion. For generalization, we applied the RNNPB (recurrent neural network with parametric bias) model to enable recognition/generation of imitation motions. The robot associates the arm motion which reproduces the observed object's motion presented by a human operator. Experimental results proved the generalization capability of our method, which enables the robot to imitate not only motion it has experienced, but also unknown motion through nonlinear combination of the experienced motions.  相似文献   

14.
The vibration of a deformable object is often problematic during automatic handling by robot manipulators. However, humans can often handle and damp the vibration of deformable objects with ease. This paper presents force/torque sensor‐based skills for handling deformable linear objects in a manner suitable to reduce acute vibration with simple human skill inspired strategies that consist of one or two adjustment motions. The adjustment motion is a simple open‐loop motion that can be attached to the end of any arbitrary end‐effector's trajectory. As an ordinary industrial robot's simple action, it has three periods, i.e., acceleration, constant speed, and deceleration period; it starts from a predicted time tightly close to a force/moment maximum. The predicted time for the adjustment action is generated automatically on‐line based on the vibration rhythm and the data sensed by a force/torque sensor mounted on the robot's wrist. To find the matching point between the vibrational signal of the deformable object and a template, template matching techniques including cross‐correlation and minimum squared error methods are used and compared. Experiments are conducted with an industrial robot to test the new skills under various conditions. The results demonstrate that an industrial robot could perform effective vibration reduction skills with simple strategies. © 2005 Wiley Periodicals, Inc.  相似文献   

15.
为了进一步提高多尺度目标检测的速度和精度,解决小目标检测易造成的漏检、错检以及重复检测等问题,提出一种基于改进YOLOv3的目标检测算法实现多尺度目标的自动检测。首先,在特征提取网络中对网络结构进行改进,在残差模块的空间维度中引入注意力机制,对小目标进行关注;然后,利用密集连接网络(DenseNet)充分融合网络浅层信息,并用深度可分离卷积替换主干网络中的普通卷积,减少模型的参数量,提升检测速率。在特征融合网络中,通过双向金字塔结构实现深浅层特征的双向融合,并将3尺度预测变为4尺度预测,提高了多尺度特征的学习能力;在损失函数方面,选取GIoU(Generalized Intersection over Union)作为损失函数,提高目标识别的精度,降低目标漏检率。实验结果表明,基于改进YOLOv3(You Only Look Once v3)的目标检测算法在Pascal VOC测试集上的平均准确率均值(mAP)达到83.26%,与原YOLOv3算法相比提升了5.89个百分点,检测速度达22.0 frame/s;在COCO数据集上,与原YOLOv3算法相比,基于改进YOLOv3的目标检测算法在mAP上提升了3.28个百分点;同时,在进行多尺度的目标检测中,算法的mAP有所提升,验证了基于改进YOLOv3的目标检测算法的有效性。  相似文献   

16.
针对传统视觉SLAM在动态场景下容易出现特征匹配错误从而导致定位精度下降的问题,提出了一种基于动态物体跟踪的语义SLAM算法。基于经典的视觉SLAM框架,提取动态物体进行帧间跟踪,并利用动态物体的位姿信息来辅助相机自身的定位。首先,算法在数据预处理中使用YOLACT、RAFT以及SC-Depth网络,分别提取图像中的语义掩膜、光流向量以及像素深度值。其次,视觉前端模块根据所提信息,通过语义分割掩膜、运动一致性检验以及遮挡点检验算法计算概率图以平滑区分场景中的动态特征与静态特征。然后,后端中的捆集调整模块融合了物体运动的多特征约束以提高算法在动态场景中的位姿估计性能。最后,在KITTI和OMD数据集的动态场景中进行对比验证。实验表明,所提算法能够准确地跟踪动态物体,在室内外动态场景中具备鲁棒、良好的定位性能。  相似文献   

17.
Motion, as a feature of video that changes in temporal sequences, is crucial to visual understanding. The powerful video representation and extraction models are typically able to focus attention on motion features in challenging dynamic environments to complete more complex video understanding tasks. However, previous approaches discriminate mainly based on similar features in the spatial or temporal domain, ignoring the interdependence of consecutive video frames. In this paper, we propose the motion sensitive self-supervised collaborative network, a video representation learning framework that exploits a pretext task to assist feature comparison and strengthen the spatiotemporal discrimination power of the model. Specifically, we first propose the motion-aware module, which extracts consecutive motion features from the spatial regions by frame difference. The global–local contrastive module is then introduced, with context and enhanced video snippets being defined as appropriate positive samples for a broader feature similarity comparison. Finally, we introduce the snippet operation prediction module, which further assists contrastive learning to obtain more reliable global semantics by sensing changes in continuous frame features. Experimental results demonstrate that our work can effectively extract robust motion features and achieve competitive performance compared with other state-of-the-art self-supervised methods on downstream action recognition and video retrieval tasks.  相似文献   

18.
目的 视频多目标跟踪(multiple object tracking, MOT)是计算机视觉中的一项重要任务,现有研究分别针对目标检测和目标关联部分进行改进,均忽视了多目标跟踪中的不一致问题。不一致问题主要包括3方面,即目标检测框中心与身份特征中心不一致、帧间目标响应不一致以及训练测试过程中相似度度量方式不一致。为了解决上述不一致问题,本文提出一种基于时空一致性的多目标跟踪方法,以提升跟踪的准确度。方法 从空间、时间以及特征维度对上述不一致性进行修正。对于目标检测框中心与身份特征中心不一致,针对每个目标检测框中心到特征中心之间的空间差异,在偏移后的位置上提取目标的ReID(re-identification)特征;对帧间响应不一致,使用空间相关计算相邻帧之间的运动偏移信息,基于该偏移信息对前一帧的目标响应进行变换后得到帧间一致性响应信息,然后对目标响应进行增强;对训练和测试过程中的相似度度量不一致,提出特征正交损失函数,在训练时考虑目标两两之间的相似关系。结果 在3个数据集上与现有方法进行比较。在MOT17、MOT20和Hieve数据集中,MOTA(multiple object t...  相似文献   

19.
《Advanced Robotics》2013,27(5):469-485
This paper presents an adaptive hybrid control approach for a robot manipulator to interact with its flexible object. Because of its flexibility, the object dynamics influence the robot's control system, and since it is usually a distributed parameter system, the object dynamics as seen from the robot change when the robot moves. The problem becomes further complicated such that it is difficult to decompose the robot's position and contact force control loops. In this paper, we approximate the object's distributed parameter model into a lumped 'position state-varying' model. Then, by using the well-known nonlinear feedback compensation, we decompose the robot's control space into a position control subspace and object torque control subspace. We design the optimal state feedback for the position control loop and control the robot's contact force through controlling the resultant torque of the object. We use the model-reference simple adaptive control strategy to control the torque control loop. We also study the problem on how to select a reasonable reference model for this control loop. Experiments of a PUMA robot interacting with an aluminum beam show the effectiveness of our approach.  相似文献   

20.
《Advanced Robotics》2013,27(1-2):207-232
In this paper, we provide the first demonstration that a humanoid robot can learn to walk directly by imitating a human gait obtained from motion capture (mocap) data without any prior information of its dynamics model. Programming a humanoid robot to perform an action (such as walking) that takes into account the robot's complex dynamics is a challenging problem. Traditional approaches typically require highly accurate prior knowledge of the robot's dynamics and environment in order to devise complex (and often brittle) control algorithms for generating a stable dynamic motion. Training using human mocap is an intuitive and flexible approach to programming a robot, but direct usage of mocap data usually results in dynamically unstable motion. Furthermore, optimization using high-dimensional mocap data in the humanoid full-body joint space is typically intractable. We propose a new approach to tractable imitation-based learning in humanoids without a robot's dynamic model. We represent kinematic information from human mocap in a low-dimensional subspace and map motor commands in this low-dimensional space to sensory feedback to learn a predictive dynamic model. This model is used within an optimization framework to estimate optimal motor commands that satisfy the initial kinematic constraints as best as possible while generating dynamically stable motion. We demonstrate the viability of our approach by providing examples of dynamically stable walking learned from mocap data using both a simulator and a real humanoid robot.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号