首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A reinforcement agent for object segmentation in ultrasound images   总被引:1,自引:0,他引:1  
The principal contribution of this work is to design a general framework for an intelligent system to extract one object of interest from ultrasound images. This system is based on reinforcement learning. The input image is divided into several sub-images, and the proposed system finds the appropriate local values for each of them so that it can extract the object of interest. The agent uses some images and their ground-truth (manually segmented) version to learn from. A reward function is employed to measure the similarities between the output and the manually segmented images, and to provide feedback to the agent. The information obtained can be used as valuable knowledge stored in the Q-matrix. The agent can then use this knowledge for new input images. The experimental results for prostate segmentation in trans-rectal ultrasound images show high potential of this approach in the field of ultrasound image segmentation.  相似文献   

2.
We propose a method for automatic extraction and labeling of semantically meaningful image objects using “learning by example” and threshold-free multi-level image segmentation. The proposed method scans through images, each of which is pre-segmented into a hierarchical uniformity tree, to seek and label objects that are similar to an example object presented by the user. By representing images with stacks of multi-level segmentation maps, objects can be extracted in the segmentation map level with adequate detail. Experiments have shown that the proposed multi-level image segmentation results in significant reduction in computation complexity for object extraction and labeling (compared to a single fine-level segmentation) by avoiding unnecessary tests of combinations in finer levels. The multi-level segmentation-based approach also achieves better accuracy in detection and labeling of small objects.  相似文献   

3.
Elevator Group Control Using Multiple Reinforcement Learning Agents   总被引:22,自引:0,他引:22  
Crites  Robert H.  Barto  Andrew G. 《Machine Learning》1998,33(2-3):235-262
Recent algorithmic and theoretical advances in reinforcement learning (RL) have attracted widespread interest. RL algorithms have appeared that approximate dynamic programming on an incremental basis. They can be trained on the basis of real or simulated experiences, focusing their computation on areas of state space that are actually visited during control, making them computationally tractable on very large problems. If each member of a team of agents employs one of these algorithms, a new collective learning algorithm emerges for the team as a whole. In this paper we demonstrate that such collective RL algorithms can be powerful heuristic methods for addressing large-scale control problems.Elevator group control serves as our testbed. It is a difficult domain posing a combination of challenges not seen in most multi-agent learning research to date. We use a team of RL agents, each of which is responsible for controlling one elevator car. The team receives a global reward signal which appears noisy to each agent due to the effects of the actions of the other agents, the random nature of the arrivals and the incomplete observation of the state. In spite of these complications, we show results that in simulation surpass the best of the heuristic elevator control algorithms of which we are aware. These results demonstrate the power of multi-agent RL on a very large scale stochastic dynamic optimization problem of practical utility.  相似文献   

4.
Learning policies for single machine job dispatching   总被引:3,自引:0,他引:3  
Reinforcement learning (RL) has received some attention in recent years from agent-based researchers because it deals with the problem of how an autonomous agent can learn to select proper actions for achieving its goals through interacting with its environment. Each time after an agent performs an action, the environment's response, as indicated by its new state, is used by the agent to reward or penalize its action. The agent's goal is to maximize the total amount of reward it receives over the long run. Although there have been several successful examples demonstrating the usefulness of RL, its application to manufacturing systems has not been fully explored. In this study, a single machine agent employs the Q-learning algorithm to develop a decision-making policy on selecting the appropriate dispatching rule from among three given dispatching rules. The system objective is to minimize mean tardiness. This paper presents a factorial experiment design for studying the settings used to apply Q-learning to the single machine dispatching rule selection problem. The factors considered in this study include two related to the agent's policy table design and three for developing its reward function. This study not only investigates the main effects of this Q-learning application but also provides recommendations for factor settings and useful guidelines for future applications of Q-learning to agent-based production scheduling.  相似文献   

5.
This paper presents a framework called Cresceptron for view-based learning, recognition and segmentation. Specifically, it recognizes and segments image patterns that are similar to those learned, using a stochastic distortion model and view-based interpolation, allowing other view points that are moderately different from those used in learning. The learning phase is interactive. The user trains the system using a collection of training images. For each training image, the user manually draws a polygon outlining the region of interest and types in the label of its class. Then, from the directional edges of each of the segmented regions, the Cresceptron uses a hierarchical self-organization scheme to grow a sparsely connected network automatically, adaptively and incrementally during the learning phase. At each level, the system detects new image structures that need to be learned and assigns a new neural plane for each new feature. The network grows by creating new nodes and connections which memorize the new image structures and their context as they are detected. Thus, the structure of the network is a function of the training exemplars. The Cresceptron incorporates both individual learning and class learning; with the former, each training example is treated as a different individual while with the latter, each example is a sample of a class. In the performance phase, segmentation and recognition are tightly coupled. No foreground extraction is necessary, which is achieved by backtracking the response of the network down the hierarchy to the image parts contributing to recognition. Several stochastic shape distortion models are analyzed to show why multilevel matching such as that in the Cresceptron can deal with more general stochastic distortions that a single-level matching scheme cannot. The system is demonstrated using images from broadcast television and other video segments to learn faces and other objects, and then later to locate and to recognize similar, but possibly distorted, views of the same objects.  相似文献   

6.
何毅  陆淑娟  梅雪 《计算机工程》2009,35(23):214-216
针对包含多个目标或目标灰度与背景灰度接近的图像分割问题,借鉴人类视觉系统的关注机制和多分辨性,提出一种多尺度框架下基于感兴趣区域提取的图像水平集分割方法。对原图像小波变换的低频分量图应用显著性特征提取出感兴趣区域,将图像域分成多个感兴趣子区域和一个背景子区域,在各目标子区域中,采用C-V模型方法进行曲线演化,并对各子区域分割结果进行合成。仿真结果标明,该算法能有效分割多目标图像。  相似文献   

7.
In this work we investigate the use of a reinforcement learning (RL) framework for the autonomous navigation of a group of mini-robots in a multi-agent collaborative environment. Each mini-robot is driven by inertial forces provided by two vibration motors that are controlled by a simple and efficient low-level speed controller. The action of the RL agent is the direction of each mini-robot, and it is based on the position of each mini-robot, the distance between them and the sign of the distance gradient between each mini-robot and the nearest one. Each mini-robot is considered a moving obstacle that must be avoided by the others. We propose suitable state space and reward function that result in an efficient collaborative RL framework. The classical and the double Q-learning algorithms are employed, where the latter is considered to learn optimal policies of mini-robots that offers more stable and reliable learning process. A simulation environment is created, using the ROS framework, that include a group of four mini-robots. The dynamic model of each mini-robot and of the vibration motors is also included. Several application scenarios are simulated and the results are presented to demonstrate the performance of the proposed approach.  相似文献   

8.
强化学习算法中启发式回报函数的设计及其收敛性分析   总被引:3,自引:0,他引:3  
(中国科学院沈阳自动化所机器人学重点实验室沈阳110016)  相似文献   

9.
张强  秦勃 《计算机系统应用》2015,24(10):212-216
针对胶囊缺陷检测中存在的图像分割效果不理想的问题, 提出了一种基于区域特征的胶囊图像分割算法. 首先将原图像分割成5个子图像, 然后分别在子图图像中分割提取胶囊. 子图图像首先对图像高亮区域作去高光处理、去除噪声, 然后将图像区域的每一行作为一个子区域, 根据胶囊在图像区域中所在的位置特点, 通过判断子区域中链板域与背景域是否存在边界点以及胶囊与链板上的链齿是否连接来识别不同类型的子区域, 寻找子区域中胶囊与非胶囊区域的边界, 然后去除非胶囊区域. 最终对图像区域逐行扫描处理完成后从图像中提取出胶囊. 实验表明该算法与传统方法相比, 不仅速度较快, 准确性和鲁棒性也得到了改善.  相似文献   

10.
NeTra: A toolbox for navigating large image databases   总被引:17,自引:0,他引:17  
We present here an implementation of NeTra, a prototype image retrieval system that uses color, texture, shape and spatial location information in segmented image regions to search and retrieve similar regions from the database. A distinguishing aspect of this system is its incorporation of a robust automated image segmentation algorithm that allows object- or region-based search. Image segmentation significantly improves the quality of image retrieval when images contain multiple complex objects. Images are segmented into homogeneous regions at the time of ingest into the database, and image attributes that represent each of these regions are computed. In addition to image segmentation, other important components of the system include an efficient color representation, and indexing of color, texture, and shape features for fast search and retrieval. This representation allows the user to compose interesting queries such as “retrieve all images that contain regions that have the color of object A, texture of object B, shape of object C, and lie in the upper of the image”, where the individual objects could be regions belonging to different images. A Java-based web implementation of NeTra is available at http://vivaldi.ece.ucsb.edu/Netra.  相似文献   

11.
目的 从影像中快速精准地分割出肺部解剖结构可以清晰直观地分辨各解剖结构间的关系,提供有效、客观的辅助诊断信息,大大提高医生的阅片效率并降低医生的工作量。随着影像分割算法的发展,越来越多的方法应用于分割肺部影像中感兴趣的解剖结构区域,但目前尚缺乏包含多种肺部精细解剖结构的影像数据集。本文创建了一个带标签的肺部CT/CTA (computer tomography/computer tomography angiography)影像数据集,以促进肺部解剖结构分割算法的发展。方法 该数据集共标记了67组肺部CT/CTA影像,包括CT影像24组、CTA影像43组,共计切片图像26 157幅。每组CT/CTA有4个不同的目标区域类别,标记对应支气管、肺实质、肺叶、肺动脉和肺静脉。结果 本文利用该数据集,用于肺部CT解剖结构分割医学影像挑战赛——2020年第四届国际图像计算与数字医学研讨会,该挑战赛提供了一个肺血管、支气管和肺实质的评估平台,通过Dice系数、过分割率、欠分割率、医学和算法行业专家对分割和3维重建效果进行了评估,目的是比较各种算法分割肺部解剖结构的性能。结论 本文详细描述了包括支气管、肺实质、肺叶、肺动脉和肺静脉等解剖结构标签的肺部影像数据集和应用结果,为相关研究人员利用本数据集进行更深入的研究提供参考。  相似文献   

12.
We present a novel multimodality image registration system for spinal surgery. The system comprises a surface-based algorithm that performs computed tomography/magnetic resonance (CT/MR) rigid registration and MR image segmentation in an iterative manner. The segmentation/registration process progressively refines the result of MR image segmentation and CT/MR registration. For MR image segmentation, we propose a method based on the double-front level set that avoids boundary leakages, prevents interference from other objects in the image, and reduces computational time by constraining the search space. In order to reduce the registration error from the misclassification of the soft tissue surrounding the bone in MR images, we propose a weighted surface-based CT/MR registration scheme. The resultant weighted surface is registered to the segmented surface of the CT image. Contours are generated from the reconstructed CT surfaces for subsequent MR image segmentation. This process iterates till convergence. The registration method achieves accuracy comparable to conventional techniques while being significantly faster. Experimental results demonstrate the advantages of the proposed approach and its application to different anatomies.  相似文献   

13.
Dyna-Q, a well-known model-based reinforcement learning (RL) method, interplays offline simulations and action executions to update Q functions. It creates a world model that predicts the feature values in the next state and the reward function of the domain directly from the data and uses the model to train Q functions to accelerate policy learning. In general, tabular methods are always used in Dyna-Q to establish the model, but a tabular model needs many more samples of experience to approximate the environment concisely. In this article, an adaptive model learning method based on tree structures is presented to enhance sampling efficiency in modeling the world model. The proposed method is to produce simulated experiences for indirect learning. Thus, the proposed agent has additional experience for updating the policy. The agent works backwards from collections of state transition and associated rewards, utilizing coarse coding to learn their definitions for the region of state space that tracks back to the precedent states. The proposed method estimates the reward and transition probabilities between states from past experience. Because the resultant tree is always concise and small, the agent can use value iteration to quickly estimate the Q-values of each action in the induced states and determine a policy. The effectiveness and generality of our method is further demonstrated in two numerical simulations. Two simulations, a mountain car and a mobile robot in a maze, are used to verify the proposed methods. The simulation result demonstrates that the training rate of our method can improve obviously.  相似文献   

14.
Segmentation of an image into regions and the labeling of the regions is a challenging problem. In this paper, an approach that is applicable to any set of multifeature images of the same location is derived. Our approach applies to, for example, medical images of a region of the body; repeated camera images of the same area; and satellite images of a region. The segmentation and labeling approach described here uses a set of training images and domain knowledge to produce an image segmentation system that can be used without change on images of the same region collected over time. How to obtain training images, integrate domain knowledge, and utilize learning to segment and label images of the same region taken under any condition for which a training image exists is detailed. It is shown that clustering in conjunction with image processing techniques utilizing an iterative approach can effectively identify objects of interest in images. The segmentation and labeling approach described here is applied to color camera images and two other image domains are used to illustrate the applicability of the approach.  相似文献   

15.
Path-based relational reasoning over knowledge graphs has become increasingly popular due to a variety of downstream applications such as question answering in dialogue systems, fact prediction, and recommendation systems. In recent years, reinforcement learning (RL) based solutions for knowledge graphs have been demonstrated to be more interpretable and explainable than other deep learning models. However, the current solutions still struggle with performance issues due to incomplete state representations and large action spaces for the RL agent. We address these problems by developing HRRL (Heterogeneous Relational reasoning with Reinforcement Learning), a type-enhanced RL agent that utilizes the local heterogeneous neighborhood information for efficient path-based reasoning over knowledge graphs. HRRL improves the state representation using a graph neural network (GNN) for encoding the neighborhood information and utilizes entity type information for pruning the action space. Extensive experiments on real-world datasets show that HRRL outperforms state-of-the-art RL methods and discovers more novel paths during the training procedure, demonstrating the explorative power of our method.  相似文献   

16.
The Segmentation According to Natural Examples (SANE) algorithm learns to segment objects in static images from video training data. SANE uses background subtraction to find the segmentation of moving objects in videos. This provides object segmentation information for each video frame. The collection of frames and segmentations forms a training set that SANE uses to learn the image and shape properties of the observed motion boundaries. When presented with new static images, the trained model infers segmentations similar to the observed motion segmentations. SANE is a general method for learning environment-specific segmentation models. Because it can automatically generate training data from video, it can adapt to a new environment and new objects with relative ease, an advantage over untrained segmentation methods or those that require human-labeled training data. By using the local shape information in the training data, it outperforms a trained local boundary detector. Its performance is competitive with a trained top-down segmentation algorithm that uses global shape. The shape information it learns from one class of objects can assist the segmentation of other classes.  相似文献   

17.
This paper proposes an algorithm to deal with continuous state/action space in the reinforcement learning (RL) problem. Extensive studies have been done to solve the continuous state RL problems, but more research should be carried out for RL problems with continuous action spaces. Due to non-stationary, very large size, and continuous nature of RL problems, the proposed algorithm uses two growing self-organizing maps (GSOM) to elegantly approximate the state/action space through addition and deletion of neurons. It has been demonstrated that GSOM has a better performance in topology preservation, quantization error reduction, and non-stationary distribution approximation than the standard SOM. The novel algorithm proposed in this paper attempts to simultaneously find the best representation for the state space, accurate estimation of Q-values, and appropriate representation for highly rewarded regions in the action space. Experimental results on delayed reward, non-stationary, and large-scale problems demonstrate very satisfactory performance of the proposed algorithm.  相似文献   

18.
We present a novel “dynamic learning” approach for an intelligent image database system to automatically improve object segmentation and labeling without user intervention, as new examples become available, for object-based indexing. The proposed approach is an extension of our earlier work on “learning by example,” which addressed labeling of similar objects in a set of database images based on a single example. The proposed dynamic learning procedure utilizes multiple example object templates to improve the accuracy of existing object segmentations and labels. Multiple example templates may be images of the same object from different viewing angles, or images of related objects. This paper also introduces a new shape similarity metric called normalized area of symmetric differences (NASD), which has desired properties for use in the proposed “dynamic learning” scheme, and is more robust against boundary noise that results from automatic image segmentation. Performance of the dynamic learning procedures has been demonstrated by experimental results.  相似文献   

19.
强化学习是一种重要的无监督机器学习技术,它能够利用不确定的环境下的奖赏发现最优的行为序列,实现动态环境下的在线学习,被广泛地应用到Agent系统当中。应用强化学习算法的难点之一就是如何平衡强化学习当中探索和利用之间的关系,即如何进行动作选择。结合Q学习在ε-greedy策略基础上引入计数器,从而使动作选择时的参数ε能够分阶段进行调整,从而更好地平衡探索和利用间的关系。通过对方格世界的实验仿真,证明了方法的有效性。  相似文献   

20.
While Reinforcement Learning (RL) is not traditionally designed for interactive supervisory input from a human teacher, several works in both robot and software agents have adapted it for human input by letting a human trainer control the reward signal. In this work, we experimentally examine the assumption underlying these works, namely that the human-given reward is compatible with the traditional RL reward signal. We describe an experimental platform with a simulated RL robot and present an analysis of real-time human teaching behavior found in a study in which untrained subjects taught the robot to perform a new task. We report three main observations on how people administer feedback when teaching a Reinforcement Learning agent: (a) they use the reward channel not only for feedback, but also for future-directed guidance; (b) they have a positive bias to their feedback, possibly using the signal as a motivational channel; and (c) they change their behavior as they develop a mental model of the robotic learner. Given this, we made specific modifications to the simulated RL robot, and analyzed and evaluated its learning behavior in four follow-up experiments with human trainers. We report significant improvements on several learning measures. This work demonstrates the importance of understanding the human-teacher/robot-learner partnership in order to design algorithms that support how people want to teach and simultaneously improve the robot's learning behavior.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号