首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Over the last decade, wireless capsule endoscopy (WCE) technology has become a very useful tool for diagnosing diseases within the human digestive tract. Physicians using WCE can examine the digestive tract in a minimally invasive way searching for pathological abnormalities such as bleeding, polyps, ulcers, and Crohn's disease. To improve effectiveness of WCE, researchers have developed software methods to automatically detect these diseases at a high rate of success. This paper proposes a novel synergistic methodology for automatically discovering polyps (protrusions) and perforated ulcers in WCE video frames. Finally, results of the methodology are given and statistical comparisons are also presented relevant to other works.  相似文献   

2.
Tumor in digestive tract is a common disease and wireless capsule endoscopy (WCE) is a relatively new technology to examine diseases for digestive tract especially for small intestine. This paper addresses the problem of automatic recognition of tumor for WCE images. Candidate color texture feature that integrates uniform local binary pattern and wavelet is proposed to characterize WCE images. The proposed features are invariant to illumination change and describe multiresolution characteristics of WCE images. Two feature selection approaches based on support vector machine, sequential forward floating selection and recursive feature elimination, are further employed to refine the proposed features for improving the detection accuracy. Extensive experiments validate that the proposed computer-aided diagnosis system achieves a promising tumor recognition accuracy of 92.4% in WCE images on our collected data.  相似文献   

3.
Since wireless capsule endoscopy (WCE) is a novel technology for recording the videos of the digestive tract of a patient, the problem of segmenting the WCE video of the digestive tract into subvideos corresponding to the entrance, stomach, small intestine, and large intestine regions is not well addressed in the literature. A selected few papers addressing this problem follow supervised leaning approaches that presume availability of a large database of correctly labeled training samples. Considering the difficulties in procuring sizable WCE training data sets needed for achieving high classification accuracy, we introduce in this paper an unsupervised learning approach that employs Scale Invariant Feature Transform (SIFT) for extraction of local image features and the probabilistic latent semantic analysis (pLSA) model used in the linguistic content analysis for data clustering. Results of experimentation indicate that this method compares well in classification accuracy with the state-of-the-art supervised classification approaches to WCE video segmentation.  相似文献   

4.
Wireless capsule endoscopy (WCE) allows for comfortable video explorations of the gastrointestinal (GI) tract, with special indication for the small bowel. In the other segments of the GI tract also accessible to probe gastroscopy and colonscopy, WCE still exhibits poorer diagnostic efficacy. Its main drawback is the impossibility of controlling the capsule movement, which is randomly driven by peristalsis and gravity. To solve this problem, magnetic maneuvering has recently become a thrust research area. Here, we report the first demonstration of accurate robotic steering and noninvasive 3-D localization of a magnetically enabled sample of the most common video capsule (PillCam, Given Imaging Ltd, Israel) within each of the main regions of the GI tract (esophagus, stomach, small bowel, and colon) in vivo, in a domestic pig model. Moreover, we demonstrate how this is readily achievable with a robotic magnetic navigation system (Niobe, Stereotaxis, Inc, USA) already used for cardiovascular clinical procedures. The capsule was freely and safely moved with omnidirectional steering accuracy of 1°, and was tracked in real time through fluoroscopic imaging, which also allowed for 3-D localization with an error of 1 mm. The accuracy of steering and localization enabled by the Stereotaxis system and its clinical accessibility world wide may allow for immediate and broad usage in this new application. This anticipates magnetically steerable WCE as a near-term reality. The instrumentation should be used with the next generations of video capsules, intrinsically magnetic and capable of real-time optical-image visualization, which are expected to reach the market soon.  相似文献   

5.
Compression of captured video frames is crucial for saving the power in wireless capsule endoscopy (WCE). A low complexity encoder is desired to limit the power consumption required for compressing the WCE video. Distributed video coding (DVC) technique is best suitable for designing a low complexity encoder. In this technique, frames captured in RGB colour space are converted into YCbCr colour space. Both Y and CbCr representing luma and chroma components of the Wyner–Ziv (WZ) frames are processed and encoded in existing DVC techniques proposed for WCE video compression. In the WCE video, consecutive frames exhibit more similarity in texture and colour properties. The proposed work uses these properties to present a method for processing and encoding only the luma component of a WZ frame. The chroma components of the WZ frame are predicted by an encoder–decoder based deep chroma prediction model at the decoder by matching luma and texture information of the keyframe and WZ frame. The proposed method reduces the computations required for encoding and transmitting of WZ chroma component. The results show that the proposed DVC with a deep chroma prediction model performs better when compared to motion JPEG and existing DVC systems for WCE at the reduced encoder complexity.  相似文献   

6.
RF localization science and technology started with the global positioning systems for outdoor areas, and it then transformed into wireless indoor geolocation. The next step in the evolution of this science is the transformation into RF localization inside the human body. The first major application for this technology is the localization of the wireless video capsule endoscope (VCE) that has been in the clinical arena for 12?years. While physicians can receive clear images of abnormalities in the gastrointestinal tract with VCE devices, they have little idea of their exact location inside the GI tract. To localize intestinal abnormalities, physicians routinely use radiological, endoscopic or surgical operations. If we could use the RF signal radiated from the capsule to also locate these devices, not only can physicians discover medical problems, but they can also learn where the problems are located. However, finding a realistic RF localization solution for the endoscopy capsule is a very challenging task, because the inside of the human body is a difficult environment for experimentation and visualization. In addition, we have no-idea how the capsule moves and rotates in its 3D journey in this non-homogeneous medium for radio propagation. In this paper, we describe how we can design a cyber physical system (CPS) for experimental testing and visualization of interior of the human body that can be used for solving the RF localization problem for the endoscopy capsule. We also address the scientific challenges that face and the appropriate technical approaches for solving this problem.  相似文献   

7.
In this paper, we present a hybrid camera system combining one time-of-flight depth camera and multiple video cameras to generate multi-view video sequences and their corresponding depth maps. In order to obtain the multi-view video-plus-depth data using the hybrid camera system, we capture multi-view videos using multiple video cameras and a single view depth video with the depth camera. After performing a three-dimensional (3-D) warping operation to obtain an initial depth map at each viewpoint, we refine the initial depth map using segment-based stereo matching. To reduce mismatched depth values along object boundaries, we detect the moving objects using color difference between frames and extract occlusion and disocclusion areas with the initial depth information. Finally, we recompute the depth value of each pixel in each segment using pairwise stereo matching with a proposed cost function. Experimental results show that the proposed hybrid camera system produces multi-view video sequences with more accurate depth maps, especially along the boundary of objects. In addition, it is suitable for generating more natural 3-D views for 3-D TV than previous works..  相似文献   

8.
Intensity prediction along motion trajectories removes temporal redundancy considerably in video compression algorithms. In three-dimensional (3-D) object-based video coding, both 3-D motion and depth values are required for temporal prediction. The required 3-D motion parameters for each object are found by the correspondence-based E-matrix method. The estimation of the correspondences-two-dimensional (2-D) motion field-between the frames and segmentation of the scene into objects are achieved simultaneously by minimizing a Gibbs energy. The depth field is estimated by jointly minimizing a defined distortion and bit-rate criterion using the 3-D motion parameters. The resulting depth field is efficient in the rate-distortion sense. Bit-rate values corresponding to the lossless encoding of the resultant depth fields are obtained using predictive coding; prediction errors are encoded by a Lempel-Ziv algorithm. The results are satisfactory for real-life video scenes.  相似文献   

9.
Gastro-intestinal (GI) endoscopy is a widely used clinical procedure for screening and surveillance of digestive tract diseases ranging from Barrett's Oesophagus to oesophageal cancer. Current surveillance protocol consists of periodic endoscopic examinations performed in 3-4 month intervals including expert's visual assessment and biopsies taken from suspicious tissue regions. Recent development of a new imaging technology, called probe-based confocal laser endomicroscopy (pCLE), enabled the acquisition of in vivo optical biopsies without removing any tissue sample. Besides its several advantages, i.e., noninvasiveness, real-time and in vivo feedback, optical biopsies involve a new challenge for the endoscopic expert. Due to their noninvasive nature, optical biopsies do not leave any scar on the tissue and therefore recognition of the previous optical biopsy sites in surveillance endoscopy becomes very challenging. In this work, we introduce a clustering and classification framework to facilitate retargeting previous optical biopsy sites in surveillance upper GI-endoscopies. A new representation of endoscopic videos based on manifold learning, "endoscopic video manifolds" (EVMs), is proposed. The low dimensional EVM representation is adapted to facilitate two different clustering tasks; i.e., clustering of informative frames and patient specific endoscopic segments, only by changing the similarity measure. Each step of the proposed framework is validated on three in vivo patient datasets containing 1834, 3445, and 1546 frames, corresponding to endoscopic videos of 73.36, 137.80, and 61.84 s, respectively. Improvements achieved by the introduced EVM representation are demonstrated by quantitative analysis in comparison to the original image representation and principal component analysis. Final experiments evaluating the complete framework demonstrate the feasibility of the proposed method as a promising step for assisting the endoscopic expert in retargeting the optical biopsy sites.  相似文献   

10.
In this paper, we tackle the problem of matching of objects in video in the framework of the rough indexing paradigm. In this context, the video data are of very low spatial and temporal resolution because they come from partially decoded MPEG compressed streams. This paradigm enables us to achieve our purpose in near real time due to the faster computation on rough data than on original full spatial and temporal resolution video frames.In this context, segmentation of rough video frames is inaccurate and the region features (texture, color, shape) are not strongly relevant. The structure of the objects must be considered in order to improve the robustness of the matching of regions. The problem of object matching can be expressed in terms of region adjacency graph (RAG) matching.Here, we propose a directed acyclic graph (DAG) matching method based on a heuristic in order to approximate object matching. The RAGs to compare are first transformed into DAGs by orienting edges. Then, we compute some combinatoric metrics on nodes in order to classify them by similarity. At the end, a top-down process on DAGs aims to match similar patterns that exist between the two DAGs.The results are compared with those of a method based on relaxation matching.  相似文献   

11.
This paper proposes an architecture of the wireless endoscopy system for the diagnoses of whole human digestive tract and real-time endoscopic image monitoring. The low-power digital IC design inside the wireless endoscopic capsule is discussed in detail. A very large scale integration (VLSI) architecture of three-stage clock management is applied, which can save 46% power inside the capsule compared with the design without such a low-power design. A stoppable ring crystal oscillator with minimal overhead is used in the sleep mode, which results in about 60-muW system power dissipation in sleep mode. A new image compression algorithm based on Bayer image format and its corresponding VLSI architecture are both proposed for low-power, high-data volume. Thus, 8 frames per second with 320*288 pixels can be transmitted with 2 Mb/s. The digital IC design also assures that the capsule has many flexible and useful functions for clinical application. The digital circuits were verified on field-programmable gate arrays and have been implemented in 0.18-mum CMOS process with 6.2 mW  相似文献   

12.
In this paper, we propose a new bi-directional 2-D mesh representation of video objects, which utilizes forward and backward reference frames (keyframes). This framework extends the previous uni-directional mesh representation to enable efficient rendering, editing, and superresolution of video objects in the presence of occlusion by allowing bi-directional texture mapping as in MPEG B-frames. The video object of interest is tracked between two successive keyframes (which can be automatically or interactively selected) both in forward and backward directions. Keyframes provide the texture of the video object, whereas its motion is modeled by forward and backward 2-D meshes. In addition, we employ “validity maps”, associated with each 2-D mesh, which allow selective texture mapping from the keyframes. Experimental results for efficient video object editing and object-based video resolution enhancement in the presence of self-occlusion are presented to demonstrate the effectiveness of the proposed representation.  相似文献   

13.
通过分析目前消化道无线内窥镜的发展状况,提出了一种全新的双向、数字化的微型无线内窥镜系统方案设计,该系统具有实时观察病人图像、全消化道检查以及提供三维深度图像数据等功能,并对方案中各硬件模块及其关键技术进行了详细的论述,设计了该系统的FPGA验证环境,验证了整个方案的正确性。系统胶囊内的数模混合芯片已采用0.18μm CMOS工艺流片。  相似文献   

14.
Based on our statistical investigation of a typical three-dimensional (3-D) wavelet codec, we present a unified mathematical model to describe its operational rate-distortion (RD) behavior. The quantization distortion of the reconstructed video frames is assessed by tracking the quantization noise along the 3-D wavelet decomposition trees. The coding bit-rate is estimated for a class of embedded video coders. Experimental results show that the model captures sequence characteristics accurately and reveals the relationship between wavelet decomposition levels and the overall RD performance. After being trained with offline RD data, the model enables accurate prediction of real RD performance of video codecs and therefore can enable optimal RD adaptation of the encoding parameters according to various network conditions.  相似文献   

15.
3-D vision technologies are extensively used in a wide variety of applications. Particularly glasses-based 3-D technology facilities are increasingly becoming affordable to our daily lives. Considering health issues raised by 3-D video technologies, to the best of our knowledge, most of the pilot studies are practiced in a highly-controlled laboratory environment only. In this paper, we present NeuroGlasses, a nonintrusive wearable physiological signal monitoring system to facilitate health analysis and diagnosis of 3-D video watchers. The NeuroGlasses system acquires health-related signals by physiological sensors and provides feedbacks of health-related features. Moreover, the NeuroGlasses system employs signal-specific reconstruction and feature extraction to compensate the distortion of signals caused by variation of the placement of the sensors. We also propose a server-based NeuroGlasses infrastructure where physiological features can be extracted for real-time response or collected on the server side for long term analysis and diagnosis. Through an on-campus pilot study, the experimental results show that NeuroGlasses system can effectively provide physiological information for healthcare purpose. Furthermore, it approves that 3-D vision technology has a significant impact on the physiological signals, such as EEG, which potentially leads to neural diseases.  相似文献   

16.
In this work, we implement a real-time visual tracker that targets the position and 3D pose of objects in video sequences, specifically faces. The use of stream processors for the computations and efficient Sparse-Template-based particle filtering allows us to achieve real-time processing even when tracking multiple objects simultaneously in high-resolution video frames. Stream processing is a relatively new computing paradigm that permits the expression and execution of data-parallel algorithms with great efficiency and minimum effort. Using a GPU (graphics processing unit, a consumer-grade stream processor) and the NVIDIA CUDA™ technology, we can achieve performance improvements as large as ten times compared to a similar CPU-only tracker. At the same time, the Stream processing approach opens the door to other computing devices, like the Cell/BE™ or other multicore CPUs.  相似文献   

17.
This paper proposes a method for progressive lossy-to-lossless compression of four-dimensional (4-D) medical images (sequences of volumetric images over time) by using a combination of three-dimensional (3-D) integer wavelet transform (IWT) and 3-D motion compensation. A 3-D extension of the set-partitioning in hierarchical trees (SPIHT) algorithm is employed for coding the wavelet coefficients. To effectively exploit the redundancy between consecutive 3-D images, the concepts of key and residual frames from video coding is used. A fast 3-D cube matching algorithm is employed to do motion estimation. The key and the residual volumes are then coded using 3-D IWT and the modified 3-D SPIHT. The experimental results presented in this paper show that our proposed compression scheme achieves better lossy and lossless compression performance on 4-D medical images when compared with JPEG-2000 and volumetric compression based on 3-D SPIHT.  相似文献   

18.
We present a two-dimensional (2-D) mesh-based mosaic representation, consisting of an object mesh and a mosaic mesh for each frame and a final mosaic image, for video objects with mildly deformable motion in the presence of self and/or object-to-object (external) occlusion. Unlike classical mosaic representations where successive frames are registered using global motion models, we map the uncovered regions in the successive frames onto the mosaic reference frame using local affine models, i.e., those of the neighboring mesh patches. The proposed method to compute this mosaic representation is tightly coupled with an occlusion adaptive 2-D mesh tracking procedure, which consist of propagating the object mesh frame to frame, and updating of both object and mosaic meshes to optimize texture mapping from the mosaic to each instance of the object. The proposed representation has been applied to video object rendering and editing, including self transfiguration, synthetic transfiguration, and 2-D augmented reality in the presence of self and/or external occlusion. We also provide an algorithm to determine the minimum number of still views needed to reconstruct a replacement mosaic which is needed for synthetic transfiguration. Experimental results are provided to demonstrate both the 2-D mesh-based mosaic synthesis and two different video object editing applications on real video sequences.  相似文献   

19.
In this paper, we present a novel method for content adaptation and video summarization fully implemented in compressed-domain. Firstly, summarization of generic videos is modeled as the process of extracted human objects under various activities/events. Accordingly, frames are classified into five categories via fuzzy decision including shot changes (cut and gradual transitions), motion activities (camera motion and object motion) and others by using two inter-frame measurements. Secondly, human objects are detected using Haar-like features. With the detected human objects and attained frame categories, activity levels for each frame are determined to adapt with video contents. Continuous frames belonging to same category are grouped to form one activity entry as content of interest (COI) which will convert the original video into a series of activities. An overall adjustable quota is used to control the size of generated summarization for efficient streaming purpose. Upon this quota, the frames selected for summarization are determined by evenly sampling the accumulated activity levels for content adaptation. Quantitative evaluations have proved the effectiveness and efficiency of our proposed approach, which provides a more flexible and general solution for this topic as domain-specific tasks such as accurate recognition of objects can be avoided.  相似文献   

20.
Accurate and fast localization of a predefined target region inside the patient is an important component of many image-guided therapy procedures. This problem is commonly solved by registration of intraoperative 2-D projection images to 3-D preoperative images. If the patient is not fixed during the intervention, the 2-D image acquisition is repeated several times during the procedure, and the registration problem can be cast instead as a 3-D tracking problem. To solve the 3-D problem, we propose in this paper to apply 2-D region tracking to first recover the components of the transformation that are in-plane to the projections. The 2-D motion estimates of all projections are backprojected into 3-D space, where they are then combined into a consistent estimate of the 3-D motion. We compare this method to intensity-based 2-D to 3-D registration and a combination of 2-D motion backprojection followed by a 2-D to 3-D registration stage. Using clinical data with a fiducial marker-based gold-standard transformation, we show that our method is capable of accurately tracking vertebral targets in 3-D from 2-D motion measured in X-ray projection images. Using a standard tracking algorithm (hyperplane tracking), tracking is achieved at video frame rates but fails relatively often (32% of all frames tracked with target registration error (TRE) better than 1.2 mm, 82% of all frames tracked with TRE better than 2.4 mm). With intensity-based 2-D to 2-D image registration using normalized mutual information (NMI) and pattern intensity (PI), accuracy and robustness are substantially improved. NMI tracked 82% of all frames in our data with TRE better than 1.2 mm and 96% of all frames with TRE better than 2.4 mm. This comes at the cost of a reduced frame rate, 1.7 s average processing time per frame and projection device. Results using PI were slightly more accurate, but required on average 5.4 s time per frame. These results are still substantially faster than 2-D to 3-D registration. We conclude that motion backprojection from 2-D motion tracking is an accurate and efficient method for tracking 3-D target motion, but tracking 2-D motion accurately and robustly remains a challenge.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号