Automated classification of tissue types of Region of Interest (ROI) in medical images has been an important application in Computer-Aided Diagnosis (CAD). Recently, bag-of-feature methods which treat each ROI as a set of local features have shown their power in this field. Two important issues of bag-of-feature strategy for tissue classification are investigated in this paper: the visual vocabulary learning and weighting, which are always considered independently in traditional methods by neglecting the inner relationship between the visual words and their weights. To overcome this problem, we develop a novel algorithm, Joint-ViVo, which learns the vocabulary and visual word weights jointly. A unified objective function based on large margin is defined for learning of both visual vocabulary and visual word weights, and optimized alternately in the iterative algorithm. We test our algorithm on three tissue classification tasks: classifying breast tissue density in mammograms, classifying lung tissue in High-Resolution Computed Tomography (HRCT) images, and identifying brain tissue type in Magnetic Resonance Imaging (MRI). The results show that Joint-ViVo outperforms the state-of-art methods on tissue classification problems.  相似文献   

We use charting, a non-linear dimensionality reduction algorithm, for articulated human motion classification in multi-view sequences or 3D data. Charting estimates automatically the intrinsic dimensionality of the latent subspace and preserves local neighbourhood and global structure of high-dimensional data. We classify human actions sub-sequences of varying lengths of skeletal poses, adopting a multi-layered subspace classification scheme with layered pruning and search. The sub-sequences of varying lengths of skeletal poses can be extracted using either markerless articulated tracking algorithms or markerless motion capture systems. We present a qualitative and quantitative comparison of single-subspace and multiple-subspace classification algorithms. We also identify the minimum length of action skeletal poses, required for accurate classification, using competing classification systems as the baseline. We test our motion classification framework on HumanEva, CMU, HDM05 and ACCAD mocap datasets and achieve similar or better classification accuracy than various comparable systems.  相似文献   

目的 视频动作质量评估旨在评估视频中特定动作的执行情况和完成质量。自动化的动作质量评估能够有效地减少人力资源的损耗,可以更加精准、公正地对视频内容进行评估。传统动作质量评估方法主要存在以下问题: 1)视频中动作主体的多尺度时空特征问题; 2)认知差异导致的标记内在模糊性问题; 3)多头自注意力机制的注意力头冗余问题。针对以上问题,提出了一种能够感知视频序列中不同时空位置、生成细粒度标记的动作质量评估模型SALDL (self-attention and label distribution learning)。方法 SALDL提出Attention-Inc (attention-inception)结构,该结构通过Embedding、多头自注意力以及多层感知机将自注意力机制渐进式融入Inception结构,使模型能够获得不同尺度卷积特征之间的上下文信息。提出一种正负时间注意力模块PNTA (pos-neg temporal attention),通过PNTA损失挖掘时间注意力特征,从而减少自注意力头冗余并提取不同片段的注意力特征。SALDL模型通过标记增强及标记分布学习生成细粒度的动作质量标记。结果 提出的SALDL模型在MTL-AQA (multitask learning-action quality assessment)和JIGSAWS (JHU-ISI gesture and skill assessment working set)等数据集上进行了大量对比及消融实验,斯皮尔曼等级相关系数分别为0.941 6和0.818 3。结论 SALDL模型通过充分挖掘不同尺度的时空特征解决了多尺度时空特征问题,并引入符合标记分布的先验知识进行标记增强,达到了解决标记的内在模糊性问题以及注意力头的冗余问题。  相似文献   

In this article, we review unsupervised neural network learning procedures which can be applied to the task of preprocessing raw data to extract useful features for subsequent classification. The learning algorithms reviewed here are grouped into three sections: information-preserving methods, density estimation methods, and feature extraction methods. Each of these major sections concludes with a discussion of successful applications of the methods to real-world problems.The first author is supported by research grants from the James S. McDonnell Foundation (grant #93–95) and the Natural Sciences and Engineering Research Council of Canada. For part of this work, the second author was supported by a Temporary Lectureship from the Academic Initiative of the University of London, and by a grant (GR/J38987) from the Science and Engineering Research Council (SERC) of the UK.  相似文献   

Human action recognition is a promising yet non-trivial computer vision field with many potential applications. Current advances in bag-of-feature approaches have brought significant insights into recognizing human actions within complex context. It is, however, a common practice in literature to consider action as merely an orderless set of local salient features. This representation has been shown to be oversimplified, which inherently limits traditional approaches from robust deployment in real-life scenarios. In this work, we propose and show that, by taking into account global configuration of local features, we can greatly improve recognition performance. We first introduce a novel feature selection process called Sparse Hierarchical Bayes Filter to select only the most contributive features of each action type based on neighboring structure constraints. We then present the application of structured learning in human action analysis. That is, by representing human action as a complex set of local features, we can incorporate different spatial and temporal feature constraints into the learning tasks of human action classification and localization. In particular, we tackle the problem of action localization in video using structured learning with two alternatives: one is Dynamic Conditional Random Field from probabilistic perspective; the other is Structural Support Vector Machine from max-margin point of view. We evaluate our modular classification-localization framework on various testbeds, in which our proposed framework is proven to be highly effective and robust compared against bag-of-feature methods.  相似文献   

Learning a compact and yet discriminative codebook is an important procedure for local feature-based action recognition. A common procedure involves two independent phases: reducing the dimensionality of local features and then performing clustering. Since the two phases are disconnected, dimensionality reduction does not necessarily capture the dimensions that are greatly helpful for codebook creation. What’s more, some dimensionality reduction techniques such as the principal component analysis do not take class separability into account and thus may not help build an effective codebook. In this paper, we propose the weighted adaptive metric learning (WAML) which integrates the two independent phases into a unified optimization framework. This framework enables to select indispensable and crucial dimensions for building a discriminative codebook. The dimensionality reduction phase in the WAML is optimized for class separability and adaptively adjusts the distance metric to improve the separability of data. In addition, the video word weighting is smoothly incorporated into the WAML to accurately generate video words. Experimental results demonstrate that our approach builds a highly discriminative codebook and achieves comparable results to other state-of-the-art approaches.  相似文献   

深度学习在人物动作识别方面已取得较好的成效,但当前仍然需要充分利用视频中人物的外形信息和运动信息。为利用视频中的空间信息和时间信息来识别人物行为动作,提出一种时空双流视频人物动作识别模型。该模型首先利用两个卷积神经网络分别抽取视频动作片段空间和时间特征,接着融合这两个卷积神经网络并提取中层时空特征,最后将提取的中层特征输入到3D卷积神经网络来完成视频中人物动作的识别。在数据集UCF101和HMDB51上,进行视频人物动作识别实验。实验结果表明,所提出的基于时空双流的3D卷积神经网络模型能够有效地识别视频人物动作。  相似文献   

Since learning English is very popular in non-English speaking countries, developing modern assisted-learning tools that support effective English learning is a critical issue in the English-language education field. Learning English involves memorization and practice of a large number of vocabulary words and numerous grammatical structures. Vocabulary learning is a principal issue for English learning because vocabulary comprises the basic building blocks of English sentences. Therefore, many studies have attempted to improve the efficiency and performance when learning English vocabulary. With the accelerated growth in wireless and mobile technologies, mobile learning using mobile devices such as PDAs, tablet PCs, and cell phones has gradually become considered effective because it inherits all the advantages of e-learning and overcomes limitations of learning time and space that limit web-based learning systems. Therefore, this study presents a personalized mobile English vocabulary learning system based on Item Response Theory and learning memory cycle, which recommends appropriate English vocabulary for learning according to individual learner vocabulary ability and memory cycle. The proposed system has been successfully implemented on personal digital assistant (PDA) for personalized English vocabulary learning. The experimental results indicated that the proposed system could obviously promote the learning performances and interests of learners due to effective and flexible learning mode for English vocabulary learning.  相似文献   

To provide more sophisticated healthcare services, it is necessary to collect the precise information on a patient. One impressive area of study to obtain meaningful information is human activity recognition, which has proceeded through the use of supervised learning techniques in recent decades. Previous studies, however, have suffered from generating a training dataset and extending the number of activities to be recognized. In this paper, to find out a new approach that avoids these problems, we propose unsupervised learning methods for human activity recognition, with sensor data collected from smartphone sensors even when the number of activities is unknown. Experiment results show that the mixture of Gaussian exactly distinguishes those activities when the number of activities k is known, while hierarchical clustering or DBSCAN achieve above 90% accuracy by obtaining k based on Caliński–Harabasz index, or by choosing appropriate values for ɛ and MinPts when k is unknown. We believe that the results of our approach provide a way of automatically selecting an appropriate value of k at which the accuracy is maximized for activity recognition, without the generation of training datasets by hand.  相似文献   

针对遥感图像场景零样本分类算法中的空间类结构不一致以及域偏移问题,提出基于Sammon嵌入和谱聚类方法结合的直推式遥感图像场景零样本分类算法。首先,基于Sammon嵌入算法修正语义特征空间类原型表示,使其与视觉特征空间类原型结构对齐;其次,借助结构迁移方法得到视觉特征空间测试类原型表示;最后,针对域偏移问题,采用谱聚类方法修正视觉特征空间测试类原型,以适应测试类样本分布特点,提高场景零样本分类准确度。在两个遥感场景集(UCM和AID)上分别获得52.89%和55.93%的最高总体分类准确度,均显著优于对比方法。实验结果表明,通过显著降低视觉特征空间和语义特征空间的场景类别结构不一致性,同时减轻了域偏移问题,可实现语义特征空间类结构知识到视觉特征空间的有效迁移,大幅提升遥感场景零样本分类的准确度。  相似文献   

Humans draw on their stereotypic beliefs to make assumptions about others. Even though prior research has shown that individuals respond socially to media, there is little evidence with regards to learners stereotyping and categorizing pedagogical agents. This study investigated whether learners stereotype a pedagogical agent as being knowledgeable or not knowledgeable and how this acuity influenced learning. Participants were assigned to four experimental conditions differing by agent (scientist or artist) and tutorial type (nanotechnology or punk rock). Quantitative analyses indicated that agents were stereotyped depending on their image and the academic domain under which they functioned. Regardless of tutorial, participants assigned to the artist agent recalled more information than participants assigned to the scientist agent. Learning differences between the groups varied according to whether agent appearance fit the content area under investigation. Qualitative results indicated learner's stereotypic expectations as well as their unwillingness to draw conclusions based on visual appearance.  相似文献   

Distribution and variability of ozone are vital to the atmospheric thermal structure as it can exert great influence on climate. In this study, the Microtops II Ozonometer (Microtops)-measured total column ozone (TCO) data archived at the tropical urban, high altitude, and coastal observing sites during 2012–2015 are analysed to investigate the temporal structure of ozone. Results reveal that the TCO exhibits a non-negligible diurnal variability depicting distinct seasonal behaviour, which corroborates well with the Indian as well as the worldwide measurements of TCO. The mean rate of ozone diurnal change (Vs) in winter is found to be maximum (approximately 2.1 DU h–1) while it is minimum (about 0.53 DU h–1) in pre-monsoon. In spite of the prevalent variability of the order of about 2–9 DU amongst Microtops channels and Ozone Monitoring Instrument on board the NASA EOS/AURA spacecraft (OMI-AURA) measurements, there exists a strong monthly/seasonal variation in both the ground- and satellite-based TCO measurements. Monthly mean OMI-AURA TCO variation presents a nearly perfect sinusoidal wave with a coefficient of determination (R2) equal to 0.76. Monthly TCO is maximum in May/June and minimum in December/January. The noticeable diurnal and monthly TCO variability could be due to a complex combination of photochemical processes in the lower troposphere and the transport in the middle and upper troposphere. Linear regression technique applied to the Microtops and OMI-AURA data sets show that the two data sets are better correlated with a correlation coefficient (r) taking values 0.71, 0.77, and 0.61 for channels I, II, and III, respectively. The three Microtops channels show the dispersion of about 8–11 DU around 1:1 regression line which is of the order of one standard deviation of the daily mean data set. The TCO data at all Microtops channels either underestimate or overestimate with respect to the OMI-AURA measurements since the values for slopes of the linear regression line for all the three channels are ≤1. Pearson’s product moment correlation analysis indicates that the TCO anti-correlates with ultraviolet-B (UV-B) irradiance (vis-à-vis through UV index) as the Pearson’s product moment correlation coefficients are found to be in the range –0.52 to –0.97.  相似文献   

针对光照变化人脸识别问题中传统的光谱回归算法不能很好地进行特征提取而严重影响识别性能的问题,提出了局部判别嵌入优化光谱回归分类的人脸识别算法。计算出训练样本的特征向量;借助于数据的近邻和分类关系,利用局部判别嵌入算法构建分类问题所需的嵌入,同时学习每种分类的子流形所需的嵌入;利用光谱回归分类算法计算投影矩阵,并利用最近邻分类器完成人脸的识别。在两大人脸数据库扩展YaleB及CMU PIE上的实验验证了该算法的有效性,实验结果表明,相比其他光谱回归算法,该算法取得了更高的识别率、更好的工作特性,并且降低了计算复杂度。  相似文献   

近年来各类人体行为识别算法利用大量标记数据进行训练,取得了良好的识别精度。但在实际应用中,数据的获取以及标注过程都是非常耗时耗力的,这限制了算法的实际落地。针对弱监督及少样本场景下的视频行为识别深度学习方法进行综述。首先,在弱监督情况下,分类总结了半监督行为识别方法和无监督领域自适应下的视频行为识别方法;然后,对少样本场景下的视频行为识别算法进行详细综述;接着,总结了当前相关的人体行为识别数据集,并在该数据集上对各相关视频行为识别算法性能进行分析比较;最后,进行概括总结,并展望人体行为识别的未来发展方向。  相似文献   

基于"学习者-监督者"的间接学习机制,提出多阶段监督的软迁移学习方法来实现跨网络结构学习,使神经网络对人体行为的建模能力能在不同结构的网络中传递和重用.根据数据特征在不同网络层级上的不同特性,引入两种有效的特征差异度量函数,降低不同网络结构提取的特征之间的差异.在UCF101和HMDB51数据集上进行实验,其结果表明,...  相似文献   

