图像数据解压缩问题是一类重要的数据处理问题,数据特征学习在数据压缩研究中有重要的研究价值。提出了一种基于云模型的变分自编码器特征表征模型,将云模型作为变分自编码器的先验分布,解决变分自编码器在特征表征上的局限性。变分自编码器的编码器部分负责构建数据的特征空间,通过在该空间中采样获得隐变量,完成数据压缩;解码器部分完成从数据特征到原数据的生成,即数据的解压。在人脸数据集上与原方法作实验对比,验证了该方法的正确性和有效性。  相似文献   

针对微博文本情感分析中大量有标记数据难获取,以及文本特征学习不完全的问题,提出将长短时记忆网络(Long Short-Term Memory,LSTM)及其衍生模型双向长短时记忆网络(Bi-LSTM)引入变分自编码生成模型,构建基于变分自编码的半监督文本分类模型.其中LSTM作为变分编码器中的编码器和解码器,Bi-LSTM作为分类器.分类器既为编码器提供标签信息共同生成隐变量,也与隐变量通过解码器共同重构数据,利用无标记数据的有用信息提高分类器的性能.与其他方法在同一公开数据集上对比的实验结果表明,该模型的分类效果更好.  相似文献   

针对协同过滤推荐模型的数据稀疏性问题,提出一种带有聚类隐变量的变分自编码器,用于处理用户的隐式反馈数据.该深度生成模型既能学习到隐变量的特征分布,同时又能完成对特征的聚类.先以多项式似然来重构原始数据,再用贝叶斯变分推断估计参数,并且将正则化系数引入到模型当中,通过调节其大小能够避免过度正则化,使模型的拟合效果更好.这种非线性的概率模型对缺失评分的预测有更好的建模能力.在MovieLens的三个数据集上的实验结果表明,该算法相比较于其他先进的基线有更优秀的推荐性能.  相似文献   

现有的交互式神经音乐生成方法主要存在控制模式不灵活、数据标注困难以及模型难以优化等问题。针对这些问题,提出了一种基于变分自编码器(VAE)的无监督交互式旋律生成方法。通过为VAE引入显式的旋律轮廓条件推理学习,实现了对生成旋律局部与全局特征的灵活控制。实验表明,该方法易于优化且具有良好的旋律局部与全局特征的控制能力。通过对大量生成样本的分析,证明了模型从音乐数据中学习到了有用的音乐知识。  相似文献   

入侵检测是主动防御网络中攻击行为的技术,以往入侵检测模型因正常网络流量与未知攻击内在特征区分度不足,导致对未知攻击识别率不够高,本文设计基于判别条件变分自编码器与密度峰值聚类算法的入侵检测模型(DCVAE-DPC).利用判别条件变分自编码器能够生成指定类别样本的能力,学习正常网络流量特征的隐空间表示并计算其重建误差,增加其与未知攻击间的特征区分度,并使用密度峰值聚类算法求出正常网络流量重建误差的分布,提高未知攻击识别率.实验结果表明,在NSL-KDD数据集中与当前流行的入侵检测模型相比,模型的分类准确率可以达到97.08%,具有更高的未知攻击检测能力,面对当前复杂网络环境,有更强的入侵检测性能.  相似文献   

传统变分自动编码器模型通常使用标准正态分布作为隐向量先验,当应用于推荐系统等复杂任务时容易导致模型过度正则化和隐向量解耦表现不佳。融合复杂隐向量先验与注意力机制,建立变分自动编码器模型。使用多层神经网络生成的隐向量先验分布替代标准正态分布作为假设先验分布,使得模型能根据数据学习先验分布并获得更多的潜在表征。在单层隐向量的基础上添加辅助隐向量,联合辅助隐向量与数据特征向量再生成隐向量,增强了隐向量的低维表现能力和解耦性。借助注意力机制的特征信息选择特点,对隐向量中重要节点赋予更大的权重值,使其能传递更重要的信息。在数据集Movielens-1M、Movielens-Latest-Small、Movielens-20M和Netflix上的实验结果表明,该模型的Recall@20、Recall@50、NDCG@100相较于基线模型平均提升了12.95%、10.80%、10.48%,具有更高的推荐精确度。  相似文献   

深度学习从数据集中学习样本的内在规律,数据集的质量一定程度上决定了模型的表现。在去雾任务的公开数据集中,由于缺少成对真实数据,合成的成对数据难以模拟真实环境等问题,可能导致训练出的模型在实际环境中表现不佳。为此,提出混合样本学习问题,利用合成的成对数据和真实数据(混合样本)同时训练模型,通过隐空间的转换实现混合样本间的转换。算法利用变分自编码器和生成对抗网络(VAE-GAN)将混合样本分别编码到隐空间,利用对抗损失将真实数据的隐空间向合成雾图的隐空间对齐,利用含特征自适应融合(MFF)模块的映射网络学习成对数据隐空间之间的转换,从而建立起从真实雾图域到清晰图像域之间的去雾数据通路。实验结果表明,该算法相比其他去雾算法在真实雾图上的去雾结果更加清晰,对于较厚的雾图也有突出的效果,且该算法的峰值信噪比高于对比算法。  相似文献   

行人重识别技术在实际应用中易受行人姿态变化的干扰, 由于行人姿态的变化不仅丢失部分行人信息, 而且还会引起大于身份差异的外观变化, 导致现有工作难以学到鲁棒的行人特征. 为了解决上述问题, 本文提出一种基于变分对抗与强化学习的生成式对抗网络(RL-VGAN)用于多姿态行人重识别任务. 该方法的核心思想是在不受姿态变化干扰的情况下通过外观编码器和姿态编码器将行人属性分解为外观特征和姿态特征, 用以学习鲁棒的身份视觉特征. 首先, 设计的变分生成网络利用Kullback-Leibler散度损失促进外观编码器推断与身份信息相关的连续隐变量. 其次, 为了使生成式对抗网络逐步收敛到稳定状态, 采用强化学习策略平衡变分生成网络和判别网络的性能. 此外, 针对基于姿态引导图像生成任务, 提出一种新的Inception Score损失用于规范变分生成网络生成图像质量的过程. 实验结果证明, 所提出的RL-VGAN方法在多个基准数据集上优于其他方法.  相似文献   

自动的室内家具摆放在家居设计、动态场景生成等应用中具有显著的意义.传统算法往往通过显式的空间、语义和功能性上物体之间的关系来理解场景的内部结构,并进一步辅助室内场景的生成.随着大规模室内场景数据集的出现,提出将零散的输入家具编码进图结构,并利用图神经网络中迭代的消息传递隐式地学习场景的分布先验.为了满足家具摆放的多样性,提出将图神经网络融合进条件式变分自编码器.通过一个编码器将输入场景嵌入到一个符合高斯分布的隐变量,并通过一个生成器将从隐变量采样的场景先验用于条件式的新场景生成.在Fu-floor数据集上的实验结果表明,与基准算法相比,该算法在生成结果的评价指标最小匹配距离上表现更优.该算法对于未来实现场景补全、基于场景图的室内家具摆放等实际应用也具有显式的意义和价值.  相似文献   

近年来,文本风格转换作为一种可控的文本生成任务受到学者们越来越多的关注。该文基于变分自编码器模型,通过鉴别器与变分自编码器的对抗性训练,将源端句子的内容和风格在隐变量空间进行分离,从而实现无监督的文本风格转换。针对文本语义内容和风格的解纠缠过程中利用固定的二进制向量通过线性变换来对风格进行表征的方法的不足,该文提出更具细腻度的联合表征方法: 利用独立的编码器从原句中提取风格的连续隐向量,再和标签向量结合作为最终风格的表征,以提升风格转换的准确率。该文提出的联合表征方法在常用数据集Yelp上进行评测,与两个基线方法相比,风格转换准确率均有显著提升。  相似文献   

Designing realistic digital humans is extremely complex. Most data-driven generative models used to simplify the creation of their underlying geometric shape do not offer control over the generation of local shape attributes. In this paper, we overcome this limitation by introducing a novel loss function grounded in spectral geometry and applicable to different neural-network-based generative models of 3D head and body meshes. Encouraging the latent variables of mesh variational autoencoders (VAEs) or generative adversarial networks (GANs) to follow the local eigenprojections of identity attributes, we improve latent disentanglement and properly decouple the attribute creation. Experimental results show that our local eigenprojection disentangled (LED) models not only offer improved disentanglement with respect to the state-of-the-art, but also maintain good generation capabilities with training times comparable to the vanilla implementations of the models. Our code and pre-trained models are available at github.com/simofoti/LocalEigenprojDisentangled .  相似文献   

Mansbridge  Alex  Fierimonte  Roberto  Feige  Ilya  Barber  David 《Machine Learning》2019,108(8-9):1601-1611

Powerful generative models, particularly in natural language modelling, are commonly trained by maximizing a variational lower bound on the data log likelihood. These models often suffer from poor use of their latent variable, with ad-hoc annealing factors used to encourage retention of information in the latent variable. We discuss an alternative and general approach to latent variable modelling, based on an objective that encourages a perfect reconstruction by tying a stochastic autoencoder with a variational autoencoder (VAE). This ensures by design that the latent variable captures information about the observations, whilst retaining the ability to generate well. Interestingly, although our model is fundamentally different to a VAE, the lower bound attained is identical to the standard VAE bound but with the addition of a simple pre-factor; thus, providing a formal interpretation of the commonly used, ad-hoc pre-factors in training VAEs.


Wu  Ga  Domke  Justin  Sanner  Scott 《Machine Learning》2022,111(7):2537-2559
Machine Learning - Variational Autoencoders (VAEs) are a popular generative model, but one in which conditional inference can be challenging. If the decomposition into query and evidence variables...  相似文献   

This paper presents a novel generative model to synthesize fluid simulations from a set of reduced parameters. A convolutional neural network is trained on a collection of discrete, parameterizable fluid simulation velocity fields. Due to the capability of deep learning architectures to learn representative features of the data, our generative model is able to accurately approximate the training data set, while providing plausible interpolated in‐betweens. The proposed generative model is optimized for fluids by a novel loss function that guarantees divergence‐free velocity fields at all times. In addition, we demonstrate that we can handle complex parameterizations in reduced spaces, and advance simulations in time by integrating in the latent space with a second network. Our method models a wide variety of fluid behaviors, thus enabling applications such as fast construction of simulations, interpolation of fluids with different parameters, time re‐sampling, latent space simulations, and compression of fluid simulation data. Reconstructed velocity fields are generated up to 700× faster than re‐simulating the data with the underlying CPU solver, while achieving compression rates of up to 1300×.  相似文献   

Tracking unknown human motions using generative tracking techniques requires the exploration of a high-dimensional pose space which is both difficult and computationally expensive. Alternatively, if the type of activity is known and training data is available, a low-dimensional latent pose space may be learned and the difficulty and cost of the estimation task reduced. In this paper we attempt to combine the competing benefits—flexibility and efficiency—of these two generative tracking scenarios within a single approach. We define a number of “activity models”, each composed of a pose space with unique dimensionality and an associated dynamical model, and each designed for use in the recovery of a particular class of activity. We then propose a method for the fair combination of these activity models for use in particle dispersion by an annealed particle filter. The resulting algorithm, which we term multiple activity model annealed particle filtering (MAM-APF), is able to dynamically vary the scope of its search effort, using a small number of particles to explore latent pose spaces and a large number of particles to explore the full pose space. We present quantitative results on the HumanEva-I and HumanEva-II datasets, demonstrating robust 3D tracking of known and unknown activities from fewer than four cameras.  相似文献   

We present a new generative model of natural language, the latent words language model. This model uses a latent variable for every word in a text that represents synonyms or related words in the given context. We develop novel methods to train this model and to find the expected value of these latent variables for a given unseen text. The learned word similarities help to reduce the sparseness problems of traditional n-gram language models. We show that the model significantly outperforms interpolated Kneser–Ney smoothing and class-based language models on three different corpora. Furthermore the latent variables are useful features for information extraction. We show that both for semantic role labeling and word sense disambiguation, the performance of a supervised classifier increases when incorporating these variables as extra features. This improvement is especially large when using only a small annotated corpus for training.  相似文献   

Akuzawa  Kei  Iwasawa  Yusuke  Matsuo  Yutaka 《Machine Learning》2021,110(8):2239-2266
Machine Learning - Sequential variational autoencoders (VAEs) with a global latent variable z have been studied for disentangling the global features of data, which is useful for several downstream...  相似文献   

孙辉霞  李跃新 《计算机应用》2015,35(12):3477-3480
针对标签空间的指数增长这一问题,提出了一种基于潜在特征的重叠社团识别算法。首先,提出了一种包含重叠社团的网络产生式模型。根据该产生式模型,通过最大化目标网络的产生概率来推导网络中节点的潜在特征,并给出了优化目标函数。然后,通过将网络诱导为二部图,分析得出了潜在特征个数的下届,并据此对标签空间进行优化。实验表明,提出的重叠社团识别算法与BigClam算法相比较,在保持运行效率和查准率基本不变的前提下,可以明显提高检索结果的召回率。该算法可以有效地应对社团识别中标签空间的指数增长。  相似文献   

A generative CAD based design exploration method is proposed. It is suitable for complex multi-criteria design problems where important performance criteria are uncomputable. The method is based on building a genotype of the design within a history based parametric CAD system and then, varying its parameters randomly within pre-defined limits to generate a set of distinctive designs. The generated designs are then filtered through various constraint envelopes representing geometric viability, manufacturability, cost and other performance related constraints, thus reducing the vast design space into a smaller viable design space represented by a set of distinctive designs. These designs may then be further developed by the designer. The proposed generative design method makes minimal imposition on the designer’s work process and maintains both flexibility and fluidity that is required for creative design exploration. Its ability to work seamlessly with current CAD based design practices from early conceptual to detailed design is demonstrated. The design philosophy behind this generative method and the key steps involved in its implementation are presented with examples.  相似文献   

Recognizing and tracking multiple activities are all extremely challenging machine vision tasks due to diverse motion types included and high-dimensional (HD) state space. To overcome these difficulties, a novel generative model called composite motion model (CMM) is proposed. This model contains a set of independent, low-dimensional (LD), and activity-specific manifold models that effectively constrain the state search space for 3D human motion recognition and tracking. This separate modeling of activity-specific movements can not only allow each manifold model to be optimized in accordance with only its respective movement, but also improve the scalability of the models. For accurate tracking with our CMM, a particle filter (PF) method is thus employed and then the particles can be distributed in all manifold models at each time step. In addition, an efficient activity switching strategy is proposed to dominate the particle distribution on all LD manifolds. To diffuse the particles amongst manifold models and respond quickly to the sudden changes in the activity, a set of visually-reasonable and kinematically-realistic transition bridges are synthesized by using the good properties of LD latent space and HD observation space, which enables the inter-activity motions seem more natural and realistic. Finally, a pose hypothesis that can best interpret the visual observation is selected and then used to recognize the activity that is currently observed. Extensive experiments, via qualitative and quantitative analyses, verify the effectiveness and robustness of our proposed CMM in the tasks of multi-activity 3D human motion recognition and tracking.  相似文献   

