基于主题模型的多示例多标记学习方法 Multi-instance multi-label learning method based on topic model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于主题模型的多示例多标记学习方法

引用本文：	严考碧,李志欣,张灿龙. 基于主题模型的多示例多标记学习方法[J]. 计算机应用, 2015, 35(8): 2233-2237. DOI: 10.11772/j.issn.1001-9081.2015.08.2237

作者姓名：	严考碧李志欣张灿龙

作者单位：	1. 广西师范大学广西多源信息挖掘与安全重点实验室, 广西桂林 541004;2. 广西信息科学实验中心, 广西桂林 541004

基金项目：	国家自然科学基金资助项目(61165009,61262005,61363035,61365009);国家973计划项目(2012CB326403);广西自然科学基金资助项目(2012GXNSFAA053219,2013GXNSFAA019345,2014GXNSFAA118368)。

摘要：	针对现有的大部分多示例多标记(MIML)算法都没有考虑如何更好地表示对象特征这一问题,将概率潜在语义分析(PLSA)模型和神经网络(NN)相结合,提出了基于主题模型的多示例多标记学习方法。算法通过概率潜在语义分析模型学习到所有训练样本的潜在主题分布,该过程是一个特征学习的过程,用于学习到更好的特征表达,用学习到的每个样本的潜在主题分布作为输入来训练神经网络。当给定一个测试样本时,学习测试样本的潜在主题分布,将学习到的潜在主题分布输入到训练好的神经网络中,从而得到测试样本的标记集合。与两种经典的基于分解策略的多示例多标记算法相比,实验结果表明提出的新方法在现实世界中的两种多示例多标记学习任务中具有更优越的性能。
关键词：	主题模型特征表达多示例多标记学习场景分类文本分类
收稿时间：	2015-03-27
修稿时间：	2015-05-30
Multi-instance multi-label learning method based on topic model

YAN Kaobi,LI Zhixin,ZHANG Canlong. Multi-instance multi-label learning method based on topic model[J]. Journal of Computer Applications, 2015, 35(8): 2233-2237. DOI: 10.11772/j.issn.1001-9081.2015.08.2237

Authors:	YAN Kaobi LI Zhixin ZHANG Canlong

Affiliation:	1. Guangxi Key Laboratory of Multi-Source Information Mining and Security, Guangxi Normal University, Guilin Guangxi 541004, China; 2. Guangxi Experiment Center of Information Science, Guilin Guangxi 541004, China

Abstract:	Concerning that most of the current methods for Multi-Instance Multi-Label (MIML) problem do not consider how to represent features of objects in an even better way, a new MIML approach combined with Probabilistic Latent Semantic Analysis (PLSA) model and Neural Network (NN) was proposed based on topic model. The proposed algorithm learned the latent topic allocation of all the training examples by using the PLSA model. The above process was equivalent to the feature learning for getting a better feature expression. Then it utilized the latent topic allocation of each training example to train the neural network. When a test example was given, the proposed algorithm learned its latent topic distribution, then regarded the learned latent topic allocation of the test example as an input of the trained neural network to get the multiple labels of the test example. The experimental results on comparison with two classical algorithms based on decomposition strategy show that the proposed method has superior performance on two real-world MIML tasks.

Keywords:	topic model feature expression multi-instance multi-label learning scene classification text categorization
本文献已被万方数据等数据库收录！
	点击此处可从《计算机应用》浏览原始摘要信息
	点击此处可从《计算机应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏