首页 | 本学科首页   官方微博 | 高级检索  
     


Composition pattern oriented tag extraction from short documents using a structural learning method
Authors:Yongwook Shin  Sung-Jun Lee  Jonghun Park
Affiliation:1. Department of Industrial Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 151-744, Korea
Abstract:With the rapid growth of web, automatic tagging that detects informative terms from a document becomes an important problem for information aggregation and sharing services. In particular, automatic tagging for short documents becomes more interesting as many users are increasingly publishing information through social media services which encourage users to create the documents of short length. In this paper, we propose a novel automatic tagging model for short text documents from social media services, following the framework of supervised learning. We redefine traditional frequency-based term features so that they can address the properties of the documents created under length limitation and consider sequential dependencies between successive terms in a document based on a structural support vector machine. In addition, our proposed approach incorporates composition patterns by which users put informative terms into their documents. Extensive experiments have been conducted to validate the presented approach, and it was found that the proposed term features were effective for extracting tags, and the tag extractor trained by considering the sequential dependencies and composition patterns achieved superior performance results over the existing alternative methods.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号