首页 | 本学科首页   官方微博 | 高级检索  
     


On developing indicators with text analytics: exploring concept vectors applied to English and Chinese texts
Authors:Steven O. Kimbrough  Christine Chou  Yi-Ting Chen  Hilary Lin
Affiliation:1. Operations and Information Management, University of Pennsylvania, 3730 Walnut Street, Philadelphia, PA, 19104, USA
2. Department of International Business, National Dong Hwa University, No. 1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien, 97401, Taiwan
3. Economics Department, Iona College, Spellman Hall 2nd Floor, 715 North Avenue, New Rochelle, NY, 10801, USA
Abstract:This paper investigates how high-quality, vocabulary-based classifiers, useful for competitive intelligence, can be found for relatively small corpora of publicly available documents. Two corpora of recent annual reports are examined and compared, one in English and one in Chinese. The paper tests whether vocabularies can predict whether firms are relatively innovative or not, examining vocabularies of both content words and function words. We find that indeed the tested vocabularies do produce effective indicators or classifiers and, surprisingly, that function words are especially effective. The paper also provides extensive conceptual and theoretical background to frame the investigation in the context of an EMCUT problematic, that of mapping entities to classification schemes using information derived from text.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号