On developing indicators with text analytics: exploring concept vectors applied to English and Chinese texts期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

On developing indicators with text analytics: exploring concept vectors applied to English and Chinese texts

Authors:	Steven O. Kimbrough Christine Chou Yi-Ting Chen Hilary Lin

Affiliation:	1. Operations and Information Management, University of Pennsylvania, 3730 Walnut Street, Philadelphia, PA, 19104, USA 2. Department of International Business, National Dong Hwa University, No. 1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien, 97401, Taiwan 3. Economics Department, Iona College, Spellman Hall 2nd Floor, 715 North Avenue, New Rochelle, NY, 10801, USA

Abstract:	This paper investigates how high-quality, vocabulary-based classifiers, useful for competitive intelligence, can be found for relatively small corpora of publicly available documents. Two corpora of recent annual reports are examined and compared, one in English and one in Chinese. The paper tests whether vocabularies can predict whether firms are relatively innovative or not, examining vocabularies of both content words and function words. We find that indeed the tested vocabularies do produce effective indicators or classifiers and, surprisingly, that function words are especially effective. The paper also provides extensive conceptual and theoretical background to frame the investigation in the context of an EMCUT problematic, that of mapping entities to classification schemes using information derived from text.

Keywords:
本文献已被 SpringerLink 等数据库收录！