首页 | 本学科首页   官方微博 | 高级检索  
     

企业经营范围文本自动分类方法探究
引用本文:韩雪,张业,朱聪慧.企业经营范围文本自动分类方法探究[J].标准科学,2012(1):93-96.
作者姓名:韩雪  张业  朱聪慧
作者单位:全国组织机构代码管理中心;哈尔滨工业大学
摘    要:随着各种数字化信息不断增长,如何对大量文档信息进行科学归类成为亟待解决的问题。文本自动分类方法成为目前解决该问题的一项关键技术。我国目前有超过1,000万的企业,企业经营范围是企业从事经营活动的具体描述。本文以企业经营范围数据为基础,根据其结构特征,以及与经济行业的关系,利用大规模文本数据的切分词优化、统计分类推断、属性关联分析等关键技术,通过在组织机构代码数据库中提炼相关数据进行实验比对分析,从而得出一种实用、高效的企业经营范围的文本自动分类方法。

关 键 词:经营范围  经济活动  行业类别  文本分类

Research on Text Automatic Classification Method of Enterprise Business Scope
Authors:HAN Xue  ZHANG Ye  ZHU Cong-hui
Affiliation:1.National Administration for Code Allocation to Organizations,Beijing 100029; 2.Harbin Institute of Technology,Haerbin 150001)
Abstract:With the growth of digitized information,it is necessary to scientifically classify a large number of information by computer.The text automatic classification method has become a key technology for solving these problems.Nowadays,there are 10 million enterprises in China,whose business scopes describe their businesses activities.This article researches on a method of text automatic classification,which bases on business scope data and relationship between scope structural feature and economic industries.The key technology of the research are word segmentation optimization of mass text data,statistics classification deduction and related nature analysis,etc..It turns to be a practical and efficient method of text automatic classification by experiment and analysis on the related data refined from the national organization code database.
Keywords:business scope  economic activity  industry category  text classification
本文献已被 CNKI 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号