首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  完全免费   5篇
  自动化技术   5篇
  2003年   1篇
  2002年   4篇
排序方式: 共有5条查询结果,搜索用时 15 毫秒
1
1.
利用数据库技术实现的可扩展的分类算法   总被引:8,自引:0,他引:8       下载免费PDF全文
刘红岩  陆宏钧  陈剑 《软件学报》2002,13(6):1075-1081
重点研究将数据挖掘中的分类技术与数据库技术紧密结合的高效的可扩展的分类算法.提出一种基于分组记数技术构造分类器的方法,利用数据库系统的结构化查询语言来实现主要计算任务.为了提高算法的执行效率,还提出了优化策略和冗余规则的剪裁策略,并将分类规则的发现过程与相关属性的选择方法有机地结合在一起.使用这些方法和策略,分类算法能够从大规模数据集中快速地发现一组简洁的规则.除了具有与现有分类算法相当的准确度和较高的执行效率以外,该分类算法还具有良好的基于训练集元组个数和属性个数两方面的可扩展性和易于实现的特点.  相似文献
2.
Data Extraction from the Web Based on Pre-Defined Schema   总被引:8,自引:1,他引:7       下载免费PDF全文
With the development of the Internet,the World Web has become an invaluable information source for most organizations,However,most documents available from the Web are in HTML form which is originally designed for document formatting with little consideration of its contents.Effectively extracting data from such documents remains a non-trivial task.In this paper,we present a schema-guided approach to extracting data from HTML pages .Under the approach,the user defines a schema specifying what to be extracted and provides sample mappings between the schema and th HTML page.The system will induce the mapping rules and generate a wrapper that takes the HTML page as input and produces the required datas in the form of XML conforming to the use-defined schema .A prototype system implementing the approach has been developed .The preliminary experiments indicate that the proposed semi-automatic approach is not only easy to use but also able to produce a wrapper that extracts required data from inputted pages with high accuracy.  相似文献
3.
4.
关于切换回归的集成模糊聚类算法 GFC   总被引:1,自引:0,他引:1       下载免费PDF全文
王士同  江海峰  陆宏钧 《软件学报》2002,13(10):1905-1914
已经有多个方法可用于解决切换回归问题.根据所提出的基于Newton引力定理的引力聚类算法GC,结合模糊聚类算法,进一步提出了新的集成模糊聚类算法 GFC.理论分析表明GFC 能收敛到局部最小.实验结果表明GFC在解决切换回归问题时,比标准模糊聚类算法更有效,特别在收敛速度方面.  相似文献
5.
In this pager,we report our success in building efficient scalable classifiers by exploring the capabilities of modern relational database management systems (RDBMS).In addition to high classification accuracy,the unique features of the approach include its high training speed ,linear scalability,and simplicity in implementation.More importantly,the major computation required in the approach can be implemented using standard functions provided by the modern realtional DBMS.Besides,with the effective rule pruning strategy,the algorithm proposed in this paper can produce a compact set of classification rules,The results of experiments conducted for performance evaluation an analysis are presented.  相似文献
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号