共查询到20条相似文献,搜索用时 78 毫秒
1.
2.
一种基于词聚类的中文文本主题抽取方法 总被引:2,自引:0,他引:2
提出了一种基于词聚类的中文文本主题抽取方法,该方法利用相关度对词的共现进行分析,建立词之间的语义关联,并生成代表某一主题概念的用种子词表示的词类。对于给定文档,先进行特征词抽取,再借助词类生成该文档的主题因子,最后按权重输出主题因子,作为文本的主题。实验结果表明,该方法具有较高的抽准率。 相似文献
3.
将Copulas理论引入文本特征词关联模式挖掘,提出融合Copulas理论和关联规则挖掘的查询扩展算法.从初检文档集中提取前列n篇文档构建伪相关反馈文档集或用户相关反馈文档集,利用基于Copulas理论的支持度和置信度对相关反馈文档集挖掘含有原查询词项的特征词频繁项集和关联规则模式,从这些规则模式中提取扩展词,实现查询扩展.在NTCIR-5 CLIR中英文本语料上的实验表明,文中算法可有效遏制查询主题漂移和词不匹配问题,改善信息检索性能,提升扩展词质量,减少无效扩展词. 相似文献
4.
文档表示模型是文本自动处理的基础,是将非结构化的文本数据转化为结构化数据的有效手段。然而,目前通用的空间向量模型(Vector Space Model,VSM)是以单个的词汇为基础的文档表示模型,因其忽略了词间的关联关系,导致文本挖掘的准确率难以得到很大的提升。该文以词共现分析为基础,讨论了文档主题与词的二阶关系之间的潜在联系,进而定义了词共现度及与文档主题相关度的量化计算方法,利用关联规则算法抽取出文档集上的词共现组合,提出了基于词共现组合的文档向量主题表示模型(Co-occurrence Term based Vector Space Model, CTVSM),定义了基于CTVSM的文档相似度。实验表明,CTVSM能够准确反映文档之间的相关关系,比经典的文档向量空间模型(Vector Space Model,VSM)具有更强的主题区分能力。 相似文献
5.
6.
位置加权文本聚类算法 总被引:2,自引:2,他引:0
文本聚类是自然语言处理研究中一项重要研究课题,文本聚类技术广泛地应用于信息检索、Web挖掘和数字图书馆等领域。本文针对特征词在文档中的不同位置对文档的贡献大小不同,提出了基于特征词的位置加权文本聚类改进算法——TCABPW。通过选取反映文档主题的前L个高权值的特征项构造新的文本特征向量,采用层次聚类和K-means文本聚类相结合的改进算法实现文本聚类。实验结果表明,提出的改进算法在不影响聚类质量的情况下大大地降低了文本聚类的维度,在稳定性和纯度上都有显著提高,获得了较好的聚类效果。 相似文献
7.
针对信息检索中查询关键词与文档用词不匹配的问题,提出一种基于关联规则与聚类算法的查询扩展算法。该算法在第1阶段对初始查询结果的前N篇文档进行关联规则挖掘,提取含有初始查询项的关联规则构建规则库,并从中选取与查询用词关联度最大的置个词作为扩展词,与初始查询组成新查询后再次查询,在第2阶段将新查询结果进行聚类分析并计算结果中每篇文档的最终相关度,按最终相关度大小重新排序。实验结果表明,该算法比单独使用关联规则算法或是单独使用聚类算法均有更优的检索性能。 相似文献
8.
提出了一种基于粒计算Web文档聚类(WDCGrc)方法。该方法通过TF-IDF法则计算文档词条的权值,采取设定文档阈值和平均权值相结合的方法实行降维,抽取出每篇文档的主干词;建立了文档的主干词和二进制粒之间的转换,提出了基于粒计算提取文档间的关联规则算法来获取文档间的频繁项集,由频繁项集形成初始聚类,使用优化算法对初始聚类进行优化,得到最终聚类结果。实验结果表明,该方法切实有效,聚类质量较好。 相似文献
9.
10.
基于关联规则的Web文档聚类算法 总被引:32,自引:1,他引:32
Web文档聚类可以有效地压缩搜索空间,加快检索速度,提高查询精度.提出了一种Web文档的聚类算法.该算法首先采用向量空间模型VSM(vector space model)表示主题,根据主题表示文档;再以文档为事务,以主题为事务项,将文档和主题间的关系看作事务的形式,采用关联规则挖掘算法发现主题频集,相应的文档集即为初步文档类;然后依据类间距离和类内连接强度阈值合并、拆分类,最终实现文档聚类.实验结果表明,该算法是有效的,能处理文档类间固有的重叠情况,具有一定的实用价值. 相似文献
11.
S. Shaw 《Journal of Computer Assisted Learning》1993,9(2):93-99
Abstract This paper describes an approach to the design of interactive multimedia materials being developed in a European Community project. The developmental process is seen as a dialogue between technologists and teachers. This dialogue is often problematic because of the differences in training, experience and culture between them. Conditions needed for fruitful dialogue are described and the generic model for learning design used in the project is explained. 相似文献
12.
European Community policy and the market 总被引:1,自引:0,他引:1
C. Lloyd 《Journal of Computer Assisted Learning》1993,9(2):86-91
Abstract This paper starts with some reflections on the policy considerations and priorities which are shaping European Commission (EC) research programmes. Then it attempts to position the current projects which seek to capitalise on information and communications technologies for learning in relation to these priorities and the apparent realities of the marketplace. It concludes that while there are grounds to be optimistic about the contribution EC programmes can make to the efficiency and standard of education and training, they are still too technology driven. 相似文献
13.
融合集成方法已经广泛应用在模式识别领域,然而一些基分类器实时性能稳定性较差,导致多分类器融合性能差,针对上述问题本文提出了一种新的基于多分类器的子融合集成分类器系统。该方法考虑在度量层融合层次之上通过对各类基多分类器进行动态选择,票数最多的类别作为融合系统中对特征向量识别的类别,构成一种新的自适应子融合集成分类器方法。实验表明,该方法比传统的分类器以及分类融合方法识别准确率明显更高,具有更好的鲁棒性。 相似文献
14.
Wayne O’Brien Author Vitae 《Journal of Systems and Software》2008,81(11):1997-2013
Development of software intensive systems (systems) in practice involves a series of self-contained phases for the lifecycle of a system. Semantic and temporal gaps, which occur among phases and among developer disciplines within and across phases, hinder the ongoing development of a system because of the interdependencies among phases and among disciplines. Such gaps are magnified among systems that are developed at different times by different development teams, which may limit reuse of artifacts of systems development and interoperability among the systems. This article discusses such gaps and a systems development process for avoiding them. 相似文献
15.
This paper presents control charts models and the necessary simulation software for the location of economic values of the control parameters. The simulation program is written in FORTRAN, requires only 10K of main storage, and can run on most mini and micro computers. Two models are presented - one describes the process when it is operating at full capacity and the other when the process is operating under capacity. The models allow the product quality to deteriorate to a further level before an existing out-of-control state is detected, and they can also be used in situations where no prior knowledge exists of the out-of-control causes and the resulting proportion defectives. 相似文献
16.
Going through a few examples of robot artists who are recognized worldwide, we try to analyze the deepest meaning of what
is called “robot art” and the related art field definition. We also try to highlight its well-marked borders, such as kinetic
sculptures, kinetic art, cyber art, and cyberpunk. A brief excursion into the importance of the context, the message, and
its semiotics is also provided, case by case, together with a few hints on the history of this discipline in the light of
an artistic perspective. Therefore, the aim of this article is to try to summarize the main characteristics that might classify
robot art as a unique and innovative discipline, and to track down some of the principles by which a robotic artifact can
or cannot be considered an art piece in terms of social, cultural, and strictly artistic interest.
This work was presented in part at the 13th International Symposium on Artificial Life and Robotics, Oita, Japan, January
31–February 2, 2008 相似文献
17.
David Poole 《Computational Intelligence》1989,5(2):97-110
Although there are many arguments that logic is an appropriate tool for artificial intelligence, there has been a perceived problem with the monotonicity of classical logic. This paper elaborates on the idea that reasoning should be viewed as theory formation where logic tells us the consequences of our assumptions. The two activities of predicting what is expected to be true and explaining observations are considered in a simple theory formation framework. Properties of each activity are discussed, along with a number of proposals as to what should be predicted or accepted as reasonable explanations. An architecture is proposed to combine explanation and prediction into one coherent framework. Algorithms used to implement the system as well as examples from a running implementation are given. 相似文献
18.
Watts S. Humphrey 《Annals of Software Engineering》2002,14(1-4):39-72
This paper provides the author's personal views and perspectives on software process improvement. Starting with his first work on technology assessment in IBM over 20 years ago, Watts Humphrey describes the process improvement work he has been directly involved in. This includes the development of the early process assessment methods, the original design of the CMM, and the introduction of the Personal Software Process (PSP)SM and Team Software Process (TSP){SM}. In addition to describing the original motivation for this work, the author also reviews many of the problems he and his associates encountered and why they solved them the way they did. He also comments on the outstanding issues and likely directions for future work. Finally, this work has built on the experiences and contributions of many people. Mr. Humphrey only describes work that he was personally involved in and he names many of the key contributors. However, so many people have been involved in this work that a full list of the important participants would be impractical. 相似文献
19.
基于复小波噪声方差显著修正的SAR图像去噪 总被引:4,自引:1,他引:3
提出了一种基于复小波域统计建模与噪声方差估计显著性修正相结合的合成孔径雷达(Synthetic Aperture Radar,SAR)图像斑点噪声滤波方法。该方法首先通过对数变换将乘性噪声模型转化为加性噪声模型,然后对变换后的图像进行双树复小波变换(Dualtree Complex Wavelet Transform,DCWT),并对复数小波系数的统计分布进行建模。在此先验分布的基础上,通过运用贝叶斯估计方法从含噪系数中恢复原始系数,达到滤除噪声的目的。实验结果表明该方法在去除噪声的同时保留了图像的细节信息,取得了很好的降噪效果。 相似文献