首页 | 本学科首页   官方微博 | 高级检索  
     


Sample-weighted clustering methods
Authors:Jian YuMiin-Shen Yang  E. Stanley Lee
Affiliation:
  • a Department of Computer Science, Beijing Jiaotong University, Beijing 100044, China
  • b Department of Applied Mathematics, Chung Yuan Christian University, Chung-Li 32023, Taiwan
  • c Department of Industrial and Manufacturing Systems Engineering, Kansas State University, KS 66506, USA
  • Abstract:Although there have been many researches on cluster analysis considering feature (or variable) weights, little effort has been made regarding sample weights in clustering. In practice, not every sample in a data set has the same importance in cluster analysis. Therefore, it is interesting to obtain the proper sample weights for clustering a data set. In this paper, we consider a probability distribution over a data set to represent its sample weights. We then apply the maximum entropy principle to automatically compute these sample weights for clustering. Such method can generate the sample-weighted versions of most clustering algorithms, such as k-means, fuzzy c-means (FCM) and expectation & maximization (EM), etc. The proposed sample-weighted clustering algorithms will be robust for data sets with noise and outliers. Furthermore, we also analyze the convergence properties of the proposed algorithms. This study also uses some numerical data and real data sets for demonstration and comparison. Experimental results and comparisons actually demonstrate that the proposed sample-weighted clustering algorithms are effective and robust clustering methods.
    Keywords:Cluster analysis   Maximum entropy principle     mmlsi33"   class="  mathmlsrc"   onclick="  submitCitation('/science?_ob=MathURL&  _method=retrieve&  _eid=1-s2.0-S0898122111005591&  _mathId=si33.gif&  _pii=S0898122111005591&  _issn=08981221&  _acct=C000051805&  _version=1&  _userid=1154080&  md5=f5abdcd1f20e1ea4efce7f80aceb7d66')"   style="  cursor:pointer  "   alt="  Click to view the MathML source"   title="  Click to view the MathML source"  >  formulatext"   title="  click to view the MathML source"  >k-means   Fuzzy   mmlsi34"   class="  mathmlsrc"   onclick="  submitCitation('/science?_ob=MathURL&  _method=retrieve&  _eid=1-s2.0-S0898122111005591&  _mathId=si34.gif&  _pii=S0898122111005591&  _issn=08981221&  _acct=C000051805&  _version=1&  _userid=1154080&  md5=3090cc99d002cf234f705a874c6b1f02')"   style="  cursor:pointer  "   alt="  Click to view the MathML source"   title="  Click to view the MathML source"  >  formulatext"   title="  click to view the MathML source"  >c-means   Sample weights   Robustness
    本文献已被 ScienceDirect 等数据库收录!
    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号