共查询到20条相似文献,搜索用时 0 毫秒
1.
Learning to Predict Links by Integrating Structure and Interaction Information in Microblogs
下载免费PDF全文

Link prediction in microblogs by using unsupervised methods has been studied extensively in recent years, which aims to find an appropriate similarity measure between users in the network. However, the... 相似文献
2.
主题模型是挖掘微博潜在主题的重要工具.然而,现有的主题模型多由 Latent Dirichlet Allocation (LDA)派生,它需要用户预先指定主题数目.为了自动挖掘微博主题,作者提出了一个基于分层 Dirichlet 过程(Hierarchical Dirichlet Process,HDP)的非参数贝叶斯模型 MB-HDP.首先,针对微博应用场景,假设消息是不可交换的;接着,利用微博的时间信息、用户兴趣以及话题标签,聚合主题相关的消息以解决微博短文本的数据稀疏问题;然后,扩展Chinese Restaurant Franchise (CRF)对微博数据进行主题建模;最后,设计一个相应的 Markov Chain Monte Carlo (MCMC)采样方法,推导 MB-HDP 模型的分布参数.实验表明,在生成主题质量、内容困惑度和模型复杂度等指标上,MB-HDP 模型明显优于 LDA 和 HDP 两种模型. 相似文献
3.
本文回顾了汉字信息处理技术的发展,提出了汉字信息处理技术发展的三个阶段的划分,对汉字信息处理技术的现状及发展趋势作了介绍。 相似文献
4.
国内信息导航系统中的信息自动分类子系统的设计与实现 总被引:4,自引:1,他引:3
信息分类检索服务是信息导航系统中通常提供的一种重要服务,该文介绍了一种国内信息导航系统中使用的信息自动分类子系统及其实现方法,阐述了其分类主题词典的构成及其实现,最后也给出了信息自动分类子系统处理后入库的数据的检索方法。 相似文献
5.
基于多元辅助信息的机载LIDAR点云数据滤波分类研究 总被引:2,自引:0,他引:2
借助多元辅助信息对黑龙江省凉水国家级自然保护区内las标准格式的原始激光雷达点云数据进行处理。首先根据研究区内地形起伏分布和地物的不同裁剪分割点云数据,去除掉极高点与极低点;然后采用渐进式不规则三角网算法和带权的线性迭代预测算法,通过设置不同的参数组分别对具有代表性特征的分块数据进行滤波试验,对比分析了两种算法在不同地形区域的滤波效果,进而得到各自的适用范围。再结合这两种算法对所有分块进行滤波处理,并充分利用las格式点云携带的强度与回波及同步获取的影像等多元信息,帮助分类。最后评价多元信息滤波分类的效果。结果表明:结合两种算法和多元辅助信息对凉水地区进行滤波分类能很好地保留地形与地物的局部细节信息,从而为准确构建研究区域的数字高程模型和数字表面模型提供了技术依据,同时验证了机载LIDAR技术在东北林区应用推广的可行性与优势性。 相似文献
6.
基于稀疏差异度的聚类方法在信息分类中的应用 总被引:1,自引:1,他引:1
针对文本信息聚类中的高属性维稀疏数据聚类问题,采用计算对象间稀疏特征差异度来度量文本对象之间的相关度,结合最小生成树的方法来进行聚类分析,提出一种基于稀疏特征差异度的聚类方法,通过实例表明,该算法对于多关键字匹配的文本信息分类十分有效,并可根据关键字的重要程度进行加权计算,使聚类更加符合实际情况。该算法将在高维稀疏数据挖掘中有着重要应用。 相似文献
7.
8.
异构空间信息的分级共识模型研究 总被引:2,自引:0,他引:2
Internet上异构、分布的空间信息因各自独立、相对封闭而无法相互沟通和协作,从而形成了空间信息孤岛。论文利用计算机的知识表示理论和Ontology技术能够明确表达相关领域共识的特点,探索性地提出了异构空间信息的分级共识模型。该模型为解决上述问题提供了一条切实可行的新思路。 相似文献
9.
10.
Classifying RFID attacks and defenses 总被引:2,自引:0,他引:2
Aikaterini Mitrokotsa Melanie R. Rieback Andrew S. Tanenbaum 《Information Systems Frontiers》2010,12(5):491-505
RFID (Radio Frequency Identification) systems are one of the most pervasive computing technologies with technical potential and profitable opportunities in a diverse area of applications. Among their advantages is included their low cost and their broad applicability. However, they also present a number of inherent vulnerabilities. This paper develops a structural methodology for risks that RFID networks face by developing a classification of RFID attacks, presenting their important features, and discussing possible countermeasures. The goal of the paper is to categorize the existing weaknesses of RFID communication so that a better understanding of RFID attacks can be achieved and subsequently more efficient and effective algorithms, techniques and procedures to combat these attacks may be developed. 相似文献
11.
Bertossi A.A. Olariu S. Pinotti M.C. Si-Qing Zheng 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(7):654-665
The classification problem transforms a set of N numbers in such a way that none of the first N/2 numbers exceeds any of the last N/2 numbers. A comparator network that solves the classification problem on a set of r numbers is commonly called an r-classifier. We show how the well-known Leighton's Columnsort algorithm can be modified to solve the classification problem of N=rs numbers, with 1 /spl les/ s /spl les/ r, using an r-classifier instead of an r-sorting network. Overall, the r-classifier is used O(s) times, namely, the same number of times that Columnsort applies an r-sorter. A hardware implementation is proposed that runs in optimal O(s+logr) time and uses an O(rlogr(s + logr)) work. The implementation shows that, when N= rlogr, there is a classifier network solving the classification problem on N numbers in the same O(logr) time and using the same O(rlogr) comparators as an r-classifier, thus saying a logr factor in the number of comparators over an (rlogr)-classifier. 相似文献
12.
虚拟现实中基于几何和实体建模方法研究 总被引:1,自引:0,他引:1
陶红 《电脑编程技巧与维护》2013,(4):62+86
归纳了虚拟现实设计中,几何建模和实体建模的主要内容和基本方法。 相似文献
13.
Multimedia Tools and Applications - During the crisis, people post a large number of informative and non-informative tweets on Twitter. Informative tweets provide helpful information such as... 相似文献
14.
Constant folding is a well-known optimization of compilers which evaluates constant expressions already at compile time. Constant folding is valid only if the results computed by the compiler are exactly the same as the results which would be computed at run-time by the target machine arithmetic. We classify different arithmetics by deriving a general condition under which a target-machine arithmetic can be replaced by a compiler arithmetic. Furthermore, we consider integer arithmetics as a special case. They can be described by residue class arithmetics. We show that these arithmetics form a lattice. Using the order relation in this lattice, we establish a necessary and sufficient criterion under which constant folding can be done in a residue class arithmetic that is different from the one of the target machine. Concerning formal verification, we have formalized our proofs in the Isabelle/HOL system. As examples, we discuss the Java and C integer arithmetics and show which compiler arithmetics are valid for constant folding. This discussion reveals also potential sources of incorrect behavior of C compilers. 相似文献
15.
随着数据挖掘技术的发展,各种各样的数据挖掘工具不断开发出来,如何把握这些工具的功能、挖掘技术和未来发展趋势,是一个非常困难的事情。文中借助数据挖掘技术提出了数据挖掘软件工具的一个多维立方体分类模型,给出了一个具体分类实例,总结出数据挖掘工具的技术发展路线和未来发展趋势,并通过对三个不同阶段的数据挖掘工具的深入比较,进一步验证了文中的结论。 相似文献
16.
随着数据挖掘技术的发展,各种各样的数据挖掘工具不断开发出来,如何把握这些工具的功能、挖掘技术和未来发展趋势,是一个非常困难的事情。文中借助数据挖掘技术提出了数据挖掘软件工具的一个多维立方体分类模型,给出了一个具体分类实例,总结出数据挖掘工具的技术发展路线和未来发展趋势,并通过对三个不同阶段的数据挖掘工具的深入比较,进一步验证了文中的结论。 相似文献
17.
Deep Web数据源聚类与分类 总被引:1,自引:0,他引:1
随着Internet信息的迅速增长,许多Web信息已经被各种各样的可搜索在线数据库所深化,并被隐藏在Web查询接口下面.传统的搜索引擎由于技术原因不能索引这些信息--Deep Web信息.本文分析了Deep Web查询接口的各种类型,研究了基于查询接口特征的数据源聚类方法和基于聚类结果的数据源分类方法,讨论了从基于规则与线性文档分类器中抽取查询探测集的规则抽取算法和Web文档数据库分类的查询探测算法. 相似文献
18.
挖掘带有概念漂移的数据流对于许多实时决策是十分重要的.本文使用统计学理论估计某一确定模型在最新概念上的真实错误率的置信区间,在一定概率保证下检测数据流中是否发生了概念漂移,并将此方法和KMM(核平均匹配)算法引入集成分类器框架中,提出一种数据流分类的新算法WSEC.在仿真和真实数据流上的试验结果表明该算法是有效的. 相似文献
19.
A complete framework for enumerating and classifying the types of multidatabase system (MDBS) structural and representational discrepancies is developed. The framework is structured according to a relational database schema and is both practical and complete. It was used to build the UniSQL/M commercial multidatabase system. This MDBS was built over Structured-Query-Language-based relational database systems and a unified relational and object-oriented database system named UniSQL/X. However, the results are substantially applicable to heterogeneous database systems that use a nonrelational data model (for example, an object-oriented data model) as the common data model and allow the formulation of queries directly against the component database schemas 相似文献