首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
为了解决路网环境中传统的组最近邻查询无法支持用户不确定搜索的问题,在组最近邻查询的基础上引入了“模糊”因子来描述用户查询的不确定性,并提出了四种不同的算法,其中朴素的全局搜索算法利用了Dijkstra 算法的特性来处理不确定性,多维向量算法和V-Tree 算法在此基础上通过缩小搜索空间进一步优化,最后提出的近似算法在牺牲了一定正确率的前提下进一步提高了查询效率。通过在真实路网数据集上的大量实验,总结归纳了不同算法的优势,并充分验证了各个算法的合理性与实用性。  相似文献   

2.
Continuous aggregate nearest neighbor queries   总被引:1,自引:0,他引:1  
This paper addresses the problem of continuous aggregate nearest-neighbor (CANN) queries for moving objects in spatio-temporal data stream management systems. A CANN query specifies a set of landmarks, an integer k, and an aggregate distance function f (e.g., min, max, or sum), where f computes the aggregate distance between a moving object and each of the landmarks. The answer to this continuous query is the set of k moving objects that have the smallest aggregate distance f. A CANN query may also be viewed as a combined set of nearest neighbor queries. We introduce several algorithms to continuously and incrementally answer CANN queries. Extensive experimentation shows that the proposed operators outperform the state-of-the-art algorithms by up to a factor of 3 and incur low memory overhead.  相似文献   

3.
Nearest and reverse nearest neighbor queries for moving objects   总被引:4,自引:0,他引:4  
With the continued proliferation of wireless communications and advances in positioning technologies, algorithms for efficiently answering queries about large populations of moving objects are gaining interest. This paper proposes algorithms for k nearest and reverse k nearest neighbor queries on the current and anticipated future positions of points moving continuously in the plane. The former type of query returns k objects nearest to a query object for each time point during a time interval, while the latter returns the objects that have a specified query object as one of their k closest neighbors, again for each time point during a time interval. In addition, algorithms for so-called persistent and continuous variants of these queries are provided. The algorithms are based on the indexing of object positions represented as linear functions of time. The results of empirical performance experiments are reported.  相似文献   

4.
在大数据环境下,K近邻多标签算法(ML-KNN)高时间复杂度的问题显得尤为突出;此外,ML-KNN也没有考虑◢k◣个近邻对最终分类结果的影响。针对上述问题进行研究,首先将训练集进行聚类,再为测试集找到一个距离其最近的训练数据簇作为新的训练数据集;然后计算最近邻样本的距离权重,并用该权重描述最近邻和其他近邻对预测结果的影响;最后使用新的目标函数为待测样本分类。通过在图片、Web页面文本数据等数据集上的实验表明,所提算法得到了更好的分类结果,并且大大降低了时间复杂度。  相似文献   

5.
Most recently, uncertain graph data begin attracting significant interests of database research community, because uncertainty is the intrinsic property of the real-world and data are more suitable to be modeled as graphs in numbers of applications, e.g. social network analysis, PPI networks in biology, and road network monitoring. Meanwhile, as one of the basic query operators, aggregate nearest neighbor (ANN) query retrieves a data entity whose aggregate distance, e.g. sum, max, to the given query data entities is smaller than those of other data entities in a database. ANN query on both certain graph data and high dimensional data has been well studied by previous work. However, existing ANN query processing approaches cannot handle the situation of uncertain graphs, because topological structures of an uncertain graph may vary in different possible worlds. Motivated by this, we propose the aggregate nearest neighbor query in uncertain graphs (UG-ANN) in this paper. First of all, we give the formal definition of UG-ANN query and the basic UG-ANN query algorithm. After that, to improve the efficiency of UG-ANN query processing, we develop two kinds of pruning approaches, i.e. structural pruning and instance pruning. The structural pruning takes advantages the monotonicity of the aggregate distance to derive the upper and lower bounds of the aggregate distance for reducing the graph size. Whereas, the instance pruning decreases the number of possible worlds to be checked in the searching tree. Comprehensive experimental results on real-world data sets demonstrate that the proposed method significantly improves the efficiency of the UG-ANN query processing.  相似文献   

6.
Aggregate nearest neighbor queries in road networks   总被引:5,自引:0,他引:5  
Aggregate nearest neighbor queries return the object that minimizes an aggregate distance function with respect to a set of query points. Consider, for example, several users at specific locations (query points) that want to find the restaurant (data point), which leads to the minimum sum of distances that they have to travel in order to meet. We study the processing of such queries for the case where the position and accessibility of spatial objects are constrained by spatial (e.g., road) networks. We consider alternative aggregate functions and techniques that utilize Euclidean distance bounds, spatial access methods, and/or network distance materialization structures. Our algorithms are experimentally evaluated with synthetic and real data. The results show that their relative performance depends on the problem characteristics.  相似文献   

7.
金波  张志勇  赵婷 《计算机应用》2020,40(8):2340-2344
针对社交网络中近邻位置查询时个人位置隐私泄漏的问题,采用地理不可区分性机制对位置数据添加随机噪声,提出了一种隐私预算分配方法。首先,对空间区域进行网格化分割,根据用户在不同区域的位置访问量来个性化分配隐私预算;然后,为了解决在扰动位置数据集中近邻查询命中率偏低的问题,提出了一种组合增量近邻查询(CINQ)算法,以扩大需求空间的检索范围,并利用组合查询过滤冗余数据。在仿真实验中,与SpaceTwist算法相比,CINQ算法的查询命中率提高了13.7个百分点。实验结果表明,CINQ算法有效解决了因为查询目标的位置扰动所带来的查询命中率偏低问题,适用于社交网络应用中扰动位置的近邻查询。  相似文献   

8.
空间数据库中反向最近邻查询在低维查询时一般利用基于R-Tree的改进树作为索引结构,由于树型索引结构本身的限制,R-Tree等索引结构的查询在高维中都会出现维数灾难。针对这个问题,提出了一种基于VARdnn-Tree的索引结构,采用量化压缩的方法存储数据,能够有效地支持高维查询。  相似文献   

9.
空间数据库的多类型最近邻查询逐渐受到人们的关注,关于K最近邻查询的研究也较多,但多类型K最近邻查询的研究还存在空白。针对道路网络中的多类型K-最近邻(MT-KNN)问题,结合多类型最近邻查询及K最近邻查询的理论,提出了多类型K最近邻查询算法。通过对分层编码视图进行扩展,建立了多路径分层编码视图,并利用逐步扩展局部路径的方法,实现了多类型K最近邻查询,实验结果分析表明算法具有较好的性能。  相似文献   

10.
反向最近邻查询已成为空间查询的热点问题,而障碍物在实际应用中是不可避免的,因而在障碍物环境中的反向最近邻查询也成为重要的空间查询。已有的可视反向最近邻查询只考虑了可视性,并没有考虑最小障碍距离。提出一种障碍物环境中新的反向最近邻查询的变体,查找障碍距离最小的反向最近邻,即障碍反向最近邻查询。利用障碍距离的计算和相应的剪枝规则,给出障碍反向最近邻查询的算法及相关定理和证明。  相似文献   

11.
通过分析已有的索引结构在进行k近邻查询时效率上的不足,提出了适合进行k近邻查询的X*树索引结构,采用了新的结点分裂算法,同时不需要额外存储结点分裂的历史信息。实验结果表明它比X树的时间和空间性能更好,更适合k近邻查询的应用。  相似文献   

12.
A parametrically-directed nearest neighbor procedure is developed to reduce the error between asymptotic and finite sample risk. Two examples are given.  相似文献   

13.
针对LBS查询服务中构造的匿名区或选取的锚点仍位于敏感区域而导致的位置隐私泄露问题,提出一种基于位置语义的增量近邻隐私保护方法。该方法在客户端/服务器体系架构下,先根据用户的位置隐私需求计算语义安全匿名区,保护用户位置隐私;再筛选语义安全匿名区中道路交叉点作为语义安全锚点,保证了选取锚点是真实存在的,且其语义安全性达到最大;最终客户端以锚点位置请求服务并获取查询结果。实验结果表明,该方法能够较好地保护用户位置的隐私,且查询准确率较高约90%,查询时间较低约60 ms。  相似文献   

14.
针对有限样本下,KNN算法距离量的选择以及以前距离量学习研究中没有充分考虑样本分布的情况,提出了一种新的基于概率的两层最近邻自适应度量算法(PTLNN)。该算法分为两层,在低层使用欧氏距离来确定一个未标记的样本局部子空间;在高层,用AdaBoost在子空间进行信息提取。以最小化平均绝对误差为原则,定义一个基于概率的自适应距离度量进行最近邻分类。该算法结合KNN与AdaBoost算法的优势,在有限样本下充分考虑样本分布能降低分类错误率,并且在噪声数据下有很好的稳定性,能降低AdaBoost过度拟合现象发生。通过与其他算法对比实验表明,PTLNN算法取得更好的结果。  相似文献   

15.
最近邻查询在地理信息系统、智能交通系统、多媒体应用以及数据挖掘等领域有着广泛的应用,随着对最近邻查询问题研究的深入,其应用前景和发展空间将更为广阔。针对近几年时空数据库中提出的最近邻查询的多种变体查询问题进行了详细地介绍和分析,总结了解决这些变体最近邻查询问题的有效方法,最后对最近邻查询问题的发展方向进行了展望。  相似文献   

16.
In this article, we propose a new generalization of the rank nearest neighbor (RNN) rule for multivariate data for diagnosis of breast cancer. We study the performance of this rule using two well known databases and compare the results with the conventional k-NN rule. We observe that this rule performed remarkably well, and the computational complexity of the proposed k-RNN is much less than the conventional k-NN rule.  相似文献   

17.
为解决矩阵分解应用到协同过滤算法的局限性和准确率等问题,提出基于边界矩阵低阶近似(BMA)和近邻模型的协同过滤算法(BMAN-CF)来提高物品评分预测的准确率。首先,引入BMA的矩阵分解算法,挖掘子矩阵的隐含特征信息,提高近邻集合查找的准确率;然后,根据传统基于用户和基于物品的协同过滤算法分别预测出目标用户对目标物品的评分,利用平衡因子和控制因子动态平衡两个预测结果,得到目标用户对物品的评分;最后,利用MapReduce计算框架的特点,对数据进行分块,将该算法在Hadoop环境下并行化。实验结果表明,BMAN-CF比其他矩阵分解算法有更高的评分预测准确率,且加速比实验验证了该算法具有较好的可扩展性。  相似文献   

18.
We propose a two-layer decision fusion technique, called Fuzzy Stacked Generalization (FSG) which establishes a hierarchical distance learning architecture. At the base-layer of an FSG, fuzzy k-NN classifiers receive different feature sets each of which is extracted from the same dataset to gain multiple views of the dataset. At the meta-layer, first, a fusion space is constructed by aggregating decision spaces of all the base-layer classifiers. Then, a fuzzy k-NN classifier is trained in the fusion space by minimizing the difference between the large sample and N-sample classification error. In order to measure the degree of collaboration among the base-layer classifiers and the diversity of the feature spaces, a new measure called, shareability, is introduced. Shearability is defined as the number of samples that are correctly classified by at least one of the base-layer classifiers in FSG. In the experiments, we observe that FSG performs better than the popular distance learning and ensemble learning algorithms when the shareability measure is large enough such that most of the samples are correctly classified by at least one of the base-layer classifiers. The relationship between the proposed and state-of-the-art diversity measures is experimentally analyzed. The tests performed on a variety of artificial and real-world benchmark datasets show that the classification performance of FSG increases compared to that of state-of-the art ensemble learning and distance learning methods as the number of classes increases.  相似文献   

19.
Nearest neighbor (NN) classification assumes locally constant class conditional probabilities, and suffers from bias in high dimensions with a small sample set. In this paper, we propose a novel cam weighted distance to ameliorate the curse of dimensionality. Different from the existing neighborhood-based methods which only analyze a small space emanating from the query sample, the proposed nearest neighbor classification using the cam weighted distance (CamNN) optimizes the distance measure based on the analysis of inter-prototype relationship. Our motivation comes from the observation that the prototypes are not isolated. Prototypes with different surroundings should have different effects in the classification. The proposed cam weighted distance is orientation and scale adaptive to take advantage of the relevant information of inter-prototype relationship, so that a better classification performance can be achieved. Experiments show that CamNN significantly outperforms one nearest neighbor classification (1-NN) and k-nearest neighbor classification (k-NN) in most benchmarks, while its computational complexity is comparable with that of 1-NN classification.  相似文献   

20.
Knowledge and Information Systems - Approximate nearest neighbor (ANN) search on high-dimensional data is a fundamental operation in many applications. In this paper, we study massive queries of...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号