研究如何充分利用海量用户浏览行为数据,构建更加精确的推荐算法和模型,以提高推荐系统性能,是目前个性化推荐领域研究的热点.针对这些问题,首先对用户的浏览行为进行了简要概括表述,给出了基于浏览行为推荐系统的总体框架,回顾总结了基于用户浏览行为的推荐系统的发展历程.对其关键技术和单一浏览行为量化方法与混合浏览行为量化方法进行总结、对比和分析.最后讨论了结合多源异构数据的浏览行为推荐的最新成果,总结了该领域未来研究难点和发展趋势.  相似文献   

针对智慧城市中乘客打车策略的推荐算法效率不高的问题,使用古典概率学统计历史轨迹中该时间该路段有空车的天数占数据集总天数比例,作为乘客等到空车概率;使用最小二乘法拟合时间与到达空车数曲线,预测乘客等到空车时间,以提高推荐效率。同时,使用Hadoop作为数据存储和计算平台以提高数据处理能力;提出一种基于地图栅格化的路网存储结构来提高搜索地图速度;改进一种基于计算几何的地图匹配算法提高匹配准确率。实验结果显示,空车概率推荐算法正确率约87%,等待时间推荐算法正确率达88.4%,表明挖掘轨迹数据为乘客提供推荐服务的可行性。  相似文献   

As users may have different needs in different situations and contexts, it is increasingly important to consider user context data when filtering information. In the field of web personalization and recommender systems, most of the studies have focused on the process of modelling user profiles and the personalization process in order to provide personalized services to the user, but not on contextualized services. Rather limited attention has been paid to investigate how to discover, model, exploit and integrate context information in personalization systems in a generic way. In this paper, we aim at providing a novel model to build, exploit and integrate context information with a web personalization system. A context-aware personalization system (CAPS) is developed which is able to model and build contextual and personalized ontological user profiles based on the user’s interests and context information. These profiles are then exploited in order to infer and provide contextual recommendations to users. The methods and system developed are evaluated through a user study which shows that considering context information in web personalization systems can provide more effective personalization services and offer better recommendations to users.  相似文献   

伴随着移动宽带、物联网、云计算的迅猛发展以及越来越多的移动终端、传感设备接入网络,现代社会正在以不可想象的速度产生海量数据,对传统教育模式产生广泛而深刻的影响。在数据量庞大、种类繁多、信息多样化的大数据背景下,高职院校教学服务和数据利用方式将发生显著变化,并因此带来新的机会.大数据技术的应用,使得高校可以对其数据资源采取完全数据筛选的方式来分析、挖掘隐藏在数据背后的规律,获得具有洞察力和新价值的东西依靠数据作出科学决策,让高校的信息化建设成果为办学所用,为素质教育所用。  相似文献   

移动互联网和LBS技术的高速发展使得位置服务提供商可以轻松收集到大量用户位置轨迹数据,近期研究表明,深度学习方法能够从轨迹数据集中提取出用户身份标识等隐私信息。然而现有工作主要针对社交网络采集的签到点轨迹,针对GPS轨迹的去匿名研究则较为缺乏。因此,对基于深度学习的GPS轨迹去匿名技术开展研究。首先提出一种GPS轨迹数据预训练方法,经过子轨迹划分、位置点转化和位置点嵌入,原始GPS轨迹中的空间距离和上下文信息被嵌入到定长向量中,使得GPS轨迹数据能够作为神经网络的输入。其次提出一种基于深度神经网络训练的GPS轨迹去匿名方法,基于预训练得到的向量序列,采用LSTM、GRU等神经网络作为编码器训练拟合用户标识,实现匿名轨迹数据的用户关联。最后基于Geolife轨迹数据集对上述方法进行验证,实验中轨迹去匿名的准确率和Top5准确率分别达到了56.73%和73.48%,实验结果表明,基于深度学习的GPS轨迹去匿名方法能够从匿名轨迹数据中较为准确地识别出用户标识。  相似文献   

为了解决大数据背景下新用户因没有历史数据而导致推荐难和推荐效率低等问题,提出将基于Mahout的协同过滤算法与基于MapReduce的Top N算法相结合的技术方法,来实现新用户推荐算法,从而构建新用户推荐系统的架构,并对Hadoop Top N算法以及Mahout中协同过滤算法进行设计与实现。理论分析和实验验证表明,该新用户推荐算法在推荐效率、对大规模数据处理的伸缩性以及推荐质量上都明显优于单独使用协同过滤算法的新用户推荐。  相似文献   

Hashtags, terms prefixed by a hash-symbol #, are widely used and inserted anywhere within short messages (tweets) on micro-blogging systems as they present rich sentiment information on topics that people are interested in. In this paper, we focus on the problem of hashtag recommendation considering their personalized and temporal aspects. As far as we know, this is the first work addressing this issue specially to recommend personalized hashtags combining longterm and short-term user interest.We introduce three features to capture personal and temporal user interest: 1) hashtag textual information; 2) user behavior; and 3) time. We offer two recommendation models for comparison: a linearcombined model, and an enhanced session-based temporal graph (STG) model, Topic-STG, considering the features to learn user preferences and subsequently recommend personalized hashtags. Experiments on two real tweet datasets illustrate the effectiveness of the proposed models and algorithms.  相似文献   

针对移动社交网络迅猛发展带来的发布轨迹隐私泄露问题,提出了一种个性化的轨迹保护方案。根据个体个性化的隐私保护需求差异,对不同个体采用了不同的保护准则,这样可以解决传统隐私保护下“过度保护”及轨迹效用低等问题。给出k敏感轨迹匿名和(k,p)敏感轨迹匿名等重要的隐私保护定义,并利用Trie树的构造、剪枝、重构等技术实现了个体的个性化隐私保护。最后,通过在真实数据集上的实验分析,证明该个性化方案比现存隐私保护方案在轨迹位置损失率方面性能优,计算延时较低和效率高。  相似文献   

Nowadays, the personalized recommendation has become a research hotspot for addressing information overload. Despite this, generating effective recommendations from sparse data remains a challenge. Recently, auxiliary information has been widely used to address data sparsity, but most models using auxiliary information are linear and have limited expressiveness. Due to the advantages of feature extraction and no-label requirements, autoencoder-based methods have become quite popular. However, most existing autoencoder-based methods discard the reconstruction of auxiliary information, which poses huge challenges for better representation learning and model scalability. To address these problems, we propose Serial-Autoencoder for Personalized Recommendation (SAPR), which aims to reduce the loss of critical information and enhance the learning of feature representations. Specifically, we first combine the original rating matrix and item attribute features and feed them into the first autoencoder for generating a higher-level representation of the input. Second, we use a second autoencoder to enhance the reconstruction of the data representation of the prediciton rating matrix. The output rating information is used for recommendation prediction. Extensive experiments on the MovieTweetings and MovieLens datasets have verified the effectiveness of SAPR compared to state-of-the-art models.  相似文献   

为解决现有学习推荐算法中存在的忽略对学生知识点掌握情况的分析、不能将知识掌握程度概率化等问题,提出一种基于多重因素的学习推荐方法。该方法综合考虑知识点的综合权重、错误率和失分率多个因素构建知识点掌握概率模型,并应用所提出的策略实现一个在线的个性化学习推荐系统。系统评估上对200名高中生进行了一项调查,本系统推荐top-8知识点的准确率达到91.2%,◢F◣▼1▽达到78.4%。系统调查的结果显示了提出策略的有效性和可靠性。  相似文献   

针对传统音乐评分推荐模式用户评分缺失和主观差异性较大等问题,通过提取用户行为数据构建行为特征模型,用以分析用户行为与兴趣的关联性,并采用因子分解机(Factorization Machine,FM)预测用户行为类型,作为音乐推荐的依据。将FM应用到该方法中,充分利用音乐和用户属性特征,并且通过模拟用户行为特征数据中的隐因子来填充推荐的稀疏矩阵,降低数据稀疏对预测的影响。与传统音乐推荐方法相比,从用户历史行为中挖掘用户兴趣倾向以解决评分模型带来的问题更具可行性,实验结果表明该方法用于音乐推荐也具有良好的效果。  相似文献   

针对电子医疗信息过载和医疗资源严重不足的问题,本文以辅助诊疗的结果为基础,将Skyline查询和局部范围内基于协同过滤的评分方式有机结合,提出了一种面向智能导诊的个性化推荐算法。实验结果表明,本文提出的算法能为用户提供个性化的合理推荐结果。该方法对合理分配和使用医疗资源有很大的促进作用,能从一定程度上缓解就诊压力,提高就诊质量,具有重要的实用价值和社会意义。  相似文献   

为了有效地从物联网移动设备的数字信息中挖掘出用户在日常行为中的轨迹异常,针对现有用户异常轨迹检测算法效率低的问题,提出了一种双层聚类的用户轨迹异常检测方法。考虑到移动终端设备中的轨迹信息数据量大、分布不均匀等特点,该方法在特定的空间距离与时间间隔下提取出停留点集合,并对这些点进行层次聚类,根据结果划分出停留区域,进而发现其中的异常停留区域;最后,对停留区域之间发生的运动轨迹段进行二次层次聚类,发现异常轨迹段。实验结果表明,该方法在发现异常轨迹时,相较于传统算法,既全面地检测出异常轨迹,又加快了异常检测的速度。  相似文献   

由于大数据海量、复杂多样、变化快,传统的机器学习平台已不再适用,因此,设计一个高效的、通用的大数据机器学习平台成为目前的研究热点。通过介绍和分析机器学习算法的特点以及大规模机器学习的数据和模型并行化,引出常见的并行计算模型。简单介绍了整体同步并行模型(BSP)、SSP并行计算模型以及BSP、SSP模型与AP模型的区别,主要介绍了基于这些并行模型的典型的机器学习平台和这些平台的优缺点,并指出各个平台最适合处理何种大数据问题。最后从采用的抽象数据结构、并行计算模型、容错机制等方面对典型的机器学习平台进行了总结,并提出一些建议和展望。  相似文献   

针对时空轨迹大数据的蜂群模式挖掘需求,提出了一种高效的基于MapReduce的分布式蜂群模式挖掘算法。首先,提出了基于最大移动目标集的对象集闭合蜂群模式概念,并利用最小时间支集优化了串行挖掘算法;其次,提出了蜂群模式的并行化挖掘模型,利用蜂群模式时间域无关性,并行化了聚类与子时间域上的蜂群模式挖掘过程;第三,设计了一个基于MapReduce链式架构的分布式并行挖掘算法,通过四个阶段快速地实现了蜂群模式的并行挖掘;最后,在Hadoop平台上,使用真实交通轨迹大数据集对分布式算法的有效性和高效性进行了验证与分析。  相似文献   


The Internet of Things (IoT) holds the promise to blend real-world and online behaviors in principled ways, yet we are only beginning to understand how to effectively exploit insights from the online realm into effective applications in smart environments. Such smart environments aim to provide an improved, personalized experience based on the trail of user interactions with smart devices, but how does recommendation in smart environments differ from the usual online recommender systems? And can we exploit similarities to truly blend behavior in both realms to address the fundamental cold-start problem? In this article, we experiment with behavioral user models based on interactions with smart devices in a museum, and investigate the personalized recommendation of what to see after visiting an initial set of Point of Interests (POIs), a key problem in personalizing museum visits or tour guides, and focus on a critical one-shot POI recommendation task—where to go next? We have logged users' onsite physical information interactions during visits in an IoT-augmented museum exhibition at scale. Furthermore, we have collected an even larger set of search logs of the online museum collection. Users in both sets are unconnected, for privacy reasons we do not have shared IDs. We study the similarities between users' online digital and onsite physical information interaction behaviors, and build new behavioral user models based on the information interaction behaviors in (i) the physical exhibition space, (ii) the online collection, or (iii) both. Specifically, we propose a deep neural multilayer perceptron (MLP) based on explicitly given users' contextual information, and set-based extracted features using users' physical information interaction behaviors and similar users' digital information interaction behaviors. Our experimental results indicate that the proposed behavioral user modeling approach, using both physical and online user information interaction behaviors, improves the onsite POI recommendation baselines' performances on all evaluation metrics. Our proposed MLP approach achieves 83% precision at rank 1 on the critical one-shot POI recommendation problem, realizing the high accuracy needed for fruitful deployment in practical situations. Furthermore, the MLP model is less sensitive to amount of real-world interactions in terms of the seen POIs set-size, by backing of to the online data, hence helps address the cold start problem in recommendation. Our general conclusion is that it is possible to fruitfully combine information interactions in the online and physical world for effective recommendation in smart environments.  相似文献   

基于知识图谱的推荐算法在多个领域取得了较好的效果,但仍然存在一些问题,如不能有效提取知识图谱中实体关系标签中的特征,推荐准确率会降低。因而提出将网络嵌入方法(network embedding)用于旅游知识图谱的特征提取,使得特征的提取更加充分。通过对旅游知识图谱中不同标签的属性子图独立建模,利用深度学习模型挖掘游客及景点等图节点语义特征,进而获得融合各个标签语义的游客和景点特征向量,最终通过计算游客和景点相关性生成景点推荐列表。通过在真实旅游知识图谱上的实验,验证了利用网络嵌入方法对知识图谱中数据建模后,可以有效提取节点的深层特征。  相似文献   

Journal of Computer Science and Technology - Urban sensing is one of the fundamental building blocks of urban computing. It uses various types of sensors deployed in different geospatial locations...  相似文献   

针对港口停留区域识别时船舶轨迹大数据的精度低、稀疏、漂移等问题,提出了一种多约束条件下的船舶停留轨迹提取(MPTSSE)方法。首先,结合船舶轨迹数据特点,给出了用于停留区域识别与提取的停留段概念的定义;其次,建立了基于速度、时间差、停留时长、距离等多约束的轨迹停留段提取模型和并行化轨迹停留段提取算法;最后,基于Hadoop框架给出了船舶轨迹大数据集上的轨迹停留段提取算法实现。基于真实船舶轨迹数据的实验结果表明,与基于Stop/Move模型的轨迹停留提取方法相比,MPTSSE方法在三个港口泊位的提取中准确率提高了22%。MPTSSE方法能有效避免轨迹停留段误分割情况,同时在大规模船舶轨迹数据下具有较高的执行效率。  相似文献   

在基于社会化媒体的位置推荐中,建模用户签到的位置序列建模十分必要。已有的相关算法大多都忽略了这样一个事实,即不同日子的签到序列表现出了不同的时间特征。为解决上述问题,提出一个地理社交时间序列嵌入排名(GSTSER)模型用于基于社会化媒体的位置推荐。该统一模型中的时间位置嵌入模型用于捕获序列中的上下签到信息以及不同日子的各种时间特征。同时,也提出了一种新的方法,根据地理—社交信息区分未访问的位置,将地理—社交影响纳入成对偏好排序方法。最后,基于一个统一的框架来结合这两种模型用于推荐位置。为了验证提出方法的有效性,在两个真实的数据集实验结果表明,GSTSER模型优于主流先进位置推荐算法。  相似文献   

