首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Early screening of mental disorders plays a crucial role in diagnosis and treatment. This study explores how data‐driven methods can leverage the information available on social media platforms to predict postpartum depression (PPD). A generalized approach is proposed where linguistic features are extracted from user‐generated textual posts on social media and categorized as general, depressive, and PPD representative using multiple machine learning techniques. We find that techniques used in our study exhibit strong predictive capabilities for PPD content. Holdout validation showed that multilayer perceptron outperformed other techniques such as support vector machine and logistic regression used in this study with 91.7% accuracy for depressive content identification and up to 86.9% accuracy for PPD content prediction. This work adopts a hierarchical approach to predict PPD. Therefore, the reported PPD accuracy represents the performance of the model to correctly classify PPD content from non‐PPD depressive content.  相似文献   

2.
社交网络是一个有效的信息传播平台,使得人们的生活更加便捷.同时,在线社交网络也不断提高了社交网络账号的价值.然而,为了获取非法利益,犯罪团伙会利用社交网络平台隐秘地开展各种诈骗、赌博等犯罪活动.为了保护用户的社交安全,各种基于用户行为、关系传播的恶意账号检测方法被提出.此类方法需要积累足够的用户数据才能进行恶意检测,利用这个时间差,犯罪团伙可以开展大量的犯罪活动.首先系统分析了现有恶意账号检测工作.为克服现有方法的缺点而更快地检测恶意账号,设计了一种基于账号注册属性的恶意账号检测方法.方法首先通过分析恶意账号和正常账号在不同属性值上的分布,设计并提取了账号的相似性特征和异常特征;然后基于此计算两两账号的相似度构图以聚类挖掘恶意注册团体,从而有效实现注册阶段的恶意账号检测.  相似文献   

3.
Social networks once being an innoxious platform for sharing pictures and thoughts among a small online community of friends has now transformed into a powerful tool of information, activism, mobilization, and sometimes abuse. Detecting true identity of social network users is an essential step for building social media an efficient channel of communication. This paper targets the microblogging service, Twitter, as the social network of choice for investigation. It has been observed that dissipation of pornographic content and promotion of followers market are actively operational on Twitter. This clearly indicates loopholes in the Twitter’s spam detection techniques. Through this work, five types of spammers-sole spammers, pornographic users, followers market merchants, fake, and compromised profiles have been identified. For the detection purpose, data of around 1 Lakh Twitter users with their 20 million tweets has been collected. Users have been classified based on trust, user and content based features using machine learning techniques such as Bayes Net, Logistic Regression, J48, Random Forest, and AdaBoostM1. The experimental results show that Random Forest classifier is able to predict spammers with an accuracy of 92.1%. Based on these initial classification results, a novel system for real-time streaming of users for spam detection has been developed. We envision that such a system should provide an indication to Twitter users about the identity of users in real-time.  相似文献   

4.
Walking is the most fundamental requirement for independent living in daily life. An intelligent walking-support robot has been developed for use by people with walking disabilities. To appropriately assist the user, the robot must precisely track the user’s intentions. However, the robot’s tracking accuracy is severely compromised by time-varying friction, center-of-gravity (CoG) shifts, and load changes induced by the user. In a previous study, we proposed a digital acceleration controller with online inertial parameter identification. However, the tracking accuracy was still affected by CoG shifts introduced by the users. To address these issues, the current study investigated a novel dynamic model, wherein all the load and CoG information processed in the inertial matrix was derived and a new digital acceleration controller with parameter estimation was used to compensate for the time-varying friction, CoG shifts, and load changes. Experiments were conducted under different floor and load conditions to demonstrate the improved tracking accuracy of the proposed control method.  相似文献   

5.
邢千里  刘列  刘奕群  张敏  马少平 《软件学报》2015,26(7):1626-1637
微博环境中用户可以为自己添加标签,用户所添加的标签往往被视为是对自身特点和兴趣的重要描述信息.标签中所包含的信息可能有助于建立精确的用户描述,因此在个性化推荐、专家检索、影响力分析等应用中有潜在的应用价值.首先,在大规模数据上分析和研究了微博中用户添加标签的行为及标签内容分布的特点;之后,通过主题模型对用户的微博内容进行分析,实验结果表明:用户的标签越相似,微博内容也越相似,反之亦然;随后,分析了用户关注关系与微博和标签内容之间的联系,实验结果显示,有关注关系的用户之间微博和标签的内容越相似;基于这个发现,分别使用标签内容和微博内容对真实微博数据中的用户关注关系进行预测,结果表明:基于标签的预测方法其效果明显优于基于微博内容的预测方法,显示出用户标签在描述用户兴趣方面的价值.  相似文献   

6.
随着Web 2.0时代的发展,微博作为新兴的社交网络媒体在人们的日常生活中扮演着愈发重要的角色.它不仅是用户交流与分享信息的桥梁,也是获取信息的重要方式.微博同时具有社交网络与信息媒体双重性,其生态环境中仅具有媒体属性,用于发布信息给公众的自媒体账号(we media account)发展迅速.首次提出微博自媒体账号识别这一研究问题,阐述了自媒体账号识别对分析微博生态环境、用户兴趣建模、优质内容挖掘的重要意义,提出了结合个人信息、账号行为及微博内容3类特征的有监督识别方法.研究结果表明:1)自媒体账号与普通的微博账号有着较明显的不同,主要体现在微博发布行为的规律性以及话题分布特性之上.2)提出的3类特征能够有效识别自媒体账号,不同类别的特征也能够相互补充,预测准确率高达96.71%.  相似文献   

7.
Adaptive applications may benefit from having models of users? personality to adapt their behavior accordingly. There is a wide variety of domains in which this can be useful, i.e., assistive technologies, e-learning, e-commerce, health care or recommender systems, among others. The most commonly used procedure to obtain the user personality consists of asking the user to fill in questionnaires. However, on one hand, it would be desirable to obtain the user personality as unobtrusively as possible, yet without compromising the reliability of the model built. On the other hand, our hypothesis is that users with similar personality are expected to show common behavioral patterns when interacting through virtual social networks, and that these patterns can be mined in order to predict the tendency of a user personality. With the goal of inferring personality from the analysis of user interactions within social networks, we have developed TP2010, a Facebook application. It has been used to collect information about the personality traits of more than 20,000 users, along with their interactions within Facebook. Based on all the collected data, automatic classifiers were trained by using different machine-learning techniques, with the purpose of looking for interaction patterns that provide information about the users? personality traits. These classifiers are able to predict user personality starting from parameters related to user interactions, such as the number of friends or the number of wall posts. The results show that the classifiers have a high level of accuracy, making the proposed approach a reliable method for predicting the user personality  相似文献   

8.
Users of social media sites can use more than one account. These identities have pseudo anonymous properties, and as such some users abuse multiple accounts to perform undesirable actions, such as posting false or misleading remarks comments that praise or defame the work of others. The detection of multiple user accounts that are controlled by an individual or organization is important. Herein, we define the problem as sockpuppet gang (SPG) detection. First, we analyze user sentiment orientation to topics based on emotional phrases extracted from their posted comments. Then we evaluate the similarity between sentiment orientations of user account pairs, and build a similar-orientation network (SON) where each vertex represents a user account on a social media site. In an SON, an edge exists only if the two user accounts have similar sentiment orientations to most topics. The boundary between detected SPGs may be indistinct, thus by analyzing account posting behavior features we propose a multiple random walk method to iteratively remeasure the weight of each edge. Finally, we adopt multiple community detection algorithms to detect SPGs in the network. User accounts in the same SPG are considered to be controlled by the same individual or organization. In our experiments on real world datasets, our method shows better performance than other contemporary methods.  相似文献   

9.
In this paper, we focus on the problem of community detection on Sina weibo, the most popular microblogging system in China. By characterizing the structure and content of microgroup (community) on Sina weibo in detail, we observe that different from ordinary social networks, the degree assortativity coefficients are negative on most microgroups. In addition, we find that users from the same microgroup tend to share some common attributes (e.g., followers, tags) and interests extracted from their published posts. Inspired by these new findings, we propose a united method to remodel the network for microgroup detection while maintaining the information of link structure and user content. Firstly, the link direction is concerned by assigning greater weight values to more surprising links, while the content similarity is measured by the Jaccard coefficient of common features and interest similarity based on Latent Dirichlet Allocation model. Then, both link direction and content similarity between two users are uniformly converted to the edge weight of a new remodeled network, which is undirected and weighted. Finally, multiple frequently used community detection algorithms that support weighted networks could be employed. Extensive experiments on real-world social networks show that both link structure and user content play almost equally important roles in microgroup detection on Sina weibo. Our method outperforms the traditional methods with average accuracy improvement up to 39 %, and the number of unrecognized users decreased by about 75 %.  相似文献   

10.
11.
12.
针对博客文章内容上,包含多个主题,类别归属不明显,多为作者自己主观意见且结构上,包括不同于文本的标签,普通文本分类方法直接应用于博客文章效果不理想的问题,提出一种结构特征和内容分析融合的博客文章分类方法。内容上,通过迭代两种不同特征选择方法,提高特征集代表性的前提下,利用正文,标题两个方面分类.结构上,利用博客文章特有的标签分类,并将三个方面融合。实验结果表明,改进的分类方法有效地提高了博客文章分类的性能。  相似文献   

13.
Due to rapid development of Internet technology and electronic business, fraudulent activities have increased. One of the ways to cope with damages of them is fraud detection. In this field, there is a need for methods accurate and fast. Therefore, a novel and efficient feature extraction method based on social network analysis called FEMBSNA is proposed for fraud detection in banking accounts. In this method, in order to increase accuracy and control runtime in the first step, features based on network level are considered using social network analysis and extracted feature is combined with other features based on user level in the next phase. To evaluate our feature extraction method, we use PCK-means method as a basic method to learn. The results show using the proposed feature extraction as a pre-processing step in fraud detection improves the accuracy remarkably while it controls runtime in comparison with other methods.  相似文献   

14.
Knowledge of the information goal of users is critical in website design, analyzing the efficacy of such designs, and in ensuring effective user-access to desired information. Determining the information goal is complex due to the subjective and latent nature of user information needs. This challenge is further exacerbated in media-rich websites since the semantics of media-based information is context-based and emergent. A critical step in determining information goals lies in the identification of content pages. These are the pages which contain the information the user seeks. We propose a method to automatically determine the content pages by taking into account the organization of the web site, the media-based information content, as well as the influence of a specific user browsing pattern. Given a specific browsing pattern, in our method, putative content pages are identified as the pages corresponding to the local minima of page-content entropy values. For an (unknown) user information goal this intuitively corresponds to modeling the progressive transition of the user from pages with generic information to those with specific information. Experimental investigations on media rich sites demonstrate the effectiveness of the technique and underline its potential in modeling user information needs and actions in a media-rich web.  相似文献   

15.
16.
Since the user generated contents in Web forums are rich but vary in quality, ranging from excellent detailed opinions to simple repetition of the content of previous, or even spams, it is difficult to find high quality information in the process of post browsing, retrieval and other Web forum applications. In this paper, we propose a novel machine learning approach named LGPRank to evaluate the web forum posts, where a genetic programming architecture is used to rank Web forum posts according to the qualities of their contents. In order to address the shortcomings of current studies, we take both the semantic-free and semantic-specific information of a post into account. We propose a set of new features named Latent Dirichlet Allocation (LDA) semantic features which are computed in LDA topic space. The proposed features as well as content surface features and forum specific features are used in the learning process. Experiments are conducted on three web forum datasets in comparison with methods used in prior ranking research. LGPRank outperforms all the other methods in terms of P@N, NDCG@N and MAP measures. Furthermore, the experimental results also indicate that the proposed LDA semantic features have a positive effect in improving the ranking performance.  相似文献   

17.
User goals are of major importance for an interface agent because they serve as a context to define what the user’s focus of attention is at a given moment. The user’s goals should be detected as soon as possible, after observing few user actions, in order to provide the user with timely assistance. In this article, we describe an approach for modeling and recognizing user goals from observed sequences of user actions by using Variable Order Markov models combined with an exponential moving average (EMA) on the prediction probabilities. The validity of our approach has been tested using data collected from real users in the Unix domain. The results obtained show that an interface agent can achieve near 90% average accuracy and over 58% online accuracy in predicting the most probable user goal after each observed action, in a time linear to the number of goals being modeled. We also found that the use of an EMA allows a faster convergence in the actual user goal.  相似文献   

18.
Engineers create engineering documents with their own terminologies, and want to search existing engineering documents quickly and accurately during a product development process. Keyword-based search methods have been widely used due to their ease of use, but their search accuracy has been often problematic because of the semantic ambiguity of terminologies in engineering documents and queries. The semantic ambiguity can be alleviated by using a domain ontology. Also, if queries are expanded to incorporate the engineer’s personalized information needs, the accuracy of the search result would be improved. Therefore, we propose a framework to search engineering documents with less semantic ambiguity and more focus on each engineer’s personalized information needs. The framework includes four processes: (1) developing a domain ontology, (2) indexing engineering documents, (3) learning user profiles, and (4) performing personalized query expansion and retrieval. A domain ontology is developed based on product structure information and engineering documents. Using the domain ontology, terminologies in documents are disambiguated and indexed. Also, a user profile is generated from the domain ontology. By user profile learning, user’s interests are captured from the relevant documents. During a personalized query expansion process, the learned user profile is used to reflect user’s interests. Simultaneously, user’s searching intent, which is implicitly inferred from the user’s task context, is also considered. To retrieve relevant documents, an expanded query in which both user’s interests and intents are reflected is then matched against the document collection. The experimental results show that the proposed approach can substantially outperform both the keyword-based approach and the existing query expansion method in retrieving engineering documents. Reflecting a user’s information needs precisely has been identified to be the most important factor underlying this notable improvement.  相似文献   

19.
Online social networks have become immensely popular in recent years and have become the major sources for tracking the reverberation of events and news throughout the world. However, the diversity and popularity of online social networks attract malicious users to inject new forms of spam. Spamming is a malicious activity where a fake user spreads unsolicited messages in the form of bulk message, fraudulent review, malware/virus, hate speech, profanity, or advertising for marketing scam. In addition, it is found that spammers usually form a connected community of spam accounts and use them to spread spam to a large set of legitimate users. Consequently, it is highly desirable to detect such spammer communities existing in social networks. Even though a significant amount of work has been done in the field of detecting spam messages and accounts, not much research has been done in detecting spammer communities and hidden spam accounts. In this work, an unsupervised approach called SpamCom is proposed for detecting spammer communities in Twitter. We model the Twitter network as a multilayer social network and exploit the existence of overlapping community-based features of users represented in the form of Hypergraphs to identify spammers based on their structural behavior and URL characteristics. The use of community-based features, graph and URL characteristics of user accounts, and content similarity among users make our technique very robust and efficient.  相似文献   

20.
基于协同过滤的网络论坛个性化推荐算法   总被引:1,自引:0,他引:1       下载免费PDF全文
提出一种基于协同过滤的网络论坛个性化推荐算法,根据用户的发帖、回帖、阅读等记录,采用加权方法计算用户帖子的评分矩阵,获取邻近用户集合,通过邻居用户的帖子评分,计算目标用户的帖子预测评分,推荐预测评分最高的帖子。实验结果表明,该算法的推荐质量较高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号