首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 288 毫秒
1.
We investigate information cascades in the context of viral marketing applications. Recent research has identified that communities in social networks may hinder cascades. To overcome this problem, we propose a novel method for injecting social links in a social network, aiming at boosting the spread of information cascades. Unlike the proposed approach, existing link prediction methods do not consider the optimization of information cascades as an explicit objective. In our proposed method, the injected links are being predicted in a collaborative-filtering fashion, based on factorizing the adjacency matrix that represents the structure of the social network. Our method controls the number of injected links to avoid an “aggressive” injection scheme that may compromise the experience of users. We evaluate the performance of the proposed method by examining real data sets from social networks and several additional factors. Our results indicate that the proposed scheme can boost information cascades in social networks and can operate as a “people recommendations” strategy complementary to currently applied methods that are based on the number of common neighbors (e.g., “friend of friend”) or on the similarity of user profiles.  相似文献   

2.
在社会网络的影响的测量在数据采矿社区收到了很多注意。影响最大化指发现尽量利用信息或产品采纳的有影响的用户的过程。在真实设置,在一个社会网络的一个用户的影响能被行动的集合建模(例如,份额,重新鸣叫,注释) 在其出版物以后由网络的另外的用户表现了。就我们的知识而言,在文学的所有建议模型同等地对待这些行动。然而,它是明显的一工具少些比一样的出版的份额影响的一份出版物相似。这建议每个行动有它影响的自己的水平(或重要性) 。在这份报纸,我们建议一个模型(叫的社会基于行动的影响最大化模型, SAIM ) 为在社会网络的影响最大化。在 SAIM,行动没在测量一个个人的影响力量同等地被考虑,并且它由二主要的步组成。在第一步,我们在社会网络计算每个个人的影响力量。这影响力量用 PageRank 从用户行动被计算。在这步的结束,我们得到每个节点被它的影响力量在标记的一个加权的社会网络。在 SAIM 的第二步,我们计算一个新概念说出 influence-BFS 树的使用的有影响的节点的一个最佳的集合。在大规模真实世界、合成的社会网络上进行的实验在计算揭示我们的模型 SAIM 的好表演,在可接受的时间规模,允许信息的最大的传播的有影响的节点的一个最小的集合。  相似文献   

3.

Recently, sequence anomaly detection has been widely used in many fields. Sequence data in these fields are usually multi-dimensional over the data stream. It is a challenge to design an anomaly detection method for a multi-dimensional sequence over the data stream to satisfy the requirements of accuracy and high speed. It is because: (1) Redundant dimensions in sequence data and large state space lead to a poor ability for sequence modeling; (2) Anomaly detection cannot adapt to the high-speed nature of the data stream, especially when concept drift occurs, and it will reduce the detection rate. On one hand, most existing methods of sequence anomaly detection focus on the single-dimension sequence. On the other hand, some studies concerning multi-dimensional sequence concentrate mainly on the static database rather than the data stream. To improve the performance of anomaly detection for a multi-dimensional sequence over the data stream, we propose a novel unsupervised fast and accurate anomaly detection (FAAD) method which includes three algorithms. First, a method called “information calculation and minimum spanning tree cluster” is adopted to reduce redundant dimensions. Second, to speed up model construction and ensure the detection rate for the sequence over the data stream, we propose a method called “random sampling and subsequence partitioning based on the index probabilistic suffix tree.” Last, the method called “anomaly buffer based on model dynamic adjustment” dramatically reduces the effects of concept drift in the data stream. FAAD is implemented on the streaming platform Storm to detect multi-dimensional log audit data. Compared with the existing anomaly detection methods, FAAD has a good performance in detection rate and speed without being affected by concept drift.

  相似文献   

4.
The database auto-design is an important problem in database research.In this paper we propose some new ideas and an approach called “logic approach” to implement the database auto-design.Given a relational scheme and a set of the functional dependencies for the relation we can obtain all of the functional dependencies and key for the relation and determine the normal form the relation satisfies.  相似文献   

5.
Computing the posterior probability distribution for a set of query variables by search result is an important task of inferences with a Bayesian network. Starting from real applications, it is also necessary to make inferences when the evidence is not contained in training data. In this paper, we are to augment the learning function to Bayesian network inferences, and extend the classical “search”-based inferences to “search + learning”-based inferences. Based on the support vector machine, we use a class of hyperplanes to construct the hypothesis space. Then we use the method of solving an optimal hyperplane to find a maximum likelihood hypothesis for the value not contained in training data. Further, we give a convergent Gibbs sampling algorithm for approximate probabilistic inference with the presence of maximum likelihood parameters. Preliminary experiments show the feasibility of our proposed methods.  相似文献   

6.
For reconstructing sparse volumes of 3D objects from projection images taken from different viewing directions, several volumetric reconstruction techniques are available. Most popular volume reconstruction methods are algebraic algorithms (e.g. the multiplicative algebraic reconstruction technique, MART). These methods which belong to voxel-oriented class allow volume to be reconstructed by computing each voxel intensity. A new class of tomographic reconstruction methods, called “object-oriented” approach, has recently emerged and was used in the Tomographic Particle Image Velocimetry technique (Tomo-PIV). In this paper, we propose an object-oriented approach, called Iterative Object Detection—Object Volume Reconstruction based on Marked Point Process (IOD-OVRMPP), to reconstruct the volume of 3D objects from projection images of 2D objects. Our approach allows the problem to be solved in a parsimonious way by minimizing an energy function based on a least squares criterion. Each object belonging to 2D or 3D space is identified by its continuous position and a set of features (marks). In order to optimize the population of objects, we use a simulated annealing algorithm which provides a “Maximum A Posteriori” estimation. To test our approach, we apply it to the field of Tomo-PIV where the volume reconstruction process is one of the most important steps in the analysis of volumetric flow. Finally, using synthetic data, we show that the proposed approach is able to reconstruct densely seeded flows.  相似文献   

7.
In this paper, we propose a method to predict the presence or absence of correct classification results in classification problems with many classes and the output of the classifier is provided in the form of a ranking list. This problem differs from the “traditional” classification tasks encountered in pattern recognition. While the original problem of forming a ranking of the most likely classes can be solved by running several classification methods, the analysis presented here is moved one step further. The main objective is to analyse (classify) the provided rankings (an ordered list of rankings of a fixed length) and decide whether the “true” class is present on this list. With this regard, a two-class classification problem is formulated where the underlying feature space is built through a characterization of the ranking lists. Experimental results obtained for synthetic data as well as real world face identification data are presented.  相似文献   

8.
The ability to mathematically model the movement of a robot manipulator is a prerequisite to the understanding of the key factors that influence a manipulator's performance. This paper presents a manipulator model which has been used to simulate and control a real robot arm. A method of describing the arm by its rotational characteristics. a set of equations called the “Inverse Arm” and an algorithm called the “Forward Arm” are presented. The Forward Arm simulates the movement of an arm and the Inverse Arm provides a means of computing the correct voltages to apply to an arm to achieve a desired movement.  相似文献   

9.
In this paper, we propose a new method to present a fuzzy trapezoidal solution, namely “suitable solution”, for a fully fuzzy linear system (FFLS) based on solving two fully interval linear systems (FILSs) that are 1-cut and 0-cut of the related FILS. After some manipulations, two FILSs are transformed to 2n crisp linear equations and 4n crisp linear nonequations and n crisp nonlinear equations. Then, we propose a nonlinear programming problem (NLP) to computing simultaneous (synchronic) equations and nonequations. Moreover, we define two other new solutions namely, “fuzzy surrounding solution” and “fuzzy peripheral solution” for an FFLS. It is shown that the fuzzy surrounding solution is placed in a tolerable fuzzy solution set and the fuzzy peripheral solution is placed in a controllable fuzzy solution set. Finally, some numerical examples are given to illustrate the ability of the proposed methods.  相似文献   

10.
Collaborative Filtering (CF) computes recommendations by leveraging a historical data set of users’ ratings for items. CF assumes that the users’ recorded ratings can help in predicting their future ratings. This has been validated extensively, but in some domains the user’s ratings can be influenced by contextual conditions, such as the time, or the goal of the item consumption. This type of contextual information is not exploited by standard CF models. This paper introduces and analyzes a novel technique for context-aware CF called Item Splitting. In this approach items experienced in two alternative contextual conditions are “split” into two items. This means that the ratings of a split item, e.g., a place to visit, are assigned (split) to two new fictitious items representing for instance the place in summer and the same place in winter. This split is performed only if there is statistical evidence that under these two contextual conditions the items ratings are different; for instance, a place may be rated higher in summer than in winter. These two new fictitious items are then used, together with the unaffected items, in the rating prediction algorithm. When the system must predict the rating for that “split” item in a particular contextual condition (e.g., in summer), it will consider the new fictitious item representing the original one in that particular contextual condition, and will predict its rating. We evaluated this approach on real world, and semi-synthetic data sets using matrix factorization, and nearest neighbor CF algorithms. We show that Item Splitting can be beneficial and its performance depends on the method used to determine which items to split. We also show that the benefit of the method is determined by the relevance of the contextual factors that are used to split.  相似文献   

11.
Contextual advertising is an important part of today’s Web. It provides benefits to all parties: Web site owners and an advertising platform share the revenue, advertisers receive new customers, and Web site visitors get useful reference links. The relevance of selected ads for a Web page is essential for the whole system to work. Problems such as homonymy and polysemy, low intersection of keywords and context mismatch can lead to the selection of irrelevant ads. Therefore, a simple keyword matching technique gives a poor accuracy. In this paper, we propose a method for improving the relevance of contextual ads. We propose a novel “Wikipedia matching” technique that uses Wikipedia articles as “reference points” for ads selection. We show how to combine our new method with existing solutions in order to increase the overall performance. An experimental evaluation based on a set of real ads and a set of pages from news Web sites is conducted. Test results show that our proposed method performs better than existing matching strategies and using the Wikipedia matching in combination with existing approaches provides up to 50% lift in the average precision. TREC standard measure bpref-10 also confirms the positive effect of using Wikipedia matching for the effective ads selection.  相似文献   

12.
Copy–move image forgery detection has recently become a very active research topic in blind image forensics. In copy–move image forgery, a region from some image location is copied and pasted to a different location of the same image. Typically, post-processing is applied to better hide the forgery. Using keypoint-based features, such as SIFT features, for detecting copy–move image forgeries has produced promising results. The main idea is detecting duplicated regions in an image by exploiting the similarity between keypoint-based features in these regions. In this paper, we have adopted keypoint-based features for copy–move image forgery detection; however, our emphasis is on accurate and robust localization of duplicated regions. In this context, we are interested in estimating the transformation (e.g., affine) between the copied and pasted regions more accurately as well as extracting these regions as robustly by reducing the number of false positives and negatives. To address these issues, we propose using a more powerful set of keypoint-based features, called MIFT, which shares the properties of SIFT features but also are invariant to mirror reflection transformations. Moreover, we propose refining the affine transformation using an iterative scheme which improves the estimation of the affine transformation parameters by incrementally finding additional keypoint matches. To reduce false positives and negatives when extracting the copied and pasted regions, we propose using “dense” MIFT features, instead of standard pixel correlation, along with hysteresis thresholding and morphological operations. The proposed approach has been evaluated and compared with competitive approaches through a comprehensive set of experiments using a large dataset of real images (i.e., CASIA v2.0). Our results indicate that our method can detect duplicated regions in copy–move image forgery with higher accuracy, especially when the size of the duplicated region is small.  相似文献   

13.
Unimodal analysis of palmprint and palm vein has been investigated for person recognition. One of the problems with unimodality is that the unimodal biometric is less accurate and vulnerable to spoofing, as the data can be imitated or forged. In this paper, we present a multimodal personal identification system using palmprint and palm vein images with their fusion applied at the image level. The palmprint and palm vein images are fused by a new edge-preserving and contrast-enhancing wavelet fusion method in which the modified multiscale edges of the palmprint and palm vein images are combined. We developed a fusion rule that enhances the discriminatory information in the images. Here, a novel palm representation, called “Laplacianpalm” feature, is extracted from the fused images by the locality preserving projections (LPP). Unlike the Eigenpalm approach, the “Laplacianpalm” finds an embedding that preserves local information and yields a palm space that best detects the essential manifold structure. We compare the proposed “Laplacianpalm” approach with the Fisherpalm and Eigenpalm methods on a large data set. Experimental results show that the proposed “Laplacianpalm” approach provides a better representation and achieves lower error rates in palm recognition. Furthermore, the proposed multimodal method outperforms any of its individual modality.  相似文献   

14.
对于单训练样本人脸识别,基于每人多个训练样本的传统人脸识别算法效果均不太理想。尤其是基于Fisher线性鉴别准则的一些方法,由于类内散布矩阵为零矩阵,根本无法进行识别。针对这一问题进行了分析研究,提出了一种新的样本扩充方法,即泛滑动窗法。采用“大窗口,小步长”的机制进行窗口图像采集和样本扩充,不仅增加了训练样本,而且充分保持和强化了原始样本模式固有的类内和类间信息。然后,使用加权二维线性鉴别分析方法(Weighted 2DLDA)对上面获得的窗口图像进行特征抽取。在ORL国际标准人脸库上进行的实验表明了所提算法的可行性和有效性。  相似文献   

15.
This paper introduces a new interpolation method to estimate the spatial distribution of contaminant concentrations in groundwater. The method is intended to identify areas of risks in early investigation stages when groundwater sampling data is typically scarce and available interpolation methods fail to provide reasonable results. As a consequence, the method does not only incorporate available sampling data, but also makes use of information about the groundwater flow field, in order to “guide” the interpolation with e.g. ordinary kriging or inverse distance method. The guidance includes the augmentation of available data by auxiliary point data and the segmentation of the estimated plume area into a series of sectors. The method is evaluated for several settings and different sampling data sets. Each data set reflects a specific level of field investigations at the model site, an abandoned military base in Potsdam near Berlin, Germany. The results reveal that flow guidance improves the representation of contaminant distribution for all cases examined in this study compared to “unguided” interpolation. These findings are underpinned by the results of the method’s application to real sampling data. The method especially shows its strength when data of only a few sampling points are available.  相似文献   

16.
We humans usually think in words; to represent our opinion about, e.g., the size of an object, it is sufficient to pick one of the few (say, five) words used to describe size (“tiny,” “small,” “medium,” etc.). Indicating which of 5 words we have chosen takes 3 bits. However, in the modern computer representations of uncertainty, real numbers are used to represent this “fuzziness.” A real number takes 10 times more memory to store, and therefore, processing a real number takes 10 times longer than it should. Therefore, for the computers to reach the ability of a human brain, Zadeh proposed to represent and process uncertainty in the computer by storing and processing the very words that humans use, without translating them into real numbers (he called this idea granularity). If we try to define operations with words, we run into the following problem: e.g., if we define “tiny” + “tiny” as “tiny,” then we will have to make a counter-intuitive conclusion that the sum of any number of tiny objects is also tiny. If we define “tiny” + “tiny” as “small,” we may be overestimating the size. To overcome this problem, we suggest to use nondeterministic (probabilistic) operations with words. For example, in the above case, “tiny” + “tiny” is, with some probability, equal to “tiny,” and with some other probability, equal to “small.” We also analyze the advantages and disadvantages of this approach: The main advantage is that we now have granularity and we can thus speed up processing uncertainty. The main disadvantage is that in some cases, when defining symmetric associative operations for the set of words, we must give up either symmetry, or associativity. Luckily, this necessity is not always happening: in some cases, we can define symmetric associative operations. © 1997 John Wiley & Sons, Inc.  相似文献   

17.
Evaluation of segmentation methods is a crucial aspect in image processing, especially in the medical imaging field, where small differences between segmented regions in the anatomy can be of paramount importance. Usually, segmentation evaluation is based on a measure that depends on the number of segmented voxels inside and outside of some reference regions that are called gold standards. Although some other measures have been also used, in this work we propose a set of new similarity measures, based on different features, such as the location and intensity values of the misclassified voxels, and the connectivity and the boundaries of the segmented data. Using the multidimensional information provided by these measures, we propose a new evaluation method whose results are visualized applying a Principal Component Analysis of the data, obtaining a simplified graphical method to compare different segmentation results. We have carried out an intensive study using several classic segmentation methods applied to a set of MRI simulated data of the brain with several noise and RF inhomogeneity levels, and also to real data, showing that the new measures proposed here and the results that we have obtained from the multidimensional evaluation, improve the robustness of the evaluation and provides better understanding about the difference between segmentation methods.  相似文献   

18.
Finding an informative, structure‐preserving map between two shapes has been a long‐standing problem in geometry processing, involving a variety of solution approaches and applications. However, in many cases, we are given not only two related shapes, but a collection of them, and considering each pairwise map independently does not take full advantage of all existing information. For example, a notorious problem with computing shape maps is the ambiguity introduced by the symmetry problem — for two similar shapes which have reflectional symmetry there exist two maps which are equally favorable, and no intrinsic mapping algorithm can distinguish between them based on these two shapes alone. Another prominent issue with shape mapping algorithms is their relative sensitivity to how “similar” two shapes are — good maps are much easier to obtain when shapes are very similar. Given the context of additional shape maps connecting our collection, we propose to add the constraint of global map consistency, requiring that any composition of maps between two shapes should be independent of the path chosen in the network. This requirement can help us choose among the equally good symmetric alternatives, or help us replace a “bad” pairwise map with the composition of a few “good” maps between shapes that in some sense interpolate the original ones. We show how, given a collection of pairwise shape maps, to define an optimization problem whose output is a set of alternative maps, compositions of those given, which are consistent, and individually at times much better than the original. Our method is general, and can work on any collection of shapes, as long as a seed set of good pairwise maps is provided. We demonstrate the effectiveness of our method for improving maps generated by state‐of‐the‐art mapping methods on various shape databases.  相似文献   

19.
《Computers & chemistry》1996,20(1):61-66
Methods of phylogenetic analysis are presented that result in corrections of highly biased data sets, particularly those in which there are great differences between mutation and/or substitution rates from one nucleotide site to another along a DNA sequence. Two approaches are discussed. In the first, pairwise comparisons of a set of sequences are used to determine whether the most recent substitutions take place at the sites that are most polymorphic—that is, where the mutational “hot spots” are located. In the second, a “topiary pruning” method is used to remove selectively the bases in the data set that are most likely to occupy these hot spots and therefore to result in homoplastic substitutions. The two methods combined yield new and substantially older estimates of the time at which the mitochondrial Eve lived, and increase the likelihood that she lived in Africa. In these data, transversions provide a more satisfactory yardstick for phylogenetic analysis than transitions, because there is no detectable tendency for transversions to occur at mutational hot spots.  相似文献   

20.
To help users with automatically reformatting and validating spreadsheets and other datasets, prior work introduced a user-extensible data model called “topes” and a supporting visual programming language. However, no support has existed to date for users to exchange and reuse topes. This functional gap results in wasteful duplication of work as users implement topes that other people have already created.In this paper, a design for a new repository system is presented that supports sharing and finding of topes for reuse. This repository tightly integrates traditional keyword-based search with two additional search methods whose usefulness in repositories of end-user code has gone unexplored to date. The first method is “search-by-match”, where a user specifies examples of data, and the repository retrieves topes that can reformat and validate that data. The second method is collaborative filtering, which has played a vital role in repositories of non-code artifacts.The repository’s search functionality was empirically tested on a prototype repository implementation by simulating queries generated from real user spreadsheets. This experiment reveals that search-by-match and collaborative filtering greatly improve the accuracy of search over the traditional keyword-based approach, to a recall as high as 95%. These results show that search-by-match and collaborative filtering are useful approaches for helping users to publish, find, and reuse visual programs similar to topes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号