共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider a system where users wish to find similar users. To model similarity, we assume the existence of a set of queries,
and two users are deemed similar if their answers to these queries are (mostly) identical. Technically, each user has a vector
of preferences (answers to queries), and two users are similar if their preference vectors differ in only a few coordinates.
The preferences are unknown to the system initially, and the goal of the algorithm is to classify the users into classes of
roughly the same preferences by asking each user to answer the least possible number of queries. We prove nearly matching
lower and upper bounds on the maximal number of queries required to solve the problem. Specifically, we present an “anytime”
algorithm that asks each user at most one query in each round, while maintaining a partition of the users. The quality of
the partition improves over time: for n users and time T, groups of [(O)\tilde](n/T)\tilde{O}(n/T) users with the same preferences will be separated (with high probability) if they differ in sufficiently many queries. We
present a lower bound that matches the upper bound, up to a constant factor, for nearly all possible distances between user
groups. 相似文献
2.
Huiru Zheng Haiying Wang Glass D.H. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》2008,38(1):5-16
Protein-protein interactions (PPIs) play crucial roles in virtually every aspect of cellular function within an organism. One important objective of modern biology is the extraction of functional modules, such as protein complexes from global protein interaction networks. This paper describes how seven genomic features and four experimental interaction data sets were combined using a Bayesian-networks-based data integration approach to infer PPI networks in yeast. Greater coverage and higher accuracy were achieved than in previous high-throughput studies of PPI networks in yeast. A Markov clustering algorithm was then used to extract protein complexes from the inferred protein interaction networks. The quality of the computed complexes was evaluated using the hand-curated complexes from the Munich Information Center for Protein Sequences database and gene-ontology-driven semantic similarity. The results indicated that, by integrating multiple genomic information sources, a better clustering result was obtained in terms of both statistical measures and biological relevance. 相似文献
3.
Real-time control of infectious disease outbreaks represents one of the greatest epidemiological challenges currently faced. In this paper we address the problem of identifying contagion patterns responsible for the spread of a disease in a network, which can be applied in real-time to evaluate an ongoing outbreak. We focus on the scenario where limited information, i.e. infection reports which may or may not include the actual source, is available during an ongoing outbreak and we seek the most likely infection tree that spans at least a set of known infected nodes. This problem can be represented using a maximum likelihood constrained Steiner tree model where the objective is to find a spanning tree with an assignment of integer nodes weights. We propose a novel formulation and solution method based on a two-step heuristic which (1) reduces the initial graph using a polynomial time algorithm designed to find feasible infection paths and (2) solves an exact mixed integer linear programming reformulation of the maximum likelihood model on the resulting subgraph. The proposed methodology can be applied to outbreaks which may evolve from multiple sources. Simulated contagion episodes are used to evaluate the performance of our solution method. Our results show that the approach is computationally efficient and is able to reconstruct a significant proportion of the outbreak, even in the context of low levels of information availability. 相似文献
4.
We present a new approach to motion rearrangement that preserves the syntactic structures of an input motion automatically by learning a context‐free grammar from the motion data. For grammatical analysis, we reduce an input motion into a string of terminal symbols by segmenting the motion into a series of subsequences, and then associating a group of similar subsequences with the same symbol. To obtain the most repetitive and precise set of terminals, we search for an optimial segmentation such that a large number of subsequences can be clustered into groups with little error. Once the input motion has been encoded as a string, a grammar induction algorithm is employed to build up a context‐free grammar so that the grammar can reconstruct the original string accurately as well as generate novel strings sharing their syntactic structures with the original string. Given any new strings from the learned grammar, it is straightforward to synthesize motion sequences by replacing each terminal symbol with its associated motion segment, and stitching every motion segment sequentially. We demonstrate the usefulness and flexibility of our approach by learning grammars from a large diversity of human motions, and reproducing their syntactic structures in new motion sequences. 相似文献
5.
Community structure has been recognized as an important statistical feature of networked systems over the past decade. A lot
of work has been done to discover isolated communities from a network, and the focus was on developing of algorithms with
high quality and good performance. However, there is less work done on the discovery of overlapping community structure, even
though it could better capture the nature of network in some real-world applications. For example, people are always provided
with varying characteristics and interests, and are able to join very different communities in their social network. In this
context, we present a novel overlapping community structures detecting algorithm which first finds the seed sets by the spectral
partition and then extends them with a special random walks technique. At every expansion step, the modularity function Q
is chosen to measure the expansion structures. The function has become one of the popular standards in community detecting
and is defined in Newman and Girvan (Phys. Rev. 69:026113, 2004). We also give a theoretic analysis to the whole expansion process and prove that our algorithm gets the best community structures
greedily. Extensive experiments are conducted in real-world networks with various sizes. The results show that overlapping
is important to find the complete community structures and our method outperforms the C-means in quality. 相似文献
6.
Efimov Yu. S. Leonov V. Yu. Odinokikh G. A. Solomatin I. A. 《Journal of Computer and Systems Sciences International》2021,60(1):108-117
Journal of Computer and Systems Sciences International - Iris-based identification is a well-known and widely used technology. However, its use in mobile devices is limited due to the algorithmic... 相似文献
7.
8.
Bi Yu Chen William H. K. Lam Agachai Sumalee Qingquan Li Hu Shao Zhixiang Fang 《Networks and Spatial Economics》2013,13(2):123-148
The aim of this study is to investigate the solution algorithm for solving the problem of determining reliable shortest paths in road networks with stochastic travel times. The availability of reliable shortest paths enables travelers, in the face of travel time uncertainty, to plan their trips with a pre-specified on-time arrival probability. In this study, the reliable shortest path between origin and destination nodes is determined using a multiple-criteria shortest path approach when link travel times follow normal distributions. The dominance conditions involved in such problems are established, thereby reducing the number of generated non-dominated paths during the search processes. Two solution algorithms, multi-criteria label-setting and A* algorithms, are proposed and their complexities analyzed. Computational results using large scale networks are presented. Numerical examples using data from a real-world advanced traveller information system is also given to illustrate the applicability of the solution algorithms in practice. 相似文献
9.
Improvements in biological data acquisition and genomes sequencing now allow to reconstruct entire metabolic networks of many living organisms. The size and complexity of these networks prohibit manual drawing and thereby urge the need of dedicated visualization techniques. An efficient representation of such a network should preserve the topological information of metabolic pathways while respecting biological drawing conventions. These constraints complicate the automatic generation of such visualization as it raises graph drawing issues. In this paper we propose a method to lay out the entire metabolic network while preserving the pathway information as much as possible. That method is flexible as it enables the user to define whether or not node duplication should be performed, to preserve or not the network topology. Our technique combines partitioning, node placement and edge bundling to provide a pseudo‐orthogonal visualization of the metabolic network. To ease pathway information retrieval, we also provide complementary interaction tools that emphasize relevant pathways in the entire metabolic context. 相似文献
10.
11.
社会网络上的社区发现技术是近年来的研究热点之一,通常的社区发现方法往往立足于最优化社区的某个特征,从而达到识别社区的目的,例如最优化节点群的链接密度。在实际需求中,人们往往期待能够识别满足多种特征约束的社区,例如既连接紧密又在满足某种形状约束的条件下,规模越大越好。在某些情况下,单一条件约束的社区发现方法通常无法满足用户对于多准则的要求,而由于网络社区形式的任意性,不同发现方法的结果又难以合并。针对这一需求,给出了一种社区发现的多准则约束模型,以及能够同时满足结构约束、社区规模要求和紧密度要求的典型社区多准则约束规则;提出了一种高效的算法和剪枝策略,并在一个大规模的真实数据集上验证了该算法的有效性和高效率。 相似文献
12.
We define and prove properties of the consensus shape for a protein family, a protein-like structure that provides a compact summary of the significant structural information for a protein family. If all members of the protein family exhibit a geometric relationship between corresponding α-carbons, then that relationship is preserved in the consensus shape. In particular, distances and angles that are consistent across family members are preserved. For the consensus shape, the spacing between successive α-carbons is variable, with small distances in regions where the members of the protein family exhibit significant variation and large distances (up to the standard spacing of about 3.8 Å) in regions where the family members agree. Despite this non-protein-like characteristic, the consensus shape preserves and highlights important structural information. We describe an iterative algorithm for computing the consensus shape and prove that the algorithm converges. As a by-product we produce a multiple structure alignment for the family members. We present the results of experiments in which we build consensus shapes for several known protein families. 相似文献
13.
14.
V. A. Vasyanin 《Cybernetics and Systems Analysis》2014,50(5):759-767
An algorithm for finding all shortest paths in an undirected network is considered. The following two criteria are used: the minimum number of arcs in a path and a minimum path length. The algorithm is analyzed for complexity, and it is empirically shown that, with increasing the network density, its computational efficiency becomes higher than that of the Floyd algorithm adequately modified to find the shortest path with the use of a two-step criterion. 相似文献
15.
We define and prove properties of the consensus shape for a
protein family, a protein-like structure that provides a compact
summary of the significant structural information for a protein
family. If all members of the protein family exhibit a geometric
relationship between corresponding -carbons, then that
relationship is preserved in the consensus shape. In particular,
distances and angles that are consistent across family members are
preserved. For the consensus shape, the spacing between successive
-carbons is variable, with small distances in regions where the
members of the protein family exhibit significant variation and large
distances (up to the standard spacing of about 3.8 Å) in regions where
the family members agree. Despite this non-protein-like
characteristic, the consensus shape preserves and highlights important
structural information. We describe an iterative algorithm for
computing the consensus shape and prove that the algorithm converges.
As a by-product we produce a multiple structure alignment
for the family members.
We present the results of experiments in which we build consensus
shapes for several known protein families. 相似文献
16.
《Evolutionary Computation, IEEE Transactions on》2009,13(4):895-912
17.
《Journal of Parallel and Distributed Computing》2002,62(2):267-283
A “book-embedding” of a graph G comprises embedding the graph's nodes along the spine of a book and embedding the edges on the pages so that the edges embedded on the same page do not intersect. This is also referred to as the page model. The “pagenumber” of a graph is the thickness of the smallest (in number of pages) book into which G can be embedded. The problem has been studied only for some specific kind of graphs. The pagenumber problem is known to be NP-complete, even if the order of nodes on the spine is fixed. Using genetic algorithms, we describe the first algorithm for solving the pagenumber problem that can be applied on arbitrary graphs. Experimental results for several kinds of graphs are obtained. We were particularly interested in graphs that correspond to some well-known interconnection networks (such as hypercubes and meshes). We also introduced and experimented with 2-D pagenumber model for embedding graphs. 相似文献
18.
本文提出了一种新的基于泛函网络的多项式求根模型及学习算法,而泛函网络的参数利用解线性不等式组,可得到所求任意高阶多项式近似根的一般参数表达式。文章还讨论了基于泛函网络的多项式求根学习算法实现的一些技术问题,相对传统方法,能够有效地获得任意多项式对应根的参数表达式。 相似文献
19.
为了指导新药探索,减少耗费过高的实验次数,需要有一个准确、快速的分子相似性判定方法来选择待比较的分子.为此提出一种蛋白质结构相似性比较方法,使用空间球极坐标3个分量划分蛋白质所在区域,得到蛋白质基本组成元素的空间密度特征,进而判定2个蛋白质结构的相似性.实验结果表明,该方法可作为各种结构蛋白质相似性比较的辅助手段,不仅能较好地反映蛋白质分子的空间结构特征,而且还有令人满意的时间复杂性,对大分子的相似性比较及进一步分类将有重要意义. 相似文献
20.
Hsiao Hung-Chang Lin Yung-Chih Liao Hao 《Parallel and Distributed Systems, IEEE Transactions on》2009,20(7):1023-1037
Small-world (SW) networks possess two properties, namely low diameter and high clustering coefficient, which are often desired by large-scale peer-to-peer networks. Prior studies have shown that the construction of an SW network can be based on a d-regular graph, and each node in the graph maintains d local neighbors and a small constant number of long-distance contacts. However, it is commonly understood that it is difficult to construct a short route in an SW network, given source (s) and target (t) nodes, though an SW network guarantees that a short route from s to t exists. Prior work in [1] proposed a navigable SW network for a d-dimensional lattice such that a simple localized routing algorithm can be devised to route a message from s to t using O(log^{2}{cal X}) hops, where {cal X} is the number of nodes in the network. In this paper, we present a novel navigable SW network based on a hierarchical model. Compared to previous efforts, the novelty of our study presents 1) that our network construction based on a hierarchical model is decentralized, 2) that routing a message between any two nodes in our SW network takes logarithmic hopcount in expectation, 3) that our SW network has high cluster coefficient, and 4) that the performance of our proposal is mathematically provable. We support the performance of our proposal in this study through rigorous, thorough performance analysis and extensive simulations. 相似文献