排序方式: 共有25条查询结果,搜索用时 15 毫秒
11.
A hash function is a mapping from a key universe U to a range of integers, i.e., h:U?{0,1,…,m−1}, where m is the range's size. A perfect hash function for some set S⊆U is a hash function that is one-to-one on S , where m≥|S|. A minimal perfect hash function for some set S⊆U is a perfect hash function with a range of minimum size, i.e., m=|S|. This paper presents a construction for (minimal) perfect hash functions that combines theoretical analysis, practical performance, expected linear construction time and nearly optimal space consumption for the data structure. For n keys and m=n the space consumption ranges from 2.62n+o(n) to 3.3n+o(n) bits, and for m=1.23n it ranges from 1.95n+o(n) to 2.7n+o(n) bits. This is within a small constant factor from the theoretical lower bounds of 1.44n bits for m=n and 0.89n bits for m=1.23n. We combine several theoretical results into a practical solution that has turned perfect hashing into a very compact data structure to solve the membership problem when the key set S is static and known in advance. By taking into account the memory hierarchy we can construct (minimal) perfect hash functions for over a billion keys in 46 min using a commodity PC. An open source implementation of the algorithms is available at http://cmph.sf.net under the GNU Lesser General Public License (LGPL). 相似文献
12.
13.
Waister Silva Martins Marcos André Gonçalves Alberto H. F. Laender Nivio Ziviani 《Scientometrics》2010,83(1):133-155
Assessing the quality of scientific conferences is an important and useful service that can be provided by digital libraries and similar systems. This is specially true for fields such as Computer Science and Electric Engineering, where conference publications are crucial. However, the majority of the existing quality metrics, particularly those relying on bibliographic citations, has been proposed for measuring the quality of journals. In this article we conduct a study about the relative performance of existing journal metrics in assessing the quality of scientific conferences. More importantly, departing from a deep analysis of the deficiencies of these metrics, we propose a new set of quality metrics especially designed to capture intrinsic and important aspects related to conferences, such as longevity, popularity, prestige, and periodicity. To demonstrate the effectiveness of the proposed metrics, we have conducted two sets of experiments that contrast their results against a “gold standard” produced by a large group of specialists. Our metrics obtained gains of more than 12% when compared to the most consistent journal quality metric and up to 58% when compared to standard metrics such as Thomson’s Impact Factor. 相似文献
14.
Distributed denial of service attacks currently represent a serious threat to the appropriate operation of Internet services. To deal with this threat, we propose an overlay network that provides an IP-traceback scheme at the level of autonomous systems. Our proposed autonomous system-level IP-traceback system contrasts with previous works because it does not require a priori knowledge of the network topology and allows single-packet traceback and incremental deployment. Our first contribution is a new extension to the Border Gateway Protocol update-message community attribute that enables information to be passed across autonomous systems that are not necessarily involved in the overlay network. The second contribution is a new sequence-marking process to remove ambiguities in the traceback path. Two different strategies for incremental system deployment are investigated and evaluated. We show that strategic placement of the system on highly connected autonomous systems produces relevant results for IP traceback even if the system operates on only a few autonomous systems. The main conclusion is that the proposed system is suitable for large-scale networks such as the Internet because it provides efficient traceback and allows incremental deployment. 相似文献
15.
We study the problem of minimizing the expected cost of binary searching for data where the access cost is not fixed and
depends on the last accessed element, such as data stored in magnetic or optical disk. We present an optimal algorithm for
this problem that finds the optimal search strategy in O(n
3
) time, which is the same time complexity of the simpler classical problem of fixed costs. Next, we present two practical
linear expected time algorithms, under the assumption that the access cost of an element is independent of its physical position.
Both practical algorithms are online, that is, they find the next element to access as the search proceeds. The first one
is an approximate algorithm which minimizes the access cost disregarding the goodness of the problem partitioning. The second one is a heuristic algorithm, whose quality depends on its ability to estimate the final search cost, and therefore it can be tuned by recording
statistics of previous runs.
We present an application for our algorithms related to text retrieval. When a text collection is large it demands specialized
indexing techniques for efficient access. One important type of index is the suffix array, where data access is provided through
an indirect binary search on the text stored in magnetic disk or optical disk. Under this cost model we prove that the optimal
algorithm cannot perform better than Ω(1/ log n) times the standard binary search. We also prove that the approximate strategy cannot, on average, perform worse than 39%
over the optimal one. We confirm the analytical results with simulations, showing improvements between 34% (optimal) and 60%
(online) over standard binary search for both magnetic and optical disks.
Received February 13, 1997; revised May 27, 1998. 相似文献
16.
Klessius Berlt Edleno Silva de Moura André Carvalho Marco Cristo Nivio Ziviani Thierson Couto 《Information Systems》2010
In this work we propose a model to represent the web as a directed hypergraph (instead of a graph), where links connect pairs of disjointed sets of pages. The web hypergraph is derived from the web graph by dividing the set of pages into non-overlapping blocks and using the links between pages of distinct blocks to create hyperarcs. A hyperarc connects a block of pages to a single page, in order to provide more reliable information for link analysis. We use the hypergraph model to create the hypergraph versions of the Pagerank and Indegree algorithms, referred to as HyperPagerank and HyperIndegree, respectively. The hypergraph is derived from the web graph by grouping pages by two different partition criteria: grouping together the pages that belong to the same web host or to the same web domain. We compared the original page-based algorithms with the host-based and domain-based versions of the algorithms, considering a combination of the page reputation, the textual content of the pages and the anchor text. Experimental results using three distinct web collections show that the HyperPagerank and HyperIndegree algorithms may yield better results than the original graph versions of the Pagerank and Indegree algorithms. We also show that the hypergraph versions of the algorithms were slightly less affected by noise links and spamming. 相似文献
17.
We propose a method for the Distributed Assessment of the Closeness CEntrality Ranking (DACCER) in complex networks. DACCER computes centrality based only on localized information restricted to a limited neighborhood around each node, thus not requiring full knowledge of the network topology. We indicate that the node centrality ranking computed by DACCER is highly correlated with the node ranking based on the traditional closeness centrality, which requires high computational costs and full knowledge of the network topology by the entity responsible for calculating the centrality. This outcome is quite useful given the vast potential applicability of closeness centrality, which is seldom applied to large-scale networks due to its high computational costs. Results indicate that DACCER is simple, yet efficient, in assessing node centrality while allowing a distributed implementation that contributes to its performance. This also contributes to the practical applicability of DACCER to the analysis of large complex networks, as indicated in our experimental evaluation using both synthetically generated networks and real-world network traces of different kinds and scales. 相似文献
18.
Ziviani A. Fdida S. de Rezende J.F. Duarte O.C.M.B. 《Communications Letters, IEEE》2005,9(12):1043-1045
The main location management proposals in mobile ad hoc networks have, as a common characteristic, two distinct phases: the location query of the position of a destination node and the transmission of a flow toward the destination node. This letter proposes to send the initial packet of a flow to learn the position of its destination instead of adopting a dedicated query packet. Such an approach specially benefits TCP flows. This TCP-tailored approach can be applied to previous proposals in location management with minor changes in their particular features. We show that the proposed TCP-tailored approach reduces the cost of location management for TCP flows with respect to conventional schemes. We also evaluate the benefits that different location services take from the TCP-tailored approach. 相似文献
19.
Humberto T. Marques-Neto Faber H. Z. Xavier Wender Z. Xavier Carlos Henrique S. Malab Artur Ziviani Lucas M. Silveira Jussara M. Almeida 《Journal of Network and Systems Management》2018,26(4):1079-1100
The analysis of mobile phone data can help carriers to improve the way they deal with unusual workloads imposed by large-scale events. This paper analyzes human mobility and the resulting dynamics in the network workload caused by three different types of large-scale events: a major soccer match, a rock concert, and a New Year’s Eve celebration, which took place in a large Brazilian city. Our analysis is based on the characterization of records of mobile phone calls made around the time and place of each event. That is, human mobility and network workload are analyzed in terms of the number of mobile phone calls, their inter-arrival and inter-departure times, and their durations. We use heat maps to visually analyze the spatio-temporal dynamics of the movement patterns of the participants of the large-scale event. The results obtained can be helpful to improve the understanding of human mobility caused by large-scale events. Such results could also provide valuable insights for network managers into effective capacity management and planning strategies. We also present PrediTraf, an application built to help the cellphone carriers plan their infrastructure on large-scale events. 相似文献
20.
Jayme L. Szwarcfiter Gonzalo Navarro Ricardo Baeza-Yates Joísa de S. Oliveira Walter Cunto Nívio Ziviani 《Theoretical computer science》2003,290(3):1799-1814
We describe algorithms for constructing optimal binary search trees, in which the access cost of a key depends on the k preceding keys which were reached in the path to it. This problem has applications to searching on secondary memory and robotics. Two kinds of optimal trees are considered, namely optimal worst case trees and weighted average case trees. The time and space complexities of both algorithms are O(nk+2) and O(nk+1), respectively. The algorithms are based on a convenient decomposition and characterizations of sequences of keys which are paths of special kinds in binary search trees. Finally, using generating functions, we present an exact analysis of the number of steps performed by the algorithms. 相似文献