The users' interest can be mined from the web cache and can be used widely. The interest can be specialized by the two-tuple (term, weight) in the simple interest model, in which the association relations are not mined, and then the interest cannot be associated in expressing the users' interest. Based on analyzing the WWW cache model, this letter brings forward a two-dimensional interest model and gives the interrelated methods on how to store the two-dimensional interest model effectively. 相似文献
We compare two link analysis ranking methods of web pages in a site. The first, called Site Rank, is an adaptation of PageRank to the granularity of a web site and the second, called Popularity Rank, is based on the frequencies of user clicks on the outlinks in a page that are captured by navigation sessions of users through
the web site. We ran experiments on artificially created web sites of different sizes and on two real data sets, employing
the relative entropy to compare the distributions of the two ranking methods. For the real data sets we also employ a nonparametric
measure, called Spearman's footrule, which we use to compare the top-ten web pages ranked by the two methods. Our main result
is that the distributions of the Popularity Rank and Site Rank are surprisingly close to each other, implying that the topology
of a web site is very instrumental in guiding users through the site. Thus, in practice, the Site Rank provides a reasonable
first order approximation of the aggregate behaviour of users within a web site given by the Popularity Rank. 相似文献
Due to the language barrier, non-English users are unable to retrieve the most updated medical information from the U.S. authoritative medical websites, such as PubMed and MedlinePlus. However, currently, there is no any cross-language medical information retrieval (CLMIR) system that can help Chinese-speaking consumers cross the language barrier in finding useful English medical information. A few CLMIR systems utilize MeSH (Medical Subject Headings) to help overcome the language barrier. Unfortunately, the traditional Chinese version of MeSH is currently unavailable.In this paper, we employ a semi-automatic term translation method to construct a Chinese–English MeSH by exploiting abundant multilingual Web resources, including Web anchor texts and search–result pages. Through this method, we have developed a Chinese–English Mesh Compilation System to assist knowledge engineers in compiling a Chinese–English medical thesaurus with more than 19,000 entries. Furthermore, this thesaurus has been used to develop a prototypical system for cross-language medical information retrieval, MMODE, which can help consumers retrieve top-quality English medical information using Chinese terms. 相似文献
This paper presents an algorithm that permits the search for dependencies among sets of data (univariate or multivariate time-series, or cross-sectional observations). The procedure is modeled after genetic theories and Darwinian concepts, such as natural selection and survival of the fittest. It permits the discovery of equations of the data-generating process in symbolic form. The genetic algorithm that is described here uses parts of equations as building blocks to breed ever better formulas. Apart from furnishing a deeper understanding of the dynamics of a process, the method also permits global predictions and forecasts. The algorithm is successfully tested with artificial and with economic time-series and also with cross-sectional data on the performance and salaries of NBA players during the 94–95 season. 相似文献
Microalloyed high-strength low-alloy (HSLA) steels contain additions of Nb, V, Ti, or in combination, in amounts of 0.01 to 0.1 weight percent to improve mechanical properties, which are strongly dependent on the thermomechanical interaction taking place in the course of rolling mill processes. The recrystallizatian of hat-twisted austenite has been investigated in a cylindrical specimen (f 6×50 mm) machined from hat rolled plates of 0,052 wt % Niobium microalloyed steel. Continuous and interrupted torsion test were carried out in the temperature range 1123 K to 1173 K after a solution treatment of 1.5 minutes at 1423 K and torque-twist data were analysed. The various methods were discussed for obtaining results from torsion tests. The effect of precipitation kinetics was appreciated by way of connection tp/tp(red), where tp is the experimental measured time for the peak stress and tp(red) is the newly defined reduced time. The softening ratio X and time t0.05R for start of static recrystallization were established.
The correlation between precipitation and recrystallization is presented as a graphs for chosen requirements (temperature of austenitization, carbon and niobium content and strain rate). If temperature goes below 850°C, the restoration processes are hardly suppressed, both are limited by diffusion and Nb(CN) precipitation, which are extended dynamically in the range of strains rates 10−2 to 1 s−1.
In the present paper, an attempt is made to derive the PRTT diagram and to define all mathematical equations for describing recrystallization times t0.05R, t0.5R, t0.95R and t0.05P for the start of precipitation. In real metal forming processes such as the hot rolling of plates or strips the knowledge of these parameters and results is extremely important for the the correct microstructure and sheet quality to be obtained. 相似文献
We investigate determining the exact bounds of the frequencies of conjunctions based on frequent sets. Our scenario is an important special case of some general probabilistic logic problems that are known to be intractable. We show that despite the limitations our problems are also intractable, namely, we show that checking whether the maximal consistent frequency of a query is larger than a given threshold is NP-complete and that evaluating the Maximum Entropy estimate of a query is PP-hard. We also prove that checking consistency is NP-complete. 相似文献
Data clustering is a popular approach for automatically finding classes, concepts, or groups of patterns. In practice, this
discovery process should avoid redundancies with existing knowledge about class structures or groupings, and reveal novel,
previously unknown aspects of the data. In order to deal with this problem, we present an extension of the information bottleneck
framework, called coordinated conditional information bottleneck, which takes negative relevance information into account by maximizing a conditional mutual information score subject to
constraints. Algorithmically, one can apply an alternating optimization scheme that can be used in conjunction with different
types of numeric and non-numeric attributes. We discuss extensions of the technique to the tasks of semi-supervised classification
and enumeration of successive non-redundant clusterings. We present experimental results for applications in text mining and
computer vision. 相似文献