exact relevance measures of the objects are not significant. We flatten the multidimensionality of the feature space into a 2D relevance map, capturing the inter-relations among the features. The prototype, extract information from the World Wide Web from query engines, automatically categorizes and clusters the information and allow the user to visualize.  相似文献   

在查询扩展方法中,如果通过查询结果中关键词的上下文来计算候选关键词的权重,将权重大的词作为查询扩展词,其候选关键词来源于文档中关键词的上下文,这种方法存在主题漂移的问题。为了解决这个问题,提出一种将初始查询结果过滤,只选择与源文档语境相似的搜索结果,来帮助选择查询扩展词的方法。实验结果表明该方法能获得更合适的查询扩展词。  相似文献   

We investigate the possibility of using Semantic Web data to improve hypertext Web search. In particular, we use relevance feedback to create a ‘virtuous cycle’ between data gathered from the Semantic Web of Linked Data and web-pages gathered from the hypertext Web. Previous approaches have generally considered the searching over the Semantic Web and hypertext Web to be entirely disparate, indexing, and searching over different domains. While relevance feedback has traditionally improved information retrieval performance, relevance feedback is normally used to improve rankings over a single data-set. Our novel approach is to use relevance feedback from hypertext Web results to improve Semantic Web search, and results from the Semantic Web to improve the retrieval of hypertext Web data. In both cases, an evaluation is performed based on certain kinds of informational queries (abstract concepts, people, and places) selected from a real-life query log and checked by human judges. We evaluate our work over a wide range of algorithms and options, and show it improves baseline performance on these queries for deployed systems as well, such as the Semantic Web Search engine FALCON-S and Yahoo! Web search. We further show that the use of Semantic Web inference seems to hurt performance, while the pseudo-relevance feedback increases performance in both cases, although not as much as actual relevance feedback. Lastly, our evaluation is the first rigorous ‘Cranfield’ evaluation of Semantic Web search.  相似文献   

Semplore: A scalable IR approach to search the Web of Data   总被引:1,自引:0,他引:1  
The Web of Data keeps growing rapidly. However, the full exploitation of this large amount of structured data faces numerous challenges like usability, scalability, imprecise information needs and data change. We present Semplore, an IR-based system that aims at addressing these issues. Semplore supports intuitive faceted search and complex queries both on text and structured data. It combines imprecise keyword search and precise structured query in a unified ranking scheme. Scalable query processing is supported by leveraging inverted indexes traditionally used in IR systems. This is combined with a novel block-based index structure to support efficient index update when data changes. The experimental results show that Semplore is an efficient and effective system for searching the Web of Data and can be used as a basic infrastructure for Web-scale Semantic Web search engines.  相似文献   

The central argument of this paper is that the design, implementation and use of technologies that underpin general semantic search have implications for what we know and the way in which knowledge is understood. Semantic search is an assemblage of technologies that most Internet users would use regularly without necessarily realising. Users of search engines implementing semantic search can obtain answers to questions rather than just retrieve pages that include their search query. This paper critically examines the design of the Semantic Web, upon which semantic search is based. It demonstrates that implicit in the design of the Semantic Web are particular assumptions about the nature of classification and the nature of knowledge. The Semantic Web was intended for interoperability within specific domains. It is here argued that the extension to general semantic search, for use by the general public, has implications for what type of knowledge is visible and what counts as legitimate knowledge. The provision of a definitive answer to a query, via the reduction of discursive knowledge into machine-processable data, provides the illusion of objectivity and authority in a way that is increasingly impenetrable to critical scrutiny.  相似文献   

A central task in the development of context-aware applications is the modeling and management of complex context information. In this paper, we present the NexusEditor, which can ease this task by providing a graphical user interface to design schemas for spatial and technical context models, interactively create queries, send them to a server and visualize the results. One main contribution is to show how schema awareness can improve such a tool: The NexusEditor dynamically parses the underlying data model and provides additional syntactic checks, semantic checks, and short-cuts based on the schema information. Furthermore, the tool helps to design new schema definitions based on the existing ones, which is crucial for an iterative and user-centric development of context-aware applications. Finally, it provides interfaces to existing information spaces and visualization tools for spatial data like GoogleEarth.  相似文献   

A naturalistic online information search exposes individuals to multiple sites and conflicting perspectives. In this study, we evaluated how the holistic stance of a web search toward a product influences purchasing decisions. We recruited 109 participants who completed an initial product choice task regarding bottled water, a brief Internet search, and then a second post-search product choice task. Internet searches were analyzed to identify query terms, site visits, and stance. Results show that query terms influenced the types of sites obtained in a search, which in turn shaped the overall search stance. Participants were more likely to buy bottled water when they visited websites that emphasized environmental, economic, or health benefits for bottled water (i.e., positive stance). Participants who were asked to focus on environmental issues were less likely to buy bottled water unless packaged in recycled plastic.  相似文献   

Current implementations of gazetteers, geographic directories that associate place names to geographic coordinates, cannot use semantics to answer complex queries (most gazetteers are just thesauri of place names), use domain ontologies for place name disambiguation, make their data sets available in the Semantic Web or support the use of Volunteered Geographic Information (VGI). A new generation of gazetteers has to tackle these problems. In this paper, we present a new architecture for gazetteers that uses VGI and Semantic Web tools, such as ontologies and Linked Open Data to overcome these limitations. We also present a gazetteer, the Semantic Web Interactive Gazetteer (SWI), implemented using this architecture, and show that it can be used to add absent geographic coordinates to biodiversity records. In our tests, we use this gazetteer to correct geographic data from a big sample (around 142,000 occurrence records of Amazonian specimens) from SpeciesLink, a big repository of biodiversity collection records from Brazil. The tests showed that the SWI Gazetteer was able to add geographic coordinates to around 30,000 records, increasing the records with coordinates from 30.29% to 57.5% of the total number of records in the sample (representing an increase of 90%).  相似文献   

Search engines play a key role for Internet users when searching for information. The vast majority of users are heavily influenced by the given ranking on the search engine results page (SERP). In this study, N?=?222 university students were tasked to inform themselves about the working conditions in South Asia on the basis of given SERPs. Besides the ranking on the SERP, two credibility cues – the type of the website (news site, corporate website, research institute, and private blog) and the primary source of information mentioned in the search result (scientific study vs. corporate spokesperson) – were varied. Two research objectives were examined: the influence of the ranking and the credibility cues on the evaluation of search results; and the effect of both ranking and credibility cues on the selection. Credibility cues had a strong influence on the perception of the search results’ credibility. Students rated the credibility higher if search results linked to reputable websites or mentioned a neutral primary source of information. We also find an interaction effect between the type of website and the primary source of information. However, participants’ selection was mainly influenced by the ranking. Reasons for this discrepancy are discussed.  相似文献   

Web search engine: Characteristics of user behaviors and their implication   总被引:5,自引:0,他引:5  
In this paper, first studied are the distribution characteristics of user behaviors based on log data from a massive web search engine. Analysis shows that stochastic distribution of user queries accords with the characteristics of power-law function and exhibits strong similarity, and the user' s queries and clicked URLs present dramatic locality, which implies that query cache and 'hot click' cache can be employed to improve system performance. Then three typical cache replacement policies are compared, including LRU, FIFO, and LFU with attenuation. In addition, the distribution character-istics of web information are also analyzed, which demonstrates that the link popularity and replica pop-ularity of a URL have positive influence on its importance. Finally, variance between the link popularity and user popularity, and variance between replica popularity and user popularity are analyzed, which give us some important insight that helps us improve the ranking algorithms in a search engine.  相似文献   

Feeling thermometer questions are widely used in political science research to estimate people’s attitudes and feelings toward a political object, like a political figure or an organization. Given the popularity of the feeling thermometer question in population surveys, more work is needed to explore the measurement of this question type. This study examines the data collection mode effect on feeling thermometers. Using the 2012 American National Election Studies, we find that the measurement of feeling thermometers is not exactly comparable between face-to-face and Web surveys. Face-to-face respondents tend to provide warmer feelings, while Web respondents give relatively more reliable responses in comparison. In both survey modes, respondents are most likely to select the response options that are verbally labeled although the effect is more striking in face-to-face than Web survey. The item nonresponse between these two modes does not differ in a meaningful way. This study ends by discussing future research directions on feeling thermometer questions.  相似文献   

The purpose of the study was to examine the role of domain knowledge when retrieving and using information from the Internet as a resource for essay tasks, as well as to investigate the quality of Internet searches and its relation to essay performance. In two experiments, 100 undergraduates searched the Internet for 30 min and completed two essays; one in which they had high domain knowledge and one in which domain knowledge was low. Two control groups of 70 undergraduates just wrote the essays. Searching the Internet for information enhanced essay performance relative to the control groups only for the topic for which participants had high domain knowledge. In the second experiment, analyses of Internet searches revealed large individual differences in search behaviors and these behaviors did not relate to essay performance, although individuals highlighted the importance of domain knowledge in making their searches easier. Domain knowledge is one factor that educators should pay attention to when using the Internet for learning tasks, particularly when study time is limited, in order to maximize the ability of students to successfully retrieve and use information from the Internet.  相似文献   

Control is necessary for aligning the actions of management (i.e., controllers) and subordinates (i.e., controlees) around common goals. The enactment of control often fails in practice; however, as controlee perceptions may not match those of controllers, leading to a myriad of possible outcomes. Through an interpretive case study of two inter-organisational IT projects, we reveal how controlees' appraisals and responses to controls are context-dependent and play out across multiple levels (e.g., personal, professional, project and organisational contexts). We build on a coping perspective of IS controls to theorise the ‘coping strategies’ that controlees pursued relevant to these contexts and the ‘coping routes’ followed when combining different consecutive coping strategies. We find the process need not end with the selection of a single strategy but can potentially continue as both the controller and controlees make ongoing readjustments. While Behavioural Control Theory traditionally assumes the presence of a single control hierarchy, interorganisational IT projects are multi-level entities that amalgamate different structures and cultures. Our study moves beyond the existing assumptions of Behavioural Control Theory to discuss how a controller's choice of activities shapes the salience of different contexts in controlee appraisals.  相似文献   

It is well known that many Web pages are difficult to use for visually disabled people. Without access to a rich visual display, the intended structure and organisation of the page is obscured. To fully understand what is missing from the experience of visually disabled users, it is pertinent to ask how the presentation of Web pages on a standard display makes them easier for sighted people to use. This paper reports on an exploratory eye tracking study that addresses this issue by investigating how sighted readers use the presentation of the BBC News Web page to search for a link. The standard page presentation is compared with a “text-only” version, demonstrating both qualitatively and quantitatively that the removal of the intended presentation alters “reading” behaviours. The demonstration that the presentation of information assists task completion suggests that it should be re-introduced to non-visual presentations if the Web is to become more accessible. The conducted study also explored the extent to which algorithms that generate maps of what is perceptually salient on a page match the gaze data recorded in the eye tracking study. The correspondence between a page’s presentation, knowledge of what is visually salient, and how people use these features to complete a task might offer an opportunity to re-model a Web page to maximise access to its most important parts.
Cathy YangEmail:

As Building Information Modeling (BIM) workflows are becoming very relevant for the different stages of the project’s lifecycle, more data is produced and managed across it. The information and data accumulated in BIM-based projects present an opportunity for analysis and extraction of project knowledge from the inception to the operation phase. In other industries, Machine Learning (ML) has been demonstrated to be an effective approach to automate processes and extract useful insights from different types and sources of data. The rapid development of ML applications, the growing generation of BIM-related data in projects, and the different needs for use of this data present serious challenges to adopt and effectively apply ML techniques to BIM-based projects in the Architecture, Engineering, Construction and Operations (AECO) industry. While research on the use of BIM data through ML has increased in the past decade, it is still in a nascent stage. In order to asses where the industry stands today, this paper carries out a systematic literature review (SLR) identifying and summarizing common emerging areas of application and utilization of ML within the context of BIM-generated data. Moreover, the paper identifies research gaps and trends. Based on the observed limitations, prominent future research directions are suggested, focusing on information architecture and data, applications scalability, and human information interactions.  相似文献   

This theory-testing study is an examination of the influences of perceived information quality and perceived e-service quality in determining perceived value, and the influence of these three constructs in determining web site loyalty intentions. The results indicate that (1) perceived e-service quality and perceived information quality similarly influence loyalty intentions, (2) perceived e-service quality more strongly influences perceived value than does perceived information quality, and (3) perceived information quality partially mediates the relationship between perceived e-service quality and perceived value. The model was significant in explaining loyalty intentions.  相似文献   

This paper argues for a return to fundamentals and for a balanced assessment of the contribution that Information Technology can make as we enter the new millennium. It argues that the field of Information Systems should no longer be distracted from its natural locus of concern and competence, or claim more than it can actually achieve. More specifically, and as a case in point, we eschew IT-enabled Knowledge Management, both in theory and in practice. We view Knowledge Management as the most recent in a long line of fads and fashions embraced by the Information Systems community that have little to offer. Rather, we argue for a refocusing of our attention back on the management ofdata, since IT processes data-notinformation and certainly notknowledge. In so doing, we develop a model that provides a tentative means of distinguishing between the terms. This model also forms the basis for on-going empirical research designed to test the efficacy of our argument in a number of case companies currently implementing ERP and Knowledge Management Systems.  相似文献   

Bayesian networks provide the means for representing probabilistic conditional independence. Conditional independence is widely considered also beyond the theory of probability, with linkages to, e.g. the database multi-valued dependencies, and at a higher abstraction level of semi-graphoid models. The rough set framework for data analysis is related to the topics of conditional independence via the notion of a decision reduct, to be considered within a wider domain of the feature selection. Given probabilistic version of decision reducts equivalent to the data-based Markov boundaries, the studies were also conducted for other criteria of the rough-set-based feature selection, e.g. those corresponding to the multi-valued dependencies. In this paper, we investigate the degrees of approximate conditional dependence, which could be a topic corresponding to the well-known notions such as conditional mutual information and polymatroid functions, however, with many practically useful approximate conditional independence models unmanageable within the information theoretic framework. The major paper’s contribution lays in extending the means for understanding the degrees of approximate conditional dependence, with appropriately generalized semi-graphoid properties formulated and with the mathematical soundness of the Bayesian network-like representation of the approximate conditional independence statements thoroughly proved. As an additional contribution, we provide a case study of the approximate conditional independence model, which would not be manageable without the above-mentioned extensions.  相似文献   

