期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

RDRP: Reward-Driven Request Prioritization for e-Commerce web sites

Alexander Totok Vijay Karamcheti 《Electronic Commerce Research and Applications》2010,9(6):839-561

Meeting client Quality-of-Service (QoS) expectations proves to be a difficult task for the providers of e-Commerce services, especially when web servers experience overload conditions, which cause increased response times and request rejections, leading to user frustration, lowered usage of the service and reduced revenues. In this paper, we propose a server-side request scheduling mechanism that addresses these problems. Our Reward-Driven Request Prioritization (RDRP) algorithm gives higher execution priority to client web sessions that are likely to bring more service profit (or any other application-specific reward). The method works by predicting future session structure by comparing its requests seen so far with aggregated information about recent client behavior, and using these predictions to preferentially allocate web server resources. Our experiments using the TPC-W benchmark application with an implementation of the RDRP techniques in the JBoss web application server show that RDRP can significantly boost profit attained by the service, while providing better QoS to clients that bring more profit. 相似文献

2.

A smartphone watch for mobile surveillance service

Won-Ho Chung 《Personal and Ubiquitous Computing》2012,16(6):687-696

Mobile surveillance service is regarded as one of the Internet applications to which much attention is recently given. However, the time and cost problem resulting from using heterogeneous platforms and proprietary protocols must be a burden to developing such systems and expanding their services. In this paper, we present a framework of mobile surveillance service for smartphone users. It includes the design and implementation of a video server and a mobile client called smartphone watch. A component-based architecture is employed for the server and client for easy extension and adaptation. We also employ the well-known standard web protocol HTTP to provide higher compatibility and portability than using a proprietary one. Three different video transmission modes are provided for efficient usage of limited bandwidth resource. We demonstrate our approach via real experiments on a commercial smartphone. 相似文献

3.

Feature evaluation for web crawler detection with data mining techniques

Dusan Stevanovic Aijun An Natalija Vlajic 《Expert systems with applications》2012,39(10):8707-8717

Distributed Denial of Service (DDoS) is one of the most damaging attacks on the Internet security today. Recently, malicious web crawlers have been used to execute automated DDoS attacks on web sites across the WWW. In this study we examine the effect of applying seven well-established data mining classification algorithms on static web server access logs in order to: (1) classify user sessions as belonging to either automated web crawlers or human visitors and (2) identify which of the automated web crawlers sessions exhibit ‘malicious’ behavior and are potentially participants in a DDoS attack. The classification performance is evaluated in terms of classification accuracy, recall, precision and F₁ score. Seven out of nine vector (i.e. web-session) features employed in our work are borrowed from earlier studies on classification of user sessions as belonging to web crawlers. However, we also introduce two novel web-session features: the consecutive sequential request ratio and standard deviation of page request depth. The effectiveness of the new features is evaluated in terms of the information gain and gain ratio metrics. The experimental results demonstrate the potential of the new features to improve the accuracy of data mining classifiers in identifying malicious and well-behaved web crawler sessions. 相似文献

4.

A probabilistic expert system for predicting the risk of Legionella in evaporative installations

Carmen Armero Alejandro Artacho Antonio López-Quílez Francisco Verdejo 《Expert systems with applications》2011,38(6):6637-6643

Early detection in water evaporative installations is one of the keys to fighting against the bacterium Legionella, the main cause of Legionnaire’s disease. This paper discusses the general structure, elements and operation of a probabilistic expert system capable of predicting the risk of Legionella in real time from remote information relating to the quality of the water in evaporative installations.The expert system has a master–slave architecture. The slave is a control panel in the installation at risk containing multi-sensors which continuously provide measurements of chemical and physical variables continuously. The master is a net server which is responsible for communicating with the control panel and is in charge of storing the information received, processing the data through the environment R and publishing the results in a web server.The inference engine of the expert system is constructed through Bayesian networks, which are very useful and powerful models that put together probabilistic reasoning and graphical modelling. Bayesian reasoning and Markov Chain Monte Carlo algorithms are applied in order to study the relevant unknown quantities involved in the parametric learning and propagation of evidence phases. 相似文献

5.

A Data Cube Model for Prediction-Based Web Prefetching 总被引：7，自引：0，他引：7

Qiang Yang Joshua Zhexue Huang Michael Ng 《Journal of Intelligent Information Systems》2003,20(1):11-30

Reducing the web latency is one of the primary concerns of Internet research. Web caching and web prefetching are two effective techniques to latency reduction. A primary method for intelligent prefetching is to rank potential web documents based on prediction models that are trained on the past web server and proxy server log data, and to prefetch the highly ranked objects. For this method to work well, the prediction model must be updated constantly, and different queries must be answered efficiently. In this paper we present a data-cube model to represent Web access sessions for data mining for supporting the prediction model construction. The cube model organizes session data into three dimensions. With the data cube in place, we apply efficient data mining algorithms for clustering and correlation analysis. As a result of the analysis, the web page clusters can then be used to guide the prefetching system. In this paper, we propose an integrated web-caching and web-prefetching model, where the issues of prefetching aggressiveness, replacement policy and increased network traffic are addressed together in an integrated framework. The core of our integrated solution is a prediction model based on statistical correlation between web objects. This model can be frequently updated by querying the data cube of web server logs. This integrated data cube and prediction based prefetching framework represents a first such effort in our knowledge. 相似文献

6.

Optimizing utilization of resource pools in web application servers

Alexander Totok Vijay Karamcheti 《Concurrency and Computation》2010,22(18):2421-2444

Among the web application server resources, the most critical for their performance are those that are held exclusively by a service request for the duration of its execution (or some significant part of it). Such exclusively held server resources become performance bottleneck points, with failures to obtain such a resource constituting a major portion of request rejections under server overload conditions. In this paper, we propose a methodology that computes the optimal pool sizes for two such critical resources: web server threads and database connections. Our methodology uses information about incoming request flow and about fine‐grained server resource utilization by service requests of different types, obtained through offline and online request profiling. In our methodology, we advocate (and show its benefits) the use of a database connection pooling mechanism that caches database connections for the duration of a service request execution (so‐called request‐wide database connection caching). We evaluate our methodology by testing it on the TPC‐W web application. Our method is able to accurately compute the optimal number of server threads and database connections, and the value of sustainable request throughput computed by the method always lies within a 5% margin of the actual value determined experimentally. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

7.

Dynamic Scheduling of Multimedia Documents in a Single Server Multiple Clients Environment

《Journal of Parallel and Distributed Computing》1999,57(1):91-120

In a typical single server and multiple client distributed multimedia system, clients may send sporadic requests to the server for certain multimedia documents. The requests must be served with a fast response time and with the required quality of service guarantee. This requires the server to determine the transmission schedule of each multimedia stream while ensuring necessary inter- and intrastream synchronizations. There are two major drawbacks in the existing scheduling algorithms. First, it is assumed that all channels are available at the beginning of the scheduling, but in reality, requests arrive when others are in service; second, the cost of the scheduling itself is usually ignored. In general a feasible scheduling algorithm should have the following features: (1) the schedule must be generated in real time, (2) it should have small scheduling cost, and (3) it must be capable of handling multiple requests from multiple clients. In this paper, we propose two dynamic scheduling algorithms whose worst time complexity isO(n log nm+nm), wherenis the total number of data units in a retrieved multimedia document andmdenotes the number of available channels. The salient feature of the proposed algorithms is their inherent dynamic nature which can adjust the scheduling times for each individual request according to the slack time between consecutive requests. If the slack time between two requests is large, the scheduler can run longer in an attempt to find a better solution. This reduces the response time while maintaining a good quality of presentation. Through both simulation and analysis, we evaluate our algorithms and demonstrate their applicability in a realistic environment. 相似文献

8.

Implementation of hybrid P2P networking distributed web crawler using AWS for smart work news big data

Kim Yong-Young Kim Yong-Ki Kim Dae-Sik Kim Mi-Hye 《Peer-to-Peer Networking and Applications》2020,13(2):659-670

Web crawlers collect and index the vast amount of data available online to gather specific types of objective data such as news that researchers or practitioners need. As big data are increasingly used in a variety of fields and web data are exponentially growing each year, the importance of web crawlers is growing as well. Web servers that currently handle high traffic, such as portal news servers, have safeguards against security threats such as distributed denial-of-service (DDoS) attacks. In particular, the crawler, which causes a large amount of traffic to the Web server, has a very similar nature to DDoS attacks, so the crawler’s activities tend to be blocked from the web server. A peer-to-peer (P2P) crawler can be used to solve these problems. However, the limitations with the pure P2P crawler is that it is difficult to maintain the entire system when network traffic increases or errors occur. Therefore, in order to overcome these limitations, we would like to propose a hybrid P2P crawler that can collect web data using the cloud service platform provided by Amazon Web Services (AWS). The hybrid P2P networking distributed web crawler using AWS (HP2PNC-AWS) is applied to collecting news on Korea’s current smart work lifestyle from three portal sites. In Portal A where the target server does not block crawling, the HP2PNC-AWS is faster than the general web crawler (GWC) and slightly slower than the server/client distributed web crawler (SC-DWC), but it has a similar performance to the SC-DWC. However, in both Portal B and C where the target server blocks crawling, the HP2PNC-AWS performs better than other methods, with the collection rate and the number of data collected at the same time. It was also confirmed that the hybrid P2P networking system could work efficiently in web crawler architectures.

相似文献

9.

Prediction measurement with mean acceptable error for proper inconsistency in noisy weldability prediction data

《Robotics and Computer》2017

Due to the complex nature of the welding process, the data used to construct prediction models often contain a significant amount of inconsistency. In general, this type of inconsistent data is treated as noise in the literature. However, for the weldability prediction, the inconsistency, which we describe as proper-inconsistency, may not be eliminated since the inconsistent data can help extract additional information about the process. This paper discusses that, in the presence of proper-inconsistency, it is inappropriate to perform the same approach generally employed with machine learning algorithms, in terms of the model construction and prediction measurement. Due to the numerical characteristics of proper-inconsistency, it is likely to achieve vague prediction results from the prediction model with the traditional prediction performance measures. In this paper, we propose a new prediction performance measure called mean acceptable error (MACE), which measures the performance of prediction models constructed with the presence of proper-inconsistency. This paper presents experimental results with real weldability prediction data, and we examine the prediction performance of k-nearest neighbor (kNN) and generalized regression neural network (GRNN) measured by MACE and the different characteristics of data in relation to MACE, kNN, and GRNN. The results indicate that using a smaller k on properly-inconsistent data increases the prediction performance measured by MACE. Also, the prediction performance on the correct data increases, while the effect of properly-inconsistent data decreases with the measurement of MACE. 相似文献

10.

A generalized web service model for geophysical data processing and modeling

《Computers & Geosciences》2006,32(9):1403-1410

A web service model for geophysical data manipulation, analysis and modeling based on a generalized data processing system was implemented. The service is not limited to any specific data type or operation and allows the user to combine ∼190 tools of the existing package, and new codes easily includable. It allows remote execution of complex processing flows completely designed and controlled by remote clients who are presented with mirror images of the server processing environment. Clients are also able to upload their processing flows to the server, thereby building a knowledge base of processing expertise shared by the community. Flows in this knowledge base are currently represented by a hierarchy of automatically generated interactive web forms. These flows can be accessed and the resulting data retrieved by either using a web browser or through API calls from within the clients’ applications. The server administrator is thus relieved of the need for development of any content-specific data access mechanisms. The underlying processing system is fully developed and includes a graphical user interface, parallel processing capabilities, on-line documentation, on-line software distribution service and automatic code updates. Currently, the service is installed on the University of Saskatchewan seismology web server (http://seisweb.usask.ca/SIA/ps.php) and maintains a library of processing examples (http://seisweb.usask.ca/temp/examples) including a number of useful web tools (such as UTM coordinate transformations, calculation of travel times of seismic waves in a global Earth model and generation of color palettes). Important potential applications of this web service model for building intelligent data queries, processing and modeling of global seismological data are also discussed. 相似文献

11.

Incremental and interactive mining of web traversal patterns

Yue-Shi Lee Show-Jane Yen 《Information Sciences》2008,178(2):287-306

Web mining involves the application of data mining techniques to large amounts of web-related data in order to improve web services. Web traversal pattern mining involves discovering users’ access patterns from web server access logs. This information can provide navigation suggestions for web users indicating appropriate actions that can be taken. However, web logs keep growing continuously, and some web logs may become out of date over time. The users’ behaviors may change as web logs are updated, or when the web site structure is changed. Additionally, it can be difficult to determine a perfect minimum support threshold during the data mining process to find interesting rules. Accordingly, we must constantly adjust the minimum support threshold until satisfactory data mining results can be found.The essence of incremental data mining and interactive data mining is the ability to use previous mining results in order to reduce unnecessary processes when web logs or web site structures are updated, or when the minimum support is changed. In this paper, we propose efficient incremental and interactive data mining algorithms to discover web traversal patterns that match users’ requirements. The experimental results show that our algorithms are more efficient than other comparable approaches. 相似文献

12.

Enabling access-privacy for random walk based data analysis applications

《Data & Knowledge Engineering》2008,64(3):667-683

相似文献

13.

Oblivious and fair server-aided two-party computation

Amir Herzberg Haya Shulman 《Information Security Technical Report》2013,17(4):210-226

We show efficient, practical (server-aided) secure two-party computation protocols ensuring privacy, correctness and fairness in the presence of malicious (Byzantine) faults. Our requirements from the server are modest. To ensure privacy and correctness, we only assume a circuit evaluation service, executing an initialisation program provided by both parties. To ensure fairness, we further assume a trusted-decryption service, providing decryption service using a known public key. Our fairness-ensuring protocol is optimistic, i.e., the decryption service is invoked only in case of faults.Both of these trusted services are feasible in practice, and may be useful for additional tasks; both can also be distributed, with linear overhead, for redundancy. We believe that the protocols are sufficiently efficient, to allow deployment, in particular for financial applications. We also propose applications which constitute natural candidates to benefit from our protocols. 相似文献

14.

A case study on bypass testing of web applications

Jeff Offutt Vasileios Papadimitriou Upsorn Praphamontripong 《Empirical Software Engineering》2014,19(1):69-104

Society’s increasing reliance on services provided by web applications places a high demand on their reliability. The flow of control through web applications heavily depends on user inputs and interactions, so user inputs should be thoroughly validated before being passed to the back-end software. Although several techniques are used to validate inputs on the client, users can easily bypass this validation and submit arbitrary data to the server. This can cause unexpected behavior, and even allow unauthorized access. A test technique called bypass testing intentionally sends invalid data to the server by bypassing client-side validation. This paper reports results from a comprehensive case study on 16 deployed, widely used, commercial web applications. As part of this project, the theory behind bypass testing was extended and an automated tool, AutoBypass, was built. The case study found failures in 14 of the 16 web applications tested, some significant. This study gives evidence that bypass testing is effective, has positive return on investment, and scales to real applications. 相似文献

15.

Dynamic reference sifting: a case study in the homepage domain

《Computer Networks and ISDN Systems #》1997,29(8-13):1193-1204

Robot-generated Web indices such as AltaVista are comprehensive but imprecise; manually generated directories such as Yahoo! are precise but cannot keep up with large, rapidly growing categories such as personal homepages or news stories on the American economy. Thus, if a user is searching for a particular page that is not cataloged in a directory, she is forced to query a web index and manually sift through a large number of responses. Furthermore, if the page is not yet indexed, then the user is stymied. This paper presents Dynamic Reference Sifting — a novel architecture that attempts to provide both maximally comprehensive coverage and highly precise responses in real time, for specific page categories.To demonstrate our approach, we describe Ahoy! The Homepage Finder (http://www.cs.washington,edu/research/ahoy), a fielded web service that embodies Dynamic Reference Sifting for the domain of personal homepages. Given a person's name and institution, Ahoy! filters the output of multiple web indices to extract one or two references that are most likely to point to the person's homepage. If it finds no likely candidates, Ahoy! uses knowledge of homepage placement conventions, which it has accumulated from previous experience, to “guess” the URL for the desired homepage. The search process takes 9 seconds on average. On 74% of queries from our primary test sample, Ahoy! finds the target homepage and ranks it as the top reference. 9% of the targets are found by guessing the URL. In comparison, AltaVista can find 58% of the targets and ranks only 23% of these as the top reference. 相似文献

16.

Using a recurrent artificial neural network for dynamic self-adaptation of cluster-based web-server systems

Sanaz Sheikhi Seyed Morteza Babamir 《Applied Intelligence》2018,48(8):2097-2111

To process huge requests issued from web users, web servers often set up a cluster using switches and gateways where a switch directs users’ requests to some gateway. Each gateway, which is connected to some servers, is considered for processing a specific type of request such as fttp or http service. When servers of a gateway are saturated and the gateway is not able to process more requests, adaptation is performed by borrowing a server from another gateway. However, such a reactive adaptation causes some problems. However, due to problem of the reactive techniques, predictive ones have been paid attention. While a reactive adaptation aims to redress the system after incurring a bottleneck, a predictive adaptation tries to prevent the system from entering the bottleneck. In this article, we improved our previous predictive framework using a Recurrent Artificial Neural Network (RANN) called Nonlinear Autoregressive with eXogenous (external) inputs (NARX). We employed our new framework for adaptation of a web-based cluster where each cluster is meant for a specific service and self-adaptation is used for load balancing clusters. To show the improvement, we used the case study presented in our previous study. 相似文献

17.

Dynamic personalization in conversational recommender systems

Tariq Mahmood Ghulam Mujtaba Adriano Venturini 《Information Systems and E-Business Management》2014,12(2):213-238

Conversational recommender systems are E-Commerce applications which interactively assist online users to acquire their interaction goals during their sessions. In our previous work, we have proposed and validated a methodology for conversational systems which autonomously learns the particular web page to display to the user, at each step of the session. We employed reinforcement learning to learn an optimal strategy, i.e., one that is personalized for a real user population. In this paper, we extend our methodology by allowing it to autonomously learn and update the optimal strategy dynamically (at run-time), and individually for each user. This learning occurs perpetually after every session, as long as the user continues her interaction with the system. We evaluate our approach in an off-line simulation with four simulated users, as well as in an online evaluation with thirteen real users. The results show that an optimal strategy is learnt and updated for each real and simulated user. For each simulated user, the optimal behavior is reasonably adapted to this user’s characteristics, but converges after several hundred sessions. For each real user, the optimal behavior converges only in several sessions. It provides assistance only in certain situations, allowing many users to buy several products together in shorter time and with more page-views and lesser number of query executions. We prove that our approach is novel and show how its current limitations can catered. 相似文献

18.

Disambiguating identity web references using Web 2.0 data and semantics

Matthew Rowe Fabio Ciravegna 《Journal of Web Semantics》2010,8(2-3):125-142

As web users disseminate more of their personal information on the web, the possibility of these users becoming victims of lateral surveillance and identity theft increases. Therefore web resources containing this personal information, which we refer to as identity web references must be found and disambiguated to produce a unary set of web resources which refer to a given person. Such is the scale of the web that forcing web users to monitor their identity web references is not feasible, therefore automated approaches are required. However, automated approaches require background knowledge about the person whose identity web references are to be disambiguated. Within this paper we present a detailed approach to monitor the web presence of a given individual by obtaining background knowledge from Web 2.0 platforms to support automated disambiguation processes. We present a methodology for generating this background knowledge by exporting data from multiple Web 2.0 platforms as RDF data models and combining these models together for use as seed data. We present two disambiguation techniques; the first using a semi-supervised machine learning technique known as Self-training and the second using a graph-based technique known as Random Walks, we explain how the semantics of data supports the intrinsic functionalities of these techniques. We compare the performance of our presented disambiguation techniques against several baseline measures including human processing of the same data. We achieve an average precision level of 0.935 for Self-training and an average f-measure level of 0.705 for Random Walks in both cases outperforming several baselines measures. 相似文献

19.

基于Web服务的房地产短信平台的设计与实现 总被引：5，自引：0，他引：5

宋春宋玲《计算机工程与设计》2007,28(5):1147-1149,1153

Web服务这一当前最有价值且较新的分布式应用技术越来越受到广泛的关注.分析介绍了Web服务的含义、特点、体系结构以及安全性,并对其核心技术SOAP、WSDL和UDDI进行了探讨与研究.利用Web服务的跨平台可互操作性构建了一个分布式的房地产短信平台,该平台服务器通过XML技术和通讯运营商这种异构平台之间进行频繁地数据交换.最后给出了该短信平台的具体设计和实现方法. 相似文献

20.

Improving the Hybrid Data Dissemination Model of Web Documents

Jonathan Beaver Kirk Pruhs Panos K. Chrysanthis Vincenzo Liberatore 《World Wide Web》2008,11(3):313-337

One of the major problems in the Internet today is the scalable delivery of data. With more and more people joining the Internet community, web servers and services are being forced to deal with workloads beyond their original data dissemination design capacity. One solution that has arisen to address scalability is to use multicasting, or push-based data dissemination, to send out data to many clients at once. More recently, the idea of using multicasting as part of a hybrid system with unicasting has shown positive results in increasing server scalability. In this paper we focus on solving problems associated with the hybrid dissemination model. In particular, we address the issues of document popularity and document division while arguing for the use of a third channel, called the multicast pull channel, in the hybrid system model. This channel improves performance in terms of response time while improving the robustness of the hybrid system. We show through extensive simulation using our working hybrid server the usefulness of this additional channel and its improving effects in creating a more scalable and more efficient web server. 相似文献