首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Cloud computing infrastructures provide vast processing power and host a diverse set of computing workloads, ranging from service-oriented deployments to high-performance computing (HPC) applications. As HPC applications scale to a large number of VMs, providing near-native network I/O performance to each peer VM is an important challenge. In this paper we present Xen2MX, a paravirtual interconnection framework over generic Ethernet, binary compatible with Myrinet/MX and wire compatible with MXoE. Xen2MX combines the zero-copy characteristics of Open-MX with Xen's memory sharing techniques. Experimental evaluation of our prototype implementation shows that Xen2MX is able to achieve nearly the same raw performance as Open-MX running in a non-virtualized environment. On the latency front, Xen2MX performs as close as 96% to the case where virtualization layers are not present. Regarding throughput, Xen2MX saturates a 10 Gbps link, achieving 1159 MB/s, compared to 1192 MB/s of the non-virtualized case. Scales efficiently with the number of VMs, saturating the link for even smaller messages when 40 single-core VMs put pressure on the network adapters.  相似文献   

2.
With the integration of IP and optical technology, a high-speed optical network (of the order of 10Gbps) has emerged to support international research cooperation such as massive scientific data transfer and next generation Internet related research. Therefore, it is critical to explore the issues of measurement and evaluation on the performance of transport protocols over 10 Gbps high-speed optical networks. To the best of our knowledge, this is the first work that presents a measurement study on a variety of networking environments. The objectives of this paper are as follows: (i) determine the suitability of TCP parameters such as Jumbo Frame size, buffer sizes of a TCP sender and a receiver; (ii) evaluate TCP performance measurement tools and emulation tools over 10 Gbps high-speed optical networks; and (iii) compare performance of TCP variants with different metrics, such as throughput and fairness, by varying delays and randomized losses controlled with software emulators. The result shows that selection of emulation and performance measurement tools matters for the accurate measurement of TCP performance. The performance of TCP variants is highly impacted by the Linux TCP/IP stack tuning. Finally we present that overall and detailed performance, such as throughput and fairness, of each TCP variant is dependent on different network environments, such as packet loss rate and propagation delay.  相似文献   

3.
This paper presents the design and implementation of a protocol offload engine that processes TCP/IP and remote direct memory access (RDMA) protocols by means of hardware/software coprocessing. In the offload engine, time-consuming operations such as TCP/IP header generation are implemented as hardware to improve performance. The software performs control operations and RDMA header generation. In the experiments and analyses, it is proved that the hardware can provide satisfactory performance to process all operations at speeds of over 1 Gbps. Our engine can offload most protocol processing overheads – up to 95% to 100% – from the host CPU. Finally, although the embedded processors operate with a 300 MHz clock that is seven times slower than the clock of the host CPU, our engine shows maximum bandwidths of 673 Mbps for TCP/IP and 551 Mbps for RDMA on a gigabit Ethernet network.  相似文献   

4.
This paper proposes a fully-integrated SIP + HCoP-B architecture to provide efficient mobility management of the nested mobile network. It achieves the following merits, which are rare in the literature. First, it reduces network deployment costs by only equipping an integrated SIP mobile server. Second, it supports both SIP-based and non-SIP-based applications. Third, by adopting the analytical model proposed in Mohanty and Akyildiz (2007) [19], mathematical analyses are provided to investigate six performance metrics of SIP + HCoP-B and the other four well-known SIP's over NEMO schemes over the error-prone wireless link. Finally, it is shown that SIP + HCoP-B outperforms these four traditional schemes through intensive simulations.  相似文献   

5.
《Parallel Computing》2014,40(5-6):144-158
One of the main difficulties using multi-point statistical (MPS) simulation based on annealing techniques or genetic algorithms concerns the excessive amount of time and memory that must be spent in order to achieve convergence. In this work we propose code optimizations and parallelization schemes over a genetic-based MPS code with the aim of speeding up the execution time. The code optimizations involve the reduction of cache misses in the array accesses, avoid branching instructions and increase the locality of the accessed data. The hybrid parallelization scheme involves a fine-grain parallelization of loops using a shared-memory programming model (OpenMP) and a coarse-grain distribution of load among several computational nodes using a distributed-memory programming model (MPI). Convergence, execution time and speed-up results are presented using 2D training images of sizes 100 × 100 × 1 and 1000 × 1000 × 1 on a distributed-shared memory supercomputing facility.  相似文献   

6.
《Computer Networks》2007,51(11):3172-3196
A search based heuristic for the optimisation of communication networks where traffic forecasts are uncertain and the problem is NP-complete is presented. While algorithms such as genetic algorithms (GA) and simulated annealing (SA) are often used for this class of problem, this work applies a combination of newer optimisation techniques specifically: fast local search (FLS) as an improved hill climbing method and guided local search (GLS) to allow escape from local minima. The GLS + FLS combination is compared with an optimised GA and SA approaches. It is found that in terms of implementation, the parameterisation of the GLS + FLS technique is significantly simpler than that for a GA and SA. Also, the self-regularisation feature of the GLS + FLS approach provides a distinctive advantage over the other techniques which require manual parameterisation. To compare numerical performance, the three techniques were tested over a number of network sets varying in size, number of switch circuit demands (network bandwidth demands) and levels of uncertainties on the switch circuit demands. The results show that the GLS + FLS outperforms the GA and SA techniques in terms of both solution quality and optimisation speed but even more importantly GLS + FLS has significantly reduced parameterisation time.  相似文献   

7.
This paper shows how temporal difference learning can be used to build a signalized junction controller that will learn its own strategies through experience. Simulation tests detailed here show that the learned strategies can have high performance. This work builds upon previous work where a neural network based junction controller that can learn strategies from a human expert was developed (Box and Waterson, 2012). In the simulations presented, vehicles are assumed to be broadcasting their position over WiFi giving the junction controller rich information. The vehicle's position data are pre-processed to describe a simplified state. The state-space is classified into regions associated with junction control decisions using a neural network. This classification is the strategy and is parametrized by the weights of the neural network. The weights can be learned either through supervised learning with a human trainer or reinforcement learning by temporal difference (TD). Tests on a model of an isolated T junction show an average delay of 14.12 s and 14.36 s respectively for the human trained and TD trained networks. Tests on a model of a pair of closely spaced junctions show 17.44 s and 20.82 s respectively. Both methods of training produced strategies that were approximately equivalent in their equitable treatment of vehicles, defined here as the variance over the journey time distributions.  相似文献   

8.
The identification of high fidelity models is a critical element in the implementation of high performance model predictive control (MPC) applications in the industry. These controllers can vary in size with input–ouput dimensions ranging from 5 × 10 to 50 × 100. Identifying models of this scale accurately is a time consuming and demanding exercise. We present a novel approach wherein an information rich test signal is generated in closed loop by maximizing the MPC objective, as opposed to minimization that is done in the standard controller. We show that the proposed input design approach is similar to T-optimal (trace optimal) experiment design method. Our approach automatically accounts for the input and output constraints and is implemented in a moving horizon manner. It is demonstrated through simulation examples on both well and ill-conditioned processes.  相似文献   

9.
Using the density functional theory methods, we effectively tune the second-order nonlinear optical (NLO) properties in some chalcone derivatives. Various unique push–pull configurations are used to efficiently enhance the intramolecular charge transfer process over the designed derivatives, which result in significantly larger amplitudes of the first hyperpolarizability as compared to their parent molecule. The ground state molecular geometries have been optimized using B3LYP/6-311G** level of theory. A variety of methods including B3LYP, CAM-B3LYP, PBE0, M06, BHandHLYP and MP2 are tested with 6-311G** basis set to calculate the first hyperpolarizability of parent system 1. The results of M06 are found closer to highly correlated MP2 method, which has been selected to calculate static and frequency dependent first hyperpolarizability amplitudes of all selected systems. At M06/6-311G** level of theory, the permanent electronic dipole moment (μtot), polarizability (α0) and static first hyperpolarizability (βtot) amplitudes for parent system 1 are found to be 5.139 Debye, 274 a. u. and 24.22 × 10−30 esu, respectively. These amplitudes have been significantly enhanced in designed derivatives 2 and 3. More importantly, the (βtot) amplitudes of systems 2 and 3 mount to 75.78 × 10−30 and 128.51 × 10−30 esu, respectively, which are about 3 times and 5 times larger than that of their parent system 1. Additionally, we have extended the structure-NLO property relationship to several newly synthesized chalcone derivatives. Interestingly, the amplitudes of dynamic frequency dependent hyperpolarizability μβω (SHG) are also significantly larger having values of 366.72 × 10−48, 856.32 × 10−48 and 1913.46 × 10−48 esu for systems 13, respectively, at 1400 nm of incident laser wavelength. The dispersion behavior over a wide range of change in wavelength has also been studied adopting a range of wavelength from 1907 to 544 nm. Thus, the present work realizes the potential of designed derivatives as efficient NLO-phores for modern NLO applications.  相似文献   

10.
We report the fabrication and performance of a micromachined Y-cut quartz resonator based thermal infrared detector array. 1 mm diameter and 18 μm thick (90 MHz) inverted mesa configuration quartz resonator arrays with excellent resonance characteristics have been fabricated by RIE etching of quartz. Temperature sensitivity of 7.2 kHz/K was experimentally measured. Infrared calibration tests on the resonator array even without the use of infrared absorbers gave a responsivity of 14.3 MHz/W and an NEP of 326 nW. In this first report on the performance of the Y-cut quartz resonator infrared thermal detector array, the response time measurements were found to be limited by the slow measurement time of the impedance scans and the undesired heating of the quartz substrate. Most importantly, this initial work demonstrates the possibility of realizing infrared detector arrays for room temperature thermal imaging applications that can rival current state of the art in the field.  相似文献   

11.
Spiking Neural Network (SNN) is a type of biologically-inspired neural networks that perform information processing based on discrete-time spikes, different from traditional Artificial Neural Network (ANN). Hardware implementation of SNNs is necessary for achieving high-performance and low-power. We present the Darwin Neural Processing Unit (NPU), a highly-configurable neuromorphic hardware co-processor based on SNN implemented with digital logic, supporting a configurable number of neurons, synapses and synaptic delays. The Darwin NPU was fabricated by standard 180 nm CMOS technology with area size of 5 × 5 mm2 and 70 MHz clock frequency at the worst case. It consumes 0.84 mW/MHz with 1.8 V power supply for typical applications. Two prototype applications are used to demonstrate the performance and efficiency of the Darwin NPU.  相似文献   

12.
BackgroundTo integrate electronic health records (EHRs) from diverse document sources across healthcare providers, facilities, or medical institutions, the IHE XDS.b profile can be considered as one of the solutions. In this research, we have developed an EHR/OpenXDS system which adopted the OpenXDS, an open source software that complied with the IHE XDS.b profile, and which achieved the EHR interoperability.ObjectiveWe conducted performance testing to investigate the performance and limitations of this EHR/OpenXDS system.MethodologyThe performance testing was conducted for three use cases, EHR submission, query, and retrieval, based on the IHE XDS.b profile for EHR sharing. In addition, we also monitored the depletion of hardware resources (including the CPU usage, memory usage, and network usage) during the test cases execution to detect more details of the EHR/OpenXDS system's limitations.ResultsIn this EHR/OpenXDS system, the maximum affordable workload of the EHR submissions were 400 EHR submissions per hour, the DSA CPU usage was 20%, memory usage was 1380 MB, the network usages were 0.286 KB input and 7.58 KB output per minute; the DPA CPU usage was 1%, memory usage was 1770 MB, the network usages were 7.75 KB input and 1.54 KB output per minute; the DGA CPU usage was 24%, memory usage was 2130 MB, the network usages were 1.3 KB input and 0.174 KB output per minute. The maximum affordable workload of the EHR queries were 600 EHR queries per hour, the DCA CPU usage was 66%, the memory usage was 1660 MB, the network usages were 0.230 KB input and 0.251 KB output per minute; the DGA CPU usage was 1%, the memory usage was 1890 MB, the network usages were 0.273 KB input and 0.22 KB output per minute. The maximum affordable workload of the EHR retrievals were 2000 EHR retrievals, the DCA CPU usage was 79%, the memory usage was 1730 MB, the network usages were 19.55 KB input and 1.12 KB output per minute; the DPA CPU usage was 3.75%, the memory usage was 2310 MB, and the network usages were 0.956 KB input and 19.57 KB output per minute.Discussion and conclusionFrom the research results, we suggest that future implementers who deployed the EHR/OpenXDS system should consider the following aspects. First, to ensure how many service volumes would be provided in the environment and then to adjust the hardware resources. Second, the IHE XDS.b profile is adopted by the SOAP (Simple Object Access Protocol) web service, it might then move onto the Restful (representational state transfer) web service which is more efficient than the SOAP web service. Third, the concurrency process ability should be added in the OpenXDS source code to improve the hardware usage more efficiently while processing the ITI-42, ITI-18, and ITI-43 transactions. Four, this research suggests that the work should continue on adjusting the memory usage for the modules of the OpenXDS thereby using the memory resource more efficiently, e.g., the memory configuration of the JVM (Java Virtual Machine), Apache Tomcat, and Apache Axis2. Fifth, to consider if the hardware monitoring would be required in the implementing environment. These research results provided some test figures to refer to, and it also gave some tuning suggestions and future works to continue improving the performance of the OpenXDS.  相似文献   

13.
Yao Zhao  Yan Chen 《Computer Networks》2009,53(9):1303-1318
It is highly desirable and important for end users, with no special privileges, to identify and pinpoint faults inside the network that degrade the performance of their applications. However, existing tools are inaccurate to infer the link-level loss rates and have large diagnosis granularity. To address these problems, we propose a suite of user-level diagnosis approaches in two categories: (1) the diagnosis tool needs to be deployed only at the source and (2) the tool has to be deployed at both source and destination. For the former, we propose two fragmentation aided diagnosis approaches (FAD), Algebraic FAD and Opportunistic FAD, which use IP fragmentation to enable accurate link-level loss rate inference. For the latter category, we propose Striped Probe Analysis (SPA) which significantly improves the diagnosis granularity over those of the source-only approaches. Internet experiments are applied to evaluate each individual scheme (including an improved version of the state-of-the-art tool, Tulip [R. Mahajan, N. Spring, D. Wetherall, T. Anderson, User-level internet path diagnosis, in: ACM SOSP, 2003]) and various hybrid approaches. The results indicate that our approaches dramatically outperform existing work (especially for diagnosis granularity). But more importantly, we show that combination of different individual approaches (e.g. OFAD + Tulip or OFAD + SPA) provide not only the best performance but also smooth tradeoff among deployment requirement, diagnosis accuracy and granularity.  相似文献   

14.
Modern hospitals are beginning to adopt E-HEALTH as efficient complements to the traditional healthcare services. To support the E-HEALTH services, a locatable, radiation-free and high-capacity communication system is urgently needed in hospitals. Power line communication (PLC) systems can use the ubiquitous power line network to power the light-emitting diode (LED) lamps while serving as the backbone network for the indoor visible light communication (VLC) systems naturally. In this article, a hybrid broadband power line and visible light communication system with orthogonal frequency division multiplexing modulation is proposed for the indoor hospital applications, which gives a brand-new solution to replace the conventional wireless communication systems in hospitals. A general-purpose system model is provided and some basic techniques to enhance system performance are also investigated. Moreover, a feasible demonstration which supports over 48 Mbps data rate within a bandwidth of 8 MHz is implemented in the laboratory.  相似文献   

15.
In this work, Ni oxide thin films, with thermal sensitivity superior to Pt and Ni thin films, were formed through annealing of Ni films deposited by a r.f. magnetron sputtering. The annealing was carried out in the temperature range of 300–500 °C under atmospheric conditions. Resistivity of the resulting Ni oxide films were in the range of 10.5 μΩ cm/°C to 2.84 × 104 μΩ cm/°C, depending on the extent of Ni oxidation. The temperature coefficient of resistance (TCR) of the Ni oxide films also depended on the extent of Ni oxidation; the average TCR of Ni oxide resistors, measured between 0 and 150 °C, were 5630 ppm/°C for the 300 °C and 2188 ppm/°C for 500 °C films. Because of their high resistivity and very linear TCR, Ni oxide thin films are superior to pure Ni and Pt thin films for flow and temperature sensor applications.  相似文献   

16.
Joint moment is one of the most important factors in human gait analysis. It can be calculated using multi body dynamics but might not be straight forward. This study had two main purposes; firstly, to develop a generic multi-dimensional wavelet neural network (WNN) as a real-time surrogate model to calculate lower extremity joint moments and compare with those determined by multi body dynamics approach, secondly, to compare the calculation accuracy of WNN with feed forward artificial neural network (FFANN) as a traditional intelligent predictive structure in biomechanics.To aim these purposes, data of four patients walked with three different conditions were obtained from the literature. A total of 10 inputs including eight electromyography (EMG) signals and two ground reaction force (GRF) components were determined as the most informative inputs for the WNN based on the mutual information technique. Prediction ability of the network was tested at two different levels of inter-subject generalization. The WNN predictions were validated against outputs from multi body dynamics method in terms of normalized root mean square error (NRMSE (%)) and cross correlation coefficient (ρ).Results showed that WNN can predict joint moments to a high level of accuracy (NRMSE < 10%, ρ > 0.94) compared to FFANN (NRMSE < 16%, ρ > 0.89). A generic WNN could also calculate joint moments much faster and easier than multi body dynamics approach based on GRFs and EMG signals which released the necessity of motion capture. It is therefore indicated that the WNN can be a surrogate model for real-time gait biomechanics evaluation.  相似文献   

17.
The content addressable memory (CAM) based solutions are very useful in network applications due to its high speed parallel search mechanism. This paper presents a novel Ternary CAM (TCAM) based NAND Pseudo CMOS–Longest Prefix Match (NPC–LPM) search engine. The proposed system provides a simple hardware based solution using novel 11T TCAM cell structures and NPC word line technique, for network routers. The experiments were performed on 256 × 128 NPC–LPM system under 0.13 μm technology. The simulation result shows that the proposed design provides low power dissipation of 5.78 mW and high search speed of 315 MSearches/s under 1.3 V supply voltage. The presented NPC–LPM system meets the speed requirement of Optical Carrier (OC) 3072 with line-rate of 160 Gb/s in Ethernet networking and IPv6 protocol. The experimental results also show that the proposed system improves power-performance by 65%.  相似文献   

18.
As a part of the research project aimed at developing a thermodynamic database of the La–Sr–Co–Fe–O system for applications in Solid Oxide Fuel Cells (SOFCs), the Co–Fe–O subsystem was thermodynamically re-modeled in the present work using the CALPHAD methodology. The solid phases were described using the Compound Energy Formalism (CEF) and the ionized liquid was modeled with the ionic two-sublattice model based on CEF. A set of self-consistent thermodynamic parameters was obtained eventually. Calculated phase diagrams and thermodynamic properties are presented and compared with experimental data. The modeling covers a temperature range from 298 K to 3000 K and oxygen partial pressure from 10−16 to 102 bar. A good agreement with the experimental data was shown. Improvements were made as compared to previous modeling results.  相似文献   

19.
The present study attempts to develop a flow pattern indicator for gas–liquid flow in microchannel with the help of artificial neural network (ANN). Out of many neural networks present in literature, probabilistic neural network (PNN) has been chosen for the present study due to its speed in operation and accuracy in pattern recognition. The inbuilt code in MATLAB R2008a has been used to develop the PNN. During training, superficial velocity of gas and liquid phase, channel diameter, angle of inclination and fluid properties such as density, viscosity and surface tension have been considered as the governing parameters of the flow pattern. Data has been collected from the literature for air–water and nitrogen–water flow through different circular microchannel diameters (0.53, 0.25, 0.100 and 0.050 mm for nitrogen–water and 0.53, 0.22 mm for air–water). For the convenience of the study, the flow patterns available in literature have been classified into six categories namely; bubbly, slug, annular, churn, liquid ring and liquid lump flow. Single PNN model is unable to predict the flow pattern for the whole range (0.53 mm–0.050 mm) of microchannel diameter. That is why two separate PNN models has been developed to predict the flow patterns of gas–liquid flow through different channel diameter, one for diameter ranging from 0.53 mm to 0.22 mm and another for 0.100 mm–0.05 mm. The predicted map and their transition boundaries have been compared with the corresponding experimental data and have been found to be in good agreement. Whereas accuracy in prediction of transition boundary obtained from available analytical models used for conventional channel is less for all diameter of channel as compared to the present work. The percentage accuracy of PNN (~94% for 0.53 mm ID and ~73% for 0.100 mm ID channel) has also been found to be higher than the model based on Weber number (~86% for 0.53 mm ID and ~36% for 0.05 mm ID channel).  相似文献   

20.
A cobaloxime ([chlorobis(dimethylglyoximeato)(triphenylphosphine)] cobalt (III), [Co(dmgH)2pph3Cl]) incorporated in a plasticized poly(vinyl chloride) membrane was used to develop a perchlorate-selective electrode. The influence of membrane composition on the electrode response was studied. The electrode exhibits a Nernstian response over the perchlorate concentration range 1.0 × 10−6 to 1 × 10−1 mol l−1 with a slope of −56.8 ± 0.7 mV per decade of concentration, a detection limit of 8.3 × 10−7, a wide working pH range (3–10) and a fast response time (<15 s). The electrode shows excellent selectivity towards perchlorate with respect to many common anions. The electrode was used to determine perchlorate in water and human urine.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号