首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Energy-centric DVFS controlling method for multi-core platforms   总被引:1,自引:0,他引:1  
Dynamic voltage and frequency scaling (DVFS) is a well-known and effective technique for reducing energy consumption in modern processors. However, accurately predicting the effect of frequency scaling on system performance is a challenging problem in real environments. In this paper, we propose a realistic DVFS performance prediction method, and a practical DVFS control policy (eDVFS) that aims to minimize total energy consumption in multi-core platforms. We also present power consumption estimation models for CPU and DRAM by exploiting a hardware energy monitoring unit. We implemented eDVFS in Linux, and our evaluation results show that eDVFS can save a substantial amount of energy compared with Linux “on-demand” CPU governor in diverse environments.  相似文献   

2.
As data exploration has increased rapidly in recent years, the datastore and data processing are getting more and more attention in extracting important information. To find a scalable solution to process the large-scale data is a critical issue in either the relational database system or the emerging NoSQL database. With the inherent scalability and fault tolerance of Hadoop, MapReduce is attractive to process the massive data in parallel. Most of previous researches focus on developing the SQL or SQL-like queries translator with the Hadoop distributed file system. However, it could be difficult to update data frequently in such file system. Therefore, we need a flexible datastore as HBase not only to place the data over a scale-out storage system, but also to manipulate the changeable data in a transparent way. However, the HBase interface is not friendly enough for most users. A GUI composed of SQL client application and database connection to HBase will ease the learning curve. In this paper, we propose the JackHare framework with SQL query compiler, JDBC driver and a systematical method using MapReduce framework for processing the unstructured data in HBase. After importing the JDBC driver to a SQL client GUI, we can exploit the HBase as the underlying datastore to execute the ANSI-SQL queries. Experimental results show that our approaches can perform well with efficiency and scalability.  相似文献   

3.
McCabe’s Cyclomatic Complexity (MCC) is a widely used metric for the complexity of control flow. Common usage decrees that functions should not have an MCC above 50, and preferably much less. However, the Linux kernel includes more than 800 functions with MCC values above 50, and over the years 369 functions have had an MCC of 100 or more. Moreover, some of these functions undergo extensive evolution, indicating that developers are successful in coping with the supposed high complexity. Functions with similarly high MCC values also occur in other operating systems and domains, including Windows. For example, the highest MCC value in FreeBSD is 1316, double the highest MCC in Linux. We attempt to explain all this by analyzing the structure of high-MCC functions in Linux and showing that in many cases they are in fact well-structured (albeit we observe some cases where developers indeed refactor the code in order to reduce complexity). Moreover, human opinions do not correlate with the MCC values of these functions. A survey of perceived complexity shows that there are cases where high MCC functions were ranked as having a low complexity. We characterize these cases and identify specific code attributes such as the diversity of constructs (not only a switch but also ifs) and nesting that correlate with discrete increases in perceived complexity. These observations indicate that a high MCC is not necessarily an impediment to code comprehension, and support the notion that complexity cannot be fully captured using simple syntactic code metrics. In particular, we show that regularity in the code (meaning repetitions of the same pattern of control structures) correlates with low perceived complexity.  相似文献   

4.
The interest of adopting RFID continues to grow in many industries, ranging from supply chain automation to healthcare management. However, dynamics of the operating environment is one of the major challenges that impede RFID deployment. Even though numerous researchers focus on controlled laboratory experiments to enhance the success of deployment, it is found that system performance in the actual production environment may differ significantly from that conducted in a controlled laboratory, resulting in poor deployment result. To cope with this situation, this paper proposes an RFID Deployment Optimizer (RFIDDO), which is a generic methodology for optimizing the RFID configuration to provide objective, quantifiable data about the data capture performance of RFID readers for comparing and optimizing RFID applications in a scientific manner. A case study has also been conducted in a logistics company to demonstrate the implementation of RFIDDO and provide contextual details to help other firms in coping with the environmental dynamics in the journey of RFID deployment.  相似文献   

5.
In this work a unified treatment of solid and fluid vibration problems is developed by means of the Finite-Difference Time-Domain (FDTD). The scheme here proposed takes advantage from a scaling factor in the velocity fields that improves the performance of the method and the vibration analysis in heterogenous media. Moreover, the scheme has been extended in order to simulate both the propagation in porous media and the lossy solid materials. In order to accurately reproduce the interaction of fluids and solids in FDTD both time and spatial resolutions must be reduced compared with the set up used in acoustic FDTD problems. This aspect implies the use of bigger grids and hence more time and memory resources. For reducing the time simulation costs, FDTD code has been adapted in order to exploit the resources available in modern parallel architectures. For CPUs the implicit usage of the advanced vectorial extensions (AVX) in multi-core CPUs has been considered. In addition, the computation has been distributed along the different cores available by means of OpenMP directives. Graphic Processing Units have been also considered and the degree of improvement achieved by means of this parallel architecture has been compared with the highly-tuned CPU scheme by means of the relative speed up. The speed up obtained by the parallel versions implemented were up to 3 (AVX and OpenMP) and 40 (CUDA) times faster than the best sequential version for CPU that also uses OpenMP with auto-vectorization techniques, but non includes implicitely vectorial instructions. Results obtained with both parallel approaches demonstrate that massive parallel programming techniques are mandatory in solid-vibration problems with FDTD.  相似文献   

6.
7.
Radio frequency identification (RFID) tags have been widely deployed in many applications, such as supply chain management, inventory control, and traffic card payment. However, these applications can suffer from security issues or privacy violations when the underlying data-protection techniques are not properly designed. Hence, many secure RFID authentication protocols have been proposed. According to the resource usage of the tags, secure RFID protocols are classified into four types: full-fledged, simple, lightweight, and ultra-lightweight. In general, non-full-fledged protocols are vulnerable to desynchronization, impersonation, and tracking attacks, and they also lack scalability. If the tag resources allow more flexibility, full-fledged protocols seem to be an attractive solution. In this study, we examine full-fledged RFID authentication protocols and discuss their security issues. We then design a novel RFID authentication protocol based on elliptic curve cryptography, to avoid these issues. In addition, we present a detailed security analysis and a comparison with related studies; the results show that our scheme is more resistant to a variety of attacks and that it has the best scalability, while maintaining competitive levels of efficiency.  相似文献   

8.
This paper presents an FPGA implementation of the quartic neuron model. This approach uses digital computation to emulate individual neuron behavior. We implemented the neuron model using fixed-point arithmetic operation. The neuron model’s computations are performed in arithmetic pipelines. It was designed in VHDL language and simulated prior to mapping in the FPGA. We show that the proposed FPGA implementation of the quartic neuron model can emulate the electrophysiological activities in various types of cortical neurons and is capable of producing a variety of different behaviors, with diversity similar to that of neuronal cells. The neuron family of this digital neuron can be modified by appropriately adjusting the neuron model’s parameters.  相似文献   

9.
This is the first report of surface-enhanced Raman scattering (SERS) substrate fabrication using a combination of imprinted hydrogen silsesquioxane (HSQ: HSiO3/2) patterns and self-assembly of gold nanoparticles (AuNPs). To assemble the AuNPs inside the imprinted HSQ pattern, it is important to understand the interactions between AuNPs and AuNPs, and those between AuNPs and HSQ. The authors investigated the effects HSQ surface charges on the self-assembly of AuNPs. It was found that the negatively charged AuNPs were successfully assembled according to the geometry of the negatively charged HSQ pattern. In addition, it was shown that the SERS substrate fabricated from an HSQ consisting of an inorganic polymer was suitable for organic chemical analysis, by comparing it with a substrate fabricated using an organic polymer.  相似文献   

10.
This paper suggests a general method for compiling OR-parallelism into AND-parallelism. An interpreter for an AND/OR-parallel language written in the AND-parallel subset of the language induces a source-to-source transformation from the full language into the AND-parallel subset. This transformation can be identified and implemented as a special purpose compiler or applied using a general purpose partial evaluator. The method is demonstrated to compile a variant of Concurrent Prolog into an AND-parallel subset of the language called Flat Concurrent Prolog (FCP). It is also shown applicable to the compilation of OR-parallel Prolog to FCP. The transformation identified is simple and efficient. The performance of the method is discussed in the context of programming examples. These compare well with conventionally compiled Prolog programs.  相似文献   

11.
Compared to Beowulf clusters and shared-memory machines, GPU and FPGA are emerging alternative architectures that provide massive parallelism and great computational capabilities. These architectures can be utilized to run compute-intensive algorithms to analyze ever-enlarging datasets and provide scalability. In this paper, we present four implementations of K-means data clustering algorithm for different high performance computing platforms. These four implementations include a CUDA implementation for GPUs, a Mitrion C implementation for FPGAs, an MPI implementation for Beowulf compute clusters, and an OpenMP implementation for shared-memory machines. The comparative analyses of the cost of each platform, difficulty level of programming for each platform, and the performance of each implementation are presented.  相似文献   

12.
We present a method for preprocessing Prolog programs so that their operational semantics will be given by the first-order predicate calculus. Most Prolog implementations do not use a full unification algorithm, for efficiency reasons. The result is that it is possible to create terms having loops in them, whose semantics is not adequately described by first-order logic. Our method finds places where such loops may be created, and adds tests to detect them. This should not appreciably slow down the execution of most Prolog programs.  相似文献   

13.
14.
Radio frequency identification (RFID) is a wireless technology for automatic identification and data capture. Security and privacy issues in the RFID systems have attracted much attention. Many approaches have been proposed to achieve the security and privacy goals. One of these approaches is RFID authentication protocols by which a server and tags can authorize each other through an intracity process. Recently, Chou proposed a RFID authentication protocol based on elliptic curve cryptography. However, this paper demonstrates that the Chou’s protocol does not satisfy tag privacy, forward privacy and authentication, and server authentication. Based on these security and privacy problems, we also show that Chou’s protocol is defenseless to impersonation attacks, tag cloning attacks and location tracking attacks. Therefore, we propose a more secure and efficient scheme, which does not only cover all the security flaws and weaknesses of related previous protocols, but also provides more functionality. We prove the security of the proposed improved protocol in the random oracle model.  相似文献   

15.
Multicore processors can provide sufficient computing power and flexibility for complex streaming applications, such as high-definition video processing. For less hardware complexity and power consumption, the distributed scratchpad memory architecture is considered, instead of the cache memory architecture. However, the distributed design poses new challenges to programming. It is difficult to exploit all available capabilities and achieve maximal throughput, due to the combined complexity of inter-processor communication, synchronization, and workload balancing. In this study, we developed an efficient design flow for parallelizing multimedia applications on a distributed scratchpad memory multicore architecture. An application is first partitioned into streaming components and then mapped onto multicore processors. Various hardware-dependent factors and application-specific characteristics are involved in generating efficient task partitions and allocating resources appropriately. To test and verify the proposed design flow, three popular multimedia applications were implemented: a full-HD motion JPEG decoder, an object detector, and a full-HD H.264/AVC decoder. For demonstration purposes, SONY PlayStation \(^{\circledR }\) 3 was selected as the target platform. Simulation results show that, on PS3, the full-HD motion JPEG decoder with the proposed design flow can decode about 108.9 frames per second (fps) in the 1080p format. The object detection application can perform real-time object detection at 2.84 fps at \(1280 \times 960\) resolution, 11.75 fps at \(640 \times 480\) resolution, and 62.52 fps at \(320 \times 240\) resolution. The full-HD H.264/AVC decoder applications can achieve nearly 50 fps.  相似文献   

16.
This paper describes a reliable method for fabrication of stable gold patterns embedded in polydimethylsiloxane (PDMS) using a direct peel-off process. Two different surface modifications with self-assembled monolayers were carried out for easy and reliable transfer of Au micro-patterns to the PDMS: (1) perfluorodecyltrichlorosilane on a Si substrate for easy release of the Au patterns from the Si substrate, and (2) (3-mercaptopropyl)trimethoxysilane on the Au patterns to promote the adhesion between the Au patterns and PDMS. Au features as small as 2 μm, in shapes of line and dots, were successfully transferred from the Si substrate to the PDMS over a 3-inch wafer. Transfer of Au patterns to PDMS using the dry peel-off process did not cause any contamination of PDMS, typically seen in wet chemical methods. Finally, the stability of the Au patterns embedded in PDMS was confirmed by the Scotch-tape adhesion test.  相似文献   

17.
Puzzle - an efficient,compression independent video encryption algorithm   总被引:1,自引:0,他引:1  
Real-time video streams require an efficient encryption method to ensure their confidentiality. One of the major challenges in designing a video encryption algorithm is to encrypt the vast amount of video data in real-time to satisfy the stringent time requirements. Video encryption algorithms can be classified according to their association with video compression into joint compression and encryption algorithms and compression-independent encryption algorithms. The latter have a clear advantage over the former regarding the incorporation into existing multimedia systems due to their independence of the video compression. In this paper we present the compression-independent video encryption algorithm Puzzle, which was inspired by the children game jigsaw puzzle. It comprises two simple encryption operations with low computational complexity: puzzling and obscuring. The scheme thereby dramatically reduces the encryption overhead compared to conventional encryption algorithms, such as AES, especially for high resolution video. Further outstanding features of Puzzle are a good trade-off between security demands and encryption efficiency, no impairment on video compression efficiency, and an easy integration into existing multimedia systems. This makes Puzzle particularly well-suited for these security-sensitive multimedia applications, such as videoconferencing, where maximal security and minimal encryption overhead are desired simultaneously.  相似文献   

18.
The Artificial Reaction Network (ARN) is a Cell Signalling Network inspired connectionist representation belonging to the branch of A-Life known as Artificial Chemistry. Its purpose is to represent chemical circuitry and to explore computational properties responsible for generating emergent high-level behaviour associated with cells. In this paper, the computational mechanisms involved in pattern recognition and spatio-temporal pattern generation are examined in robotic control tasks. The results show that the ARN has application in limbed robotic control and computational functionality in common with Artificial Neural Networks. Like spiking neural models, the ARN can combine pattern recognition and complex temporal control functionality in a single network, however it offers increased flexibility. Furthermore, the results illustrate parallels between emergent neural and cell intelligence.  相似文献   

19.
Aurora is a prototype or-parallel implementation of the full Prolog language for shared-memory multiprocessors, developed as part of an informal research collaboration known as the “Gigalips Project”. It currently runs on Sequent and Encore machines. It has been constructed by adapting Sicstus Prolog, a fast, portable, sequential Prolog system. The techniques for constructing a portable multiprocessor version follow those pioneered in a predecessor system, ANL-WAM. The SRI model was adopted as the means to extend the Sicstus Prolog engine for or-parallel operation. We describe the design and main implementation features of the current Aurora system, and present some experimental results. For a range of benchmarks, Aurora on a 20-processor Sequent Symmetry is 4 to 7 times faster than Quintus Prolog on a Sun 3/75. Good performance is also reported on some large-scale Prolog applications.  相似文献   

20.
We consider NP-hard integer-valued multiindex problems of transportation type. We distinguish a subclass of polynomially solvable multiindex problems, namely multiindex problems with decomposition structure. We construct a general scheme for a heuristic method to solve a number of similar NP-hard decompositional multiindex problems. For one version of implementation for this scheme, we estimate its deviation from the optimum. We illustrate our results with the example of designing a class schedule.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号