In this paper, we conduct performance scaling analysis of multithreaded multicore processors (MMPs) for parallel computing. We propose a thread-level closed-queuing network model covering a fairly large design space, accounting for hardware scaling models, coarse-grain, fine-grain, and simultaneous multithreading (SMT) cores, shared resources, including cache, memory, and critical sections. We then derive a closed-form solution for this model in terms of speedup performance measure. This solution makes it possible to analyze performance scaling properties of MMPs along multiple dimensions. In particular, we show that for the parallelizable part of the workload, the speedup, in the absence of resource contention, is no longer just a linear function of parallel processing unit counts, as predicted by Amdahl’s law, but also a strong function of workload characteristics, ranging from strong memory-bound to strong CPU-bound workloads. We also find that with core multithreading, super linear speedup, higher than that predicted by Amdahl’s law, may be achieved for the parallelizable part of the workload, if core threads exhibit strong cache affinity and the workload is strongly memory-bound. Then, we derive a tight speedup upper bound in the presence of both memory resource contention and critical section for multicore processors with single-threaded cores. This speedup upper bound indicates that with resource contention among threads, whether it is due to shared memory or critical section, a sequential term is guaranteed to emerge from the parallelizable part of the workload, fundamentally limiting the scalability of multicore processors for parallel computing, in addition to the sequential part of the workload, as dictated by Amdahl’s law. As a result, to improve speedup performance for MMPs, one should strive to enhance memory parallelism and confine critical sections as locally as possible, e.g., to the smallest possible number of threads in the same core.  相似文献   

Microprocessor architecture has entered the multicore era. Recently, Hill and Marty presented a pessimistic view of multicore scalability. Their analysis was based on Amdahl’s law (i.e. fixed-workload condition) and challenged readers to develop better models. In this study, we analyze multicore scalability under fixed-time and memory-bound conditions and from the data access (memory wall) perspective. We use the same hardware cost model of multicore chips used by Hill and Marty, but achieve very different and more optimistic performance models. These models show that there is no inherent, immovable upper bound on the scalability of multicore architectures. These results complement existing studies and demonstrate that multicore architectures are capable of extensive scalability.  相似文献   

In order to satisfy the needs for increasing computer processing power, there are significant changes in the design process of modern computing systems. Major chip-vendors are deploying multicore or manycore processors to their product lines. Multicore architectures offer a tremendous amount of processing speed. At the same time, they bring challenges for embedded systems which suffer from limited resources. Various cache memory hierarchies have been proposed to satisfy the requirements for different embedded systems. Normally, a level-1 cache (CL1) memory is dedicated to each core. However, the level-2 cache (CL2) can be shared (like Intel Xeon and IBM Cell) or distributed (like AMD Athlon). In this paper, we investigate the impact of the CL2 organization type (shared Vs distributed) on the performance and power consumption of homogeneous multicore embedded systems. We use VisualSim and Heptane tools to model and simulate the target architectures running FFT, MI, and DFT applications. Experimental results show that by replacing a single-core system with an 8-core system, reductions in mean delay per core of 64% for distributed CL2 and 53% for shared CL2 are possible with little additional power (15% for distributed CL2 and 18% for shared CL2) for FFT. Results also reveal that the distributed CL2 hierarchy outperforms the shared CL2 hierarchy for all three applications considered and for other applications with similar code characteristics.  相似文献   

The present study sought to examine the effect of nonhuman’s external regulation on children’s self-regulation to regulate their own process of learning and to what extent did children succeed in terms of application when they talk and think while act alone with nonhuman’s external regulator. The Aginian’s methodology (, , ,  and ) that used an isolated, computer-based learning system and acts as a standalone learning environment with special set of tasks was used by hundred healthy preschool children. The results showed that young children were able to regulate their own process of learning and engage with their full free-will without the need of their real teacher’s regulation. The conclusion provided evidence that the verbalization of thinking aloud should occur spontaneously by nature, the nonhuman’s external regulation has a positive effect on young children’s development when they act with their full free-will, and has a positive effect on their behavior either.  相似文献   

The present study sought to examine the effect of the nonhuman’s external regulation on children’s responses during learning tasks to detect children with developmental problems (DP) associated with the natural development process of self-regulation. The material was an isolated, computer-based learning system that acts as a standalone learning environment and used by 100 preschool children, which were randomly selected from ten preschools without revising their medical files. Participants were classified by the system itself during learning progression in three essential groups based on Aginian’s zone of children regulation (ZCR), which is “the equilibrium point in the self-regulation’s development process that controls the child to be either a self-Vygotskyian’s learner, self-Piagetian’s learner, or self-Aginian’s learner during learning tasks” ( Agina, Kommers, & Steehouder, 2011d). The results showed that the preschool children can spontaneously do diagnostic tests during learning tasks and the nonhuman external regulator was able to analysis children’s responses that, in turn, used for detecting those children with DP. This result was practically confirmed by revising all children’s medical files that matched the final judgment of the nonhuman external regulator. However, the results confirmed that the natural development of self-regulation was fluctuated among three paradoxical views (Vygotskyian vs. Piagetian vs. Aginian).  相似文献   

The present study explored the effect of nonhuman’s external regulation on children’s natural development of self-regulation and the effect of each natural developed class on children’s spontaneous thinking aloud and satisfaction. The Aginian’s methodology (Agina et al., 2011a) that relied on special computer agents for the external regulation, measuring self-regulation and children’s satisfaction, and producing the final results in points was used with 40 preschool children, which were divided into classes based on their natural development of self-regulation during learning tasks. The results showed that children who followed Piagetian’s view were outperforming children who followed Vygotskyian’s view and Aginian’s view, which is a new psychological view generated by computer indicates that the child either followed unknown class of self-regulation’s natural development or the child holds an ambiguous psychological problem. The results also showed that the relationship between children’s spontaneous thinking aloud and children’s self-regulation is a reverse. The supplemental analysis showed that computer, as a nonhuman external regulator, can identify those children who hold psychological problems and can integrate the net signed of self-regulation of each child at each task through embedding the mathematics integration where the computer becomes fully conscious with all the occurrences of children’s behavioral regulation.  相似文献   

The special relativity considered in [A. Einstein, Zur Elektrodynamik der bewegte Körper. Ann. Physik, 17 (1905) 891-921] is based on the concept of finite speed of information transmittal by the available signals (rays of light). It is demonstrated that the same concept applies to Newton’s law of universal gravitation since the magnitude of distances between attracting masses can be physically defined (carried, accounted in acting forces of gravity) only by signals (physical processes) propagating at finite velocities. It follows that the speed of propagation of gravity is finite. The linear transformations of special relativity are applied to Newton’s law of gravitation to take into account the relativistic effects of information transmittal in a field of central forces of attraction. Relativistic representations of Newton’s law are obtained with respect to the center of gravity exposing illusory effects that appear at high velocities. It is verified that in atomic physics the effect of Newtonian gravitation on the motion of elementary particles at high velocities is negligible also in relativistic consideration. Computational methods are developed to measure the intensity of gravitation at a distant space-time location using a body that travels in space, emitting uniform pulses of light that are received by the observer at a different space-time location. It is demonstrated that the tensor approach to the general relativity and the united theory of space, time and gravitation in which the geometrical properties (metric) of the four-dimensional space-time continuum depend on the distribution of gravitating masses in space and their motion represent a transformed Lorentz invariant with a new type of inertia in the field of forces changing in space and time. Real physical processes evolve according to the forces represented in the tensor form by this invariant which is equivalent to the coordinate-free local invariant of relativistic dynamics that defines the field and the motion of a body whose velocities and accelerations can be measured by relativistic identification methods at a point, time and direction of interest. The results open new avenues for research in the general relativity and can be used for software development, field measurements and experimental studies in application to distant or fast moving systems.  相似文献   

Modeling the color variation due to an illuminant change is an important task for many computer vision applications, like color constancy, object recognition, shadow removal and image restoration. The von Kries diagonal model and Wien’s law are widely assumed by many algorithms for solving these problems. In this work we combine these two hypotheses and we show that under Wien’s law, the parameters of the von Kries model are related to each other through the color temperatures of the illuminants and the spectral sensitivities of the acquisition device. Based on this result, we provide a method for estimating some camera cues that are used to compute the illuminant invariant intrinsic image proposed by Finlayson and others. This is obtained by projecting the log-chromaticities of an input color image onto a vector depending only on the spectral cues of the acquisition device.  相似文献   

The present study was conducted to shed a new light on the nonhuman’s external regulation effect on children’s behavioral regulation through investigating the effect of the computer’s task feedback answer-until-correct (AUC) versus knowledge-of-result (KR) with 40 preschool children (boys/girls) divided into AUC-Condition versus KR-Condition. The Aginian’s methodology (Agina, Kommers, & Steehouder, 2010) with the latest updates (Agina, Kommers, & Steehouder, 2011) was used, which involves an isolated, computer-based learning system with three different computer’s agents for measuring self-regulation as a function of the task level selection, self-regulation as a function of task precision, and special agent for evaluating children’s satisfaction. It was hypothesized that AUC-Condition will outperform KR-Condition in verbalization intensity, manifested self-regulation, and the degree of satisfaction. Despite the results were not confirmed the hypothesis, the results generated by the game were consistent with the statistical results in which this consistency increases, to a great extent, the reliability of the Aginian’s measurements. However, both results were not confirmed Vygotsky’s view or Piaget’s view of self-regulation development as both concluded that thinking aloud and self-regulation have a reverse relationship and, therefore, thinking aloud, per se, can be used to explore the problems the children may not agree to talk about.  相似文献   

To capitalize on multicore power, modern high-speed data transfer applications usually adopt multi-threaded design and aggregate multiple network interfaces. However, NUMA introduces another dimension of complexity to these applications. In this paper, we undertook comprehensive experiment on real systems to illustrate the importance of NUMA-awareness to applications with intensive memory accesses and network I/Os. Instead of simply attributing the NUMA effect to the physical layout, we provide an in-depth analysis of underlying interactions inside hardware devices. We profile the system performance by monitoring relevant hardware counters, and reveal how the NUMA penalty occurs during prefetch and cache synchronization processes. Consequently, we implement a thread mapping module in a bulk data transfer software, BBCP, as a practical example of enabling NUMA-awareness. The enhanced application is then evaluated on our high-performance testbed with storage area networks (SAN). Our experimental results show that the proposed NUMA optimizations can significantly improve BBCP’s performance in memory-based tests with various contention levels and realistic data transfers involving SAN-based storage.  相似文献   

Online support groups have become a popular source of information, advice and support for individuals living with a range of health conditions. However, research has not commonly focused on patients living with Parkinson’s disease and their use of online support groups. Thus, the aim of this study was to gain an insight into the positive and negative aspects of online communication through an analysis of messages exchanged within Parkinson’s disease discussion forums. Data was collected from four forums and analysed using data-driven thematic analysis. The results revealed that participation in the forums allowed patients to share experiences and knowledge, form friendships, as well as helping them cope with the challenges of living with Parkinson’s disease. Conversely, a lack of replies, the experience of Parkinson’s disease symptoms, a lack of personal information, fragility of online relationships, misunderstandings and disagreements, all appeared to compromise the online experience. Practical implications and future research recommendations are proposed.  相似文献   

The intensive use of interactive media has led to assertions about the effect of these media on youth. This paper presents a quantitative study on the position of interactive media in young people’s lives. Rather than following the assumption of a homogeneous generation, we investigate the existence of a diversity of user patterns. The research question for this paper: Can patterns be found in the use of interactive media among youth? We answer this question by a survey among Dutch youngsters aged 10–23. Four clusters of interactive media users, namely Traditionalists, Gamers, Networkers and Producers were identified using cluster analysis. Behind these straightforward clusters, a complex whole of user activities can be found. Each cluster shows specific use of and opinions about interactive media. This provides a contextualized understanding of the position of interactive media in the lives of contemporary youth, and a nuanced conceptualization of the ‘Net generation’. This allows for studying the intricate relationship between youth culture, interactive media and learning.  相似文献   

Global Address Space (GAS) programming models enable a convenient, shared-memory style addressing model. Typically this is realized through one-sided operations that can enable asynchronous communication and data movement. With the size of petascale systems reaching 10,000s of nodes and 100,000s of cores, the underlying runtime systems face critical challenges in (1) scalably managing resources (such as memory for communication buffers), and (2) gracefully handling unpredictable communication patterns and any associated contention. For any solution that addresses these resource scalability challenges, equally important is the need to maintain the performance of GAS programming models. In this paper, we describe a Hierarchical COOperation (HiCOO) architecture for scalable communication in GAS programming models. HiCOO formulates a cooperative communication architecture: with inter-node cooperation amongst multiple nodes (a.k.a multinode) and hierarchical cooperation among multinodes that are arranged in various virtual topologies. We have implemented HiCOO for a popular GAS runtime library, Aggregate Remote Memory Copy Interface (ARMCI). By extensively evaluating different virtual topologies in HiCOO in terms of their impact to memory scalability, network contention, and application performance, we identify MFCG as the most suitable virtual topology. The resulting HiCOO architecture is able to realize scalable resource management and achieve resilience to network contention, while at the same time maintaining or enhancing the performance of scientific applications. In one case, it reduces the total execution time of an NWChem application by 52%.  相似文献   

In this paper, the second in a series, the authors have extended and implemented their computational algorithms for improving the scalability of CSD (Computational Structural Dynamics) and FSI (Fluid–Structure Interaction) simulations on emerging architectures like multicore High Performance Computing (HPC) platforms. These algorithmic developments and extensions are classified into two categories: (i) enhanced scalability for CSD simulations on multicore platforms, (ii) newer ideas for running FSI simulations. In the first category, the authors employed the ideas developed in the first paper of this series including the multilevel partitioning strategy, next generation optimized communication procedure and better memory management to get enhanced scalability for CSD simulations. In the second category, the authors came up with a novel solver specific multicore-FSI optimal partitioning so as to improve the overall FSI scalability. After implementing the new “intelligent partitioning” algorithm, a speedup ratio of nearly 2.5x was obtained for the total time. The intelligent partitioning algorithm optimizes the number of solid domains relative to the number of fluid domains to optimize the overall FSI solution, irrespective of the type of the flow solver. In general, the authors have demonstrated (i) good, almost linear scalability for aeroelastic applications with several millions of cells on multicore platforms with thousands of cores, (ii) significant improvement in the scalability for smaller FSI problems using the intelligent partitioning.  相似文献   

Because of several analytical and methodological critiques on the findings and contexts of children’s private speech (PS), self-regulation learning (SRL), and thinking aloud (TA), the present study was conducted to shed new light on the effect of the nonhuman’s/computer’s versus human’s/teacher’s intervention (C-Condition versus T-Condition) on young children’s speech use, SRL, and satisfaction during learning tasks. Four developmental measurements with novel criteria were used to measure: (1) speech analysis, (3) SRL as a function of task level selection, (3) SRL as a function of task precision, and (4) a friendly-chat questionnaire to measure children’s satisfaction. Two types of intervention (enacted versus verbal encouragement) were applied through computer-based learning environment and investigated by forty preschool children divided by their teachers between the two conditions equivalently. It was hypothesized that children who acted alone (C-Condition) were more PS productive, manifested higher SRL, task performance, and satisfaction. The results confirmed the hypothesis with no significant differential effect of the gender on performance, showed that the injudicious use of encouragement hindered the children’s regulation behavior, and proved that PS and TA elicitation were fully different. However, the results were not confirmed Vygotsky’s view and simultaneously not fully inline with Piaget’s view of self-regulation development.  相似文献   

The present study was conducted to explore the effect of the absence of the external regulators on children’s use of speech (private/social), task performance, and self-regulation during learning tasks. A novel methodology was employed through a computer-based learning environment that proposed three types/units of encouragement with only two sequences of instructional conditions, Verbal-Gesture-Silent (VGS) versus Silent-Gesture-Verbal (SGV). The Knowledge of response (KR) was applied as: verbal KR feedback with verbal encouragement during the verbal unit, visualization-representation of KR without verbal encouragement during the gesture unit, and no KR feedback without any encouragement during the silent unit. Three measurements were used: speech analysis, novel criteria to measure self-regulation and task performance, and a computer-based friendly chat questionnaire to measure children’s satisfaction. Forty preschool children were divided by their teachers between the two conditions equivalently. It was hypothesized that children in the VGS condition were more speech productive, manifested higher self-regulation, task performance, and satisfaction. The results showed significant differential effect on the speech intensity and manifested self-regulation with no significant differential effect on task performance and satisfaction during learning tasks. However, the results were not confirmed Vygotsky’s view as it were supported (neutralizing, at best) to Piaget’s view of self-regulation development.  相似文献   

This paper shows that the Zadeh’s extensions of sendograph-metric-continuous fuzzy-valued functions are sendograph-metric-continuous fuzzy functions.  相似文献   

