首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   1136篇
  免费   204篇
  国内免费   201篇
电工技术   38篇
技术理论   1篇
综合类   69篇
化学工业   13篇
金属工艺   7篇
机械仪表   20篇
建筑科学   9篇
矿业工程   1篇
能源动力   5篇
轻工业   2篇
水利工程   12篇
石油天然气   12篇
武器工业   5篇
无线电   172篇
一般工业技术   46篇
冶金工业   1篇
原子能技术   22篇
自动化技术   1106篇
  2024年   2篇
  2023年   17篇
  2022年   35篇
  2021年   36篇
  2020年   49篇
  2019年   37篇
  2018年   48篇
  2017年   50篇
  2016年   101篇
  2015年   115篇
  2014年   175篇
  2013年   199篇
  2012年   148篇
  2011年   179篇
  2010年   115篇
  2009年   81篇
  2008年   66篇
  2007年   38篇
  2006年   30篇
  2005年   14篇
  2004年   5篇
  1990年   1篇
排序方式: 共有1541条查询结果,搜索用时 0 毫秒
1.
We present a new post processing method of simulating depth of field based on accurate calculations of circles of confusion. Compared to previous work, our method derives actual scene depth information directly from the existing depth buffer, requires no specialized rendering passes, and allows easy integration into existing rendering applications. Our implementation uses an adaptive, two‐pass filter, producing a high quality depth of field effect that can be executed entirely on the GPU, taking advantage of the parallelism of modern graphics cards and permitting real time performance when applied to large numbers of pixels.  相似文献   
2.
The problem of packing circles into a domain of prescribed topology is considered. The circles need not have equal radii. The Collins-Stephenson algorithm computes such a circle packing. This algorithm is parallelized in two different ways and its performance is reported for a triangular, planar domain test case. The implementation uses the highly parallel graphics processing unit (GPU) on commodity hardware. The speedups so achieved are discussed based on a number of experiments.  相似文献   
3.
We have designed Particle-in-Cell algorithms for emerging architectures. These algorithms share a common approach, using fine-grained tiles, but different implementations depending on the architecture. On the GPU, there were two different implementations, one with atomic operations and one with no data collisions, using CUDA C and Fortran. Speedups up to about 50 compared to a single core of the Intel i7 processor have been achieved. There was also an implementation for traditional multi-core processors using OpenMP which achieved high parallel efficiency. We believe that this approach should work for other emerging designs such as Intel Phi coprocessor from the Intel MIC architecture.  相似文献   
4.
An online beam dynamics simulator is being developed for use in the operation of an ion linear particle accelerator. By employing Graphics Processing Unit (GPU) technology, the performance of the simulator has been significantly increased over that of a single CPU and is therefore viable in the demanding accelerator operations environment. Once connected to the accelerator control system, it can rapidly respond to any control set point changes and predict beam properties along an ion linear accelerator in pseudo-real time. This simulator will be a virtual beam diagnostic tool which is especially useful when direct beam measurements are not available. Details about the code structure design, physics algorithms, GPU implementations, and performance are presented.  相似文献   
5.
We report fast computation of computer-generated holograms (CGHs) using Xeon Phi coprocessors, which have massively x86-based processors on one chip, recently released by Intel. CGHs can generate arbitrary light wavefronts, and therefore, are promising technology for many applications: for example, three-dimensional displays, diffractive optical elements, and the generation of arbitrary beams. CGHs incur enormous computational cost. In this paper, we describe the implementations of several CGH generating algorithms on the Xeon Phi, and the comparisons in terms of the performance and the ease of programming between the Xeon Phi, a CPU and graphics processing unit (GPU).  相似文献   
6.
The manycore revolution can be characterized by increasing thread counts, decreasing memory per thread, and diversity of continually evolving manycore architectures. High performance computing (HPC) applications and libraries must exploit increasingly finer levels of parallelism within their codes to sustain scalability on these devices. A major obstacle to performance portability is the diverse and conflicting set of constraints on memory access patterns across devices. Contemporary portable programming models address manycore parallelism (e.g., OpenMP, OpenACC, OpenCL) but fail to address memory access patterns. The Kokkos C++ library enables applications and domain libraries to achieve performance portability on diverse manycore architectures by unifying abstractions for both fine-grain data parallelism and memory access patterns. In this paper we describe Kokkos’ abstractions, summarize its application programmer interface (API), present performance results for unit-test kernels and mini-applications, and outline an incremental strategy for migrating legacy C++ codes to Kokkos. The Kokkos library is under active research and development to incorporate capabilities from new generations of manycore architectures, and to address a growing list of applications and domain libraries.  相似文献   
7.
用光谱分析鉴别生物特征,导致数据量大,而实际需要必须实时处理。偏最小二乘法是使用最广泛的鉴别算法,但是对于大规模数据流该算法无法达到实时性。为了解决这个应用矛盾,提出了一种基于NVIDIA CUDA架构下的并行计算策略,利用具有大规模并行计算特征的图形处理器(GPU)作为计算设备,结合GPU存储器的优势实现了偏最小二乘算法。实验的测试结果表明,在GPU上使用CUDA实现的偏最小二乘算法比在CPU上实现该算法快了47倍,性能得到了显著提高,从而使偏最小二乘算法应用于大规模数据流处理成为可能。  相似文献   
8.
FFT(快速傅里叶变换)是基于提高DFT(离散傅里叶变换)计算的高效算法,它在众多科学和工程领域都得到了广泛的应用。自FFT算法出现以后,从早期的以降低复杂度到近年以来的大规模并行FFT计算,各种优化算法得到广泛的研究。在并行运算领域中,随着可编程的、并行化GPU的不断推广,特别是通用并行统一计算架构CUDA的出现,极大增强了GPU的计算能力,在编程和优化等方面都有显著地提升。鉴于此,本文在分析FFT算法实现的基础上,研究了一种适合GPU运算的FFT并行计算方法,并通过CUDA架构实现了FFT算法在GPU上的运算。该方法的引入在理论不计算数据传输的情况下,使一维FFT运算时间的复杂度由O(N logN2)可以降到O(N/rlogN2)。通过验证,本文提出的CUDA的并行FFT方法得到较好的加速效果,在精度计算上也符合实际的要求,从而证明了该方法的正确性和有效性。  相似文献   
9.
In this paper we focus on two complementary approaches to significantly decrease pre-training time of a deep belief network (DBN). First, we propose an adaptive step size technique to enhance the convergence of the contrastive divergence (CD) algorithm, thereby reducing the number of epochs to train the restricted Boltzmann machine (RBM) that supports the DBN infrastructure. Second, we present a highly scalable graphics processing unit (GPU) parallel implementation of the CD-k algorithm, which boosts notably the training speed. Additionally, extensive experiments are conducted on the MNIST and the HHreco databases. The results suggest that the maximum useful depth of a DBN is related to the number and quality of the training samples. Moreover, it was found that the lower-level layer plays a fundamental role for building successful DBN models. Furthermore, the results contradict the pre-conceived idea that all the layers should be pre-trained. Finally, it is shown that by incorporating multiple back-propagation (MBP) layers, the DBNs generalization capability is remarkably improved.  相似文献   
10.
《Parallel Computing》2014,40(5-6):70-85
QR factorization is a computational kernel of scientific computing. How can the latest computer be used to accelerate this task? We investigate this topic by proposing a dense QR factorization algorithm with adaptive block sizes on a hybrid system that contains a central processing unit (CPU) and a graphic processing unit (GPU). To maximize the use of CPU and GPU, we develop an adaptive scheme that chooses block size at each iteration. The decision is based on statistical surrogate models of performance and an online monitor, which avoids unexpected occasional performance drops. We modify the highly optimized CPU–GPU based QR factorization in MAGMA to implement the proposed schemes. Numerical results suggest that our approaches are efficient and can lead to near-optimal block sizes. The proposed algorithm can be extended to other one-sided factorizations, such as LU and Cholesky factorizations.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号