首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A new block algorithm for computing a full rank solution of the Sylvester-observer equation arising in state estimation is proposed. The major computational kernels of this algorithm are: 1) solutions of standard Sylvester equations, in each case of which one of the matrices is of much smaller order than that of the system matrix and (furthermore, this small matrix can be chosen arbitrarily), and 2) orthogonal reduction of small order matrices. There are numerically stable algorithms for performing these tasks including the Krylov-subspace methods for solving large and sparse Sylvester equations. The proposed algorithm is also rich in Level 3 Basic Linear Algebra Subroutine (BLAS-3) computations and is thus suitable for high performance computing. Furthermore, the results on numerical experiments on some benchmark examples show that the algorithm has better accuracy than that of some of the existing block algorithms for this problem.  相似文献   

2.
To make the results reasonable, existing joint diagonalization algorithms have imposed a variety of constraints on diagonalizers. Actually, those constraints can be imposed uniformly by minimizing the condition number of diagonalizers. Motivated by this, the approximate joint diagonalization problem is reviewed as a multiobjective optimization problem for the first time. Based on this, a new algorithm for nonorthogonal joint diagonalization is developed. The new algorithm yields diagonalizers which not only minimize the diagonalization error but also have as small condition numbers as possible. Meanwhile, degenerate solutions are avoided strictly. Besides, the new algorithm imposes few restrictions on the target set of matrices to be diagonalized, which makes it widely applicable. Primary results on convergence are presented and we also show that, for exactly jointly diagonalizable sets, no local minima exist and the solutions are unique under mild conditions. Extensive numerical simulations illustrate the performance of the algorithm and provide comparison with other leading diagonalization methods. The practical use of our algorithm is shown for blind source separation (BSS) problems, especially when ill-conditioned mixing matrices are involved.   相似文献   

3.
基于稳健联合分块对角化的卷积盲分离   总被引:1,自引:0,他引:1  
汤辉  王殊 《自动化学报》2013,39(9):1502-1510
针对卷积盲分离问题,提出一种新的矩阵联合分块对角化(Joint block diagonalization, JBD)算法. 现有的迭代非正交联合分块对角化算法都存在不收敛的情况,本文利用分离矩阵的特殊结构确保其可逆性,使得算法的迭代过程稳定. 在已知矩阵分块结构的条件下,首先,将卷积盲分离模型写成瞬时形式,并说明其满足联合分块对角化结构; 然后,提出联合分块对角化的代价函数,依据代价函数的最小化等价于矩阵中每个分块的范数最小化, 将整个分离矩阵的迭代更新转化成每个分块的迭代更新;最后,利用最小化条件得到迭代算法. 实数和复数两种情况下的算法都进行了推导.基本实验验证了新算法在不同条件下的性能; 仿真实验中对在时域和频域都重叠的信号的卷积混合进行盲分离,实验结果验证了新算法具有更好的分离性能和更稳定的分离能力.  相似文献   

4.
非对称非正交快速联合对角化算法   总被引:1,自引:0,他引:1  
针对非对称联合对角化算法收敛速度慢以及有可能收敛到奇异解的问题, 首先提出一种基于最小二乘的非对称代价函数, 该代价函数在最小二乘标准的基础上增加了使对角化矩阵非奇异的约束项, 以保证算法不会收敛到奇异解. 然后利用一种循环最小化技术来优化提出的代价函数, 得到一种非对称非正交快速联合对角化算法. 算法的性能分析证明, 该算法不仅全局渐近收敛, 而且具有不变性. 左右对角化矩阵的关系也证明了非对称联合对角化的一般性. 实验仿真表明, 与原非对称联合对角化算法相比, 提出的算法收敛速度更快, 而且可以显著降低干扰信号比.  相似文献   

5.
6.
The purpose of this paper is to highlight the performance issues of the matrix transposition algorithms for large matrices, relating to the Translation Lookaside Buffer (TLB) cache. The existing optimisation techniques such as coalesced access and the use of shared memory, regardless of their necessity and benefits, are not sufficient enough to neutralise the problem. As the data problem size increases, these optimisations do not exploit data locality effectively enough to counteract the detrimental effects of TLB cache misses. We propose a new optimisation technique that counteracts the performance degradation of these algorithms and seamlessly complements current optimisations. Our optimisation is based on detailed analysis of enumeration schemes that can be applied to either individual matrix entries or blocks (sub-matrices). The key advantage of these enumeration schemes is that they do not incur matrix storage format conversion because they operate on canonical matrix layouts. In addition, several cache-efficient matrix transposition algorithms based on enumeration schemes are offered—an improved version of the in-place algorithm for square matrices, out-of-place algorithm for rectangular matrices and two 3D involutions. We demonstrate that the choice of the enumeration schemes and their parametrisation can have a direct and significant impact on the algorithm’s memory access pattern. Our in-place version of the algorithm delivers up to 100% performance improvement over the existing optimisation techniques. Meanwhile, for the out-of-place version we observe up to 300% performance gain over the NVidia’s algorithm. We also offer improved versions of two involution transpositions for the 3D matrices that can achieve performance increase up 300%. To the best of our knowledge, this is the first effective attempt to control the logical-to-physical block association through the design of enumeration schemes in the context of matrix transposition.  相似文献   

7.
The conjugate gradient squared (CGS) algorithm is a Krylov subspace algorithm that can be used to obtain fast solutions for linear systems (Ax=b) with complex nonsymmetric, very large, and very sparse coefficient matrices (A). By considering electromagnetic scattering problems as examples, a study of the performance and scalability of this algorithm on two MIMD machines is presented. A modified CGS (MCGS) algorithm, where the synchronization overhead is effectively reduced by a factor of two, is proposed in this paper. This is achieved by changing the computation sequence in the CGS algorithm. Both experimental and theoretical analyses are performed to investigate the impact of this modification on the overall execution time. From the theoretical and experimental analysis it is found that CGS is faster than MCGS for smaller number of processors and MCGS outperforms CGS as the number of processors increases. Based on this observation, a set of algorithms approach is proposed, where either CGS or MGS is selected depending on the values of the dimension of the A matrix (N) and number of processors (P). The set approach provides an algorithm that is more scalable than either the CGS or MCGS algorithms. The experiments performed on a 128-processor mesh Intel Paragon and on a 16-processor IBM SP2 with multistage network indicate that MCGS is approximately 20% faster than CGS.  相似文献   

8.
A Knowledge-Based Ant Colony Optimization (KBACO) algorithm is proposed in this paper for the Flexible Job Shop Scheduling Problem (FJSSP). KBACO algorithm provides an effective integration between Ant Colony Optimization (ACO) model and knowledge model. In the KBACO algorithm, knowledge model learns some available knowledge from the optimization of ACO, and then applies the existing knowledge to guide the current heuristic searching. The performance of KBACO was evaluated by a large range of benchmark instances taken from literature and some generated by ourselves. Final experimental results indicate that the proposed KBACO algorithm outperforms some current approaches in the quality of schedules.  相似文献   

9.
《Image and vision computing》2014,32(6-7):437-451
Concurrently obtaining an accurate, robust and fast global registration of multiple 3D scans is still an open issue for modern 3D modeling pipelines, especially when high metric precision as well as easy usage of high-end devices (structured-light or laser scanners) are required. Various solutions have been proposed (either heuristic, iterative and/or closed form solutions) which present some compromise concerning the fulfillment of the above contrasting requirements. Our purpose here, compared to existing reference solutions, is to go a step further in this perspective by presenting a new technique able to provide improved alignment performance, even on large datasets (both in terms of number of views and/or point density) of range images. Relying on the ‘Optimization-on-a-Manifold’ (OOM) approach, originally proposed by Krishnan et al., we propose a set of methodological and computational upgrades that produce an operative impact on both accuracy, robustness and computational performance compared to the original solution. In particular, always basing on an unconstrained error minimization over the manifold of rotations, instead of relying on a static set of point correspondences, our algorithm updates the optimization iterations with a dynamically modified set of correspondences in a computationally effective way, leading to substantial improvements in terms of registration accuracy and convergence trend. Other proposed improvements are directed to a substantial reduction of the computational load without sacrificing the alignment performance. Stress tests with increasing view misalignment allowed us to appreciate the convergence robustness of the proposed solution. Eventually, we demonstrate that for very large datasets a further computational speedup can be reached by the adoption of a hybrid (local heuristic followed by global optimization) registration approach.  相似文献   

10.
针对现有的单视图数据竞争聚类算法无法高效处理多视图数据的问题,提出了基于视图相关因子的多视图数据竞争聚类算法。首先,为了描述不同视图之间的相关性定义了一种视图相关性因子;然后,将视图相关因子与谱方法关于拉普拉斯矩阵的目标函数最大化问题结合,建立一个联合目标函数,使得不同视图之间的信息相互影响,以充分利用多视图的信息。通过解决联合目标函数的优化问题,得到每个视图的优化嵌入矩阵;最后,将得到的优化嵌入矩阵用于数据竞争聚类算法中。在人工和真实数据集上的仿真实验结果表明,新算法比现有的数据竞争聚类算法具有更高的聚类性能。  相似文献   

11.
Matrix-based methods such as two-dimensional principal component analysis (2DPCA) and generalized low rank approximations of matrices (GLRAM) have gained wide attention from researchers due to their computational efficiency. In this paper, we propose a non-iterative algorithm for GLRAM. Firstly, the optimal property of GLRAM is revealed, which is closely related to PCA. Moreover, it also shows that the reconstruction error of GLRAM is not smaller than that of PCA when considering the same dimensionality. Secondly, a non-iterative algorithm for GLRAM is derived. And the proposed method obtains smaller reconstruction error than 2DPCA or GLRAM. Finally, experimental results on face images and handwritten numeral characters show that the proposed method can achieve competitive results with some existing methods such as 2DPCA and PCA in terms of the classification performance or the reconstruction error.  相似文献   

12.
Analysis of high dimensional data in modern applications, such as neuroscience, text mining, spectral analysis, chemometrices naturally requires tensor decomposition methods. The Tucker decompositions allow us to extract hidden factors (component matrices) with different dimension in each mode, and investigate interactions among various modalities. The alternating least squares (ALS) algorithms have been confirmed effective and efficient in most of tensor decompositions, especially Tucker with orthogonality constraints. However, for nonnegative Tucker decomposition (NTD), standard ALS algorithms suffer from unstable convergence properties, demand high computational cost for large scale problems due to matrix inverse, and often return suboptimal solutions. Moreover they are quite sensitive with respect to noise, and can be relatively slow in the special case when data are nearly collinear. In this paper, we propose a new algorithm for nonnegative Tucker decomposition based on constrained minimization of a set of local cost functions and hierarchical alternating least squares (HALS). The developed NTD-HALS algorithm sequentially updates components, hence avoids matrix inverse, and is suitable for large-scale problems. The proposed algorithm is also regularized with additional constraint terms such as sparseness, orthogonality, smoothness, and especially discriminant. Extensive experiments confirm the validity and higher performance of the developed algorithm in comparison with other existing algorithms.  相似文献   

13.
全双工中继系统相比于半双工中继系统可以极大地提高频谱利用率,但是中继收发端之间的信号泄漏严重影响全双工系统的性能。为了抑制基于译码转发的全双工多输入多输出中继系统的自干扰,提高信息传输速率,提出了一种波束成形算法。该算法在中继站采用基于最小均方误差的接收与发射波束成形,并联合两个波束成形矩阵建立迭代结构以得到最优解。仿真表明,同传统的零空间投影与最大化信干比算法相比,提出的算法能够有效提高系统性能。在中高信噪比时,该算法较最大化信干比算法获得0.8 bit/(s·Hz)左右的速率增益;当误码率达到10-3以及更低时,该算法相比于最大化信干比算法能获得1.5 dB左右的信噪比增益。  相似文献   

14.
Despite extensive research, optimal performance has not easily been available previously for matrix multiplication (especially for large matrices) on most architectures because of the lack of a structured approach and the limitations imposed by matrix storage formats. A simple but effective framework is presented here that lays the foundation for building high‐performance matrix‐multiplication codes in a structured, portable and efficient manner. The resulting codes are validated on three different representative RISC and CISC architectures on which they significantly outperform highly optimized libraries such as ATLAS and other competing methodologies reported in the literature. The main component of the proposed approach is a hierarchical storage format that efficiently generalizes the applicability of the memory hierarchy friendly Morton ordering to arbitrary‐sized matrices. The storage format supports polyalgorithms, which are shown here to be essential for obtaining the best possible performance for a range of problem sizes. Several algorithmic advances are made in this paper, including an oscillating iterative algorithm for matrix multiplication and a variable recursion cutoff criterion for Strassen's algorithm. The authors expose the need to standardize linear algebra kernel interfaces, distinct from the BLAS, for writing portable high‐performance code. These kernel routines operate on small blocks that fit in the L1 cache. The performance advantages of the proposed framework can be effectively delivered to new and existing applications through the use of object‐oriented or compiler‐based approaches. Copyright © 2002 John Wiley & Sons, Ltd.  相似文献   

15.
基于谱聚类的多闭值图像分割方法   总被引:4,自引:4,他引:0  
阈值法是图像分割的一种重要方法,在图像处理与目标识别中广为应用。因此,如何确定阈值是图像分割的关键。提出了一种新的图像阈值分割方法,即通过采用新的相似度函数的谱聚类算法(Dcut)确定图像阈值。采用基于灰度级的权值矩阵代替常用的基于图像像素级的权值矩阵描述图像像素的关系,因而算法需要的存储空间及实现的复杂性与其它基于图的图像分割方法相比大大减少。实验表明,该方法分割图像的时间少,且能够单阈值和多阈值分割图像,与现有的阈值分割方法相比,其具有更为优越的分割性能。  相似文献   

16.
TFQMR算法是一种Krylov子空间算法,常用来求解大型稀疏线性方程组.通过改变TFQMR算法的计算次序,提出了一种改进的TFQMR(ITFQMR)算法.对比TFQMR算法,ITFQMR算法的数值稳定性和TFQMR算法相同,几乎没有增加计算量,但考虑了在MIMD并行机上实现时并行算法的性能,其同步开销减少为TFQMR算法的一半,并且所有内积计算以及矩阵向量乘是独立的,没有数据相关性,可以进行计算与通信的重叠.从理论和实验两个角度来讨论ITFQMR算法的性能,当处理机台数较多时,ITFQMR算法的计算速度快于TFQMR算法.实验说明了在有64台处理机机群上进行,最快的并行ITFQMR算法的计算速度大约比TFQMR算法快20%.  相似文献   

17.
The multiple signal classification (MUSIC) algorithm based on spatial time-frequency distribution (STFD) has been investigated for direction of arrival (DOA) estimation of closely-spaced sources. However, the limitations of the bilinear time-frequency based MUSIC (TF-MUSIC) algorithm lie in that it suffers from heavy implementation complexity, and its performance strongly depends on appropriate selection of auto-term location of the sources in time-frequency (TF) domain for the formulation of a group of STFD matrices, which is practically difficult especially when the sources are spectrally-overlapped. In order to relax these limitations, this paper aims to develop a novel DOA estimation algorithm. Specifically, we build a MUSIC algorithm based on spatial short-time Fourier transform (STFT), which effectively reduces implementation cost. More importantly, we propose an efficient method to precisely select single-source auto-term location for constructing the STFD matrices of each source. In addition to low complexity, the main advantage of the proposed STFT-MUSIC algorithm compared to some existing ones is that it can better deal with closely-spaced sources whose spectral contents are highly overlapped in TF domain.  相似文献   

18.
Outliers and gross errors in training data sets can seriously deteriorate the performance of traditional supervised feedforward neural networks learning algorithms. This is why several learning methods, to some extent robust to outliers, have been proposed. In this paper we present a new robust learning algorithm based on the iterative Least Median of Squares, that outperforms some existing solutions in its accuracy or speed. We demonstrate how to minimise new non-differentiable performance function by a deterministic approximate method. Results of simulations and comparison with other learning methods are demonstrated. Improved robustness of our novel algorithm, for data sets with varying degrees of outliers, is shown.  相似文献   

19.
Probabilistic design of LPV control systems   总被引:1,自引:0,他引:1  
This paper presents an alternative approach to design of linear parameter-varying (LPV) control systems. In contrast to previous methods, which are focused on deterministic algorithms, this paper is based on a probabilistic setting. The proposed randomized algorithm provides a sequence of candidate solutions converging with probability one to a feasible solution in a finite number of steps. The main features of this approach are as follows: (i) The randomized algorithm gives a method for general LPV plants with state space matrices depending on scheduling parameters in a nonlinear manner. That is, the probabilistic setting does not need a gridding of the set of scheduling parameters or approximations such as a linear fractional transformation of the state space matrices. (ii) The proposed algorithm is sequential and, at each iteration, it does not require heavy computational effort such as solving simultaneously a large number of linear matrix inequalities.  相似文献   

20.
By applying the hierarchical identification principle, the gradient-based iterative algorithm is suggested to solve a class of complex matrix equations. With the real representation of a complex matrix as a tool, the sufficient and necessary conditions for the convergence factor are determined to guarantee that the iterative solutions given by the proposed algorithm converge to the exact solution for any initial matrices. Also, we solve the problem which is proposed by Wu et al. (2010). Finally, some numerical examples are provided to illustrate the effectiveness of the proposed algorithms and testify the conclusions suggested in this paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号