首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Data co-clustering refers to the problem of simultaneous clustering of two data types. Typically, the data is stored in a contingency or co-occurrence matrix C where rows and columns of the matrix represent the data types to be co-clustered. An entry C ij of the matrix signifies the relation between the data type represented by row i and column j. Co-clustering is the problem of deriving sub-matrices from the larger data matrix by simultaneously clustering rows and columns of the data matrix. In this paper, we present a novel graph theoretic approach to data co-clustering. The two data types are modeled as the two sets of vertices of a weighted bipartite graph. We then propose Isoperimetric Co-clustering Algorithm (ICA)—a new method for partitioning the bipartite graph. ICA requires a simple solution to a sparse system of linear equations instead of the eigenvalue or SVD problem in the popular spectral co-clustering approach. Our theoretical analysis and extensive experiments performed on publicly available datasets demonstrate the advantages of ICA over other approaches in terms of the quality, efficiency and stability in partitioning the bipartite graph.  相似文献   

2.
Conclusion Our experiments establish sufficient efficiency of the g.r.p. for MCP (WMCP), and in many cases show it to be superior by running time to the branch-and-bound method for this problem. The analytical bounds given above point to stable behavior of the g.r.p. Note that in the theoretical arguments of the previous section we have assumed that in the generated resolvent column at least one 1 is distributed with probability of order 1/k among the rows in a covering withk or fewer rows. Of the two rows α, β from the sought covering and such that α contains a greater number of 1s than β, α obviously has a higher probability of getting a 1 in the resolvent column than β. However, each time that the “greedy” algorithm prefers row α over row β (includes α in the covering and does not include β), a cutting column is constructed by Theorem 2 in which α is guaranteed to contain 0. Thus, the more times row α is selected without row β, the higher the number of additional 1s accumulated in row β. This suggests that the number of 1s in rows α and β tends to equalize. This argument can be extended to any pair of rows in the sought covering. The argument presented in this paper is thus valid on average. Translated from Kibernetika i Sistemnyi Analiz, No. 1, pp. 135–146, January–February, 1996.  相似文献   

3.
4.
赵鸿图  霍江波 《测控技术》2018,37(9):126-130
在进行图像压缩感知时发现以行或列进行压缩感知所得到的图像重构后的峰值信噪比(PSNR)是不同的。为了提高图像压缩重构的质量,提出了单层小波分解下图像行列压缩感知的选择算法。该算法首先计算图像的行与列数据的相对方差的最大偏离值,选择较小者对应的行或列作为压缩感知的对象,然后对图像进行单层小波变换分解出高频系数,在高斯观测矩阵下,对这些系数按指定的行或列进行压缩感知,最后利用正交匹配追踪算法(OMP)分别恢复压缩感知下的高频系数,并通过小波逆变换得到经过行列压缩感知后的重构图像,实验结果证明了算法的准确性。  相似文献   

5.
A two-dimensional mesh of processing elements (PE's) with separable row and column buses (i.e., broadcast mechanisms for rows and columns that can be logically divided into a number of local buses through the use of PE-controlled switches) has been shown to be quite effective for semigroup computation, prefix computation, and a wide class of other computations that do not require excessive communication or data routing. For meshes with separable row/column buses, the authors show how semigroup and prefix computations can be performed with the same asymptotic time complexity without the provision of buses for every row and every column and discuss the VLSI implications of this new architecture  相似文献   

6.
在分析基于UDP的图像传输数据丢失的原因和特点的基础上,发现其特点是图像按行进行传输,因而数据丢失而产生的图像残缺也是按行发生的。基于相邻点间的灰度值关联度较大的原理,提出按列处理数据的方法,对每一列的丢失数据采用预测方法,预测出各行丢失的数据,进而达到修复整幅图像的目的。借鉴灰色预测的思想,提出了对图像按列进行恢复的灰色预测方法,该算法基本满足了图像恢复的要求,并通过实验得到了该算法的适用条件。  相似文献   

7.
Earth scientists generally handle large arrays of multivariate data (tabular form) which have to be entered into a computer for further processing. This paper is concerned with a user-friendly screen data editor named DAINTY, developed by the authors to serve the needs of data entry and printing. Whereas column operations are required for tabular data, text data entry requires row operations. DAINTY provides a number of column operations which generally are not present in commercial editors. These operations are discussed.  相似文献   

8.
9.
列存储数据库关键技术综述   总被引:5,自引:0,他引:5  
随着互联网技术的发展、硬件的不断更新、企业及政府信息化的不断深入,应用的复杂性要求越来越高,推动着数据存储技术向着海量数据、分析数据、智能数据的方向发展,以便为数据仓库、在线分析提供高效实时的技术支持。基于行存储的数据库技术面临新的问题,已经出现了技术瓶颈。近些年来,一种新的数据存储理念,即基于列存储的关系型数据库(简称列数据库,下同)应运而生。列数据库能够快速发展,主要原因是其复杂查询效率高,读磁盘少,存储空间少,以及由此带来的技术、管理和应用优势。对列数据库技术的基本现状、关键支撑技术以及应用优势进行了介绍和分析。  相似文献   

10.
Abstract— In small STN‐LCDs for portable applications, rows and columns are driven by one IC. The LC supply voltages are generated on‐chip from the battery voltage by voltage multiplying. The total LC supply voltage should be as low as possible to minimize the accompanied power losses. By using multiple‐row addressing, the row and maximum column voltages can be made equal, leading to a minimum LC supply voltage. This occurs when the number of simultaneously addressed rows is equal to the square root of the number of rows in the panel. The LC supply voltage may be minimized further by using a liquid crystal which allows multiplexing of more rows than are actually present in the display panel, while at the same time fewer simultaneously addressed rows are required.  相似文献   

11.
在现实中存在许多这样的实体,它需要关系数据库中多个相关的记录一起才能完整地表示其意义,且这些记录之间通常具有行和列上的计算关系,使用传统的方法不能很好地实现这样的实体.针对该情况,提出逻辑实体及其行列模板的概念,设计逻辑实体和行列模板的关系模式,给出行列模板的实现,并结合一个实例,详细说明了行列模板的具体应用.  相似文献   

12.
通过研究列存储技术的特点,提出了一种行列混合存储数据库系统的设计方案.该方案在存储层设立独立的行存储引擎和列存储引擎,采用早物化技术在数据读出之后将列表转换成行表,然后以行的形式完成后续处理.因此,该方法既获得了列存储的读优势又复用了行数据库系统的成熟部件,降低了开发的风险和复杂度.基于PostgreSQL的原型开发与测试证明了该方案的可行性和有效性.  相似文献   

13.
A subspace identification method is discussed that deals with multivariable linear parameter-varying state-space systems with affine parameter dependence. It is shown that a major problem with subspace methods for this kind of system is the enormous dimension of the data matrices involved. To overcome the curse of dimensionality, we suggest using only the most dominant rows of the data matrices in estimating the model. An efficient selection algorithm is discussed that does not require the formation of the complete data matrices, but processes them row by row.  相似文献   

14.
Time series data mining (TSDM) techniques permit exploring large amounts of time series data in search of consistent patterns and/or interesting relationships between variables. TSDM is becoming increasingly important as a knowledge management tool where it is expected to reveal knowledge structures that can guide decision making in conditions of limited certainty. Human decision making in problems related with analysis of time series databases is usually based on perceptions like “end of the day”, “high temperature”, “quickly increasing”, “possible”, etc. Though many effective algorithms of TSDM have been developed, the integration of TSDM algorithms with human decision making procedures is still an open problem. In this paper, we consider architecture of perception-based decision making system in time series databases domains integrating perception-based TSDM, computing with words and perceptions, and expert knowledge. The new tasks which should be solved by the perception-based TSDM methods to enable their integration in such systems are discussed. These tasks include: precisiation of perceptions, shape pattern identification, and pattern retranslation. We show how different methods developed so far in TSDM for manipulation of perception-based information can be used for development of a fuzzy perception-based TSDM approach. This approach is grounded in computing with words and perceptions permitting to formalize human perception-based inference mechanisms. The discussion is illustrated by examples from economics, finance, meteorology, medicine, etc.  相似文献   

15.
This paper presents an approach for extracting and segmenting tables from Chinese ink documents based on a matrix model. An ink document is first modeled as a matrix containing ink rows, including writing and drawing ones. Each row consists of collinear ink lines containing ink characters. Together with their associated drawing rows, adjacent writing rows having an identical distribution of writing lines and?or the same associated drawing rows if available are extracted to form a table. Row and column headers, nested sub-headers and cells are identified. Experiments demonstrate that the proposed approach is more effective and robust.  相似文献   

16.
分析了HBase的存储模型和Spark的并行处理机制,提出一种矢量空间数据的分布式存储、索引和并行区域查询方法。设计了基于空间对象中心点的行键存储方案,将中心点的Hilbert编码与经纬度小数位结合实现行键的唯一性,保证地理位置接近的要素在表中存储在相邻的行。实现了基于Spark的空间索引并行构建和区域查询方法,借助空间对象中心点的Hilbert编码快速构建索引,通过多边形区域的最小外接矩形过滤查询结果。实验结果表明,索引并行构建可靠性好速度快,区域查询并行处理算法可行且效率高。  相似文献   

17.
The “Petlyuk” or “dividing-wall” or “fully thermally coupled” distillation column is an interesting alternative to the conventional cascaded binary columns for separation of multi-component mixtures. However, the industrial use has been limited, and difficulties in operation have been reported as one reason. With three product compositions controlled, the system has two degrees of freedom left for on-line optimization. We show that the steady-state optimal solution surface is quite narrow, and depends strongly on disturbances and design parameters. Thus it seems difficult to achieve the potential energy savings compared to conventional approaches without a good control strategy. We discuss candidate variables which may be used as feedback variables in order to keep the column operation close to optimal in a “self-optimizing” control scheme.  相似文献   

18.
协同聚类是对数据矩阵的行和列两个方向同时进行聚类的一类算法。本文将双层加权的思想引入协同聚类,提出了一种双层子空间加权协同聚类算法(TLWCC)。TLWCC对聚类块(co-cluster)加一层权重,对行和列再加一层权重,并且算法在迭代过程中自动计算块、行和列这三组权重。TLWCC考虑不同的块、行和列与相应块、行和列中心的距离,距离越大,认为其噪声越强,就给予小权重;反之噪声越弱,给予大权重。通过给噪声信息小权重,TLWCC能有效地降低噪声信息带来的干扰,提高聚类效果。本文通过四组实验展示TLWCC算法识别噪声信息的能力、参数选取对算法聚类结果的影响程度,算法的聚类性能和时间性能。  相似文献   

19.
Transposable data represents interactions among two sets of entities, and are typically represented as a matrix containing the known interaction values. Additional side information may consist of feature vectors specific to entities corresponding to the rows and/or columns of such a matrix. Further information may also be available in the form of interactions or hierarchies among entities along the same mode (axis). We propose a novel approach for modeling transposable data with missing interactions given additional side information. The interactions are modeled as noisy observations from a latent noise free matrix generated from a matrix-variate Gaussian process. The construction of row and column covariances using side information provides a flexible mechanism for specifying a-priori knowledge of the row and column correlations in the data. Further, the use of such a prior combined with the side information enables predictions for new rows and columns not observed in the training data. In this work, we combine the matrix-variate Gaussian process model with low rank constraints. The constrained Gaussian process approach is applied to the prediction of hidden associations between genes and diseases using a small set of observed associations as well as prior covariances induced by gene-gene interaction networks and disease ontologies. The proposed approach is also applied to recommender systems data which involves predicting the item ratings of users using known associations as well as prior covariances induced by social networks. We present experimental results that highlight the performance of constrained matrix-variate Gaussian process as compared to state of the art approaches in each domain.  相似文献   

20.
In this paper, we propose a method of automatic detection of texture-periodicity using superposition of distance matching functions (DMFs) followed by computation of their forward differences. The method has been specifically devised for automatically identifying row and column periodicities and thereby the size of periodic units from textile fabrics belonging to any of the 17 wallpaper groups and is a part of automatic fabric defect detection scheme being developed by us that needs periodicities along row and column directions. Overall row-DMF (or overall column-DMF) is obtained based on superposition of DMF of all rows (or columns) from the input image and its second forward difference is computed to get the overall maximum which is a direct measure of periodicity along row (or column) direction. Results from experiments on various near-regular textures demonstrate the capability of the proposed method for automatic periodicity extraction without the need of human intervention.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号