首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this paper, we consider the k-prize-collecting minimum vertex cover problem with submodular penalties, which generalizes the well-known minimum vertex cover problem, minimum partial vertex cover problem and minimum vertex cover problem with submodular penalties. We are given a cost graph G=(V,E;c) and an integer k. This problem determines a vertex set SV such that S covers at least k edges. The objective is to minimize the total cost of the vertices in S plus the penalty of the uncovered edge set, where the penalty is determined by a submodular function. We design a two-phase combinatorial algorithm based on the guessing technique and the primal-dual framework to address the problem. When the submodular penalty cost function is normalized and nondecreasing, the proposed algorithm has an approximation factor of 3. When the submodular penalty cost function is linear, the approximation factor of the proposed algorithm is reduced to 2, which is the best factor if the unique game conjecture holds.  相似文献   

2.
With the increasing amount of data, there is an urgent need for efficient sorting algorithms to process large data sets. Hardware sorting algorithms have attracted much attention because they can take advantage of different hardware’s parallelism. But the traditional hardware sort accelerators suffer “memory wall” problems since their multiple rounds of data transmission between the memory and the processor. In this paper, we utilize the in-situ processing ability of the ReRAM crossbar to design a new ReCAM array that can process the matrix-vector multiplication operation and the vector-scalar comparison in the same array simultaneously. Using this designed ReCAM array, we present ReCSA, which is the first dedicated ReCAM-based sort accelerator. Besides hardware designs, we also develop algorithms to maximize memory utilization and minimize memory exchanges to improve sorting performance. The sorting algorithm in ReCSA can process various data types, such as integer, float, double, and strings. We also present experiments to evaluate the performance and energy efficiency against the state-of-the-art sort accelerators. The experimental results show that ReCSA has 90.92×, 46.13×, 27.38×, 84.57×, and 3.36× speedups against CPU-, GPU-, FPGA-, NDP-, and PIM-based platforms when processing numeric data sets. ReCSA also has 24.82×, 32.94×, and 18.22× performance improvement when processing string data sets compared with CPU-, GPU-, and FPGA-based platforms.  相似文献   

3.
On-line transaction processing (OLTP) systems rely on transaction logging and quorum-based consensus protocol to guarantee durability, high availability and strong consistency. This makes the log manager a key component of distributed database management systems (DDBMSs). The leader of DDBMSs commonly adopts a centralized logging method to writing log entries into a stable storage device and uses a constant log replication strategy to periodically synchronize its state to followers. With the advent of new hardware and high parallelism of transaction processing, the traditional centralized design of logging limits scalability, and the constant trigger condition of replication can not always maintain optimal performance under dynamic workloads. In this paper, we propose a new log manager named Salmo with scalable logging and adaptive replication for distributed database systems. The scalable logging eliminates centralized contention by utilizing a highly concurrent data structure and speedy log hole tracking. The kernel of adaptive replication is an adaptive log shipping method, which dynamically adjusts the number of log entries transmitted between leader and followers based on the real-time workload. We implemented and evaluated Salmo in the open-sourced transaction processing systems Cedar and DBx1000. Experimental results show that Salmo scales well by increasing the number of working threads, improves peak throughput by 1.56× and reduces latency by more than 4× over log replication of Raft, and maintains efficient and stable performance under dynamic workloads all the time.  相似文献   

4.
Block synchronization is an essential component of blockchain systems. Traditionally, blockchain systems tend to send all the transactions from one node to another for synchronization. However, such a method may lead to an extremely high network bandwidth overhead and significant transmission latency. It is crucial to speed up such a block synchronization process and save bandwidth consumption. A feasible solution is to reduce the amount of data transmission in the block synchronization process between any pair of peers. However, existing methods based on the Bloom filter or its variants still suffer from multiple roundtrips of communications and significant synchronization delay. In this paper, we propose a novel protocol named Gauze for fast block synchronization. It utilizes the Cuckoo filter (CF) to discern the transactions in the receiver’s mempool and the block to verify, providing an efficient solution to the problem of set reconciliation in the P2P (Peer-to-Peer Network) network. By up to two rounds of exchanging and querying the CFs, the sending node can acknowledge whether the transactions in a block are contained by the receiver’s mempool or not. Based on this message, the sender only needs to transfer the missed transactions to the receiver, which speeds up the block synchronization and saves precious bandwidth resources. The evaluation results show that Gauze outperforms existing methods in terms of the average processing latency (about 10× lower than Graphene) and the total synchronization space cost (about 10× lower than Compact Blocks) in different scenarios.  相似文献   

5.
刘帅  蒋林  李远成  山蕊  朱育琳  王欣 《计算机应用》2022,42(5):1524-1530
针对大规模多输入多输出(MIMO)系统中,最小均方误差(MMSE)检测算法在可重构阵列结构上适应性差、计算复杂度高和运算效率低的问题,基于项目组开发的可重构阵列处理器,提出了一种基于MMSE算法的并行映射方法。首先,利用Gram矩阵计算时较为简单的数据依赖关系,设计时间上和空间上可以高度并行的流水线加速方案;其次,根据MMSE算法中Gram矩阵计算和匹配滤波计算模块相对独立的特点,设计模块化并行映射方案;最后,基于Xilinx Virtex-6开发板对映射方案进行实现并统计其性能。实验结果表明,该方法在MIMO规模为128×4128×8128×16的正交相移键控(QPSK)上行链路中,加速比分别2.80、4.04和5.57;在128×16的大规模MIMO系统中,可重构阵列处理器比专用硬件减少了42.6%的资源消耗。  相似文献   

6.
王梅  许传海  刘勇 《计算机应用》2021,41(12):3462-3467
多核学习方法是一类重要的核学习方法,但大多数多核学习方法存在如下问题:多核学习方法中的基核函数大多选择传统的具有浅层结构的核函数,在处理数据规模大且分布不平坦的问题时表示能力较弱;现有的多核学习方法的泛化误差收敛率大多为O1/n,收敛速度较慢。为此,提出了一种基于神经正切核(NTK)的多核学习方法。首先,将具有深层次结构的NTK作为多核学习方法的基核函数,从而增强多核学习方法的表示能力。然后,根据主特征值比例度量证明了一种收敛速率可达O1/n的泛化误差界;在此基础上,结合核对齐度量设计了一种全新的多核学习算法。最后,在多个数据集上进行了实验,实验结果表明,相比Adaboost和K近邻(KNN)等分类算法,新提出的多核学习算法具有更高的准确率和更好的表示能力,也验证了所提方法的可行性与有效性。  相似文献   

7.
At ToSC 2019, Ankele et al. proposed a novel idea for constructing zero-correlation linear distinguishers in a related-tweakey model. This paper further clarifies this principle and gives a search model for zero-correlation distinguishers. As a result, for the first time, the authors construct 15-round and 17-round zero-correlation linear distinguishers for SKINNY-n-2n and SKINNY-n-3n, respectively, which are both two rounds longer than Anekele et al.’s. Based on these distinguishers, the paper presents related-tweakey zero-correlation linear attacks on 22-round SKINNY-n-2n and 26-round SKINNY-n-3n, respectively.  相似文献   

8.
蒋楚钰  方李西  章宁  朱建明 《计算机应用》2022,42(11):3438-3443
针对公证人机制中存在的公证人节点职能集中以及跨链交易效率较低等问题,提出一种基于公证人组的跨链交互安全模型。首先,将公证人节点分为三类角色,即交易验证者、连接者和监督者,由交易验证组成员打包经过共识的多笔交易成一笔大的交易,并利用门限签名技术对它进行签名;其次,被确认的交易会被置于跨链待转账池中,连接者随机选取多笔交易,利用安全多方计算和同态加密等技术判断交易的真实性;最后,若打包所有符合条件的交易的哈希值真实可靠且被交易验证组验证过,则连接者可以继续执行多笔跨链交易的批处理任务,并与区块链进行信息交互。安全性分析表明,该跨链机制有助于保护信息的机密性和数据的完整性,实现数据在不出库的情况下的协同计算,保障区块链跨链系统的稳定性。与传统的跨链交互安全模型相比,所提模型的签名次数和需要分配公证人组数的复杂度从O(n)降为O(1)。  相似文献   

9.
针对Blow-CAST-Fish算法攻击轮数有限和复杂度高等问题,提出一种基于差分表的Blow-CAST-Fish算法的密钥恢复攻击。首先,对S盒的碰撞性进行分析,分别基于两个S盒和单个S盒的碰撞,构造6轮和12轮差分特征;然后,计算轮函数f3的差分表,并在特定差分特征的基础上扩充3轮,从而确定密文差分与f3的输入、输出差分的关系;最后,选取符合条件的明文进行加密,根据密文差分计算f3的输入、输出差分值,并查寻差分表找到对应的输入、输出对,从而获取子密钥。在两个S盒碰撞的情况下,所提攻击实现了9轮Blow-CAST-Fish算法的差分攻击,比对比攻击多1轮,时间复杂度由2107.9降低到274;而在单个S盒碰撞的情况下,所提攻击实现了15轮Blow-CAST-Fish算法的差分攻击,与对比攻击相比,虽然攻击轮数减少了1轮,但弱密钥比例由2-52.4提高到2-42,数据复杂度由254降低到247。测试结果表明,在相同差分特征基础上,基于差分表的攻击的攻击效率更高。  相似文献   

10.
A k-CNF (conjunctive normal form) formula is a regular (k, s)-CNF one if every variable occurs s times in the formula, where k≥2 and s>0 are integers. Regular (3, s)- CNF formulas have some good structural properties, so carrying out a probability analysis of the structure for random formulas of this type is easier than conducting such an analysis for random 3-CNF formulas. Some subclasses of the regular (3, s)-CNF formula have also characteristics of intractability that differ from random 3-CNF formulas. For this purpose, we propose strictly d-regular (k, 2s)-CNF formula, which is a regular (k, 2s)-CNF formula for which d≥0 is an even number and each literal occurs sd2 or s+d2 times (the literals from a variable x are x and ¬x, where x is positive and ¬x is negative). In this paper, we present a new model to generate strictly d-regular random (k, 2s)-CNF formulas, and focus on the strictly d-regular random (3, 2s)-CNF formulas. Let F be a strictly d-regular random (3, 2s)-CNF formula such that 2s>d. We show that there exists a real number s0 such that the formula F is unsatisfiable with high probability when s>s0, and present a numerical solution for the real number s0. The result is supported by simulated experiments, and is consistent with the existing conclusion for the case of d= 0. Furthermore, we have a conjecture: for a given d, the strictly d-regular random (3, 2s)-SAT problem has an SAT-UNSAT (satisfiable-unsatisfiable) phase transition. Our experiments support this conjecture. Finally, our experiments also show that the parameter d is correlated with the intractability of the 3-SAT problem. Therefore, our research maybe helpful for generating random hard instances of the 3-CNF formula.  相似文献   

11.
In this paper, we propose a lightweight network with an adaptive batch normalization module, called Meta-BN Net, for few-shot classification. Unlike existing few-shot learning methods, which consist of complex models or algorithms, our approach extends batch normalization, an essential part of current deep neural network training, whose potential has not been fully explored. In particular, a meta-module is introduced to learn to generate more powerful affine transformation parameters, known as γ and β, in the batch normalization layer adaptively so that the representation ability of batch normalization can be activated. The experimental results on miniImageNet demonstrate that Meta-BN Net not only outperforms the baseline methods at a large margin but also is competitive with recent state-of-the-art few-shot learning methods. We also conduct experiments on Fewshot-CIFAR100 and CUB datasets, and the results show that our approach is effective to boost the performance of weak baseline networks. We believe our findings can motivate to explore the undiscovered capacity of base components in a neural network as well as more efficient few-shot learning methods.  相似文献   

12.
Local holographic transformations were introduced by Cai et al., and local affine functions, an extra tractable class, were derived by it in #CSP2. In the present paper, we not only generalize local affine functions to #CSPd for general d, but also give new tractable classes by combining local holographic transformations with global holographic transformations. Moreover, we show how to use local holographic transformations to prove hardness. This is of independent interests in the complexity classification of counting problems.  相似文献   

13.
Secure k-Nearest Neighbor (k-NN) query aims to find k nearest data of a given query from an encrypted database in a cloud server without revealing privacy to the untrusted cloud and has wide applications in many areas, such as privacy-preserving machine learning and secure biometric identification. Several solutions have been put forward to solve this challenging problem. However, the existing schemes still suffer from various limitations in terms of efficiency and flexibility. In this paper, we propose a new encrypt-then-index strategy for the secure k-NN query, which can simultaneously achieve sub-linear search complexity (efficiency) and support dynamical update over the encrypted database (flexibility). Specifically, we propose a novel algorithm to transform the encrypted database and encrypted query points in the cloud. By indexing the transformed database using spatial data structures such as the R-tree index, our strategy enables sub-linear complexity for secure k-NN queries and allows users to dynamically update the encrypted database. To the best of our knowledge, the proposed strategy is the first to simultaneously provide these two properties. Through theoretical analysis and extensive experiments, we formally prove the security and demonstrate the efficiency of our scheme.  相似文献   

14.
为评价阴影消除植被指数(Shadow-Eliminated Vegetation Index,SEVI)对常用十米级不同空间分辨率遥感影像的地形阴影消除效果,采用2019年1月24~25日过境的Sentinel S2B(10 m)、GF-1(16 m)、Landsat 8 OLI(30 m)、GF-4(50 m)4种空...  相似文献   

15.
Combinatorial optimization in the face of uncertainty is a challenge in both operational research and machine learning. In this paper, we consider a special and important class called the adversarial online combinatorial optimization with semi-bandit feedback, in which a player makes combinatorial decisions and gets the corresponding feedback repeatedly. While existing algorithms focus on the regret guarantee or assume there exists an efficient offline oracle, it is still a challenge to solve this problem efficiently if the offline counterpart is NP-hard. In this paper, we propose a variant of the Followthe-Perturbed-Leader (FPL) algorithm to solve this problem. Unlike the existing FPL approach, our method employs an approximation algorithm as an offline oracle and perturbs the collected data by adding nonnegative random variables. Our approach is simple and computationally efficient. Moreover, it can guarantee a sublinear (1+ ε)-scaled regret of order O(T23) for any small ε>0 for an important class of combinatorial optimization problems that admit an FPTAS (fully polynomial time approximation scheme), in which Tis the number of rounds of the learning process. In addition to the theoretical analysis, we also conduct a series of experiments to demonstrate the performance of our algorithm.  相似文献   

16.
The flourish of deep learning frameworks and hardware platforms has been demanding an efficient compiler that can shield the diversity in both software and hardware in order to provide application portability. Among the existing deep learning compilers, TVM is well known for its efficiency in code generation and optimization across diverse hardware devices. In the meanwhile, the Sunway many-core processor renders itself as a competitive candidate for its attractive computational power in both scientific computing and deep learning workloads. This paper combines the trends in these two directions. Specifically, we propose swTVM that extends the original TVM to support ahead-of-time compilation for architecture requiring cross-compilation such as Sunway. In addition, we leverage the architecture features during the compilation such as core group for massive parallelism, DMA for high bandwidth memory transfer and local device memory for data locality, in order to generate efficient codes for deep learning workloads on Sunway. The experiment results show that the codes generated by swTVM achieve 1.79× improvement of inference latency on average compared to the state-of-the-art deep learning framework on Sunway, across eight representative benchmarks. This work is the first attempt from the compiler perspective to bridge the gap of deep learning and Sunway processor particularly with productivity and efficiency in mind. We believe this work will encourage more people to embrace the power of deep learning and Sunway many-core processor.  相似文献   

17.
张雪峰 《自动化学报》2020,46(5):1061-1062
《自动化学报》39卷12期的"分数阶奇异系统容许性的充分必要条件"得到了基于线性矩阵不等式的分数阶广义系统容许性充分必要条件.本文给出一个数值反例表明文献[1]中定理1的充分条件结论并不成立, 必要条件也不准确.最后, 给出了修改正确的分数阶广义系统容许性充分必要条件.相比于文献[1]的定理1, 改进后充要条件没有保守性并去除了原文中分数阶广义系统正则性的限制要求.  相似文献   

18.
传统的聚类方法是在数据空间进行,且聚类数据的维度较高.为了解决这两个问题,提出了一种新的二进制图像聚类方法——基于离散哈希的聚类(CDH).该框架通过L21范数实现自适应的特征选择,从而降低数据的维度;同时通过哈希方法将数据映射到二进制的汉明空间,随后,在汉明空间中对稀疏的二进制矩阵进行低秩矩阵分解,完成图像的快速聚类...  相似文献   

19.
周玉彬  肖红  王涛  姜文超  熊梦  贺忠堂 《计算机应用》2021,41(11):3192-3199
针对工业机器人机械轴健康管理中检测效率和精准度较低的问题,提出了一种机械轴运行监控大数据背景下的基于动作周期退化相似性度量的健康指标(HI)构建方法,并结合长短时记忆(LSTM)网络进行机器人剩余寿命(RUL)的自动预测。首先,利用MPdist关注机械轴不同动作周期之间子周期序列相似性的特点,并计算正常周期数据与退化周期数据之间的偏离程度,进而构建HI;然后,利用HI集训练LSTM网络模型并建立HI与RUL之间的映射关系;最后,通过MPdist-LSTM混合模型自动计算RUL并适时预警。使用某公司六轴工业机器人进行实验,采集了加速老化数据约1 500万条,对HI单调性、鲁棒性和趋势性以及RUL预测的平均绝对误差(MAE)、均方根误差(RMSE)、决定系数(R2)、误差区间(ER)、早预测(EP)和晚预测(LP)等指标进行了实验测试,将该方法分别与动态时间规整(DTW)、欧氏距离(ED)、时域特征值(TDE)结合LSTM的方法,MPdist结合循环神经网络(RNN)和LSTM等方法进行比较。实验结果表明,相较于其他对比方法,所提方法所构建HI的单调性和趋势性分别至少提高了0.07和0.13,RUL预测准确率更高,ER更小,验证了所提方法的有效性。  相似文献   

20.
朱槐雨  李博 《计算机应用》2021,41(11):3234-3241
无人机(UAV)航拍图像视野开阔,图像中的目标较小且边缘模糊,而现有单阶段多框检测器(SSD)目标检测模型难以准确地检测航拍图像中的小目标。为了有效地解决原有模型容易漏检的问题,借鉴特征金字塔网络(FPN)提出了一种基于连续上采样的SSD模型。改进SSD模型将输入图像尺寸调整为320×320,新增Conv3_3特征层,将高层特征进行上采样,并利用特征金字塔结构对VGG16网络前5层特征进行融合,从而增强各个特征层的语义表达能力,同时重新设计先验框的尺寸。在公开航拍数据集UCAS-AOD上训练并验证,实验结果表明,所提改进SSD模型的各类平均精度均值(mAP)达到了94.78%,与现有SSD模型相比,其准确率提升了17.62%,其中飞机类别提升了4.66%,汽车类别提升了34.78%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号