首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
Machine learning offers a systematic framework for developing metrics that use multiple criteria to assess the quality of machine translation (MT). However, learning introduces additional complexities that may impact on the resulting metric’s effectiveness. First, a learned metric is more reliable for translations that are similar to its training examples; this calls into question whether it is as effective in evaluating translations from systems that are not its contemporaries. Second, metrics trained from different sets of training examples may exhibit variations in their evaluations. Third, expensive developmental resources (such as translations that have been evaluated by humans) may be needed as training examples. This paper investigates these concerns in the context of using regression to develop metrics for evaluating machine-translated sentences. We track a learned metric’s reliability across a 5 year period to measure the extent to which the learned metric can evaluate sentences produced by other systems. We compare metrics trained under different conditions to measure their variations. Finally, we present an alternative formulation of metric training in which the features are based on comparisons against pseudo-references in order to reduce the demand on human produced resources. Our results confirm that regression is a useful approach for developing new metrics for MT evaluation at the sentence level.  相似文献   

2.
当前的互联网只能提供“尽力而为”的发送服务,使网络层无法控制传输质量,因此,为不同应用提供不同QoS的服务是网络用户的基本要求和互联网面临的重要研究课题。近几年,有关IP QoS的讨论焦点是Intserv、Diffserv、MPLS等服务模型和框架,在资源受限的网络环境下,这些解决方案实现的基础是QoS路由。论文的目的是在明确QoS路由重要性的基础上,通过建立网络模型和度量合成规则,详细分析单播和多播可能遇到的单度量的基本路由问题和多度量的组合路由问题,并给出解决这些路由问题的方法和计算复杂度,这对于设计和实现可行的QoS路由协议或算法有一定参考价值。  相似文献   

3.
For many optimization applications a complicated computational simulation is replaced with a simpler response surface model. These models are built by fitting a limited number of evaluations of the full simulation with a simple function that captures the trends in the evaluated data. In many cases the values of the data at the evaluation points have some uncertainty. This paper uses Bayesian model selection to derive two objective metrics that can be used to determine which response surface model provides the most appropriate representation of the evaluated data given the associated uncertainty. These metrics are shown to be consistent with modelling intuition based on Occam’s principle. The uncertainty may be due to numerical error, approximations, uncertain input conditions, or to higher order effects in the simulation that do not need to be fit by the response surface. Two metrics, Q and G, are derived in this paper. The metric Q assumes that a good estimate of the simulation uncertainty is available. The metric G assumes the uncertainty, although present, is unknown. Application of these metrics in one and two dimensions are demonstrated. Received June 28, 2000  相似文献   

4.
A practical view of software measurement that formed the basis for a companywide software metrics initiative within Motorola is described. A multidimensional view of measurement is provided by identifying different dimensions (e.g., metric usefulness/utility, metric types or categories, metric audiences, etc.) that were considered in this companywide metrics implementation process. The definitions of the common set of Motorola software metrics, as well as the charts used for presenting these metrics, are included. The metrics were derived using the goal/question metric approach to measurement. A distinction is made between the use of metrics for process improvement over time across projects and the use of metrics for in-process project control. Important experiences in implementing the software metrics initiative within Motorola are also included  相似文献   

5.
Business process models abstract complex business processes by representing them as graphical models. Their layout, as determined by the modeler, may have an effect when these models are used. However, this effect is currently not fully understood. In order to systematically study this effect, a basic set of measurable key visual features is proposed, depicting the layout properties that are meaningful to the human user. The aim of this research is thus twofold: first, to empirically identify key visual features of business process models which are perceived as meaningful to the user and second, to show how such features can be quantified into computational metrics, which are applicable to business process models. We focus on one particular feature, consistency of flow direction, and show the challenges that arise when transforming it into a precise metric. We propose three different metrics addressing these challenges, each following a different view of flow consistency. We then report the results of an empirical evaluation, which indicates which metric is more effective in predicting the human perception of this feature. Moreover, two other automatic evaluations describing the performance and the computational capabilities of our metrics are reported as well.  相似文献   

6.
基于图神经网络的故障诊断方法, 通常需要根据度量衡确定样本之间的相似性, 进而构建图的拓扑结构.然而, 根据单一度量衡可能无法准确衡量数据样本之间的相似性, 进而导致无法准确表征样本之间的关系. 因此, 选用不同的度量衡会极大地影响图神经网络的诊断性能. 为了解决通过单一度量衡无法准确表征数据样本之间相关性的问题, 本文提出了一种基于多度量衡构造图的故障诊断模型???Multi-GAT. 通过结合3种度量衡的计算结果,从而判断数据样本之间相关性的强弱. 本文改进了图注意力网络的评分函数, 使其能够依据样本之间相关性的强弱更准确地确定数据样本之间的相似性. 在本文基准数据集上的实验表明, Multi-GAT能够提升模型的诊断精度,且拥有较好的稳定性.  相似文献   

7.
本文针对采用Gray映射的高阶PSK调制,提出一种比特软信息的简化计算方法.该方法利用Gray码的对称性,通过递推求取比特软信息.分析和仿真结果表明,在高阶调制下,此方法较传统的ML和Max-Log计算方法,运算负担大大降低且对系统性能影响不大.此外,该方法可对不同调制阶数的PSK信号统一处理,适合应用于自适应编码调制系统.  相似文献   

8.
In a wireless network, node failure due to either natural disasters or human intervention can cause network partitioning and other communication problems. For this reason, a wireless network should be fault tolerant. At present, most researchers use k-connectivity to measure fault tolerance, which requires the network to be connected after the failure of any up to k-1 nodes. However, wireless network node failures are usually spatially related, and particularly in military applications, nodes from the same limited area can fail together. As a metric of fault-tolerance, k-connectivity fails to capture the spatial relativity of faults and hardly satisfies the fault tolerance requirements of a wireless network design. In this paper, a new metric of fault-tolerance, termed D-region fault tolerance, is introduced to measure wireless network fault tolerance. A D-region fault tolerant network means that even after all the nodes have failed in a circular region with diameter D, it still remains connected. Based on D-region fault tolerance, we propose two fault-tolerant topology control algorithms--the global region fault tolerance algorithm (GRFT) and the localized region fault tolerance algorithm (LRFT). It is theoretically proven that both algorithms are able to generate a network with D-region fault tolerance. Simulation results indicate that with the same fault tolerance capabilities, networks based on both GRFT and LRFT algorithms have a lower transmission radius and lower logical degree.  相似文献   

9.
大规模MIMO系统的符号向量检测算法计算复杂度较高,对此结合粒子群优化与蚁群优化提出一种低计算复杂度的海量规模MIMO系统快速检测算法。首先,推导出一种新的概率搜索模型,将基于距离的蚁群搜索与基于速度的粒子搜索结合;然后,将ACO距离指标与PSO的方向、速度指标结合生成一种新的概率指标,将ACO的信息素更新步骤变为PSO速度的更新;最终,将MIMO检测问题建模为路径寻找问题,寻找MIMO符号检测问题的次优解。对比仿真实验结果表明,本算法的检测性能优于部分传统算法以及其他新颖的MIMO检测算法,在获得与最大似然估计检测法接近的误码率性能下,具有极快的计算速度,适用于海量规模的MIMO系统。  相似文献   

10.
Directly optimizing an information retrieval (IR) metric has become a hot topic in the field of learning to rank. Conventional wisdom believes that it is better to train for the loss function on which will be used for evaluation. But we often observe different results in reality. For example, directly optimizing averaged precision achieves higher performance than directly optimizing precision@3 when the ranking results are evaluated in terms of precision@3. This motivates us to combine multiple metrics in the process of optimizing IR metrics. For simplicity we study learning with two metrics. Since we usually conduct the learning process in a restricted hypothesis space, e.g., linear hypothesis space, it is usually difficult to maximize both metrics at the same time. To tackle this problem, we propose a relaxed approach in this paper. Specifically, we incorporate one metric within the constraint while maximizing the other one. By restricting the feasible hypothesis space, we can get a more robust ranking model. Empirical results on the benchmark data set LETOR show that the relaxed approach is superior to the direct linear combination approach, and also outperforms other baselines.  相似文献   

11.
Removing the bias and variance of multicentre data has always been a challenge in large scale digital healthcare studies, which requires the ability to integrate clinical features extracted from data acquired by different scanners and protocols to improve stability and robustness. Previous studies have described various computational approaches to fuse single modality multicentre datasets. However, these surveys rarely focused on evaluation metrics and lacked a checklist for computational data harmonisation studies. In this systematic review, we summarise the computational data harmonisation approaches for multi-modality data in the digital healthcare field, including harmonisation strategies and evaluation metrics based on different theories. In addition, a comprehensive checklist that summarises common practices for data harmonisation studies is proposed to guide researchers to report their research findings more effectively. Last but not least, flowcharts presenting possible ways for methodology and metric selection are proposed and the limitations of different methods have been surveyed for future research.  相似文献   

12.
is paper reports on a pioneer effort for the establishment of a software compostie-metric with key capability of distinguishing among different structrues.As a part of this effort most of the previously proposed program control-flow complexity metrics are evaluated.It is obseved that most of these metrics are inhrently limited in distinguishing capability.However,the concept of composite metrics in potentially useful for the development of a practical metrics.This paper presents a methology for the development of a practical composite metric using statistical techniques.The proposed metric differs from all previous metrics in 2 ways:(1)It is based on an overall structural analysis of a given program in deeper and broader context.It captures various structural measurements taken from all existing structural levels;(2)It unifies a set of 19 important structural metrics.The compositing model of these metrics in based on statistical techniques rather than on an arbitrary method.Experinces with the proposd metric clearly indicate that it distinguishes different structures better than the previous metrics.  相似文献   

13.
融合图像质量评价指标的相关性分析及性能评估   总被引:9,自引:0,他引:9  
图像融合质量评价指标研究旨在提供一种高效、准确的方法,为融合模型 选择、参数优化等问题提供支持. 本文通过对现有指标的机理分析、指标性能检验与 指标间相关性分析,提出一种客观评价指标集的遴选策略. 本文首先将现有客观评价 指标归为三类:基于统计的、基于信息的和基于人类视觉系统的;之后列举了类别内经典指标和最新指标;并在标准数据集上,使用正确排序指标对各图 像融合客观评价指标的性能进行验证. 结果表明,基于视觉系统类的指标性能普遍优于前两类. 最后,利用Spearman相关系数挖掘各指 标间的相关程度. 实验表明,通过指标性能和相 关系数可以选取合适的客观评价指标集.  相似文献   

14.
Complex networks are often characterized by their underlying graph metrics, yet there is no unified computational method for comparing networks to each other. Given that complex networks are entities characterized by a set of known properties, our problem is reduced to quantifying the similarity between the multi-variable entities. To address this issue, we introduce the new statistical fidelity metric, which can compare any types of entities, characterized by specific individual metrics, in order to gauge the similarity of the entities under the form of a single number between 0 and 1. To test the efficiency of statistical fidelity, we apply our composite metric in the field of complex networks, by assessing topological similarity and realism of social networks and urban road networks. Pinned against other statistical methods, such as the cosine similarity, Pearson correlation, Mahalanobis distance and fractal dimension, we highlight the superior analytic power of statistical fidelity.  相似文献   

15.
Model calibration is the process of estimating unknown inputs in a model to improve the agreement between model predictions and experimental observations. Optimization-based model calibration is a probabilistic approach for estimating unknown inputs by using optimization techniques. Gradient-based optimization algorithms are popular for optimization-based model calibration because of their computational efficiency. Gradient-based algorithms, however, also have drawbacks that include the local optimum issue, the numerical noise issue, lack of gradient information, and related concerns. In optimization-based model calibration, a calibration metric that quantifies the similarity or difference between two probability distributions (the predicted and the observed system responses) is defined as an objective function. Current methods of optimization-based model calibration use existing calibration metrics, such as the likelihood function and the probability residual. Occasionally, these methods show inaccurate calibrated results. Therefore, first, this comprehensive study investigates the root causes of the inaccurate calibrated results that arise from using existing calibration metrics. Second, an enhanced method is proposed to achieve robust optimization-based model calibration by providing analytical gradient information. This study provides a general guideline for improved optimization-based model calibration.  相似文献   

16.
Images sorted by similarity enables more images to be viewed simultaneously, and can be very useful for stock photo agencies or e-commerce applications. Visually sorted grid layouts attempt to arrange images so that their proximity on the grid corresponds as closely as possible to their similarity. Various metrics exist for evaluating such arrangements, but there is low experimental evidence on correlation between human perceived quality and metric value. We propose distance preservation quality (DPQ) as a new metric to evaluate the quality of an arrangement. Extensive user testing revealed stronger correlation of DPQ with user-perceived quality and performance in image retrieval tasks compared to other metrics. In addition, we introduce Fast linear assignment sorting (FLAS) as a new algorithm for creating visually sorted grid layouts. FLAS achieves very good sorting qualities while improving run time and computational resources.  相似文献   

17.
Many studies use logistic regression models to investigate the ability of complexity metrics to predict fault-prone classes. However, it is not uncommon to see the inappropriate use of performance indictors such as odds ratio in previous studies. In particular, a recent study by Olague et al. uses the odds ratio associated with one unit increase in a metric to compare the relative magnitude of the associations between individual metrics and fault-proneness. In addition, the percents of concordant, discordant, and tied pairs are used to evaluate the predictive effectiveness of a univariate logistic regression model. Their results suggest that lesser known complexity metrics such as standard deviation method complexity (SDMC) and average method complexity (AMC) are better predictors than the two commonly used metrics: lines of code (LOC) and weighted method McCabe complexity (WMC). In this paper, however, we show that (1) the odds ratio associated with one standard deviation increase, rather than one unit increase, in a metric should be used to compare the relative magnitudes of the effects of individual metrics on fault-proneness. Otherwise, misleading results may be obtained; and that (2) the connection of the percents of concordant, discordant, and tied pairs with the predictive effectiveness of a univariate logistic regression model is false, as they indeed do not depend on the model. Furthermore, we use the data collected from three versions of Eclipse to re-examine the ability of complexity metrics to predict fault-proneness. Our experimental results reveal that: (1) many metrics exhibit moderate or almost moderate ability in discriminating between fault-prone and not fault-prone classes; (2) LOC and WMC are indeed better fault-proneness predictors than SDMC and AMC; and (3) the explanatory power of other complexity metrics in addition to LOC is limited.  相似文献   

18.

The ever-growing video streaming services require accurate quality assessment with often no reference to the original media. One primary challenge in developing no-reference (NR) video quality metrics is achieving real-timeliness while retaining the accuracy. A real-time no-reference video quality assessment (VQA) method is proposed for videos encoded by H.264/AVC codec. Temporal and spatial features are extracted from the encoded bit-stream and pixel values to train and validate a fully connected neural network. The hand-crafted features and network dynamics are designed in a manner to ensure a high correlation with human judgment of quality as well as minimizing the computational complexities. Proof-of-concept experiments are conducted via comparison with: 1) video sequences rated by a full-reference quality metric, and 2) H.264-encoded sequences from the LIVE video dataset which are subjectively evaluated through differential mean opinion scores (DMOS). The performance of the proposed method is verified by correlation measurements with the aforementioned objective and subjective scores. The framework achieves real-time execution while outperforming state-of-art full-reference and no-reference video quality assessment methods.

  相似文献   

19.
The goal of image annotation is to automatically assign a set of textual labels to an image to describe the visual contents thereof. Recently, with the rapid increase in the number of web images, nearest neighbor (NN) based methods have become more attractive and have shown exciting results for image annotation. One of the key challenges of these methods is to define an appropriate similarity measure between images for neighbor selection. Several distance metric learning (DML) algorithms derived from traditional image classification problems have been applied to annotation tasks. However, a fundamental limitation of applying DML to image annotation is that it learns a single global distance metric over the entire image collection and measures the distance between image pairs in the image-level. For multi-label annotation problems, it may be more reasonable to measure similarity of image pairs in the label-level. In this paper, we develop a novel label prediction scheme utilizing multiple label-specific local metrics for label-level similarity measure, and propose two different local metric learning methods in a multi-task learning (MTL) framework. Extensive experimental results on two challenging annotation datasets demonstrate that 1) utilizing multiple local distance metrics to learn label-level distances is superior to using a single global metric in label prediction, and 2) the proposed methods using the MTL framework to learn multiple local metrics simultaneously can model the commonalities of labels, thereby facilitating label prediction results to achieve state-of-the-art annotation performance.  相似文献   

20.
ContextSoftware quality attributes are assessed by employing appropriate metrics. However, the choice of such metrics is not always obvious and is further complicated by the multitude of available metrics. To assist metrics selection, several properties have been proposed. However, although metrics are often used to assess successive software versions, there is no property that assesses their ability to capture structural changes along evolution.ObjectiveWe introduce a property, Software Metric Fluctuation (SMF), which quantifies the degree to which a metric score varies, due to changes occurring between successive system's versions. Regarding SMF, metrics can be characterized as sensitive (changes induce high variation on the metric score) or stable (changes induce low variation on the metric score).MethodSMF property has been evaluated by: (a) a case study on 20 OSS projects to assess the ability of SMF to differently characterize different metrics, and (b) a case study on 10 software engineers to assess SMF's usefulness in the metric selection process.ResultsThe results of the first case study suggest that different metrics that quantify the same quality attributes present differences in their fluctuation. We also provide evidence that an additional factor that is related to metrics’ fluctuation is the function that is used for aggregating metric from the micro to the macro level. In addition, the outcome of the second case study suggested that SMF is capable of helping practitioners in metric selection, since: (a) different practitioners have different perception of metric fluctuation, and (b) this perception is less accurate than the systematic approach that SMF offers.ConclusionsSMF is a useful metric property that can improve the accuracy of metrics selection. Based on SMF, we can differentiate metrics, based on their degree of fluctuation. Such results can provide input to researchers and practitioners in their metric selection processes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号