期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Automatic detection of dimension errors in spreadsheets

Chris Chambers Martin Erwig 《Journal of Visual Languages and Computing》2009,20(4):269-283

We present a reasoning system for inferring dimension information in spreadsheets. This system can be used to check the consistency of spreadsheet formulas and thus is able to detect errors in spreadsheets.Our approach is based on three static analysis components. First, the spatial structure of the spreadsheet is analyzed to infer a labeling relationship among cells. Second, cells that are used as labels are lexically analyzed and mapped to potential dimensions. Finally, dimension information is propagated through spreadsheet formulas. An important aspect of the rule system defining dimension inference is that it works bi-directionally, that is, not only “downstream” from referenced arguments to the current cell, but also “upstream” in the reverse direction. This flexibility makes the system robust and turns out to be particularly useful in cases when the initial dimension information that can be inferred from headers is incomplete or ambiguous.We have implemented a prototype system as an add-in to Excel. In an evaluation of this implementation we were able to detect dimension errors in almost 50% of the investigated spreadsheets, which shows (i) that the system works reliably in practice and (ii) that dimension information can be well exploited to uncover errors in spreadsheets. 相似文献

2.

Model inference for spreadsheets

Jácome Cunha Martin Erwig Jorge Mendes João Saraiva 《Automated Software Engineering》2016,23(3):361-392

Many errors in spreadsheet formulas can be avoided if spreadsheets are built automatically from higher-level models that can encode and enforce consistency constraints in the generated spreadsheets. Employing this strategy for legacy spreadsheets is difficult, because the model has to be reverse engineered from an existing spreadsheet and existing data must be transferred into the new model-generated spreadsheet. We have developed and implemented a technique that automatically infers relational schemas from spreadsheets. This technique uses particularities from the spreadsheet realm to create better schemas. We have evaluated this technique in two ways: first, we have demonstrated its applicability by using it on a set of real-world spreadsheets. Second, we have run an empirical study with users. The study has shown that the results produced by our technique are comparable to the ones developed by experts starting from the same (legacy) spreadsheet data. Although relational schemas are very useful to model data, they do not fit spreadsheets well, as they do not allow expressing layout. Thus, we have also introduced a mapping between relational schemas and ClassSheets. A ClassSheet controls further changes to the spreadsheet and safeguards it against a large class of formula errors. The developed tool is a contribution to spreadsheet (reverse) engineering, because it fills an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort. 相似文献

3.

Mutation Operators for Spreadsheets

Abraham Robin Erwig Martin 《IEEE transactions on pattern analysis and machine intelligence》2009,35(1):94-108

Based on (1) research into mutation testing for general purpose programming languages, and (2) spreadsheet errors that have been reported in the literature, we have developed a suite of mutation operators for spreadsheets. We present an evaluation of the mutation adequacy of du-adequate test suites generated by a constraint-based automatic test-case generation system we have developed in previous work. The results of the evaluation suggest additional constraints that can be incorporated into the system to target mutation adequacy. In addition to being useful in mutation testing of spreadsheets, the operators can be used in the evaluation of error-detection tools and also for seeding spreadsheets with errors for empirical studies. We describe two case studies where the suite of mutation operators helped us carry out such empirical evaluations. The main contribution of this paper is a suite of mutation operators for spreadsheets that can be used for carrying out empirical evaluations of spreadsheet tools to indicate ways in which the tools can be improved. 相似文献

4.

UCheck: A spreadsheet type checker for end users

《Journal of Visual Languages and Computing》2007,18(1):71-95

Spreadsheets are widely used, and studies have shown that most end-user spreadsheets contain non-trivial errors. Most of the currently available tools that try to mitigate this problem require varying levels of user intervention. This paper presents a system, called UCheck, that detects errors in spreadsheets automatically. UCheck carries out automatic header and unit inference, and reports unit errors to the users. UCheck is based on two static analyses phases that infer header and unit information for all cells in a spreadsheet.We have tested UCheck on a wide variety of spreadsheets and found that it works accurately and reliably. The system was also used in a continuing education course for high school teachers, conducted through Oregon State University, aimed at making the participants aware of the need for quality control in the creation of spreadsheets. 相似文献

5.

An auditing protocol for spreadsheet models

Stephen G. Powell Kenneth R. BakerAuthor VitaeBarry LawsonAuthor Vitae 《Information & Management》2008

Errors are prevalent in spreadsheets and can be extremely difficult to find. A number of audits of existing spreadsheets have been reported, but few details have been given about how the audits were performed. We developed and tested a new spreadsheet auditing protocol designed to find errors in operational spreadsheets. Our work showed which auditing procedures, used in what sequence and combination, were most effective across a wide range of spreadsheets. It also provided useful information on the size and complexity of operational spreadsheets, as well as the frequency with which certain types of errors occur. 相似文献

6.

Learning constraints in spreadsheets and tabular data

Samuel Kolb Sergey Paramonov Tias Guns Luc De Raedt 《Machine Learning》2017,106(9-10):1441-1468

Spreadsheets, comma separated value files and other tabular data representations are in wide use today. However, writing, maintaining and identifying good formulas for tabular data and spreadsheets can be time-consuming and error-prone. We investigate the automatic learning of constraints (formulas and relations) in raw tabular data in an unsupervised way. We represent common spreadsheet formulas and relations through predicates and expressions whose arguments must satisfy the inherent properties of the constraint. The challenge is to automatically infer the set of constraints present in the data, without labeled examples or user feedback. We propose a two-stage generate and test method where the first stage uses constraint solving techniques to efficiently reduce the number of candidates, based on the predicate signatures. Our approach takes inspiration from inductive logic programming, constraint learning and constraint satisfaction. We show that we are able to accurately discover constraints in spreadsheets from various sources. 相似文献

7.

Multi-view document clustering via ensemble method

Syed Fawad Hussain Muhammad Mushtaq Zahid Halim 《Journal of Intelligent Information Systems》2014,43(1):81-99

Multi-view clustering has become an important extension of ensemble clustering. In multi-view clustering, we apply clustering algorithms on different views of the data to obtain different cluster labels for the same set of objects. These results are then combined in such a manner that the final clustering gives better result than individual clustering of each multi-view data. Multi view clustering can be applied at various stages of the clustering paradigm. This paper proposes a novel multi-view clustering algorithm that combines different ensemble techniques. Our approach is based on computing different similarity matrices on the individual datasets and aggregates these to form a combined similarity matrix, which is then used to obtain the final clustering. We tested our approach on several datasets and perform a comparison with other state-of-the-art algorithms. Our results show that the proposed algorithm outperforms several other methods in terms of accuracy while maintaining the overall complexity of the individual approaches. 相似文献

8.

Model Checking Markov Chains with Actions and State Labels 总被引：2，自引：0，他引：2

Baier C. Cloth L. Haverkort B.R. Kuntz M. Siegle M. 《IEEE transactions on pattern analysis and machine intelligence》2007,33(4):209-224

In the past, logics of several kinds have been proposed for reasoning about discrete-time or continuous-time Markov chains. Most of these logics rely on either state labels (atomic propositions) or on transition labels (actions). However, in several applications it is useful to reason about both state properties and action sequences. For this purpose, we introduce the logic as CSL which provides a powerful means to characterize execution paths of Markov chains with actions and state labels. asCSL can be regarded as an extension of the purely state-based logic CSL (continuous stochastic logic). In asCSL, path properties are characterized by regular expressions over actions and state formulas. Thus, the truth value of path formulas depends not only on the available actions in a given time interval, but also on the validity of certain state formulas in intermediate states. We compare the expressive power of CSL and asCSL and show that even the state-based fragment of asCSL is strictly more expressive than CSL if time intervals starting at zero are employed. Using an automaton-based technique, an asCSL formula and a Markov chain with actions and state labels are combined into a product Markov chain. For time intervals starting at zero, we establish a reduction of the model checking problem for asCSL to CSL model checking on this product Markov chain. The usefulness of our approach is illustrated with an elaborate model of a scalable cellular communication system, for which several properties are formalized by means of asCSL formulas and checked using the new procedure 相似文献

9.

Satisficing: a new approach to constructive nonlinear control

Curtis J.W. Beard R.W. 《Automatic Control, IEEE Transactions on》2004,49(7):1090-1102

The main contribution of this paper is a constructive parameterization of the class of almost smooth universal formulas which render a system asymptotically stable with respect to a known control Lyapunov function (CLF), and a constructive parameterization of a class of inverse optimal universal formulas having Kalman-like stability margins. The novelty of the parameterization is that it is given in terms of two function which are constrained to be locally Lipschitz and satisfy convex constraints. The implication of this result is that the CLF/universal formula approach can be combined with a priori performance objectives to design high performance control strategies. Two examples illustrate the approach. 相似文献

10.

A Declarative Approach to Network Device Configuration Correctness

Éric Lunaud Ngoupé Clément Parisot Sylvan Stoesel Petko Valtchev Roger Villemaire Omar Cherkaoui Pierre Boucher Sylvain Hallé 《Journal of Network and Systems Management》2017,25(1):180-209

Configuration Logic (CL) is a formal language that allows a network engineer to express constraints in terms of the actual parameters found in the configuration of network devices. We present an efficient algorithm that can automatically check a pool of devices for conformance to a set of CL constraints; moreover, this algorithm can point to the part of the configuration responsible for the error when a constraint is violated. Contrary to other validation approaches that require dumping the configuration of the whole network to a central location in order to be verified, we also present an algorithm that analyzes the correct formulas and greatly helps reduce the amount of data that need to be transferred to that central location, pushing as much of the evaluation of the formula locally on each device. The procedure is also backwards-compatible, in such a way that a device that does not (or only partially) supports a local evaluation may simply return a subset or all of its configuration. These capabilities have been integrated into a network management tool called ValidMaker. 相似文献

11.

Systematic evolution of model-based spreadsheet applications

Markus Luckey Martin Erwig Gregor Engels 《Journal of Visual Languages and Computing》2012,23(5):267-286

Using spreadsheets is the preferred method to calculate, display or store anything that fits into a table-like structure. They are often used by end users to create applications, although they have one critical drawback—spreadsheets are very error-prone. Recent research has developed methods to reduce this error-proneness by introducing a new way of object-oriented modeling of spreadsheets before using them. These spreadsheet models, termed ClassSheets, are used to generate concrete spreadsheets on the instance level. By this approach sources of errors are reduced and spreadsheet applications become easier to understand.As usual for almost every other application, requirements on spreadsheets change due to the changing environment. Thus, the problem of evolution of spreadsheets arises. The update and evolution of spreadsheets is the uttermost source of errors that may have severe impact.In this paper, we will introduce a model-based approach to spreadsheet evolution by propagating updates on spreadsheet models (i.e. ClassSheets) to spreadsheets. To this end, update commands for the ClassSheet layer are automatically transformed to those for the spreadsheet layer. We describe spreadsheet model update propagation using a formal framework and present an integrated tool suite that allows the easy creation and safe update of spreadsheet models. The presented approach greatly contributes to the problem of software evolution and maintenance for spreadsheets and thus avoids many errors that might have severe impacts. 相似文献

12.

Software Engineering for Spreadsheets

Erwig Martin 《Software, IEEE》2009,26(5):25-30

Spreadsheets are popular end-user programming tools. Many people use spreadsheet-computed values to make critical decisions, so spreadsheets must be correct. Proven software engineering principles can assist the construction and maintenance of dependable spreadsheets. However, how can we make this practical for end users? One way is to exploit spreadsheets' idiosyncratic structure to translate software engineering concepts such as type checking and debugging to an end-user programming domain. The simplified computational model and the spatial embedding of formulas, which provides rich contextual information, can simplify these concepts, leading to effective tools for end users. 相似文献

13.

Port-of-Entry Inspection: Sensor Deployment Policy Optimization

Elsayed E.A. Young C.M. Minge Xie Hao Zhang Zhu Y. 《Automation Science and Engineering, IEEE Transactions on》2009,6(2):265-276

This paper considers the problem of container inspection at a port-of-entry. Containers are inspected through a specific sequence to detect the presence of nuclear materials, biological and chemical agents, and other illegal shipments. The threshold levels of sensors at inspection stations of the port-of-entry affect the probabilities of incorrectly accepting or rejecting a container. In this paper, we present several optimization approaches for determining the optimum sensor threshold levels, while considering misclassification errors, total cost of inspection, and budget constraints. In contrast to previous work which determines threshold levels and sequence separately, we consider an integrated system and determine them simultaneously. Examples applying the approaches in different sensor arrangements are demonstrated. 相似文献

14.

基于多上下文信息的协同过滤推荐算法

郝志峰廖祥财温雯蔡瑞初《计算机科学》2021,48(3):168-173

随着电子商务和互联网的发展,数据信息呈爆炸式增长,协同过滤算法作为一种简单而高效的推荐算法,能在一定程度上有效地解决信息爆炸问题。但是传统协同过滤算法仅通过单一评分来挖掘相似用户,推荐效果并不占优势。为了提高个性化推荐的质量,如何充分利用用户(物品)的文本、图片、标签等上下文信息以使数据价值最大化是当前推荐系统亟待解决的问题。对此,提出了一种融合多种类型上下文信息的协同过滤算法。以用户商品交互信息为二部图,根据不同类型上下文的特点构建不同的相似度网络,设计目标函数在多种上下文信息网络的约束下联合矩阵分解,并学得用户商品的表示学习。在多个数据集上进行了充分实验,结果表明,融合多种类型上下文信息的协同过滤算法不仅能有效提高推荐的准确度,而且能在一定程度上解决数据稀疏性问题。相似文献

15.

Avoiding,finding and fixing spreadsheet errors – A survey of automated approaches for spreadsheet QA

《Journal of Systems and Software》2014

Spreadsheet programs can be found everywhere in organizations and they are used for a variety of purposes, including financial calculations, planning, data aggregation and decision making tasks. A number of research surveys have however shown that such programs are particularly prone to errors. Some reasons for the error-proneness of spreadsheets are that spreadsheets are developed by end users and that standard software quality assurance processes are mostly not applied. Correspondingly, during the last two decades, researchers have proposed a number of techniques and automated tools aimed at supporting the end user in the development of error-free spreadsheets. In this paper, we provide a review of the research literature and develop a classification of automated spreadsheet quality assurance (QA) approaches, which range from spreadsheet visualization, static analysis and quality reports, over testing and support to model-based spreadsheet development. Based on this review, we outline possible opportunities for future work in the area of automated spreadsheet QA. 相似文献

16.

Modeling Spreadsheet Audit: A Rigorous Approach to Automatic Visualization

《Journal of Visual Languages and Computing》2000,11(1):49-82

Computations in spreadsheets are hard to grasp and consequently many errors remain unnoticed. The problem with the hidden errors lies in the invisibility of the structure of calculations. As a result, auditing and visualization tools are required to make spreadsheets easier to comprehend and to make errors easier to detect. This paper presents a theoretical model of spreadsheets and a technique to describe spreadsheet auditing tools. These are then employed to describe and compare various tools. Moreover, two new visualization mechanisms are introduced.The spreadsheet model reflects not only current spreadsheet systems but also the way people actually use spreadsheets. Theoretically, it is impossible to check the correctness of a spreadsheet without a formal definition of its computations, but our hope is to find visualizations that point out parts of spreadsheets that contain anomalies, i.e. potential locations of errors. The model helps us to understand how such anomalies can be defined. 相似文献

17.

Program testing versus proofs of correctness

William E. Howden 《Software Testing, Verification and Reliability》1991,1(1):5-15

相似文献

18.

Spreadsheet development and ‘what-if’ analysis: quantitative versus qualitative errors

《Accounting, Management and Information Technologies》1999,9(3):141-160

Past research has shown that errors are relatively common in all types of spreadsheets. As spreadsheets are used widely by executives in analyzing and supporting their decision making, especially in financial analysis, budgeting and forecasting applications, it is important for spreadsheets to be accurate. Errors undetected in spreadsheets may have undesirable consequences. For example, errors may adversely impact the firm's competitiveness or profitability when the costing of projects is prone to incorrect computation. For this purpose, we investigate the types of errors that may occur even for simple domain-free spreadsheet problems. In addition, we also show that spreadsheet errors are difficult to detect during ‘what-if’ analysis (i.e. when some design parameters are changed) when spreadsheets are not properly designed. The results show that most students do not take due care in designing spreadsheets. It appears that the techniques in teaching spreadsheets should really focus on how to design a comprehensive spreadsheet that is both easy to maintain and debug rather than just demonstrating the many features of spreadsheets. 相似文献

19.

Exception handling in the spreadsheet paradigm

《IEEE transactions on pattern analysis and machine intelligence》2000,26(10):923-942

Exception handling is widely regarded as a necessity in programming languages today and almost every programming language currently used for professional software development supports some form of it. However, spreadsheet systems, which may be the most widely used type of “programming language” today in terms of number of users using it to create “programs” (spreadsheets), have traditionally had only extremely limited support for exception handling. Spreadsheet system users range from end users to professional programmers and this wide range suggests that an approach to exception handling for spreadsheet systems needs to be compatible with the equational reasoning model of spreadsheet formulas, yet feature expressive power comparable to that found in other programming languages. We present an approach to exception handling for spreadsheet system users that is aimed at this goal. Some of the features of the approach are new; others are not new, but their effects on the programming language properties of spreadsheet systems have not been discussed before in the literature. We explore these properties, offer our solutions to problems that arise with these properties, and compare the functionality of the approach with that of exception handling approaches in other languages 相似文献

20.

An error correction framework for sequences resulting from known state-transition models in Non-Intrusive Load Monitoring

《Advanced Engineering Informatics》2017

Non-Intrusive Load Monitoring (NILM), the set of techniques used for disaggregating total electricity consumption in a building into its constituent electrical loads, has recently received renewed interest in the research community, partly due to the roll-out of smart metering technology worldwide. Event-based NILM approaches (i.e., those that are based on first segmenting the power time-series and associating each segment with the operation of electrical appliances) are a commonly implemented solution but are prone to the propagation of errors through the data processing pipeline. Thus, during energy estimation (the final step in the process), many corrections need to be made to account for errors incurred during segmentation, feature extraction and classification (the other steps typically present in event-based approaches). A robust framework for energy estimation should use the labels from classification to (1) model the different state transitions that can occur in an appliance; (2) account for any misclassifications by correcting event labels that violate the extracted model; and (3) accurately estimate the energy consumed by that appliance over a period of time. In this paper, we address the second problem by proposing an error-correcting algorithm which looks at sequences generated by Finite State Machines (FSMs) and corrects for errors in the sequence; errors are defined as state transitions that violate the said FSM. We evaluate our framework on simulated data and find that it improves energy estimation errors. We further test it on data from 43 appliances collected from 19 houses and find that the framework significantly improves errors in energy estimates when compared to the case with no correction in 19 appliances, leaves 17 appliances unchanged, and has a slightly negative impact on 6 appliances. 相似文献