Many errors in spreadsheet formulas can be avoided if spreadsheets are built automatically from higher-level models that can encode and enforce consistency constraints in the generated spreadsheets. Employing this strategy for legacy spreadsheets is difficult, because the model has to be reverse engineered from an existing spreadsheet and existing data must be transferred into the new model-generated spreadsheet. We have developed and implemented a technique that automatically infers relational schemas from spreadsheets. This technique uses particularities from the spreadsheet realm to create better schemas. We have evaluated this technique in two ways: first, we have demonstrated its applicability by using it on a set of real-world spreadsheets. Second, we have run an empirical study with users. The study has shown that the results produced by our technique are comparable to the ones developed by experts starting from the same (legacy) spreadsheet data. Although relational schemas are very useful to model data, they do not fit spreadsheets well, as they do not allow expressing layout. Thus, we have also introduced a mapping between relational schemas and ClassSheets. A ClassSheet controls further changes to the spreadsheet and safeguards it against a large class of formula errors. The developed tool is a contribution to spreadsheet (reverse) engineering, because it fills an important gap and allows a promising design method (ClassSheets) to be applied to a huge collection of legacy spreadsheets with minimal effort.  相似文献   

To address the problem of errors in spreadsheets, we have investigated spreadsheet authors׳ mental models in a hope of finding cognition-based principles for spreadsheet visualization and debugging tools. To this end, we have conducted three empirical studies. The first study explored the nature of mental models of spreadsheet authors during explaining and debugging tasks. It was found that several mental models about spreadsheets are activated in spreadsheet authors׳ minds. Particularly, when explaining a spreadsheet, the real-world and domain mental models are prominent, and the spreadsheet model is suppressed; however, when locating and fixing an error, one must constantly switch back and forth between the domain model and the spreadsheet model, which requires frequent use of the mapping between problem domain concepts and their spreadsheet model counterparts. The second study examined the effects of replacing traditional spreadsheet formulas with problem domain narratives in the context of a debugging task. Domain narratives were found to be easy to learn and they helped participants to locate more errors in spreadsheets. Furthermore, domain narratives also increased the use of the domain mental model and appeared to improve the mapping between the domain and spreadsheet models. The third study investigated the effects of allowing spreadsheet authors to fix errors by editing domain narratives, thus relieving them from the use of traditional low-level cell references. This scenario was found to promote spreadsheet authors using even more of their domain mental model in a manner that completely overshadowed the use of their spreadsheet mental model. Thus, from a mental model perspective, it is possible to devise a new spreadsheet paradigm that uses domain narratives in place of traditional spreadsheet formulas, thus automatically presenting spreadsheet content so that it prompts spreadsheet authors to think in a manner that closely corresponds to their mental models of the application domain.  相似文献   

Computations in spreadsheets are hard to grasp and consequently many errors remain unnoticed. The problem with the hidden errors lies in the invisibility of the structure of calculations. As a result, auditing and visualization tools are required to make spreadsheets easier to comprehend and to make errors easier to detect. This paper presents a theoretical model of spreadsheets and a technique to describe spreadsheet auditing tools. These are then employed to describe and compare various tools. Moreover, two new visualization mechanisms are introduced.The spreadsheet model reflects not only current spreadsheet systems but also the way people actually use spreadsheets. Theoretically, it is impossible to check the correctness of a spreadsheet without a formal definition of its computations, but our hope is to find visualizations that point out parts of spreadsheets that contain anomalies, i.e. potential locations of errors. The model helps us to understand how such anomalies can be defined.  相似文献   

Past research has shown that errors are relatively common in all types of spreadsheets. As spreadsheets are used widely by executives in analyzing and supporting their decision making, especially in financial analysis, budgeting and forecasting applications, it is important for spreadsheets to be accurate. Errors undetected in spreadsheets may have undesirable consequences. For example, errors may adversely impact the firm's competitiveness or profitability when the costing of projects is prone to incorrect computation. For this purpose, we investigate the types of errors that may occur even for simple domain-free spreadsheet problems. In addition, we also show that spreadsheet errors are difficult to detect during ‘what-if’ analysis (i.e. when some design parameters are changed) when spreadsheets are not properly designed. The results show that most students do not take due care in designing spreadsheets. It appears that the techniques in teaching spreadsheets should really focus on how to design a comprehensive spreadsheet that is both easy to maintain and debug rather than just demonstrating the many features of spreadsheets.  相似文献   

Spreadsheet programs can be found everywhere in organizations and they are used for a variety of purposes, including financial calculations, planning, data aggregation and decision making tasks. A number of research surveys have however shown that such programs are particularly prone to errors. Some reasons for the error-proneness of spreadsheets are that spreadsheets are developed by end users and that standard software quality assurance processes are mostly not applied. Correspondingly, during the last two decades, researchers have proposed a number of techniques and automated tools aimed at supporting the end user in the development of error-free spreadsheets. In this paper, we provide a review of the research literature and develop a classification of automated spreadsheet quality assurance (QA) approaches, which range from spreadsheet visualization, static analysis and quality reports, over testing and support to model-based spreadsheet development. Based on this review, we outline possible opportunities for future work in the area of automated spreadsheet QA.  相似文献   

Field audits and experiments have found substantial error rates when students and professionals have built spreadsheet models. In this study, 102 undergraduate MIS majors and 50 MBA students developed a model from a word problem that was relatively simple and free of domain knowledge. Even so, 35% of their 152 models were incorrect. There was no significant difference in errors per model between undergraduates and MBAs. Even among the 17 MBAs with 250 h or more of experience, 24% of the models contained errors. The cell error rate (CER)—the percentage of cells with errors—was 2.0%. When 23 undergraduates attempted to audit their models through code inspection, only three with incorrect spreadsheets (15%) produced clean spreadsheets when they finished the audit.  相似文献   

Field audits and experiments have found substantial error rates when students and professionals have built spreadsheet models. In this study, 102 undergraduate MIS majors and 50 MBA students developed a model from a word problem that was relatively simple and free of domain knowledge. Even so, 35% of their 152 models were incorrect. There was no significant difference in errors per model between undergraduates and MBAs. Even among the 17 MBAs with 250 h or more of experience, 24% of the models contained errors. The cell error rate (CER)—the percentage of cells with errors—was 2.0%. When 23 undergraduates attempted to audit their models through code inspection, only three with incorrect spreadsheets (15%) produced clean spreadsheets when they finished the audit.  相似文献   

Spreadsheets are widely used, and studies have shown that most end-user spreadsheets contain non-trivial errors. Most of the currently available tools that try to mitigate this problem require varying levels of user intervention. This paper presents a system, called UCheck, that detects errors in spreadsheets automatically. UCheck carries out automatic header and unit inference, and reports unit errors to the users. UCheck is based on two static analyses phases that infer header and unit information for all cells in a spreadsheet.We have tested UCheck on a wide variety of spreadsheets and found that it works accurately and reliably. The system was also used in a continuing education course for high school teachers, conducted through Oregon State University, aimed at making the participants aware of the need for quality control in the creation of spreadsheets.  相似文献   

We present a reasoning system for inferring dimension information in spreadsheets. This system can be used to check the consistency of spreadsheet formulas and thus is able to detect errors in spreadsheets.Our approach is based on three static analysis components. First, the spatial structure of the spreadsheet is analyzed to infer a labeling relationship among cells. Second, cells that are used as labels are lexically analyzed and mapped to potential dimensions. Finally, dimension information is propagated through spreadsheet formulas. An important aspect of the rule system defining dimension inference is that it works bi-directionally, that is, not only “downstream” from referenced arguments to the current cell, but also “upstream” in the reverse direction. This flexibility makes the system robust and turns out to be particularly useful in cases when the initial dimension information that can be inferred from headers is incomplete or ambiguous.We have implemented a prototype system as an add-in to Excel. In an evaluation of this implementation we were able to detect dimension errors in almost 50% of the investigated spreadsheets, which shows (i) that the system works reliably in practice and (ii) that dimension information can be well exploited to uncover errors in spreadsheets.  相似文献   

Exception handling is widely regarded as a necessity in programming languages today and almost every programming language currently used for professional software development supports some form of it. However, spreadsheet systems, which may be the most widely used type of “programming language” today in terms of number of users using it to create “programs” (spreadsheets), have traditionally had only extremely limited support for exception handling. Spreadsheet system users range from end users to professional programmers and this wide range suggests that an approach to exception handling for spreadsheet systems needs to be compatible with the equational reasoning model of spreadsheet formulas, yet feature expressive power comparable to that found in other programming languages. We present an approach to exception handling for spreadsheet system users that is aimed at this goal. Some of the features of the approach are new; others are not new, but their effects on the programming language properties of spreadsheet systems have not been discussed before in the literature. We explore these properties, offer our solutions to problems that arise with these properties, and compare the functionality of the approach with that of exception handling approaches in other languages  相似文献   

Spreadsheets are very common for information processing to support decision making by both professional developers and non-technical end users. Moreover, business intelligence and artificial intelligence are increasingly popular in the industry nowadays, where spreadsheets have been used as, or integrated into, intelligent or expert systems in various application domains. However, it has been repeatedly reported that faults often exist in operational spreadsheets, which could severely compromise the quality of conclusions and decisions based on the spreadsheets. With a view to systematically examining this problem via survey of existing work, we have conducted a comprehensive literature review on the quality issues and related techniques of spreadsheets over a 35.5-year period (from January 1987 to June 2022) for target journals and a 10.5-year period (from January 2012 to June 2022) for target conferences. Among other findings, two major ones are: (a) Spreadsheet quality is best addressed throughout the whole spreadsheet life cycle, rather than just focusing on a few specific stages of the life cycle. (b) Relatively more studies focus on spreadsheet testing and debugging (related to fault detection and removal) when compared with spreadsheet specification, modeling, and design (related to development). As prevention is better than cure, more research should be performed on the early stages of the spreadsheet life cycle. Enlightened by our comprehensive review, we have identified the major research gaps as well as highlighted key research directions for future work in the area.  相似文献   

Based on (1) research into mutation testing for general purpose programming languages, and (2) spreadsheet errors that have been reported in the literature, we have developed a suite of mutation operators for spreadsheets. We present an evaluation of the mutation adequacy of du-adequate test suites generated by a constraint-based automatic test-case generation system we have developed in previous work. The results of the evaluation suggest additional constraints that can be incorporated into the system to target mutation adequacy. In addition to being useful in mutation testing of spreadsheets, the operators can be used in the evaluation of error-detection tools and also for seeding spreadsheets with errors for empirical studies. We describe two case studies where the suite of mutation operators helped us carry out such empirical evaluations. The main contribution of this paper is a suite of mutation operators for spreadsheets that can be used for carrying out empirical evaluations of spreadsheet tools to indicate ways in which the tools can be improved.  相似文献   

Spreadsheet programs are probably the most successful example of end-user software development tools and are used for a variety of purposes. Like any type of software, they are prone to error, in particular as they are usually developed by non-programmers. While various techniques exist to support the developer in finding errors in procedural programs, the tool support for spreadsheet debugging is still limited. In this paper, we show how techniques from model-based diagnosis can be applied and extended for spreadsheet debugging by translating the relevant parts of a spreadsheet to a constraint satisfaction problem. We additionally propose both problem-specific and generalizable extensions to the classical diagnosis algorithms which help to detect potential problems in a spreadsheet based on user-provided test cases more efficiently. The proposed techniques were integrated into a modular framework for spreadsheet debugging and evaluated with respect to scalability based on a number of real-world and artificially created spreadsheets. An additional error detection exercise involving 24 subjects was performed to assess the general applicability of such advanced spreadsheet debugging techniques for end users.  相似文献   

Errors are prevalent in spreadsheets and can be extremely difficult to find. A number of audits of existing spreadsheets have been reported, but few details have been given about how the audits were performed. We developed and tested a new spreadsheet auditing protocol designed to find errors in operational spreadsheets. Our work showed which auditing procedures, used in what sequence and combination, were most effective across a wide range of spreadsheets. It also provided useful information on the size and complexity of operational spreadsheets, as well as the frequency with which certain types of errors occur.  相似文献   

In contrast to the common view of spreadsheets as “single-user” programs, we have found that spreadsheets offer surprisingly strong support for cooperative development of a wide variety of applications. Ethnographic interviews with spreadsheet users showed that nearly all of the spreadsheets used in the work environments studied were the result of collaborative work by people with different levels of programming and domain expertise. We describe how spreadsheet users cooperate in developing, debugging and using spreadsheets. We examine the properties of spreadsheet software that enable cooperation, arguing that: (1) the division of the spreadsheet into two distinct programming layers permits effective distribution of computational tasks across users with different levels of programming skill; and (2) the spreadsheet's strong visual format for structuring and presenting data supports sharing of domain knowledge among co-workers.  相似文献   

Labels in spreadsheets can be exploited for finding formula errors in two principally different ways. First, the spatial relationships between labels and other cells express simple constraints on the cells usage in formulas. Second, labels can be interpreted as units of measurements to provide semantic information about the data being combined in formulas, which results in different kinds of constraints.In this paper we demonstrate how both approaches can be combined into an integrated analysis, which is able to find significantly more errors in spreadsheets than each of the individual approaches. In particular, the integrated system is able to detect errors that cannot be found by either of the individual approaches alone, which shows that the integrated system provides an added value beyond the mere combination of its parts. We also compare the effectiveness of this combined approach with several other conceivable combinations of the involved components and identify a system that seems most effective to find spreadsheet formula errors based on label and unit-of-measurement information.  相似文献   

The issues of 'usability' and 'learnability' are assuming an increasingly important role for both the designers of software and their prospective customers. Objective measures of the interaction between system and user are important for the development of software that is both easy to learn and pleasurable to use. In this study we apply a set of five measures to evaluate users' interactions with spreadsheet software, and to compare two spreadsheet packages. We tested 16 people with no previous experience of spreadsheets and 16 with experience of spreadsheets generally though not of the spreadsheet we gave them. Half were allocated to learn Excel and half to learn Wingz, running on Apple Macintosh computers. A standard task was constructed to assess understanding of the basic concepts involved in the use of • spreadsheets. Users' previous experience of spreadsheet use was the most salient factor in the scores achieved on the task. The brand of spreadsheet had no significant effect on task performance. Implications for designers of software and users of spreadsheet packages are discussed.  相似文献   

Spreadsheets are very widely used at various levels of the organization. Studies have shown that errors do occur frequently during the development of spreadsheet models. Empirical studies of operational spreadsheet models show that a large percentage of them contain errors. However, the identification of errors is difficult and tedious, and errors have led to drastically wrong decisions. It is thus important to develop new strategies and auditing tools for detecting errors. A suite of new auditing visualization tools have been designed and implemented in Visual Basic for Application (VBA), as an add-in module for easy inclusion in any Excel 97 or Excel 2000 installation. Furthermore, four strategies are proposed for detecting errors. These range from an overview strategy to identify logical components of the spreadsheet model, to specific strategies targeted at specific types of error. Illustrations show how these strategies can be supported with the new visualization tools.  相似文献   

电子表格是一种分析工具,具有友好的用户界面,提供了简单计算模型。但与RDBMS相比,电子表格没有一个共享机制,而导致多个电子表格需要产生多个副本。此外,电子表格也不能提供可扩展的计算。为了解决共享性和可扩展性问题,提出把Excel计算自动地转换为SQL。可以从一个系统中输入数据,在系统中用类似的Excel公式定义计算,然后把它作为一个SQL视图转换和存储起来,然后通过系统执行Excel计算。实验显示,方法是可行和有效的。  相似文献   

