期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Criteria For Evaluating Usability Evaluation Methods

《International journal of human-computer interaction》2013,29(1):145-181

The current variety of alternative approaches to usability evaluation methods (UEMs) designed to assess and improve usability in software systems is offset by a general lack of understanding of the capabilities and limitations of each. Practitioners need to know which methods are more effective and in what ways and for what purposes. However, UEMs cannot be evaluated and compared reliably because of the lack of standard criteria for comparison. In this article, we present a practical discussion of factors, comparison criteria, and UEM performance measures useful in studies comparing UEMs. In demonstrating the importance of developing appropriate UEM evaluation criteria, we offer operational definitions and possible measures of UEM performance. We highlight specific challenges that researchers and practitioners face in comparing UEMs and provide a point of departure for further discussion and refinement of the principles and techniques used to approach UEM evaluation and comparison. 相似文献

2.

CASSM and cognitive walkthrough: usability issues with ticket vending machines

Iain Connell Ann Blandford Thomas Green 《Behaviour & Information Technology》2004,23(5):307-320

We focus on the ability of two analytical usability evaluation methods (UEMs), namely CASSM (Concept-based Analysis for Surface and Structural Misfits) and Cognitive Walkthrough, to identify usability issues underlying the use made of two London Underground ticket vending machines. By setting both sets of issues against the observed interactions with the machines, we assess the similarities and differences between the issues depicted by the two methods. In so doing we de-emphasise the mainly quantitative approach which is typical of the comparative UEM literature. However, by accounting for the likely consequences of the issues in behavioural terms, we reduced the proportion of issues which were anticipated but not observed (the false positives), compared with that achieved by other UEM studies. We assess these results in terms of the limitations of problem count as a measure of UEM effectiveness. We also discuss the likely trade-offs between field studies and laboratory testing. 相似文献

3.

What you get is what you see: revisiting the evaluator effect in usability tests

Morten Hertzum Rolf Molich Niels Ebbe Jacobsen 《Behaviour & Information Technology》2014,33(2):144-162

Usability evaluation is essential to user-centred design; yet, evaluators who analyse the same usability test sessions have been found to identify substantially different sets of usability problems. We revisit this evaluator effect by having 19 experienced usability professionals analyse video-recorded test sessions with five users. Nine participants analysed moderated sessions; 10 participants analysed unmoderated sessions. For the moderated sessions, participants reported an average of 33% of the problems reported by all nine of these participants and 50% of the subset of problems reported as critical or serious by at least one participant. For the unmoderated sessions, the percentages were 32% and 40%. Thus, the evaluator effect was similar for moderated and unmoderated sessions, and it was substantial for the full set of problems and still present for the most severe problems. In addition, participants disagreed in their severity ratings. As much as 24% (moderated) and 30% (unmoderated) of the problems reported by multiple participants were rated as critical by one participant and minor by another. The majority of the participants perceived an evaluator effect when merging their individual findings into group evaluations. We discuss reasons for the evaluator effect and recommend ways of managing it. 相似文献

4.

Dogmas in the assessment of usability evaluation methods

《Behaviour & Information Technology》2012,31(1):97-111

Usability evaluation methods (UEMs) are widely recognised as an essential part of systems development. Assessments of the performance of UEMs, however, have been criticised for low validity and limited reliability. The present study extends this critique by describing seven dogmas in recent work on UEMs. The dogmas include using inadequate procedures and measures for assessment, focusing on win–lose outcomes, holding simplistic models of how usability evaluators work, concentrating on evaluation rather than on design and working from the assumption that usability problems are real. We discuss research approaches that may help move beyond the dogmas. In particular, we emphasise detailed studies of evaluation processes, assessments of the impact of UEMs on design carried out in real-world systems development and analyses of how UEMs may be combined. 相似文献

5.

A Study of the Evaluator Effect in Usability Testing

Kasper Hornbæk Erik Frøkjær 《Human-Computer Interaction》2013,28(3):251-277

ABSTRACT

The evaluator effect names the observation that usability evaluators in similar conditions identify substantially different sets of usability problems. Yet little is known about the factors involved in the evaluator effect. We present a study of 50 novice evaluators' usability tests and subsequent comparisons, in teams and individually, of the resulting usability problems. The same problems were analyzed independently by 10 human–computer interaction experts. The study shows an agreement between evaluators of about 40%, indicating a substantial evaluator effect. Team matching of problems following the individual matching appears to improve the agreement, and evaluators express greater satisfaction with the teams' matchings. The matchings of individuals, teams, and independent experts show evaluator effects of similar sizes; yet individuals, teams, and independent experts fundamentally disagree about which problems are similar. Previous claims in the literature about the evaluator effect are challenged by the large variability in the matching of usability problems; we identify matching as a key determinant of the evaluator effect. An alternative view of usability problems and evaluator agreement is proposed in which matching is seen as an activity that helps to make sense of usability problems and where the existence of a correct matching is not assumed. 相似文献

6.

An interdisciplinary heuristic evaluation method for universal building design

Afacan Y Erbug C 《Applied ergonomics》2009,40(4):731-69

This study highlights how heuristic evaluation as a usability evaluation method can feed into current building design practice to conform to universal design principles. It provides a definition of universal usability that is applicable to an architectural design context. It takes the seven universal design principles as a set of heuristics and applies an iterative sequence of heuristic evaluation in a shopping mall, aiming to achieve a cost-effective evaluation process. The evaluation was composed of three consecutive sessions. First, five evaluators from different professions were interviewed regarding the construction drawings in terms of universal design principles. Then, each evaluator was asked to perform the predefined task scenarios. In subsequent interviews, the evaluators were asked to re-analyze the construction drawings. The results showed that heuristic evaluation could successfully integrate universal usability into current building design practice in two ways: (i) it promoted an iterative evaluation process combined with multi-sessions rather than relying on one evaluator and on one evaluation session to find the maximum number of usability problems, and (ii) it highlighted the necessity of an interdisciplinary ad hoc committee regarding the heuristic abilities of each profession. A multi-session and interdisciplinary heuristic evaluation method can save both the project budget and the required time, while ensuring a reduced error rate for the universal usage of the built environments. 相似文献

7.

Problem Prioritization in Usability Evaluation: From Severity Assessments Toward Impact on Design

《International journal of human-computer interaction》2013,29(2):125-146

Severity assessments enable prioritization of problems encountered during usability evaluations, and thereby provide a device for guiding the utilization of design resources. However, designers' response to usability evaluations is also influenced by other factors, which may overshadow severity. With the purpose of enhancing the impact of severity assessments, this study combines a field study of factors that influence the impact of evaluations with an experimental study of severity assessments made during usability inspections. The results show that even in a project receptive to input from evaluations, their impact was highly dependent on conducting evaluations early. This accorded with an informal method that blended elements of usability evaluation and participatory design and could be extended with user-made severity assessments. The major cost associated with the evaluations was not finding but fixing problems, emphasizing that, to be effective, severity assessments must be reliable, valid, and sufficiently persuasive to justify the cost of fixing problems. For the usability inspections, evaluators' ratings of problem impact and persistence were weakly correlated with the number of evaluators reporting a problem, indicating that different evaluators represent different subgroups of users or alternatively that evaluator-made severity assessments are of questionable reliability. To call designers' attention to the severe problems, the halving of the severity sum is proposed as a means of visualizing the large payoff of fixing a high-severity problem and, conversely, the modest potential of spending resources on low-severity problems. 相似文献

8.

认知协作遍历检查在可用性评估中的应用

梁路藤少华《计算机工程》2010,36(13):239-241

在协作任务仿真模型的基础上,提出一种认知协作遍历检查技术,给出其遍历过程算法。通过第1阶段的评估对比实验,证明该可用性评估技术适用于设计早期的协同软件系统,且评估成本较低,效率较高。在第2阶段实验中增加了评估者的数量,用以分析评估者数量及其个体表现对评估结果的影响,并针对一般情况下获得最高收益代缴比的样本数量给出建议。相似文献

9.

Scoping Analytical Usability Evaluation Methods: A Case Study

Ann E. Blandford Joanne K. Hyde Thomas R. G. Green Iain Connell 《Human-Computer Interaction》2013,28(3):278-327

ABSTRACT

Analytical usability evaluation methods (UEMs) can complement empirical evaluation of systems: for example, they can often be used earlier in design and can provide accounts of why users might experience difficulties, as well as what those difficulties are. However, their properties and value are only partially understood. One way to improve our understanding is by detailed comparisons using a single interface or system as a target for evaluation, but we need to look deeper than simple problem counts: we need to consider what kinds of accounts each UEM offers, and why. Here, we report on a detailed comparison of eight analytical UEMs. These eight methods were applied to a robotic arm interface, and the findings were systematically compared against video data of the arm in use. The usability issues that were identified could be grouped into five categories: system design, user misconceptions, conceptual fit between user and system, physical issues, and contextual ones. Other possible categories such as user experience did not emerge in this particular study. With the exception of Heuristic Evaluation, which supported a range of insights, each analytical method was found to focus attention on just one or two categories of issues. Two of the three “home-grown” methods (Evaluating Multimodal Usability and Concept-based Analysis of Surface and Structural Misfits) were found to occupy particular niches in the space, whereas the third (Programmable User Modeling) did not. This approach has identified commonalities and contrasts between methods and provided accounts of why a particular method yielded the insights it did. Rather than considering measures such as problem count or thoroughness, this approach has yielded insights into the scope of each method. 相似文献

10.

Developing usability heuristics with PROMETHEUS: A case study in virtual learning environments

《Computer Standards & Interfaces》2019

Heuristic evaluation is one of the most widely-used methods for evaluating the usability of a software product. Proposed in 1990 by Nielsen and Molich, it consists in having a small group of evaluators performing a systematic revision of a system under a set of guiding principles known as usability heuristics. Although Nielsen’s 10 usability heuristics are used as the de facto standard in the process of heuristic evaluation, recent research has provided evidence not only for the need of custom domain specific heuristics, but also for the development of methodological processes to create such sets of heuristics. In this work we apply the PROMETHEUS methodology, recently proposed by the authors, to develop the VLEs heuristics: a novel set of usability heuristics for the domain of virtual learning environments. In addition to the development of these heuristics, our research serves as further empirical validation of PROMETHEUS. To validate our results we performed an heuristic evaluation using both VLEs and Nielsen’s heuristics. Our design explicitly controls the effect of evaluator variability by using a large number of evaluators. Indeed, for both sets of heuristics the evaluation was performed independently by 7 groups of 5 evaluators each. That is, there were 70 evaluators in total, 35 using VLEs and 35 using Nielsen’s heuristics. In addition, we perform rigorous statistical analyses to establish the validity of the novel VLEs heuristics. The results show that VLEs perform better than Nielsen’s heuristics, finding more problems, which are also more relevant to the domain, as well as satisfying other quantitative and qualitative criteria. Finally, in contrast to evaluators using Nielsen’s heuristics, evaluators using VLEs heuristics reported greater satisfaction regarding utility, clarity, ease of use, and need of additional elements. 相似文献

11.

Cultural cognition in usability evaluation

《Interacting with computers》2009,21(3):212-220

We discuss the impact of cultural differences on usability evaluations that are based on the thinking-aloud method (TA). The term ‘cultural differences’ helps distinguish differences in the perception and thinking of Westerners (people from Western Europe and US citizens with European origins) and Easterners (people from China and the countries heavily influenced by its culture). We illustrate the impact of cultural cognition on four central elements of TA: (1) instructions and tasks, (2) the user’s verbalizations, (3) the evaluator’s reading of the user, and (4) the overall relationship between user and evaluator. In conclusion, we point to the importance of matching the task presentation to users’ cultural background, the different effects of thinking aloud on task performance between Easterners and Westerners, the differences in nonverbal behaviour that affect usability problem detection, and, finally, the complexity of the overall relationship between a user and an evaluator with different cultural backgrounds. 相似文献

12.

Evaluating the Information Behaviour methods: Formative evaluations of two methods for assessing the functionality and usability of electronic information resources

Stephann Makri Ann Blandford Anna L. Cox Simon Attfield Claire Warwick 《International journal of human-computer studies》2011,69(7-8):455-482

The importance of user-centred evaluation is stressed by HCI academics and practitioners alike. However, there have been few recent evaluation studies of User Evaluation Methods (UEMs), especially those with the aim of improving methods rather than assessing their efficacy (i.e. formative rather than summative evaluations). In this article, we present formative evaluations of two new methods for assessing the functionality and usability of a particular type of interactive system—electronic information resources. These serve as an example of an evaluation approach for assessing the success of new HCI methods. We taught the methods to a group of electronic resource developers and collected a mixture of focus group, method usage and summary questionnaire data—all focusing on how useful, usable and learnable the developers perceived the methods to be and how likely they were to use them in the future. Findings related to both methods were generally positive, and useful suggestions for improvement were made. Our evaluation sessions also highlighted a number of trade-offs for the development of UEMs and general lessons learned, which we discuss in order to inform the future development and evaluation of HCI methods. 相似文献

13.

Must evaluation methods be about usability? Devising and assessing the utility inspection method

Guðrún Hulda Jónsdóttir Johannessen 《Behaviour & Information Technology》2014,33(2):195-206

Whereas research in usability evaluation abounds, few evaluation approaches focus on utility. We present the utility inspection method (UIM), which prompts evaluators about the utility of the system they evaluate. The UIM asks whether a system uses global platforms, provides support infrastructure, is robust, gives access to rich content, allows customisation, offers symbolic value and supports companionship among users and between users and developers. We compare 47 participants’ use of UIM and heuristic evaluation (HE). The UIM helps identify more than three times as many problems as HE about the context of activities; HE helps identify 2.5 times as many problems as UIM about the interface. Usability experts consider the problems found with UIM more severe and more complex to solve compared to those found with HE. We argue that UIM complements existing usability evaluation methods and discuss future research on utility inspection. 相似文献

14.

Usability evaluation methods for the web: A systematic mapping study

Adrian Fernandez Emilio Insfran Silvia Abrahão 《Information and Software Technology》2011,53(8):789-817

Context

In recent years, many usability evaluation methods (UEMs) have been employed to evaluate Web applications. However, many of these applications still do not meet most customers’ usability expectations and many companies have folded as a result of not considering Web usability issues. No studies currently exist with regard to either the use of usability evaluation methods for the Web or the benefits they bring.

Objective

The objective of this paper is to summarize the current knowledge that is available as regards the usability evaluation methods (UEMs) that have been employed to evaluate Web applications over the last 14 years.

Method

A systematic mapping study was performed to assess the UEMs that have been used by researchers to evaluate Web applications and their relation to the Web development process. Systematic mapping studies are useful for categorizing and summarizing the existing information concerning a research question in an unbiased manner.

Results

The results show that around 39% of the papers reviewed reported the use of evaluation methods that had been specifically crafted for the Web. The results also show that the type of method most widely used was that of User Testing. The results identify several research gaps, such as the fact that around 90% of the studies applied evaluations during the implementation phase of the Web application development, which is the most costly phase in which to perform changes. A list of the UEMs that were found is also provided in order to guide novice usability practitioners.

Conclusions

From an initial set of 2703 papers, a total of 206 research papers were selected for the mapping study. The results obtained allowed us to reach conclusions concerning the state-of-the-art of UEMs for evaluating Web applications. This allowed us to identify several research gaps, which subsequently provided us with a framework in which new research activities can be more appropriately positioned, and from which useful information for novice usability practitioners can be extracted. 相似文献

15.

Usability Inspection by Metaphors of Human Thinking Compared to Heuristic Evaluation

《International journal of human-computer interaction》2013,29(3):357-374

A new usability inspection technique based on metaphors of human thinking has been experimentally compared to heuristic evaluation (HE). The aim of metaphors of thinking (MOT) is to focus inspection on users' mental activity and to make inspection easily applicable to different devices and use contexts. Building on classical introspective psychology, MOT bases inspection on metaphors of habit formation, stream of thought, awareness and associations, the relation between utterances and thought, and knowing. An experiment was conducted in which 87 novices evaluated a large Web application, and its key developer assessed the problems found. Compared to HE, MOT uncovered usability problems that were assessed as more severe for users and also appeared more complex to repair. The evaluators using HE found more cosmetic problems. The time spent learning and performing an evaluation with MOT was shorter. A discussion of strengths and weaknesses of MOT and HE is provided, which shows how MOT can be an effective alternative or supplement to HE. 相似文献

16.

Argumentation Models for Usability Problem Analysis in Individual and Collaborative Settings 总被引：1，自引：0，他引：1

Ebba Thora Hvannberg Effie L.-C. Law Gyda Halldorsdottir 《International journal of human-computer interaction》2019,35(3):256-273

Consolidating usability problems (UPs) from problem lists from several users can be a cognitively demanding task for evaluators. It has been suggested that collaboration between evaluators can help this process. In an attempt to learn how evaluators make decisions in this process, the authors studied what justification evaluators give for extracting UPs and their consolidation when working both individually and collaboratively. An experiment with eight novice usability evaluators was carried out where they extracted UPs and consolidated them individually and then collaboratively. The data were analyzed by using conventional content analysis and by creating argumentation models according to the Toulmin model. The results showed that during UP, extraction novice usability evaluators could put forward warrants leading to clear claims when probed but seldom added qualifiers or rebuttals. Novice usability evaluators could identify predefined criteria for a UP when probed and this could be acknowledged as a backing to warrants. In the individual settings, novice evaluators had difficulty in presenting claims and warrants for their decisions on consolidation. Although further study is needed, the results of the study indicated that collaborating pairs had a tendency to argue slightly better than individuals. Through the experiment novice evaluators’ reasoning patterns during problem extraction and consolidation as well as during their assessment of severity and confidence could be identified. 相似文献

17.

Barefoot usability evaluations

Anders Bruun Jan Stage 《Behaviour & Information Technology》2014,33(11):1148-1167

相似文献

18.

Training software developers and designers to conduct usability evaluations

《Behaviour & Information Technology》2012,31(4):425-435

Many efforts to improve the interplay between usability evaluation and software development rely either on better methods for conducting usability evaluations or on better formats for presenting evaluation results in ways that are useful for software designers and developers. Both of these approaches depend on a complete division of work between developers and evaluators. This article takes a different approach by exploring whether software developers and designers can be trained to conduct their own usability evaluations. The article is based on an empirical study where 36 teams with a total of 234 first-year university students on software development and design educations were trained through an introductory course in user-based website usability testing that was taught in 40 h. They used the techniques from this course for planning, conducting, and interpreting the results of a usability evaluation of an interactive website. They gained good competence in conducting the evaluation, defining user tasks and producing a usability report, while they were less successful in acquiring skills for identifying and describing usability problems. 相似文献

19.

Expert vs novice collaborative heuristic evaluation (CHE) of a smartphone app for cultural heritage sites

Othman Mohd Kamal Ong Lay Wan Aman Shaziti 《Multimedia Tools and Applications》2022,81(5):6923-6942

This study was conducted to compare CHE between Human-Computer Interaction (HCI) experts and novices in evaluating the Smartphone app for the cultural heritage site. It uses the Smartphone Mobile Application heuRisTics (SMART), focusing on smartphone applications and traditional Nielsen heuristics, focusing on a wider range of interactive systems. Six experts and six novices used the severity rating scale to categorise the severity of the usability issues. These issues were mapped to both heuristics. The study found that experts and novice evaluators identified 19 and 14 usability issues, respectively, with ten as the same usability issues. However, these same usability issues have been rated differently. Although the t-test indicates no significant differences between experts and novices in their ratings for usability issues, these results nevertheless indicate the need for both evaluators in CHE to provide a more comprehensive perspective on the severity of the usability issues. Furthermore, the mapping of the usability issues for Nielsen and SMART heuristics concluded that more issues with the smartphone app could be addressed through smartphone-specific heuristics than general heuristics, indicating a better tool for heuristic evaluation of the smartphone app. This study also provides new insight into the required number of evaluators needed for CHE.

相似文献

20.

A Weighted QFD-Based Usability Evaluation Method for Elderly in Smart Cars

In-Kyung Choi Won-Sup Kim Dong-Soo Kwon 《International journal of human-computer interaction》2015,31(10):703-716

The aim of this research is to develop a quantitative usability evaluation method (UEM) for elderly drivers, which has different weight values on each factor concerning physical and cognitive context of elderly drivers. An analysis of the relationship between universal design guidelines for elderly drivers and usability principles was conducted by using the quality function deployment method. In addition, developmental priorities are derived from analysis results of difficulty in achieving performance improvement, max relationship values, and relative weight. Furthermore, n and positive relationships among the universal design guidelines are defined by means of relationship analysis. Combining these results, a quantitative evaluation guideline for elderly drivers is derived, and based on the context and developmental goals of the developer, selective design is facilitated. The proposed UEM is compared with existing UEM in terms of thoroughness, validity, and effectiveness. 相似文献