m
m m
p p
EVALUATION TECHNIQUES O Evaluation À
tests usability, functionality and acceptability of an interactive system
À
occurs in laboratory, field and/or in collaboration with s
À
evaluates both design and implementation
À
should be considered at all stages in the design life cycle
!OALS OF EVALUATION O
assess extent of system functionality
O
assess effect of interface on
O
identify specific problems
? "
CO!NITIVE WALKTHROU!H Proposed by Polson @ À Require a details review of a sequence of actions À evaluates design on how well it s in learning task À usually performed by expert in cognitive psychology À expert ¶walks though· design to identify potential problems using psychological principles À Main focus: ´learning through explorationµ À forms used to guide analysis
CO!NITIVE WALKTHROU!H Requirements: i. A specification / prototype of the system ii. A description of task is to perform on the system iii. A written list of actions needed to complete the task iv. An indication ers ² experience & knowledge
CO!NITIVE WALKTHROU!H (CTD) O
For each task walkthrough considers
À what
impact will interaction have on ? À what cognitive processes are required? À what learning problems may occur?
IN EACH ACTION PERFORM: Analysis focuses on goals and knowledge: does the design lead the to generate the correct goals? O Will s see that the action is available? O Does know that the action is the one that they needed? O After action is taken, will understand the they get? O
HEURISTIC EVALUATION O Heuristics
- guideline / general principles that can guide a decision that has already been made.
O usability
criteria (heuristics) are identified O design examined by experts to see if these are violated O Example
heuristics
system behaviour is predictable À system behaviour is consistent À is provided À
O Heuristic
evaluation `debugs' design.
NIELSEN·S 10 HEURISTICS: O Visibility
of system status O Match between system & real world O control & freedom O Consistency & standards O Error prevention O Recognition rather than recall O Flexibility & efficiency of use O Aesthetic & minimal design O Help s recognize, diagnose & recover from errors O Help & documentation
MODEL-BASED EVALUATION O Results
from the literature used to or refute parts of design.
O Care
needed to ensure results are transferable to new design.
O Cognitive
models used to filter design options
e.g. !OMS (goals, operators, methods, selections) prediction of performance. O Design
rationale can also provide useful evaluation information
? ?
LABORATORY STUDIES O Advantages:
specialist equipment available À uninterrupted environment À
O Disadvantages:
lack of context À difficult to observe several s cooperating À
O Appropriate À
if system location is dangerous or impractical for constrained single systems to allow controlled manipulation of use
FIELD STUDIES O Advantages:
natural environment À context retained À Interaction occur in actual use À
O Disadvantages:
distractions À noise À
O Appropriate À
where context is crucial for longitudinal studies
Evaluating Implementations
Requires an artefact: simulation, prototype, full implementation
EMPIRICAL METHODS: EXPERIMENTAL EVALUATION O controlled
evaluation of specific aspects of interactive behaviour
O evaluator
chooses hypothesis to be tested
Oa
number of experimental conditions are considered which differ only in the value of some controlled variable.
O changes
in behavioural measure are attributed to different conditions
EXPERIMENTAL FACTORS O
Participants À
O
Variables À
O
things to modify and measure
Hypothesis À
O
who ² representative, sample size
what you·d like to show
Experimental design À
how you are going to do it
VARIABLES O independent
variable (IV)
characteristic changed to produce different conditions e.g. interface style, number of menu items O dependent
variable (DV)
characteristics measured in the experiment e.g. time taken, number of errors.
HYPOTHESIS O prediction À
of outcome
framed in of IV and DV e.g. ´error rate will increase as font size decreasesµ
O null
hypothesis:
states no difference between conditions À aim is to disprove this À
e.g. null hyp. = ´no change with font sizeµ
EXPERIMENTAL DESI!N O ¶within À
subjects·
each subject performs experiment under each condition.
O ¶between À
subjects·
each subject performs under only one condition
ANALYSIS OF DATA O Before
you start to do any statistics:
look at data À save original data À
O Choice
of statistical technique depends on
type of data À information required À
O Type
of data
discrete - finite number of values À continuous - any value À
ANALYSIS ² STATISTICAL TECHNIQUES O parametric
ANOVA (analysis of variance) À robust À powerful À
O non-parametric
do not assume normal distribution À less powerful À more reliable À
O contingency
table
classify data by discrete attributes À count number of data items in each group À
ANALYSIS OF DATA (CONT.) O
What information is required? is there a difference? À how big is the difference? À how accurate is the estimate? À
O
Parametric and non-parametric tests mainly address first of these
EXPERIMENTAL STUDIES ON !ROUPS More difficult than single- experiments Problems with: subject groups À choice of task À data gathering À analysis À
SUBJECT !ROUPS larger number of subjects A more expensive longer time to `settle down· « even more variation! difficult to timetable so « often only three or four groups
DATA !ATHERIN! several video cameras + direct logging of application problems: synchronisation À sheer volume! À
one solution: À
record from each perspective
%? !"( #*
THINK ALOUD O
observed performing task O asked to describe what he is doing and why, what he thinks is happening etc. O Advantages
simplicity - requires little expertise À can provide useful insight À can show how system is actually use À
O Disadvantages
subjective À selective À act of describing may alter task performance À
COOPERATIVE EVALUATION O variation
on think aloud O collaborates in evaluation O both and evaluator can ask each other questions throughout O Additional
advantages
less constrained and easier to use À is encouraged to criticize system À clarification possible À
PROTOCOL ANALYSIS Methods for recording actions: O paper and pencil ² cheap, limited to writing speed O audio ² good for think aloud, difficult to match with other O O O
O O O
protocols video ² accurate and realistic, needs special equipment, obtrusive computer logging ² automatic and unobtrusive, large amounts of data difficult to analyze notebooks ² coarse and subjective, useful insights, good for longitudinal studies
Mixed use in practice. audio/video transcription difficult and requires skill. Some automatic tools available
AUTOMATED ANALYSIS ² EVA (EXPERIMENTAL VIDEO ANNOTATOR) O Workplace
project O Post task walkthrough reacts on action after the event À used to fill in intention À
O Advantages
analyst has time to focus on relevant incidents À avoid excessive interruption of task À
O Disadvantages
lack of freshness À may be post-hoc interpretation of events À
POST-TASK WALKTHROU!HS O
transcript played back to participant for comment immediately fresh in mind À delayed evaluator has time to identify questions À
useful to identify reasons for actions and alternatives considered O necessary in cases where think aloud is not possible O
? ? ? *
INTERVIEWS O analyst
questions on one-to -one basis usually based on prepared questions O informal, subjective and relatively cheap O Advantages
can be varied to suit context À issues can be explored more fully À can elicit views and identify unanticipated problems À
O Disadvantages
very subjective À time consuming À
QUESTIONNAIRES O Set
of fixed questions given to s
O Advantages
quick and reaches large group À can be analyzed more rigorously À
O Disadvantages
less flexible À less probing À
QUESTIONNAIRES (CTD) O Need
careful design
what information is required? À how are answers to be analyzed? À
O Styles À À À À À
of question
general open-ended scalar multi-choice ranked
!?& ?( ( ))
EYE TRACKIN! O head
or desk mounted equipment tracks the position of the eye O eye movement reflects the amount of cognitive processing a display requires O measurements include fixations: eye maintains stable position. Number and duration indicate level of difficulty with display À saccades: rapid eye movement from one point of interest to another À scan paths: moving straight to a target with a short fixation at the target is optimal À
PHYSIOLO!ICAL MEASUREMENTS O emotional
response linked to physical changes O these may help determine a ·s reaction to an interface O measurements include: À À À À
heart activity, including blood pressure, volume and pulse. activity of sweat glands: !alvanic Skin Response (!SR) electrical activity in muscle: electromyogram (EM!) electrical activity in brain: electroencephalogram (EE!)
O some
difficulty in interpreting these physiological responses - more research needed
CHOOSIN! AN EVALUATION METHOD Factors distinguish evaluation techniques: when in process:
design vs. implementation
style of evaluation:
laboratory vs. field
how objective:
subjective vs. objective
type of measures:
qualitative vs. quantitative
level of information:
high level vs. low level
level of interference:
obtrusive vs. unobtrusive
resources available:
time, subjects, equipment, expertise