Synopsis

Aim of this analysis is to explore if there might be some relationship between high level of usability of information system and number of clients who don’t need help using these systems. Data analysed is from user survey. Analysis shows that there might be a positive link between number of users who don’t need help using service and high level of usability (higher level means that less users need help). Further analysis is needed to confirm the results and find causal relationship.

Data collection

Data analysed is from a survey conducted in January-February 2014. Survey was conducted by Estonian Ministry of Economic Affairs and Communications. Users of 8 different information systems were questioned:

Building Registry
Electronic Naval Information System
Supervision Information System
Aviation Safety Monitoring System
Registry of Economic Activities
Road Administration Self-Service portal
Port Registry
Consumer Protection Authoritys website (filing consumer complaints)

In total there were 606 answers and average response rate was 15%. Some of the information systems had few andswers analysis is not conducted on a system level. Instead data is analysed based on different factors: did user needed help and mean usability rate (level) of different usability components. Users were asked to rate different claims (components) of usability on a 4-point scale (0-don’t agree at all … 3-totally agree):

System was easy to use (simplicity)
Next time I know how to use the system in a better way (learnability)
System fulfilled my expectations (expectations)
I was satisfied with architecture (of the user interface) of the system (satisfaction)
I was satisfied with texts in the system (texts)
I could conduct my activities in the system fast enough (speed)
I liked to use the system (pleased)

Also users were asked if they used help while using information system (answer variants: yes/no). Users were also asked if they repeatedly used information system for the same acitivity (answer variants: yes/no, due to confusing wording this question is left out from analysis). As much as possible questionnaire covered Jakob Nielsen usability components.

Analysis

First a plot is made showing mean rate (with 95% confidence interval) for each usability component. plot of chunk unnamed-chunk-1

Minimum rate is 0 and maximum 3. Rate 2 represents level above which usability is high (as level 2 means where user more or less agreed that usability is good). As we see from the plot no component except learnability is above the line. Mean learnability rate might be in high usability side because even if system is difficult to use this time, user has gained some experience how to use it better next time (what mistakes avoid). That is a positive, as even if systems are difficult to use, users could learn to adapt with them. We also see from plot that mean rate for expectations, satisfaction and speed are the lowest.

Next a plot is created which displayes mean usability component rate for those who used help and for those who didn’t use help.

plot of chunk unnamed-chunk-2

As seen from the plot users who didn’t use help gave on average higher rate for every usability component compared to users who used help. In some cases (learnability, simplicity) it means that users who didn’t use help gave on average gave higher usability rates (above 2) and users who used help on average gave lower usability rates. These results suggest that there might be a link between a high level of usability and a need for help (as usability level increases less help is needed).

Next a plot is created to show proportion of users who used help in every usability level (0 (very bad) … 3 (very good)). Usability level 0 (very bad) means that user didn’t agree with claim and usability level 3 (very good) means that user totally agreed with claim (for example for simplicity very bad usability level means that user didn’t agree at all that system was easy to use).

plot of chunk unnamed-chunk-3

As seen from the plot approximately only 30% users who rated usability very bad didn’t use help. But close to 70% of users who rated usability as very good didn’t use help. That is more than two times more users. Only difference is related to learnability component where proportion of users who didn’t use help declines from moving usability level very bad to usability level bad. This is somewhat puzzling and further analysis is needed to find reason for this.

Conclusions and restrictions

These findings suggests that higher usability levels (rates) mean that higher proportion of users don’t need help for using information system. Difference might be more than two times: about 30% of users didn’t use help on lowest usability levels but about 70% of users didn’t use help on highest usability level. This suggests that usability is important part of information system design. Bad usability wastes users time (finding/using help) and service owner time (for example providing help via conact centre). Also this helps to achieve goal of information system: automate processes and make service usage as simple as possible (in best case scenario no help is needed).

Further analysis is needed if increasing usability levels really increases level of users who don’t use help. Current study design doesn’t allow us to directly find this pattern. For example if we analyse users who rated usablity very low and increase usability level for them than does proportion of help non-users increase. Time series data would be needed for this analysis.

Analysis has some more restrictions. As most of the users who got questionnaire didn’t answer it we don’t know how big is systematic bias in results (are those who didn’t answer somewhat different in their preferences and rates). That means that we cannot generalize results, but still they give us some hint what kind of link there is between usability level and proportion of users who need help.

Appendix

Code for analysis is here. Code for first plot:

#read in data
data=read.csv("andmed.csv")
#recode variables to english
names(data)=c("register", "simplicity", "learnability", "expectations", 
"satisfaction", "texts", "speed", "pleased", "help" )
#melt data
library(reshape2)
data.melt=melt(data[,1:8])

library(plyr)
data.sum.variable=ddply(data.melt, c("variable"), summarize,
                        N    = length(value),
                        mean = mean(value),
                        sd   = sd(value),
                        se   = qt(0.975,df=N-1)*sd/sqrt(N),
                        left.ci=mean-se,
                        right.ci=mean+se)

#plot the mean values with CI
library(ggplot2)
ggplot(data.sum.variable, aes(x=variable, y=mean))+
    geom_point()+
    theme_minimal()+
    geom_errorbar(width=.1, aes(ymin=left.ci, ymax=right.ci))+
    geom_hline(yintercept=2, color="red", linetype=5)+
    ylab("")+
    xlab("")+
    ggtitle("Mean rate for each usability component")

Code for second plot:

#make help variable as factor and recode it
data$help=as.factor(data$help)
data$help <- revalue(data$help, c("0"="yes", "1"="no"))
data.melt2=melt(data[,1:9])

##calculate mean, n, sd, se ja 95% CI for each question and 
#each help level (yes/no) 
data.sum.variable3=ddply(data.melt2, c("help", "variable"), summarize,
                         N    = length(value),
                         mean = mean(value),
                         sd   = sd(value),
                         se   = qt(0.975,df=N-1)*sd/sqrt(N),
                         left.ci=mean-se,
                         right.ci=mean+se)

#plot different mean for each question depending on if person used help or not
ggplot(data.sum.variable3, aes(x=help, y=mean))+
    geom_point(size=2)+
    geom_errorbar(width=.1, aes(ymin=left.ci, ymax=right.ci))+
    theme_minimal()+
    geom_hline(yintercept=2, color="red", linetype=5)+
    facet_wrap(~variable)+
    geom_errorbar(width=.1, aes(ymin=left.ci, ymax=right.ci))+
    coord_cartesian(ylim = c(1, 2.5))+
    ylab("")+
    xlab("Used help?")+
    ggtitle("Mean usability rate for users who used/didn't use help")

Code for third plot:

#make values as factors
data.melt2$value=as.factor(data.melt2$value)

#calculate how many used/didn't use help in every variable and value level
data.levels=ddply(data.melt2, c("variable", "value"), summarize,
                  N    = length(value))
#calculate how many used/didn't use help in every variable, value, help level
data.levels2=ddply(data.melt2, c("variable", "value","help"), summarize,
                   N    = length(value))
#keep only those who didnt use help 
data.levels.no=subset(data.levels2, help=="no")
#calculate % of help users in every usability level
data.levels.no$all=data.levels$N
#calculate % of users who didn't use help
data.levels.no$percentage=data.levels.no$N/data.levels.no$all
#rename variable
names(data.levels.no)[1]="Component"

#plot % of users needed help in each usability rate
require(scales)
ggplot(data.levels.no, aes(x=value, y=percentage, group=Component))+
    geom_line(aes(color=Component), size=1)+
    geom_point(aes(color=Component), size=2)+
    ylab("")+
    xlab("Usability level")+
    scale_x_discrete(breaks=c("0", "1", "2", "3"), 
                     labels=c("Very bad", "Bad", "Good", "Very good"))+
    scale_y_continuous(labels=percent)+
    ggtitle("Proportion of users who didn't use help")+
    coord_cartesian(ylim = c(0.2, 0.75))+
    theme_minimal()

Does your customer need help?

Risto Hinno

January 1, 2015

Synopsis

Data collection

Analysis

Conclusions and restrictions

Appendix