1 Confusion Matrix, Accuracy and Kappa

Each time when one uses library or customized functions, it is easy to confuse the usage of the functions. This article will help in using the functions correctly and also give some understanding of the metrics. We will only focus on binary classification.

We will focus on the usage of confusion matrix and some of the metrics (especially kappa) associated with it.

1.1 Caret Package

The first go-to package to get confusion matrix is caret package’s confusion matrix which produces rich assortment of metrics which can be used for given problem to measure performance.

v_truth <- c(rep(1,3),rep(0,7))
v_pred <- c(rep(1,1),rep(0,9))

To understand the concepts, we make a simple truth (or observed) vector with 10 elements that are imbalanced – has 30% 1’s and 70% 0’s. The vector created by above code is as follows:

  • truth vector: 1, 1, 1, 0, 0, 0, 0, 0, 0, 0
  • pred vector: 1, 0, 0, 0, 0, 0, 0, 0, 0, 0

Note: When using the caret package’s function confusion Matrix prediction vector is the first argument, the observation or the truth is the second argument

suppressMessages(library(caret))
confusionMatrix(v_pred,v_truth)
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction 0 1
##          0 7 2
##          1 0 1
##                                           
##                Accuracy : 0.8             
##                  95% CI : (0.4439, 0.9748)
##     No Information Rate : 0.7             
##     P-Value [Acc > NIR] : 0.3828          
##                                           
##                   Kappa : 0.4118          
##  Mcnemar's Test P-Value : 0.4795          
##                                           
##             Sensitivity : 1.0000          
##             Specificity : 0.3333          
##          Pos Pred Value : 0.7778          
##          Neg Pred Value : 1.0000          
##              Prevalence : 0.7000          
##          Detection Rate : 0.7000          
##    Detection Prevalence : 0.9000          
##       Balanced Accuracy : 0.6667          
##                                           
##        'Positive' Class : 0               
## 

1.2 Confusion Matrix

Confusion matrix can be generated simply by the native R’s table function as follows:

table(v_pred,v_truth)
##       v_truth
## v_pred 0 1
##      0 7 2
##      1 0 1

1.3 Accuracy

Accuracy is simply fraction of correctly identified elements. ie.,

\[ \text{accuracy} ={7+1\over 10}\] This is a good metric when the binary response are about equal in number i.e., 1’s and 0’s are about equal in number. However when the reponse in imbalanced like 30% of and 70% zeros, balanced accuracy and kappa are important metrics to track.

1.4 Balanced Accuracy

This metric is average of each elements’ fractional response correctly predicted i.e.,

\[ \text{balanced accuracy} = ({7\over{7}} + {1\over{3}})/2 = 0.\bar{6}\]

For imbalanced classification, this metric is prefered because each response 0 or 1 is given fifty percent importance! The maximum score contribution from each 0 or 1 is 0.5.

To diagramtically see, check the confusion matrix, the look columwise numbers relates to Truth, we use

\[ {\text{correctly predicted}\over{\text{correctly predicted + incorrectly predicted}}}\text{, which is} {7\over 7} \text{ for 0's and} {1\over 3} \text{ for 1's} \]

1.5 kappa

The metric is often used for imbalanced classification. Cohen’s kappa is measure of agreement between two individuals. A perfect agreement will have kappa as 1. Following categories can be made (more from this link)

  • Poor agreement = Less than 0.20
  • Fair agreement = 0.20 to 0.40
  • Moderate agreement = 0.40 to 0.60
  • Good agreement = 0.60 to 0.80
  • Very good agreement = 0.80 to 1.00
kp <- function(pred,obs){
  cf<-table(pred, obs)
  if(sum(dim(cf))<4){return(0)}
  a<- cf[1,1]; c<- cf[2,1]; b<- cf[1,2];d <- cf[2,2]
  po <- (a+d)/sum(cf); pe <- ((a+b)*(a+c)+(c+d)*(b+d))/sum(cf)^2
  return(ifelse(pe==1,0,(po-pe)/(1-pe)))}

ba <- function(pred,obs){
  a<-table(pred, obs)
  return((a[1]/(a[1]+a[2])+a[4]/(a[4]+a[3]))/2)
}

cat(" kappa calculated = ", kp(v_pred,v_truth), "\n")
##  kappa calculated =  0.4117647
cat(" Balanced  Accuracy calculated = ", ba(v_pred,v_truth), "\n")
##  Balanced  Accuracy calculated =  0.6666667

2 Kappa illustration-30%-70%

For an imbalanced classifcation, it is interesting to see how kappa varies based on different predictions. We use the same dataset of 10 elements of truth (or observation) vector imbalanced with 30% 1’s and rest 70% 0’s.

We firstly prepare the data that is easily acceptable to ggplot

suppressMessages(library(ggplot2))
df <- data.frame("typ"=rep("truth 3%",10), 
                 "response"=as.character(v_truth),
                 "kappa"="n/a", "bal_acc" = "n/a")

for (ii in 1:5) {
  v_pred <- c(rep(1,ii),rep(0,10-ii))
  #kpa <- paste(expression("kappa=", round(kp(v_pred, v_truth),1)))
  kpa <- as.character(round(kp(v_pred, v_truth),2))
  kpa <- paste("k = ", kpa)
  bal_acc <- as.character(round(ba(v_pred, v_truth),2))
  bal_acc <- paste("ba= ", bal_acc)
  df <- rbind(df,(data.frame("typ"=rep(paste ("pred", ii, "%"),10), 
                             "response"=as.character(v_pred),
                             "kappa"= rep(kpa,10), "bal_acc"= rep(bal_acc,10))))
}

suppressMessages(library(dplyr))
dd<- (df %>% 
  group_by(typ, response) %>% 
  summarise(nn = n()))
ggplot(df, aes(x=typ)) + geom_bar(aes(fill=response)) + 
  geom_text(data=dd, aes(label=paste0(nn,"%"),
                         y=ifelse(dd$response==1,nn/2,10-nn/2)), ) +
  geom_text(data=df, aes(label=kappa),y=9.2, size=4, vjust=.8, family="Calibri", colour="darkblue") +
  geom_text(data=df, aes(label=bal_acc),y=9.2, size=4, vjust =-0.8, family="Calibri Light", colour="darkblue") +
  coord_flip() + theme_bw() + 
  ylab("number of 1's and 0's in sample") +
  xlab("sample of various response predictions") +
  ggtitle("kappa/balanced accuracy for various samples with Truth 30%-70%") +
  theme(plot.title = element_text(hjust = 0.5))