Each time when one uses library or customized functions, it is easy to confuse the usage of the functions. This article will help in using the functions correctly and also give some understanding of the metrics. We will only focus on binary classification.
We will focus on the usage of confusion matrix and some of the metrics (especially kappa) associated with it.
The first go-to package to get confusion matrix is caret package’s confusion matrix which produces rich assortment of metrics which can be used for given problem to measure performance.
v_truth <- c(rep(1,3),rep(0,7))
v_pred <- c(rep(1,1),rep(0,9))
To understand the concepts, we make a simple truth (or observed) vector with 10 elements that are imbalanced – has 30% 1’s and 70% 0’s. The vector created by above code is as follows:
Note: When using the caret package’s function confusion Matrix prediction vector is the first argument, the observation or the truth is the second argument
suppressMessages(library(caret))
confusionMatrix(v_pred,v_truth)
## Confusion Matrix and Statistics
##
## Reference
## Prediction 0 1
## 0 7 2
## 1 0 1
##
## Accuracy : 0.8
## 95% CI : (0.4439, 0.9748)
## No Information Rate : 0.7
## P-Value [Acc > NIR] : 0.3828
##
## Kappa : 0.4118
## Mcnemar's Test P-Value : 0.4795
##
## Sensitivity : 1.0000
## Specificity : 0.3333
## Pos Pred Value : 0.7778
## Neg Pred Value : 1.0000
## Prevalence : 0.7000
## Detection Rate : 0.7000
## Detection Prevalence : 0.9000
## Balanced Accuracy : 0.6667
##
## 'Positive' Class : 0
##
Confusion matrix can be generated simply by the native R’s table function as follows:
table(v_pred,v_truth)
## v_truth
## v_pred 0 1
## 0 7 2
## 1 0 1
Accuracy is simply fraction of correctly identified elements. ie.,
\[ \text{accuracy} ={7+1\over 10}\] This is a good metric when the binary response are about equal in number i.e., 1’s and 0’s are about equal in number. However when the reponse in imbalanced like 30% of and 70% zeros, balanced accuracy and kappa are important metrics to track.
This metric is average of each elements’ fractional response correctly predicted i.e.,
\[ \text{balanced accuracy} = ({7\over{7}} + {1\over{3}})/2 = 0.\bar{6}\]
For imbalanced classification, this metric is prefered because each response 0 or 1 is given fifty percent importance! The maximum score contribution from each 0 or 1 is 0.5.
To diagramtically see, check the confusion matrix, the look columwise numbers relates to Truth, we use
\[ {\text{correctly predicted}\over{\text{correctly predicted + incorrectly predicted}}}\text{, which is} {7\over 7} \text{ for 0's and} {1\over 3} \text{ for 1's} \]
The metric is often used for imbalanced classification. Cohen’s kappa is measure of agreement between two individuals. A perfect agreement will have kappa as 1. Following categories can be made (more from this link)
kp <- function(pred,obs){
cf<-table(pred, obs)
if(sum(dim(cf))<4){return(0)}
a<- cf[1,1]; c<- cf[2,1]; b<- cf[1,2];d <- cf[2,2]
po <- (a+d)/sum(cf); pe <- ((a+b)*(a+c)+(c+d)*(b+d))/sum(cf)^2
return(ifelse(pe==1,0,(po-pe)/(1-pe)))}
ba <- function(pred,obs){
a<-table(pred, obs)
return((a[1]/(a[1]+a[2])+a[4]/(a[4]+a[3]))/2)
}
cat(" kappa calculated = ", kp(v_pred,v_truth), "\n")
## kappa calculated = 0.4117647
cat(" Balanced Accuracy calculated = ", ba(v_pred,v_truth), "\n")
## Balanced Accuracy calculated = 0.6666667
For an imbalanced classifcation, it is interesting to see how kappa varies based on different predictions. We use the same dataset of 10 elements of truth (or observation) vector imbalanced with 30% 1’s and rest 70% 0’s.
We firstly prepare the data that is easily acceptable to ggplot
suppressMessages(library(ggplot2))
df <- data.frame("typ"=rep("truth 3%",10),
"response"=as.character(v_truth),
"kappa"="n/a", "bal_acc" = "n/a")
for (ii in 1:5) {
v_pred <- c(rep(1,ii),rep(0,10-ii))
#kpa <- paste(expression("kappa=", round(kp(v_pred, v_truth),1)))
kpa <- as.character(round(kp(v_pred, v_truth),2))
kpa <- paste("k = ", kpa)
bal_acc <- as.character(round(ba(v_pred, v_truth),2))
bal_acc <- paste("ba= ", bal_acc)
df <- rbind(df,(data.frame("typ"=rep(paste ("pred", ii, "%"),10),
"response"=as.character(v_pred),
"kappa"= rep(kpa,10), "bal_acc"= rep(bal_acc,10))))
}
suppressMessages(library(dplyr))
dd<- (df %>%
group_by(typ, response) %>%
summarise(nn = n()))
ggplot(df, aes(x=typ)) + geom_bar(aes(fill=response)) +
geom_text(data=dd, aes(label=paste0(nn,"%"),
y=ifelse(dd$response==1,nn/2,10-nn/2)), ) +
geom_text(data=df, aes(label=kappa),y=9.2, size=4, vjust=.8, family="Calibri", colour="darkblue") +
geom_text(data=df, aes(label=bal_acc),y=9.2, size=4, vjust =-0.8, family="Calibri Light", colour="darkblue") +
coord_flip() + theme_bw() +
ylab("number of 1's and 0's in sample") +
xlab("sample of various response predictions") +
ggtitle("kappa/balanced accuracy for various samples with Truth 30%-70%") +
theme(plot.title = element_text(hjust = 0.5))