Evaluating the model is a crucial stage. It allows us to understand if we are solving the problem or not. The evaluation stage starts from the domain and the data stages. We must be sure of what we are trying to solve and that it can be evaluated. In this chapter, we will cover the most popular evaluation tools and techniques to assess the model’s performance.
The most basic estimator and it is estimated as follows: success prediction / total cases
Accuracy is the most basic and is not as informative as the following ones.
A popular matrix for visualizing and measuring the model’s performance, this matrix is almost the same matrix presented in the statistics chapter under errors.
In the confusion matrix, we see the results of a classifier under a specific threshold (usually 0.5). The columns represent the instances of the prediction label 1 and label -1. The rows represent the actual label, the cross of the columns and rows are the values in the cells.
The most popular evaluation plot for binary classifiers (predicts 2 labels) with many possible thresholds. The plot is constructed with Y-axis for TPR and X-axis for FPR.
The ROC curve visualizes the classifier performance under different thresholds allowing the cyber data scientists to choose what the correct threshold is for specific predefined False Positive Rate.
Extreme points – these points (0,0) and (1,1) are the same for any ROC because if the FPR is 0 then the false positive is 0. meaning that the classifier will classify all the instances as negative so its TPR will be also 0 because it couldn’t classify even one positive instance. On the other hand, if the FPR is 1 then the False positive is 1, meaning that it can classify all the cases as positive, thus causing the TPR to be 1 because the TP is 1 (all the positive instances were classified correctly).
AUC – The general motivation when reading the ROC graph is to have the classifier with high TPR and low FPR. The area under the ROC curve (AUC) is a good estimator to compare classifiers because, as the FPR grows we want to have a bigger TPR.
Random classifier – the random classifier will randomly choose a label; its curve will be a simple straight line between the extreme points. This is an essential baseline because it means that our model curve must be at least above this line. Otherwise, we could use the random classifier.