X

Confusion matrix – Example, Scenario and Code

A confusion matrix is a table that is often used to describe the performance of a classification model (or “classifier”) on a set of data for which the true values are known. It allows you to see how well your model is doing by comparing the predicted values with the true values.

For example, let’s say you have a classification model that predicts whether an email is a spam or not spam. You can create a confusion matrix by using the predicted values and the true values from a test set of data. The confusion matrix will show you how many emails your model correctly predicted as spam and how many it predicted as not spam, as well as how many emails were actually spam and how many were not spam.

Here is a simple example of a confusion matrix in Python:

from sklearn.metrics import confusion_matrix

# true values
y_true = [0, 0, 0, 1, 1, 1, 1, 1]

# predicted values
y_pred = [0, 1, 0, 1, 0, 1, 0, 1]

# create confusion matrix
confusion_matrix = confusion_matrix(y_true, y_pred)

# print confusion matrix
print(confusion_matrix)

This will print the following output:

[[2, 1],
 [2, 3]]

This confusion matrix shows that out of 8 emails, the model correctly predicted 2 as not spam and 3 as spam, but it also predicted 1 as not spam when it was actually spam, and 2 as spam when they were not spam.

More In Details

A confusion matrix can be useful for evaluating the performance of a classifier because it allows you to see how well your model is doing in terms of both accuracy and precision. The matrix can also help you understand which types of errors your model is making, and it can provide insights into ways you might be able to improve the model.

For example, in the confusion matrix above, you can see that there are more false positives (emails predicted as spam that are not spam) than false negatives (emails predicted as not spam that are spam). This might indicate that your model is overly conservative and could benefit from being adjusted to make it more sensitive. On the other hand, if there were more false negatives than false positives, it might indicate that your model is too aggressive and could benefit from being adjusted to make it more selective.

Additionally, you can use the numbers in the confusion matrix to calculate various performance metrics, such as precision, recall, and F1 score. These metrics can give you a more detailed understanding of how well your model is performing, and they can be useful for comparing the performance of different models.

Here is an example of how you can calculate some of these metrics using the confusion matrix in Python:

# calculate precision
precision = 3 / (3 + 1)
print("Precision:", precision)

# calculate recall
recall = 3 / (3 + 2)
print("Recall:", recall)

# calculate F1 score
f1 = 2 * (precision * recall) / (precision + recall)
print("F1 score:", f1)

This will print the following output:

Precision: 0.75
Recall: 0.6
F1 score: 0.6666666666666666

These values give you a more detailed picture of your model’s performance. The precision of 0.75 means that out of all the emails your model predicted as spam, 75% of them were actually spam. The recall of 0.6 means that out of all the emails that were actually spam, your model correctly predicted 60% of them as spam. The F1 score is a balance of precision and recall, and a score of 0.6666666666666666 means that your model has a relatively good balance of both.

Scenario-based example

confusion matrix to evaluate the performance of a classifier.

Imagine that you have a classifier that is trained to predict whether a customer will churn (stop using your service) based on their usage history and other factors. You want to evaluate the performance of your model on a test set of data, so you create a confusion matrix using the predicted values and the true values from the test set.

The confusion matrix might look something like this:

[[973, 27],
 [43, 57]]

This matrix shows that out of 1,050 customers, the model correctly predicted 973 as not churning and 57 as churning. However, it also predicted 27 as not churning when they actually churned, and 43 as churning when they did not.

Using this information, you can calculate some performance metrics to get a better understanding of how well your model is doing. For example, you can calculate the precision and recall as follows:

# calculate precision
precision = 57 / (57 + 27)
print("Precision:", precision)

# calculate recall
recall = 57 / (57 + 43)
print("Recall:", recall)

This will print the following output:

Precision: 0.6792452830188679
Recall: 0.5698924731182796

These values indicate that your model has relatively good precision, but its recall is not as high. This means that out of all the customers your model predicted as churning, about 68% of them actually churned. However, it also means that out of all the customers who actually churned, your model only correctly predicted 57% of them as churning.

Based on this information, you might decide to adjust your model to improve its recall. For example, you could try using a different algorithm or changing the features that the model uses to make its predictions. By comparing the performance of your modified model with the original model using a confusion matrix, you can see whether your changes have improved the model’s performance.

Jamaley Hussain: Hello, I am Jamaley. I did my graduation from StaffordShire University UK . Fortunately, I find myself quite passionate about Computers and Technology.
Related Post