The Precision-Recall Trade-Off
By George Bennett
The precision-recall trade-off can be an essential tool when precision is more important than recall or vice versa. In this post I will cover what the precision-recall trade-off is and how to take advantage of it. I will also give an introduction to precision and recall. If you already know about precision and recall, feel free to skip to the third section.
In the context of machine learning, precision and recall are metrics of performance for classification algorithms. Consider a classification task with two classes. Precision is how many times an accurate prediction of a particular class occurs per a false prediction of that class. Recall is the percentage of the data belonging to a particular class which the model properly predicts as belonging to that class. The traditional way to think of this is to first define true positives, false positives, and false negatives (see the table below¹). A true positive is a prediction that was correct and the data point belongs to the positive class. A false negative is a prediction that was incorrect where the actual value is positive and the predicted value was negative. A false negative is a prediction that was incorrect because the prediction was positive but the actual value was negative. Precision can then be defined as the amount of true positives divided by the sum of true positives and false positives. Recall can be defined as the amount of true positives divided by the sum of true positives and false negatives.
The Idea behind the precision-recall trade-off is that when a person changes the threshold for determining if a class is positive or negative it will tilt the scales. What I mean by that is that it will cause precision to increase and recall to decrease, or vice versa. Most machine learning algorithms in scikit-learn come with a method to predict probability. Instead of using the ordinary predict method it can be beneficial to use the predict_proba method.
MyModel.predict_proba(X_val)
This method will give an n-dimensional array of probabilities, one column for each class. Simply take the column for the class you want to adjust the precision-recall trade-off for and generate 100 hundred or so different probability thresholds. Then make a function to determine the precision and recall at each threshold (the prediction should be positive if the predicted probability is over the threshold) so you can plot the precision-recall trade-off (see image below code²). Once you have done that and have found the best trade-off point simply make it a rule to use that same threshold on the test set and when use that threshold the model gets put into production.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import precision_score, recall_scorethresholds = np.linspace(0, 1, 100)
precision_scores = []
recall_scores = []for threshold in thresholds: adjusted_predictions = [1 if p > threshold else 0 for p in predictions] precision_scores.append(precision_score(y_val, adjusted_predictions)) recall_scores.append(recall_score(y_val, adjusted_predictions))plt.plot(thresholds, precision_scores, label="precision")
plt.plot(thresholds, recall_scores, label="recall")
plt.show()
Sources
Feel free to contact the author with any questions or comments!
Name: George Bennett
Email: datascience.george@gmail.com
LinkedIn: linkedin.com/in/george-w-bennett
Phone: (757)-292–8346