On this quick article, I purpose to create a set of 5 mnemonic guidelines to remove my confusion in regards to the Confusion Matrix.
First Rule: Undesirable positives
In binary classification, the “constructive” label doesn’t suggest that the result is “good” or “fascinating” — it simply signifies the category that’s of explicit curiosity or concern:
- For shopper retention, the constructive class is a churned shopper 🫣
- In a safety verify on the airport, the constructive class may very well be that the suspect carries a gun 🔫
- In a medical check, the constructive class is that the affected person has the illness 🦠
Second Rule: Predictions on prime, truths apart
Use this rule to recollect how to attract the confusion matrix.
Third Rule: “Safety verify”
I like to make use of the “Safety Test” Metaphor to make sense of the 4 quadrants:
- 🙂 True Optimistic 🙂: Job properly carried out.
- 🥲 False Optimistic 🥲: Sorry for embarrassing you, you may put your sneakers again on.
- 😱 False Unfavorable 😱: We fucked-up.
- ☺️ True Unfavorable ☺️: Job properly carried out.
On this case, our mannequin must be tuned to reduce false negatives.
Fourth Rule: To be exact we want all of the positives
The confusion matrix is used to calculate two key metrics: precision and recall.
SOFT_PRED = mannequin.predict_proba(X_test)[:, 1] # Choose the second column (index 1) for all rowsactual_positive = (y_test == 1)
actual_negative = (y_test == 0)
predicted_positive = (SOFT_PRED >= 0.5)
predicted_negative = (SOFT_PRED
true_positives = (predicted_positive & actual_positive).sum()
false_positives = (predicted_positive & actual_negative).sum()
true_negatives = (predicted_negative & actual_negative).sum()
false_negatives = (predicted_negative & actual_positive).sum()
Keep in mind, “precision makes use of all of the positives”. Precision measures the fraction of appropriate constructive predictions — out of all of the shoppers we recognized as churning, what number of churned?”
precision = true_positives / (true_positives + false_positives)
Fifth Rule: Recall is about re-calling all your pals
What number of of your pals are you able to recall?
- True Positives: The buddies you efficiently keep in mind.
- False Negatives: The buddies you by accident neglect.
recall = true_positives / (true_positives + false_negatives)
A recall of 52% tells you that you simply didn’t name 48% of your pals
Thanks for studying ❤️