Decide threshold for each class for optimal precision/recall in a multi-class classification problem
Say I have three classes $C_1$,$C_2$, $C_3$ and a model $M$ which outputs a score $P$ for the confidence of each class for a sample $X$ i.e $M(X)=[P(C_1),P(C_2),P(C_3)]$ (note, we only want to predict one class)
Say I have created 3 one-vs-rest precision/recall plots and I decide that the optimal thresholds for each class are
$T_1 = 0.6$
$T_2 = 0.7$
$T_3 = 0.5$
We can then create the logic of assigning $X$ to a class like so:
If the index, $i$, of the biggest score of $M$ is greater than or equal to $T_i$, assign $X$ to $C_i$. Else, don't assign $X$ to anything, see the two examples below for two input of $X$:
$M(X_1) = [0.8,0.1,0.1] \rightarrow C_1\quad$ since the biggest socre is $0.8$ which is for class 1 and $T_10.8$
$M(X_2) = [0.3,0.6,0.1] \rightarrow \text{None}\quad$ since the biggest score is $0.6$ which is for class 2, but $T_20.6$
But, something tells me that in this way we don't preserve the optimal precision/recall for each class as we used to decide the thresholds at the first place, so my questions are:
- Can we have dynamic thresholds for a multi-class classification i.e a threshold for each class which preserves the optimal precision/recall for all classes at the same time?
- Is there a better way to decide thresholds for multi-class classification as per my problem above, when we want to control the precision/recall for each class
EDIT:
Say I have the following results for my validation-set:
conf pred target
----+------+-------
0.9 C1 C1
0.8 C1 C1
0.76 C1 C2
.
.
0.93 C2 C2
0.9 C2 C2
0.83 C2 C3
.
.
wouldn't this overcome the issue about the one-vs-rest example, since we now have the confidence when all three classes are involved and not 3x one-vs-rest?
Topic multiclass-classification
Category Data Science