Motivation

The imbalance of foreground objects and background objects is a key problem in detection model training
OHEM automatically select hard examples to solve this issue
This paper proposes the Focal Loss to down-weight the loss assigned to well-classified examples, which is a dynamically scaled cross-entropy loss

Method

Binary CE Loss

Starting from the cross-entropy (CE) loss for binary classification

CE (p, y) = {- lo g p, - lo g (1 - p), y = 1 otherwise

In the above $y \in {\pm 1}$ specifies the ground-truth class and $p \in [0, 1]$ is the model's estimated probability for the class with label $y = 1$ . For notation convenience, we define $p_{t}$ as

p_{t} = {p 1 - p if y = 1 otherwise

and rewrite $CE (p, y) = - lo g (p_{t})$

Balanced Cross Entropy

A common method for addressing class imbalance is to introduce a weighting factor $α \in [0, 1]$ for class $1$ and $1 - α$ for class $1$ . In practice $α$ may be set by inverse class frequency or treated as a hyper-parameter to set by cross validation. For notation convenience, we define $α_{t}$ analogously to how we defined $p_{t}$ . We write the $α$ -balanced CE loss as:

CE (p_{t}) = - α_{t} lo g (p_{t})

Focal Loss

While $α$ balances the importance of positive/negative examples, it does not differentiate between easy/hard examples. Therefore, we propose to add a modulating factor $(1 - p_{t})^{γ}$ to the cross entropy loss with tunable focusing parameter $γ \geq 0$ . We define the focal loss as

FL (p_{t}) = - (1 - p_{t})^{γ} lo g (p_{t})

Properties of Focal Loss

When an example is misclassified and $p_{t}$ is small, the modulating factor is near $1$ and the loss is unaffected. As $p_{t} \to 1$ , the factor goes to $0$ and the loss for well-classified examples is down-weighted.
The focusing parameter $γ$ smoothly adjusts the rate at which easy examples are down-weighted. When $γ = 0$ , FL is equivalent to CE, and as $γ$ is increased the effect of the modulating factor is likewise increased.

RetinaNet

Combine FPN with Focal Loss

Lin's Notes Garden

Explorer

Focal Loss for Dense Object Detection