Sep 14, 2015

Loss functions

I am listing out the basic loss function commonly encountered in computer vision research. Most of the description is quoted from the pytorch implementation documentation.
  • Classification:
    • Cross-Entropy Loss
    •  Binary Cross Entropy with Logit:
  • Regression:

    • Mean Squared Error (MSE):
    • Average MSE:
      •  y^{gt} vector comprised of binary 0/1 of ground truth. y^{pred} vector comprised of binary 0/1 of prediction.
      • This normalized MSE will give compute the overall difference in content between the ground truth and prediction
    •  
    • Instance MSE:
      •  y^{gt} vector comprised of binary 0/1 of ground truth. y^{pred} vector comprised of binary 0/1 of prediction. This normalized MSE will give emphasis on the foreground prediction ie, 1s in the ground truth vector:
    •  
    • joint loss (Average MSE + Instance MSE) is better to compute
    • L1 Loss:
    • Smooth L1 Loss:

  •  Distance Learning:
    • Cosine Embedding Loss:
    • Contrastive Loss:
      •  
    • Triplet Loss:
 
  • 3D Reconstruction Losses:
    • Voxel-based representation.
      • Eg, discretize the space into 32x32x32 voxels 3D grid
      • The output layer predicts the sigmoid function for each voxel as a classification task.
      • Binary cross-entropy for each voxel with reconstructed voxel's label vs voxels ground truth label
    • Point-cloud based representation.
      • Fixed number of 3D points as a regression task. Eg, 1024x3 fixed 3D point regression task.
      • Loss is defined to be the Chamfer distance to measure point to point similarity between two PCL.
      • Reference: Fan et al. [Point set generation for 3D reconstruction from a single image, CVPR'17]
 
  • Focal Loss
    • check how to use for semantic segmentation
  

No comments:

Carlo Cipolla's Laws of Stupidity

    "By creating a graph of Cipolla's two factors, we obtain four groups of people. Helpless people contribute to society but are...