Desert Rose: Deep Residual Network (ResNet)

Apr 17, 2017

Deep Residual Network (ResNet)

Main idea:
The central idea of the paper itself is simple and elegant. They take a standard feed-forward ConvNet and add skip connections that bypass (or shortcut) a few convolution layers at a time. Each bypass gives rise to a residual block in which the convolution layers predict a residual that is added to the block's input tensor.

Reference

Although, Deep feed-forward conv nets tend to suffer from optimization difficulty (high training and high validation error). The residual network architecture solves this by adding shortcut connections that are summed with the output of the convolution layers.

Observations:

add the previous conv output 'x' (as residual) to the next output

H(x) = F(x)+x
= F(x)+Ix // multiplication with identity I-called identity mapping

If x is sufficient then F(.) will learn to weight the filters to zero. Otherwise learn to adjust weights to get optimal value.
Simply adding series of conv layers has large training error. 56-layer net has higher training error and test error than 20-layer net "Overly deep" plain nets have higher training error
Very simple design (series of fixed 3x3 conv layers)
Shortcut mapping is identity then forward pass additively propagates and Loss additively passes back as gradient (as opposed to multiplicative gradient propagation in other case)
what if shortcut mapping ℎ ≠ identity?

eg, conv(), xor, multiply with 0.5 etc increases the error

Keep the shortest path as smooth as possible by

using identity
forward/backward signals directly flow through this path

More on ResNet details
Update (09/25/21): another good ResNet tutorial

Desert Rose

Apr 17, 2017

Deep Residual Network (ResNet)

No comments:

Cat