synapses = weights = parameters

neurons = features = activations

popular neural network building block

  1. FC Layer

Convolution Layer $k_o = h_i - k_h + 1$

image.png

  1. Padding $h_o = h_i + 2p -k_h + 1$

since each padding increases both the height and width by two, we add $2p$

![image.png](attachment:513be318-1121-490a-bfd3-ce315347b270:image.png)

1. **Zero Padding**: pads input boundaries with zero.
    1. $h_o = h_i + 2p -k_h + 1 = h_i - k_h + 1$ 
2. Other Paddings: **Reflection Padding, Replication Padding** …
  1. Receptive Field $L \cdot (k - 1) + 1$ (정확하게 이해 안 됨)

    image.png

    1. Problem: with big images, we need many (deeper) layers
    2. Solution: Strided Convolution $h_o = \frac{h_i + 2p - k_h}{s} + 1$
      1. 1씩 움직이는 sliding windows 대신 건너뛰기라고 생각하면 됨
  2. Grouped Convolution Layer: reduces the weights

    image.png

  3. Depthwise Convolution Layer $g = c_i = c_o$ each output channel is connected to only one input channel; extreme case of group convolution layer

  4. Pooling Layer: downsamples the feature map to smaller size

  5. Normalization Layer: Normalize the feature map for faster and more effecient calculation

    1. Batch Norm
    2. Layer Norm
    3. Instance Norm
    4. Group Norm
  6. Activation Function

    1. Sigmoid
    2. ReLU → brings sparsity (all neg vals → 0)
    3. ReLU6 → largest value is 6
    4. Leaky ReLU
    5. Swish
    6. Hard Swish