Artificial neurons

2023-01-21

1 minute read

Notes , Mathematics , ML , NNs

Sigmoid ($\sigma(x) = \frac {1}{1 - e^{-z}}$)

Can saturate when $z$ is large
Has a nice derivative: $\sigma'(x) = \sigma(x)(1 - \sigma(x)) $
Transforms $(-\infty; \infty) \to (0, 1)$

Tanh ($\tanh(z) = \frac {e^z - e^{-z}}{e^z + e^{-z}}$)

Is a rescaled version of the sigmoid: $\sigma(z)=\frac {1 + \tanh(\frac z 2)}{2}$
Transforms $(-\infty; \infty) \to (-1, 1)$, so is zero centered
May require normalization of outputs (or even inputs) to a prob distribution
$\tanh'(z) = 1 - \tanh^2(z)$
Is found empirically to provide only a small or no improvement in performance over sigmoid neurons
Can also saturate

Rectified Linear neuron ($\text{ReLU}(z) = \max(0, z)$)

Doesn’t saturate
Doesn’t learn if the weighted input is negative, as the gradient is then 0