====== Backpropagation ====== The algorithm for computing gradients of a loss function with respect to network parameters by applying the chain rule layer by layer from output to input. [[concepts:vanishing_gradients|Vanishing gradients]] occur when these signals decay excessively in deep networks. [[concepts:residual_connections|Residual connections]] create the [[concepts:gradient_highway|gradient highway]] that preserves gradient flow. See also: [[concepts:vanishing_gradients]], [[concepts:gradient_highway]], [[concepts:residual_connections]]