User Tools

Site Tools


concepts:residual_connections

Residual Connections

Skip connections that add a layer's input to its output: h_l = h_{l-1} + f(h_{l-1}). Enable gradient flow in deep networks but accumulate all prior outputs with fixed unit weights, causing dilution at depth.

See also: attention_residuals, prenorm, gradient_highway, hidden_state_growth, layer_pruning

concepts/residual_connections.txt · Last modified: by aethersync

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki