User Tools

Site Tools


concepts:layer_normalization

Layer Normalization

Normalizing activations across the feature dimension to stabilize training. Applied either before (PreNorm) or after (PostNorm) the sublayer. PreNorm dominates modern LLMs but causes hidden-state growth.

See also: prenorm, postnorm, hidden_state_growth, attention_residuals

concepts/layer_normalization.txt · Last modified: by aethersync

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki