concepts:positional_encoding
Positional Encoding
Since softmax attention is permutation-invariant, positional encodings inject sequence-order information into Transformer inputs. Variants include sinusoidal (original), learned, and rotary (RoPE) encodings. Not directly modified by Attention Residuals, but RoPE interacts with linear attention in Kimi Linear.
See also: transformer, softmax_attention, kimi_linear
concepts/positional_encoding.txt · Last modified: by aethersync
