====== Papers ====== Notes and summaries of research papers. * [[papers:attention_residuals|Attention Residuals]] — Kimi Team technique replacing fixed residual weights with learned softmax attention over layer outputs.