papers:start
Papers
Notes and summaries of research papers.
- Attention Residuals — Kimi Team technique replacing fixed residual weights with learned softmax attention over layer outputs.
papers/start.txt · Last modified: by aethersync
