User Tools

Site Tools


papers:start

Papers

Notes and summaries of research papers.

  • Attention Residuals — Kimi Team technique replacing fixed residual weights with learned softmax attention over layer outputs.
papers/start.txt · Last modified: by aethersync

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki