Scaling State Space Models with Block-Sparsity and Fused Kernels
A scaling story for oscillator state-space layers: block-sparse projection heads control dense coupling, while IO-aware FlashDOSS kernel fuses projection, scan, and projection-back work to avoid expensive state-domain traffic.