Wave-Attractor-Tree: A Hierarchical Binary Tree Reduction Architecture for Efficient Sequence Modeling

arXiv:2603.00812v1 Announce Type: new Abstract: Work introduces a hierarchical binary tree-based reduction that replaces standard self-attention. The core idea is to use a recursive Gated Linear Unit merge operation, achieving O(n) total merge operations O(log n) parallel depth O(n d^2) total wor...

Wave-Attractor-Tree: A Hierarchical Binary Tree Reduction Architecture for Efficient Sequence Modeling
arXiv:2603.00812v1 Announce Type: new Abstract: Work introduces a hierarchical binary tree-based reduction that replaces standard self-attention. The core idea is to use a recursive Gated Linear Unit merge operation, achieving O(n) total merge operations O(log n) parallel depth O(n d^2) total work and O(n) space complexity. In these experiments, the model significantly outperforms standard Transformers in both convergence speed and accuracy on long-range structural dependencies, specifically where hierarchical inductive bias is critical.