A Comparative Analysis of Representation Flow in State-Space and Transformer Architectures

State-space models (SSMs) have emerged as promising alternatives to Transformers, particularly for long-context tasks, due to their efficiency in modeling long-range dependencies through structured state transitions. While prior work has focused on interpreting final-layer outputs, this study investigates the feature flow across layers in SSMs. By comparing this behavior to Transformers, we show fundamental differences in how contextual information is encoded and propagated. Our analysis reveals trade-offs in efficiency and expressivity, offering a deeper understanding of learning dynamics in both architectures. This work not only advances our understanding of SSMs but also lays the foundation for designing hybrid models that combine the strengths of both paradigms.

Nhat M. Hoang
Nhat M. Hoang
Research Intern (Jul ‘24)

My name is Nhat, research assistant at NTU Nail Lab, Singapore. I’m interested in generative AI, multimodal learning, and large language models.

Xuan Long Do
Xuan Long Do
A*STAR Doctoral Student (Aug ‘23)
Co-Supervised by Kenji Kawaguchi

PhD Candidate August 2023 Intake

Min-Yen Kan
Min-Yen Kan
Associate Professor

WING lead; interests include Digital Libraries, Information Retrieval and Natural Language Processing.