ITSC 2025 Paper Abstract

Paper TH-LM-T30.2

Nguyen, Quang Nhat (The University of Melbourne), Sarvi, Majid (University of Melbourne), Bagloee, Saeed (Melbourne Uni)

Mamba-Byte-Traffic: Token-Free Byte-Level Traffic Flow Prediction with Selective State-Space Model

Scheduled for presentation during the Regular Session "S30a-Intelligent Modeling and Prediction of Traffic Dynamics" (TH-LM-T30), Thursday, November 20, 2025, 10:50−11:10, Gold Coast

2025 IEEE 28th International Conference on Intelligent Transportation Systems (ITSC), November 18-21, 2025, Gold Coast, Australia

This information is tentative and subject to change. Compiled on October 18, 2025

Keywords AI, Machine Learning for Real-time Traffic Flow Prediction and Management

Abstract

Traffic flow forecasting plays an important role in intelligent transportation systems, from enabling intelligent centralised traffic control to providing predicted-optimised navigation to every road user. To this end, Selective State-Space Model (Mamba) has lately emerged as SOTA for its lower computational complexity and better capability with very long context. Traffic flow forecasting is a subtask of time series forecasting which is currently limited by the need for tokenisation where the entire time dimension is transformed into an embedded token. This undermines the long context comprehension and long-term information retention that Mamba is particularly strong at. To model time series without tampering with the time dimension, we introduce Mamba-Byte-Traffic: a token-free approach where the time series is interpreted byte-by-byte as a very long sequence of digits in order to leverage the long-context capabilities of Mamba. By training the model to predict both the next byte and the future time series values, we let the model learn the dynamics that inherently exist within the data as intrinsically expressed by the numerical digits, thereby eliminating the dependency on any tokeniser and feature engineering for inter-channel dynamics learning. The next-byte prediction training also propagates the model's knowledge towards the last latent states, allowing the natural-language-like modelling of time series. Experiment results show that our model, by modelling traffic flow just like natural language, outperforms SOTA methods by from 3.38% up to 33.45%, with an average of 11.42%, in terms of prediction MSE on 5 datasets while maintaining linear computational complexity and edge-compute-level inference cost.