Meta-Reinforcement Learning for Fast and Data-Efficient Spectrum Allocation in Dynamic Wireless Networks

1African Institute for Mathematical Sciences, 2Stanford University, 3University of Oklahoma, 4University of Glasgow

Meta reinforcement learning (RL) ensures faster adaptation of an RL agent in an environment by learning the initial policy and quickly adapting. In a wireless environment, meta RL captures the dynamic nature of components that are suitable for the safe exploration of the agent.

Abstract

Efficient spectrum allocation is vital for 5G/6G networks, yet traditional deep reinforcement learning (DRL) methods suffer from high sample complexity and unsafe exploration that can disrupt network stability. To address these challenges, we propose a meta-learning framework that learns a robust initial policy capable of rapid and safe adaptation to changing wireless conditions. We implement three meta-learning architectures using model-agnostic techniques---model-agnostic meta-learning (MAML), recurrent neural network (RNN), and RNN with a self-attention mechanism---and compare them against a DRL baseline and classical heuristic approaches in a dynamic integrated access/backhaul (IAB) environment. The attention-based agent achieves a peak throughput of \(\approx49\)~Mbps, reducing SINR and latency violations by over \(60\%\) relative to PPO, and attains \(97\%\) of the fairness level of the exhaustive-search upper bound. These results demonstrate that meta-learning enables data-efficient, reliable, and scalable spectrum management for next-generation wireless systems.

Results

1 / 4
Fairness Index.
2 / 4
Mean Latency Violations.
3 / 4
Mean SINR Violations.
4 / 4
Network Throughput.

Related Works

A lot of excellent work was very useful for completing this work.

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks was the basis of our architecture.

Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning is another excellent literature.

BibTeX

@article{oluwaseyi2025,
  author    = {Oluwaseyi, Giwa and Tobi, Ebenezer Awodumila and Muhammad, Ahmed Mohsin and Ahsan, Bilal and Muhammad, Ali Jamshed},
  title     = {Meta-Reinforcement Learning for Fast and Data-Efficient Spectrum Allocation in Dynamic Wireless Networks},
  journal   = {IEEE Wireless Communications Letters},
  year      = {2026}
}