Meta-Learned Safe Exploration for Data-Efficient Deep Reinforcement Learning in Dynamic Wireless Spectrum Allocation

1African Institute for Mathematical Sciences, 2Stanford University

MetaRL is an upgrade to vanilla reinforcment learning in terms of safe exploration of the agent in a dynamic environment such as wireless networks. It utilized a constraint markov decision process (CMDP) unlike the classical reinforcement learning.

Abstract

Abstract coming... .

Safe Deep Reinforcement Learning in Wireless Networks

Safe DRL has mostly been employed in robotics with significant improvement in the field. This inspired the application of it in for spectrum allocation in dynamic wireless networks.

Algorithms

Different Algorithms

We developed two different algorithms, model-agnostic meta learning (MAML) and recurrent meta learning , and compared their respective actions in our safe DRL environment.

Interpolate start reference image.

Start Frame

Loading...
Interpolation end reference image.

End Frame


Related Works

There's a lot of excellent work that was introduced around the same time as ours.

Progressive Encoding for Neural Optimization introduces an idea similar to our windowed position encoding for coarse-to-fine optimization.

D-NeRF and NR-NeRF both use deformation fields to model non-rigid scenes.

Some works model videos with a NeRF by directly modulating the density, such as Video-NeRF, NSFF, and DyNeRF

There are probably many more by the time you are reading this. Check out Frank Dellart's survey on recent NeRF papers, and Yen-Chen Lin's curated list of NeRF papers.

BibTeX

@article{oluwaseyi2025,
  author    = {Oluwaseyi, Giwa and Tobi, Awodunmila and Muhammad, Ahmed Mohsin},
  title     = {Meta-Learned Safe Exploration for Data-Efficient Deep Reinforcement Learning in Dynamic Wireless Spectrum Allocation},
  journal   = {IEEE Wireless Communications Letters},
  year      = {2025},
}