The management of future AI-native Next-Generation (NextG) Radio Access Networks (RANs), including 6G and beyond, presents a challenge of immense complexity that exceeds the capabilities of traditional automation.
In response, we introduce the concept of the LLM-RAN Operator. In this paradigm, a Large Language Model (LLM) is embedded into the RAN control loop to translate high-level human intents into optimal network actions. Unlike prior empirical studies, we present a formal framework for an LLM-RAN operator that builds on earlier work by making guarantees checkable through an adapter aligned with the Open RAN (O-RAN) standard, separating strategic LLM-driven guidance in the Non-Real-Time (RT) RAN intelligent controller (RIC) from reactive execution in the Near-RT RIC, including a proposition on policy expressiveness and a theorem on convergence to stable fixed points.
By framing the problem with mathematical rigor, our work provides the analytical tools to reason about the feasibility and stability of AI-native RAN control. It identifies critical research challenges in safety, real-time performance, and physical-world grounding.
This paper aims to bridge the gap between AI theory and wireless systems engineering in the NextG era, aligning with the AI4NextG vision to develop knowledgeable, intent-driven wireless networks that integrate generative AI into the heart of the RAN.
The state \(s_{t} \in \mathcal{S}\) at time \(t\) must capture a snapshot of the entire RAN environment.
\(\mathcal{S} = \mathcal{H} \times \mathcal{Q} \times \mathcal{C} \times \mathcal{I}\)
Where \(\mathcal{H}\) is channel state space, \(\mathcal{Q}\) is the queueing state space, \(\mathcal{C}\) is the configuration space, and \(\mathcal{I}\) interference state space.
The action \(a_{t} \in \mathcal{A}\) is a structured, combinatorial command that modifies the network's configuration.
Lemma: Let \(U(s)\) be a utility function representing a network performance metric. Assume the LLM operator is designed to solve the single-step optimization problem \(a_t = \arg \max_{a \in \mathcal{A}}U(f_{\text{env}}(s_t, a))\). If a solution \(a_t\) exists such that \(U(s_{t + 1}) \geq U(s_t)\), then the sequence of states generated by the system is monotonically improving in utility. This property is observed in LLM-guided cases (e.g., decreasing transmit power raises efficiency).
Proof: We seek to prove that for any time step t, the utility of the next state, \(s_{t+1}\), is greater than or equal to the utility of the current state, \(s_t\). By the definition of the system's dynamics, the state at time \(t + 1\) is given by the application of the environment function to the current state \(s_t\) and the chosen action \(a_t\):
\(s_{t+1} = f_{\text{env}}(s_t, a_t)\)
The action space \(\mathcal{A}\) must contain, either explicitly or implicitly, a "do-nothing" or identity action, which we will denote as \(a_{\text{null}}\). This action is defined such that it does not change the state of the network. Therefore, applying the environment dynamics with this action yields the same state:
\(f_{\text{env}}(s_t, a_{\text{null}}) = s_t\)
According to the central assumption of the lemma, the action \(a_t\) is chosen to be the optimal action that maximizes the utility \(U\) of the resulting state. This means that \(a_t\) must yield a utility that is greater than or equal to the utility produced by any other possible action \(a' \in \mathcal{A}\).
\(U\left(f_{\text{env}}(s_t, a_t)\right) \geq U\left(f_{\text{env}}(s_t, a')\right) \quad \forall a' \in \mathcal{A}\)
This action \(a_{\text{null}}\) is a member of the set of all possible actions, \(\mathcal{A}\).
Since \(a_{\text{null}}\) is a member of \(\mathcal{A}\), the above inequality must also hold for \(a' = a_{\text{null}}\):
\(U(f_{\text{env}}(s_t, a_t)) \geq U(f_{\text{env}}(s_t, a_{\text{null}}))\)
By substituting the definitions from steps 1 and 2 into the inequality from step 4, we arrive at:
\(U(s_{t+1}) \geq U(s_t)\)
Since this holds for any arbitrary time step \(t\), the sequence of utilities (\(U(s_t)\)) is monotonically non-decreasing. QED
You can find more detailed formalism in the paper.
There's a lot of excellent work that was very useful for the completion of this work.
ORANSight-2.0: Foundational LLMs for O-RAN is a good starting point. Other literatures can be found in our paper.
@article{llm-ran,
author = {Oluwaseyi, Giwa and Michael, Adewole and Tobi, Awodumila and Pelumi, Aderinto},
title = {The LLM as a Network Operator: A Vision for Generative AI in the 6G
Radio Access Network},
journal = {NeurIPS 2025 Workshop on AI and ML for Next-Generation Wireless Communications and Networking},
year = {2025},
}