Privileged Critic

The policy sees only what a player would see. The critic sees everything — including enemy positions behind fog of war.

The Trait

use rl4burn::algo::privileged_critic::{PrivilegedActorCritic, make_critic_input};

impl<B: Backend> PrivilegedActorCritic<B> for MyModel<B> {
    fn actor_forward(&self, obs: Tensor<B, 2>) -> Tensor<B, 2> {
        self.actor.forward(obs)  // partial observation only
    }

    fn critic_forward(&self, obs: Tensor<B, 2>, privileged: Tensor<B, 2>) -> Tensor<B, 1> {
        let input = make_critic_input(obs, privileged);
        self.critic.forward(input)
    }
}

Value estimation under partial observability is noisy — the critic can’t tell if you’re winning or losing without seeing the full game state. Giving the critic privileged information dramatically reduces variance. At deployment time, only the actor is needed.

Keyboard shortcuts

rl4burn

Privileged Critic

The Trait

Why it works