Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Privileged Critic

The policy sees only what a player would see. The critic sees everything — including enemy positions behind fog of war.

The Trait

use rl4burn::algo::privileged_critic::{PrivilegedActorCritic, make_critic_input};

impl<B: Backend> PrivilegedActorCritic<B> for MyModel<B> {
    fn actor_forward(&self, obs: Tensor<B, 2>) -> Tensor<B, 2> {
        self.actor.forward(obs)  // partial observation only
    }

    fn critic_forward(&self, obs: Tensor<B, 2>, privileged: Tensor<B, 2>) -> Tensor<B, 1> {
        let input = make_critic_input(obs, privileged);
        self.critic.forward(input)
    }
}

Why it works

Value estimation under partial observability is noisy — the critic can’t tell if you’re winning or losing without seeing the full game state. Giving the critic privileged information dramatically reduces variance. At deployment time, only the actor is needed.