Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Orthogonal Initialization

Orthogonal weight initialization is critical for PPO convergence. It’s the default in CleanRL, Stable Baselines3, and OpenAI baselines.

API

pub fn orthogonal_linear<B: Backend>(
    d_in: usize,
    d_out: usize,
    gain: f32,
    device: &B::Device,
    rng: &mut impl Rng,
) -> Linear<B>

Creates a Linear layer with orthogonal weights and zero bias. Matches PyTorch’s nn.init.orthogonal_.

Gain values

LayerGainWhy
Hidden (tanh)sqrt(2) ≈ 1.414Preserves gradient norms through tanh
Actor output0.01Near-uniform initial policy (good exploration)
Critic output1.0Reasonable initial value scale

Usage

use rl4burn::init::orthogonal_linear;
let sqrt2 = std::f32::consts::SQRT_2;

let actor_fc1 = orthogonal_linear(4, 64, sqrt2, &device, &mut rng);
let actor_out = orthogonal_linear(64, 2, 0.01, &device, &mut rng);
let critic_out = orthogonal_linear(64, 1, 1.0, &device, &mut rng);

Why not use Burn’s built-in initializers?

Two reasons:

  1. Burn doesn’t have orthogonal initialization. The closest is XavierUniform, which has similar scale but lacks the orthogonality property.
  2. Burn initializes bias with the same initializer as weights. CleanRL always initializes bias to zero. orthogonal_linear handles both correctly.

Implementation

Uses modified Gram-Schmidt orthogonalization on a random Gaussian matrix. Weights are loaded via Param::from_data + load_record to preserve Burn’s autodiff tracking (see Working with Burn’s Autodiff).