R2-Dreamer
R2-Dreamer (ICLR 2026) is a computationally efficient world model for RL that achieves strong performance without decoders or augmentation. It replaces the standard reconstruction loss with self-supervised representation objectives.
Key Idea
Standard DreamerV3 trains the encoder via a decoder that reconstructs observations. R2-Dreamer eliminates this bottleneck by using redundancy reduction (Barlow Twins loss) to learn representations directly.
Representation Variants
rl4burn supports all four variants from the paper:
| Variant | Loss | Description |
|---|---|---|
Dreamer | Decoder MSE | Standard DreamerV3 reconstruction baseline |
R2Dreamer | Barlow Twins | Invariance + decorrelation on cross-correlation matrix |
InfoNCE | Contrastive | Positive pair matching with temperature-scaled cosine similarity |
DreamerPro | Prototype | Sinkhorn-Knopp assignment to learned prototypes |
Usage
#![allow(unused)]
fn main() {
use rl4burn::algo::dreamer::{DreamerConfig, dreamer_world_model_loss, dreamer_actor_critic_loss};
use rl4burn::algo::loss::representation::RepresentationVariant;
// Configure with R2-Dreamer (Barlow Twins)
let config = DreamerConfig {
rep_variant: RepresentationVariant::R2Dreamer,
action_dim: 4,
discrete_actions: true,
..DreamerConfig::default()
};
let agent = config.init::<B>(&device);
// Train world model on observed sequences
let (wm_loss, wm_stats) = dreamer_world_model_loss(
&agent, observations, actions, rewards, continues,
);
// Train actor-critic via imagination
let (actor_loss, critic_loss, ac_stats) = dreamer_actor_critic_loss(
&agent, initial_states,
);
}
Architecture
The agent composes existing rl4burn building blocks:
- RSSM (
rl4burn_nn::rssm) — recurrent state-space model with deterministic GRU + stochastic categorical states - Imagination rollouts (
rl4burn_algo::planning::imagination) — generate trajectories in latent space - KL-balanced loss (
rl4burn_algo::loss::kl_balance) — train posterior and prior with free bits - Symlog + Twohot (
rl4burn_nn::symlog) — distributional value prediction - Representation losses (
rl4burn_algo::loss::representation) — Barlow Twins, InfoNCE, DreamerPro, decoder - MLP with RMSNorm (
rl4burn_nn::mlp) — prediction heads and actor/critic networks - CNN encoder/decoder (
rl4burn_nn::conv) — image observation processing
New Modules
| Module | Crate | Description |
|---|---|---|
mlp | rl4burn-nn | Configurable MLP with RMSNorm or LayerNorm |
conv | rl4burn-nn | CNN encoder (images → features) and decoder (features → images) |
multi_encoder | rl4burn-nn | Routes mixed observations (images + vectors) |
representation | rl4burn-algo | Four self-supervised representation losses |
dreamer | rl4burn-algo | DreamerAgent, world model loss, actor-critic loss |
Example
See examples/dreamer/ for a complete training loop on CartPole.
Reference
Nauman & Straffelini, “R2-Dreamer: Redundancy Reduction for Computationally Efficient World Models” (ICLR 2026).