Three.js From Zero · Article s5-07

S5-07 Clustered Forward Lighting

Season 5 · Article 07

Clustered Forward Lighting

Deferred scales to many lights but breaks MSAA + transparency. Clustered forward = 3D grid over the view frustum, lights binned per cell, shader loops only its cell's lights. Scales AND handles transparency.

1. The motivation

Forward + 100 lights = each fragment loops 100 lights. Dead.

Forward+ / clustered: slice the frustum into a 16×9×24 grid (~3500 cells). Assign each light to the cells its radius intersects. Shader samples its cell → loops 4-20 lights instead of 100.

2. The two passes

// CPU (or compute): bin lights
for each light:
  for each cluster cell the light's bounding sphere overlaps:
    add light.id to cell.lightList

// Upload two buffers:
// - lightIndices: concatenated list of all light IDs
// - clusterHeads: per-cell (offset, count) into lightIndices

// GPU shading
vec3 shade(fragment) {
  uvec3 cluster = computeCluster(gl_FragCoord, depth);
  (offset, count) = clusterHeads[cluster];
  for (uint i = 0; i < count; i++) {
    uint lightId = lightIndices[offset + i];
    color += lighting(lights[lightId]);
  }
}

3. The cluster math

Grid is in view space. X/Y are 2D screen tiles. Z is logarithmic slices (tight near camera, looser far away).

float slice(float viewZ, float near, float far, int N) {
  return float(N) * log(viewZ / near) / log(far / near);
}

Logarithmic Z gives even distribution of density.

4. Live demo — clustered forward visualization

Many dynamic lights over a scene. Toggle cluster visualization overlay → see the 2D tile grid and light bin counts.

5. Tile size tradeoff

  • Small tiles (8×8 px): precise culling, big buffer.
  • Large tiles (32×32 px): fewer cells, looser cull.
  • Typical: 16×16 or 32×32. Z-slices 16-32.

6. Cluster vs. tile

Tiled forward: 2D screen grid only (no depth slice). Simpler, but long narrow lights overfill tiles.

Clustered forward: adds Z slices. Much tighter cull along view direction.

Unity URP/HDRP, Unreal, Detroit Become Human, Doom Eternal — all clustered.

7. Transparency

Transparent objects sample same cluster buffer. No separate pass. This is the killer feature over deferred.

8. Compute-shader build

Modern: light-list build is itself a compute shader. Each workgroup = one cell, threads fan out over lights, atomic append on hit.

Build cost: sub-millisecond for 1024 lights × 3500 cells on RTX-tier cards.

9. Three.js status

WebGPURenderer has the pieces — compute shaders, storage buffers. No stock clustered lighter yet.

Library: WebGPU samples or roll your own via TSL compute (S4-10).

10. Takeaways

  • Clustered = 3D grid over view frustum, lights binned per cell.
  • Log-Z slices → even density.
  • Per-fragment: loop only your cell's lights (4-20 instead of 100+).
  • Supports transparency, unlike deferred.
  • Modern AAA default.