Three.js From Zero · Article s11-11

S11-11 Nerfstudio → Three.js

Season 11 · Article 11

Nerfstudio → Three.js — the Production Bridge

Most splat tutorials end at render the demo asset. This article covers the pipeline before that: capture footage, train a Splatfacto model in Nerfstudio, export, optimize, ship in Three.js via Spark. The capture-to-shipped recipe.

1. The full pipeline at a glance

1. captureiPhone, mirrorless, drone
2. processCOLMAP + Nerfstudio prep
3. trainSplatfacto, ~30 min
4. exportPLY → SPZ via SuperSplat
5. shipSpark + Three.js scene

Five steps. Three of them are offline. Two are in the browser. The article that just stops at step 5 has skipped the work that actually determines whether your scene looks shippable. Capture quality is everything.

2. Step 1 — capture

Every splat starts as a photo set. Your goal: 80–300 images covering every angle of the subject, with consistent lighting, no motion blur, and overlap of ~70% between adjacent frames.

devicepractical capturenotes
iPhone 14+ / Pixel 8+Polycam app or Luma capture, or raw 4K videoEasiest. Polycam handles step 2 in the cloud.
Mirrorless (e.g. Sony A7)200-400 stills, 24-35mm equivalentBest quality. Manual focus = no jitter.
Drone (DJI Mini 4)Orbit at 2-3 altitudes, video at 4K30Aerial scenes; watch wind moving foliage (kills training).

What kills a capture:

  • Reflections, transparency, water. The splat training assumes view-consistent geometry. A glass building gets reconstructed as a fuzzy ball.
  • Moving subjects. A person walking through frame becomes a smear.
  • Aggressive auto-exposure. Lock exposure if your camera lets you. Per-frame brightness changes look like ghost geometry to the optimizer.
  • Featureless walls. COLMAP needs SIFT features for camera pose estimation. A blank white wall has no features. Stick post-it notes to flat walls if you must.

3. Step 2 — process

Nerfstudio expects a folder layout: a directory of input images plus a transforms.json describing each camera's pose. The standard tool to create that JSON is COLMAP (Structure-from-Motion).

# From a video, sample frames first
ns-process-data video --data ./my_capture.mp4 \
  --output-dir ./data/my_scene --num-frames-target 200

# Or from a folder of images
ns-process-data images --data ./photos/ \
  --output-dir ./data/my_scene

ns-process-data wraps COLMAP. Output: a folder with images/, transforms.json, and a sparse point cloud Nerfstudio uses as a Gaussian seed.

4. Step 3 — train Splatfacto

Splatfacto is Nerfstudio's Gaussian-splat method. It's the splat-flavored sibling of nerfacto; same input format, very different model. Train command:

ns-train splatfacto \
  --data ./data/my_scene \
  --output-dir ./outputs/my_scene \
  --max-num-iterations 30000

# To resume:
ns-train splatfacto \
  --data ./data/my_scene \
  --load-dir ./outputs/my_scene/latest/

Hardware reality: an RTX 3090, 4090, or A100 is the sane training tier. M-series Macs can train via MPS (PyTorch metal backend) but expect 5-10× slower. Cloud options: ns-train on Lambda, RunPod, or Paperspace cost about $0.40-$1.20 per scene at H100 hourly rates.

Live preview: ns-viewer serves a browser-based 3D inspector while training. Look for floaters, oversmoothing, and underexposed regions early — don't wait for 30k iterations to discover the capture was bad.

Practical numbers: 10-min iPhone capture → ~30 min training on RTX 4090 → ~800k-1.5M splats in the output PLY. A 30-second drone orbit produces ~2M splats. Tune --pipeline.model.cull-alpha-thresh upward to thin the model if it's bloated.

5. Step 4 — export and convert

Splatfacto saves checkpoints (.ckpt). To get a shippable file, export to PLY first:

ns-export gaussian-splat \
  --load-config ./outputs/my_scene/.../config.yml \
  --output-dir ./exports/my_scene/

# Result: exports/my_scene/splat.ply  (often 200-500 MB)

That PLY is lossless and not web-friendly. Convert to .spz (Niantic) for shipping. Two paths:

  • SuperSplat (browser) — drag the PLY in, clean floaters with the lasso, export SPZ. Free, MIT, works offline once cached.
  • spz CLI (Niantic's converter) — spz encode splat.ply -o splat.spz. Headless for CI.
format1M-splat sizeSHwhere to use
PLY~250 MB0-3archive, training output
SPLAT~12 MB0 onlyview-independent legacy
SPZ~4-8 MB0-22026 shipping default
KSPLAT~10 MB0-2 progressivemkkellogg legacy

Always keep the PLY. It's your source-of-truth; you'll re-export to whatever format dominates 2027.

6. Step 5 — ship in Three.js via Spark

Code we covered in S11-10, repeated here for end-to-end completeness:

import * as THREE from 'three';
import { SplatMesh, SparkRenderer } from '@sparkjsdev/spark';

const renderer = new THREE.WebGLRenderer({ antialias: true });
const scene = new THREE.Scene();
const camera = new THREE.PerspectiveCamera(45, w/h, 0.1, 100);

// Trained Splatfacto output, converted to SPZ
const splats = new SplatMesh({ url: '/exports/my_scene.spz' });
scene.add(splats);

const spark = new SparkRenderer({ renderer });
function tick() {
  spark.update({ scene, camera });
  renderer.render(scene, camera);
  requestAnimationFrame(tick);
}
tick();

7. Live demo — pipeline visualizer (procedural)

Real splats need the offline training pipeline above. This demo shows what each stage conceptually looks like — captured images, sparse point cloud (the Splatfacto initialization), trained splat blobs, the same scene with Three.js mesh composited (final delivery). Step through the stages.

stage 1

8. Optimization for delivery

Production splat scenes need three optimizations beyond the raw export:

a. Cull floaters and out-of-frame splats

Splatfacto outputs are noisy at the edges. SuperSplat's lasso + delete in browser cuts 10-30% of splats with no perceptual loss.

b. Compress to SPZ

Quantize positions to 16-bit fixed-point, opacity to 8-bit, SH coefficients to 8-bit. PLY → SPZ is roughly 30-50× smaller. Decode is GPU-friendly.

c. Author multiple LODs

For VR or mobile delivery, generate two outputs from the same training: a 1.5M-splat hero, and a 500k-splat mobile. Switch by detecting WebXR session or low-end device. Spark supports per-instance LOD.

const hero   = new SplatMesh({ url: '/scene.hero.spz' });
const mobile = new SplatMesh({ url: '/scene.mobile.spz' });
const isXR = await navigator.xr?.isSessionSupported('immersive-vr');
scene.add(isXR ? hero : (matchMedia('(max-width: 800px)').matches ? mobile : hero));

9. Production gotchas

Alignment

COLMAP's coordinate system is arbitrary — your scene comes back rotated, mirrored, off-axis from where you expect. Bake the corrective transform into the SplatMesh:

splats.rotation.set(-Math.PI / 2, 0, 0);   // Z-up → Y-up
splats.position.set(0, -1.4, 0);            // floor to origin
splats.scale.setScalar(1.2);                // capture scale → world scale

Do this once at load. Don't re-bake the file every iteration; you'll be tweaking these values for an hour.

Scale

COLMAP recovers scale up to a similarity transform — meters are not meters. To set real-world scale: include a known-size object in capture (a checkerboard, a printed scale card), measure it in the Nerfstudio viewer, derive a multiplier. Or eyeball it. Most marketing scenes don't need true metric.

Lighting transfer

Splats bake lighting into vertex colors at capture time. If you composite a Three.js mesh into the splat scene, that mesh's lighting must match the splat's baked lighting or it'll look pasted-in. Two approaches:

  • Capture an HDRI alongside the splat scene — Insta360 camera, or stitch a phone panorama. Use it as scene.environment for the mesh.
  • Render the splat-only scene, sample its color from the regions where the mesh sits, fit a low-frequency env to those samples. Hacky but works for a dropped product on a captured surface.

Shadows

Splats can't receive Three.js mesh shadows (no shadow map sampling). Workaround: a soft fake shadow billboard under the mesh — semi-transparent radial gradient — sells the contact even if it's not physically accurate.

10. Cost & timing reality

steptimecost ($USD, cloud)
Capture10-30 min0 (own device)
COLMAP / process10-30 min~$0.20 on RunPod CPU
Splatfacto train20-60 min$0.40-$1.50 on H100
Export + SuperSplat cleanup10-20 min0
SPZ conversion< 1 min0
Three.js integration1-2 hrs first time, 10 min after0

End-to-end: a hobbyist scene is 90 minutes and $1-2 of compute. A polished agency capture is half a day and ~$50 of cloud GPU. That's the bar Polycam, Luma, and Niantic have collectively dragged from "research project" to "billable deliverable."

11. Takeaways

  • Capture is the input. 80-300 images, ~70% overlap, locked exposure, no moving subjects.
  • Nerfstudio's ns-process-data wraps COLMAP into a one-line preprocessor.
  • Splatfacto trains in 20-60 min on a 4090; cloud option is ~$1 per scene.
  • Export PLY for archive, convert to SPZ for shipping.
  • Spark loads SPZ + composes with Three.js meshes — splats and meshes share the scene graph.
  • Three production gotchas: alignment, scale, lighting transfer. Author the corrective transform once at load.
  • End-to-end cost: ~$1-50 per scene, half a day of work.