Three.js From Zero · Article s11-11
S11-11 Nerfstudio → Three.js
Nerfstudio → Three.js — the Production Bridge
Most splat tutorials end at render the demo asset. This article covers the pipeline before that: capture footage, train a Splatfacto model in Nerfstudio, export, optimize, ship in Three.js via Spark. The capture-to-shipped recipe.
1. The full pipeline at a glance
Five steps. Three of them are offline. Two are in the browser. The article that just stops at step 5 has skipped the work that actually determines whether your scene looks shippable. Capture quality is everything.
2. Step 1 — capture
Every splat starts as a photo set. Your goal: 80–300 images covering every angle of the subject, with consistent lighting, no motion blur, and overlap of ~70% between adjacent frames.
| device | practical capture | notes |
|---|---|---|
| iPhone 14+ / Pixel 8+ | Polycam app or Luma capture, or raw 4K video | Easiest. Polycam handles step 2 in the cloud. |
| Mirrorless (e.g. Sony A7) | 200-400 stills, 24-35mm equivalent | Best quality. Manual focus = no jitter. |
| Drone (DJI Mini 4) | Orbit at 2-3 altitudes, video at 4K30 | Aerial scenes; watch wind moving foliage (kills training). |
What kills a capture:
- Reflections, transparency, water. The splat training assumes view-consistent geometry. A glass building gets reconstructed as a fuzzy ball.
- Moving subjects. A person walking through frame becomes a smear.
- Aggressive auto-exposure. Lock exposure if your camera lets you. Per-frame brightness changes look like ghost geometry to the optimizer.
- Featureless walls. COLMAP needs SIFT features for camera pose estimation. A blank white wall has no features. Stick post-it notes to flat walls if you must.
3. Step 2 — process
Nerfstudio expects a folder layout: a directory of input images plus a transforms.json describing each camera's pose. The standard tool to create that JSON is COLMAP (Structure-from-Motion).
# From a video, sample frames first
ns-process-data video --data ./my_capture.mp4 \
--output-dir ./data/my_scene --num-frames-target 200
# Or from a folder of images
ns-process-data images --data ./photos/ \
--output-dir ./data/my_scene
ns-process-data wraps COLMAP. Output: a folder with images/, transforms.json, and a sparse point cloud Nerfstudio uses as a Gaussian seed.
4. Step 3 — train Splatfacto
Splatfacto is Nerfstudio's Gaussian-splat method. It's the splat-flavored sibling of nerfacto; same input format, very different model. Train command:
ns-train splatfacto \
--data ./data/my_scene \
--output-dir ./outputs/my_scene \
--max-num-iterations 30000
# To resume:
ns-train splatfacto \
--data ./data/my_scene \
--load-dir ./outputs/my_scene/latest/
Hardware reality: an RTX 3090, 4090, or A100 is the sane training tier. M-series Macs can train via MPS (PyTorch metal backend) but expect 5-10× slower. Cloud options: ns-train on Lambda, RunPod, or Paperspace cost about $0.40-$1.20 per scene at H100 hourly rates.
Live preview: ns-viewer serves a browser-based 3D inspector while training. Look for floaters, oversmoothing, and underexposed regions early — don't wait for 30k iterations to discover the capture was bad.
--pipeline.model.cull-alpha-thresh upward to thin the model if it's bloated.5. Step 4 — export and convert
Splatfacto saves checkpoints (.ckpt). To get a shippable file, export to PLY first:
ns-export gaussian-splat \
--load-config ./outputs/my_scene/.../config.yml \
--output-dir ./exports/my_scene/
# Result: exports/my_scene/splat.ply (often 200-500 MB)
That PLY is lossless and not web-friendly. Convert to .spz (Niantic) for shipping. Two paths:
- SuperSplat (browser) — drag the PLY in, clean floaters with the lasso, export SPZ. Free, MIT, works offline once cached.
- spz CLI (Niantic's converter) —
spz encode splat.ply -o splat.spz. Headless for CI.
| format | 1M-splat size | SH | where to use |
|---|---|---|---|
| PLY | ~250 MB | 0-3 | archive, training output |
| SPLAT | ~12 MB | 0 only | view-independent legacy |
| SPZ | ~4-8 MB | 0-2 | 2026 shipping default |
| KSPLAT | ~10 MB | 0-2 progressive | mkkellogg legacy |
Always keep the PLY. It's your source-of-truth; you'll re-export to whatever format dominates 2027.
6. Step 5 — ship in Three.js via Spark
Code we covered in S11-10, repeated here for end-to-end completeness:
import * as THREE from 'three';
import { SplatMesh, SparkRenderer } from '@sparkjsdev/spark';
const renderer = new THREE.WebGLRenderer({ antialias: true });
const scene = new THREE.Scene();
const camera = new THREE.PerspectiveCamera(45, w/h, 0.1, 100);
// Trained Splatfacto output, converted to SPZ
const splats = new SplatMesh({ url: '/exports/my_scene.spz' });
scene.add(splats);
const spark = new SparkRenderer({ renderer });
function tick() {
spark.update({ scene, camera });
renderer.render(scene, camera);
requestAnimationFrame(tick);
}
tick();
7. Live demo — pipeline visualizer (procedural)
Real splats need the offline training pipeline above. This demo shows what each stage conceptually looks like — captured images, sparse point cloud (the Splatfacto initialization), trained splat blobs, the same scene with Three.js mesh composited (final delivery). Step through the stages.
8. Optimization for delivery
Production splat scenes need three optimizations beyond the raw export:
a. Cull floaters and out-of-frame splats
Splatfacto outputs are noisy at the edges. SuperSplat's lasso + delete in browser cuts 10-30% of splats with no perceptual loss.
b. Compress to SPZ
Quantize positions to 16-bit fixed-point, opacity to 8-bit, SH coefficients to 8-bit. PLY → SPZ is roughly 30-50× smaller. Decode is GPU-friendly.
c. Author multiple LODs
For VR or mobile delivery, generate two outputs from the same training: a 1.5M-splat hero, and a 500k-splat mobile. Switch by detecting WebXR session or low-end device. Spark supports per-instance LOD.
const hero = new SplatMesh({ url: '/scene.hero.spz' });
const mobile = new SplatMesh({ url: '/scene.mobile.spz' });
const isXR = await navigator.xr?.isSessionSupported('immersive-vr');
scene.add(isXR ? hero : (matchMedia('(max-width: 800px)').matches ? mobile : hero));
9. Production gotchas
Alignment
COLMAP's coordinate system is arbitrary — your scene comes back rotated, mirrored, off-axis from where you expect. Bake the corrective transform into the SplatMesh:
splats.rotation.set(-Math.PI / 2, 0, 0); // Z-up → Y-up
splats.position.set(0, -1.4, 0); // floor to origin
splats.scale.setScalar(1.2); // capture scale → world scale
Do this once at load. Don't re-bake the file every iteration; you'll be tweaking these values for an hour.
Scale
COLMAP recovers scale up to a similarity transform — meters are not meters. To set real-world scale: include a known-size object in capture (a checkerboard, a printed scale card), measure it in the Nerfstudio viewer, derive a multiplier. Or eyeball it. Most marketing scenes don't need true metric.
Lighting transfer
Splats bake lighting into vertex colors at capture time. If you composite a Three.js mesh into the splat scene, that mesh's lighting must match the splat's baked lighting or it'll look pasted-in. Two approaches:
- Capture an HDRI alongside the splat scene — Insta360 camera, or stitch a phone panorama. Use it as
scene.environmentfor the mesh. - Render the splat-only scene, sample its color from the regions where the mesh sits, fit a low-frequency env to those samples. Hacky but works for a dropped product on a captured surface.
Shadows
Splats can't receive Three.js mesh shadows (no shadow map sampling). Workaround: a soft fake shadow billboard under the mesh — semi-transparent radial gradient — sells the contact even if it's not physically accurate.
10. Cost & timing reality
| step | time | cost ($USD, cloud) |
|---|---|---|
| Capture | 10-30 min | 0 (own device) |
| COLMAP / process | 10-30 min | ~$0.20 on RunPod CPU |
| Splatfacto train | 20-60 min | $0.40-$1.50 on H100 |
| Export + SuperSplat cleanup | 10-20 min | 0 |
| SPZ conversion | < 1 min | 0 |
| Three.js integration | 1-2 hrs first time, 10 min after | 0 |
End-to-end: a hobbyist scene is 90 minutes and $1-2 of compute. A polished agency capture is half a day and ~$50 of cloud GPU. That's the bar Polycam, Luma, and Niantic have collectively dragged from "research project" to "billable deliverable."
11. Takeaways
- Capture is the input. 80-300 images, ~70% overlap, locked exposure, no moving subjects.
- Nerfstudio's
ns-process-datawraps COLMAP into a one-line preprocessor. - Splatfacto trains in 20-60 min on a 4090; cloud option is ~$1 per scene.
- Export PLY for archive, convert to SPZ for shipping.
- Spark loads SPZ + composes with Three.js meshes — splats and meshes share the scene graph.
- Three production gotchas: alignment, scale, lighting transfer. Author the corrective transform once at load.
- End-to-end cost: ~$1-50 per scene, half a day of work.