Three.js From Zero · Article s3-02
Morph Targets & Facial Animation
Morph Targets & Facial Animation
Bones move parts of a mesh rigidly. Great for limbs, bad for faces — a smile isn't "rotate the mouth bone." Faces deform in a million tiny ways: lips press, cheeks puff, eyelids droop, tongue moves. Those micro-deformations are morph targets (blend shapes) — whole-mesh snapshots of the face in different expressions, blended together by per-shape weights.
Concept in one paragraph
For each morph target, you store a delta — the per-vertex offset from the base mesh to that expression. The GPU adds them up weighted each frame:
// For each vertex:
finalPos = basePos + Σ (morphDelta[i] × morphWeight[i])
Store 50 targets (one per face muscle group). Set their weights 0..1. The engine combines them. Smile (0.7) + Brow-Furrow (0.3) + Tongue-Out (0.2) = a single expressive pose, all in the vertex shader.
Building morph targets in code
const geom = new THREE.SphereGeometry(1, 32, 24);
const basePos = geom.attributes.position.clone();
// Build a "stretched" morph target
const stretchArr = new Float32Array(basePos.count * 3);
for (let i = 0; i < basePos.count; i++) {
const y = basePos.getY(i);
stretchArr[i*3 + 0] = basePos.getX(i) * 0.7 - basePos.getX(i); // delta: squish X
stretchArr[i*3 + 1] = y * 0.5; // delta: stretch Y
stretchArr[i*3 + 2] = basePos.getZ(i) * 0.7 - basePos.getZ(i);
}
geom.morphAttributes.position = geom.morphAttributes.position || [];
geom.morphAttributes.position.push(
new THREE.BufferAttribute(stretchArr, 3)
);
const mesh = new THREE.Mesh(geom, mat);
mesh.morphTargetInfluences = [0.0]; // one weight per morph
mesh.morphTargetDictionary = { 'stretch': 0 };
Set mesh.morphTargetInfluences[0] = 1.0 and the sphere stretches.
= 0.5 → halfway. Smooth blend, all on the GPU.
Critical: deltas, not absolute positions
This is the #1 gotcha. Morph attributes store offsets from the base, not absolute positions. If you set absolute positions, the mesh explodes because the engine adds the delta to the base.
// WRONG — absolute positions
stretchArr[i*3 + 1] = basePos.getY(i) * 1.5;
// RIGHT — delta from base
stretchArr[i*3 + 1] = basePos.getY(i) * 1.5 - basePos.getY(i);
// equivalent to: basePos.getY(i) * 0.5
Morph normals — or your lighting breaks
If your material is lit (Standard/Physical), you also need morph normals — delta normals per target. Otherwise lighting uses the base normals and the deformed surface gets shaded wrong.
geom.morphAttributes.normal = [normalDeltaAttr1, normalDeltaAttr2, ...];
material.morphNormals = true; // ← essential for Standard/Physical
Computing correct morph normals by hand is tedious. Blender exports them via glTF. For code-generated morphs without lighting (MeshBasicMaterial, MeshNormalMaterial), skip this.
The 8-morph limit (and how to beat it)
Three.js's default shader uses 8 morph target attributes in the vertex shader. For faces with 50+ targets, you'd hit the cap fast. Fix: Three.js enables morph textures automatically when the count exceeds 8, packing targets into a texture read in the vertex shader.
You don't write morph-texture code yourself — the renderer switches transparently. The
side effect: the first render of a high-morph-count mesh compiles a more expensive shader.
Preload that compile (renderer.compile(scene, camera)) on app start.
Facial animation with FACS / ARKit blend shapes
For realistic faces, don't invent your own morph names. Use a standard set:
- FACS (Facial Action Coding System) — Paul Ekman's 46 action units. Academic standard.
- ARKit blendshapes — Apple's 52-shape face model. De-facto industry standard because iPhone face capture ships it. Names like
mouthSmileLeft,eyeBlinkRight,jawOpen. - Viseme sets (Oculus, Microsoft) — mouth shapes per phoneme. 14-22 shapes typical.
Build your character with ARKit's 52 blendshapes → you can drive it from any Apple face capture app, MediaPipe, or lip-sync generators. Cross-compatibility for free.
Lip sync from audio
Two paths:
1. Phoneme → viseme mapping (offline)
// Use a phoneme analyzer (e.g., Rhubarb Lip Sync) on an audio file
// Get back: [ { time: 0.0, viseme: 'A' }, { time: 0.08, viseme: 'B' }, ... ]
function tickLipSync(audioTime) {
const current = schedule.find(e => e.time <= audioTime);
for (const viseme of ALL_VISEMES) {
mesh.morphTargetInfluences[morphDict[viseme]] =
viseme === current.viseme ? 1 : 0;
}
}
2. Real-time audio analysis
// Use Web Audio + FFT to extract formants, map to mouth shape
const analyser = audioContext.createAnalyser();
analyser.fftSize = 1024;
// Compare frequency energy to known phoneme patterns
// OR use oculus-lipsync / azure-speech viseme streams
For production: Rhubarb for offline, TalkingHead for JS text-to-speech + visemes, or ElevenLabs / Azure Cognitive Speech streaming visemes.
Blending multiple expressions
Emotions aren't single shapes. Joy = Smile (0.8) + EyeSquint (0.6) + BrowRaise (0.3). Additive blending is free — just set multiple weights:
const JOY = {
mouthSmileLeft: 0.8, mouthSmileRight: 0.8,
cheekSquintLeft: 0.4, cheekSquintRight: 0.4,
browInnerUp: 0.3,
};
function apply(preset) {
for (const [name, w] of Object.entries(preset)) {
const idx = mesh.morphTargetDictionary[name];
if (idx !== undefined) mesh.morphTargetInfluences[idx] = w;
}
}
apply(JOY);
Preset dicts for emotions + lerp them over time → full emotional pipeline. Your dialog system emits emotion tags, the renderer lerps between presets, the character emotes naturally.
Morph + skeleton — use both together
Skeleton for big motions (pose, walk, jump). Morphs for fine detail (face, muscle flex,
cloth wrinkles). A SkinnedMesh can have both — morphs apply first in the
vertex shader, then skinning. That's the pipeline every high-end character uses.
Reading glTF morphs
gltf.scene.traverse((o) => {
if (o.morphTargetInfluences) {
console.log(o.name, 'has', o.morphTargetInfluences.length, 'morphs');
console.log('names', Object.keys(o.morphTargetDictionary));
}
});
// Drive a blendshape by name:
const idx = head.morphTargetDictionary['jawOpen'];
head.morphTargetInfluences[idx] = 0.4;
Common first-time pitfalls
- Mesh warps oddly — you stored absolute positions instead of deltas.
- Lighting looks wrong on deformed mesh — need morph normals +
material.morphNormals = true. - morphTargetInfluences is undefined — your geometry doesn't have
morphAttributes.positionset. Add the attribute first, then the influences array appears. - High-count morphs compile slowly — Three.js switches to morph textures internally. First render hitches. Preload with
renderer.compile. - Blendshapes stack to more than 1 everywhere — ARKit shapes can go above 1 but look unnatural. Clamp weights to 0..1.
- Face looks "fake-smile-y" — you're using one mouthSmile shape. Mix with cheekSquint + browInnerUp for a Duchenne (real) smile.
Exercises
- Load a Ready Player Me avatar. It ships with ARKit blendshapes. Build a UI that exposes all 52.
- Real-time viseme driving: use Web Speech Synthesis with
onvisemeevents to drive mouthOpen/Close/Smile shapes from live TTS. - Emotion blending: define presets for Joy / Sadness / Anger / Fear / Surprise. Cross-fade between them over 0.4s when triggered.
What's next
Article S3-03 — Animation Blending & Blend Trees. Locomotion systems that mix walk/run/sprint based on a speed axis, directional blends for strafe, and the state-machine patterns layered on top.