Article S3-02 · Three.js From Zero

Morph Targets & Facial Animation

Bones move parts of a mesh rigidly. Great for limbs, bad for faces — a smile isn't "rotate the mouth bone." Faces deform in a million tiny ways: lips press, cheeks puff, eyelids droop, tongue moves. Those micro-deformations are morph targets (blend shapes) — whole-mesh snapshots of the face in different expressions, blended together by per-shape weights.

loading…

loading morph targets…

Concept in one paragraph

For each morph target, you store a delta — the per-vertex offset from the base mesh to that expression. The GPU adds them up weighted each frame:

// For each vertex:
finalPos = basePos + Σ (morphDelta[i] × morphWeight[i])

Store 50 targets (one per face muscle group). Set their weights 0..1. The engine combines them. Smile (0.7) + Brow-Furrow (0.3) + Tongue-Out (0.2) = a single expressive pose, all in the vertex shader.

Building morph targets in code

const geom = new THREE.SphereGeometry(1, 32, 24);
const basePos = geom.attributes.position.clone();

// Build a "stretched" morph target
const stretchArr = new Float32Array(basePos.count * 3);
for (let i = 0; i < basePos.count; i++) {
  const y = basePos.getY(i);
  stretchArr[i*3 + 0] = basePos.getX(i) * 0.7 - basePos.getX(i);   // delta: squish X
  stretchArr[i*3 + 1] = y * 0.5;                                    // delta: stretch Y
  stretchArr[i*3 + 2] = basePos.getZ(i) * 0.7 - basePos.getZ(i);
}

geom.morphAttributes.position = geom.morphAttributes.position || [];
geom.morphAttributes.position.push(
  new THREE.BufferAttribute(stretchArr, 3)
);

const mesh = new THREE.Mesh(geom, mat);
mesh.morphTargetInfluences = [0.0];   // one weight per morph
mesh.morphTargetDictionary = { 'stretch': 0 };

Set mesh.morphTargetInfluences[0] = 1.0 and the sphere stretches. = 0.5 → halfway. Smooth blend, all on the GPU.

Critical: deltas, not absolute positions

This is the #1 gotcha. Morph attributes store offsets from the base, not absolute positions. If you set absolute positions, the mesh explodes because the engine adds the delta to the base.

// WRONG — absolute positions
stretchArr[i*3 + 1] = basePos.getY(i) * 1.5;

// RIGHT — delta from base
stretchArr[i*3 + 1] = basePos.getY(i) * 1.5 - basePos.getY(i);
// equivalent to: basePos.getY(i) * 0.5

Morph normals — or your lighting breaks

If your material is lit (Standard/Physical), you also need morph normals — delta normals per target. Otherwise lighting uses the base normals and the deformed surface gets shaded wrong.

geom.morphAttributes.normal = [normalDeltaAttr1, normalDeltaAttr2, ...];
material.morphNormals = true;      // ← essential for Standard/Physical

Computing correct morph normals by hand is tedious. Blender exports them via glTF. For code-generated morphs without lighting (MeshBasicMaterial, MeshNormalMaterial), skip this.

The 8-morph limit (and how to beat it)

Three.js's default shader uses 8 morph target attributes in the vertex shader. For faces with 50+ targets, you'd hit the cap fast. Fix: Three.js enables morph textures automatically when the count exceeds 8, packing targets into a texture read in the vertex shader.

You don't write morph-texture code yourself — the renderer switches transparently. The side effect: the first render of a high-morph-count mesh compiles a more expensive shader. Preload that compile (renderer.compile(scene, camera)) on app start.

Facial animation with FACS / ARKit blend shapes

For realistic faces, don't invent your own morph names. Use a standard set:

FACS (Facial Action Coding System) — Paul Ekman's 46 action units. Academic standard.
ARKit blendshapes — Apple's 52-shape face model. De-facto industry standard because iPhone face capture ships it. Names like mouthSmileLeft, eyeBlinkRight, jawOpen.
Viseme sets (Oculus, Microsoft) — mouth shapes per phoneme. 14-22 shapes typical.

Build your character with ARKit's 52 blendshapes → you can drive it from any Apple face capture app, MediaPipe, or lip-sync generators. Cross-compatibility for free.

Lip sync from audio

Two paths:

1. Phoneme → viseme mapping (offline)

// Use a phoneme analyzer (e.g., Rhubarb Lip Sync) on an audio file
// Get back: [ { time: 0.0, viseme: 'A' }, { time: 0.08, viseme: 'B' }, ... ]

function tickLipSync(audioTime) {
  const current = schedule.find(e => e.time <= audioTime);
  for (const viseme of ALL_VISEMES) {
    mesh.morphTargetInfluences[morphDict[viseme]] =
      viseme === current.viseme ? 1 : 0;
  }
}

2. Real-time audio analysis

// Use Web Audio + FFT to extract formants, map to mouth shape
const analyser = audioContext.createAnalyser();
analyser.fftSize = 1024;
// Compare frequency energy to known phoneme patterns
// OR use oculus-lipsync / azure-speech viseme streams

For production: Rhubarb for offline, TalkingHead for JS text-to-speech + visemes, or ElevenLabs / Azure Cognitive Speech streaming visemes.

Blending multiple expressions

Emotions aren't single shapes. Joy = Smile (0.8) + EyeSquint (0.6) + BrowRaise (0.3). Additive blending is free — just set multiple weights:

const JOY = {
  mouthSmileLeft:  0.8, mouthSmileRight: 0.8,
  cheekSquintLeft: 0.4, cheekSquintRight: 0.4,
  browInnerUp:     0.3,
};

function apply(preset) {
  for (const [name, w] of Object.entries(preset)) {
    const idx = mesh.morphTargetDictionary[name];
    if (idx !== undefined) mesh.morphTargetInfluences[idx] = w;
  }
}

apply(JOY);

Preset dicts for emotions + lerp them over time → full emotional pipeline. Your dialog system emits emotion tags, the renderer lerps between presets, the character emotes naturally.

Morph + skeleton — use both together

Skeleton for big motions (pose, walk, jump). Morphs for fine detail (face, muscle flex, cloth wrinkles). A SkinnedMesh can have both — morphs apply first in the vertex shader, then skinning. That's the pipeline every high-end character uses.

Reading glTF morphs

gltf.scene.traverse((o) => {
  if (o.morphTargetInfluences) {
    console.log(o.name, 'has', o.morphTargetInfluences.length, 'morphs');
    console.log('names', Object.keys(o.morphTargetDictionary));
  }
});

// Drive a blendshape by name:
const idx = head.morphTargetDictionary['jawOpen'];
head.morphTargetInfluences[idx] = 0.4;

Common first-time pitfalls

Mesh warps oddly — you stored absolute positions instead of deltas.
Lighting looks wrong on deformed mesh — need morph normals + material.morphNormals = true.
morphTargetInfluences is undefined — your geometry doesn't have morphAttributes.position set. Add the attribute first, then the influences array appears.
High-count morphs compile slowly — Three.js switches to morph textures internally. First render hitches. Preload with renderer.compile.
Blendshapes stack to more than 1 everywhere — ARKit shapes can go above 1 but look unnatural. Clamp weights to 0..1.
Face looks "fake-smile-y" — you're using one mouthSmile shape. Mix with cheekSquint + browInnerUp for a Duchenne (real) smile.

Exercises

Load a Ready Player Me avatar. It ships with ARKit blendshapes. Build a UI that exposes all 52.
Real-time viseme driving: use Web Speech Synthesis with onviseme events to drive mouthOpen/Close/Smile shapes from live TTS.
Emotion blending: define presets for Joy / Sadness / Anger / Fear / Surprise. Cross-fade between them over 0.4s when triggered.

What's next

Article S3-03 — Animation Blending & Blend Trees. Locomotion systems that mix walk/run/sprint based on a speed axis, directional blends for strafe, and the state-machine patterns layered on top.