Three.js From Zero · Article s3-08

Character Pipeline: Mixamo → Three.js

← threejs-from-zeroS3 · Article 08 Season 3
Article S3-08 · Three.js From Zero

Character Pipeline: Mixamo → Three.js

You need a rigged, animated character. Three options: hire a rigger (expensive), rig one yourself in Blender (weeks of learning), or use Mixamo. Adobe's free auto-rigger takes any humanoid mesh, places a skeleton, produces hundreds of mocap animations you can download as FBX. The catch: the export pipeline to Three.js has gotchas. This article is the full cookbook.

Demo: a Mixamo character playing one of the clips in the library. Pick the clip from the dropdown. (We use a Three.js sample asset because we can't bundle Mixamo's FBX in a single HTML file, but the article covers the real workflow step by step.)

loading…

The 5-step pipeline

1Model a humanoid mesh (or grab one)
2Auto-rig via Mixamo
3Download clips + apply to character
4Merge + convert to glTF
5Load in Three.js

Step 1 — The mesh

Your character needs to be a humanoid — head, torso, two arms, two legs, roughly proportioned. A-pose or T-pose works. Not dual-horse, not four-armed — Mixamo's auto-rig only handles bipeds.

Source options:

For Mixamo upload, export as FBX or OBJ. Keep the mesh below ~40,000 polygons — Mixamo rejects higher counts. Single mesh, no hierarchy.

Step 2 — Auto-rig at mixamo.com

  1. Upload your mesh
  2. Position the 6 joint markers: chin, wrists, elbows, knees, groin, hips
  3. Click "Next" — Mixamo computes weights and a skeleton (~30 seconds)
  4. Preview the auto-rig. If limbs deform badly, reposition markers and retry

Mixamo's skeleton is a standard humanoid with ~60 bones, named mixamorig:Hips, mixamorig:Spine, mixamorig:LeftArm, etc. This matters because all Mixamo animations use the same bone names — they're interchangeable across any auto-rigged character.

Step 3 — Download animations

Browse Mixamo's animation library (5000+ clips). For each clip you want:

  1. Tune the overdrive / character arm-space / other clip parameters
  2. Check "In Place" — keeps the character at origin (you drive world position yourself). For kid-mode "out of place" with root motion, uncheck.
  3. Download settings: FBX 7.4 Binary, 30fps, Skin: With Skin for the first clip, Without Skin for all subsequent
Why "with skin" only once? The first FBX carries your mesh + the rig + clip 1. Every subsequent FBX is just rig + clip N — the same skeleton, no mesh. You merge them so one character has N clips.

Step 4 — Merge clips into one glTF

Three tools to pick from:

Option A: Blender (free, visual)

  1. Import the first FBX (with mesh). Rename the NLA action to something sensible.
  2. Import the second FBX (without mesh). Two armatures now exist — remove the duplicate armature from step 2 but keep the new action. Rename the action.
  3. Repeat for each clip.
  4. Export as glTF 2.0 (.glb). Check "Export Deformation Bones Only" and "Export Animations".

Option B: Don McCurdy's glTF-Transform (CLI, scriptable)

# Convert FBX to glTF
npx fbx2gltf character_walk.fbx -o character_walk.glb

# Merge clips from multiple glTFs
npx gltf-transform animation-merge \
  --base character_walk.glb \
  --sources character_run.glb character_jump.glb character_idle.glb \
  -o character_merged.glb

Option C: Three.js runtime merge

const loader = new FBXLoader();
const base = await loader.loadAsync('char_idle.fbx');
const walk = await loader.loadAsync('char_walk.fbx');

// Pull clip 0 out of walk, rename, push into base
walk.animations[0].name = 'walk';
base.animations.push(walk.animations[0]);

const mixer = new THREE.AnimationMixer(base);
const walkAction = mixer.clipAction(walk.animations[0]);

Option C skips the export step entirely but bloats your bundle with 5 FBX files. Production: merge offline with Option B.

Step 5 — Load in Three.js

import { GLTFLoader } from 'three/addons/loaders/GLTFLoader.js';

const gltf = await new GLTFLoader().loadAsync('/character.glb');
scene.add(gltf.scene);

const mixer = new THREE.AnimationMixer(gltf.scene);
const clips = Object.fromEntries(
  gltf.animations.map((c) => [c.name, mixer.clipAction(c)]),
);
clips.idle.play();

Retargeting — using Mixamo clips on a non-Mixamo character

What if you already have a rig from Blender / Maya that's NOT Mixamo-named? You retarget: remap bones from source to destination.

import { SkeletonUtils } from 'three/addons/utils/SkeletonUtils.js';

// Retarget a Mixamo clip onto a different skeleton
const retargeted = SkeletonUtils.retargetClip(
  targetSkinnedMesh,      // destination mesh
  sourceSkinnedMesh,      // Mixamo rig + animation data
  mixamoClip,             // AnimationClip to retarget
  { hip: 'mixamorig:Hips', scale: 1 },
);

targetMesh.mixer.clipAction(retargeted).play();

SkeletonUtils requires the two skeletons have matching bone names OR you provide a bone-name mapping. For generic retargeting (across completely different rigs), consider offline tools like AutoRig Pro (Blender addon) or Rokoko Studio.

Scaling — the #1 Mixamo gotcha

Mixamo exports at centimeter scale. Three.js uses meters. Your character appears 100x larger than expected:

gltf.scene.scale.setScalar(0.01);   // cm → m

Or scale during fbx2gltf conversion:

npx fbx2gltf --khr-materials-unlit --scale 0.01 char.fbx

If you do the scale on the outer group, root-motion translation is also scaled — which is usually what you want.

Bone names reference

Every Mixamo rig has these key bones (60+ total):

mixamorig:Hips           (root)
 ├─ mixamorig:Spine
 │  └─ mixamorig:Spine1
 │     └─ mixamorig:Spine2
 │        ├─ mixamorig:Neck → mixamorig:Head
 │        ├─ mixamorig:LeftShoulder → LeftArm → LeftForeArm → LeftHand
 │        └─ mixamorig:RightShoulder → RightArm → RightForeArm → RightHand
 ├─ mixamorig:LeftUpLeg → LeftLeg → LeftFoot → LeftToeBase
 └─ mixamorig:RightUpLeg → RightLeg → RightFoot → RightToeBase

Each hand has 15 finger bones (5 digits × 3 joints). Drive them separately for gestures. Fingertips are exposed as a *Hand4 bone each.

In-place vs out-of-place

ModeBehaviorUse when
In PlaceRoot stays at origin, legs cycleYou drive position yourself (S3-04 character controller)
Root motionHips translate per frame dataMotion matching, foot planting, exact distances

If you're using S3-04 kinematic character controller: In Place. If you're using motion matching (S3-07): root motion.

Optimizing FBX → glTF

The raw FBX from Mixamo is often 20-40MB. Convert + optimize:

# Convert + optimize + compress
npx fbx2gltf character.fbx -o character.glb
npx gltf-transform resize character.glb character.glb --width 512    # shrink textures
npx gltf-transform uastc character.glb character.glb                  # compress textures to KTX2
npx gltf-transform draco character.glb character.glb                  # compress mesh

# Typical result: 40MB FBX → 5MB glb with full quality

Apply dedup if you merged many clips — duplicate skin data collapses.

Common first-time pitfalls

  • Character is huge. Mixamo exports in cm. Scale by 0.01.
  • Rig shows but mesh is invisible. FBX without skin mesh. Re-export the first FBX with "With Skin".
  • Animation plays but hands look stuck. Finger bones are separately animated; some clips don't include finger keyframes. That's fine — fingers stay in the rest pose.
  • Character "stair-steps" forward. You're applying root motion on top of manual movement, or the clip is "out of place" but you're also moving the character. Pick one.
  • Weights look wrong near joints. Mixamo auto-weights aren't perfect. For production, bring into Blender and refine weight painting.
  • Bones are named differently in my engine. Retarget via SkeletonUtils or AutoRig Pro.

Exercises

  1. Minimal shippable character: upload a Ready Player Me avatar, get it through Mixamo (Mixamo doesn't support RPM — use their stock model instead), download 3 clips, merge, load. Ship under 10MB.
  2. Finger gesture: find a clip without finger animation, programmatically set finger bones into a wave gesture during playback.
  3. Retarget to a custom rig: rig a monster character yourself, retarget Mixamo walk onto it.

What's next

S3-09 — Facial Capture with Webcam. MediaPipe extracts 468 face landmarks from a live camera feed. Map them to ARKit blendshapes. Your face drives the character's face in real time, right in the browser.