Three.js From Zero · Article s2-06

WebXR Interaction: Controllers, Hands, Haptics

← threejs-from-zeroS2 · Article 06 Season 2
Article S2-06 · Three.js From Zero

WebXR Interaction: Controllers, Hands, Haptics

S2-05 enabled VR. Now we make it interactive. Controllers with real input. 25-joint hand tracking. Pinch and grab gestures. Haptic pulses. And the payoff that ties this whole season together: physics grab-and-throw in VR — Rapier bodies you can pick up with your actual hands on a Quest 3.

initializing…

The input-source model

WebXR abstracts every kind of input — handheld controllers, tracked hands, gaze targeting — into an XRInputSource. Each source has a handedness ('left' | 'right' | 'none'), a target ray space (where it aims), a grip space (where you'd hold an object), optional buttons/axes (gamepad-like), and optional hand joint data.

Three.js gives you each input source via index:

const controller0 = renderer.xr.getController(0);     // target ray space (aim)
const grip0       = renderer.xr.getControllerGrip(0);  // grip space (hold)
const hand0       = renderer.xr.getHand(0);            // hand (25 joints)

scene.add(controller0, grip0, hand0);   // all three update automatically

Index 0 and 1 don't map to left/right reliably. Check inputSource.handedness on the connected event.

Events, buttons, axes

Three.js bubbles a small set of events up from the WebXR input source:

EventFired when
connectedController attaches. event.data = the XRInputSource.
disconnectedController drops. Clean up any visuals parented to it.
selectstartTrigger down (or "select" gesture on hands = pinch).
selectendTrigger up.
squeezestartGrip/squeeze button down (hand: whole-hand grab).
squeezeendGrip up.

For raw button + axis state (thumbstick, A/B/X/Y), reach through to the gamepad of the input source:

controller0.addEventListener('connected', (e) => {
  const gp = e.data.gamepad;
  // gp.buttons: array of { pressed, touched, value }
  // gp.axes:    [thumbX, thumbY, touchpadX, touchpadY]
});

// per frame, read current state:
function readGamepad(controller) {
  const input = getInputSource(controller);   // you track this in connected/disconnected
  const gp = input?.gamepad;
  if (!gp) return null;
  return { thumbX: gp.axes[2] ?? 0, thumbY: gp.axes[3] ?? 0, a: gp.buttons[4]?.pressed };
}
Quest controllers map buttons slightly differently than Index or Valve hardware. Don't assume indices. The WebXR "Gamepad Xr Standard Mapping" spec is close to universal but not perfect — test on every device you care about.

Controller models — the drei-less way

WebGL has no built-in "controller mesh" — you ship your own or load one per device. Three.js has XRControllerModelFactory for the stock approach:

import { XRControllerModelFactory } from
  'three/addons/webxr/XRControllerModelFactory.js';

const factory = new XRControllerModelFactory();
const grip0 = renderer.xr.getControllerGrip(0);
grip0.add(factory.createControllerModel(grip0));

It downloads the proper glTF for each connected controller from a WebXR-standard CDN. Works for Quest, Index, Vive, Valve, WMR, and Vision Pro pinch-model.

Hand tracking — 25 joints per hand

If the session has 'hand-tracking' in its features, each renderer.xr.getHand(i) returns an Object3D with a joints map:

const hand = renderer.xr.getHand(0);
scene.add(hand);

// Each joint is an Object3D that gets its transform updated per frame.
// Common joints you'll use:
const wrist        = hand.joints['wrist'];
const indexTip     = hand.joints['index-finger-tip'];
const thumbTip     = hand.joints['thumb-tip'];
const middleTip    = hand.joints['middle-finger-tip'];

The 25 joint names are a standard list — see the spec for the full enumeration.

Visualizing hands

import { XRHandModelFactory } from
  'three/addons/webxr/XRHandModelFactory.js';

const handFactory = new XRHandModelFactory();
hand.add(handFactory.createHandModel(hand, 'mesh'));
// variants: 'mesh' (rigged model), 'spheres' (joint dots), 'boxes' (joint boxes)

'mesh' gets you a realistic hand rig. 'spheres' is useful for debugging or stylized apps.

Pinch detection

function isPinching(hand) {
  const i = hand.joints['index-finger-tip'];
  const t = hand.joints['thumb-tip'];
  if (!i || !t) return false;
  return i.position.distanceTo(t.position) < 0.02;   // 2cm threshold
}

The WebXR hand spec also emits 'selectstart'/'selectend' events on hands when pinching, so in practice you can use the same event-driven code for both controllers and hands. That's the magic — write once, works on both.

Grab-and-throw with physics (the payoff)

Tie S2-01's Rapier + S2-05's WebXR into one system. The pattern:

  1. On selectstart, raycast from the controller forward
  2. Find the nearest Rapier-backed mesh within reach
  3. Switch its body to kinematic and parent the visual to the controller
  4. Every frame, set body position from the controller position
  5. Track the velocity you imparted
  6. On selectend, switch back to dynamic and apply the tracked linear velocity as initial setLinvel
function onGrab(controller, event) {
  const hit = raycastGrabbable(controller);
  if (!hit) return;

  // Switch to kinematic while held
  hit.body.setBodyType(RAPIER.RigidBodyType.KinematicPositionBased, true);
  controller.userData.held = hit;
  controller.userData.lastPos = controller.getWorldPosition(new THREE.Vector3());
}

function updateHeld(controller, dt) {
  const held = controller.userData.held;
  if (!held) return;

  const p = controller.getWorldPosition(new THREE.Vector3());
  const q = controller.getWorldQuaternion(new THREE.Quaternion());

  held.body.setNextKinematicTranslation({ x: p.x, y: p.y, z: p.z });
  held.body.setNextKinematicRotation({ x: q.x, y: q.y, z: q.z, w: q.w });

  // Track velocity for throw — instant velocity from last frame
  controller.userData.linvel = p.clone().sub(controller.userData.lastPos).divideScalar(dt);
  controller.userData.lastPos = p;
}

function onRelease(controller) {
  const held = controller.userData.held;
  if (!held) return;
  held.body.setBodyType(RAPIER.RigidBodyType.Dynamic, true);
  const v = controller.userData.linvel ?? new THREE.Vector3();
  held.body.setLinvel({ x: v.x, y: v.y, z: v.z }, true);
  controller.userData.held = null;
}

Two details that matter:

  • setBodyType at grab and release swap the body between dynamic and kinematic. Rapier handles this transition smoothly.
  • Velocity tracking over the last frame is what makes throws feel right. Smooth over 3–5 frames (moving average) if you want natural-feeling arcs.

Haptics — the buzz

function pulse(inputSource, intensity = 0.5, durationMs = 30) {
  const actuator = inputSource?.gamepad?.hapticActuators?.[0];
  actuator?.pulse(intensity, durationMs);
}

Intensity 0–1, duration in milliseconds. Use sparingly — 10–50ms is the sweet spot for "click" feedback; anything longer feels buzzy. Fire on grab, on collision events, on button press for UI confirmation.

Vision Pro currently has no controller haptics exposed (it's a hand-tracking-only device). Quest, Index, Vive do.

World-space UI — buttons in 3D

Build buttons as regular meshes. Add an userData.isButton = true flag. Raycast from controllers to them each frame:

function rayButtons(controller) {
  const ray = new THREE.Raycaster();
  ray.set(
    controller.getWorldPosition(new THREE.Vector3()),
    new THREE.Vector3(0, 0, -1).applyQuaternion(controller.getWorldQuaternion(new THREE.Quaternion())),
  );
  const hits = ray.intersectObjects(uiButtons, false);
  return hits[0]?.object;
}

controller0.addEventListener('selectstart', () => {
  const hit = rayButtons(controller0);
  if (hit?.userData.isButton) hit.userData.onClick?.();
});

Add a laser visualization (a line parented to the controller). Add a reticle at the hit point. Add haptic on hover. That's the complete XR UI pattern — no library needed.

Poke interactions — finger-tip buttons

For hand-tracked apps, "poke" is often better than "ray aim" — the user pushes the button with their index fingertip. Simpler UX, feels more direct.

function pokeButtons(hand) {
  const tip = hand.joints['index-finger-tip'];
  if (!tip) return;
  for (const btn of uiButtons) {
    const d = tip.position.distanceTo(btn.getWorldPosition(new THREE.Vector3()));
    if (d < 0.03 && !btn.userData.wasPressed) {
      btn.userData.wasPressed = true;
      btn.userData.onClick?.();
    } else if (d > 0.06) {
      btn.userData.wasPressed = false;
    }
  }
}

Hysteresis (enter at 3cm, exit at 6cm) prevents machine-gun firing on small jitter.

Teleport locomotion

const cameraRig = new THREE.Group();
cameraRig.add(camera);
scene.add(cameraRig);

function teleport(targetPos) {
  // Move the RIG, not the camera — camera position comes from head pose
  cameraRig.position.copy(targetPos);
}

controller0.addEventListener('selectend', () => {
  const hit = raycastFloor(controller0);
  if (hit) teleport(hit.point);
});

The crucial bit: move the rig, not the camera. The camera's transform is owned by WebXR (it's the head pose). You adjust world coordinates by moving the parent group.

Common first-time pitfalls

  • Controllers appear at origin. You didn't scene.add(controller). Target ray / grip / hand all need to be in the scene graph to get their transforms updated.
  • Hand tracking doesn't work. You didn't request 'hand-tracking' in optionalFeatures.
  • Events fire twice. You attached the same listener on both the controller AND the grip for the same input source. Pick one.
  • Teleport snaps my head, not the world. You moved camera instead of the rig group. The XR layer overwrites camera per frame.
  • Pinch detected constantly. Your distance threshold is too loose. 2cm ≈ a firm pinch; 5cm picks up when you're not trying to.
  • Haptics silent. Vision Pro doesn't have them. Also: actuator might be hapticActuators[0] or not exist on older devices. Always guard.

Exercises

  1. Two-hand scaling: grab a mesh with both hands, scale based on the distance between controllers.
  2. Spatial menu: pinch on the non-dominant hand → open a radial menu at its position. Pinch-point on a menu item to select.
  3. Velocity-matched throw: improve the throw from a 1-frame velocity to a 5-frame moving average. Throws will feel considerably more natural.

What's next

Article S2-07 — Multiplayer Foundations. Presence and cursors in a shared 3D space via Y.js + WebSockets. Two browser tabs, same scene, instant sync.