Three.js From Zero · Article s10-01

S10-01 AI Tools for 3D

Season 10 · Article 01

AI Tools for 3D — the 2026 Landscape

Text to 3D. Image to 3D. Photo to splat. Shader from prompt. The AI-for-3D stack, what each tool is best at, and when to reach for which.

1. The categories

CategoryInputOutputUse case
Text → 3D meshPromptglTFCharacters, props, concept
Image → 3D meshPNG/sketchglTFFrom existing art to model
Photo set → splat20-100 photos.splat / .plyReal-world capture
Video → 4DClipAnimated splatVolumetric capture
Text → texturePromptPBR mapsMaterial variations
Text → animationPromptBVH / FBXMotion for characters
LLM for NPCsPlayer inputDialog + actionsAgents in game

2. Text-to-3D — the current leaders

  • Meshy (meshy.ai): good general-purpose, web UI, API. Meshes are usable.
  • Rodin (hyperhuman.top): fast. Great for game-ready assets with textures.
  • Trellis (Microsoft): open-source. State of the art mid-2025 for research.
  • Scenario: asset packs, style-consistent.
  • Luma Genie: conversational, animation support.

3. Image-to-3D

  • Rodin, Meshy, Trellis: all support image input.
  • Stable Fast 3D: 2-second model from one image. Stability AI.
  • CSM: character-focused from a reference image.

4. Photos-to-splat

Gaussian Splatting is the 2024-2026 winner over NeRF.

  • Luma AI (lumalabs.ai): phone app + web. 20-100 photos → view-dependent 3D.
  • Polycam: LiDAR + photo capture.
  • Postshot: desktop training, best quality.
  • Nerfstudio: open source training pipeline.

5. Textures

  • Scenario: material / style-consistent textures.
  • Material Maker with SD: local tiling.
  • DreamBooth/LoRA on Flux: fine-tune your brand.
  • Upscalers: Real-ESRGAN for low-res → 4K.

6. Animation

  • Cascadeur: AI-assisted keyframing, physics-aware.
  • Rokoko Vision: video → mocap BVH.
  • DeepMotion: webcam-to-animation.
  • Mixamo + Motion-GPT: text-to-motion fine-tunes.

7. LLM integrations

  • Anthropic Claude / OpenAI GPT-4: function-call 3D actions.
  • Inworld AI: pre-packaged game NPC platform.
  • Convai: another NPC platform.
  • Direct API: latency 500ms-2s. Cache common responses.

8. Pipeline integration

# Typical 2026 workflow
prompt → Meshy → .glb
  → gltf-transform cleanup
  → Blender hand-polish
  → gltf-transform compress
  → Three.js GLTFLoader

9. Where human still wins

  • Hero assets: AI 80%, human 100%.
  • Clean topology for animation: AI loses.
  • Specific IP look: AI needs fine-tuning.
  • Tiny details, branding: human.

10. Article plan for Season 10

  1. S10-02: Gaussian Splatting rendering in browser.
  2. S10-03: Lightweight NeRF for web.
  3. S10-04: LLM-driven NPCs.
  4. S10-05: TTS + lip sync.
  5. S10-06: MediaPipe body/face tracking → avatar.
  6. S10-07: Generative textures authoring.
  7. S10-08: Generative meshes pipeline.
  8. S10-09: AI-assisted shader/scene authoring.
  9. S10-10: Series finale — the future.

11. Takeaways

  • Text-to-3D: Meshy, Rodin, Trellis for general use.
  • Image-to-3D: same tools + Stable Fast 3D.
  • Photos-to-splat: Luma AI (phone), Postshot (desktop).
  • LLM for NPCs: direct API or Inworld/Convai.
  • Mocap: Rokoko Vision for video-in, Cascadeur for polish.
  • Pipeline: AI for rough, human for polish.

Landscape overview — subsequent articles dive into each tool.