Three.js From Zero · Article s10-01
S10-01 AI Tools for 3D
Season 10 · Article 01
AI Tools for 3D — the 2026 Landscape
Text to 3D. Image to 3D. Photo to splat. Shader from prompt. The AI-for-3D stack, what each tool is best at, and when to reach for which.
1. The categories
| Category | Input | Output | Use case |
|---|---|---|---|
| Text → 3D mesh | Prompt | glTF | Characters, props, concept |
| Image → 3D mesh | PNG/sketch | glTF | From existing art to model |
| Photo set → splat | 20-100 photos | .splat / .ply | Real-world capture |
| Video → 4D | Clip | Animated splat | Volumetric capture |
| Text → texture | Prompt | PBR maps | Material variations |
| Text → animation | Prompt | BVH / FBX | Motion for characters |
| LLM for NPCs | Player input | Dialog + actions | Agents in game |
2. Text-to-3D — the current leaders
- Meshy (meshy.ai): good general-purpose, web UI, API. Meshes are usable.
- Rodin (hyperhuman.top): fast. Great for game-ready assets with textures.
- Trellis (Microsoft): open-source. State of the art mid-2025 for research.
- Scenario: asset packs, style-consistent.
- Luma Genie: conversational, animation support.
3. Image-to-3D
- Rodin, Meshy, Trellis: all support image input.
- Stable Fast 3D: 2-second model from one image. Stability AI.
- CSM: character-focused from a reference image.
4. Photos-to-splat
Gaussian Splatting is the 2024-2026 winner over NeRF.
- Luma AI (lumalabs.ai): phone app + web. 20-100 photos → view-dependent 3D.
- Polycam: LiDAR + photo capture.
- Postshot: desktop training, best quality.
- Nerfstudio: open source training pipeline.
5. Textures
- Scenario: material / style-consistent textures.
- Material Maker with SD: local tiling.
- DreamBooth/LoRA on Flux: fine-tune your brand.
- Upscalers: Real-ESRGAN for low-res → 4K.
6. Animation
- Cascadeur: AI-assisted keyframing, physics-aware.
- Rokoko Vision: video → mocap BVH.
- DeepMotion: webcam-to-animation.
- Mixamo + Motion-GPT: text-to-motion fine-tunes.
7. LLM integrations
- Anthropic Claude / OpenAI GPT-4: function-call 3D actions.
- Inworld AI: pre-packaged game NPC platform.
- Convai: another NPC platform.
- Direct API: latency 500ms-2s. Cache common responses.
8. Pipeline integration
# Typical 2026 workflow
prompt → Meshy → .glb
→ gltf-transform cleanup
→ Blender hand-polish
→ gltf-transform compress
→ Three.js GLTFLoader
9. Where human still wins
- Hero assets: AI 80%, human 100%.
- Clean topology for animation: AI loses.
- Specific IP look: AI needs fine-tuning.
- Tiny details, branding: human.
10. Article plan for Season 10
- S10-02: Gaussian Splatting rendering in browser.
- S10-03: Lightweight NeRF for web.
- S10-04: LLM-driven NPCs.
- S10-05: TTS + lip sync.
- S10-06: MediaPipe body/face tracking → avatar.
- S10-07: Generative textures authoring.
- S10-08: Generative meshes pipeline.
- S10-09: AI-assisted shader/scene authoring.
- S10-10: Series finale — the future.
11. Takeaways
- Text-to-3D: Meshy, Rodin, Trellis for general use.
- Image-to-3D: same tools + Stable Fast 3D.
- Photos-to-splat: Luma AI (phone), Postshot (desktop).
- LLM for NPCs: direct API or Inworld/Convai.
- Mocap: Rokoko Vision for video-in, Cascadeur for polish.
- Pipeline: AI for rough, human for polish.
Landscape overview — subsequent articles dive into each tool.