Three.js From Zero · Article s10-01

S10-01 AI Tools for 3D

Season 10 · Article 01

AI Tools for 3D — the 2026 Landscape

Text to 3D. Image to 3D. Photo to splat. Shader from prompt. The AI-for-3D stack, what each tool is best at, and when to reach for which.

1. The categories

Category	Input	Output	Use case
Text → 3D mesh	Prompt	glTF	Characters, props, concept
Image → 3D mesh	PNG/sketch	glTF	From existing art to model
Photo set → splat	20-100 photos	.splat / .ply	Real-world capture
Video → 4D	Clip	Animated splat	Volumetric capture
Text → texture	Prompt	PBR maps	Material variations
Text → animation	Prompt	BVH / FBX	Motion for characters
LLM for NPCs	Player input	Dialog + actions	Agents in game

2. Text-to-3D — the current leaders

Meshy (meshy.ai): good general-purpose, web UI, API. Meshes are usable.
Rodin (hyperhuman.top): fast. Great for game-ready assets with textures.
Trellis (Microsoft): open-source. State of the art mid-2025 for research.
Scenario: asset packs, style-consistent.
Luma Genie: conversational, animation support.

3. Image-to-3D

Rodin, Meshy, Trellis: all support image input.
Stable Fast 3D: 2-second model from one image. Stability AI.
CSM: character-focused from a reference image.

4. Photos-to-splat

Gaussian Splatting is the 2024-2026 winner over NeRF.

Luma AI (lumalabs.ai): phone app + web. 20-100 photos → view-dependent 3D.
Polycam: LiDAR + photo capture.
Postshot: desktop training, best quality.
Nerfstudio: open source training pipeline.

5. Textures

Scenario: material / style-consistent textures.
Material Maker with SD: local tiling.
DreamBooth/LoRA on Flux: fine-tune your brand.
Upscalers: Real-ESRGAN for low-res → 4K.

6. Animation

Cascadeur: AI-assisted keyframing, physics-aware.
Rokoko Vision: video → mocap BVH.
DeepMotion: webcam-to-animation.
Mixamo + Motion-GPT: text-to-motion fine-tunes.

7. LLM integrations

Anthropic Claude / OpenAI GPT-4: function-call 3D actions.
Inworld AI: pre-packaged game NPC platform.
Convai: another NPC platform.
Direct API: latency 500ms-2s. Cache common responses.

8. Pipeline integration

# Typical 2026 workflow
prompt → Meshy → .glb
  → gltf-transform cleanup
  → Blender hand-polish
  → gltf-transform compress
  → Three.js GLTFLoader

9. Where human still wins

Hero assets: AI 80%, human 100%.
Clean topology for animation: AI loses.
Specific IP look: AI needs fine-tuning.
Tiny details, branding: human.

10. Article plan for Season 10

S10-02: Gaussian Splatting rendering in browser.
S10-03: Lightweight NeRF for web.
S10-04: LLM-driven NPCs.
S10-05: TTS + lip sync.
S10-06: MediaPipe body/face tracking → avatar.
S10-07: Generative textures authoring.
S10-08: Generative meshes pipeline.
S10-09: AI-assisted shader/scene authoring.
S10-10: Series finale — the future.

11. Takeaways

Text-to-3D: Meshy, Rodin, Trellis for general use.
Image-to-3D: same tools + Stable Fast 3D.
Photos-to-splat: Luma AI (phone), Postshot (desktop).
LLM for NPCs: direct API or Inworld/Convai.
Mocap: Rokoko Vision for video-in, Cascadeur for polish.
Pipeline: AI for rough, human for polish.

Landscape overview — subsequent articles dive into each tool.