Back to Projects
Generative AIBuilt at NatrajX ↗Featured

Text-to-Animation Engine (TTA)

A production-grade multi-agent system that transforms raw academic topics into fully narrated, animated educational videos with a 99.4% success rate — deployed at Edza.ai.

Client

Edza.ai

Year

2024

Category

Generative AI

Built at

NatrajX

Text-to-Animation Engine (TTA)

Impact

99.4% animation success rate in production

P95 end-to-end latency of 85 seconds

<100ms audio-video synchronisation

Covers Physics, Chemistry, Mathematics (PCM)

Key Metrics

success Rate

99.4%

latency P95

85s end-to-end

audio Sync

<100ms

subjects

PCM (Physics, Chemistry, Maths)

Tech Stack

Python 3.11FastAPIManimGoogle Vertex AIRedisAST ParsingTTS Pipeline

1. The Problem

Creating a single high-quality educational animation requires subject matter experts, structured scriptwriting, animation engineering, and voice + post-production — a largely sequential, human-intensive workflow that is impossible to scale.

2. The Insight: LLMs Lack Spatial Awareness

Early experiments asking LLMs to write Manim scripts directly failed ~60% of the time. Models called non-existent functions, placed text labels over diagrams, and produced invalid syntax. Pure generation is unreliable for structured animation code.

3. Architecture: Neuro-Symbolic Pipeline

The solution is a hybrid approach: use LLMs for high-level reasoning and content synthesis, but confine them within strict, deterministic code scaffolds.

  • Subject Classifier — routes topic to domain-specific template engine (confidence > 85%)
  • Template Engine — subject-specific Manim scaffolds for Maths, Physics, Chemistry, Organic Chemistry, CS
  • Wikipedia Fallback Agent — grounded generation for low-confidence or unsupported topics
  • Orchestration Layer — manages routing, validation, regeneration, rendering, and storage
  • TTS + Sync Layer — audio narration with <100ms video synchronisation

4. Confidence-Gated Routing


def route_topic(topic: str, subject: str, confidence: float):
    if confidence >= 0.85:
        return use_domain_template(subject, topic)
    else:
        return wikipedia_fallback_pipeline(topic)

5. Template Architecture


def _get_template_for_subject(self, subject: str) -> str:
    template_map = {
        "mathematics":          MATH_TEMPLATE,
        "physics":              PHYSICS_TEMPLATE,
        "chemistry":            PHYSICAL_CHEMISTRY_TEMPLATE,
        "organic_chemistry":    ORGANIC_CHEMISTRY_TEMPLATE,
        "computer_science":     COMPUTER_SCIENCE_TEMPLATE,
    }
    return template_map.get(subject, MATH_TEMPLATE)

6. Results

  • 99.4% success rate (up from ~40% with pure generation)
  • P95 latency: 85 seconds for a fully rendered, narrated video
  • Deployed across PCM subjects at Edza.ai

7. Key Learnings

  • Neuro-symbolic is more robust than pure generation for structured outputs
  • Confidence-gated routing prevents template mismatch failures
  • Isolating failure boundaries enables independent component testing

This project was built at NatrajX — an AI/IT engineering agency.

Full engineering write-up, system architecture, and production metrics available on the agency site.

Full Case Study ↗