Back to Projects

AI Notes Generator

A production-grade pipeline that converts raw curriculum data into structured, visually rich PDF textbooks using multi-layer caching and layout-aware rendering.

Client

EdTech Platform

Year

2024

Category

Generative AI

Built at

NatrajX

AI Notes Generator

Impact

Automated textbook generation from raw curriculum data

Multi-layer caching reduces repeat generation cost

Layout-aware PDF rendering with visual hierarchy

Deployed on Google Cloud Storage

Key Metrics

output

Structured PDF textbooks

caching

Multi-layer (in-memory + GCS)

rendering

Layout-aware, visual hierarchy

Tech Stack

Python 3.10WeasyPrintJinja2Google Cloud StorageAsyncIOBeautifulSoup4

1. Problem

Creating structured, print-ready study notes from raw curriculum data requires significant editorial effort. The goal was to automate this end-to-end with consistent formatting, visual hierarchy, and layout quality.

2. Pipeline Architecture

  • Content Ingestion — parse raw curriculum data via BeautifulSoup4
  • Template Rendering — Jinja2 HTML templates with layout-aware structure
  • PDF Engine — WeasyPrint for pixel-perfect HTML-to-PDF conversion
  • Cache Layer — multi-layer caching (in-memory + GCS) to avoid redundant generation
  • Storage — Google Cloud Storage with signed URL delivery

3. Async Generation


async def generate_notes(topic: str, curriculum: dict) -> str:
    cache_key = build_cache_key(topic, curriculum)
    if cached := await cache.get(cache_key):
        return cached

    html = render_template("notes.html", curriculum=curriculum)
    pdf_bytes = weasyprint.HTML(string=html).write_pdf()
    url = await gcs.upload(pdf_bytes, cache_key)

    await cache.set(cache_key, url)
    return url

4. Results

  • Fully automated, consistent textbook output
  • Multi-layer caching significantly reduces GCS egress costs
  • Layout-aware rendering matches professional editorial quality

This project was built at NatrajX — an AI/IT engineering agency.

Full engineering write-up, system architecture, and production metrics available on the agency site.

Full Case Study ↗