Assessment Arena
An AI-native evaluation infrastructure that automates question generation, multimodal answer evaluation, semantic grading, and human-like annotated feedback at scale.
Client
EdTech Platform
Year
2024
Category
Computer Vision
Built at
NatrajX

Impact
Fully automated question generation pipeline
Multimodal grading: text + handwritten image answers
Human-like annotated feedback at scale
Eliminated manual evaluation bottleneck
Key Metrics
evaluation Type
Text + Image (multimodal)
automation
100% automated pipeline
feedback
Annotated, human-like
Tech Stack
1. Problem
Manual assessment at scale is the biggest operational bottleneck in EdTech. Question generation, answer evaluation, and feedback generation are repetitive, expensive, and inconsistent across human graders.
2. System Design
- Question Generator — AI-driven question synthesis from curriculum content
- Multimodal Evaluator — handles both typed text and handwritten image answers via Pillow + Vision models
- Semantic Grader — embedding-based semantic correctness scoring
- Feedback Engine — generates annotated, student-facing explanations
- PDF Pipeline — PDF2Image conversion for scanned answer sheets
3. Multimodal Answer Processing
def evaluate_answer(answer_input: AnswerInput) -> EvaluationResult:
if answer_input.type == "image":
text = ocr_extract(answer_input.image_bytes)
else:
text = answer_input.text
score = semantic_grade(text, answer_input.expected)
feedback = generate_feedback(text, answer_input.expected, score)
return EvaluationResult(score=score, feedback=feedback)
4. Results
- End-to-end automated evaluation pipeline
- Supports handwritten + typed answers via multimodal processing
- Consistent grading eliminates inter-grader variability
This project was built at NatrajX — an AI/IT engineering agency.
Full engineering write-up, system architecture, and production metrics available on the agency site.