Back to Projects
ML SystemsFeatured

F1 2025 — Race Strategy Intelligence Platform

An end-to-end ML system that predicts lap performance with 92.4% accuracy, simulates pit strategies, and delivers SHAP-based explainability for Formula 1 race strategy.

Client

Personal / Kaggle

Year

2026

Category

ML Systems

Built at

Personal

F1 2025 — Race Strategy Intelligence Platform

Impact

92.4% R² accuracy for lap-time prediction (XGBoost)

81% accuracy for top-3 finish classification

38% faster retraining pipeline post-optimisation

Sub-120ms inference latency via FastAPI

Key Metrics

lap Time Prediction

92.4% R²

top3 Accuracy

81%

inference Latency

<120ms

retraining Speedup

38% faster

Tech Stack

PythonXGBoostSHAPScikit-learnPandasFastAPINext.jsMongoDBDocker

1. Challenge

F1 lap time is influenced by tire degradation, fuel mass decay, track evolution, sector variability, temperature, and competitor behaviour. Building a reliable prediction system requires careful feature engineering, outlier handling (safety cars, crashes), and model interpretability for strategic decisions.

2. Modular ML Pipeline

  • Data Layer — schema validation, anomaly filtering (safety car laps removed)
  • Feature Engineering — degradation slopes, rolling lap deltas, fuel-adjusted pace, sector consistency scores
  • Modeling — XGBoost regression (lap time) + ensemble classifiers (finishing position)
  • Explainability — SHAP feature attribution reveals tire age and fuel load as primary drivers
  • Serving — FastAPI microservice + Docker + Next.js strategy dashboard

3. Feature Engineering


def engineer_features(df: pd.DataFrame) -> pd.DataFrame:
    df['tire_deg_slope'] = df.groupby('stint')['lap_time'].transform(
        lambda x: np.polyfit(range(len(x)), x, 1)[0]
    )
    df['fuel_adjusted_pace'] = df['lap_time'] - (df['fuel_load'] * FUEL_EFFECT_CONSTANT)
    df['rolling_delta'] = df['lap_time'].rolling(3).mean() - df['lap_time']
    return df

4. SHAP Explainability


explainer = shap.TreeExplainer(xgb_model)
shap_values = explainer.shap_values(X_test)
# Result: tire_age and fuel_load are top 2 drivers (as expected by F1 engineers)
shap.summary_plot(shap_values, X_test)

5. Results

  • 92.4% R² — competitive with F1 team internal models
  • SHAP confirms domain intuition: tire age > fuel load > track temp
  • Scenario engine: change tire compound / pit window → see projected race outcome

This project was built at NatrajX — an AI/IT engineering agency.

Full engineering write-up, system architecture, and production metrics available on the agency site.

Full Case Study ↗