Back to Projects
Data Science

Time-Series Analysis — 20 Years of Google Stock

A rigorous statistical financial analytics pipeline transforming 5,000+ OHLCV data points into investment intelligence — covering regime detection, volatility clustering, Sharpe ratio, and max drawdown.

Client

Kaggle

Year

2025

Category

Data Science

Built at

Kaggle

Time-Series Analysis — 20 Years of Google Stock

Impact

5,000+ data points across 20 years analysed

Regime detection: Bull vs Bear market identification

Fat-tail distribution confirmed (crashes underestimated by normal curve)

Reusable ETL pipeline applicable to any equity ticker

Key Metrics

data Points

5,000+

time Horizon

20 years (2004–2024)

metrics

Sharpe · Beta · Max Drawdown · Volatility Clustering

methodology

Vectorised, reproducible pipeline

Tech Stack

Python 3.10PandasNumPyMatplotlibSeabornSciPy

1. Core Principle: Price is Noise, Returns are Signal

Raw price levels are non-stationary. To build a robust model, prices must be transformed into returns — a stationary stochastic process centered around a mean.

2. ETL Pipeline

  • Ingest — raw OHLCV CSV with strict temporal ordering
  • Transform — rolling features: volatility, moving averages, log returns
  • Visualise — structural market regimes (Bull vs Bear)
  • Quantify — Sharpe ratio, Beta, Max Drawdown, volatility clustering

3. Max Drawdown Implementation


def calculate_max_drawdown(price_series: pd.Series):
    """Quantifies worst-case capital destruction from peak to trough."""
    rolling_peak = price_series.cummax()
    drawdown = (price_series - rolling_peak) / rolling_peak
    return drawdown, drawdown.min()

4. Regime Detection


def detect_regimes(returns: pd.Series, window: int = 252) -> pd.Series:
    rolling_vol = returns.rolling(window).std() * np.sqrt(252)
    regime = pd.cut(rolling_vol, bins=3, labels=['Low Vol', 'Mid Vol', 'High Vol'])
    return regime

5. Key Findings

  • Fat-tail distribution: crashes occur more frequently than normal curve predicts
  • Volatility clustering: calm periods systematically followed by chaotic periods
  • Max drawdown reveals worst-case investor psychology (2008, 2022 corrections clearly visible)
  • Sharpe ratio confirms strong risk-adjusted performance over 20-year horizon

This project was built at NatrajX — an AI/IT engineering agency.

Full engineering write-up, system architecture, and production metrics available on the agency site.

Full Case Study ↗