Time-Series Analysis — 20 Years of Google Stock
A rigorous statistical financial analytics pipeline transforming 5,000+ OHLCV data points into investment intelligence — covering regime detection, volatility clustering, Sharpe ratio, and max drawdown.
Client
Kaggle
Year
2025
Category
Data Science
Built at
Kaggle

Impact
5,000+ data points across 20 years analysed
Regime detection: Bull vs Bear market identification
Fat-tail distribution confirmed (crashes underestimated by normal curve)
Reusable ETL pipeline applicable to any equity ticker
Key Metrics
data Points
5,000+
time Horizon
20 years (2004–2024)
metrics
Sharpe · Beta · Max Drawdown · Volatility Clustering
methodology
Vectorised, reproducible pipeline
Tech Stack
1. Core Principle: Price is Noise, Returns are Signal
Raw price levels are non-stationary. To build a robust model, prices must be transformed into returns — a stationary stochastic process centered around a mean.
2. ETL Pipeline
- Ingest — raw OHLCV CSV with strict temporal ordering
- Transform — rolling features: volatility, moving averages, log returns
- Visualise — structural market regimes (Bull vs Bear)
- Quantify — Sharpe ratio, Beta, Max Drawdown, volatility clustering
3. Max Drawdown Implementation
def calculate_max_drawdown(price_series: pd.Series):
"""Quantifies worst-case capital destruction from peak to trough."""
rolling_peak = price_series.cummax()
drawdown = (price_series - rolling_peak) / rolling_peak
return drawdown, drawdown.min()
4. Regime Detection
def detect_regimes(returns: pd.Series, window: int = 252) -> pd.Series:
rolling_vol = returns.rolling(window).std() * np.sqrt(252)
regime = pd.cut(rolling_vol, bins=3, labels=['Low Vol', 'Mid Vol', 'High Vol'])
return regime
5. Key Findings
- Fat-tail distribution: crashes occur more frequently than normal curve predicts
- Volatility clustering: calm periods systematically followed by chaotic periods
- Max drawdown reveals worst-case investor psychology (2008, 2022 corrections clearly visible)
- Sharpe ratio confirms strong risk-adjusted performance over 20-year horizon
This project was built at NatrajX — an AI/IT engineering agency.
Full engineering write-up, system architecture, and production metrics available on the agency site.