Back to Blog
Data Science·Time-SeriesQuantitative FinancePandas

Time-Series Analysis for Financial Data: A Quantitative Engineering Approach

How to build a rigorous financial analytics pipeline in Python — covering returns vs price stationarity, rolling volatility, regime detection, Sharpe ratio, max drawdown, and vectorized computation.

Rishabh Bhartiya9 min read
Time-Series Analysis for Financial Data: A Quantitative Engineering Approach

Financial time-series analysis is one of the most demanding data science domains. The data is non-stationary, fat-tailed, and full of structural breaks. Most "data science" tutorials on stock data are dangerously naive — they analyze raw price levels and mistake correlation for signal.

This post covers the rigorous quantitative approach I used to analyze 20 years of Google/Alphabet stock data — the same methodology used by quant analysts.

Principle 1: Never Analyze Raw Prices

Raw stock prices are non-stationary — they have trends, drift, and regime changes that make standard statistical tools invalid. The first transformation is always to returns:


import pandas as pd
import numpy as np

def load_and_prepare(filepath: str) -> pd.DataFrame:
    df = pd.read_csv(filepath, parse_dates=["Date"], index_col="Date")
    df = df.sort_index()  # Ensure chronological order

    # Log returns: more stationary, better for statistics
    df["log_return"] = np.log(df["Close"] / df["Close"].shift(1))

    # Simple returns: intuitive for portfolio math
    df["simple_return"] = df["Close"].pct_change()

    # Remove the first row (NaN from shift)
    return df.dropna()

Rolling Volatility: Detecting Risk Regimes

Volatility clusters — calm periods are followed by volatile ones (GARCH effects). Rolling volatility makes this visible:


def compute_rolling_metrics(df: pd.DataFrame, window: int = 252) -> pd.DataFrame:
    """252 trading days = 1 year"""
    
    # Annualized rolling volatility
    df["rolling_vol"] = (
        df["log_return"]
        .rolling(window)
        .std() * np.sqrt(252)
    )
    
    # Rolling mean return (annualized)
    df["rolling_mean_return"] = (
        df["log_return"]
        .rolling(window)
        .mean() * 252
    )
    
    # Rolling Sharpe (simplified — assumes 0 risk-free rate)
    df["rolling_sharpe"] = (
        df["rolling_mean_return"] / df["rolling_vol"]
    )
    
    # Regime classification
    vol_33 = df["rolling_vol"].quantile(0.33)
    vol_66 = df["rolling_vol"].quantile(0.66)
    df["regime"] = pd.cut(
        df["rolling_vol"],
        bins=[-np.inf, vol_33, vol_66, np.inf],
        labels=["Low Vol", "Mid Vol", "High Vol"]
    )
    
    return df

Maximum Drawdown: Quantifying Investor Pain


def calculate_max_drawdown(price_series: pd.Series) -> tuple[pd.Series, float]:
    """
    Maximum drawdown = worst peak-to-trough decline.
    Essential for risk management — tells you the worst case.
    """
    # High Water Mark: running maximum
    rolling_peak = price_series.cummax()
    
    # Drawdown at each point
    drawdown = (price_series - rolling_peak) / rolling_peak
    
    # Maximum drawdown
    max_dd = drawdown.min()
    
    # Drawdown duration: how long to recover?
    dd_duration = (drawdown < 0).astype(int)
    
    return drawdown, max_dd

# Google's historical drawdowns
drawdown_series, max_dd = calculate_max_drawdown(df["Close"])
print(f"Maximum Drawdown: {max_dd:.1%}")   # -65.2% during 2008 financial crisis

Sharpe Ratio: Risk-Adjusted Performance


def sharpe_ratio(returns: pd.Series,
                  risk_free_rate: float = 0.04,   # 4% annual
                  periods_per_year: int = 252) -> float:
    """
    Sharpe > 1.0: Good
    Sharpe > 2.0: Excellent
    Sharpe < 0.5: Probably not worth the risk
    """
    excess_returns = returns - (risk_free_rate / periods_per_year)
    
    annualized_return = excess_returns.mean() * periods_per_year
    annualized_vol    = returns.std() * np.sqrt(periods_per_year)
    
    return annualized_return / annualized_vol

# Beta vs S&P 500
def calculate_beta(asset_returns: pd.Series, market_returns: pd.Series) -> float:
    covariance = np.cov(asset_returns.dropna(), market_returns.dropna())[0][1]
    market_var = market_returns.var()
    return covariance / market_var

Fat Tails: Why Normal Distribution Fails for Finance


from scipy import stats

def test_normality(returns: pd.Series) -> dict:
    """
    Financial returns are NOT normally distributed.
    They have fat tails — extreme events happen more often than predicted.
    """
    jarque_bera_stat, jb_pvalue = stats.jarque_bera(returns.dropna())
    kurtosis = stats.kurtosis(returns.dropna())
    skewness = stats.skew(returns.dropna())
    
    # VaR (Value at Risk) — 5th percentile loss
    var_95 = returns.quantile(0.05)
    
    return {
        "jarque_bera_p_value": jb_pvalue,    # < 0.05 = reject normality
        "excess_kurtosis": kurtosis,           # > 0 = fat tails
        "skewness": skewness,                  # < 0 = negative skew (more crashes)
        "var_95": var_95,                      # 95% VaR
        "is_normal": jb_pvalue > 0.05
    }

Key Findings from 20 Years of GOOGL

  • Excess kurtosis: 8.2 — crashes happen 4× more often than a normal distribution predicts
  • Sharpe Ratio (2004–2024): 0.87 — good risk-adjusted performance over 20 years
  • Max Drawdown: -65.2% (2008 financial crisis)
  • Three distinct volatility regimes clearly visible: 2008, 2020 (COVID), 2022 (rate hikes)
  • Log returns pass the stationarity test (ADF p-value < 0.001) while raw prices fail

Tags

Time-SeriesQuantitative FinancePandasNumPyData Analysis

Author

Rishabh Bhartiya

ML Engineer · NatrajX

Related Posts

All posts