Value Investing & Time-Series Analysis
About This Project
It is widely acknowledged that the investment acumen of Warren Buffett has led to substantial returns for many of his followers. However, there exists a paradoxical scenario in which some investors, by merely replicating his stock portfolio choices, have actually experienced erosion of their profits. This counterintuitive outcome stems from several complex factors that underlie the trading and investment banking landscape, such as misalignment between individual and collective market sentiments, unforeseen sectoral or macroeconomic fluctuations, and the inherent unpredictability of stock market movements.
Technical Details
The time series is first transformed with a natural logarithm to stabilize the variance, and then an ARIMA model is fitted with the auto.arima function from the forecast package. A fractional differencing algorithm produces an approximately stationary time series to then fit with an ARMA model. This is found by looking over all possible combinations of AR and MA coefficients and ranking by AIC and BIC.
What is ARIMA?
ARIMA stands for AutoRegressive Integrated Moving Average. It is one of the most popular and powerful statistical models for forecasting univariate time series data (a single variable observed over time, e.g., stock prices, sales, temperature).
ARIMA excels at capturing:
- Trends (gradual increase/decrease)
- Short-term dependencies between observations
- Random noise
(Seasonality is handled by the SARIMA extension.)
The Three Components: ARIMA(p, d, q)
An ARIMA model is defined by three parameters: (p, d, q).
| Parameter |
Name |
Meaning |
Mathematical Idea |
| p |
AutoRegressive order |
Number of past observations used to predict the current value. |
\(\ y_t = \phi_1 y_{t-1} + \dots + \phi_p y_{t-p} + \epsilon_t\) |
| d |
Differencing order |
Number of times the series is differenced to make it stationary (remove trend). |
\(\Delta^d y_t = (1 - L)^d y_t\) |
| q |
Moving Average order |
Number of past forecast errors used in the prediction. |
\(\ y_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \dots + \theta_q \epsilon_{t-q}\) |
The complete ARIMA(p,d,q) equation (in lag operator notation) is:
\[
(1 - \phi_1 L - \dots - \phi_p L^p)(1 - L)^d y_t = (1 + \theta_1 L + \dots + \theta_q L^q) \epsilon_t
\]
Step-by-Step Breakdown
- Integration (I(d)) – Make the series stationary
Differencing removes trends:
d = 1 → \(\Delta y_t = y_t - y_{t-1}\)
Most common: d = 0 or 1 (rarely 2).
- AutoRegressive (AR(p)) – Use past values
Current value is a linear combination of previous values plus noise.
Example AR(1): \(\ y_t = \phi_1 y_{t-1} + \epsilon_t\)
- Moving Average (MA(q)) – Use past errors
Current value depends on past forecast errors (shocks).
Example MA(1): \(\ y_t = \epsilon_t + \theta_1 \epsilon_{t-1}\)
Practical Workflow to Fit an ARIMA Model
- Plot the data → look for trend/seasonality.
- Test stationarity (e.g., ADF test).
- Difference until stationary → choose d.
- Examine ACF/PACF plots:
• ACF cuts off → suggests q
• PACF cuts off → suggests p
- Fit candidate models and compare with AIC/BIC.
- Check residuals → should look like white noise.
- Forecast!
Common Extensions
- SARIMA: Adds seasonal AR and MA terms for periodic patterns.
- ARIMAX: Includes exogenous regressors (like regression + ARIMA errors).
ARIMA remains a cornerstone of time-series forecasting due to its interpretability, statistical foundation, and excellent performance on many real-world univariate datasets.