the Great Grimpen Mire

Boris Demeshev
Oxana Malakhovskaya

September Equinox

Bayesian Vector Autoregression Paradox:

  • High predictive power

  • Underused and confused

What is a BVAR?

BVAR = VAR + Bayesian approach

VAR: \[ \begin{cases} y_t =\Phi_{const}+ \Phi_1 y_{t-1} + \Phi_2 y_{t-2} +\ldots + \Phi_p y_{t-p} + \varepsilon_t \\ \varepsilon_t\sim \mathcal{N}(0,\Sigma) \end{cases} \]

Bayesian approach:

  1. Impose some prior distribution on \(\Sigma\), \(\Phi_{const}\), \(\Phi_1\), \(\Phi_2\), …

  2. Use formula for conditional probability \(p(\theta|data) \sim p(data|\theta) \cdot p(\theta)\) to obtain posterior distribution.

  3. Use posterior distribution for forecasting.

On the shoulders of giants:

  • Robert Litterman, 1979, Techniques of forecasting using vector autoregressions

  • Rao Kadiyala and Sune Karlsson, 1997, Numerical Methods for Estimation and Inference in Baesian VAR-Models

  • Christopher A. Sims and Tao Zha, 1998, Bayesian methods for dynamic multivariate models

  • Sune Karlsson, 2012, Forecasting with Bayesian Vector Autoregressions

  • more than 7 000 hits in

Note: ARMA gives more than 700 000 hits!

Why the BVAR paradox?

  1. Great Grimpen Mire of prior distributions

  2. Great Grimpen Mire of software

  3. No MCMC in bachelor probability course

Great Grimpen Mire of prior distributions

General overview of BVARs:

  • structural vs reduced form BVARs

Structural: \(Ay_t = B_0+ B_1 y_{t-1} +\ldots + B_p y_{t-p} + u_t\)

Reduced form: \(y_t =\Phi_{const}+ \Phi_1 y_{t-1} + \Phi_2 y_{t-2} +\ldots + \Phi_p y_{t-p} + \varepsilon_t\)

Link: \(B_i = A \Phi_i\).

  • time varying parameters vs classic BVARs

The parameters \(\Phi_i\) may change over time.

Classical BVARs in reduced form

\(y_t =\Phi_{const}+ \Phi_1 y_{t-1} + \Phi_2 y_{t-2} +\ldots + \Phi_p y_{t-p} + \varepsilon_t\)

We should impose prior on \(\Sigma\), \(\Phi_{const}\), \(\Phi_1\), \(\Phi_2\), …

Here \(y_t\) is multivariate: \(m\times 1\).

For \(m=10\) variables and \(p=4\) lags we have more than 400 parameters.

Grimpen Mire of priors

  1. Confusing names for priors

  2. No clear classification of priors

  3. Contradictory notation

Results in:

  1. Coding mistakes

  2. You should struggle a lot to understand

  3. Underuse of BVARs

Everything is simple:

\[ \begin{cases} y_t =\Phi_{const}+ \Phi_1 y_{t-1} + \Phi_2 y_{t-2} +\ldots + \Phi_p y_{t-p} + \varepsilon_t \\ \varepsilon_t\sim \mathcal{N}(0,\Sigma) \end{cases} \]

We should place prior on:

  1. Covariance matrix \(\Sigma\)

  2. Coefficients \(\Phi_i\)

Prior for \(\Sigma\) and \(\Phi_i\):

  1. Independent \(\Sigma\) and \(\Phi_i\)

Prior \(p(\Sigma)\) and prior \(p(\Phi_{const}, \Phi_1, \ldots, \Phi_p)\)

  1. Conjugate \(\Sigma\) and \(\Phi_i\)

Prior \(p(\Sigma)\) and prior \(p(\Phi_{const}, \Phi_1, \ldots, \Phi_p | \Sigma)\)

Sharp classification:

  1. Independent \(\Sigma\) and \(\Phi_i\):
    • Independent Normal-Inverse Wishart prior
    • Independent Normal-Jeffreys prior
    • Minnesota prior
  2. Conjugate \(\Sigma\) and \(\Phi_i\):
    • Conjugate Normal-Inverse Wishart prior
    • Conjugate Normal-Jeffreys prior
    • Kronecker-subcase of Minnesota prior

Technical details:

  • Independent Normal-Inverse Wishart prior

\[ \begin{cases} \Sigma \sim \mathcal{IW}(\underline S,\underline \nu) \\ \phi \sim \mathcal{N}(\underline \phi, \underline \Xi) \\ p(\phi, \Sigma) = p(\phi)\cdot p(\Sigma) \end{cases} \]

  • Minnesota prior

\[ \begin{cases} \Sigma = const\\ \phi \sim \mathcal{N}(\underline \phi, \underline \Xi) \end{cases} \]

  • Conjugate Normal-Inverse Wishart prior

\[ \begin{cases} \Sigma \sim \mathcal{IW}(\underline S,\underline \nu) \\ \phi | \Sigma \sim \mathcal{N}(\underline \phi, \Sigma \otimes \underline \Omega) \end{cases} \]

Here be dragons