Shift-Share Instruments: Exogenous Shocks
Introduction
Borusyak, Hull, and Jaravel1 provide a new framework for the use of shift-share (or Bartik) instruments and their interpretation. Their model offers a perspective on the conditions required for these instruments to be valid, shifting the focus from the exogeneity of regional industry shares to the quasi-random assignment of industry-level shocks.
Set-Up
The core of Borusyak, Hull, and Jaravel’s framework is an equivalence result that recasts the traditional region-level shift-share instrumental variable (SSIV) regression into an equivalent regression at the shock level.
A typical shift-share instrument is constructed as follows:
- Instrument (Zi) for a region i: \[ Z_i = \sum_j s_{ij} \cdot g_j \] where:
- \(s_{ij}\) represents the share of industry j in region i (the “share” component).
- \(g_j\) is a national or industry-level shock (the “shift” component).
The traditional view required that the instrument as a whole be uncorrelated with the error term in the outcome equation. Borusyak, Hull, and Jaravel (2022) demonstrate that the orthogonality between this shift-share instrument and the unobserved residual can be equivalently represented as the orthogonality between the underlying shocks (\(g_j\)) and a shock-level unobservable.
This leads to a “shock-level” regression where the consistency of the SSIV estimator can be assessed. This reformulation is a key aspect of their structured model, as it moves the focus of the identification argument away from the potentially endogenous regional shares to the exogeneity of the shocks themselves. This approach allows for the exposure shares to be endogenous, a significant departure from previous understandings of shift-share instruments.
Identifying Assumptions
The central identifying assumption in the Borusyak, Hull, and Jaravel (2022) framework is the quasi-random assignment of shocks. This means that the industry-level shocks are as-good-as-randomly assigned with respect to the shock-level unobservables. More formally, the key assumptions are:
Shock Exogeneity: The shocks (gj) are uncorrelated with the error term in the shock-level representation of the model. This is the main identifying assumption. It implies that the shocks are not systematically related to other factors that influence the outcome variable at the industry level.
No Single Dominant Shock: The influence of any single shock on the instrument should be limited. This is often referred to as a “no-dominant-shock” condition and is important for the asymptotic properties of the estimator.
Sufficiently Dispersed Shares: The exposure shares (sij) across regions must be sufficiently heterogeneous. This ensures that the instrument has power and is not driven by a small number of regions or industries.
This set of assumptions provides a clear and testable framework for researchers using shift-share instruments. For instance, the exogeneity of shocks can be partially assessed by checking for pre-trends, i.e., whether the shocks are correlated with pre-existing trends in the outcome variable at the industry level.
In essence, they argue that if the industry-level shocks can be plausibly argued to be exogenous, then the resulting shift-share instrument is valid, even if the regional industry shares are correlated with unobserved determinants of the outcome.
Heuristic Derivation of Exogeneity
The core of their mathematical argument is to show how the exogeneity of shocks at the industry level translates into a valid instrument at the regional level. The correct formulation is to show that the covariance between the shift-share instrument and the unobserved regional-level residual is zero, which is the standard condition for instrument validity.
Let’s walk through a simplified version of the proof presented by Borusyak, Hull, and Jaravel (2022). We are interested in the effect of an endogenous treatment, \(X_i\), on an outcome, \(Y_i\), for a set of regions \(i = 1, \dots, N\).
Outcome: \[ Y_i = \alpha + \beta X_i + \epsilon_i \]
Here, \(\beta\) is the causal parameter of interest, and εi is the unobserved residual or error term. We are concerned that Xi is correlated with εi, making OLS estimates of β biased.
Shift-Share Instrument (\(Z_i\)): To address this endogeneity, we use a shift-share instrument:
\[ Z_i = \sum_{j=1}^{J} s_{ij} g_j \] where \(s_{ij}\) is the initial share of industry j in region i, and \(g_j\) is the aggregate (e.g., national) shock to industry j.
The key condition for the validity of Zi as an instrument is the exclusion restriction, which states that the instrument is uncorrelated with the regional-level error term:
\[ Cov(Z_i, \epsilon_i) = 0 \]
The most important insight from Borusyak, Hull, and Jaravel is to decompose this regional residual into industry-level components. They posit that the residual \(\epsilon_i\) is itself a share-weighted average of industry-specific regional factors:
\[ \epsilon_i = \sum_{j=1}^{J} s_{ij} \epsilon_{ij} + \nu_i \]
Here, \(\epsilon_{ij}\) represents the unobserved determinants of the outcome \(Y_i\) that are specific to industry j within region i. The term \(\nu_i\) captures any remaining regional-level unobservables that are not tied to specific industry compositions. For simplicity in this proof, we can assume \(\nu_i = 0\), as the core argument relates the shocks \(g_j\) to the industry-specific components \(\epsilon_{ij}\).
Now, let’s connect shock exogeneity to instrument validity. We can substitute the decomposed residual into the covariance condition we need to prove.
Step 1: Express the Covariance
Let’s write out the covariance between the instrument Zi and the residual εi:
\[ \text{Cov}(Z_i, \epsilon_i) = \text{Cov} \left( \sum_{j=1}^{J} s_{ij} g_j, \sum_{k=1}^{J} s_{ik} \epsilon_{ik} \right) \]
Note that I use a different index (k) for the summation inside the residual to avoid confusion.
Step 2: Expand the Covariance Term
Using the properties of covariance, we can expand this expression. Assuming the observations are at the region level, the covariance is an average over regions i.
\[ \text{Cov}(Z_i, \epsilon_i) = E \left[ \left( \sum_{j=1}^{J} s_{ij} g_j \right) \left( \sum_{k=1}^{J} s_{ik} \epsilon_{ik} \right) \right] - E[Z_i]E[\epsilon_i] \]
For simplicity, let’s assume the shocks and residuals have a mean of zero, so the expectation of their product is the covariance.
\[ \text{Cov}(Z_i, \epsilon_i) = E \left[ \sum_{j=1}^{J} \sum_{k=1}^{J} s_{ij} s_{ik} g_j \epsilon_{ik} \right] \]
Step 3: The Crucial “Shock-Level” Reformulation
Borusyak, Hull, and Jaravel’s key insight is to rearrange this summation. Instead of viewing this as a sum over regions, they reframe it as a weighted sum over industries (or shocks). This is the shift from the “region-level” to the “shock-level” perspective.
Let’s rewrite the expectation. We can pull the shocks \(g_j\) out of the inner summation, as they do not depend on the region index i or the second industry index k.
\[ \text{Cov}(Z_i, \epsilon_i) = \sum_{j=1}^{J} g_j \cdot E \left[ s_{ij} \sum_{k=1}^{J} s_{ik} \epsilon_{ik} \right] \]
Let’s define a new term, \(\tilde{\epsilon}_j\) which is a weighted average of the industry-specific residuals, where the weights are a function of the shares:
\[ \tilde{\epsilon}_j = E \left[ s_{ij} \sum_{k=1}^{J} s_{ik} \epsilon_{ik} \right] \]
This \(\tilde{\epsilon}_j\) is what Borusyak, Hull, and Jaravel conceptualize as the shock-level unobservable. It represents the aggregate unobserved factors that are correlated with shock gj through the cross-regional patterns of industry shares.
Now, the regional covariance can be expressed in a much simpler form:
\[ \text{Cov}(Z_i, \epsilon_i) = \sum_{j=1}^{J} g_j \tilde{\epsilon}_j \]
This expression looks very much like a covariance itself, but this time at the shock level.
Step 4: Applying the Shock Exogeneity Assumption
The main identifying assumption in the paper is shock exogeneity. This assumption states that the shocks are quasi-randomly assigned with respect to the shock-level unobservables. Formally, this means (asymptotically, over the set of shocks j):
\[ Cov(g_j, \tilde{\epsilon}_j) = 0 \]
This implies that the weighted sum we derived above, \(\sum_{j=1}^{J} g_j \tilde{\epsilon}_j\), will converge to zero as the number of shocks grows large, provided the shocks and shares are not concentrated in a way that violates certain technical conditions (like the “no dominant shock” or “dispersed shares” conditions).
Therefore, by assuming shock exogeneity at the industry level, we have shown that the covariance at the regional level becomes zero: \[ Cov(Z_i, \epsilon_i) \rightarrow 0. \]
Conclusion
The proof demonstrates that the standard instrument validity condition at the regional level, \(Cov(Z_i, \epsilon_i) = 0\), is mathematically equivalent to a shock-level exogeneity condition, \(Cov(g_j, \tilde{\epsilon}_j) = 0\).
This is a nice result because it is often more plausible and empirically verifiable to argue that industry-wide shocks (\(g_j\)) are exogenous than to argue that regional industry shares (\(s_{ij}\)) are. For example, one can test for pre-trends by checking if the shocks \(g_j\) are correlated with lagged outcomes at the industry level. This was not possible under the traditional interpretation that focused on the exogeneity of the regional shares. The work of Borusyak, Hull, and Jaravel (2022) thus provides a deeper foundation for the use and interpretation of shift-share instrumental variables.
Footnotes
Borusyak, K., Hull, P., & Jaravel, X. (2022). Quasi-experimental shift-share research designs. The Review of economic studies, 89(1), 181-213.↩︎