andrewlb notes

Causal Inference

Published:

Causal Inference

Metadata

Highlights

  • Causal inference is the leveraging of theory and deep knowledge of institutional details to estimate the impact of events and choices on a given outcome of interest. (Location 154)
  • sometimes there are causal relationships between two things and yet no observable correlation. (Location 199)
  • Just because there is no observable relationship does not mean there is no causal one. (Location 208)
  • Human beings engaging in optimal behavior are the main reason correlations almost never reveal causal relationships, because rarely are human beings acting randomly. And as we will see, it is the presence of randomness that is crucial for identifying causal effect. (Location 218)
  • There’s experimental data and non-experimental data. The latter is also sometimes called observational data. (Location 226)
  • Economic theory tells us we should be suspicious of correlations found in observational data. In observational data, correlations are almost certainly not reflecting a causal relationship because the variables were endogenously chosen by people who were making decisions they thought were best. (Location 236)
  • a correlation, in order to be a measure of a causal effect, must be based on a choice that was made independent of the potential outcomes under consideration. (Location 240)
  • economic theory says choices are endogenous, and therefore since they are, the correlations between those choices and outcomes in the aggregate will rarely, if ever, represent a causal effect. (Location 243)
  • Comparative statics are theoretical descriptions of causal effects contained within the model. These kinds of comparative statics are always based on the idea of ceteris paribus—or “all else constant.” (Location 264)
  • post hoc ergo propter hoc, which is Latin for “after this, therefore because of this.” This fallacy recognizes that the temporal ordering of events is not sufficient to be able to say that the first thing caused the second. (Location 363)
  • These describe a population, and our goal in empirical work is to estimate their values. We never directly observe these parameters, because they are not data (I will emphasize this throughout the book). What we can do, though, is estimate these parameters using data and assumptions. To do this, we need credible assumptions to accurately estimate these parameters with data. (Location 668)
  • We call that mistake the residual, and here use the notation for it. So the residual equals: While both the residual and the error term are represented with a u, it is important that you know the differences. The residual is the prediction error based on our fitted and the actual y. The residual is therefore easily calculated with any sample of data. (Location 743)
  • without the hat is the error term, and it is by definition unobserved by the researcher. Whereas the residual will appear in the data set once generated from a few steps of regression and manipulation, the error term will never appear in the data set. It is all of the determinants of our outcome not captured by our model. This is a crucial distinction, and strangely enough it is so subtle that even some seasoned researchers struggle to express it. (Location 747)
  • It is called the zero conditional mean assumption and is probably the most critical assumption in causal inference. In the population, the error term has zero mean given any value of the explanatory variable: This is the key assumption for showing that OLS is unbiased, with the zero value being of no importance once we assume that E(u | x) does not change with x. Note that we can compute OLS estimates whether or not this assumption holds, even if there is an underlying population model. (Location 869)
  • An important complement to the CEF is the law of iterated expectations (LIE). This law says that an unconditional expectation can be written as the unconditional average of the CEF. (Location 942)

public: true

title: Causal Inference longtitle: Causal Inference author: Scott Cunningham url: , source: kindle last_highlight: 2022-05-18 type: books tags:

Causal Inference

rw-book-cover

Metadata

Highlights

  • Causal inference is the leveraging of theory and deep knowledge of institutional details to estimate the impact of events and choices on a given outcome of interest. (Location 154)
  • sometimes there are causal relationships between two things and yet no observable correlation. (Location 199)
  • Just because there is no observable relationship does not mean there is no causal one. (Location 208)
  • Human beings engaging in optimal behavior are the main reason correlations almost never reveal causal relationships, because rarely are human beings acting randomly. And as we will see, it is the presence of randomness that is crucial for identifying causal effect. (Location 218)
  • There’s experimental data and non-experimental data. The latter is also sometimes called observational data. (Location 226)
  • Economic theory tells us we should be suspicious of correlations found in observational data. In observational data, correlations are almost certainly not reflecting a causal relationship because the variables were endogenously chosen by people who were making decisions they thought were best. (Location 236)
  • a correlation, in order to be a measure of a causal effect, must be based on a choice that was made independent of the potential outcomes under consideration. (Location 240)
  • economic theory says choices are endogenous, and therefore since they are, the correlations between those choices and outcomes in the aggregate will rarely, if ever, represent a causal effect. (Location 243)
  • Comparative statics are theoretical descriptions of causal effects contained within the model. These kinds of comparative statics are always based on the idea of ceteris paribus—or “all else constant.” (Location 264)
  • post hoc ergo propter hoc, which is Latin for “after this, therefore because of this.” This fallacy recognizes that the temporal ordering of events is not sufficient to be able to say that the first thing caused the second. (Location 363)
  • These describe a population, and our goal in empirical work is to estimate their values. We never directly observe these parameters, because they are not data (I will emphasize this throughout the book). What we can do, though, is estimate these parameters using data and assumptions. To do this, we need credible assumptions to accurately estimate these parameters with data. (Location 668)
  • We call that mistake the residual, and here use the notation for it. So the residual equals: While both the residual and the error term are represented with a u, it is important that you know the differences. The residual is the prediction error based on our fitted and the actual y. The residual is therefore easily calculated with any sample of data. (Location 743)
  • without the hat is the error term, and it is by definition unobserved by the researcher. Whereas the residual will appear in the data set once generated from a few steps of regression and manipulation, the error term will never appear in the data set. It is all of the determinants of our outcome not captured by our model. This is a crucial distinction, and strangely enough it is so subtle that even some seasoned researchers struggle to express it. (Location 747)
  • It is called the zero conditional mean assumption and is probably the most critical assumption in causal inference. In the population, the error term has zero mean given any value of the explanatory variable: This is the key assumption for showing that OLS is unbiased, with the zero value being of no importance once we assume that E(u | x) does not change with x. Note that we can compute OLS estimates whether or not this assumption holds, even if there is an underlying population model. (Location 869)
  • An important complement to the CEF is the law of iterated expectations (LIE). This law says that an unconditional expectation can be written as the unconditional average of the CEF. (Location 942)