Course Meeting Times

Lectures: 2 sessions / week, 1.5 hours / session

Recitations: 1 session / week, 1 hour / session


This course covers the statistical tools needed to understand empirical economic research and to plan and execute independent research projects. Topics include statistical inference, regression, generalized least squares, instrumental variables, simultaneous equations models, and evaluation of government policies and programs.


The prerequisite courses include Introduction to Statistical Methods in Economics (14.30) or equivalent. Students should be familiar with basic concepts in probability theory and statistical inference. The course includes a brief statistics review.

Course Requirements

Each week there are two lectures and a weekly recitation.

In addition to the readings, there are 6 graded problem sets and ungraded review problem sets at the beginning and end of the course. The problem sets have both analytical and computer-exercise components. The statistical analysis will be done using Stata or SAS on PCs or MIT workstations. Help for new Stata users will be given in recitation.


Buy at Amazon Wooldridge, Jeffrey M. Introductory Econometrics: A Modern Approach. 3rd ed. Mason, OH: Thomson/South-Western, 2006. ISBN: 9780324289787.

Buy at Amazon Goldberger, Arthur S. A Course in Econometrics. Cambridge, MA: Harvard University Press, 1991. ISBN: 9780674175440.

DeGroot, Morris H., and Mark J. Schervish. Probability and Statistics. 3rd ed. Boston, MA: Addison-Wesley, 2001. ISBN: 9780201524888.

Wooldridge is the main text. The material in Goldberger is more advanced and optional. DeGroot and Schervish is a recommended text for statistics review. Each unit also has readings from published journal articles.


Problem sets (5% each) 30%
Midterm exam 30%
Final exam 40%


Each review problem set is worth 1 bonus percentage point.

Graded problem sets are mandatory and solutions should be submitted on time to receive credit. Stata or SAS logs should be submitted with solution sets. A grade of 50% or better on at least 5 problem sets is required in order to be eligible to take the final. Consult with classmates on problem sets if you get stuck, but written solution sets should be your own work.

Course Outline

Part I

  • A. Review of probability and statistics
    • 1. Probability and distribution
    • 2. Expectation and moments
  • B. Review of statistical inference
    • 3. Sampling distributions and inference
    • 4. The Central Limit theorem (Asymptotic distribution of the sample mean)
    • 5. Confidence intervals
  • C. Regression basics
    • 6. Conditional expectation functions, bivariate regression
    • 7. Sampling distribution of regression estimates; Gauss-Markov theorem
    • 8. How classical assumptions are used; asymptotic distribution of the sample slope
    • 9. Residuals, fitted values, and goodness of fit

Part II

  • D. Multivariate regression
    • 10. Regression, causality, and control; anatomy of multivariate regression coefficients
    • 11. Omitted variables formula, short vs. long regressions
    • 12a. Dummy variables and interactions; testing linear restrictions using F-tests
    • 12b. Regression analysis of natural experiments, differences-in-differences
  • E. Inference problems - heteroscedasticity and autocorrelation
    • 13a. Heteroscedasticity, consequences of; weighted least squares; the linear probability model
    • 13b. Serial correlation in time series, consequences of; quasi-differencing; common-factor restriction; Durbin-Watson test for serial correlation
  • F. Instrumental variables, simultaneous equations models, measurement error
    • 14a. Using IV to solve omitted-variables problems
    • 14b. Measurement error (Time-permitting)
    • 14c. Regression-discontinuity designs (Time-permitting)
  • G. Simultaneous equation models
    • 15. Simultaneous equations models I
      • a. The use of structural models
      • b. Simultaneous equations bias
      • c. The identification problem
      • d. The structure and the reduced form
      • e. Indirect least squares
    • 16. Simultaneous equations models II
      • a. IV for the SEM
      • b. Two-stage least squares
      • c. Sampling variance of 2SLS estimates