Problem Set 4

EC031-S26

Aleksandr Michuda

Note: The problems may differ based on the edition of the textbook you have.

Problem 1

Briefly explain the difference between \(b_1(\text{OLS})\) and \(\beta_1\); between the residual, \(e_i\), and the regression error, \(\epsilon_i\); and between the OLS predicted value, \(\hat{y_i}\) and \(E(Y_i|X)\).

Problem 2

ASW 10.38

Problem 3

ASW 10.45

Problem 4

ASW 11.23

Problem 5

ASW 11.29

Problem 6

ASW 14.55

Problem 7

ASW 14.1

Problem 8

ASW 14.47

Stata Exercise: Simulating OLS with Outliers

In this problem, we will simulate a simple linear regression model with one independent variable and one dependent variable. We will then add outliers to the data and see how the OLS estimates change.

  1. Simulate a simple linear regression model with the following data generating process: \[ Y_i = \beta_0 + \beta_1 X_i + \epsilon_i \] where \(\beta_0 = 1\), \(\beta_1 = 2\), and \(\epsilon_i \sim N(0, 1)\).

To do this, open a do-file and write:

clear all
set obs 100
gen X = rnormal()
gen epsilon = rnormal()
gen Y = 1 + 2*X + epsilon
  1. Show a scatterplot of X and Y of the resulting data. Make sure to give it a title and axis labels that make it “prettier”. Add a regression line to the scatterplot by running:
scatter Y X || lfit Y X
  1. Estimate the OLS regression of \(Y\) on \(X\). What are the estimated coefficients? Why?

  2. Add an outlier to the data by making the first row of \(Y\) equal to 100:

replace Y = 100 in 1
  1. Estimate the OLS regression of \(Y\) on \(X\) again and show a scatter plot with a fitted line. What are the estimated coefficients? Did \(b_1\) change? Did \(b_0\) change? Why?