Exercises: Principles of Econometrics

SME Notes 1

Simple linear regression model

\[y_i=\beta_1+\beta_2 x_i+\epsilon_i \]

There are a number of assumptions required to formulate the simple linear regression model:

The value of \(y_i\), at each value of \(x_i\), is \(y_i=\beta_1+\beta_2 x_i+\epsilon_i\).
The independent variables \(x_i\) are not random, and must take at least two different values.
The expected value of the random errors, \(\epsilon_i\), is \(\mathbb{E}\left(\epsilon_i\right)=0\) or equivalently \(\mathbb{E}\left(y_i\right)=\) \(\beta_1+\beta_2 x_i\)
The variances of the the random errors, \(\epsilon_i\), and the random variables, \(y_i\), are equal to each other:

\[\operatorname{Var}\left(\epsilon_i\right)=\operatorname{Var}\left(y_i\right)=\sigma^2 \]

In fact, \(\epsilon_i\) and \(y_i\), both of which are random, only differ by constants \(\beta_1+\beta_2 x_i\).

The covariance between any pair of the random errors, \(\epsilon_i\) and \(\epsilon_j(i \neq j)\), is zero:

\[\operatorname{Cov}\left(\epsilon_i, \epsilon_j\right)=\operatorname{Cov}\left(y_i, y_j\right)=0 . \]

Covariance equals to 0 does not necessarily imply that 2 random variables are (statistically) independent.

Definition (Statistically independent)

Two events are independent if the occurrence of one event does not affect the chances of the occurrence of the other event.

(Optional) The values of the random errors, \(\epsilon_i\), are normally distributed about their means if the values of the random variables, \(y_i\), are normally distributed, and vice versa

\[y_i \sim N\left(\beta_1+\beta_2 x_i, \sigma^2\right) \Leftrightarrow \epsilon_i \sim N\left(0, \sigma^2\right) . \]

poe version:

ASSUMPTIONS OF THE SIMPLE LINEAR REGRESSION MODEL-II SR1. The value of \(y\), for each value of \(x\), is

\[y=\beta_1+\beta_2 x+e \]
SR2. The expected value of the random error \(e\) is

\[E(e)=0 \]
which is equivalent to assuming that

\[E(y)=\beta_1+\beta_2 x \]
SR3. The variance of the random error \(e\) is

\[\operatorname{var}(e)=\sigma^2=\operatorname{var}(y) \]
The random variables \(y\) and \(e\) have the same variance because they differ only by a constant.
SR4. The covariance between any pair of random errors \(e_i\) and \(e_j\) is

\[\operatorname{cov}\left(e_i, e_j\right)=\operatorname{cov}\left(y_i, y_j\right)=0 \]
The stronger version of this assumption is that the random errors \(e\) are statistically independent, in which case the values of the dependent variable \(y\) are also statistically independent.
SR5. The variable \(x\) is not random and must take at least two different values.
SR6. (optional) The values of \(e\) are normally distributed about their mean

\[e \sim N\left(0, \sigma^2\right) \]
if the values of \(y\) are normally distributed, and vice versa.

Uncorrelated vs independent

Two random variables \(X\) and \(Y\) are uncorrelated when their correlation coefficient \(\rho\) is zero:

\[\rho(X, Y)=\frac{\operatorname{Cov}(X, Y)}{\sqrt{\operatorname{Var}(X) \operatorname{Var}(Y)}}=0 . \]

Moreover, having zero correlation coefficient is the same as having zero covariance:

\[\operatorname{Cov}(X, Y)=\mathbb{E}(X Y)-\mathbb{E}(X) \mathbb{E}(Y)=0 \]

which leads to

\[\mathbb{E}(X Y)=\mathbb{E}(X) \mathbb{E}(Y) \]

Definition

If \(\rho(X, Y) \neq 0\), then \(X\) and \(Y\) are correlated.

Definition

Two random variables are (statistically) independent when their joint probability distribution is the product of their marginal probability distributions: for all \(x\) and \(y\),

\[p_{X, Y}(x, y)=p_X(x) p_Y(y) . \]

Equivalently, the conditional distribution is the same as the marginal distribution:

\[p_{Y \mid X}(y \mid x)=p_Y(y) \]

Some Questions

(a) You are given a simple linear regression model

\[y_i=\beta_1+\beta_2 x_i+\varepsilon_i, \quad i=1, \ldots, n, \]

where \(\beta_1\) and \(\beta_2\) are unknown parameters, \(\varepsilon_i\) are the random error terms and \(n\) is the sample size. Let \(b_1\) and \(b_2\) be the best linear unbiased estimators of \(\beta_1\) and \(\beta_2\), respectively. Let \(r_{x y}\) be the correlation coefficient, let \(R^2\) be the coefficient of determination and let \(\hat{y}\) be the fitted values of \(y\).
(i) Prove that \(r_{x y}^2=R^2\). Give appropriate details.
(ii) Prove that \(r_{x y}^2=r_{y \hat{y}}^2\). Give appropriate details.

\[r_{xy} = \frac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^n (x_i - \bar{x})^2}\sqrt{\sum_{i=1}^n (y_i - \bar{y})^2}} \]

The correlation coefficient between \(y\) and \(\hat{y}\) is:

\[r_{y\hat{y}} = \frac{\sum_{i=1}^n (y_i - \bar{y})(\hat{y}*i - \bar{\hat{y}})}{\sqrt{\sum*{i=1}^n (y_i - \bar{y})^2}\sqrt{\sum_{i=1}^n (\hat{y}_i - \bar{\hat{y}})^2}} \]

Substituting \(\hat{y} = \beta_0 + \beta_1 x\) and simplifying, we get:

\[r_{y\hat{y}} = r_{xy} \]

Therefore, \(r_{xy}^2 = r_{y\hat{y}}^2\).

exam 19-20 3b

(b) True or False? Explain your answer for the following statements.
(i) When the errors in a regression model have \(\mathrm{AR}(1)\) serial correlation, the ordinary least squares (OLS) standard errors tend to correctly estimate the sampling variation in the estimators.

\[E\left(e_t\right)=0 \quad \operatorname{var}\left(e_t\right)=\sigma_e^2=\frac{\sigma_v^2}{1-\rho^2} \]

(ii) The weighted least squares method is preferred to OLS when an important variable is omitted from the model.

Weighted Least Squares method: In this way we take advantage of the heteroskedasticity to improve parameter
estimation.

The Ramsey Regression Equation Specification Error Test (RESET) is
designed to detect omitted relevant variables and an incorrect functional form.

(iii) The OLS estimators are no longer BLUE (best linear unbiased estimators) under the situation of the heteroskedasticity.

BLUE:

Assumptions 1-5
smallest variance
unbiased linear estimator

when heteroskedasticity exists,

The least squares estimator is still a linear and unbiased estimator, but it is no longer
best. There is another estimator with a smaller variance.
The standard errors usually computed for the least squares estimator are incorrect.
Confidence intervals and hypothesis tests that use these standard errors may be
misleading.

(iv) The adjusted \(R^2\) will not decrease if an additional explanatory variable is introduced into the model.

(v) We impose assumptions on the dependent variable and the random error term in linear regression models using the least squares principle. We do not need to impose assumptions on the explanatory variables since they are random variables.

The independent variables \(x_i\) are not random, and must take at least two different values.

(vi) For linear models, it is always appropriate to use \(R^2\) as a measure of how well the estimated regression equation fits the data because it shows the proportion of total variation that is explained by the regression.

not always appropriate
- when comparing models with same number of explanatory variables, choose the one with highest \(R^2\) is appropriate.
- problem: by adding more and more explanatory variables, \(R^2\) can be made larger and larger.
It shows the proportion of variation in a dependent variable explained by variation in the explanatory variables.

(vii) Interval estimates based on the least squares principle incorporate both the point estimate and the standard error of the estimate, and the sample size as well, so a true parameter is actually certain to be included in such an interval.

we can only say the true parameter is in out confidence interval with significance level of \(\alpha\), or say we have ... certainty to ensure the estimated value is in the confidence interval.

The estimated value still have probability \(\alpha\) to fall out of the interval.

exam 19-20 6c

(c) Explain with brief reasons whether the following statements are true, false, or uncertain.
(i) In the presence of lagged dependent variables, the Durbin-Watson \(d\) statistic for detecting autocorrelation is practically useless.

The Durbin-Watson \(d\) test no longer holds when the equation contains a lagged dependent variable.

(ii) The Durbin \(h\) test is valid in both large and small samples.

The Durbin-Watson \(d\) test does not rely on large samples.

(iii) There are no differences between the tests of unit roots and tests of cointegration.

tests of unit roots: e.g. Dickey–Fuller test, is a test for determining whether a series is stationary or nonstationary.

test of cointegration is to test the residuals, i.e. to test whether they share same commend trends.

(iv) For a random walk stochastic process, the variance is infinite.

e.g. a random walk model

\[y_t=y_{t-1}+v_t \]

and its variance is \(t\sigma_\nu ^2\), which does not have an upper bound.

6.6 poe carter 3e

(a) Least squares estimation of \(y_i=\beta_1+\beta_2 x_i+\beta_3 w_i+e_i\) gives \(b_3=0.4979, \operatorname{se}\left(b_3\right)=0.1174\) and \(t=0.4979 / 0.1174=4.24\). This result suggests that \(b_3\) is significantly different from zero and therefore \(w_i\) should be included in the model. Additionally, the RESET test based on the equation \(y_i=\beta_1+\beta_2 x_i+e_i\) gives \(F\)-values of \(17.98\) and \(8.72\) which are much higher than the \(5 \%\) critical values of \(F_{(0.95,1,32)}=4.15\) and \(F_{(0.95,2,31)}=3.30\), respectively. Thus, the model omitting \(w_i\) is inadequate.
(b) Let \(b_2^*\) be the least squares estimator for \(\beta_2\) in the model that omits \(w_i\). The omittedvariable bias is given by

\[E\left(b_2^*\right)-\beta_2=\beta_3 \frac{\widehat{\operatorname{cov}(x, w)}}{\widehat{\operatorname{var}(x)}} \]

Now, \(\widehat{\operatorname{cov}(x, w)}>0\) because \(r_{x v}>0\). Thus, the omitted variable bias will be positive. This result is consistent with what we observe. The estimated coefficient for \(\beta_2\) changes from \(-0.9985\) to \(4.1072\) when \(w_i\) is omitted from the equation.
(c) The high correlation between \(x_i\) and \(w_i\) suggests the existence of collinearity. The observed outcomes that are likely to be a consequence of the collinearity are the sensitivity of the estimates to omitting \(w_i\) (the large omitted variable bias) and the insignificance of \(b_2\) when both variables are included in the equation.

6.10 poe carter 4e

beer.def

  Q   PB   PL   PR   I

  Obs:  30 annual observations from a single household

  1. Q = litres of beer consumed
  2. PB = Price of beer ($)
  3. PL = price of other liquor ($)
  4. PR = price of remaining goods and services (an index)
  5. I = income ($)


    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
           Q |        30    56.11333    7.857381       44.3       81.7
          PB |        30        3.08    .6421945       1.78       4.07
          PL |        30    8.367333    .7696347       6.95       9.52
          PR |        30    1.251333     .298314        .67       1.73
           I |        30     32601.8    4541.966      25088      41593

Use the sample data for beer consumption in the file beer.dat to
(a) Estimate the coefficients of the demand relation (6.14) using only sample information. Compare and contrast these results to the restricted coefficient results given in (6.19).
(b) Does collinearity appear to be a problem?
(c) Test the validity of the restriction that implies that demand will not change if prices and income go up in the same proportion.
(d) Use model (6.19) to construct a 95% prediction interval for \(Q\) when \(P B=3.00, P L=10, P R=2.00\), and \(I=50000\). (Hint: Construct the interval for \(\ln (Q)\) and then take antilogs.)
(e) Repeat part (d) using the unconstrained model from part (a). Comment.

solution

6.20 poe carter 4e

rice.def

	firm  year  prod  area  labor  fert
  
  Obs:   a panel with 44 firms over 8 years (1990-1997)
	total observations = 352

  	firm	Firm number  ( 1 to 44)
	year	Year = 1990 to 1997
	prod	Rice production (tonnes)
	area	Area planted to rice (hectares)
	labor	Hired + family labor (person days)
	fert	Fertilizer applied (kilograms)

           
Data source: These data were used by O’Donnell, C.J. and W.E. Griffiths (2006), 
	"Estimating State-Contingent Production Frontiers", American Journal of 
	Agricultural Economics, 88(1), 249-266.             



    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
        firm |       352        22.5     12.7165          1         44
        year |       352      1993.5    2.294549       1990       1997
        prod |       352    6.466392    5.076672        .09       31.1
        area |       352    2.117528    1.451403         .2          7
       labor |       352    107.2003     76.6456          8        436
-------------+--------------------------------------------------------
        fert |       352    187.0545    168.5852        3.4     1030.9

Reconsider the production function for rice estimated in Exercise \(5.24\) using data in the file rice.dat:

\[\ln (P R O D)=\beta_1+\beta_2 \ln (\text { AREA })+\beta_3 \ln (\text { LABOR })+\beta_4 \ln (\text { FERT })+e \]

(a) Using a 5% level of significance, test the hypothesis that the elasticity of production with respect to land is equal to the elasticity of production with respect to labor.
(b) Using a \(10 \%\) level of significance, test the hypothesis that the production function exhibits constant returns to scale-that is, \(H_0: \beta_2+\beta_3+\beta_4=1\).
(c) Using a 5% level of significance, jointly test the two hypotheses in parts (a) and (b)-that is, \(H_0: \beta_2=\beta_3\) and \(\beta_2+\beta_3+\beta_4=1\).
(d) Find restricted least squares estimates for each of the restricted models implied by the null hypotheses in parts (a), (b) and (c). Compare the different estimates and their standard errors.

Solution

(a) Testing \(H_0: \beta_2=\beta_3\) against \(H_1: \beta_2 \neq \beta_3\), the calculated \(F\)-value is \(0.342\). We do not reject \(H_0\) because \(0.342<3.868=F_{(0.95,1,348)}\). The \(p\)-value of the test is \(0.559\). The hypothesis that the land and labor elasticities are equal cannot be rejected at a \(5 \%\) significance level.

Using a \(t\)-test, we fail to reject \(H_0\) because \(t=-0.585\) and the critical values are \(t_{(0.025,348)}=-1.967\) and \(t_{(0.975,348)}=1.967\). The \(p\)-value of the test is \(0.559\).

9.12* Consider the Okun's Law finite distributed lag model that was estimated in Section \(9.2\) and the data for which appears in okun.dat.
(a) Estimate the following model for \(q=0,1,2,3,4,5\), and 6.

\[D U_t=\alpha+\sum_{s=0}^q \beta_s G_{t-s}+e_t \]

In each case use data from \(t=1986 Q 4\) to \(t=2009\) Q3 to ensure that 92 observations are used for each estimation. Report the values of the AIC and SC selection criteria for each value of \(q\). What lag length would you choose on the basis of the AIC? What lag length would you choose based on the SC?
(b) Using the model that minimizes the AIC:
(i) Find a \(95 \%\) confidence interval for the impact multiplier.
(ii) Test the null hypothesis that the total multiplier equals \(-0.5\) against the alternative that it is greater than \(-0.5\). Use a \(5 \%\) significance level.
(iii) Find a \(95 \%\) confidence interval for the normal growth rate \(G_N\). (Hint: Use your software to get the standard error for \(\hat{G}_N=\hat{\alpha} / \hat{\gamma}\) where \(\hat{\gamma}=-\sum_{s=0}^q b_s\). You can do so by pretending to test a hypothesis such as \(H_0: \alpha / \gamma=1\).)

okun.def

 g  u

  Obs: 98, quarterly (1985Q2 - 2009Q3)  

	g = percentage change in U.S. Gross Domestic Product, seasonally adjusted.

        u = U.S. Civilian Unemployment Rate  (Seasonally adjusted)

	The variable DU used in Chapter 9 is defined as U(t)-U(t-1).

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
           g |        98    1.276531    .6469279       -1.4        2.5
           u |        98    5.704082    1.132638        3.9        9.6


Data Source:  Federal Reserve Bank of St Louis

𝑥₁ = 4.105456
P( X ≤ 4.105456 ) = 0.95.
P( X > 4.105456 ) = 0.05.

posted @ 2022-12-02 21:38 miyasaka 阅读(75) 评论(0) 收藏举报

刷新页面返回顶部

0xfffffff