Spatial Regression
Spatial Regression
1. Motivation
Spatial Heterogeneity: means that parts of the model may vary systematically with geography, change in different places.
Spatial dependence:
2. Models
spreg model, site
from pysal.model import spreg
2.1 Ordinary Least Squares
model = spreg.OLS(y, x, name_y, name_x, **kargs)
-
Parameters:
-
y (array): dependent variable -
x (2d array): Two dimensional array with n rows and one column for each independent (exogenous) variable, excluding the constant -
name_y (str): Name of dependent variable for use in output -
name_x (list of str): Names of independent variables for use in output
-
注意:该函数在内部会自动添加常数项,因此 x 不需要 向 statsmodel 那样添加一列 "all-one" 列用于估计常数项
2.2 Spatial Fixed Effects & Spatial Regimes
Spatial Fixed Effects model
-
Where \(i\) is the index of samples
-
\(k\) is the index of explanatory variables
-
\(r\) is the index of neighborhoods (regions)
-
\(\alpha_r\) represents constant term \(\alpha\) is allowed to vary by neighborhood \(r\), i.e., \(\alpha_r\)
-
相当于用 dummy variable 表示 neighborhood
2.3 Spatial Regimes
Spatial Regimes model
-
where we are not only allowing the constant term to vary by region (\(\alpha_r\)), but also every other parameter (\(\beta_{k,r}\))
-
相当于根据 region (\(r\)) 将样本分组,再分别使用 Linear Regerssion
model = spreg.OLS_Regimes(y, x, regimes,
constant_regi="many", cols2regi='all', regime_err_sep=True, **kargs)
-
Parameters:
-
constant_regi (str in {'one', 'many'}, default='many'): Switcher controlling the constant term setup.-
'one': a vector of ones is appended toxand held constant across regimes -
'many': a vector of ones is appended toxand considered different per regime (default)
-
-
cols2regi (list, or 'all', default='all'): Argument indicating whether each column ofxshould be considered as different per regime (True) or held constant across regimes (False).-
If a
list:kbooleans indicating for each variable the option (Trueif one per regime,Falseto be held constant). -
If
'all' (default): all the variables vary by regime.
-
-
regime_err_sep (bool, default=True):True: a separate regression is run for each regime.
-
Chow test
Null hypothesis: estimates from different regimes are undistinguishable
# global one that jointly tests for differences between the two regimes
model.chow.joint
# check whether each of the coefficients in our model differs across regimes
model.chow.regi
2.4 Exogenous effects: The SLX model
By including the spatial lag:
- where \(\sum^{N} \limits_{j=1} w_{ij}x_{jk}\) represents the spatial lag of the \(k\)th explanatory variable.
This can be stated in matrix form using the spatial weights matrix \(\mathbf{W}\), as:
This splits the model to focus on two main effects: \(\boldsymbol{\beta}\) and \(\boldsymbol{\gamma}\). The effect \(\boldsymbol{\beta}\) describes the change in \(X_{ik}\) when \(y_i\) changes by one.
2.5 Spatial Error
The spatial error model includes a spatial lag in the error term of the equation:
- where \(u^{\text{lag}}_{i} = \sum \limits_j w_{i,j} u_j\)
This specification violates the assumptions about the error term in a classical OLS model. Hence, alternative estimation methods are required.
model = spreg.GM_Error_Het(y, x, w, **kargs)
\(\lambda\) 值:model.name_x 中 lambda 所对应的系数
pandas.DataFrame({"Coeff." : model.betas.flatten(),
"Std. Error" : model.std_err.flatten(),
"P-Value" : [i[1] for i in model.z_stat] },
index = model.name_x
).reindex(["lambda"])
2.6 Spatial Lag
The spatial lag model introduces a spatial lag of the dependent variable. In the example we have covered, this would translate into:
- where \(Y^{\text{lag}}_{i} = \sum \limits_j w_{i,j} Y_j\)
This model violates the exogeneity assumption, crucial for OLS to work. This occurs when \(Y_i\) exists on both "sides" of the equals sign. In theory, since
\(Y_i\) is included in computing \(Y^{\text{lag}}_i\), exogeneity is violated.
- Two-stage least squares estimation
model = spreg.GM_Lag(y, x, w, **kargs)
2.7 Other Models
-
Generalized Additive Models
-
Spatial Gaussian Process Models or Kriging
References
Rey, Sergio J., et al. 2022, "ReySpatial Regression", in Geographic Data Science with Python, site

浙公网安备 33010602011771号