Trip Generation

1 Introduction

1.1 Basic Definitions

Sojourn: A short period of stay in a particular location away from home. It usually has a purpose associated with this stay: work, study, shopping, leisure, etc.
Activity: An endeavour or interest often associated with a purpose as above but not necessarily linked to a fixed location such as home. One could choose to go shopping or to the cinema in different locations.
Tour or Trip Chain: A set of linked sojourns and trips.
- Note: the trip chain concept corresponds better to the idea of travel as derived demand (i.e. it depends strongly on the demand for other activities) but initially were used mainly by discrete choice modellers in practice. Contemporary models, particularly of the trip frequency type, are more typically interested in tours.
Trip (or Journey): One-way movement from a point of origin to a point of destination. We are usually interested in all vehicular trips.
- Note: walking trips less than a certain study-defined threshold (say 300 m or three blocks) have often been ignored as well as trips made by infants < 5 years old. However, this emphasis is changing with greater attention being paid to nonmotorized trips and activity-based approach in surveys.
Home-based (HB) Trip: This is one where the home of the trip maker is either the origin or the destination of the trip. Note that for visitors from another city their hotel acts as a temporary home in most studies.
Non-home-based (NHB) Trip: This, conversely, is one where neither end of the trip is the home of the trip maker.
\(O_i\) The total number of trips originating from zone \(i\), (\(i = 1, 2, \cdots, n\))
\(D_i\) The total number of trips attracted to zone \(j\), (\(j = 1, 2, \cdots, n\))
\(\big[T_{ij} \big]_{n \times n}\) directional production-attraction (P-A) / origin-destination (O-D) table / matrix
- production-attraction (P-A) table / matrix: the unit is day
- origin-destination (O-D) table / matrix: the unit is hour
Trip Production Point: Defined as the home end of an HB trip or as the origin of a NHB trip.
Trip Attraction Point: Defined as the non-home end of a HB trip or the destination of a NHB trip.
Trip Generation: Often defined as the total number of trips generated by households in a zone, be they HB or NHB.
- This is what most models would produce, and the task then remains to allocate NHB trips to other zones as trip productions.

1.2 Characterization of Trips:

1.2.1 By Purpose

The minimum set of trip purposes in most transportation planning modelers
- HB work trips
- HB non-work trips
- NHB trips
In the case of HB trips, a number of categories have been employed:
- travel to work (i.e., HB work trip)
- travel to school or college (i.e., HB education trip)
- shopping trips (i.e., HB shopping trip)
- social and recreational trips
- escort trips (to accompany or collect somebody else)
- other trips associated with home
The first two are usually called compulsory (or mandatory) trips and all the others are called discretionary (or optional) trips.

1.2.2 By Time of Day

Classification
- peak period trips
- off-peak period trips
Proportion of trips by different purposes usually varies greatly with time of day.
This type of classification, although important, gets more complicated when tours rather than trips are of interest, as a complete tour may comprise trips made at several times of the day.

1.2.3 By Household Type

The following categories are usually employed
- Income level (e.g. three strata: low, middle and high income)
- Car ownership (typically three strata: 0, 1 and 2 or more cars)
- Household size and structure (e.g. six strata in the classical British studies)
Trips are heavily dependent on socioeconomic attributes
It is important to note that the total number of strata can increase very rapidly and this may have strong implications in terms of data requirements, model calibration and use
Example:
- UK employed 108 categories (\(108 = 6 \times 3 \times 6\)):
  - 6 income levels,
  - 3 car owner ship levels (0,1 and 2 or more cars per house hold)
  - and 6 household structure groupings.

1.2.4 By Personal Type

The following categories are usually employed:
- Income level (e.g. three strata: low, middle and high income)
- Car ownership (typically three strata: 0, 1 and 2 or more cars)
- Age, gender, marriage status, occupation, etc.

2 Factors Affecting Trip Production and Attraction

2.1 Personal Trip Production at Household Level

The following factors have been proposed for consideration in many practical studies focusing on household level:

Income
Car ownership
Household size
Household structure
Value of land
Residential density
Accessibility

The first three factors (1~3) are typically considered in household travel surveys.

The following three factors (4~6) are typically considered in zoning studies for urban planning.

The last one are rarely considered or quantified but it offers a way to make trip generation elastic (responsive) to changes in transport system.

2.2 Personal Trip Production at Individual Level

The following factors have been proposed for consideration in many practical studies
focusing on individual level:

Age
Gender
Income
Career (e.g., professionals and executives; managers and technical persons)
Occupation
Employment status (e.g., full-time, part-time and unemployment)
Car ownership

2.3 Personal Trip Attraction

The most widely used factor has been roofed space available for

industrial,
commercial,
and other services such as,
- land use types,
- capacity of establishment,
- business sales,
- location.

Another factor used has been

zonal employment ,such as
- the number of employment opportunities,
- employment density,
- the number of employees of government agencies,
- enrollment of educational institutions.
Certain studies have attempted to incorporate an accessibility measure.

2.4 Freight Trip Production and Attraction

Account for few vehicular trips
at most 20% of all journeys in certain areas of industrialized nations, although they can still be significant in terms of their contribution to congestion.
Important variables/factors include:
- Number of employees
- Number of sales
- Roofed area of firm
- Total area of firm

3 Trip Generation Model

3.1 Growth Factor Models

The simplest model to predict the future number of trip of any category (i.e., purpose):

\[T_i = F_i t_i \]

where \(T_i\) and \(t_i\) are the future and current trips in zone \(i\), and \(F_i\) is a growth factor.

The key of the model lies in the estimation of growth factor \(F_i\), and the rest is trivial. Normally the factor is related to variables such as population (\(P\)), income (\(I\)) and car ownership (\(C\)), in a function such as:

\[F_{i} = \frac{ f\left(P_{i}^{d}, I_{i}^{d}, C_{i}^{d}\right) }{ f\left(P_{i}^{c}, I_{i}^{c}, C_{i}^{c}\right) } \]

where \(f(\cdot)\) can even be a direct multiplicative function with no parameters, and the superscripts \(d\) and \(c\) denote the design and current years respectively.

Limitations:

Information about average trip rates and growth rates are assumed to remain constant. (Major flaw for most trip generation models!)

Application scope:

In general, growth factor models are mostly used in practice to predict the future number of external trips to an area. This is because they are not too many in the first place (so errors cannot be too large) and also because there are no simple ways to predict them.
In some cases, they are also used, at least as a sense check, for interurban toll road studies.

3.2 Regression Models

3.2.1 Zonal-based Regression Models

Objective: To find a linear relationship between the number of trips produced or attracted in a zone and the average socioeconomic characteristics of households in each zone.

Consideration #1: Meaning of zonal models

Zonal models can only explain the variation in trip making behavior between zones.
For this reason they can only be successful if the inter-zonal variations adequately reflect the real reasons behind trip variability.
For this to happen it would be necessary that zones not only have a homogeneous socioeconomic composition but represent as wide as possible a range of conditions.
Issue
- Main variations in person trip data occur at the intra-zonal level, which cannot be reflected by a zonal model.

Consideration #2: Role of intercept in regression model

One would expect the developed regression model to pass through the origin; however, large intercept values (i.e. in comparison to the product of the average value of any variable and its coefficient) have often been obtained.
If this happens the model may be rejected; if on the contrary, the intercept is not significantly different from zero, it might be informative to re-estimate the line, forcing it to pass through the origin.

Consideration #3: Null zones

It is possible that certain zones do not offer information about certain dependent variables (e.g. there can be no HB trips generated in non-residential zones).
Null zones must be excluded from analysis to prevent overestimation of regression model accuracy.

Consideration #4: Zonal totals vs. zonal means

When formulating the model the analyst appears to have a choice between using aggregate or total variables, such as trips per zone and cars per zone, or rates such as trips per household per zone and cars per household per zone.
In the first case the regression model would be:

\[y_{i}=\beta_{0}+\beta_{1} x_{i 1}+\beta_{2} x_{i 2}+\cdots+\beta_{k} x_{i k}+\varepsilon_{i} \]

whereas the model using rates would be:

\[y_{i}^{\prime}=\beta_{0}+\beta_{1} x_{i 1}^{\prime}+\beta_{2} x_{i 2}^{\prime}+\cdots+\beta_{k} x_{i k}^{\prime}+\varepsilon_{i}^{\prime} \]

with \(y'_i = \dfrac{y_i}{H_i}\), \(x'_i = \dfrac{x_i}{H_i}\) and \(\varepsilon'_i =\dfrac{\varepsilon_i}{H_i}\); \(H_i\) is the number of households in zone \(i\).

Note:
- Constant variance condition of the model cannot hold in both cases, unless \(H_i\) is constant for all zones \(i\).
- Error of aggregate model (heteroskedasticity – variability of variance) depends on zone size and tends to have higher \(R^2\).

3.2.2 Household-based Regression Models

Intra-zonal variation may be reduced by decreasing zone size, especially if zones are homogeneous.
However, smaller zones imply a greater number of them with two consequences:
- more expensive models in terms of data collection, calibration and operation
- larger sampling errors, which are assumed non-existent by multiple linear regression model
For these reasons it seems logical to develop models which are independent of zone boundaries.
- In a household-based application, each home is taken as an input data vector in order to bring into the model all the range of observed variability about the characteristics of the household and its travel behavior.
- The calibration process may proceed stepwise, testing each potential explanatory variable in turn until the best model is obtained (i.e., stepwise regression model selection method)
- Aggregation is required to determine zonal totals.

3.2.3 Methods to improve regression precision

Dealing with Non-Linearity

It is not easy to detect non-linearity because apparently linear relations may turnout to be non-linear when the presence of other variables is allowed for in a model.
Those of a qualitative nature usually shows non-linear behavior (e.g. type of dwelling, occupation of the head of the household, age, gender).
Two approaches to deal with the nonlinear issue:
1. Transformation of variables to linearize effects
2. Use of dummy variables (or indicator variables) (value of 1 in one category and 0 in all other categories)

Stepwise Regression Model Selection Approach

Criterion to be considered for model selection
1. What is the magnitude of \(R^2\)
2. Do the estimated partial regression coefficients (\(\hat{\beta_1}, \cdots, \hat{\beta_k}\)) have the correct sign and are their magnitudes reasonable.
3. Are the partial regression coefficients statistically significant.
4. Is the magnitude of the estimated constant(i.e., intercept \(\hat{\beta_0}\)) reasonable?
Stepwise Regression Model Selection Approach

Step 1: Examine the nature of relationships between the dependent variable and each of independent variables in order to detect nonlinearities. If nonlinearities are detected, the relationship must be linearized by transforming the dependent variable, the independent variable, or both.

Step 2: Develop an inter-correlation matrix involving all the variables (including both the dependent and the independent variables).

Step 3: Examine the inter-correlation matrix in order to detect

(a) These independent variables which have a statistical association with the dependent variable, and

(b) Potential sources of collinearity between pairs of the independent variables.

Step 4: If any of two independent variables are found to be highly correlated, eliminate one of the two highly correlated independent variables from the regression model.

Step 5: Do regression analysis with the chosen set of independent variables, and estimate the parameters of each of the potential regression equations.

Step 6: Conduct the relevant tests to assess the goodness of the model based on the logic and statistics.

3.3 Cross-Classification or Category Analysis

3.3.1 Cross-Classification Trip Generation Models

A cross-classification model for trip production is based on estimating the response (e.g. the number of trip productions per household for a given purpose) as a function of household attributes.

Assumption: Average trip production rates are relatively stable over time for certain household stratifications.

The method finds these rates empirically and for this it typically needs large amounts of data; in fact, a critical element is the number of households in each class.

Variable Definition and Model Specification

Let \(t^{p}(h)\) be the average number of trips with purpose \(p\) (at a certain time period) made by members of households of type \(h\).
Household types are defined by the stratification chosen. For example, a cross-classification based on \(m\) household sizes and \(n\) car ownership classes will yield \(m \times n\) types \(h\).
The standard method for estimating these cell rates is to allocate households in the calibration data to the individual cell groupings and total, cell by cell, the observed trips \(T^{(p)}(h)\) by purpose group.
The rate \(t^{(p)}(h)\) is then the total number of trips in cell \(h\), by purpose, divided by the number of households \(H(h)\) in it.

\[t^{(p)}(h) = \frac{T^{(p)}(h)}{H(h)} \]

The 'art' of the method lies in choosing the categories such that the standard deviations of the frequency distributions (\(t^{(p)}(h)\)) are minimized.

Advantages:
1. Cross-classification groupings are independent of the zoning system of study area.
2. No prior assumptions about the shape of the relationship are required (i.e. they do not even have to be monotonic, let alone linear).
3. Relationships can differ in form from class to class (e.g. the effect of changes in household size for one or two car-owning households may be different).
Disadvantages:
1. The model does not permit extrapolation beyond its calibration strata, although the lowest or highest class of a variable may be open-ended (e.g. households with two or more cars and five or more residents).
2. There are no statistical goodness-of-fit measures for the model, so only aggregate closeness to the calibration data can be ascertained (i.e., calibration method).
3. Unduly large samples are required; otherwise, cell values will vary in reliability because of differences in the numbers of households being available for calibration at each one.
4. There is no effective way to choose among variables for classification, or to choose best groupings of a given variable; the minimization of standard deviations would require an extensive 'trial and error' procedure which may be considered infeasible in practical studies.
5. If it is required to increase the number of stratifying variables, it might be necessary to increase the sample enormously.
Average Trip Rates
- Generic trip rates are developed by government agencies.
- For example, trip rates by Urban Redevelopment Authority(URA) for planning purposes or by USA Institution of Transport Engineers.

Regression Analysis for Household Strata

A mixture of cross-classification and regression modelling of trip generation may be the most appropriate approach on certain occasions.
Example Cases:
1. Area where the distribution of income is unequal it may be important to measure the differential impact of policies on different income groups. It may be necessary to model travel demand for each income group separately throughout the entire modelling process.
2. Assume now that in the same area car ownership is increasing fast and, as usual, it is not clear how correlated these two variables (i.e., income and car ownership) are.
  
  A useful way out may be to postulate regression models based on variables describing the size and make-up of different households, for a stratification according to the two previous variables.

3.3.2 Person-Category Trip Generation Models

The household-based trip generation models are unable to well capture the characteristics of an individual, such as age, gender, income, car ownership and trip making behavior.
Average trip rate for a category of homogeneous persons can be effectively estimated from survey such as the regular household interview travel survey (HITS) in Singapore.
The classifications of person-category in terms of trip rates can be conduced using the standard statistical analysis approaches.
Example: the following items will yield person 4320 (\(4320 = 2 \times 4 \times 3 \times 2 \times 3 \times 2 \times 3 \times 5\)) categories:

(1) Gender: male and female;

(2) Age: 0 to 12, 12 to 18, 18 to 65 and older than 65;

(3) Car ownership: 0, 1, 2+;

(4) Employment status: employed and not employed

(5) Individual income: low, middle and high

(6) Race: white versus non-white

(7) Employment types: white collar, blue collar, and other

(8) Family types: single, childless couple, family with children younger than 5 years of age; family with children 5 to 12 years of age, and family with children older than 12 years of age.
Advantages:

A person-level trip generation model is compatible with other components of the classical travel demand modelling system, which is based on trip makers rather than on households.
Allows cross-classification scheme that uses all important variables and yields a manageable number of classes; this in turn allows class representation to be forecast more easily.
Sample size required to develop a person-category model can be several times smaller than that required to estimate a household-category model.
Demographic changes can be more easily accounted for in a person-category model as, for example, certain key demographic variables (such as age) are virtually impossible to define at household level.
Person categories are easier to forecast than household categories as the latter require forecasts about household information and family size; these tasks are altogether avoided in the case of person categories. Only require migration and survival rates.

Limitations
- Difficulty of introducing household interaction effects and household money costs and money budgets into a person-based model.

Variable Definition and Model Specification

Let \(t_j\) be the trip rate, that is, the number of trips made during a certain time period by (the average) person in category \(j\); \(t_j^{(p)}\) is the trip rate by purpose \(p\).
\(N_i\) is the number of inhabitants of zone \(i\), and \(\alpha_{ij}\) is the percentage of inhabitants of zone \(i\) belonging to category \(j\).
\(T_i\) is the total number of trips made by the inhabitants of zone \(i\) (all categories together).
We have:

\[T_i = \sum_{j} \Big[ \left(N_i \cdot \alpha_{ij} \right) \cdot t_j \Big] \]
where trips are divided into home-based (HB) and non-home-based (NHB), and can be further divided by purpose (\(p\)) which may apply to both HB and NHB trips.
Model Development Steps
1. Consideration of several variables expected to be important to explain differences in personal mobility. Definition of plausible person categories using these variables.
2. Preliminary analysis of trip rates to find out which variables have the least explanatory power and can be excluded from model.
  This is done by comparing the trip rates of categories which are differentiated by the analyzed variable only and testing whether their differences are statistically significant.
3. Detailed analysis of trip characteristics to find variables that define similar categories.
  Variables which do not provide substantial explanation of the data variance, or variables that duplicate the explanation provided by other better variables are excluded. The exercise is conducted under the constraint that the number of final categories should not exceed a certain practical maximum (for example: 15 classes).
For this analysis the following measures may be used:
the coefficient of correlation (\(R_{jk}\)), slope (\(\alpha_{jk}\)) and intercept (\(\beta_{jk}\)) of the regression \(t_{j}^{(p)} = \beta_{jk} + \alpha_{jk} t_{k}^{(p)}\). The categories \(j\) and \(k\) may be treated as similar if these measures satisfy the following conditions (Supernak et al. 1983):

\[\begin{align*} R_{jk} & > 0.900 \\ 0.75 < \alpha_{jk} & < 1.25 \\ \beta_{jk} & < 0.10 \\ \end{align*} \]
These conditions are quite demanding and may be changed.

3.3.3 Trip Generation and Accessibility

Major disadvantage of the person-category model is that changes to the network are assumed to have no effects on trip productions and attractions.
To solve this issue, modellers have attempted to incorporate a measure of accessibility (i.e. ease or difficulty of making trips to/from each zone) into trip generation models
The aim is to replace \(O_i^n = f(H_i^n)\) by \(O_i^n = f(H_i^n, A_i^n)\)

where \(H_i^n\) are household characteristics; \(A_i^n\) is a measure of accessibility by person type.

Typical accessibility measures take the general form:

\[A_{i}^{n}=\sum_{j} f\left(E_{j}^{n}, C_{i j}\right) \]

where \(E_{j}^{n}\) is a measure of attraction of zone \(j\) and \(C_{ij}\) the generalised cost of travel between zones \(i\) and \(j\). A typical analytical expression used to this end has been:

\[A_{i}^{n}=\sum_{j} E_{j}^{n} \exp \left(-\beta C_{i j}\right) \]

where \(\beta\) is a calibration parameter from the gravity model.

3.4 Balancing Trip Production and Attraction

Some trip generation and attraction models may yield: \(\sum_{i}^{n} O_i = \sum_{j}^{n} D_j\)

Normal practice considers that the total number of trips arising from summing all origins \(O_i\) is in fact the correct figure for \(T\); therefore, trip attraction \(D_j\) at zone \(j\) are multiplied by a factor \(f\) given by:

\[f = \frac{T}{\sum_{j}^{n} D_j}, \quad \text{where} \quad T = \sum_{i}^{n} O_i \]

Thus, the produced trips \(O_i\) and adjusted attracted trips \(D_j\) are

\[O_i^{\prime} = O_i, \quad D_j^{\prime} = f \cdot D_j \]

Obviously ensure that their sum also adds to \(T\), namely,

\[\sum_{j}^{n} D_j' = \sum_{j}^{n} f \cdot D_j = T = \sum_{i}^{n} O_i \]

4 Bayesian Updating of Trip Generation Model Parameters

Assume we want to estimate a trip generation model but lack funds to collect appropriate survey data; a possible (but inadequate) solution is to use a model estimated for another (hopefully similar) area directly.
However, it would be highly desirable to modify it in order to reflect local conditions more accurately.
This can be done by means of Bayesian techniques for updating the original model parameters using information from a small sample in the application context.
Bayesian updating considers a prior distribution (i.e. that of the original parameters to be updated), new information (i.e. to be obtained from the small sample) and a posterior distribution corresponding to the updated model parameters for the new context.
Updating techniques are very important in a continuous planning framework.

Bayesian updating notation for trip generation

\[\begin{array}{lcc} \hline \text { Variable } & \text { Prior information } & \text { New information } \\ \hline \text { Mean trip rate } & t_{1} & t_{s} \\ \text { No. of observations } & n_{1} & n_{s} \\ \text { Trip rate variance } & S_{1}^{2} & S_{s}^{2} \\ \hline \end{array} \]

Mean and variance of the posterior distribution can be estimated below using Bayes' theorem

\[\begin{align*} t_{2} &= \frac{1 / \sigma_{1}^{2}}{1 / \sigma_{1}^{2}+1 / \sigma_{\mathrm{s}}^{2}} t_{1}+\frac{1 / \sigma_{\mathrm{s}}^{2}}{1 / \sigma_{1}^{2}+1 / \sigma_{\mathrm{s}}^{2}} t_{\mathrm{s}} \\ \sigma_{2}^{2} &= \frac{1}{1 / \sigma_{1}^{2}+1 / \sigma_{\mathrm{s}}^{2}} \end{align*} \]

Substituting and \(\sigma_1^2\) and \(\sigma_s^2\) by the known \(\dfrac{S_1^2}{n_1}\) and \(\dfrac{S_s^2}{n_s}\) yields the Bayesian updating formulae:

\[\begin{align*} t_{2} &= \frac{n_{1} S_{\mathrm{s}}^{2} t_{1}+n_{\mathrm{s}} S_{1}^{2} t_{\mathrm{s}}}{n_{1} S_{\mathrm{s}}^{2}+n_{\mathrm{s}} S_{1}^{2}} \\ \sigma_{2}^{2} &= \frac{S_{1}^{2} S_{\mathrm{s}}^{2}}{n_{1} S_{\mathrm{s}}^{2}+n_{\mathrm{s}} S_{1}^{2}} \end{align*} \]

Reference

[1] J. de D. Ortúzar S. and L. G. Willumsen, "4 Trip Generation Modelling", in Modelling Transport, Fourth edition. Chichester, West Sussex, United Kingdom: John Wiley & Sons, 2011, p.p. 139-173.

posted @ 2022-02-16 18:42 veager 阅读(92) 评论(0) 收藏举报

刷新页面返回顶部