Probability Distributions for Travel Demand Modelling

Probability Distributions for Travel Demand Modelling

0. Probability Distributions

Probability Density Function (PDF):

\[f(x) \]

Cumulative Density Function (CDF):

\[F(x) = \Pr(X \leq x) = \int_{-\infty}^{x} f(t) \, \mathrm{d} t \]

Mean:

\[\mu = \mathrm{E}[X] = \int_{-\infty}^{+\infty} x f(x) \, \mathrm{d} x \]

Variance:

\[\sigma^2 = \mathrm{Var}[X] = \int_{-\infty}^{+\infty} (x-\mu)^2 f(x) \, \mathrm{d} x \]

Propositions:

  • \(\mathrm{Var}[X] = \mathrm{E} [X - \mathrm{E}[X]]^2 = \mathrm{E} [X ^2] - (\mathrm{E}[X])^2\)

  • Covariance: \(\mathrm{Cov}[X, Y] = \mathrm{E}[(X - \mathrm{E}[X])(Y - \mathrm{E}[Y])]\)

  • Correlation coefficient: \(\displaystyle \rho_{X, Y} = \frac{\mathrm{Cov}[X, Y]}{\sigma_X \, \sigma_Y}\)

1. Continuous Uniform Distribution

PDF

\[f(x|a,b) = \frac{1}{b-a}, \quad a \leq x \leq b \]

CDF

\[F(x) = \begin{cases} \, 0, & x \leq a \\ \displaystyle \frac{x-a}{b-a}, & a \leq x \leq b \\ \, 1, & x \geq b \end{cases} \]

Mean : \(\displaystyle \mu = \mathrm{E}[X] = \frac{a+b}{2}\)

Variance : \(\displaystyle \sigma^2 = \mathrm{Var}[X] = \frac{(b-a)^2}{12}\)

2. Normal Distribution

PDF

\[f(x|\mu,\sigma) = \frac{1}{\sqrt{2 \pi} \sigma} \exp \left[ - \frac{(x - \mu)^2}{2 \sigma^2} \right], \quad - \infty \leq x \leq + \infty \]

CDF

\[F(x|\mu,\sigma) = \int_{-\infty}^{x} \frac{1}{\sqrt{2 \pi} \sigma} \exp \left[ - \frac{(x - \mu)^2}{2 \sigma^2} \right] \, \mathrm{d} t, \quad - \infty \leq x \leq + \infty \]

Note : If mean \(\mu=0\) and variance \(\sigma=1\), random \(X\) ( i.e., named by a \(Z\) ) follows the standard normal distribution \(\mathcal{N}(0,1)\), the CDF is expressed by \(F(z) = \Pr(Z \leq z) = \Phi(z)\).

3. Gumbel Distribution

For a series of random variables \(T_1, T_2, \cdots, T_n\) are statistically independent and identically distributed (IID, or i.i.d.) with an exponential tail distribution, and the CDF is denoted as \(F_T(t)\). \(n\) is taken as the extreme value (very large \(n\)). Then, let random variable \(X = \max \{T_1, T_2, \cdots, T_n\}\) and \(X\) follows the Gumbel distribution:

PDF

\[f(x|\eta, \mu) = \mu \cdot \exp [-\mu (x - \eta)] \cdot \exp \{-\exp [-\mu (x - \eta)]\}, \quad x \geq 0 \]

where \(\eta\) is a location parameter, \(\mu > 0\) is a scale parameter.

CDF

\[\begin{align*} F(x|\eta, \mu) &= \lim_{n \rightarrow \infty} \Pr(T_1 \leq x_1, T_2 \leq x_2, \cdots, T_n \leq x) \\ &= \lim_{n \rightarrow \infty} [F_T(x)]^n \\ &= \exp \{-\exp [-\mu (x - \eta)]\}, \quad x \geq 0 \end{align*} \]

Propositions

  • Mode: \(\eta\)

  • Mean: \(\displaystyle \eta + \frac{\gamma}{\mu}\), where \(\gamma\) is Euler-Mascheroni constant (\(\gamma \approx 0.577\) )

  • Variance: \(\displaystyle \frac{\pi^2}{6\mu^2}\)

  • If \(X \sim \text{Gumbel}(\eta, \mu)\), then:

    \[\alpha X + V \sim \text{Gumbel} \left(\alpha \, \eta + V, \frac{\mu}{\alpha} \right) \]

    where \(V\) and \(\alpha\) are any scalar constants.

  • If \(X_1 \sim \text{Gumbel}(\eta_1, \mu)\), \(X_2 \sim \text{Gumbel}(\eta_2, \mu)\), and, \(X_1\) and \(X_2\) are i.i.d., then:

    \[\max \{X_1, X_2\} \sim \text{Gumbel} \left( \frac{\ln \left[ \exp(\eta_1 \, \mu) + \exp(\eta_2 \,\mu)\right]}{\mu}, \mu \right) \]

  • As a corollary to proposition 5: if \((X_1, X_2, \cdots, X_J)\) are \(J\) independent Gumbel distributed variables with parameters \((\eta_1, \mu), (\eta_2, \mu), \cdots, (\eta_J, \mu)\), respectively,
    then \(\max \{X_1, X_2, \cdots, X_J\}\) is Gumbel distributed:

    \[\max \{X_1, X_2, \cdots, X_J\} \sim \text{Gumbel} \left(\frac{\ln \left[ \sum_{j=1}^J \exp(\eta_j \, \mu)\right]}{\mu}, \mu \right) \]

4. Logistic Distribution

For a series of random variables \(T_1, T_2, \cdots, T_n\) are statistically independent and identically distributed (IID, or i.i.d.) with an exponential tail distribution, and the CDF is denoted as \(F_T(t)\). \(n\) is taken as the extreme value (very large \(n\)). Then, let random variable \(X\)

\[\displaystyle X = \frac{\max \{T_1, T_2, \cdots, T_n\} + \min \{T_1, T_2, \cdots, T_n\}}{2} \]

and \(X\) follows the Logistic distribution:

\[X \sim \text{Logistic}(\eta, \mu) \]

PDF:

\[f(x|\eta, \mu) = \frac{\exp[-\mu (x - \mu)]}{\dfrac{1}{\mu} \cdot \left\{1+\exp[-\mu(x - \eta)] \right\}^2}, \quad -\infty < x < + \infty \]

where \(\eta\) is a location parameter, \(\mu > 0\) is a scale parameter.

CDF:

\[F(x|\eta, \mu) = \frac{1}{1+\exp[-\mu (x - \eta)]}, \quad -\infty < x < + \infty \]

Propositions

  • Mode: \(\eta\)

  • Mean: \(\eta\)

  • Variance: \(\displaystyle \frac{\pi^2}{3\mu}\)

  • If \(X \sim \text{Logistic}(\eta, \mu)\), then:

    \[\alpha X + V \sim \text{Logistic} \left(\alpha \, \eta + V, \alpha \, \mu \right) \]

    where \(V\) and \(\alpha\) are any scalar constants.

  • If \(X_1 \sim \text{Gumbel}(\eta_1, \mu)\), \(X_2 \sim \text{Gumbel}(\eta_2, \mu)\), and, \(X_1\) and \(X_2\) are i.i.d., then:

    \[Y = X_1 - X_2 \sim \text{Logistic} \left(\eta_1 - \eta_2, \mu \right) \]

    Remark : \(X_1 + Y_2 \nsim \mathrm {Logistic} (\eta_1 + \eta_2 ,\mu)\)

5. Bivariant Normal Distribution

\(X \sim \mathcal{N} \left(\mu_X, \sigma_X^2 \right)\) and \(Y \sim \mathcal{N} \left(\mu_Y, \sigma_Y^2 \right)\). \(X\) and \(Y\) are not independent, and the correlation coefficient is \(\rho\).

PDF :

\[f(x, y| \mu_X, \mu_Y, \sigma_X, \sigma_Y, \rho) = \\ \frac{1}{2 \pi \sigma_{X} \sigma_{Y} \sqrt{1-\rho^{2}}} \exp \left( -\frac{1}{2\left(1-\rho^{2}\right)} \left[ \frac{(x-\mu_{X})^{2}}{\sigma_{X}^{2}} - \frac{2 \rho (x-\mu_{X}) (y-\mu_{Y})}{\sigma_{X} \sigma_{Y}} + \frac{(y-\mu_{Y})^{2}}{\sigma_{Y}^{2}} \right] \right) \]

where \(-\infty < x < +\infty\) and \(-\infty < y < +\infty\)

Propositions :

  • \(\mathrm{E}[X] = \mu_X\) and \(\mathrm{E}[Y] = \mu_Y\)

  • \(\mathrm{Var}[X] = \sigma_X^2\) and \(\mathrm{Var}[Y] = \sigma_Y^2\)

  • \(\mathrm{Cov}[X, Y] = \rho \, \sigma_X \, \sigma_Y\)

6. Multivariant Normal Distribution

Assume that a \(n\)-dimension random variable vector \(\boldsymbol{X}\) follows a multivariant normal distribution with a mean vector \(\boldsymbol{\mu} \in \mathbb{R}^{n}\) and a covariance matrix \(\mathbf{\Sigma} \in \mathbb{R}^{n\times n}\):

The joint PDF is:

\[f(\boldsymbol{x}| \boldsymbol{\mu}, \mathbf{\Sigma}) = \frac{1}{\sqrt {(2\pi )^{k}|{\boldsymbol {\Sigma }}|}} \exp \left[ - {\frac{1}{2}} (\boldsymbol{x} - \boldsymbol{\mu})^{\mathrm{T}} {\boldsymbol {\Sigma }}^{-1} (\boldsymbol{x} - \boldsymbol{\mu}) \right], \quad -\infty < \boldsymbol{x} < + \infty \]

For the bivariant normal distribution:

\[\mathbf{\mu} = \begin{bmatrix} \mu_X \\ \mu_Y \end{bmatrix} \qquad \mathbf{\Sigma} = \begin{bmatrix} \sigma_X^2 & \rho \sigma_X \sigma_Y \\ \rho \sigma_X \sigma_Y & \sigma_Y^2 \end{bmatrix} \]

posted @ 2022-03-09 23:29  veager  阅读(58)  评论(0)    收藏  举报