三大分布密度函数推导

三大分布密度函数推导

一、\(\chi^{2}\)分布密度的推导

\(Y_{1}, \cdots, Y_{n}\)独立同分布,且每个\(Y_i\)服从标准正态分布\(N(0,1)\),由定义,

\[X = Y_{1}^{2} + \cdots + Y_{n}^{2} \sim \chi_{n}^{2} \]

\(h(x)\)为任一非负函数,使得\(h(X)\)为一随机变量,则

\[E[h(X)] = \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} h\left(y_{1}^{2} + \cdots + y_{n}^{2}\right) \left(\frac{1}{\sqrt{2 \pi}}\right)^{n} e^{-\frac{1}{2}\left(y_{1}^{2} + \cdots + y_{n}^{2}\right)} dy_{1} \cdots dy_{n} \]

作多维球坐标变换:

\[\begin{cases} y_{1} = r \cos \theta_{1} \\ y_{2} = r \sin \theta_{1} \cos \theta_{2} \\ y_{3} = r \sin \theta_{1} \sin \theta_{2} \cos \theta_{3} \\ \vdots \\ y_{n-1} = r \sin \theta_{1} \cdots \sin \theta_{n-2} \cos \theta_{n-1} \\ y_{n} = r \sin \theta_{1} \cdots \sin \theta_{n-2} \sin \theta_{n-1} \end{cases} \]

其中\(0 \leq r < \infty\)\(0 \leq \theta_{i} \leq \pi\)\(i=1, \cdots, n-2\)),\(0 \leq \theta_{n-1} \leq 2\pi\)

变换的雅可比行列式为

\[J = r^{n-1} \sin^{n-2} \theta_{1} \sin^{n-3} \theta_{2} \cdots \sin^{2} \theta_{n-3} \sin \theta_{n-2} \]

由定义有

\[J = \left|\begin{array}{ccc} \frac{\partial y_{1}}{\partial r} & \frac{\partial y_{1}}{\partial \theta_{1}} & \cdots & \frac{\partial y_{1}}{\partial \theta_{n-1}} \\ \vdots & \vdots & \vdots \\ \frac{\partial y_{n}}{\partial r} & \frac{\partial y_{n}}{\partial \theta_{1}} & \cdots & \frac{\partial y_{n}}{\partial \theta_{n-1}} \end{array}\right|_{+} = r^{n-1} c \]

因为其中从第二列开始直至最后一列,每列均可提出一个因子\(r\),将\(r\)提出后剩余部分仅与\(\theta_{1}, \cdots \theta_{n-1}\)有关,记成\(c\),由此

\[\begin{align} E[h(X)] &= \int_{0}^{\infty} dr \int_{0}^{\pi} \cdots \int_{0}^{\pi} d\theta_{1} \cdots d\theta_{n-2} \int_{0}^{2\pi} h\left(r^{2}\right) \left(\frac{1}{\sqrt{2\pi}}\right)^{n} e^{-\frac{1}{2} r^{2}} r^{n-1} c d\theta_{n-1} \\ &= c' \int_{0}^{\infty} h\left(r^{2}\right) r^{n-1} e^{-\frac{1}{2} r^{2}} dr \end{align} \]

其中\(c'\)为常数。

进一步简化得到

\[E[h(X)] = c'' \int_{0}^{\infty} h(u) u^{\frac{n}{2}-1} e^{-\frac{1}{2} u} du \]

其中\(c''\)为常数。

为了求\(c''\),应有

\[1 = c'' \int_{0}^{\infty} u^{\frac{n}{2}-1} e^{-\frac{1}{2} u} du = c'' 2^{\frac{n}{2}} \int_{0}^{\infty} y^{\frac{n}{2}-1} e^{-y} dy = c'' 2^{\frac{n}{2}} \Gamma\left(\frac{n}{2}\right) \]

\[c'' = \left(2^{\frac{n}{2}} \Gamma\left(\frac{n}{2}\right)\right)^{-1} \]

这就证明了\(\chi^{2}\)分布的密度函数为

\[f(x) = \frac{1}{2^{\frac{n}{2}} \Gamma\left(\frac{n}{2}\right)} x^{\frac{n}{2}-1} e^{-\frac{1}{2} x} \]

推导这个式子还可以直接用归纳法,主要利用以下事实:

  1. 由例1.18知\(\chi^{2}\)的分布密度为\(\frac{1}{\sqrt{2\pi}} x^{-1/2} e^{-x/2}\)
  2. 若随机变量\(X\)\(Y\)独立,各有分布密度\(f_1(x)\)\(f_2(y)\),则\(Z = X + Y\)有分布密度\(f(z) = \int_{-\infty}^{\infty} f_1(z-x) f_2(x) dx\)
  3. 利用上面两个事实对\(n\)作归纳法便可导出该式,其中用到贝塔函数的简单性质。

有兴趣的读者可以去试试看,此外,在文献[22]第九章还给出了\(\chi^{2}\)分布的其它推导法。

二、t分布密度的推导

\(X \sim N(0,1)\)\(Y \sim \chi_{n}^{2}\)独立,则随机变量

\[T = \frac{\sqrt{n} X}{\sqrt{Y}} \]

的分布是具有\(n\)个自由度的t分布。由假设条件可知\(X\)\(Y\)的联合分布密度是

\[C_{n} e^{-x^{2}/2} e^{-y/2} y^{n/2-1} \]

其中

\[C_{n} = \frac{1}{\sqrt{2\pi} 2^{n/2} \Gamma(n/2)} \]

\(h(t)\)为任一非负函数使得\(h(T)\)为一随机变量,于是

\[E[h(T)] = C_{n} \int_{-\infty}^{\infty} dx \int_{0}^{\infty} h\left(\frac{\sqrt{n} x}{\sqrt{y}}\right) e^{-x^{2}/2} e^{-y/2} y^{n/2-1} dy \]

作变换

\[\left\{ \begin{array}{l} t = \frac{\sqrt{n} x}{\sqrt{y}} \\ y = y \end{array} \right. \]

\[\left\{ \begin{array}{l} x = \frac{\sqrt{y} t}{\sqrt{n}} \\ y = y \end{array} \right. \]

则变换的雅可比行列式是

\[J = \left|\begin{array}{cc} \frac{\partial x}{\partial t} & \frac{\partial x}{\partial y} \\ \frac{\partial y}{\partial t} & \frac{\partial y}{\partial y} \end{array}\right|_{+} = \left|\begin{array}{cc} \sqrt{y} / \sqrt{n} & \frac{1}{2} t y^{-1/2} / \sqrt{n} \\ 0 & 1 \end{array}\right|_{+} = \sqrt{y / n} \]

代入得

\[E[h(T)] = C_{n} \int_{-\infty}^{\infty} h(t) dt \]

\[\int_{0}^{\infty} e^{-y t^{2} / 2 n} e^{-y / 2} y^{n / 2-1} \sqrt{y / n} dy \]

上式右端第二重积分是

\[\frac{1}{\sqrt{n}} \int_{0}^{\infty} y^{(n-1) / 2} e^{-\frac{y}{2}\left(1+\frac{t^{2}}{n}\right)} dy \]

\[z = \left(1 + \frac{t^{2}}{n}\right) y \]

\[\begin{align*} & \int_{0}^{\infty} \left(1 + \frac{t^{2}}{n}\right)^{-(n-1) / 2} z^{(n-1) / 2} e^{-z / 2} \left(1 + \frac{t^{2}}{n}\right)^{-1} dz \\ &= \left(1 + \frac{t^{2}}{n}\right)^{-(n-1) / 2 - 1} \int_{0}^{\infty} z^{(n-1) / 2} e^{-z / 2} dz \\ &= \left(1 + \frac{t^{2}}{n}\right)^{-(n+1) / 2} \int_{0}^{\infty} z^{(n+1) / 2 - 1} e^{-z / 2} dz \\ &= \left(1 + \frac{t^{2}}{n}\right)^{-(n+1) / 2} \Gamma\left(\frac{n+1}{2}\right) 2^{(n+1) / 2} \end{align*} \]

上面最后一步利用了积分号内的函数是\(\chi^{2}_{n+1}\)的分布密度(差一个常数),将结果代入到(3.55)中去,得

\[\begin{align*} E[h(T)] &= C_{n} \left(\frac{1}{\sqrt{n}}\right) \Gamma\left(\frac{n+1}{2}\right) 2^{(n+1) / 2} \int_{-\infty}^{\infty} h(t) \left(1 + \frac{t^{2}}{n}\right)^{-(n+1) / 2} dt \\ &= \frac{\Gamma\left(\frac{n+1}{2}\right)}{\sqrt{n \pi} \Gamma\left(\frac{n}{2}\right)} \int_{-\infty}^{\infty} h(t) \left(1 + \frac{t^{2}}{n}\right)^{-(n+1) / 2} dt \end{align*} \]

这就证明了t分布的密度函数为

\[f(t) = \frac{\Gamma\left(\frac{n+1}{2}\right)}{\sqrt{n \pi} \Gamma\left(\frac{n}{2}\right)} \left(1 + \frac{t^{2}}{n}\right)^{-(n+1) / 2} \]

三、F分布密度的推导

\(X \sim \chi_{m}^{2}\)\(Y \sim \chi_{n}^{2}\)独立,则\(F = \frac{(n / m)}{(X / Y)}\)的分布是自由度为\(m\)\(n\)的F分布,由假设条件知\(X\)\(Y\)的联合密度为

\[C_{m, n} x^{\frac{m}{2}-1} y^{\frac{n}{2}-1} e^{-\frac{1}{2}(x+y)} \]

其中

\[C_{m, n}^{-1} = 2^{(n+m) / 2} \Gamma(m / 2) \Gamma(n / 2) \]

\(h(f)\)为任一非负函数使得\(h(F)\)为随机变量,于是

\[E[h(F)] = \int_{0}^{\infty} \int_{0}^{\infty} h\left(\frac{n}{m} \frac{x}{y}\right) C_{m, n} x^{\frac{m}{2}-1} y^{\frac{n}{2}-1} e^{-\frac{1}{2}(x+y)} dx dy \]

作变换

\[\left\{ \begin{array}{l} f = \frac{n}{m} \frac{x}{y} \\ y = y \end{array} \right. \]

\[\left\{ \begin{array}{l} x = \frac{m}{n} f y \\ y = y \end{array} \right. \]

则变换的雅可比行列式为

\[\left(\frac{m}{n} y\right) \]

于是

\[\begin{align*} E[h(F)] &= C_{m, n} \int_{0}^{\infty} h(f) df \int_{0}^{\infty} \left(\frac{m}{n} f y\right)^{\frac{m}{2}-1} y^{\frac{n}{2}-1} \\ &\quad \times e^{-\frac{y}{2}\left(1 + \frac{m}{n} f\right)} \left(\frac{m}{n} y\right) dy \\ &= C_{m, n} \left(\frac{m}{n}\right)^{\frac{m}{2}} \int_{0}^{\infty} h(f) f^{\frac{m}{2}-1} \int_{0}^{\infty} y^{\frac{m+n}{2}-1} \\ &\quad \times e^{-\frac{y}{2}\left(1 + \frac{m}{n} f\right)} dy \end{align*} \]

\[z = \left(1 + \frac{m}{n} f\right) y \]

上式右边第二重积分为

\[\int_{0}^{\infty} \left(1 + \frac{m}{n} f\right)^{-\frac{m+n}{2}} z^{\frac{m+n}{2}-1} e^{-z / 2} dz = \left(1 + \frac{m}{n} f\right)^{-\frac{m+n}{2}} \Gamma\left(\frac{m+n}{2}\right) 2^{\frac{m+n}{2}} \]

因此

\[E[h(F)] = \frac{1}{B\left(\frac{m}{2}, \frac{n}{2}\right)} \left(\frac{m}{n}\right)^{\frac{m}{2}} \int_{0}^{\infty} h(f) f^{\frac{m}{2}-1} \left(1 + \frac{m}{n} f\right)^{-\frac{m+n}{2}} df \]

这就证明了F分布的密度函数为

\[f(f) = \frac{\Gamma\left(\frac{m+n}{2}\right)}{\Gamma\left(\frac{m}{2}\right) \Gamma\left(\frac{n}{2}\right)} \left(\frac{m}{n}\right)^{\frac{m}{2}} \frac{f^{\frac{m}{2}-1}}{\left(1 + \frac{m}{n} f\right)^{\frac{m+n}{2}}} \]

posted @ 2024-12-10 15:41  redufa  阅读(397)  评论(0)    收藏  举报