Chapter 6

6.1 Inner Products and Norms

Definition (inner product).

Let V be a vector space over F. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F, denoted $⟨x,y⟩$, such that for all x, y, and z in V and all c in F, the following hold:

(a) $⟨x + z,y⟩ = ⟨x,y⟩ + ⟨z,y⟩.$

(b) $⟨cx,y⟩=c⟨x,y⟩.$

(c) $\overline{⟨x, y⟩} = ⟨y, x⟩,$ where the bar denotes complex conjugation.

(d) $⟨x,x⟩>0$ if $x \neq 0$.

Definition (conjugate transpose).

Let $A ∈ M_{m×n}(F)$. We define the conjugate transpose or adjoint of A to be the $n×m$ matrix $A^∗$ such that $(A^∗)_{ij} = \overline{A_{ji}}$ for all $i,j$.

Definition (inner product space).

A vector space $V$ over $F$ endowed with a specific inner product is called an inner product space. If $F = C$, we call V a complex inner product space, whereas if $F = R$, we call $V$ a real inner product space.

Definition of some inner products.

Frobenius Inner product: $\langle A, B\rangle=\operatorname{tr}\left(B^{*} A\right) \text { for } A, B \in M_{n\times n}(F).$

Standard inner product on $F^n$: $x=\left(a_{1}, a_{2}, \ldots, a_{n}\right)$ and $y=\left(b_{1}, b_{2}, \ldots, b_{n}\right)$ in $\mathrm{F}^{n}$, $\langle x, y\rangle=\sum_{i=1}^{n} a_{i} \bar{b}_{i}$.

H of continuous complex-valued functions defined on the interval $[0, 2π]$: $\langle f, g\rangle=\frac{1}{2 \pi} \int_{0}^{2 \pi} f(t) \overline{g(t)} d t$.

Theorem 6.1.

Let V be an inner product space. Then for x, y, z ∈ V and c ∈ F , the following statements are true.

(a) $⟨x,y + z⟩$ = $⟨x,y⟩$ + $⟨x,z⟩$.

(b) $⟨x,cy⟩=\overline c⟨x,y⟩$.

(c) $⟨x,0⟩ = ⟨0,x⟩ = 0$.

(d) $⟨x,x⟩=0$ if and only if $x=0$.

(e) If $⟨x,y⟩=⟨x,z⟩$ for all $x∈V$, then $y=z$.

Definition (norm).

Let $V$ be an inner product space. For $x ∈ V$, we define the 􏰉norm or length of $x$ by $\|x\|= ⟨x, x⟩$.

Theorem 6.2.

Let $V$ be an inner product space over $F$. Then for all $x, y ∈ V$ and $c ∈ F$ , the following statements are true.

(a) $\|cx\|= |c|·\|x\|.$

(b) $\|x\|=0$ if and only if $x=0$. In any case, $\|x\|≥0$.

(c) (Cauchy–Schwarz Inequality)$|⟨x,y⟩|≤\|x\|·\|y\|$.

(d) (Triangle Inequality) $\|x + y\| ≤ \|x\| + \|y\|$.

证明

(c)

$y=0$显然成立，假设$y \neq 0$。对于任意$c \in F$，有

\begin{aligned} 0 \leq\|x-c y\|^{2} &=\langle x-c y, x-c y\rangle=\langle x, x-c y\rangle- c\langle y, x-c y\rangle \\ &=\langle x, x\rangle-\bar{c}\langle x, y\rangle- c\langle y, x\rangle+ c \bar{c}\langle y, y\rangle \end{aligned}

$c=\frac{\langle x, y\rangle}{\langle y, y\rangle}$，则有$0 \leq\langle x, x\rangle-\frac{|\langle x, y\rangle|^{2}}{\langle y, y\rangle}=\|x\|^{2}-\frac{|\langle x, y\rangle|^{2}}{\|y\|^{2}}$，所证不等式成立。

(d)

\begin{aligned}\|x+y\|^{2} &=\langle x+y, x+y\rangle=\langle x, x\rangle+\langle y, x\rangle+\langle x, y\rangle+\langle y, y\rangle \\ &=\|x\|^{2}+2 \Re\langle x, y\rangle+\|y\|^{2} \\ & \leq\|x\|^{2}+2|\langle x, y\rangle|+\|y\|^{2} \\ & \leq\|x\|^{2}+2\|x\| \cdot\|y\|+\|y\|^{2} \\ &=(\|x\|+\|y\|)^{2} \end{aligned}

Definition (orthogonal, unit vector, orthonormal).

Let $V$ be an inner product space. Vectors $x$ and $y$ in $V$ are orthogonal (perpendicular) if $⟨x, y⟩ = 0$.

A subset $S$ of $V$ is orthogonal if any two distinct vectors in $S$ are orthogonal.

A vector $x$ in $V$ is a unit vector if $\|x\| = 1$.

Finally, a subset $S$ of $V$ is orthonormal if $S$ is orthogonal and consists entirely of unit vectors.

6.2 The Gram–Schmidt Process and Orthogonal Complements

Definition (orthonormal basis).

Let V be an inner product space. A subset of V is an orthonormal basis for V if it is an ordered basis that is orthonormal.

Theorem 6.3.

Let V be an inner product space and $S = {v_1, v_2, . . . , v_k}$ be an orthogonal subset of V consisting of nonzero vectors. If $y ∈ span(S)$, then

$y=\sum_{i=1}^{k} \frac{\left\langle y, v_{i}\right\rangle}{\left\|v_{i}\right\|^{2}} v_{i}$

Corollary 1.

If, in addition to the hypotheses of Theorem 6.3, S is orthonormal and y ∈ span(S), then

$y=\sum_{i=1}^{k} \left\langle y, v_{i}\right\rangle v_{i}$

Corollary 2.

Let V be an inner product space, and let S be an orthogonal subset of V consisting of nonzero vectors. Then S is linearly independent.

Theorem 6.4. (the Gram–Schmidt process)

Let V be an inner product space and $S = \{w_1, w_2, \ldots, w_n\}$ be a linearly independent subset of V. Define $S′ = \{v_1, v_2, \ldots, v_n\}$, where $v_1 = w_1$ and

$v_{k}=w_{k}-\sum_{j=1}^{k-1} \frac{\left\langle w_{k}, v_{j}\right\rangle}{\left\|v_{j}\right\|^{2}} v_{j} \quad \text { for } 2 \leq k \leq n.$

Then S′ is an orthogonal set of nonzero vectors such that span(S′) = span(S).

Theorem 6.5.

Let V be a nonzero finite-dimensional inner product space. Then V has an orthonormal basis $\beta$. Furthermore, if $\beta = \{v_1,v_2,...,v_n\}$ and x ∈ V, then

$x=\sum_{i=1}^{n}\left\langle x, v_{i}\right\rangle v_{i}$

Corollary.

Let V be a finite-dimensional inner product space with an orthonormal basis $= \beta = {v_1, v_2, \ldots, v_n}$. Let T be a linear operator on V, and let A = $[T]_\beta$. Then for any i and j, $$Aij = \langle T(vj),vi\rangle.$$

Definition (Fourier coefficients).

Let $\beta$ be an orthonormal subset (possibly infinite) of an inner product space V, and let $x ∈ V$. We define the Fourier coefficients of $x$ relative to $\beta$ to be the scalars $⟨x, y⟩$, where $y ∈ β$.

Definition (orthogonal complement).

Let S be a nonempty subset of an inner product space V. We define $S^\perp$ to be the set of all vectors in V that are orthogonal to every vector in S; that is, $S^{\perp}=\{x \in V:\langle x, y\rangle= 0 \text { for all } y \in S\}$. The set $s^\perp$ is called the orthogonal complement of S.

注意
• S可以是任意集合，不一定是 subspace；

• $0 \in S$, $S\cap S^\perp = \{0\}$; 否则$S\cap S^\perp = \O$.

Theorem 6.6.

Let $W$ be a finite-dimensional subspace of an inner product space $V$, and let $y∈V$. Then there exist unique vectors $u∈W$ and $z\in W^\perp$ such that $y=u+z$. Furthermore, if${v_1,v_2,\ldots,v_k}$is an orthonormal basis for $W$, then

$u=\sum_{i=1}^{k}\left\langle y, v_{i}\right\rangle v_{i}$

\begin{aligned}\left\langle z, v_{j}\right\rangle &=\left\langle\left(y-\sum_{i=1}^{k}\left\langle y, v_{i}\right\rangle v_{i}\right), v_{j}\right\rangle=\left\langle y, v_{j}\right\rangle-\sum_{i=1}^{k}\left\langle y, v_{i}\right\rangle\left\langle v_{i}, v_{j}\right\rangle \\ &=\left\langle y, v_{j}\right\rangle-\left\langle y, v_{j}\right\rangle= 0 \end{aligned}

Corollary (orthogonal projection).

The vector $u = \sum_{i=1}^{k}\left\langle y, v_{i}\right\rangle v_{i}$ is the unique vector in $W$ that is “closest” to $y$; that is, for any $x ∈ W$,$|y − x| ≥ |y − u|$, and this inequality is an equality if and only if $x = u$. $u$ is called the orthogonal projection of $y$ on $W$.

$\|y-x\|^2 = \|u + z - x\|^2 = \|u-x\|^2 + \|z\|^2 \ge \|z\|^2$

Theorem 6.7.

Suppose that $S=\left\{v_{1}, v_{2}, \ldots, v_{k}\right\}$ is an orthonormal set in an n-dimensional inner product space $V$. Then

(a) S can be extended to an orthonormal basis $\{v_1, v_2, \ldots, v_k, v_{k+1}, \ldots, v_n\}$ for $V$.

(b) If $W = span(S)$, then $S_1 = \{v_{k+1}, v_{k+2}, \ldots, v_n\}$ is an orthonormal basis for $W^\perp$.

(c) If $W$ is any subspace of $V$, then $dim(V) = dim(W) + dim(W^\perp)$.

(a) 先 extend，然后用 Gram–Schmidt process.

(b) 显然$S_1 \subseteq W^\perp$, 只需证$span(S_1) = W^\perp$. $\forall x = \sum_{i = 1}^{n}a_iv_i \in W^\perp, \langle x, v_i\rangle = 0$ for $1 \le i \le k$, 所以$x = \sum_{i = k + 1}^{n}a_iv_i \in span(S_1).$

(c) 由(b)显然。

6.3 The Adjoint of A Linear Operator

Theorem 6.8.

Let $V$ be a finite-dimensional inner product space over $F$, and let $g: V → F$ be a linear transformation. Then there exists a unique vector $y ∈ V$ such that $g(x) = ⟨x, y⟩$ for all $x ∈ V$.

Let $\beta=\left\{v_{1}, v_{2}, \dots, v_{n}\right\}$ be an orthonormal basis for V, then

$y = \sum_{i=1}^n\overline{g(v_i)}v_i.$

Theorem 6.8 为 $T^*$的定义做了准备工作，只有证明了$y$的唯一性，才能定义出一个映射。

Let $V$ be a finite-dimensional inner product space, and let $T$ be a linear operator on $V$. Then there exists a unique function $T^*: V → V$ such that $⟨T(x), y⟩ = ⟨x, T^*(y)⟩$ for all $x, y ∈ V$. Furthermore, $T^*$ is linear. $T^*$ is called the adjoint of $T$.

Theorem 6.10.

Let V be a finite-dimensional inner product space, and let $β$ be an orthonormal basis for $V$. If $T$ is a linear operator on $V$, then $\left[\mathrm{T}^{*}\right]_{\beta}=[\mathrm{T}]_{\beta}^{*}.$

Corollary.

Let $A$ be an $n × n$ matrix. Then $L_{A^*} = (L_A)^*$.

Theorem 6.11.

Let $V$ be an inner product space, and let $T$ and $U$ be linear operators on $V$. Then

$\begin{array}{l}{\text { (a) }(\mathrm{T}+\mathrm{U})^{*}=\mathrm{T}^{*}+\mathrm{U}^{*}} \\ {\text { (b) }(c \mathrm{T})^{*}=\bar{c} \mathrm{T}^{*} \text { for any } c \in F} \\ {\text { (c) }(\mathrm{TU})^{*}=\mathrm{U}^{*} \mathrm{T}^{*} ;} \\ {\text { (d) } \mathrm{T}^{* *}=\mathrm{T}} \\ {\text { (e) } \mathrm{I}^{*}=\mathrm{I}}\end{array}$

Corollary.

Let $A$ and $B$ be $n × n$ matrices. Then

$\begin{array}{l}{\text { (a) }(A+B)^{*}=A^{*}+B^{*}} \\ {\text { (b) }(c A)^{*}=\bar{c} A^{*} \text { for all } c \in F} \\ {\text { (c) }(A B)^{*}=B^{*} A^{*}} \\ {\text { (d) } A^{* *}=A} \\ {\text { (e) } I^{*}=I}\end{array}$

Lemma 1.

Let $A \in \mathbb{M}_{m \times n}(F), x \in F^{n},$ and $y \in F^{m}$. Then

$\langle A x, y\rangle_{m}=\left\langle x, A^{*} y\right\rangle_{n}.$

Lemma 2.

Let $A \in \mathbb{M}_{m \times n}(F)$. Then $rank(A^*A) = rank(A)$.

Corollary.

If $A$ is an $m \times n$ matrix such that $rank(A) = n$, then $A^*A$ is invertible.

Theorem 6.12 (Least Squares Approximation，最小二乘法) .

Let $A ∈ M_{m×n} (F)$ and $y ∈ F^m$ . Then there exists $x_0 ∈ F^n$ such that $(A^*A)x_0 = A^*y$ and $∥Ax_0 −y∥ ≤ ∥Ax−y∥$ for all $x ∈ F^n$. Furthermore, if $rank(A) = n$, then $x_0 = (A^*A)^{−1}A^*y$.

$Ax \in R(A)$, 而在$R(A)$中存在唯一的离$y$最近的向量$Ax_0$，这里的$x_0$即为所求。由 Theorem 6.6, $Ax_0 - y \in R(A)^\perp.$ 现在求$R(A)^\perp$。若$z \in R(A)^\perp,$ $\forall x \in V,$$\langle A^*z, x \rangle = \langle z, Ax \rangle = 0$。由于$x$任意性，$A^*z = 0$，即$z \in N(A^*).$ 反过来亦可推出若$z \in N(A^*)$则有$z \in R(A)^\perp.$ 所以$R(A)^\perp = N(A^*).$ 因为$Ax_0 - y \in R(A)^\perp = N(A^*)$，所以有$A^*(Ax_0 - y) = 0$, 若$rank(A) = n$，则有$x_0 = (A^*A)^{−1}A^*y.$

Theorem 6.13 (Minimal Solution to Systems of Linear Equations，线性方程组的最小解)

A solution s to $Ax = b$ is called a minimal solution if $∥s∥ ≤ ∥u∥$ for all other solutions $u$.

Let $A \in \mathbb{M}_{m \times n}(F)$ and $b ∈ F^m$. Suppose that $Ax = b$ is consistent. Then the following statements are true.

(a) There exists exactly one minimal solution $s$ of $Ax = b$, and $s ∈ R(L_{A^*})$.

(b) The vector s is the only solution to $Ax = b$ that lies in $R(L_{A^*})$; that is, if u satisfies $\left(A A^{*}\right) u=b$, then $s = A^*u$.

Lemma.

Let $T$ be a linear operator on a finite-dimensional inner product space $V$. If $T$ has an eigenvector, then so does $T^*$.

Theorem 6.14 (Schur).

Let $T$ be a linear operator on a finite-dimensional inner product space $V$. Suppose that the characteristic polynomial of $T$ splits. Then there exists an orthonormal basis $β$ for $V$ such that the matrix $[T]_β$ is upper triangular.

$\forall y \in W^\perp, x = cz \in W$ where $c \in F$$\langle T(y), x = cz \rangle = \langle y, T^*(cz) \rangle = \overline{c\lambda}\langle y, z \rangle = 0.$ 所以$T(y) \in W^\perp.$ 所以$W^\perp$是T-invariant，可以定义$T_{W^\perp}$，而$dim(W^\perp) = n - 1$，应用假设可得存在一个$W^\perp$的orthonormal basis $\gamma$ 使得$[T_{W^\perp}]_\gamma$是上三角矩阵，显然$z$垂直于$\gamma$中的每个向量，令$\beta = \gamma \cup \{z\}$，则$\beta$是orthonormal basis, 且$[T]_β$也是上三角矩阵。

Definitions (normal).

Let $V$ be an inner product space, and let $T$ be a linear operator on $V$. We say that $T$ is normal if $TT^*= T^*T$. An $n×n$ real or complex matrix $A$ is normal if $AA^* = A^*A.$

Theorem 6.15.

Let $V$ be an inner product space, and let $T$ be a normal operator on $V$. Then the following statements are true.

(a) $\|\mathrm{T}(x)\|=\left\|\mathrm{T}^{*}(x)\right\|$ for all $x \in V$.

(b) $T−cI$ isnormal for every $c∈F$.

(c) If $x$ is an eigenvector of $T$, then $x$ is also an eigenvector of $T^*$. In fact, if $T(x) = λx$, then $T^*(x) = \overline λx$.

(d) If $λ_1$ and $λ_2$ are distinct eigenvalues of $T$ with corresponding eigenvectors $x_1$ and $x_2$, then $x_1$ and $x_2$ are orthogonal.

(c) 令 $U = T - \lambda I$，则$U^* = T^* - \overline\lambda I$，且根据(b)，$U$也normal。根据(a)有$0 \|U(x)\| = \|U^*(x)\| = \|T^*(x) - \overline\lambda x\|$

(d) $\lambda_{1}\left\langle x_{1}, x_{2}\right\rangle=\left\langle\lambda_{1} x_{1}, x_{2}\right\rangle=\left\langle T\left(x_{1}\right), x_{2}\right\rangle=\left\langle x_{1}, T^{*}\left(x_{2}\right)\right\rangle =\left\langle x_{1}, \overline{\lambda_{2}} x_{2}\right\rangle=\lambda_{2}\left\langle x_{1}, x_{2}\right\rangle$

Theorem 6.16.

Let $T$ be a linear operator on a finite-dimensional complex inner product space $V$. Then $T$ is normal if and only if there exists an orthonormal basis for $V$ consisting of eigenvectors of T.

Let T be a linear operator on an inner product space V. We say that T is self-adjoint (Hermitian) if T = $T^*$. An $n × n$ real or complex matrix $A$ is self-adjoint (Hermitian) if $A = A^*$.

Lemma.

Let T be a self-adjoint operator on a finite-dimensional inner product space V. Then

(a) Every eigenvalue of T is real.
(b) Suppose that V is a real inner product space. Then the characteristic polynomial of T splits.

(a) $\lambda x = T(x) = T^*(x) = \overline\lambda x$

Theorem 6.17.

Let $T$ be a linear operator on a finite-dimensional real inner product space $$V$$. Then $T$ is self-adjoint if and only if there exists an orthonormal basis $β$ for $V$ consisting of eigenvectors of $T$.

6.5 Unitary & Orthogonal Operators and their Matrices

Definitions (unitary/orthogonal operator, isometry).

If $\|T(x)\| = \|x\|$ for all $x \in V$, we call T a unitary operator if $F = C$ and an orthogonal operator if $F = R$.

In the infinite-dimensional case, an operator satisfying $\|T(x)\| = \|x\|$ for all $x \in V$ is called an isometry.

Lemma.

Let U be an self-adjoint operator on a finite-dimensional inner product space V. If $\langle x, U(x)\rangle = 0$ for all $x \in V$, then $U = T_0$.

Theorem 6.18:

4 equivalent statements about unitary/othogonal operators:

Let T be a linear operator on a finite-dimensional inner product space V. Then the following statements are equivalent.

(a) $TT^* = T^*T = I$.

(b) $\langle T(x), T(y)\rangle = \langle x, y\rangle$ for all $x, y \in V$.

(c) If $\beta$ is an orthonormal basis for V, then $T(\beta)$ is an orthonormal basis for $V$.

(d) $\|T(x)\| = \|x\|$ for all $x \in V$.

(a) -> (b): $\langle T(x), T(y)\rangle = \langle x, T^*T(y)\rangle = \langle x, y\rangle$

(b) -> (c) 显然

(c) -> (d): $\|x\|^2 = \langle \sum_{i=1}^na_iv_i, \sum_{j=1}^na_jv_j\rangle = \sum_{i=1}^n\sum_{j=1}^na_i\overline{a_j}\langle v_i, v_j\rangle = \sum_{i=1}^n\sum_{j=1}^na_i\overline{a_j}\delta_{ij} = \sum_{i=1}^n\|a_i\|^2$

(d) -> (a): 将x在$\beta$下表示，T(x)在$T(\beta)$下表示，可得二者范数相同。

Corollary 1 & 2:

$|\lambda| = 1 \Leftrightarrow$ orthonormal/unitary.

Let T be a linear operator on a finite-dimensional complex [real] inner product V. Then V has an orthonormal basis of eigenvectors of T with corresponding eigenvalues of absolute value 1 if and only if T is both unitary[self-adjoint and orthogonal].

Definition (unitarily equivalent).

$A = P^*BP$ where $P$ is unitary. (The definition of orthogonally equivalent is similar.)

Theorem 6.19 & 6.20.

Complex[real] matrix $A$ is normal[symmetrix] iff $A$ is unitarily equivalent to a complex[real] diagonal matrix.

If $A = P^*DP$, then $$AA* = (P*DP)(PD^P) = P*DDP = P*DDP = P*DPP^DP = A^*A$$.

Theorem 6.21 (Schur) 舒尔定理的矩阵表示.

$A\in M_{n\times n}(F)$, and the characteristic polynomial of $A$ splits. If F = C[R], then A is unitarily[orthogonally] equivalent to a complex[real] upper triangular matrix.

6.6 Orthogonal Projections & Spectral Theorem

Definition (orthogonal projection)

We say $T$ is an orthogonal projection if $R(T)^\perp = N(T)$ and $N(T)^\perp = R(T)$.

Theorem 6.24.

Let V be an inner product space, and let T be a linear operator on V. Then T is an orthogonal projection iff T has an adjoint $T^*$ and $T^2=T=T^*$.

Theorem 6.25 (The Spectral Theorem).

T is a linear operator on a finite-dimensional inner product space V over F with the distinct eigenvalues $\lambda_1,\lambda_2,\ldots,\lambda_k$. Assume that T is normal is F = C and that T is self-adjoint if F = R. For each $i(1 \le i \le k)$, Let $W_i$ be the eigenspace of T corresponding to the eigenvalue $\lambda_i$, and let $T_i$ be the orthogonal projection of $V$ on $W_i$. Then:

(a) $V = W_1\oplus W_2 \oplus \ldots \oplus W_k$.

(b) if $W_i'$ denotes the direct sum of the subspaces $W_j$ for $j \neq i$, then $W_i^\perp = W_i'$.

(c) $T_iT_j = \delta_{ij}T_i$ for $1 \le i, j \le k$.

(d) $I = T_1 + T_2 + \ldots + T_k$.

(e) $T = \lambda_1T_1 + \lambda_2T_2 + \ldots + \lambda_kT_k$.

The set $\{\lambda_1, \lambda_2, \dots, \lambda_k\}$ is called the spectrum of $T$, the sum $I = T_1 + T_2 + \ldots + T_k$ is called the resolution of the identity operator induced by $T$, and the sum $T = \lambda_1T_1 + \lambda_2T_2 + \ldots + \lambda_kT_k$ is called the spectral decomposition of T.

$[T]_\beta = \begin{pmatrix}\ \lambda_1I_{m_1}&O&\cdots&O\\ O&\lambda_2I_{m_2}&\cdots&O \\ \vdots&\vdots&&\vdots\\ O&O&\cdots&\lambda_kI_{m_k}\end{pmatrix}$

Corollary 1.

If F = C, then T is normal iff $T^* = g(T)$ for some polynomial $g$.

Corollary 2.

If F = C, then T is unitary iff T is normal and all $|\lambda| = 1$.

Corollary 3.

If F = C and T is normal, then T is self-adjoint iff every eigenvalue of T is real.

Corollary 4.

Let T be as is the spectral theorem with spectral decomposition $T = \lambda_1T_1 + \lambda_2T_2 + \ldots + \lambda_kT_k$, Then each $T_j$ is a polynomiao in T.

posted @ 2019-11-25 15:13  胡小兔  阅读(1389)  评论(5编辑  收藏