Unit III: Positive Definite Matrices and Applications
26 Symmetric Matrices and Positive Definiteness(对称矩阵及正定性)
Symmetric matrices are good – their eigenvalues are real and each has a complete set of orthonormal eigenvectors. Positive definite matrices are even better.
Symmetric matrices with real entries
实对称矩阵 的特征值均为实数,并且可以找出一组相互正交的特征向量,对称矩阵A A A 可以被分解为A = Q Λ Q − 1 = Q Λ Q T A=Q\Lambda Q^{-1}=Q\Lambda Q^T A = Q Λ Q − 1 = Q Λ Q T ,Q Q Q 是正交矩阵(数学上被称为谱定理spectral theorem,物理上被称为主轴定理principal axis theorem)
关于实对称矩阵的特征值为实数的证明:实对称矩阵A A A ,A x = λ x Ax=\lambda x A x = λ x ,取共轭得到A ˉ x ˉ = λ ˉ x ˉ \bar{A}\bar{x}=\bar{\lambda}\bar{x} A ˉ x ˉ = λ ˉ x ˉ ,由于A A A 为实矩阵,则A x ˉ = λ ˉ x ˉ A\bar{x}=\bar{\lambda}\bar{x} A x ˉ = λ ˉ x ˉ ,取转置得到x ˉ T A T = x ˉ T λ ˉ \bar{x}^TA^T=\bar{x}^T\bar{\lambda} x ˉ T A T = x ˉ T λ ˉ ,由于A A A 为对称矩阵,则x ˉ T A = x ˉ T λ ˉ \bar{x}^TA=\bar{x}^T\bar{\lambda} x ˉ T A = x ˉ T λ ˉ ,右乘x x x 得到x ˉ T A x = x ˉ T λ ˉ x \bar{x}^TAx=\bar{x}^T\bar{\lambda}x x ˉ T A x = x ˉ T λ ˉ x ;对于A x = λ x Ax=\lambda x A x = λ x 右乘x ˉ T \bar{x}^T x ˉ T 得到x ˉ T A x = x ˉ T λ x \bar{x}^TAx=\bar{x}^T\lambda x x ˉ T A x = x ˉ T λ x ,则x ˉ T λ ˉ x = x ˉ T λ x \bar{x}^T\bar{\lambda}x=\bar{x}^T\lambda x x ˉ T λ ˉ x = x ˉ T λ x ,x x x 非零,x ˉ T x = Σ ∣ x i ∣ 2 ≠ 0 \bar{x}^Tx=\Sigma|x_i|^2\neq0 x ˉ T x = Σ∣ x i ∣ 2 = 0 ,则λ \lambda λ 是实数
对于复数矩阵,特征值为实数且存在一组相互正交的特征向量的充要条件 为A = A ˉ T A={\bar{A}}^T A = A ˉ T
A = Q Λ Q T = [ q 1 q 2 ⋯ q n ] [ λ 1 λ 2 ⋱ λ n ] [ q 1 T q 2 T ⋮ q n T ] = λ 1 q 1 q 1 T + λ 2 q 2 q 2 T + ⋯ + λ n q n q n T A=Q\Lambda Q^T=\left[\begin{array}{cccc}q_1&q_2&\cdots&q_n\end{array}\right]\left[\begin{array}{ccc}\lambda_1\\&\lambda_2\\&&\ddots\\&&&\lambda_n\end{array}\right]\left[\begin{array}{c}q_1^T\\q_2^T\\\vdots\\q_n^T\end{array}\right]=\lambda_1q_1q_1^T+\lambda_2q_2q_2^T+\cdots+\lambda_nq_nq_n^T A = Q Λ Q T = [ q 1 q 2 ⋯ q n ] λ 1 λ 2 ⋱ λ n q 1 T q 2 T ⋮ q n T = λ 1 q 1 q 1 T + λ 2 q 2 q 2 T + ⋯ + λ n q n q n T ,q k q k T q_kq_k^T q k q k T 是一个投影到q k q_k q k 的投影矩阵,所以对称矩阵可以表示为多个投影矩阵的加权和,这提供了一种理解谱定理 的角度
特征值为实数的重要性 :当特征值为实数时,可以判断其正负(这对于如微分方程组是否存在稳态等数学问题很重要),对于较大的矩阵,通过计算行列式求解特征值计算量很大,但通过消元计算主元并不困难,为正数的主元的数量与为正数的特征值的数量相等 ,进而也可以通过A + b I A+bI A + b I 计算大于或小于b b b 的特征值数量,这在不通过行列式求解特征值的情况下提供了很多有效的信息
Positive definite matrices
正定矩阵 (positive definite matrices)的所有特征值均为正数(充要条件 :所有主元均为正数;所有顺序主子式 均为正数)—— 此处,主元、行列式、特征值,被联系到了一起
The most important complex matrix is the Fourier matrix F n F_n F n , which is used for Fourier transforms. Normally, multiplication by F n F_n F n would require n 2 n^2 n 2 multiplications. The fast Fourier transform (FFT) reduces this to roughly n l o g 2 n n log_2 n n l o g 2 n multiplications, a revolutionary improvement.
Complex matrices
复数向量z z z 的模长∣ z ∣ 2 = z ˉ T z = z H z |z|^2=\bar{z}^Tz=z^Hz ∣ z ∣ 2 = z ˉ T z = z H z (z z z Hermitian z z z ),复数向量的内积同理 —— y H x = y ˉ T x y^Hx=\bar{y}^Tx y H x = y ˉ T x
埃尔米特矩阵 (Hermitian matrices)是对称的复矩阵 —— A ˉ T = A {\bar{A}}^T=A A ˉ T = A (A H = A A^H=A A H = A )与实对称矩阵相似,特征值为实数且存在一组相互正交的特征向量
酉矩阵 (Unitary matrices)是正交的复矩阵 —— Q ˉ T Q = I {\bar{Q}}^TQ=I Q ˉ T Q = I (Q H Q = I Q^HQ=I Q H Q = I )
傅里叶矩阵 F n = [ 1 1 1 ⋯ 1 1 w w 2 w n − 1 1 w 2 w 4 w 2 ( n − 1 ) ⋮ ⋱ ⋮ 1 w n − 1 w 2 ( n − 1 ) ⋯ w ( n − 1 ) 2 ] F_n=\left[\begin{array}{ccccc}1&1&1&\cdots&1\\1&w&w^2&&w^{n-1}\\1&w^2&w^4&&w^{2(n-1)}\\\vdots&&&\ddots&\vdots\\1&w^{n-1}&w^{2(n-1)}&\cdots&w^{(n-1)^2}\end{array}\right] F n = 1 1 1 ⋮ 1 1 w w 2 w n − 1 1 w 2 w 4 w 2 ( n − 1 ) ⋯ ⋱ ⋯ 1 w n − 1 w 2 ( n − 1 ) ⋮ w ( n − 1 ) 2 ,F n = F n T F_{n}=F_{n}^{T} F n = F n T ,( F n ) j k = w j k (F_{n})_{jk}=w^{jk} ( F n ) jk = w jk ,w = e i ⋅ 2 π / n = cos 2 π n + i sin 2 π n w=e^{i\cdot2\pi/n}=\cos\frac{2\pi}{n}+i\sin\frac{2\pi}{n} w = e i ⋅ 2 π / n = cos n 2 π + i sin n 2 π ,列向量相互正交(注意此处为复向量正交),与傅里叶矩阵相乘实现了对向量的离散傅里叶变换 (discrete Fourier transform)
F n F_n F n 与F 2 n F_{2n} F 2 n 之间的关系(基于w 2 n 2 = w n w_{2n}^2=w_n w 2 n 2 = w n ):
F 2 n = [ I D I − D ] [ F n 0 0 F n ] P F_{2n}=\left[\begin{array}{rr}I&D\\I&-D\end{array}\right]\left[\begin{array}{rr}F_n&0\\0&F_n\end{array}\right]P F 2 n = [ I I D − D ] [ F n 0 0 F n ] P (证明方式:暴力展开,必要时用w 2 n n w_{2n}^n w 2 n n 来替换− 1 -1 − 1 ,用w 2 n 2 n w_{2n}^{2n} w 2 n 2 n 来替换1 1 1 )
置换矩阵P = [ 1 0 0 0 ⋯ 0 0 0 0 1 0 ⋯ 0 0 ⋮ 0 0 0 0 ⋯ 1 0 0 1 0 0 ⋯ 0 0 0 0 0 1 ⋯ 0 0 ⋮ 0 0 0 0 ⋯ 0 1 ] P=\left[\begin{array}{ccccccc}1&0&0&0&\cdots&0&0\\0&0&1&0&\cdots&0&0\\&&&\vdots&&&\\0&0&0&0&\cdots&1&0\\0&1&0&0&\cdots&0&0\\0&0&0&1&\cdots&0&0\\&&&\vdots&&&\\0&0&0&0&\cdots&0&1\end{array}\right] P = 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 ⋮ 0 0 1 ⋮ 0 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 0 0 1 0 0 0 0 0 0 0 0 1 实现了依次取第0 , n , 1 , n + 1 , ⋯ , n − 1 , 2 n 0,n,1,n+1,\cdots,n-1,2n 0 , n , 1 , n + 1 , ⋯ , n − 1 , 2 n 列的效果
对角矩阵D = [ 1 w w 2 ⋱ w n − 1 ] D=\left[\begin{array}{rrrrrr}1&&&&\\&w&&&\\&&w^2&&\\&&&\ddots&\\&&&&w^{n-1}\end{array}\right] D = 1 w w 2 ⋱ w n − 1 ,此处的w w w 为w 2 n w_{2n} w 2 n ,实现了矩阵行的倍缩
如此递归进行计算,将计算代价从n 2 n^2 n 2 降到了n l o g 2 n n log_2 n n l o g 2 n ,这就是快速傅里叶变换 (FFT)
28 Positive Definite Matrices and Minima(正定矩阵,最小值)
Studying positive definite matrices brings the whole course together; we use pivots, determinants, eigenvalues and stability.
Positive definite matrices
判断一个对称矩阵A A A 是否正定:
二次型 (quadratic form)x T A x = a x 1 2 + 2 b x 1 x 2 + c x 2 2 x^TAx=ax_1^2+2bx_1x_2+cx_2^2 x T A x = a x 1 2 + 2 b x 1 x 2 + c x 2 2 ,判断二次型是否恒大于0 —— 配方法 (配方后各项系数即为矩阵的各主元,故可以通过消元法求解配方系数)
正定矩阵对应的二次型的图像一般为抛物面(paraboloid),三维正定二次型的图像可能为椭圆体(ellipsoid),A = Q Λ Q T A=Q\Lambda Q^T A = Q Λ Q T ,Q Q Q (特征向量)表明A A A 图像上的三个轴向,Λ \Lambda Λ (特征值)表明各轴的长度,这提供了一种理解主轴定理 的角度
Camille Jordan found a way to choose a “most diagonal" representative from each family of similar matrices; this representative is said to be in Jordan normal form.
A T A A^TA A T A is positive definite
一些关于正定矩阵的性质:
A A A 正定,则A − 1 A^{-1} A − 1 正定(1 λ i > 0 \frac{1}{\lambda_i}>0 λ i 1 > 0 )
A A A 和B B B 正定,则A + B A+B A + B 正定(x T ( A + B ) x = x T A x + x T B x > 0 x^T(A+B)x=x^TAx+x^TBx>0 x T ( A + B ) x = x T A x + x T B x > 0 )
A A A 列满秩,则A T A A^TA A T A 正定(x T ( A T A ) x = ( A x ) T ( A x ) = ∣ A x ∣ 2 ≥ 0 x^T(A^TA)x=(Ax)^T(Ax)=|Ax|^2\geq0 x T ( A T A ) x = ( A x ) T ( A x ) = ∣ A x ∣ 2 ≥ 0 ,A A A 列满秩,则当且仅当x = 0 x=0 x = 0 时才有A x = 0 Ax=0 A x = 0 ,故A T A A^TA A T A 正定)
正定矩阵进行消元时,不需要进行行交换,因为不存在为0的主元
Similar matrices A A A and B = M − 1 A M B = M^{−1}AM B = M − 1 A M
A A A 与B B B 相似 (similar)—— 存在可逆矩阵M M M ,B = M − 1 A M B = M^{−1}AM B = M − 1 A M
典例:当A A A 存在n n n 个线性无关的特征向量,则S − 1 A S = Λ S^{-1}AS=\Lambda S − 1 A S = Λ ,A A A 与Λ \Lambda Λ 相似
A A A 与B B B 相似,B = M − 1 A M B = M^{−1}AM B = M − 1 A M ,则A A A 与B B B 具有相同的特征值,A A A 的特征向量为x x x ,则B B B 的特征向量为M − 1 x M^{-1}x M − 1 x (A x = λ x Ax=\lambda x A x = λ x ,则A M M − 1 x = λ x AMM^{-1}x=\lambda x A M M − 1 x = λ x ,左乘M − 1 M^{-1} M − 1 ,M − 1 A M M − 1 x = λ M − 1 x M^{-1}AMM^{-1}x=\lambda M^{-1}x M − 1 A M M − 1 x = λ M − 1 x ,则y = M − 1 x y=M^{-1}x y = M − 1 x ,B y = λ y By=\lambda y B y = λ y )
Jordan型矩阵 —— 一个相似矩阵族里最接近对角矩阵的矩阵,对角元都为特征值,对角线之上可能存在1,其他位置为0
Jordan块 —— J i = [ λ i 1 0 ⋯ 0 0 λ i 1 0 ⋮ ⋱ ⋮ 0 0 λ i 1 0 0 ⋯ 0 λ i ] J_i=\left[\begin{array}{ccccc}\lambda_i&1&0&\cdots&0\\0&\lambda_i&1&&0\\\vdots&&\ddots&&\vdots\\0&0&&\lambda_i&1\\0&0&\cdots&0&\lambda_i\end{array}\right] J i = λ i 0 ⋮ 0 0 1 λ i 0 0 0 1 ⋱ ⋯ ⋯ λ i 0 0 0 ⋮ 1 λ i
Jordan定理 —— 每个方阵A A A 都相似于一个Jordan型矩阵J = [ J 1 0 ⋯ 0 0 J 2 ⋯ 0 ⋮ ⋱ ⋮ 0 0 ⋯ J d ] J=\left[\begin{array}{cccc}J_1&0&\cdots&0\\0&J_2&\cdots&0\\\vdots&&\ddots&\vdots\\0&0&\cdots&J_d\end{array}\right] J = J 1 0 ⋮ 0 0 J 2 0 ⋯ ⋯ ⋱ ⋯ 0 0 ⋮ J d
如果A A A 有n n n 个不同的特征值,则它可相似对角化,Jordan型为对角矩阵Λ \Lambda Λ ;如果A A A 有重特征值且“缺失”特征向量,Jordan型在对角线上方有n − d n-d n − d 个1(d d d 为Jordan块的数量)
30 Singular Value Decomposition(奇异值分解)
The heart of the problem is to find an orthonormal basis v 1 , v 2 , ⋯ , v r v_1, v_2,\cdots,v_r v 1 , v 2 , ⋯ , v r for the row space of A A A .
奇异值分解SVD (Singular Value Decomposition):对于任意矩阵A A A ,都可进行分解A = U Σ V T A=U\Sigma V^T A = U Σ V T ,U U U 和V V V 是正交矩阵,Σ \Sigma Σ 是对角矩阵
奇异值分解的特殊情况:对于对称矩阵,它的特征向量正交,A = Q Λ Q T A=Q\Lambda Q^T A = Q Λ Q T ,此时U = V = Q U=V=Q U = V = Q
奇异值分解的理解:v 1 , v 2 , ⋯ , v r v_1,v_2,\cdots,v_r v 1 , v 2 , ⋯ , v r 为A A A 行空间的正交基,v r + 1 , ⋯ , v n v_{r+1},\cdots,v_n v r + 1 , ⋯ , v n 为A A A 零空间的正交基,u 1 , u 2 , ⋯ , u r u_1,u_2,\cdots,u_r u 1 , u 2 , ⋯ , u r 为A A A 列空间的正交基,u r + 1 , ⋯ , u n u_{r+1},\cdots,u_n u r + 1 , ⋯ , u n 是A A A 左零空间的正交基,希望实现行空间的正交基到列空间的正交基的变换 —— A v i = σ i u i Av_i=\sigma_iu_i A v i = σ i u i (当i > r i>r i > r ,σ i = 0 \sigma_i=0 σ i = 0 ),即A [ v 1 v 2 ⋯ v n ] = [ σ 1 u 1 σ 2 u 2 ⋯ σ n u n ] = [ u 1 u 2 ⋯ u n ] [ σ 1 σ 2 ⋱ σ n ] A\left[\begin{array}{ccccc}v_1&v_2&\cdots&v_n\end{array}\right]=\left[\begin{array}{cccc}\sigma_1u_1&\sigma_2u_2&\cdots&\sigma_nu_n\end{array}\right]=\left[\begin{array}{cccccc}u_1&u_2&\cdots&u_n\end{array}\right]\left[\begin{array}{ccccc}\sigma_1&&&&\\&\sigma_2&&&\\&&\ddots&&\\&&&&\sigma_n\end{array}\right] A [ v 1 v 2 ⋯ v n ] = [ σ 1 u 1 σ 2 u 2 ⋯ σ n u n ] = [ u 1 u 2 ⋯ u n ] σ 1 σ 2 ⋱ σ n ,即A V = U Σ AV=U\Sigma A V = U Σ (A = U Σ V − 1 = U Σ V T A=U\Sigma V^{-1}=U\Sigma V^T A = U Σ V − 1 = U Σ V T ),所以奇异值分解问题的核心就是找到行空间中一组特殊的正交基
奇异值分解的求解 —— 找到符合要求的V V V 和U U U :A T A = V Σ U T U Σ V T = V Σ 2 V T = V [ σ 1 2 σ 2 2 ⋱ σ n 2 ] V T A^TA=V\Sigma U^TU\Sigma V^T=V\Sigma^2V^T=V\left[\begin{array}{rrrr}\sigma_{1}^{2}&&&\\&\sigma_{2}^{2}&&\\&&\ddots&\\&&&\sigma_{n}^{2}\end{array}\right]V^{T} A T A = V Σ U T U Σ V T = V Σ 2 V T = V σ 1 2 σ 2 2 ⋱ σ n 2 V T ,通过计算 A T A A^TA A T A 的特征向量得到V V V ;A A T = U Σ V T V Σ U T = U Σ 2 U T = U [ σ 1 2 σ 2 2 ⋱ σ n 2 ] U T AA^T=U\Sigma V^TV\Sigma U^T=U\Sigma^2U^T=U\left[\begin{array}{rrrr}\sigma_{1}^{2}&&&\\&\sigma_{2}^{2}&&\\&&\ddots&\\&&&\sigma_{n}^{2}\end{array}\right]U^{T} A A T = U Σ V T V Σ U T = U Σ 2 U T = U σ 1 2 σ 2 2 ⋱ σ n 2 U T ,通过计算A A T AA^T A A T 的特征向量得到U U U ;σ i \sigma_i σ i 即为A T A A^TA A T A 或A A T AA^T A A T 的特征值的平方根
In older linear algebra courses, linear transformations were introduced before matrices. This geometric approach to linear algebra initially avoids the need for coordinates. But eventually there must be coordinates and matrices when the need for computation arises.
Without coordinates (no matrix)
映射T : R 2 ⟶ R 2 T:R^2\longrightarrow R^2 T : R 2 ⟶ R 2 ,满足条件T ( v + w ) = T ( v ) + T ( w ) , T ( c v ) = c T ( v ) T(v+w)=T(v)+T(w),T(cv)=cT(v) T ( v + w ) = T ( v ) + T ( w ) , T ( c v ) = c T ( v ) ,即为线性变换,一些线性变换:
With coordinates (matrix!)
T ( v ) = A v T(v)=Av T ( v ) = A v ,可以用输入空间的一组基v 1 , v 2 , ⋯ , v n v_1,v_2,\cdots,v_n v 1 , v 2 , ⋯ , v n 来描述该线性变换,v = c 1 v 1 + c 2 v 2 + ⋯ + c n v n v=c_1v_1+c_2v_2+\cdots+c_nv_n v = c 1 v 1 + c 2 v 2 + ⋯ + c n v n ,T ( v ) = c 1 T ( v 1 ) + c 2 T ( v 2 ) + ⋯ + c n T ( v n ) T(v)=c_1T(v_1)+c_2T(v_2)+\cdots+c_nT(v_n) T ( v ) = c 1 T ( v 1 ) + c 2 T ( v 2 ) + ⋯ + c n T ( v n ) ,c i c_i c i 即为坐标(coordinates),这源自于所取的基(basis)
确定矩阵A A A :T ( v ) = A v T(v)=Av T ( v ) = A v (此处v v v 和A v Av A v 分别代表了在变换前后基向量下的坐标),A A A 的列,是描述输入原空间的基向量变换后得到的输出向量,关于输出空间(A A A 列空间)基向量的线性组合系数,T ( v i ) = a 1 i w 1 + a 2 i w 2 + ⋯ + a m i w m T(v_i)=a_{1i}w_1+a_{2i}w_2+\cdots+a_{mi}w_m T ( v i ) = a 1 i w 1 + a 2 i w 2 + ⋯ + a mi w m ,这样矩阵A A A 就满足A [ i n p u t c o o r d s ] = [ o u t p u t c o o r d s ] A[input\ coords]=[output\ coords] A [ in p u t coor d s ] = [ o u tp u t coor d s ] ([ T ( v 1 ) T ( v 2 ) ⋯ T ( v n ) ] = [ w 1 w 2 ⋯ w n ] A , T ( v ) = [ T ( v 1 ) T ( v 2 ) ⋯ T ( v n ) ] [ i n p u t c o o r d s ] = [ w 1 w 2 ⋯ w n ] A [ i n p u t c o o r d s ] = [ w 1 w 2 ⋯ w n ] [ o u t p u t c o o r d s ] \left[\begin{array}{ccccc}T(v_1)&T(v_2)&\cdots&T(v_n)\end{array}\right]=\left[\begin{array}{ccccc}w_1&w_2&\cdots&w_n\end{array}\right]A,T(v)=\left[\begin{array}{ccccc}T(v_1)&T(v_2)&\cdots&T(v_n)\end{array}\right][input\ coords]=\left[\begin{array}{ccccc}w_1&w_2&\cdots&w_n\end{array}\right]A[input\ coords]=\left[\begin{array}{ccccc}w_1&w_2&\cdots&w_n\end{array}\right][output\ coords] [ T ( v 1 ) T ( v 2 ) ⋯ T ( v n ) ] = [ w 1 w 2 ⋯ w n ] A , T ( v ) = [ T ( v 1 ) T ( v 2 ) ⋯ T ( v n ) ] [ in p u t coor d s ] = [ w 1 w 2 ⋯ w n ] A [ in p u t coor d s ] = [ w 1 w 2 ⋯ w n ] [ o u tp u t coor d s ] )
举例,T = d d x T=\frac{d}{dx} T = d x d ,T ( c 1 + c 2 x + c 3 x 2 ) = c 2 + 2 c 3 x T(c_1+c_2x+c_3x^2)=c_2+2c_3x T ( c 1 + c 2 x + c 3 x 2 ) = c 2 + 2 c 3 x ,v 1 = 1 , v 2 = x , v 3 = x 2 v_1=1,v_2=x,v_3=x^2 v 1 = 1 , v 2 = x , v 3 = x 2 ,w 1 = 1 , w 2 = x w_1=1,w_2=x w 1 = 1 , w 2 = x ,A = [ 0 1 0 0 0 2 ] A=\left[\begin{array}{ccc}0&1&0\\0&0&2\end{array}\right] A = [ 0 0 1 0 0 2 ] ,T ( [ c 1 c 2 c 3 ] ) = A [ c 1 c 2 c 3 ] = [ c 2 2 c 3 ] T\left(\left[\begin{array}{c}c_1\\c_2\\c_3\end{array}\right]\right)=A\left[\begin{array}{c}c_1\\c_2\\c_3\end{array}\right]=\left[\begin{array}{c}c_2\\2c_3\end{array}\right] T c 1 c 2 c 3 = A c 1 c 2 c 3 = [ c 2 2 c 3 ]
矩阵乘法即为线性变换的叠加
32 Change of Basis; Image Compression(基变换,图像压缩)
Lecture videos, music, and other data sources contain a lot of information; that information can be efficiently stored and transmitted only after we change the basis used to record it.
Compression of images
基变换的一个重要应用就是压缩,可以通过基变换对图像、视频、音频进行压缩从而实现更高效的存储
图片压缩的本质是基变换 ,利用标准基存储图片并不能很好地利用如图片中连续区域灰度值很接近等特点,故利用[ 1 1 ⋯ 1 ] \left[\begin{array}{c}1\\1\\\cdots\\1\end{array}\right] 1 1 ⋯ 1 等基向量能更好地表示这种情况
两种常用的基向量:
傅里叶基 (Fourier basis vectors)—— 傅里叶矩阵的列向量
[ 1 1 1 1 1 1 1 1 ] , [ 1 ω ω 2 ω 3 ω 4 ω 5 ω 6 ω 7 ] , [ 1 ω 2 ω 4 ω 6 ω 8 ω 10 ω 12 ω 14 ] , ⋯ , [ 1 ω 7 ω 14 ω 21 ω 28 ω 35 ω 42 ω 49 ] \left[\begin{array}{c}1\\1\\1\\1\\1\\1\\1\\1\end{array}\right],\left[\begin{array}{c}1\\\omega\\\omega^2\\\omega^3\\\omega^4\\\omega^5\\\omega^6\\\omega^7\end{array}\right],\left[\begin{array}{c}1\\\omega^2\\\omega^4\\\omega^6\\\omega^8\\\omega^{10}\\\omega^{12}\\\omega^{14}\end{array}\right],\cdots,\left[\begin{array}{c}1\\\omega^7\\\omega^{14}\\\omega^{21}\\\omega^{28}\\\omega^{35}\\\omega^{42}\\\omega^{49}\end{array}\right]
1 1 1 1 1 1 1 1 , 1 ω ω 2 ω 3 ω 4 ω 5 ω 6 ω 7 , 1 ω 2 ω 4 ω 6 ω 8 ω 10 ω 12 ω 14 , ⋯ , 1 ω 7 ω 14 ω 21 ω 28 ω 35 ω 42 ω 49
在JEPG中,512 × 512 512\times512 512 × 512 的图像被划分为8 × 8 8\times8 8 × 8 的区域(64个系数 → \rightarrow → 64维)进行处理,使用的基就是傅里叶基,对于输入的信号x \mathrm{x} x ,从标准基变换位傅里叶基(无损压缩),得到系数c c c ,再对c c c 进行阈值处理(消除低于某个阈值的系数,有损压缩),得到系数c ^ \hat{c} c ^ ,利用新的系数对傅里叶基进行线性组合得到压缩后的信号x ^ = ∑ c ^ i v i \hat{\mathbf{x}}=\sum\hat{c}_i\mathbf{v}_i x ^ = ∑ c ^ i v i ,求和项已不为64项,这即是压缩
s i g n a l x ⟶ l o s s l e s s 64 coefficients c ⟶ l o s s y c o m p r e s s i o n c ^ (many zeros) ⟶ x ^ = ∑ c ^ i v i \mathrm{signal~x}\overset{\mathrm{lossless}}{\operatorname*{\longrightarrow}}\text{64 coefficients }c\overset{\mathrm{lossy~compression}}{\operatorname*{\longrightarrow}}\hat{c}\text{ (many zeros)}\longrightarrow\hat{\mathbf{x}}=\sum\hat{c}_i\mathbf{v}_i
signal x ⟶ lossless 64 coefficients c ⟶ lossy compression c ^ (many zeros) ⟶ x ^ = ∑ c ^ i v i
视频的压缩 应存储基础图象和下一帧对当前帧的修正部分(而不应该一帧一帧进行存储,那样没有利用好视频的连续性)
小波 (The Haar wavelet basis)—— 除了全1向量外,其他基向量的非0元素一半为1,一半为-1,各基向量正交
[ 1 1 1 1 1 1 1 1 ] , [ 1 1 1 1 − 1 − 1 − 1 − 1 ] , [ 1 1 − 1 − 1 0 0 0 0 ] , [ 0 0 0 0 1 1 − 1 − 1 ] , [ 1 − 1 0 0 0 0 0 0 ] , ⋯ , [ 0 0 0 0 0 0 1 − 1 ] \left[\begin{array}{r}1\\1\\1\\1\\1\\1\\1\\1\end{array}\right],\left[\begin{array}{r}1\\1\\1\\1\\-1\\-1\\-1\\-1\end{array}\right],\left[\begin{array}{r}1\\1\\-1\\-1\\0\\0\\0\\0\end{array}\right],\left[\begin{array}{r}0\\0\\0\\0\\1\\1\\-1\\-1\end{array}\right],\left[\begin{array}{r}1\\-1\\0\\0\\0\\0\\0\\0\end{array}\right],\cdots,\left[\begin{array}{r}0\\0\\0\\0\\0\\0\\1\\-1\end{array}\right]
1 1 1 1 1 1 1 1 , 1 1 1 1 − 1 − 1 − 1 − 1 , 1 1 − 1 − 1 0 0 0 0 , 0 0 0 0 1 1 − 1 − 1 , 1 − 1 0 0 0 0 0 0 , ⋯ , 0 0 0 0 0 0 1 − 1
标准基下的向量p = c 1 w 1 + c 2 w 2 + ⋯ + c n w n = W c p=c_1w_1+c_2w_2+\cdots+c_nw_n=Wc p = c 1 w 1 + c 2 w 2 + ⋯ + c n w n = W c ,W W W 为小波基向量作为列向量组成的小波矩阵,则c = W − 1 p c=W^{-1}p c = W − 1 p
好的基向量应具备的条件 :
Change of basis
一个线性变换T T T ,在基向量v 1 , v 2 , ⋯ , v n v_1,v_2,\cdots,v_n v 1 , v 2 , ⋯ , v n 下的变换矩阵为A A A ,在基向量w 1 , w 2 , ⋯ , w n w_1,w_2,\cdots,w_n w 1 , w 2 , ⋯ , w n 下的变换矩阵为B B B ,则A A A 与B B B 相似 ,B = M − 1 A M B=M^{-1}AM B = M − 1 A M
T ( v ) = W A T(v)=WA T ( v ) = W A ,W W W 为基向量作为列向量的矩阵,如果选取变换的特征向量作为基向量,则T ( v i ) = λ i v i T(v_i)=\lambda_iv_i T ( v i ) = λ i v i ,A = Λ A=\Lambda A = Λ ,可见选取特征向量作为基更利于图片压缩,但计算特征向量的代价较大,所以实际上还是采用傅里叶基或者小波
33 Quiz 3 Review
34 Left and Right Inverses; Pseudoinverse(左右逆,伪逆)
The pseudoinverse A + A^+ A + of A A A is the matrix for which x = A + A x x = A^+Ax x = A + A x for all x x x in the row space of A A A . The nullspace of A + A^+ A + is the nullspace of A T A^T A T .
Left inverse
A A A 列满秩,则A T A A^TA A T A 可逆,则A l e f t − 1 = ( A T A ) − 1 A T A_{left}^{-1}=(A^TA)^{-1}A^T A l e f t − 1 = ( A T A ) − 1 A T 是它的左逆 (left inverse),A l e f t − 1 A = I A_{left}^{-1}A=I A l e f t − 1 A = I
A A l e f t − 1 = A ( A T A ) − 1 A T = P AA_{left}^{-1}=A(A^TA)^{-1}A^T=P A A l e f t − 1 = A ( A T A ) − 1 A T = P ,是列空间的投影矩阵
Right inverse
A A A 行满秩,则A r i g h t − 1 = A T ( A A T ) − 1 A_{right}^{-1}=A^T(AA^T)^{-1} A r i g h t − 1 = A T ( A A T ) − 1 是它的右逆 (right inverse),A A r i g h t − 1 = I AA_{right}^{-1}=I A A r i g h t − 1 = I
A r i g h t − 1 A = A T ( A A T ) − 1 A = P A_{right}^{-1}A=A^T(AA^T)^{-1}A=P A r i g h t − 1 A = A T ( A A T ) − 1 A = P ,是行空间的投影矩阵
Pseudoinverse
A A A 不满秩(r < n , r < m r<n,r<m r < n , r < m ),零空间和左零空间均非零,则A A A 无逆矩阵(零空间存在非零向量则A x = 0 Ax=0 A x = 0 ,没有逆矩阵能恢复这一过程,即A − 1 0 = x A^{-1}0=x A − 1 0 = x ),但如果将其限制在行空间和列空间 上,即仅实现行空间向量x x x 到列空间向量A x Ax A x 的一一映射(证明x x x 与A x Ax A x 的一一对应:行空间向量x ≠ y x\neq y x = y ,若A x = A y Ax=Ay A x = A y ,则A ( x − y ) = 0 A(x-y)=0 A ( x − y ) = 0 ,x − y x-y x − y 在A A A 的零空间中,同时x − y x-y x − y 在A A A 的行空间中,而零空间与行空间正交,则x − y − 0 x-y-0 x − y − 0 ,这与x ≠ y x\neq y x = y 矛盾),则存在其逆映射,即为A A A 的伪逆 (pseudoinverse)
伪逆是必需的,因为使用最小二乘法解决线性回归问题时要求A T A A^TA A T A 可逆,而很多时候重复的测量数据会使A A A 不满秩,则此时可以使用伪逆来解决该问题
求伪逆A + A^+ A + :奇异值分解A = U Σ V T A=U\Sigma V^T A = U Σ V T ,Σ = [ σ 1 σ 2 ⋱ σ r 0 ⋱ 0 ] \Sigma=\left[\begin{array}{ccccc}\sigma_1&&&&&&\\&\sigma_2&&&&&\\&&\ddots&&&&\\&&&\sigma_r&&&\\&&&&0&&\\&&&&&\ddots&\\&&&&&&0\end{array}\right] Σ = σ 1 σ 2 ⋱ σ r 0 ⋱ 0 ,是一个m × n m\times n m × n 的对角矩阵,Σ + = [ 1 σ 1 1 σ 2 ⋱ 1 σ r 0 ⋱ 0 ] \Sigma^+=\left[\begin{array}{ccccc}\frac{1}{\sigma_1}&&&&&&\\&\frac{1}{\sigma_2}&&&&&\\&&\ddots&&&&\\&&&\frac{1}{\sigma_r}&&&\\&&&&0&&\\&&&&&\ddots&\\&&&&&&0\end{array}\right] Σ + = σ 1 1 σ 2 1 ⋱ σ r 1 0 ⋱ 0 ,是n × m n\times m n × m 的对角矩阵,Σ Σ + \Sigma\Sigma^+ Σ Σ + 和Σ + Σ \Sigma^+\Sigma Σ + Σ 分别是列空间和行空间的投影矩阵,A A A 的伪逆为A + = V Σ + U T A^+=V\Sigma^+U^T A + = V Σ + U T
35 Final Course Review