$Ax=b$, $Ax=\lambda x$ $Av=\sigma u$, $\min{ \frac{||Ax||^2}{||x||^2}}$

  • Multiplication Ax using columns of A

    • What is column, row space, independent vectors, basis, rank, CR decomposition
    • Matrix mutiplication, outer product
  • Factorization

    • $A=LU$

      • 不断消去上方单元 $Ax=b\to LUx=b \to Lc = b, c = Ux$
    • $A=QR$

    • 特征值分解

      • 对称矩阵,正定矩阵,半正定矩阵 (), the energy function
      • Rank的一些性质
      • $A+sI \to \lambda_1+s, \lambda_2+s, $
    • 奇异值分解 $Av_1 = \sigma_1 u_1 $

      • derive from $AA^T$ and $A^TA$

      • How to compute

      • The relation between left eigenvector and right, more vectors with 0

      • If $A-xy^T$ has rank 1, $\sigma_1 \ge |\lambda |$

      • Reflect(affine), scale, reflect, the reduced form

      • The function understanding of SVD

      • Function form, polar decomposition $rcos\theta+irsin\theta = re^{i\theta}$ A = QS, orthgonal, semi-positive definite, seperate rotation from strech

      • \[A = U\Sigma V^T=(UV^T)(V\Sigma V^T) = QS\]
  • orthogonal matrix and

    • Orthogonal basis: a and c $c = b - \frac{a^Tb}{a^Ta}a$
    • Orthogonal projection: $P = QQ^T$ Projection matrix $P^2=P$
    • co-efficient of orthogonal basis: $c_1=q_1^Tv$
    • orthogonal matrix and vector will not change the matrix norm. Then the matrix norm is connected with SVD as $A=U\Sigma V^T$, as U and V are all orthogonal vectors, matrix norm can connect with $\Sigma$ matrix
  • norm

    • Eckart young if B has rank k, $||A-B|| \ge ||A - A_k||$
    • Inner product
    • Matrix Norm: three norm only effected by $\sigma$, what about others.
      • l2 norm: $\max \frac{||Ax||}{||x||} = \sigma_1$
      • Frobenius: $\sqrt{\sigma_1^2+\sigma_2^2+\cdots + \sigma_r^2}$
      • Nuclear norm: $\sigma_1+\sigma_2+\cdots + \sigma_r$ The minimum value of $||U||_F||V||_F$ $||A^TA||_N=||A||_F^2$
    • Vector Norm $l_0, l_1, l_2, l_\infty$
    • The inituition or the matrix norm: The minimum of $||V||_p$
    • Norm的性质: Rescale, triangle
    • Function norm-> vector space to be completed $||v_n - v_\infty|| \to 0 $
      • ending with all zero is not completed p105
      • $norm < \infty$
    • spectral radius: about the stationary of Markov chain
  • Application

    • PCA

      • The stastics behind
        • variances are the diagonal entries of the matrix $AA^T$
        • covariances are the off-diagonal entries of matrix $AA^T$
      • The geometric behind The sum of squared distances from the data points to the $u_1$ line is a minimum.
      • linear algreba behind Total variance $T$ is the sum of sigma
      • The quick drop of $\sigma$ in hilebert matrix
    • Rayleigh Quotients, generalized eigenvalue $Sx_1 = \lambda M x_1$ 99页 \(R(x) = \frac{x^TSx}{x^Tx}\)

      • Generalized Rayleigh Quotients $R(x) = \frac{x^TSx}{X^TMx}$
        • M: covariance matrix, is positive definite, maximum of $R(x)$ is largest eigenvalue of $M^{-1}S$, $M^{-\frac{1}{2}}SM^{-\frac{1}{2}}$
      • Generalized Eigenvectors and M-orthogonal $x_1^TMx_2 = 0$ , $x = M^{-\frac{1}{2}}y$
      • Semi-definite situation $\alpha Sx = \beta Mx$ , $\alpha$ may equal to 0. number of samples smaller than features
      • Generalized SVD $A=U_A\Sigma_AZ$ $B=U_B\Sigma_BZ$
      • Any two positive definite matrix can be decomposed by the same inverse matrix
    • LDA

      Seperated rate \(R = \frac{(x^Tm_1-x^Tm_2)^2}{x^T\Sigma_1x+x^T\Sigma_2x}\) $S = (m_1-m_2)(m_1-m_2)^T$

      求解背后的物理意义没太看懂

$\lambda_1 \le \sigma_1$

AB and BA

$ABx=\lambda x$, $BABx=\lambda Bx$ , $Bx$是BA的eigenvector eigenvector也有对应的关系

Low rank and Compressed Sensing

  • Key insights
    • matrix are composed of small rank matrix: ($uv^T$) is extreme case with rank 1
    • Singular value: low effective rank.
    • Most matrices are completed by low rank matrix.
  • How matrix change when add small rank matrix

    • Normal Perspective(use small matrix and exchange for the larger matrix)

      • $A^{-1}$ \((A-UV^T)^{-1} = A^{-1}+A^{-1}U(I-V^TA^{-1}U)^{-1}V^TA^{-1}\)

        \[(I-UV^T)^{-1} = I +U(I-V^TU)^{-1}V^T\]
      • eigenvalue and signal values (interlacing)
        A graphical explanation: the solution is an inverse function which the last one can never go beyond the first one. \(z_1 \ge \lambda_1 \ge \cdots\)

  • Differentiate perspective

    • $A^{-1}$ \(\frac{dA^{-1}}{dt} = - A^{-1}\frac{dA}{dt}A^{-1}\)

      \frac{d\lambda}{dt}=y^T\frac{dA}{dt}x $$

    • \[\lambda_{max}(S+T)\le \lambda_{max}(S) + \lambda_{max}(T)\]
    \[\lambda_{min}(S+T) \ge \lambda_{min}(S) + \lambda_{min}(T)\]

    However, it is hard to find the intermediate ones.

    • Weyl inequality
      \(\lambda_{i+j-1}(A+B)\le \sigma_i(A)+\sigma_j(A)\)
  • Saddle points from lagrange multipliers

    • Lagrangian: $L(x, \lambda) = \frac{1}{2}x^TSx + \lambda^T(Ax-b)$ which will produce a Lag
  • application
    • update least square, has a new row
    • Kalman filter TODO reading P112