Chapter 6 Linear Algebra
6.1 Linear Algebra Concepts
Linear transformations and change of basis are widely used in statistics, for this reason I briefly describe the definition of these concepts and how they are related.
6.1.1 Linear Transformation
Letting V and W be vector spaces, a function f:V→W is a linear transformation if the additivity and scalar multiplication properties are hold for any two vectors \veu,\vev∈V and a constant c: f(\veu+\vev)=f(\veu)+f(\vev) f(c\vev)=cf(\vev).
This concept is more common to use when working with matrices. Considering the vector spaces V∈\realn and W∈\realm, a matrix \mAm×n and the vector \vex∈V; then the function f(\vex)=\mA\vex is a linear transformation V∈\realn to W∈\realm because it holds the properties mentioned above. In this definition, although not mentioned, we are assuming that both V and W are defined using the standard basis for \realn and \realm respectively.
6.1.2 Change of Basis
Consider a vector \veu∈\realn, it is implicitly defined using the standard basis {\vee1,…,\veen} for \realn, such as \veu=∑ni=1ui\veei. In a similar manner, this vector \veu can also be represented in vector spaces with different basis, this is called change of basis. For example, consider the vector space V∈\realn with basis {\vev1,…,\vevn}. Then, in order to make the change of basis, it is required to find \veuv=(uv1,…,uvn)\tr such as \veu=n∑i=1uvi\vevi=\mV\veuv, where the n×n matrix \mV=(\vev1,…,\vevn), hence the change from the standard basis to the vector space V is \veuv=\mV−1\veu, while the change from the vector space V to the standard basis is \veu=\mV\veuv.
Now, consider another vector space W∈\realn with basis {\vew1,…,\vewn}, the vector \veuv defined on the space V can also be defined on the space W as \veuw=\mW−1\mV\veuv where the n×n matrix \mW=(\vew1,…,\vewn); similarly, the vector \veuw∈W can be defined on the space V as \veuv=\mV−1\mW\veuw. It can be seen that in both cases, the original vector is first transformed to the space vector with standard basis (left-multiplying the basis matrix) and then transformed to the desired vector space (left-multiplying the basis matrix inverse ).
6.1.3 Change of Basis for Linear Transformations
Previously, we have presented a linear transformation f(\vex)=\mA\vex:\realn→\realm using standard basis. This transformation can also be represented from a vector space V with basis {\vev1,…,\vevn} to a vector space W with basis {\vew1,…,\vewn}, then f′:V→W is defined as f′(\vexv)=\mW−1\mA\mV\vexv, where the matrices \mW and \mV are the basis matrix of the vector spaces W and V respectively. The matrix multiplication \mW−1\mA\mV implies a change of basis from to standard basis, the linear transformation using the standard basis, and the change from the standard basis to the space W. In cases that V=W, then the linear transformation is defined as f′(\vexv)=\mV−1\mA\mV\vexv.
6.1.4 Eigenvalues and Eigenvectors
Eigenvalues and eigenvectors are used in several concepts of statistical inference and modelling. It can be useful for dimension reduction, decomposition of variance-covariance matrices, so on. For this reason, we provide basic details about eigenvectors and eigenvalues and their close relationship with linear transformations.
6.1.4.1 Definition
The eigenvector of a linear transformation \mAn×n is a non-zero vector \vev such as the linear transformation of this vector is proportional to itself: \mA\vev=λ\vev⟺(\mA−λ\mI)\vev=\ve0, where λ is the eigenvalue associated to the eigenvector \vev. The equation above has non-zero solution if and only if det Then, all the eigenvalues \lambda of \m{A} hold the condition above.
There is an equivalence between the linear transformation f(\ve{x}) = \m{A}\ve{x}, and the eigenvalues \lambda_1, \lambda_2, \dots, \lambda_n and eigenvectors \ve{v}_1, \ve{v}_2, \dots, \ve{v}_n of itself. This relationship provide more useful interpretation of the eigenvalues and eigenvectors, we will use the change of basis concept to describe it.
6.1.4.2 Eigendecomposition and geometric interpretation
Considering a vector space V with basis \{\ve{v}_1, \ve{v}_2, \dots, \ve{v}_n\}, any vector \ve{x} \in \real^n can be represented as \m{V}\ve{x}_v, where \ve{x}_v is the representation of \ve{x} using the matrix of basis \m{V}=(\ve{v}_1, \dots, \ve{v}_n) of the vector space V. Then, the linear transformation can be expressed as f(\ve{x}) = \m{A}\ve{x} = \m{A}\m{V}\ve{x}_v = \m{V}\m{D}\ve{x}_v, where the diagonal matrix D=\diag{\lambda_1, \dots, \lambda_n} and the last equivalence hold because \m{A}\ve{v}_i=\ve{v}_i\lambda_i. Finally, expressing \ve{x}_v in terms of the vector \ve{x} defined on the standard basis, we obtain that f(\ve{x}) = \m{V}\m{D}\m{V}^{-1}\ve{x}, the equality \m{A}=\m{V}\m{D}\m{V}^{-1} is called eigendecomposition. Hence, the linear transformation is equivalent to the following: change the basis of \ve{x} to the vector space V , apply the diagonal linear transformation D and return to the space with standard basis. Geometrically, you can think of \{\ve{v}_1, \ve{v}_2, \dots, \ve{v}_n\} as the basis of vectorial space V where the transformation \m{A} becomes only an scaling transformation \m{D} and the eigenvalues \lambda_1, \lambda_2, \dots, \lambda_n are the scaling factor in direction of the corresponding eigenvector \ve{v}_1, \ve{v}_2, \dots, \ve{v}_n.
6.1.4.3 Basis properties
There are certain properties the are useful for statistical modelling such as:
- Trace of \m{A} is equals to the sum of the eigenvalues.
- Determinant of \m{A} is equals to the sum of the eigenvalues.
- If \m{A} is symmetric, then all eigenvalues are real.
- If \m{A} is positive definite, then all eigenvalues are positive.
Note that, some of these properties can be explained using the eigendecomposition \m{A} = \m{V}\m{D}\m{V}^{-1}.
6.1.5 Cauchy–Schwartz inequality
|<u, v>| <= ||u|| * ||v||