Chapter 6 Linear Algebra

6.1 Linear Algebra Concepts

Linear transformations and change of basis are widely used in statistics, for this reason I briefly describe the definition of these concepts and how they are related.

6.1.1 Linear Transformation

Letting $V$ and $W$ be vector spaces, a function $f: V \rightarrow W$ is a linear transformation if the additivity and scalar multiplication properties are hold for any two vectors $\ve{u}, \ve{v} \in V$ and a constant $c$ : $f(\ve{u}+\ve{v}) = f(\ve{u}) + f(\ve{v})$ $f(c\ve{v}) = cf(\ve{v}).$

This concept is more common to use when working with matrices. Considering the vector spaces $V \in \real^n$ and $W \in \real^m$ , a matrix $\m{A}_{m \times n}$ and the vector $\ve{x} \in V$ ; then the function $f(\ve{x}) = \m{A}\ve{x}$ is a linear transformation $V \in \real^n$ to $W \in \real^m$ because it holds the properties mentioned above. In this definition, although not mentioned, we are assuming that both $V$ and $W$ are defined using the standard basis for $\real^n$ and $\real^m$ respectively.

6.1.2 Change of Basis

Consider a vector $\ve{u} \in \real^n$ , it is implicitly defined using the standard basis $\{\ve{e}_1,\dots,\ve{e}_n\}$ for $\real^n$ , such as $\ve{u}=\sum_{i=1}^n u_i \ve{e}_i$ . In a similar manner, this vector $\ve{u}$ can also be represented in vector spaces with different basis, this is called change of basis. For example, consider the vector space $V \in \real^n$ with basis $\{\ve{v}_1,\dots,\ve{v}_n\}$ . Then, in order to make the change of basis, it is required to find $\ve{u}_v=(u_{v_1},\dots,u_{v_n})^\tr$ such as $\ve{u} = \sum_{i=1}^n u_{v_i} \ve{v}_i = \m{V}\ve{u}_v,$ where the $n\times n$ matrix $\m{V}=(\ve{v}_1,\dots,\ve{v}_n)$ , hence the change from the standard basis to the vector space $V$ is $\ve{u}_v = \m{V}^{-1}\ve{u},$ while the change from the vector space $V$ to the standard basis is $\ve{u} = \m{V}\ve{u}_v.$

Now, consider another vector space $W \in \real^n$ with basis $\{\ve{w}_1,\dots,\ve{w}_n\}$ , the vector $\ve{u}_v$ defined on the space $V$ can also be defined on the space $W$ as $\ve{u}_w = \m{W}^{-1}\m{V}\ve{u}_v$ where the $n\times n$ matrix $\m{W}=(\ve{w}_1,\dots,\ve{w}_n)$ ; similarly, the vector $\ve{u}_w \in W$ can be defined on the space $V$ as $\ve{u}_v = \m{V}^{-1}\m{W}\ve{u}_w.$ It can be seen that in both cases, the original vector is first transformed to the space vector with standard basis (left-multiplying the basis matrix) and then transformed to the desired vector space (left-multiplying the basis matrix inverse ).

6.1.3 Change of Basis for Linear Transformations

Previously, we have presented a linear transformation $f(\ve{x})=\m{A}\ve{x}:\real^n\rightarrow\real^m$ using standard basis. This transformation can also be represented from a vector space $V$ with basis $\{\ve{v}_1,\dots,\ve{v}_n\}$ to a vector space $W$ with basis $\{\ve{w}_1,\dots,\ve{w}_n\}$ , then $f': V \rightarrow W$ is defined as $f'(\ve{x}_v) = \m{W}^{-1}\m{A}\m{V}\ve{x}_v,$ where the matrices $\m{W}$ and $\m{V}$ are the basis matrix of the vector spaces $W$ and $V$ respectively. The matrix multiplication $\m{W}^{-1}\m{A}\m{V}$ implies a change of basis from to standard basis, the linear transformation using the standard basis, and the change from the standard basis to the space $W$ . In cases that $V=W$ , then the linear transformation is defined as $f'(\ve{x}_v) = \m{V}^{-1}\m{A}\m{V}\ve{x}_v.$

6.1.4 Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors are used in several concepts of statistical inference and modelling. It can be useful for dimension reduction, decomposition of variance-covariance matrices, so on. For this reason, we provide basic details about eigenvectors and eigenvalues and their close relationship with linear transformations.

6.1.4.1 Definition

The eigenvector of a linear transformation $\m{A}_{n\times n}$ is a non-zero vector $\ve{v}$ such as the linear transformation of this vector is proportional to itself: $\m{A}\ve{v} = \lambda \ve{v} \iff (\m{A}-\lambda\m{I})\ve{v} = \ve{0},$ where $\lambda$ is the eigenvalue associated to the eigenvector $\ve{v}$ . The equation above has non-zero solution if and only if $\det(\m{A}-\lambda\m{I}) = 0.$ Then, all the eigenvalues $\lambda$ of $\m{A}$ hold the condition above.

There is an equivalence between the linear transformation $f(\ve{x}) = \m{A}\ve{x}$ , and the eigenvalues $\lambda_1, \lambda_2, \dots, \lambda_n$ and eigenvectors $\ve{v}_1, \ve{v}_2, \dots, \ve{v}_n$ of itself. This relationship provide more useful interpretation of the eigenvalues and eigenvectors, we will use the change of basis concept to describe it.

6.1.4.2 Eigendecomposition and geometric interpretation

Considering a vector space $V$ with basis $\{\ve{v}_1, \ve{v}_2, \dots, \ve{v}_n\}$ , any vector $\ve{x} \in \real^n$ can be represented as $\m{V}\ve{x}_v$ , where $\ve{x}_v$ is the representation of $\ve{x}$ using the matrix of basis $\m{V}=(\ve{v}_1, \dots, \ve{v}_n)$ of the vector space $V$ . Then, the linear transformation can be expressed as $f(\ve{x}) = \m{A}\ve{x} = \m{A}\m{V}\ve{x}_v = \m{V}\m{D}\ve{x}_v,$ where the diagonal matrix $D=\diag{\lambda_1, \dots, \lambda_n}$ and the last equivalence hold because $\m{A}\ve{v}_i=\ve{v}_i\lambda_i$ . Finally, expressing $\ve{x}_v$ in terms of the vector $\ve{x}$ defined on the standard basis, we obtain that $f(\ve{x}) = \m{V}\m{D}\m{V}^{-1}\ve{x},$ the equality $\m{A}=\m{V}\m{D}\m{V}^{-1}$ is called eigendecomposition. Hence, the linear transformation is equivalent to the following: change the basis of $\ve{x}$ to the vector space $V$ , apply the diagonal linear transformation $D$ and return to the space with standard basis. Geometrically, you can think of $\{\ve{v}_1, \ve{v}_2, \dots, \ve{v}_n\}$ as the basis of vectorial space $V$ where the transformation $\m{A}$ becomes only an scaling transformation $\m{D}$ and the eigenvalues $\lambda_1, \lambda_2, \dots, \lambda_n$ are the scaling factor in direction of the corresponding eigenvector $\ve{v}_1, \ve{v}_2, \dots, \ve{v}_n$ .

6.1.4.3 Basis properties

There are certain properties the are useful for statistical modelling such as:

Trace of $\m{A}$ is equals to the sum of the eigenvalues.
Determinant of $\m{A}$ is equals to the sum of the eigenvalues.
If $\m{A}$ is symmetric, then all eigenvalues are real.
If $\m{A}$ is positive definite, then all eigenvalues are positive.

Note that, some of these properties can be explained using the eigendecomposition $\m{A} = \m{V}\m{D}\m{V}^{-1}$ .

6.1.5 Cauchy–Schwartz inequality

$|<u, v>| <= ||u|| * ||v||$