Linear Algebra: Linear Transformations
Matrices
Overview
Linear transformations can be expressed using notation that is much more efficient than standard function notation. For example, recall the linear transformation $T(x,y,z) = [3x + 2y - z, 2x + y + 3z, 4x - 3y + 7z]$ from the previous section, and note how clunky it is to read. This section introduces the concept of a matrix, which is not only easier to read on the page, but also perfect for computers to work with.
First, we review of the definitions of ordered pairs and ordered $n$-tuples to provide context for the formal definition of a matrix and matrix notation. Then, the process for creating matrices from linear transformations is given. Finally, an overview of creating matrices for linear transformations in $\mathcal{L}(F^n, F^m)$ with the standard bases, the most common application, is given.
Review - Ordered Pairs and Ordered $n$-Tuples
Matrices are an extension of the concepts of ordered pairs and ordered $n$-tuples, so we start with a brief review of them here.
Recall that an ordered pair $(a, b)$ is defined as the set $\{\{a\}, \{a, b\}\}.$ This definition uses the simple mechanisms of set theory to enable the definition of the cartesian product, and from there relations, functions, and much more, all without reference to the number 2. Note, however that once the nice notation of the ordered pair is rigorously, we never reference the underlying set-theoretic definition and its curly-brace laden unwieldiness.
Once the natural numbers are defined, the ordered pair is generalized to the ordered $n$-tuple. An $n$-tuple is defined as a function from the set of the first $n$ positive natural numbers into some set. Thus, a $3$-tuple of real numbers $(1, 3, -2.5)$ is shorthand for a function $f$ where $f(1) = 1,$ $f(2) = 3,$ and $f(3) = -2.5.$ But as with ordered pairs, once the definition for the nice notation of $n$-tuples is given, we never use the underlying structure in terms of functions of sets of natural numbers again.
Matrices
We now extend this pattern to functions the cartesian product of two sets of natural numbers. An $n \times m$ matrix $A$ is a function $A : \{1, 2, \ldots n\} \times \{1, 2, \ldots, m\} \rightarrow S,$ where $S$ is any nonempty set. Rather than tediously write out the function as $A(1, 1) = a_{1,1},$ $A(1, 2) = a_{1, 2},$ and so on, we arrange them into a rectangular grid:
$$A = \begin{bmatrix} a_{1, 1} & a_{1, 2} & \ldots & a_{1,n}\\a_{2, 1} & a_{2, 2} & \ldots & a_{n, 2}\\ \vdots & \vdots & \ddots & \vdots\\ a_{m, 1} & a_{m, 2} & \ldots & a_{m, n}\end{bmatrix}$$
The first number $n$ denotes the number of rows in the matrix, and the second number $m$ denotes the number of columns in the matrix. Together, the numbers of rows and columns are the dimensions of a matrix. We may specify the dimensions of a matrix by saying it is an $n \times m$ matrix. A matrix is square if it has the same number of rows and columns. The elements of the codomain of the matrix are called the entries of the matrix, and are addressed top to bottom and then left to right. The plural form of matrix is matrices.
Two-Dimensional Cartesian Product
Just as the n-ary cartesian product of a set $S$ with itself $n$ times is the set of all $n$-tuples whose components are elements of $S,$ the two-dimensional cartesian product $S$ with dimensions $n \times m$ is defined as the set of all matrices $A : \{1, \ldots, n\} \times \{1, \ldots, m\} \rightarrow S:$
$$S^{n \times m} = \{ A \mid A : \{1, \ldots, n\} \times \{1, \ldots, m\} \rightarrow S \}$$
We may then say that $A$ is an element of the set $S^{n \times m}$ and use the typical notation to express this:
$$A \in S^{n \times m}.$$
Putting It All Together
As an example, consider this $3 \times 4$ matrix , which we will name $A$:
$$A = \begin{bmatrix} 1 & 9 & 12 & -2 \\ -3 & 7 & 190 & 0 \\ 32 & 5 & -4 & 6 \end{bmatrix}$$
The entry in the first row and third column is $A_{1,3} = 12$, the entry in the third row and first column is $A_{3,1} = 32$, and the entry in the second row and third column is $A_{2,3} = 190$. Since all the entries of $A$ are integers, we may say that $A \in \mathbb{Z}^{3 \times 4}.$
The Matrix of a Linear Transformation
The neat thing about matrices is that they can be used to express any linear transformation. Consider a linear transformation $T \in \mathcal{L}(V, W)$, a basis $B_V = \{v_1, \ldots, v_n\}$ of $V$, and a basis $B_W = \{w_1, \ldots, w_m\}$ of $W$. The matrix of a linear transformation $T$ given $B_V$ and $B_W$ is the matrix $A \in F^{m \times n}$ whose elements are defined by
$$Tv_i = A_{1,i}w_1 + \ldots + A_{m,i}w_m.$$
In other words, the $i$th column of $A$ contains the coefficients of the linear combination of basis vectors in $B_W$ needed to form $Tv_i$. We use the notation $\mathcal{M}(T, B_V, B_W)$ to express the matrix of a linear transformation with respect to the two bases. However, if the bases are implied, which as we will soon see they usually are, we can simply write $\mathcal{M}(T)$. As a result, we often refer to a linear transformation and its matrix interchangeably.
Note: While a linear transformation takes vectors from the $n$-dimensional $V$ to the $m$-dimensional $W$, its matrix is not $n \times m$, but in fact $m \times n$.
The Matrix of a Vector
Consider a vector space $V$, a basis of $B_V = \{v_1, \ldots, v_n\}$, and a vector $v \in V$ such that $v = c_1v_1 + \ldots c_nv_n$. The matrix of $v$ with respect to a $B_V$ is the $n$-tuple of numbers in $F^{n}$:
$$\mathcal{M}(v) = \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix}$$
In other words, the matrix of the vector is just the column of the coefficients of the linear combination of vectors in $B_V$ needed to represent $v$.
Matrices of Linear Transforms in $\mathcal{L}(F^n, F^m)$ With The Standard Bases
Now that we can represent any linear transformation in a standardized way with a matrix, let's consider the most common example: a linear transformation from $F^n$ to $F^m$ using the standard basis for both the domain and the codomain. Recall that the standard basis for $F^n$ contains the vectors $e_1, \ldots, e_n$, where the $i$th coordinate of $e_i$ is $1$ and all other entries are $0$.
Let $T(x, y) = [x + 2y, 7x - 4y]$. To find $\mathcal{M}(T)_{i,j}$, we first calculate $Te_1$ and $Te_2$:
$ Te_1 = T[1, 0] = [1 + 2(0), 7(1) - 4(0)] = [1, 7] \\ Te_2 = T[0, 1] = [0 + 2(1), 7(0) - 4(1)] = [2, -4]$
We can then write $\mathcal{M}(T)$ as
$$\mathcal{M}(T) = \begin{bmatrix} 1 & 2 \\ 7 & -4 \end{bmatrix}$$
If the matrix looks suspiciously like a blocky, variable-free version of the linear transformation, that's because it is. This is a great feature of the standard basis - you can get the entries for the matrix of $T$ right from the coefficients in its original definition! Thus, for example, if we have some other transformation $H(x, y, z) = [4x, 3y - z]$, we can write out its matrix pretty easily:
$$\mathcal{M}(H) = \begin{bmatrix} 4 & 0 & 0 \\ 0 & 3 & -1 \end{bmatrix}$$
Likewise, let's represent $v=[2, 4, 9]$ as a matrix. Clearly $v = 2e_1 + 4e_2 + 9e_3$. Therefore, we can write out the matrix of $v$ as:
$$\mathcal{M}(v) = \begin{bmatrix} 2 \\ 4 \\ 9 \end{bmatrix}$$
Behold the glory of the standard basis. We can once again just write the matrix of a vector out as we see it.
Note: Because the matrix of a vector is really just the same symbols written out vertically, it is also just called a vector, as are one-column matrices in general. In fact, people will probably look at you funny if you insist on saying "matrix of a vector" rather than just "vector." As such, we will write $v$ in place of $\mathcal{M}(v)$ whenever dealing with vectors when the basis is either assumed or irrelevant.
Note: Unless stated otherwise, matrices for transformations and vectors in $\mathcal{L}(F^n, F^m)$ are implicitly written with respect to the standard bases for both $F^n$ and $F^m$. We will also often define linear transformations by their matrices.
Problems
Write out the matrices of the following linear transformations, assuming the standard bases:
-
$T_1(x,y) = [x + y, x - y]$
-
$T_2(x, y, z) = [2x + z, -x - 2y - 3z, y + 7z, x]$
-
$T_3(x, y, z, w) = [2x + 9w, y + 5z]$
-
$\mathcal{M}(T_1) = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}$
-
$\mathcal{M}(T_2) = \begin{bmatrix} 2 & 0 & 1 \\ -1 & -2& -3 \\ 0 & 1 & 7 \\ 1 & 0 & 0\end{bmatrix}$
-
$\mathcal{M}(T_3) = \begin{bmatrix} 2 & 0 & 0 & 9 \\ 0 & 1 & 5 & 0 \end{bmatrix}$
-