Table of Contents
Matrix Math
Data Types
Instead of using the words number and array, matrix math uses these four specific terms: scalar, vector, matrix, tensor.
Theoretically, these words are defined as follows.
scalar
a single number, or a number with 0-dimensional list of numbers $a = 12$ or $b = [12]$
vector
a 1-dimensional list of numbers $A = \begin{bmatrix}1 & 2 & 3 & 4 \end{bmatrix}$ or $B = \begin{bmatrix}5 \\ 9 \\ 6 \end{bmatrix}$
matrix
a 2-dimensional list of numbers
$B = \begin{bmatrix}12 & 6 \\ 9 & 2 \\ 6 & 7 \end{bmatrix}$
tensor
a 3-or-more-dimensional list of numbers
In practice, software implements scalar and vector as matrices. The vector has a size of 1 in one dimension. The scalar has a size of 1 in both dimension.
A tensor can have any number of dimensions. The word rank gives the number of dimensions.
- A scalar is a tensor of rank 0.
- A vector is a tensor of rank 1.
- A matrix is a tensor of rank 2.
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships among variables.\\ Simple linear regression\\ y = ax + b\\ Multiple linear regression\\ Multivariate linear regression\\
examples of tensor\\ dot product\\ cross product\\ linear map\\
tensor math (called tensor calculus or the absolute differential calculus)\\ a tensor is a mathematical version of a relational database\\ How do you regress a text string to a tensor?\\
Timeline
1890. The word tensor first appears. (The classical Greeks had been using matrices.)
Element-wise Math
Same-size Matrix
Add, subtract, multiply and divide operations can be applied to two same-size matrices.
The operations are executed element-by-element among corresponding cells.
Matrix times Matrix
$$\begin{bmatrix}1 & 2\\ 3 & 4\end{bmatrix} * \begin{bmatrix}5 & 7\\ 6 & 4\end{bmatrix} = \begin{bmatrix}5 & 14\\ 18 & 16\end{bmatrix}$$
Vector plus Matrix
If the vector is the right size, it is treated as if it were replicated into a same-size matrix.
$$\begin{bmatrix}1 & 2\end{bmatrix} + \begin{bmatrix}5 & 7\\ 6 & 4\end{bmatrix} \Rightarrow \begin{bmatrix}1 & 2\\ 1 & 2\end{bmatrix} + \begin{bmatrix}5 & 7\\ 6 & 4\end{bmatrix} = \begin{bmatrix}6 & 9\\ 7 & 6\end{bmatrix}$$
Scalar times Matrix
A scalar is treated as if it were replicated to fill a same-size matrix.
$$2 * \begin{bmatrix}5 & 7\\ 6 & 4\end{bmatrix} \Rightarrow \begin{bmatrix}2 & 2\\ 2 & 2\end{bmatrix} * \begin{bmatrix}5 & 7\\ 6 & 4\end{bmatrix} = \begin{bmatrix}10 & 14\\ 12 & 8\end{bmatrix}$$
Properties
Because each element-wise operation takes place between two numbers, the element-wise matrix math operations have the same properties as their single-number counterparts.
Element-wise matrix operations are:
- commutative: $a+b = b+a$
- associative: $(a+b)+c == a+(b+c)$
- distributive: $a(b+c) == (a*b) + (a*c)$
Matrix Multiplication
Matrix Multiplication is completely different from element-wise math.
The rows of the first matrix are combined with the columns of the second matrix, like this example:
$$\begin{bmatrix}5&7\\6&4\end{bmatrix}*\begin{bmatrix}2&1\\4&2\end{bmatrix}=\begin{bmatrix}(5*2)+(7*4)&(5*1)+(7*2)\\(6*2)+(4*4)&(6*1)+(4*2)\end{bmatrix}=\begin{bmatrix}38&19\\28&14\end{bmatrix}$$
Another example:
$$\begin{bmatrix}5&7\\6&4\\3&2\end{bmatrix}*\begin{bmatrix}2&1&3\\4&2&1\end{bmatrix}=\begin{bmatrix}(5*2)+(7*4)&(5*1)+(7*2)&(5*3)+(7*1)\\(6*2)+(4*4)&(6*1)+(4*2)&(6*3)+(4*1)\\(3*2)+(2*4)&(3*1)+(2*2)&(3*3)+(2*1)\end{bmatrix}=\begin{bmatrix}38&19&22\\28&14&22\\14&7&11\end{bmatrix}$$
Appropriately Sized
The number of columns in the first matrix must equal the number of rows in the second matrix.
The resulting matrix is the number of rows in the first matrix by the number of columns in the second matrix.
Example 1: $A*B$
Matrix A is 2×3, matrix B is 3×4. 3 == 3. OK. The resulting matrix will be 2×4.
Example 2: $C*D$
Matrix C is 4×2, matrix D is 3×2. 2 != 3. Error. These two matrices cannot be multiplied together.
Properties
Matrix multiplication is:
- NOT commutative: $A*B \neq B*A$
- NOT associative: $(A+B)+C \neq A+(B+C)$
- NOT distributive: $A(B+C) \neq (A*B) + (A*C)$
Matrix Transpose
Transposing a matrix reverses its columns and rows. The transpose of a matrix is denoted with a superscript T.
Create a 3×2 matrix.
$$A = \begin{bmatrix}a & b\\ c & d\\e & f\end{bmatrix}$$
Transpose it.
$$A^T = \begin{bmatrix}a & b & c\\ d & e & f\end{bmatrix}$$
Row and column indexes are reversed.
$$A_{ij} = A^T_{ji}$$
Identity Matrix
The identity matrix contains all zeros except for a diagonal of ones.
Multiply a matrix times the identity matrix and you get the original matrix unchanged.
Only a square matrix has an identity matrix.
$$I = \begin{bmatrix}1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{bmatrix}$$
Matrix Inversion
Only square matrices can be inverted.
For a matrix $A$, its inversion is denoted $A^{-1}$
$$A=\begin{bmatrix}3&4&8\\2&16&4\\4&2&9\end{bmatrix}$$ $$A^{-1}=\begin{bmatrix}-1.7&0.25&1.4\\0.025&0.0625&-0.05\\0.75&-0.125&-0.5\end{bmatrix}$$
Multiplying a matrix times it's inverse results in the Identity matrix. $A*A^{-1}=I$
Operations
scalar, vector, matrix, tensor
add, multiply, dot
cartesian product?
linear combination?
Product
In general, “product” is the result of a multiplication.
When we multiply sets, vectors, matrices, formulas, etc., then there begin to be variations in the way the multiplication can be carried out.
In advanced topics, we see several concepts and several types of products, and often the terms are used inconsistently.
For now, we resolve all the different concepts down to two: dot product and Cartesian product.
Cartesian product is also known as:
- direct product
- cross product
- element-wise multiplication
The term “math multiplication” may refer to either.
Either type of multiplication can be taken for each of the following pairs of terms:
- product of two scalars
- product of scalar times vector
- product of two vectors
- product of scalar times matrix
- product of vector times matrix
- product of two matrices
- product of scalar times tensor
- product of vector times tensor
- product of matrix times tensor
- product of two tensors
Case Studies
1. Computer graphics. The sun rising over a landscape.
2. Data analysis. How to predict the size of a person's bank account.
Determinate
Dot product
Cross product
Eigen values
Software Implementations
- MATLAB - commercial expensive
- Octave - MATLAB compatible, free open source GNU
- Python packages
- numpy - numerical
- mapplotlib - plot 2D and 3D graphs
- numpy.matlib - matrix math
- scipy - scientific, including linear regression
- TensorFlow - latest and greatest, placed into open source by Google in November 2015
operation | Octave | Python | TensorFlow |
---|---|---|---|
element-wise | A .* B | np.multiply(A,B) | A * B |
dot product | A * B | A * B | tf.matmul(A,B) } |
on size mismatch | error | data fudged to make it match | |
array index | 1 based | 0 based | 0 based |
To support this wiki page, we have built a matrix_math script in each of Python, Octave and TensorFlow executing the equations shown above. http://samantha.voyc.com/doc/script/
Hardware Implementation
GPU
Graphics processing drives a display screen which is a matrix of pixels, one color for each pixel, and therefore is a fit with matrix math. A graphics processing unit (GPU) is a chip designed to with built-in matrix math capabilities. Recently AI software has begun to take advantage of the GPU's matrix math capabilities.
The largest manufacturer of GPU's is Nvidia with their GeForce line of products. Others include Intel, AMD, Qualcomm, Huawei Kirin, Google TPU, Wave Computing, and GraphCore, plus a slew of startups.
TPU
In May 2016, Google announced the tensor processing unit (TPU) a chip specially designed for matrix math using low 8-bit precision, and claimed a 10x improvement in performance per watt.
In May 2017, Google announced availability of clusters of TPUs in the Google Compute Engine, making this power available as a service.
ASIC
AI-Specific Integrated Circut (ASIC) - chips designed specifically for AI processing. Some companies in this space are using reduced precision for faster performance, copying Google's TPU approach.
Nvidia has adapted their GPU architecture to the Tesla P100, a chip specifically designed for AI.
References
- Compare MATLAB and numpy https://docs.scipy.org/doc/numpy-dev/user/numpy-for-matlab-users.html
- Matrix Math on wikipedia. https://en.wikipedia.org/wiki/Matrix_(mathematics)
- Nvidia on wikipedia. https://en.wikipedia.org/wiki/nvidia
- TensorFlow on wikipedia. https://en.wikipedia.org/wiki/TensorFlow
- GPU companies. https://ark-invest.com/research/gpu-tpu-nvidia
Documentation
- Online Equation Editor https://www.codecogs.com/latex/eqneditor.php generates LaTex code.
- LaTex - text syntax for pretty displayed equations.