Inverses, Pseudoinverses

Tags

Inverses

Sufficient conditions (cheat sheet)

Here are some sufficient conditions

eigenbasis with non-zero eigenvalues

Positive definite + symmetric (symmetric guarantees eigenbasis, positive definite guarantees that all eigenvalues are positive)

No null space

True inverse

Let’s consider a matrix $A$ . If $N(A) = \emptyset$ , then it is invertible. Define $A^{-1}$ as the inverse of the matrix. This maps anything in output land to one distinct element in input land.

If an inverse exists, then $A^{-1}A = AA^{-1} = I$

this only applies for square matrices; rectangular matrices can have left inverses or right inverses but not both

"nonsingular" and "invertible" are the same

Properties

$(AB)^{-1} = B^{-1}A^{-1}$

$(A^T)^{-1} = (A^{-1})^T$
- this has deep implications in dual spaces

Singular Matrix

A square matrix won't have an inverse if you can find a non-trivial vector $x$ such that $Ax = 0$

Why? Suppose that there exists $A^{-1}$ . Therefore, $A^{-1}Ax = A^{-1}0 = 0$ (the last one is by the definition of matrices/linear maps. Therefore, there exists at least one $x$ such that $A^{-1}Ax \neq x$

Geometrically, a singular matrix crushes a certain vector to zero, and it's impossible to recover.

Deriving inverse matrices

You can do gaussian-jordan elimination: start by mating the matrix with the identity matrix, then performing row operations until you you get the identity matrix on one side and the inverse matrix on the other.

the proof that this is valid involves recognizing that every row operation is akin to performing an elementary linear transformation. composing these transformations result in the creation of the identity matrix

Or, you can solve a system of linear equations by filling $A^{-1}$ with unknown variables and setting up $AA^{-1} = I$

Rank and invertibility

A matrix is left-invertible if it has full column rank (every input maps to unique output; thnk about what the columns mean)

A matrix is right invertible if it has full row rank (corresponding to a column space that spans the entire output, because a full row rank has the same dimension as the codomain.

Pseudoinverse Formulation

If you have $A$ as an $m\times n$ matrix, if there is some $x$ such that $Ax = b$ , how do you find the solution $x$ ? Well, we can find an approximation of that solution.

Pseudoinverse

But now let’s consider the matrix $A$ such that $N(A) \neq \emptyset$ and perhaps $R(A)^\perp \neq \emptyset$ .

Let’s figure out the first problem. We know that we can separate out the input space into $N(A)$ and $N(A)^\perp$ . Anything in $N(A)$ is a lost cause. But can we potentially find an inverse that goes between $R(A)\backslash 0$ and $N(A)^\perp$ ? This is possible because transformations between these two spaces is injective and surjective.

Let’s define $T$ as a transformation between $N(A)^\perp$ and $R(A)$ . This is fully invertible. Note that these two spaces must have the same number of basis vectors, which means that if $A$ is square, there defintiely is some stuff in $R(A)^\perp$ . We’ll deal with that too.

Now, let’s define the pseudoinverse as $A^+$ such that

if $y = Ax, x \in N(A)^\perp$ , then $A^+y = T^{-1}y = x$ .

If $y = y_1 + y_2$ where $y_1 \in R(A)$ and $y_2 \in R(A)^\perp$ , then $A^+y = T^{-1}y_1$ . This is a very elaborate way of saying that we ignore the part of the range that isn’t reachable by $A$ .

So, let’s get intuition. If you have anything that you want to invert, the $A^+$ will project it down into a lower subspace that is within the range of $A$ , and then compute the inverse. The projection is important. If you know that $A$ has rank $n$ in space of size $d$ , then the projection goes into a subspace of dimension $n$ .

Computing the pseudoinverse

Let’s first consult how we might compute the inverse of a matrix. If we can decompose $A = U\Sigma V^T$ (SVD) and it’s invertible, then the inverse is just $A^{-1} = V\Sigma^{-1} U^T$ because V and U are orthogonal and therefore easy inverses. The sigma is non-zero diagonal (otherwise $A$ isn’t invertible), so we just take the inverse of each element.

To compute the pseudoinverse, you do the exact thing, except that there will be zeros in the $\Sigma$ . Just leave those alone and invert the rest.

Properties

Here are the main properties

$(A^+)^+ = A$ (this makes sense, if you think about it)

$(A^TA)^+ = A^+(A^T)^+$ (this one is a little weird)

$R(A^+) = R(A^T) = R(A^+A) = R(A^TA)$ (also a little weird, but interesting. It means that the row space is the same as the column space of the inverse, and it has the same rank as the projections)

$(A^T)^+ = (A^+)^T$

And here are some more properties that should make sense. If we let $G = A^+$ , then we have

$AGA = A$ (think about why this makes sense)

$GAG = G$ (intuitively, the range of $G$ is the domain of $A$ , so $AG$ is the identity operator for all things in the range of $A$ , and a "compression" for all things outside of the range of $A$ . In this manner, it does the same thing as $G$ .

$(AG)^T = AG$

$(GA)^T = GA$

if $A$ is invertible, then it is obvious that $A^+ = A^{-1}$

Now, $AA^+$ is the same as $UU^T$ , which means that it is a projection onto the range of $A$ . This makes sense, considering what the pseudoinverse does with $w = w_1 + w_2$ .

$A^+A$ is just $VV^T$ , which you can imagine as the projection onto the range of $A^T$ (i.e. the row space)