Inverses, Pseudoinverses

Tags

Inverses

Sufficient conditions (cheat sheet)

Here are some sufficient conditions

True inverse

Let’s consider a matrix AA. If N(A)=N(A) = \emptyset, then it is invertible. Define A1A^{-1} as the inverse of the matrix. This maps anything in output land to one distinct element in input land.

If an inverse exists, then A1A=AA1=IA^{-1}A = AA^{-1} = I

Properties

Singular Matrix

A square matrix won't have an inverse if you can find a non-trivial vector xx such that Ax=0Ax = 0

Why? Suppose that there exists A1A^{-1}. Therefore, A1Ax=A10=0A^{-1}Ax = A^{-1}0 = 0 (the last one is by the definition of matrices/linear maps. Therefore, there exists at least one xx such that A1AxxA^{-1}Ax \neq x

Geometrically, a singular matrix crushes a certain vector to zero, and it's impossible to recover.

Deriving inverse matrices

You can do gaussian-jordan elimination: start by mating the matrix with the identity matrix, then performing row operations until you you get the identity matrix on one side and the inverse matrix on the other.

Or, you can solve a system of linear equations by filling A1A^{-1} with unknown variables and setting up AA1=IAA^{-1} = I

Rank and invertibility

A matrix is left-invertible if it has full column rank (every input maps to unique output; thnk about what the columns mean)

A matrix is right invertible if it has full row rank (corresponding to a column space that spans the entire output, because a full row rank has the same dimension as the codomain.

Pseudoinverse Formulation

If you have AA as an m×nm\times n matrix, if there is some xx such that Ax=bAx = b, how do you find the solution xx? Well, we can find an approximation of that solution.

Pseudoinverse

But now let’s consider the matrix AA such that N(A)N(A) \neq \emptyset and perhaps R(A)R(A)^\perp \neq \emptyset.

Let’s figure out the first problem. We know that we can separate out the input space into N(A)N(A) and N(A)N(A)^\perp. Anything in N(A)N(A) is a lost cause. But can we potentially find an inverse that goes between R(A)\0R(A)\backslash 0 and N(A)N(A)^\perp? This is possible because transformations between these two spaces is injective and surjective.

Let’s define TT as a transformation between N(A)N(A)^\perp and R(A)R(A). This is fully invertible. Note that these two spaces must have the same number of basis vectors, which means that if AA is square, there defintiely is some stuff in R(A)R(A)^\perp. We’ll deal with that too.

Now, let’s define the pseudoinverse as A+A^+ such that

So, let’s get intuition. If you have anything that you want to invert, the A+A^+ will project it down into a lower subspace that is within the range of AA, and then compute the inverse. The projection is important. If you know that AA has rank nn in space of size dd, then the projection goes into a subspace of dimension nn.

Computing the pseudoinverse

Let’s first consult how we might compute the inverse of a matrix. If we can decompose A=UΣVTA = U\Sigma V^T (SVD) and it’s invertible, then the inverse is just A1=VΣ1UTA^{-1} = V\Sigma^{-1} U^T because V and U are orthogonal and therefore easy inverses. The sigma is non-zero diagonal (otherwise AA isn’t invertible), so we just take the inverse of each element.

To compute the pseudoinverse, you do the exact thing, except that there will be zeros in the Σ\Sigma. Just leave those alone and invert the rest.

Properties

Here are the main properties

  1. (A+)+=A(A^+)^+ = A (this makes sense, if you think about it)
  1. (ATA)+=A+(AT)+(A^TA)^+ = A^+(A^T)^+ (this one is a little weird)
  1. R(A+)=R(AT)=R(A+A)=R(ATA)R(A^+) = R(A^T) = R(A^+A) = R(A^TA) (also a little weird, but interesting. It means that the row space is the same as the column space of the inverse, and it has the same rank as the projections)
  1. (AT)+=(A+)T(A^T)^+ = (A^+)^T

And here are some more properties that should make sense. If we let G=A+G = A^+, then we have

  1. AGA=AAGA = A (think about why this makes sense)
  1. GAG=GGAG = G (intuitively, the range of GG is the domain of AA, so AGAG is the identity operator for all things in the range of AA, and a "compression" for all things outside of the range of AA. In this manner, it does the same thing as GG.
  1. (AG)T=AG(AG)^T = AG
  1. (GA)T=GA(GA)^T = GA
  1. if AA is invertible, then it is obvious that A+=A1A^+ = A^{-1}

Now, AA+AA^+ is the same as UUTUU^T, which means that it is a projection onto the range of AA. This makes sense, considering what the pseudoinverse does with w=w1+w2w = w_1 + w_2.

A+AA^+A is just VVTVV^T, which you can imagine as the projection onto the range of ATA^T (i.e. the row space)