Inner Product & Norms

Tags

Reisz representation theorem

Every linear functional can be represented as an inner product ϕ(v)=v,u\phi(v) = \langle v, u \rangle. this is actually how we got to the idea of a linear functional, but formally, the linear functional comes before the inner product.

Inner products are actually derived from dual spaces.

Inner product spaces

Inner products can technically be anything, as long as it satisfies five properties. Dot product is one example.

  1. positivity: v,v0\langle v, v \rangle \geq 0
  1. definiteness: v,v=0v=0\langle v, v \rangle = 0 → v = 0
  1. Additivity: u+v,w=u,w+v,w\langle u + v, w\rangle = \langle u, w\rangle + \langle v, w\rangle
    1. works for the second position too
  1. Homogenity: λu,w=λu,w\langle \lambda u, w \rangle = \lambda \langle u, w\rangle
    1. u,λw=λˉu,w\langle u, \lambda w\rangle = \bar{\lambda}\langle u, w\rangle (note the bar)
  1. Conjugate symmetry: u,v=v,u\langle u, v\rangle = \overline{\langle v, u \rangle} (\overline)

Derivable properties (and identities)

Inner products of matrices

The corresponding norm of a matrix is the Frobenius Norm, which means that the inner product of matrices is naturally defined as

A,B=tr(ATB)\langle A, B\rangle = tr(A^TB)

Norms

The norm of something is defined as v=v,v||v|| = \sqrt{\langle v, v \rangle}. Two items u,vu, v are orthogonal if u,v=0\langle u, v\rangle = 0. Therefore, norms are just a result of some inner product space. Norms satisfy these properties

  1. non-negativity f(x)0f(x) \geq 0
  1. definiteness f(x)=0x=0f(x) = 0 \leftrightarrow x = 0
  1. homogenity f(tx)=tf(x)f(tx) = |t|f(x)
  1. triangle inequality f(x+y)f(x)+f(y)f(x + y) \leq f(x) + f(y)

Holder norm (p-norm)

Frobenius Norm (matrix)

Frobenius norm is defined in terms of a frobenius inner product, i.e. ATAA^TA. The frobenius norm is just tr(ATA)\sqrt{tr(A^TA)}. By element, it is also i,jAi,j2\sqrt{\sum_{i,j}A_{i,j}^2}. This also means that AF=A,A||A||_F = \langle A, A\rangle

The norm satisfies all properties of the norm

Moral of the story here: if you want matrix-level properties, use the trace definition. If you want element-level properties, use the direct sum of squares definition

Operator Norm (matrix)

We define Aop||A||_{op} as the operator norm. The operator norm states that

The operator norm is the same as the maximum eigenvalue of a matrix.

Matrix p-norms (matrix)

Essentially, find the vector that grows the longest with this transformation, using the p-norm (see previous section) as the metric

Spectral Norm of Matrix

Derivable Norm / Inner Product Properties

Pythagorean theorem

u+v2=u2+v2|| u + v||^2 = ||u||^2 + ||v||^2 for any u,vu, v that are orthogonal. The proof is just an expansion of the norm into inner products and using inner product properties

Cauchy Schwarz

u,vuv|\langle u, v \rangle| \leq ||u||||v||

This has a ton of implications, because it generalizes to all inner products, including integrals. The proof involves decomposing uu into αv+v\alpha v + v^\perp and then expanding out the right hand side in these new terms and applying pythagorean theorem.

Holder Inequality

The CS theorem is actually a specific form of the Holder Inequality, which applies as follows (where the subscript is the Holder norm)

u,vupvq,1p+1q=1|\langle u, v \rangle| \leq ||u||_p||v||_q, \frac{1}{p} + \frac{1}{q} = 1

Triangle inequality

u+vu+v||u+v|| \leq ||u|| + ||v||

You can prove this by using the inner product defintion and applying the Cauchy Schwarz inequality to get the inequality

Parallelogram Equality

u+v2+uv2=2u2+2v2||u+v||^2 + ||u-v||^2 = 2||u||^2 + 2||v||^2

The proof is just manipulating a lot of inner products around

Dual Norm

The dual norm to some norm is defined as

y=supx{xTy:x1}||y||_* = \sup_x\{x^Ty: ||x||\leq 1\}

Intuitively, we take the boundary of the norm and see how to maximize the inner product.

This dual norm property is a result of Holder’s inequality