Jensens and Log-Sum Inequalities

TagsBasicsEE 276

Convexity

We define convexity (concave up) function as the following

This should be intuitive, as the LHS is the curve segment, and the RHS is the secant line between two points.

When showing convexity, this is the definition we go to. Convexity is equivalent to having f’’(x)0f’’(x) \geq 0, but that’s not the definition from first principles.

Jensen’s Inequality

Moving to RV’s

If ff is a convex function ,we assert this Jensen's inequality

E[f(X)]f(E[X])E[f(X)] \geq f(E[X])

The inequality flips if ff is concave (common use is f=logf = \log)

As a side note, if ff is strictly convex, equality is achieved if and only ff if XX doesn’t change value.

Jensens with max, absolute value

So we know the following:

E[x]E[X]|E[x]|\leq E[|X|]
E[max(x,y)]max(E[x],E[y])E[\max(x, y)] \leq \max(E[x], E[y])

This is pretty easy to remember: boosting a value early on will yield a higher value down the line.

Strictly concave/convex

If a function is strictly concave / convex, then if E[f(X)]=f(E[X])E[f(X)] = f(E[X]), then we know that X=E[X]X = E[X], meaning that XX is a constant value.

Log-Sum Inequality

The inequality

For non-negative numbers a,ba, b, we have

and equality when ai/bia_i/b_i is constant. Essentially, this just allows us to push a summation into a log. It has a similar energy as Jensen’s inequality.

Gibbs Inequality

This one states that

or in other words, your own entropy will be lower than using a different distribution. This has implications into optimal coding theory and some other topics. The proof is simple: when you combine the right hand and left hand side, you have KL divergence. We know that KL divergence is non-negative, which gives us the inequality.