Notations

Tags

Notation stuff

When dealing with random variables, we typically need to say something like p(X=x)p(X = x), but we just use a shorthand at times p(X)p(X). It’s the same thing.

We use the term Val(X)Val(X) to get the values that the RV can take on.

More sophisticated notation

Sometime we use the Pr{}Pr\{\} notation if using another PP is confusing. This is common for CDF, or if the likelihood is a random varaible

“what is the likelihood that the likelihood of a sample is below a certain value?”

Distributions vs. probabilities 🐧

A distribution has no "p" attached to it. For example, yxN(x,1)y | x \sim \mathcal{N}(x, 1). In this case, yxy | x is a distribution. However, p(yx)p(y | x) is a probability. You can find the probability by plugging in yy into the CDF formula defined by yxy | x.

Prior and Posterior

The prior is what you believe before some observation. For example, p(c=i)p(c = i). The posterior is what you believe after some observation, like p(c=ib)p(c = i | b)

Densities

Here’s also a point of confusion that sounds stupid but it happens all the time. The pp is not a specific function. So p(xy)p(x | y) isn’t the same as p(ab)p(a | b), etc. This is true even when the pp is specialized for some application. Probability is not a function.

The density p(s)p(s) means the likelihood of ss, which ALSO means that if you were to select ss at random, the likelihood of ss being this current ss has likelihood p(s)p(s).

Remember that p(s)p(s) is a value, as ss is scalar. Therefore, p(sv)p(s | v) is well-defined, but NOT p(sV)p(s | V). The conditioning must always be deterministic. On the other hand, p(Sv)p(S | v) is totally fine; it’s just another random variable.