Basic Probability
Tags |
---|
Formalism & the basics
An outcome space contains all possible outcomes. So if it were a dice, the set would be six numbers. These numbers are known as outcomes
. We define the event space
to be the superset of . Therefore, an event
is a subset of the outcome space, which can be one outcome, or a set of outcomes. We can union events, and we can intersect events.
The probability measurement function is a function that satisfies these conditions
- for all
-
- If are disjoint (mutually exclusive), then
From these axioms of probability, we derive the following properties
- (duh)
- (another duh)
-
- (law of complements)
Two views of probability
The frequentist believes probabilty as a frequency of events. This is good if you’re doing multiple tries of something, but there are cases, like weather predictions, where there will only be one outcome.
In this case, the better belief is this: the probability is the subjective degrees of belief. If , then we are very sure that it will happen. If , then we are only somewhat sure.
Union and disjoints
Unioning events touches on the law of summations
If , we call these two events mutually exclusive
.
Intersection and independence
Intersecting events touches on the chain rule
We define to be independent
of if these equivalent statements are true
-
-
We can generalize the idea of two intersections into the chain rule
, which states that
Conditional distributions
We can use the standard formula to get
and in the continuous case it’s just a ratio of functions
Bayes rule applies with continuous and discrete random variable PMF’s, as well as the laws of independence
Tabular Distributions
Writing distributions as charts
- joint distribution: whole thing sums to 1
- Conditional distribution: column or row sums to 1
- computing conditional from joint: normalize across a column or row
It’s tricky keeping track of what is what sometimes! Good labeling is always key.
Tabular computation
From a table, you can compute things like . To remove a variable, you have to marginalize it out by summing across the joint probability. To condition a variable, you got to isolate all the rows with these conditions, sum the probabilities to get a normalizing factor, and then you can compute something like . As another note: you can’t compute as a function because the table is just for look-up. Tabular computation requires you to plug in values for the variables.
Remember that for independence, . However, if , this does NOT show dependence unless .