<Math: Probability and Statistics>Probability basics
I. Element of probabilistic models
1. Every probabilistic model involves an underlying process, called the experiment. ( Example. Flip two coins )
2. The experiment produces exactly one out of several possible outcomes. ( Example. four outcomes: {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇} )
3. The set of all possible outcomes is the sample space. ( Example. Ω = {𝐻𝐻, 𝐻𝑇, 𝑇𝐻, 𝑇𝑇} )
4. Event is a subset of sample space. (Example. 𝐴𝐴 = {𝐻𝐻, 𝑇𝑇} , the event that the two coins give the same side.
5. The probability law assigns our knowledge or belief to an event 𝐴 a number 𝑃(𝐴) ≥ 0. It specifies the likelihood of any outcome.
II. Probability Axioms
1. (Non-negativity) 𝑃(𝐴) ≥ 0, for every event 𝐴.
2. (Additivity) For any two disjoint events 𝐴 and 𝐵, 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) In general, if 𝐴1, 𝐴2, … are disjoint events, then 𝑃(𝐴1 ∪ 𝐴2 ∪ ⋯) = 𝑃(𝐴1) + 𝑃(𝐴2) + ⋯
3. (Normalization) 𝑃(Ω) = 1.
III. Discrete model & Continuous model
In discrete models, it holds that for any event 𝐴 = {𝑎1, … , 𝑎𝑛}, 𝑃(𝐴) = 𝑃(𝑎1) + ⋯ + 𝑃(𝑎𝑛). When the probability law is uniform, then 𝑃(𝐴) = |𝐴| / |Ω|.
However, sample space can also be infinite, and continuous. For continuous sample spaces, the probabilities of the single-element events may not be sufficient to characterize the probability law.
A natural candidate: For a continuous model Ω = [0,1]. Define the probability on any subinterval [𝑎, 𝑏] ⊆ [0,1] to be 𝑃([𝑎, 𝑏]) = 𝑏 − 𝑎. (i.e. Probability = “the length of the interval.”)
IV. Properties of Probability Laws
Consider a probability law, and let 𝐴, 𝐵, and 𝐶 be events.
1. If 𝐴 ⊆ 𝐵, then 𝑃(𝐴) ≤ 𝑃(𝐵) .
2. 𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴 ∩ 𝐵) .
3. 𝑃(𝐴 ∪ 𝐵) ≤ 𝑃(𝐴) + 𝑃(𝐵) .
4. 𝑃(𝐴 ∪ 𝐵 ∪ 𝐶) = 𝑃(𝐴) + 𝑃(𝐴' ∩ 𝐵) + 𝑃(𝐴' ∩ 𝐵' ∩ 𝐶).
ps. 𝐴' is the complement of 𝐴.
V. Conditional Probability
Definition: Conditional probability of A given B is P(A│B) = P(A ∩ B) / P(B), where we assume that P(B)>0.
ps. If P(B) = 0: then P(A│B) is undefined.
If the possible outcomes are finitely many and equally likely, then P(A│B) = |A ∩ B| / |B|.
VI. Total Probability Theorem
Definition: For disjoint events A1, . . . ,A_n , assume P(A_i)>0 for all i. Then, for any event B, we have,
P(B) = P(A_1∩B)+⋯+P(A_n∩B) = P(A_1 )P(B│A_1 )+⋯+P(A_n )P(B│A_n ).
VII. Baye's rule
Let A_1,A_2, . . . ,A_n be disjoint events that form a partition of the sample space, and assume that P(A_i)>0, for all i.
P(A_i│B)=P(A_i∩B) / P(B) = P(A_i)P(B|A_i) ) / (P(B) = P(A_i)P(B|A_i) ) / (P(A_1)P(B|A_1)+⋯+P(A_n)P(B|A_n).

浙公网安备 33010602011771号