PROBABILITY DISTRIBUTIONS EVERYONE SHOULD KNOW INTUITIVELY
Chance of surviving a trip to Tortula!
Why should you know this distribution?
Back of napkin probability calculations like Betting on flipping a coin, lottery, die throw, etc. where your decisions would need you to have answers to questions like:
“Given last 3 flips of this coin were heads what is the probability of next flip getting a tail?”
“I have a king and a Queen what is the probability of opponent having 2 aces”
“How likely am I, to land 3 six in a row when I roll the die”
Or when you run a production line or a hospital -
“My machine produces good products 95% of times, what is the probability of 10 defectives in 1000 samples picked by the inspector”
“Sepsis patients have about a 67% chance of going into a shock, I have 56 such patients. I need drug Epinephrine should a patient go into shock but it is a rare medicine, what is the chance of more than 12 patients going into a shock?”
or for more important things in life (at least for me ) - like the choice of weapon in Doom Eternal Boss fight.
“Probability of a kill using rocket launcher is 56%, and I have only 3 rockets left. What is the chance of a kill in the first 2 launches.”
Okay, I’m interested — tell me about how does this work?
Let’s start with the slightly technical definition —An X that counts the number of successes in many independent, identical Bernoulli trials is called a binomial random variable. X follows a binomial distribution.
Jargon explained :
Bernoulli trial: Think of this as an event with only 2 outcomes, success or failure, heads or tails, win or no win, a good product or defective, etc.
Independent: One event does not affect the outcome of any other event. ex: the outcome of a previous coin flip in no way affects the outcome of any subsequent flip, Scoring a kill from a bullet does not affect the next bullet scoring a kill or missing the target, etc.
Identical: All the events have the same probability of success ex: every time you flip a coin the prob of landing a tail is 50%, every roll of a die has the same probability of ending in a six.
Now read the definition again — sllloooowly, understanding each word.
Let's take an example: You are about to make a journey & fly through the treacherous nebulae of Tortula to reach your destination. It is famously said that the chance of a blown engine during any ride through this nebulae is 60%.
You have 5 engines your jet and it requires at least 3 engines to be working for the jet to fly. The natural question to ask would be
“What are the chances of having more than two engine malfunctions in your epic trip to nebulae of Tortula?”
Let's work this out -
Consider this as 5 trials, each trial corresponding to one engine surviving the journey. In each trial the probability of a blowout is 60% or 0.6.
Let's analyze the probability of exactly 3 engine blowouts.
If there are 3 blown engines then there must be 2 good engines. Because each event or trial is independent of others, we can multiply these probabilities. In mathematical terms it is
P(B and A ) = P(A) x P(B) in our case these are P(B = 3 blown engines ) x P( A = 2 good engines) P(engine 1 & 2 & 3 blown ) = P(engine 1 blown ) x P(engine 2 blown) x P(engine 3 blown) which is equal to 0.6 x 0.6 x 0.6 or (0.6)³ P(engine 4 & 5 good) = P(engine 4 good) x P(engine 5 good) which is equal to 0.4 x 0.4 or 90.4)² giving a probability of (0.6)³ x (0.4)²
Now there are exactly (5c3 or 5 combination 3) ways of selecting 3 blown engines from 5 engines. You might remember this from Permutation and combination chapter from math classes back in school. If you don’t understand how we got this then take a 2-minute detour to this and come back.
So the total probability becomes:
P(3 blown engines & 2 Good ones) = (5c3) x 0.6³ x 0.4² = 0.3456 Same way we calculate P(4 engines blowing out and 1 good engine) P(4) = (5c4)x 0.⁶⁴x0.⁴¹ = 0.2592 and P(all 5 engines blowing out ) P(5) = (5c5)x 0.6⁵x0.4⁰ = 0.07776 Adding these we get the probability of a definite death as: 0.6825 or 68.25% since you are a hero you look at it as 31.75% survival rate.
In general, the formula is:
n: Number of trials
x : Number of Successes desired.
p: probability of success
How do i use this distribution using python?
Here is a jupyter notebook with code in python, implementing Binomial distribution for a different example — to find the probability of a credit card non-payment.
People with a grasp of the binomial distribution have an additional quality set of tools at hand when evaluating their choices, assessing risk, and avoiding the unpleasant results than can accrue from insufficient preparation. When you understand the binomial distribution and its often surprising results, you’ll be well ahead of the masses.