The following preposterous case illustrates the Bayesian worldview:
If you ask a mathematically-gifted newborn for the probability that the sun will rise tomorrow, they might reply:
“The probability that the sun will rise tomorrow follows a beta distribution with parameters a = b = 2.”
Since the mean of the above distribution is 50% and the variance is wide, this statement is just a fancy, probabilistic way of saying:
“I have no idea what the probability is, and am guessing widely about how wrong I might be.”
The statement reflects the newborn’s complete inexperience with sunrise.
Their estimate is a prior distribution of the probability that the sun will rise tomorrow. Usually priors are based on expert opinion—giving them more value—but in this case we have an uninformed prior since the person who gave it clearly lacks expertise in the subject of sunrises.
After observing a few sunrises, the newborn will start concluding it is a common phenomenon, which would bias their belief about the probability of observing the sun rise on the next day. Using Bayesian estimation to update the above prior estimate after observing six, eight, 12, and 14 consecutive sunrises yields:
The belief that the sun will rise the following day starts approaching certainty as the number of sunrise observations increases. This is a case of Bayesian learning.
Bayesian estimation provides a formal method for combining prior knowledge with new measurements, to “update” the knowledge in light of the new data. The updated information is represented by the posterior distribution.
Where did I get the “21504 to 1” estimate?
Starting with the above-stated prior and combining it by Bayesian estimation with all the sunrise observations in my lifetime produces:
The mean of this distribution is 0.9999535, or roughly 21504:1 odds in favor of the sun rising tomorrow. Adding more observations—say over the course of humanity’s time on earth—would increase the calculated probability.