SciTech-Mathematics-Probability+Statistics-CDF vs. PDF: What’s the Difference? PDF:概率密度函数+CDF:累积分布函数-

https://www.statology.org/cdf-vs-pdf/

CDF vs. PDF: What’s the Difference?

This tutorial provides a simple explanation of the difference between:

  • a PDF (probability density function)
  • a CDF (cumulative distribution function)

in statistics.

Random Variables

Before we can define a PDF or a CDF, we first need to understand random variables.

A random variable, usually denoted as X,
is a variable whose values are numerical outcomes of some random process.
There are two types of random variables: discrete and continuous.

"Discrete" Random Variables

A "discrete" random variable is one which can take on only a countable number of distinct values like 0, 1, 2, 3, 4, 5…100, 1 million, etc.
Some examples of discrete random variables include:

  • The number of times a coin lands on heads after being flipped 20 times.
  • The number of times a dice lands on the number 4 after being rolled 100 times.

"Continuous" Random Variables

A "continuous" random variable is one which can take on an infinite number of possible values.
Some examples of continuous random variables include:

  • Height of a person
    There are an infinite amount of possible values for height.
    For example, the height of a person could be 60.2 inches, 65.234 inches, 70.4312 inches, etc.
  • Weight of an animal
  • Time required to run a mile

Rule of Thumb:

  • If you can count the number of outcomes,
    then you are working with a discrete random variable.
    e.g. counting the number of times a coin lands on heads.
  • But if you can measure the outcome,
    you are working with a continuous random variable.
    e.g. measuring, height, weight, time, etc.

PDF(Probability Density Functions)

  • A pdf(probability density function) tells us the probability that a random variable takes on a certain value.
  • For example, suppose we roll a dice one time.
    If we let x denote the number that the dice lands on,
    then the pdf(probability density function) for the outcome can be described as follows:
    • P(x < 1) : 0
    • P(x = 1) : 1/6
    • P(x = 2) : 1/6
    • P(x = 3) : 1/6
    • P(x = 4) : 1/6
    • P(x = 5) : 1/6
    • P(x = 6) : 1/6
    • P(x > 6) : 0
    • Note that:
      this is an example of a discrete random variable, since x can only take on integer values.

For a continuous random variable, we cannot use a PDF directly, since the probability that x takes on any exact value is zero.

  • the total Probability of x MUST be 1, and there are infinite number of possible outcomes.
    so the Probability of each exact value MUST be 0.
  • For example, suppose we want to know the probability that a burger from a particular restaurant weighs a quarter-pound (0.25 lbs). Since weight is a continuous variable, it can take on an infinite number of values.
  • For example, a given burger might actually weight 0.250001 pounds, or 0.24 pounds, or 0.2488 pounds. The probability that a given burger weights exactly .25 pounds is essentially zero.

CDF(Cumulative Distribution Functions)

A cdf(cumulative distribution function) tells us the probability that a random variable takes on a value that less than or equal to x.

For example, suppose we roll a dice one time.
If we let x denote the number that the dice lands on,
then the cdf(cumulative distribution function) for the outcome,
can be described as follows:

  • P(x ≤ 0) : 0
  • P(x ≤ 1) : 1/6
  • P(x ≤ 2) : 2/6
  • P(x ≤ 3) : 3/6
  • P(x ≤ 4) : 4/6
  • P(x ≤ 5) : 5/6
  • P(x ≤ 6) : 6/6
  • P(x > 6) : 0
  • Notice that:
    the probability that x is less than or equal to 6 is 6/6, which is equal to 1.
    This is because the dice will land on either 1, 2, 3, 4, 5, or 6 with 100% probability.

This example uses a discrete random variable,
but a continuous density function can also be used for a continuous random variable.

cdf(Cumulative distribution functions) have the following properties:

  • The probability that a random variable takes on a value less than the smallest possible value is zero. For example, the probability that a dice lands on a value less than 1 is zero.
  • The probability that a random variable takes on a value less than or equal to the largest possible value is one. For example, the probability that a dice lands on a value of 1, 2, 3, 4, 5, or 6 is one. It must land on one of those numbers.
  • The cdf is always non-decreasing. That is, the probability that a dice lands on a number less than or equal to 1 is 1/6, the probability that it lands on a number less than or equal to 2 is 2/6, the probability that it lands on a number less than or equal to 3 is 3/6, etc. The cumulative probabilities are always non-decreasing.
  • Related: You can use an ogive graph to visualize a cdf(cumulative distribution function).

The Relationship Between a CDF and a PDF

  • In technical terms, a pdf(probability density function) is the derivative of a cdf(cumulative distribution function).
  • Furthermore, the AUC(Area Under the Curve) of a pdf between negative infinity and x is equal to the value of x on the cdf.
  • For an in-depth explanation of the relationship between a pdf and a cdf, along with the proof for why the pdf is the derivative of the cdf, refer to a statistical textbook.

Zach Bobbitt

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I've worked on machine learning algorithms for professional businesses in both healthcare and retail.
I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.
My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

posted @ 2024-08-06 15:58  abaelhe  阅读(64)  评论(0)    收藏  举报