Sampling Distributions for Sample Proportions

Carson West

AP Stats Home

Sampling Distributions for Sample Proportions

Introduction

When we take a sample from a population and calculate a statistic, that statistic will vary from sample to sample. A Sampling Distributions for Sample Proportions is the distribution of all possible values of a statistic (like the sample proportion) for all possible samples of the same size from the same population. This page focuses specifically on the sampling distribution for sample proportions.

What is a Sample Proportion ( $ \hat{p} $ )?

A sample proportion ( $ \hat{p} $ ) is the proportion of successes in a simple random sample of size $ n $ drawn from a population. It is an estimate of the true population proportion ( $ p $ ).

The formula for the sample proportion is: $$ \hat{p} = \frac{X}{n} $$ where $ X $ is the number of “successes” (observations with the characteristic of interest) in the sample, and $ n $ is the sample size.

The Sampling Distribution of $ \hat{p} $

If we take many, many samples of the same size $ n $ from a population and calculate $ \hat{p} $ for each sample, the distribution of these $ \hat{p} $ values forms the sampling distribution. This distribution has a specific mean, standard deviation, and shape under certain conditions.

Mean of the Sampling Distribution of $ \hat{p} $

The mean of the sampling distribution of $ \hat{p} $ (denoted as $ \mu_{\hat{p}} $ or $ E(\hat{p}) $ ) is equal to the true population proportion $ p $ . This means that $ \hat{p} $ is an Biased and Unbiased Point Estimates of $ p $ .

$$ \mu_{\hat{p}} = p $$

Standard Deviation of the Sampling Distribution of $ \hat{p} $

The standard deviation of the sampling distribution of $ \hat{p} $ (denoted as $ \sigma_{\hat{p}} $ ) describes how much the sample proportion typically varies from the population proportion $ p $ .

The formula for the standard deviation is: $$ \sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} $$ This formula is valid only if the Potential Problems with Sampling is met, which ensures that individual observations are approximately independent. The 10% Condition states that the sample size $ n $ must be no more than 10% of the population size $ N $ (i.e., $ n \le 0.10N $ ).

Shape of the Sampling Distribution of $ \hat{p} $ (Normal Approximation)

Under certain conditions, the sampling distribution of $ \hat{p} $ can be approximated by a The Normal Distribution. This is a powerful result, similar in spirit to The Central Limit Theorem for sample means.

Conditions for a Normal Model

To use a Normal distribution as an approximation for the sampling distribution of $ \hat{p} $ , three key conditions must be met:

  1. Random Condition: The data must come from a Random Sampling and a Collection, or the data must be produced by a well-designed experiment.
  2. 10% Condition: The sample size $ n $ must be no more than 10% of the population size $ N $ . This ensures that the observations are approximately independent. $ n \le 0.10N $ .
  3. Large Counts Condition (Success/Failure Condition): The number of expected successes ( $ np $ ) and expected failures ( $ n(1-p) $ ) in the sample must both be at least 10. This ensures that the shape of the sampling distribution is approximately Normal.
    • $ np \ge 10 $
    • $ n(1-p) \ge 10 $

If these conditions are met, we can say that $ \hat{p} $ follows an approximately Normal distribution with mean $ \mu_{\hat{p}} = p $ and standard deviation $ \sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} $ .

Summary Table

Characteristic Description Formula Condition for Use (if applicable)
**Mean of $ \hat{p} $ ** The center of the sampling distribution of sample proportions. $ \mu_{\hat{p}} = p $ N/A
**Standard Deviation of $ \hat{p} $ ** Measures the variability of sample proportions around the population proportion. $ \sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} $ 10% Condition: $ n \le 0.10N $
**Shape of $ \hat{p} $ ** Approximately Normal. $ N(\mu_{\hat{p}}, \sigma_{\hat{p}}) $ Large Counts Condition: $ np \ge 10 $ and $ n(1-p) \ge 10 $
Random Condition Ensures that the sample is representative of the population. N/A Data from Random Sampling and a Collection or well-designed experiment

Understanding sampling distributions is fundamental for inferential statistics, as it allows us to quantify the uncertainty in our estimates and make inferences about population parameters based on sample data.