AP Stats Home
Sampling Distributions for Sample Proportions
Introduction
When we take a sample from a population and calculate a statistic, that statistic will vary from sample to sample. A Sampling Distributions for Sample Proportions is the distribution of all possible values of a statistic (like the sample proportion) for all possible samples of the same size from the same population. This page focuses specifically on the sampling distribution for sample proportions.
What is a Sample Proportion ( $ \hat{p} $ )?
A sample proportion ( $ \hat{p} $ ) is the proportion of successes in a simple random sample of size $ n $ drawn from a population. It is an estimate of the true population proportion ( $ p $ ).
The formula for the sample proportion is: $$ \hat{p} = \frac{X}{n} $$ where $ X $ is the number of “successes” (observations with the characteristic of interest) in the sample, and $ n $ is the sample size.
The Sampling Distribution of $ \hat{p} $
If we take many, many samples of the same size $ n $ from a population and calculate $ \hat{p} $ for each sample, the distribution of these $ \hat{p} $ values forms the sampling distribution. This distribution has a specific mean, standard deviation, and shape under certain conditions.
Mean of the Sampling Distribution of $ \hat{p} $
The mean of the sampling distribution of $ \hat{p} $ (denoted as $ \mu_{\hat{p}} $ or $ E(\hat{p}) $ ) is equal to the true population proportion $ p $ . This means that $ \hat{p} $ is an Biased and Unbiased Point Estimates of $ p $ .
$$ \mu_{\hat{p}} = p $$
Standard Deviation of the Sampling Distribution of $ \hat{p} $
The standard deviation of the sampling distribution of $ \hat{p} $ (denoted as $ \sigma_{\hat{p}} $ ) describes how much the sample proportion typically varies from the population proportion $ p $ .
The formula for the standard deviation is: $$ \sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} $$ This formula is valid only if the Potential Problems with Sampling is met, which ensures that individual observations are approximately independent. The 10% Condition states that the sample size $ n $ must be no more than 10% of the population size $ N $ (i.e., $ n \le 0.10N $ ).
Shape of the Sampling Distribution of $ \hat{p} $ (Normal Approximation)
Under certain conditions, the sampling distribution of $ \hat{p} $ can be approximated by a The Normal Distribution. This is a powerful result, similar in spirit to The Central Limit Theorem for sample means.
Conditions for a Normal Model
To use a Normal distribution as an approximation for the sampling distribution of $ \hat{p} $ , three key conditions must be met:
- Random Condition: The data must come from a Random Sampling and a Collection, or the data must be produced by a well-designed experiment.
- 10% Condition: The sample size $ n $ must be no more than 10% of the population size $ N $ . This ensures that the observations are approximately independent. $ n \le 0.10N $ .
- Large Counts Condition (Success/Failure Condition): The number of expected successes ( $ np $ ) and expected failures ( $ n(1-p) $ ) in the sample must both be at least 10. This ensures that the shape of the sampling distribution is approximately Normal.
- $ np \ge 10 $
- $ n(1-p) \ge 10 $
If these conditions are met, we can say that $ \hat{p} $ follows an approximately Normal distribution with mean $ \mu_{\hat{p}} = p $ and standard deviation $ \sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} $ .
Summary Table
Characteristic | Description | Formula | Condition for Use (if applicable) |
---|---|---|---|
**Mean of $ \hat{p} $ ** | The center of the sampling distribution of sample proportions. | $ \mu_{\hat{p}} = p $ | N/A |
**Standard Deviation of $ \hat{p} $ ** | Measures the variability of sample proportions around the population proportion. | $ \sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} $ | 10% Condition: $ n \le 0.10N $ |
**Shape of $ \hat{p} $ ** | Approximately Normal. | $ N(\mu_{\hat{p}}, \sigma_{\hat{p}}) $ | Large Counts Condition: $ np \ge 10 $ and $ n(1-p) \ge 10 $ |
Random Condition | Ensures that the sample is representative of the population. | N/A | Data from Random Sampling and a Collection or well-designed experiment |
Understanding sampling distributions is fundamental for inferential statistics, as it allows us to quantify the uncertainty in our estimates and make inferences about population parameters based on sample data.