Suggested languages for you:

Americas

Europe

|
|

# Confidence Intervals

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Nie wieder prokastinieren mit unseren Lernerinnerungen.

Finding a population parameter such as the population mean or $$\mu$$ can be easier said than done. It is not always practical or cost-efficient to travel around the world and collect data. Instead, you just have to be satisfied with your sample and use it to get a range of values for your parameter. And this brings you to what are called confidence intervals.

This article will discuss what a confidence interval is, their interpretation, types of confidence intervals such as confidence intervals for population mean and for proportion, and provide examples of confidence intervals.In statistics, the confidence interval is represented by the letters $$CI$$.

## Introduction to Confidence Intervals

Let's start by looking at the terminology behind this important concept in Statistics.

A confidence interval is a range of likely values to estimate a population parameter.

The main reason you want to do an interval estimation through confidence intervals than a point estimation – a single statistic – is that sample results vary from sample to sample.

Suppose you would like to estimate the percentage of students who eat cupcakes during break in a school. You can imagine that if you collected data from three samples, each sample in a different week, the three samples would likely be different. The results, and the percentages of the samples, would very likely be different too.

So, you need some measure of how much you can expect those results to change if you were to repeat your study. This expectation of variations in your statistic from sample to sample is measured by the margin of error.

The margin of error represents a certain number of standard deviations of your statistic you add and subtract to have a certain confidence in your results.

Let's go back to the previous example.

Imagine the first sample was of $$150$$ students and the percentage of cupcake eaters was $$35\%$$, the margin of error could be of $$1.5\%$$. This would mean the actual percentage of students who eat cupcakes during breaks in the entire school population is expected to be $$35\% ± 1.5\%$$ (that is, between $$33.5\%$$ and $$36.5\%). Here, you are using your sample to estimate a range of values – a confidence interval – where there’s a likelihood to find the true value of the unknown parameter you’re interested in. This likelihood gives you that certain confidence in your results, and it is called confidence level. The confidence level is the likelihood, given in percentage, your result is close to the actual value of the population parameter you’re interested in if you repeated the sample collection over and over. Without further ado, let's see how to build a confidence interval. ## Confidence Interval Formula The terminology presented in the previous section actually gives you a clue to the elements needed to build a confidence interval. For example the formula for the confidence interval for the mean is: $CI=\overline{x}\pm z \frac{\sigma_s}{\sqrt{n}}$ Here we can identity: \(\overline{x}$$: The sample mean.

$$z$$: The confidence level.

$$\sigma_s$$: The standard deviation of the sample.

$$n$$: The sample size.

If you want to know more about samples, the sample mean, and the sample standard deviation, check our article named Sample Mean.

With these elements, you can build a confidence interval.

The confidence level or $$z$$ is set by you. This variable $$z$$ is the percentage your results will get close to a value if you repeat your experiment.Let us propose an easy experiment. You measure the height of a sample of students in a college. The smaller students measures $$1.5m$$, and the tallest $$1.87m$$. Let us say you want a confidence interval of $$95\%$$; if you chose a random student from the college outside the sample, you expect its height to fall into a range you choose with a $$95\%$$ of probability if the variables to calculate the confidence interval are choosen correctly.

Let us suppose we have the measurements of the weight of coins of the same value. Some coins will have more weight, and some others don't. The coins weigh $$50gr$$ and have a deviation from their weight from $$0gr$$ to $$2gr$$. If they follow a normal distribution, you will have the same as below:

Fig. 1. Normal distribution.

You choose an interval where you know $$66.3\%$$ live. This is, $$64.2\%$$ of the coin weight deviation will be there. You can see the interval below. The interval goes below and above the mean $$m$$ in this case.

However, if this is just a sample of a large population then the mean and the interval might be different for the whole set of coins circulating in the market.

If you repeat the experiment with another sample of coins and you want values or the mean value to be close to the original sample; then a confidence interval will appear.For example the better the confidence interval, the closes our mean will be to the mean value of the total population. Then the close will be the means of the old sample and the new sample.

The confidence interval gets narrower as the sample increases.

## Types of Confidence Intervals

However, confidence intervals can mean several things.

The types of confidence intervals you will see below are:

• The confidence interval for population mean.

• The confidence interval for the difference of two means.

• The confidence interval for population proportion.

• The confidence interval for the difference of two proportions.

• The confidence interval for the slope of a regression model.

## Confidence Interval for Population Mean

Let us say you take a sample $$a$$ of a whole population $$A$$. This sample $$a$$ has a mean $$\overline{x_a}$$. If the sampling has enough data and the survey is random, then the parameters of the sample will resemble the ones of the large population. The better the sampling method is, the better the mean of the sample will resemble the mean of the whole population.In this case, the confidence interval is the range $$[x_1 - x_2]$$ in the original sample $$a$$, on which we have a probability value $$P$$ to find the population mean.

So let us say you have a mean $$\overline{x_a}$$ and you have a confidence interval of $$90\%$$ around that mean. The interval goes from the value $$x_1$$ to the value $$x_2$$. In this case, the mean of the population $$A$$ has $$90\%$$ of probabilities of being inside this range.

This has another implication which is if you take another sample is very probable the mean of this sample will be in this range too. Let us make a numerical example.

Let us say we have some data which follows a normal distribution. Its mean is $$0$$ and has a standard deviation of $$1$$. This data is a sample of a larger population. The data of the sample is large, at least $$2000$$ samples.Let us say you want the confidence interval for the mean to have a confidence level of $$95\%$$. To retrieve the value of $$z$$, you need to go to the z-score tables and choose a $$z$$ value close to $$0.95$$. The value for this confidence level is $$z=1.64$$.If we plug this into the formula you saw in the first paragraphs:

$CI=0 \pm 1.64 \frac{1}{\sqrt{2000}}=0.0366$

Then we can say with a $$95\%$$ of confidence that the mean of all population is $$0$$ with a $$\pm 0.036$$ deviation.

Table 1. $$Z$$ values for the confidence level of $$95\%$$. The value $$1.64$$ is taken from the column and row where the value $$z \cdot 100$$ is closer to $$95$$ in red.

 z 0 0.01 0.02 0.03 0.04 0.05 0.0 0.500 0.5040 0.5080 0.5160 0.5199 0.5239 0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.3 0.6179 ... ... ... ... ... 0.4 0.6554 ... ... ... ... ... 0.5 0.6915 ... ... ... ... ... 0.6 0.7257 ... ... ... ... ... 0.7 ... ... ... ... ... ... 0.8 ... ... ... ... ... ... 0.9 ... ... ... ... ... ... 1.0 ... ... ... ... ... ... 1.1 ... ... ... ... ... ... 1.2 ... ... ... ... ... ... 1.3 ... ... ... ... ... ... 1.4 ... ... ... ... ... ... 1.5 ... ... ... ... 0.9382 0.9394 1.6 ... ... ... ... 0.9495 0.9505

The size of the sample will affect the confidence interval in the previous example. If the sample was only $$1000$$ then the result will be $$0\pm0.051$$.

The confidence level is probability that the interval contains the true parameter value.

### Confidence Interval for the Difference of Two Means

Let us say you have two samples from two populations. Like the weight samples of a class in grade 8 in England and in grade 9 in Scotland. You want to find the difference between the means of both.This could be easy, calculate the mean in weight of the class in England $$w_E$$ and subtract this from the mean of the class in Scotland $$w_S$$. However, the samples are random, their means not resemble the mean of grade $$8$$ in England and in grade $$9$$ in Scotland. We have uncertainties about the possible rusult.In this case we have a formula to calculate the confidence interval. The mean of two different populations is defined as:

$CI_p=(\overline{x_1} - \overline{x_2})+t\sqrt{\frac{sp^2}{n_1}+\frac{sp^2}{n_2}}$

$$\overline{x_1}{,}\overline{x_2}$$: The population means.

$$sp$$: The pooled variance.

$$n_1, n_2$$: The population of sample $$1$$ and sample $$2$$.

$$t$$: The $$t$$ critical value.

The pooled variance is calculated as follows:

$sp=\sqrt{\dfrac{(n_1 - 1)\cdot s^2_1+n_2- 1)\cdot s_2^2 }{n_1 + n_2 -2}}$

$$s_1{,}s_2$$: are the variances of the samples.

## Confidence Interval for Population Proportion

You have seen what happens with the confidence interval in a normal distribution. In these types of distributions, the values are continuous. However, there are other types of distributions, like the Binomial distribution. In this case, the values are the result of a Bernoulli-type experiment. In a Bernoulli-type, the results have only two outcomes.In these distributions, we can test a question.

Lets us say you want to pool people about a presidential candidate.

People do a random survey, calling people's houses. Enquired houses cover different socio-economical backgrounds and places, making the study as random as possible. In the survey, a $$67\%$$ of people confirm their vote for candidate $$A$$.However, there is a problem, you have uncertainties. The people which gave the answers do not correspond to the total population. In this case, the real percentage might vary.Let us say the people that made the survey confirm their study has a $$90%$$ of certainty. In this case, a variation of $$\pm 6.7\%$$ is possible. The real value could be $$60.3%%$$ or $$73.7\%$$.In these cases, the confidence interval of the proportion mentioned, which is $$67\%$$, is important because it tells you something. The confidence interval tells us a history where this candidate, even in the worst case can win with more than $$50\%$$ of votes. But what if the confidence interval is lower? If the confidence interval is $$70\%$$, then the value can drop below $$50\%$$ and then the candidate might loose even if $$67\%$$ of the people will vote for him.This is why the confidence interval for proportions is very important. Given a population proportion, it can tell us how good it is when compared to the whole population.

$p=Z\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}$

$$\hat{p}$$: is the proportion or percentage.

$$Z$$: is the value for the confidence level as in the table you used before.

$$n$$: is the sample size.

### Confidence Interval for the Difference of Two Proportions

Just as when you have the confidence interval of two means of two samples of two populations, this can exist also for proportions. In this case, you have two proportions obtained from samples.

The two samples survey the same question in populations $$A$$ y $$B$$, however their results are different $$\hat{p_1}$$ and $$\hat{p_2}$$. In this case, the confidence interval for the difference of two proportions is given by the next equation:

$(\hat{p_1}-\hat{p_2})\pm Z \sqrt{\dfrac{\hat{p_1}(1-\hat{p_1})}{n_1}+\dfrac{\hat{p_2}(1-\hat{p_2})}{n_2} }$

## Confidence Interval for the Slope of a Regression Model

If you suspect that there might be a linear relationship between two variables, then you can construct a confidence interval for the slope of a regression model. Remember that you can use linear regression or the least-squares regression technique to create the line that best fits the data.

Suppose you have collected data over the last 20 years about average voter age. If you think that the average voter age has decreased over the last 20 years, you could make a confidence interval for the slope of your linear regression model to see if their has been a linear relationship between time and average voter age.

To learn how to draw conclusions from this type of confidence interval, read our article on Justifying Claims Based on the Confidence Interval for the Slope of a Regression Model.

## Confidence Interval Interpretation

Again, a confidence interval is an interval with the likely values of a population parameter based on one or several random samples, with a $$c\%$$ confidence level.

What the confidence level says is that the method used to create a particular confidence interval is successful in capturing the value of the actual population parameter approximately $$c\%$$ of the time.

Beware: a confidence level of $$X\%$$ does not mean the probability of the parameter being between the limits of the confidence interval is $$X\%$$.

Again, a confidence level of $$c\%$$ concerns the method used to produce the confidence interval.

So, the interpretation you should make of a confidence interval is that you can be $$c\%$$ confident that the actual value of the parameter is included in the calculated interval.

The most common confidence levels are of $$90\%$$, $$95\%$$ and $$99\%$$.

Suppose that a $$95\%$$ confidence interval states that the population mean is greater than $$150$$ and less than $$200$$. How would you interpret this statement?

“This means there is a $$95\%$$ chance that the population mean falls between $$150$$ and $$200$$.”

“This means there is $$95\%$$ confidence level that the true value of the population parameter is between $$150$$ and $$200$$”.

## Considerations on the margin of error, confidence level and sample size

You could think by aiming for a narrow interval to estimate your parameter, you get closer to knowing its true value since it's more precise. It is more convenient and precise for you to know that you are meeting a friend in neighborhood $$X$$, instead of in city $$Y$$.

But in confidence intervals you should think the other way around: the smaller the width of the interval, the less sure you are that the true value of the parameter is in that interval. Although accuracy decreases, it is much safer to assume that the parameter is in city $$Y$$, rather than in neighborhood $$X$$ because the same city may contain other neighborhoods in which the parameter may be present.

This means

the higher the confidence level is, the wider the confidence interval.

A confidence interval with $$99\%$$ confidence level is wider than one with a $$95\%$$ confidence level, which is wider than one with a $$90\%$$ confidence level, regarding the same situation.

Another thing you might notice in the formulas presented is that the sample size also affects the margin of error. In all the situations presented, the sample size $$n$$ appears in the denominator of the standard error. Thus,

the larger the sample size, the narrower the confidence interval

(because the smaller the value of the standard error).

## Example of Confidence Intervals

Let's end this article with two examples where we calculate the confidence interval of a mean and the confidence interval of two proportions.

Let us assume you have the height data for students in several colleges. The data shows their height and the data mean is $$\mu=1.5m$$. If the standard deviation is equal to $$1$$. In this case, we want to know the confidence interval for the mean if the sample has a size of $$3000$$ individuals.

Using the formula for the $$CI$$ you have:

$CI=1.5 \pm Z \frac{1}{\sqrt{3000}}=X$

Let us say you want a $$95\%$$ of confidence level as in the first problem:

$CI=1.5m \pm 1.64 \frac{1}{\sqrt{3000}}$

$CI=1.5m \pm 0.029$

Let us say you want to calculate the confidence interval of a proportion. This proportion is $$62\%$$. We again want a confidence level of the $$95\%$$. The sample in this case was a pool of $$6734$$ people.$p=Z\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}$

If you substitute the values:

$p=0.0097$

## Confidence Intervals – Key takeaways

• A confidence interval is a range of likely values to estimate a population parameter.
• The margin of error represents a certain number of standard deviations of your statistic you add and subtract to have a certain confidence in your results.
• The confidence level is the likelihood, given in percentage, your result is close to the actual value of the population parameter you’re interested in if you repeated the sample collection over and over.
• The most frequent confidence levels are of $$90\%$$, $$95\%$$ and $$99\%$$.
• The general form of a confidence interval issample statistic ± margin of error,where margin of error = critical value × standard error.
• Specific sample statistics have specific confidence intervals, but they all follow the same form.
• The interpretation you should make of a confidence interval is that you can be $$c\%$$ confident that the actual value of the parameter is included in the calculated interval.

A confidence interval is a range of likely values to estimate a population parameter.

A confidence interval is of the form statistic ± margin of error. In possession of these two pieces of information, or deducing this information from a specific situation, you can find the confidence interval.

Step 1: Find the sample statistic;

Step 2: Select a confidence level;

Step 3: Find the margin of error;

Step 4: Find the confidence interval.

The interpretation you should make of a confidence interval is that you can be c% confident that the actual value of the parameter is included in the calculated interval.

If the confidence interval is given, you just have to identify the standard error and manipulate it adequately to obtain the standard deviation.

## Confidence Intervals Quiz - Teste dein Wissen

Question

The confidence interval of a population proportion can be said to be_

the level of certainty that the real or actual population proportion falls within an estimated range of values.

Show question

Question

True or False?

The confidence interval for a population proportion gives you an estimated boundary or range for which the exact value is expected to be found, with a specified level of assurance.

True.

Show question

Question

True or False?

There are only 3 confidence levels.

False.

Show question

Question

True or False?

Statisticians mostly use the $$90\%$$ confidence level.

False.

Show question

Question

While choosing confidence level, you should tend to be ___.

more precise and more certain.

Show question

Question

True or False?

When determining the confidence interval, you must ensure that the sample data is truly representative of the overall population.

True.

Show question

Question

True or False?

The margin of error depends on your confidence level.

True.

Show question

Question

In a population of $$100,000$$ people, $$20$$ of them were used in a study, and it was observed that $$40\%$$ are successes in the study.

Is that sample size large enough to determine the margin of error?

No.

Show question

Question

In a population of $$100,000$$ people, $$200$$ of them were used in a study, and it was observed that $$40\%$$ are successes in the study.

Is that sample size large enough to determine the margin of error?

Yes.

Show question

Question

In finding the confidence interval of a population proportion, Nonny uses:

$\hat{p} \pm (\text{critical value}) \sqrt{ \frac{ \hat{p} (1 - \hat{p}) }{n} }$

Is he correct?

Yes.

Show question

Question

True or False?

The confidence interval for a population proportion is the sample proportion plus or minus the margin of error.

True.

Show question

Question

When communicating your result, you must consider these $$2$$ aspects:

The confidence interval and the confidence level.

Show question

Question

True or False?

The confidence level is a measure of the success rate of the method of constructing the interval, not a comment on the population.

True.

Show question

Question

When determining the confidence interval, you must ensure that the sample data is __ of the overall population.

Truly representative.

Show question

Question

True or False?

A $$95\%$$ confidence level would give the same result as a $$90\%$$ confidence level.

False.

Show question

Question

True or false: Confidence intervals for the difference between two proportions can be described as the level of certainty that the actual difference between two population proportions falls within an estimated range of values.

True.

Show question

Question

How are the margin of error and the width of a confidence interval for the difference of two proportions related?

The margin of error is half the width of the confidence interval.

Show question

Question

If you do a survey and find out that $$95\%$$ of class A passed and exam, but only $$90\%$$ of class B passed it, what kind of confidence interval would you construct?

a confidence interval for the difference of two proportions.

Show question

Question

If you do a survey and find that the average score on an exam in class A is $$95\%$$ and the average score on the same exam in class B is $$90\%$$, which kind of confidence interval would you construct?

A confidence interval for the difference in two means.

Show question

Question

Is random sampling required to construct a confidence interval for the difference of two proportions?

Yes.

Show question

Question

What is the main difference between a confidence interval for the difference of two proportions and the confidence interval for the difference of two treatment proportions?

In fact, there is no difference in the confidence intervals, just in how you state your conclusions.  The conclusion for a treatment proportion always refers to the treatment involved.

Show question

Question

True or false:  The method for constructing a confidence interval for two proportions is different from the method for constructing a confidence interval for two treatment proportions.

False.

Show question

Question

Suppose you construct a confidence interval for a difference of two proportions.  How would you find the margin of error?

The margin of error is always half of the width of the confidence interval.

Show question

Question

What is the formula for calculating a confidence interval for the difference of two proportions?

The confidence interval for the difference of two proportions has the formula:
$$\hat{p}_1 - \hat{p}_2 \pm (z \text{-critical value })\sqrt{\dfrac{ \hat{p}_1 (1-\hat{p}_1 )}{n_1} + \dfrac{ \hat{p}_2(1-\hat{p}_2 )}{n_2} }$$.

Show question

Question

If $$n_1$$ is the sample size from population $$1$$, $$n_2$$ is the sample size from population $$2$$, $$\hat{p}_1$$ is the sample proportion from population $$1$$, and $$\hat{p}_2$$ is the sample proportion from population $$2$$, then what are the conditions to construct a confidence interval for the difference of two proportions?

• individuals are randomly assigned to treatments;
• $$n_1 \hat{p}_1 \ge 10$$, $$n_1(1- \hat{p}_1 ) \ge 10$$; and
• $$n_2 \hat{p}_2 \ge 10$$, $$n_2(1- \hat{p}_2 ) \ge 10$$.

Show question

Question

When you construct a confidence interval for the difference of two proportions, you use a ___-critical score.

$$z$$.

Show question

Question

If you have made a confidence interval for the difference of two proportions $$p_1-p_2$$, and one endpoints is negative and the other is positive, which of the following would be a correct conclusion?

$$p_1$$ and $$p_2$$ could be equal.

Show question

Question

The difference between the two proportions is estimated to be $$0.21$$ and the margin of error of $$0.02$$. What is the confidence interval of the difference between the proportions?

$$(0.19, 0.23)$$.

Show question

Question

If you have made a confidence interval for the difference of two proportions $$p_1-p_2$$, and both endpoints are negative, which of the following would be a correct conclusion?

$$p_1<p_2$$.

Show question

Question

If you have made a confidence interval for the difference of two proportions $$p_1-p_2$$, and both endpoints are positive, which of the following would be a correct conclusion?

$$p_1 > p_2$$.

Show question

Question

What is the confidence level?

probability that the interval contains the true parameter value.

Show question

Question

Give the formula to calculate the confidence interval between two samples

$$CI_p=(\overline{x_1} - \overline{x_2})+t\sqrt{\frac{sp^2}{n_1}+\frac{sp^2}{n_2}}$$.

Show question

Question

Suppose you were looking at the percentage of people in your area who drink coffee.  Which type of confidence interval would you use?

confidence interval for population proportion .

Show question

Question

Suppose you want to compare the average weight of cats in one city with cats in another city.  What type of confidence interval would you make?

confidence interval for the difference of two means .

Show question

Question

Give the formula to calculate de confidence interval for a proportion:

$$p=Z\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}$$.

Show question

Question

The confidence interval for a proportion uses a _________________

Bernoully type test.

Show question

Question

If you increase the sample size, how does the confidence interval change?

The confidence interval gets smaller (narrower).

Show question

Question

Two samples exist of a large population and their confidence interval is height.

you expect the means to be inside the confidence intervals of the other sample.

Show question

Question

If the confidence level is $$95\%$$ percent what does this mean?

The method used to create a particular confidence interval is successful in capturing the value of the actual population parameter approximately $$95\%$$ of the time

Show question

Question

What kind of sampling should be done when you are interested in making a confidence interval?

Random sampling.

Show question

Question

What is the margin of error?

The margin of error represents a certain number of standard deviations of your statistic you add and subtract to have a certain confidence in your results.

Show question

Question

What is a confidence interval?

A range of likely values to estimate a population parameter.

Show question

Question

What is the formula for confidence interval of a mean?

$$CI=\overline{x}\pm z \frac{\sigma_s}{\sqrt{n}}$$.

Show question

Question

The formula for a confidence interval of the slope of a linear regression model is _____.

$$\hat{\beta}_1\pm t\cdot SE_{\beta_1}$$.

Show question

Question

The expression $$t\cdot SE_{\beta_1}$$ is known as ____.

The margin of error.

Show question

Question

The $$SE_{\beta_1}$$ is know as ____.

The standard error.

Show question

Question

The $$t$$ in $$t\cdot SE_{\beta_1}$$ is known as ____.

The critical value.

Show question

Question

What is formula to calculate the standard error of the slope $$SE_{\beta_1}$$?

$$SE_{\beta_1}=\dfrac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}$$.

Show question

Question

What is the general structure of a confidence interval?

sample statistic – margin of error $$\le \beta_1\le$$ sample statistic + margin of error.

Show question

Question

The margin of error is the product of two components, which are ____ and ___.

the standard error; the critical value.

Show question

60%

of the users don't pass the Confidence Intervals quiz! Will you pass the quiz?

Start Quiz

## Study Plan

Be perfectly prepared on time with an individual plan.

## Quizzes

Test your knowledge with gamified quizzes.

## Flashcards

Create and find flashcards in record time.

## Notes

Create beautiful notes faster than ever before.

## Study Sets

Have all your study materials in one place.

## Documents

Upload unlimited documents and save them online.

## Study Analytics

Identify your study strength and weaknesses.

## Weekly Goals

Set individual study goals and earn points reaching them.

## Smart Reminders

Stop procrastinating with our study reminders.

## Rewards

Earn points, unlock badges and level up while studying.

## Magic Marker

Create flashcards in notes completely automatically.

## Smart Formatting

Create the most beautiful study materials using our templates.