Suggested languages for you:

Americas

Europe

|
|

# Type I Error

How many ways can you be wrong? If you think there is only one way to be wrong, you're wrong. You can either be wrong about being right or wrong about being wrong. In hypothesis testing, when a statistician chooses between rejecting or not rejecting the null hypothesis, there is a possibility the statistician could have reached the wrong conclusion. When this happens, a Type I or a Type II error occurs. It is important to distinguish between the two in hypothesis testing, and the aim of statisticians is to minimise the probability of these errors.

Suppose there is a legal trial, it is commonplace to assume someone is innocent unless there is enough evidence to suggest that they are guilty. After the trial, the judge finds the defendant guilty but it turns out that the defendant was not guilty. This is an example of a Type I error.

## Definition of a Type I Error

Suppose you have carried out a hypothesis test that leads to the rejection of the null hypothesis $$H_0$$. If it turns out that in fact the null hypothesis is true then you have committed a Type I error. Now suppose you have carried out a hypothesis test and accepted the null hypothesis but in fact the $$H_0$$ is false, then you have committed a Type II error. A good way to remember this is by the following table:

 $$H_0$$ true $$H_0$$ false Reject $$H_0$$ Type I error No error Do not reject $$H_0$$ No error Type II error

A Type I error is when you have rejected $$H_0$$ when $$H_0$$ is true.

However there is another way to think about Type I errors.

## A Type I Error is a False Positive

Type I errors are also known as false positives. This is because rejecting $$H_0$$ when $$H_0$$ is true implies that the statistician has falsely concluded that there is statistical significance in the test when there was not. A real world example of a false positive is when a fire alarm goes off when there is no fire or when you have been falsely diagnosed with a disease or illness. As you can imagine, false positives can lead to significant misinformation especially in the case of medical research. For example, when testing for COVID-19, the chance of testing positive when you don't have COVID-19 was estimated at being around $$2.3\%$$. These false positives can lead to overestimation of the impact of the virus leading to a waste of resources.

Knowing that Type I errors are false positives is a good way to remembering the difference between Type I errors and Type II errors, which are referred to as false negatives.

## Type I Errors and Alpha

A Type I error occurs when the null hypothesis is rejected when it is in fact true. The probability of a Type I error is commonly denoted by $$\alpha$$ and this is known as the size of the test.

The size of a test, $$\alpha$$, is the probability of rejecting the null hypothesis, $$H_0$$, when the $$H_0$$ is true and this is equal to the probability of a Type I error.

The size of a test is the significance level of the test and this is chosen before the test is carried out. The Type 1 errors have a probability of $$\alpha$$ which correlates to the confidence level the statistician will set when performing the hypothesis test.

For example, if a statistician sets a confidence level of $$99\%$$ then there is a $$1\%$$ chance or a probability of $$\alpha=0.01$$ that you will get a Type 1 error. Other common choices for $$\alpha$$ are $$0.05$$ and $$0.1$$. Therefore, you can decrease the probability of a Type I error by decreasing the significance level of the test.

## The Probability of a Type I Error

You can calculate the probability of a Type I error occurring by looking at the critical region or the significance level. The critical region of a test is determined such that it keeps the probability of a Type I error less than of equal to the significance level $$\alpha$$.

There is an important distinction between continuous and discrete random variables to be made when looking at the probability of a Type I occurring. When looking at discrete random variables, the probability of a Type I error is the actual significance level, whereas when the random variable in question is continuous, the probability of a Type I error is equal to the significance level of the test.

To find the probability of a Type 1 error:

\begin{align} \mathbb{P}(\text{Type I error})&=\mathbb{P}(\text{rejecting } H_0 \text{ when }H_0 \text{ is true}) \\ &=\mathbb{P}(\text{being in the critical region}) \end{align}

For discrete random variables:

$\mathbb{P}(\text{Type I error})\leq \alpha.$

For continuous random variables:

$\mathbb{P}(\text{Type I error})= \alpha.$

## Discrete Examples of Type I Errors

So how do you find the probability of a Type I error if you have a discrete random variable?

The random variable $$X$$ is binomially distributed. Suppose a sample of 10 is taken and a statistician wants to test the null hypothesis $$H_0: \; p=0.45$$ against the alternative hypothesis $$H_1:\; p\neq0.45$$.

a) Find the critical region for this test.

b) State the probability of a Type I error for this test.

Solution:

a) Since this is a two tailed test, at a $$5\%$$ significance level, the critical values, $$c_1$$ and $$c_2$$ are such that

\begin{align} \mathbb{P}(X\leq c_1) &\leq0.025 \\ \text{ and } \mathbb{P}(X\geq c_2) &\leq 0.025. \end{align}

$$\mathbb{P}(X\geq c_2) = 1-\mathbb{P}(X\leq c_2-1)\leq0.025$$ or $$\mathbb{P}(X\leq c_2-1) \geq0.975$$

Assume $$H_0$$ is true. Then under the null-hypothesis $$X\sim B(10,0.45)$$, from the statistical tables:

\begin{align} &\mathbb{P}(X \leq 1)=0.0233<0.025 \\ & \mathbb{P}(X \leq 2)=0.0996>0.025.\end{align}

Therefore the critical value is $$c_1=1$$. For the second critical value,

\begin{align} &\mathbb{P}(X \leq 7)=0.9726<0.975 \\ & \mathbb{P}(X \leq 8)=0.996>0.975. \end{align}

Therefore $$c_2-1=8$$ so the critical value is $$c_2=9$$.

So the critical region for this test under a $$5\%$$ significance level is

$\left\{ X\leq 1\right\}\cup \left\{ X\geq 9\right\}.$

b) A Type I error occurs when you reject $$H_0$$ but $$H_0$$ is true, i.e. it is the probability you are in the critical region given that the null hypothesis is true.

Under the null hypothesis, $$p=0.45$$, therefore,

\begin{align} \mathbb{P}(\text{Type I error})&=\mathbb{P}(X\leq1 \mid p=0.45)+\mathbb{P}(X\geq9 \mid p=0.45) \\ &=0.0233+1-0.996 \\ &=0.0273. \end{align}

Let's take a look at another example.

A coin is tossed until a tail is obtained.

a) Using a suitable distribution, find the critical region for a hypothesis test that tests whether the coin is biased towards heads at the $$5\%$$ significance level.

b) State the probability of a Type I error for this test.

Solution:

a) Let $$X$$ be the number of coin tosses before a tail is obtained.

Then this can be answered using the geometric distribution as follows since the number of failures (heads) $$k - 1$$ before the first success/tail with a probability of a tail given by $$p$$.

Therefore, $$X\sim \rm{Geo}(p)$$ where $$p$$ is the probability of a tail being obtained. Therefore the null and alternative hypothesis are

\begin{align} &H_0: \; p=\frac{1}{2} \\ \text{and } &H_1: \; p<\frac{1}{2}. \end{align}

Here the alternative hypothesis is the one that you want to establish, i.e. that the coin is biased towards heads, and the null hypothesis is the negation of that, i.e. the coin is not biased.

Under the null hypothesis $$X\sim \rm{Geo} \left(\frac{1}{2}\right)$$.

Since you are dealing with a one-tailed test at the $$5\%$$ significance level, you want to find the critical value $$c$$ such that $$\mathbb{P}(X\geq c) \leq 0.05$$. This means you want

$\left(\frac{1}{2}\right)^{c-1} \leq 0.05.$

Therefore

$(c-1)\ln\left(\frac{1}{2}\right) \leq \ln(0.05),$

which means $$c >5.3219$$.

Therefore, the critical region for this test is $$X \geq 5.3219=6$$.

Here you have used the fact that, for a geometric distribution $$X\sim \rm{Geo}(p)$$,

$\mathbb{P}(X \geq x)=(1-p)^{x-1}.$

b) Since $$X$$ is a discrete random variable, $$\mathbb{P}(\text{Type I error})\leq \alpha$$, and the probability of a Type I error is the actual significance level. So

\begin{align} \mathbb{P}(\text{Type I error})&= \mathbb{P}( \text{rejecting } H_0 \text{ when } H_0 \text{ is true}) \\ &=\mathbb{P}(X\geq 6 \mid p=0.5) \\ &= \left(\frac{1}{2}\right)^{6-1} \\ &=0.03125. \end{align}

## Continuous Examples of a Type I Error

In the continuous case, when finding the probability of a Type I error, you will simply need to give the significance level of the test given in the question.

The random variable $$X$$ is normally distributed such that $$X\sim N(\mu ,4)$$. Suppose a random sample of $$16$$ observations is taken and $$\bar{X}$$ the test statistic. A statistician wants to test $$H_0:\mu=30$$ against $$H_1:\mu<30$$ using a $$5\%$$ significance level.

a) Find the critical region.

b) State the probability of a Type I error.

Solution:

a) Under the null hypothesis you have $$\bar{X}\sim N(30,\frac{4}{16})$$.

Define

$Z=\frac{\bar{X}-\mu}{\frac{\mu}{\sqrt{n}}}\sim N(0,1).$

At the $$5\%$$ significance level for a one-sided test, from the statistical tables, the critical region for $$Z$$ is $$Z<-1.6449$$.

Therefore, you reject $$H_0$$ if

\begin{align} \frac{\bar{X}-\mu}{\frac{\mu}{\sqrt{n}}}&=\frac{\bar{X}-30}{\frac{2}{\sqrt{16}}} \\ &\leq -1.6449.\end{align}

Therefore, with some rearranging, the critical region for $$\bar{X}$$ is given by $$\bar{X} \leq 29.1776$$.

b) Since $$X$$ is a continuous random variable, there is no difference between the target significance level and the actual significance level. Therefore, $$\mathbb{P}(\text{Type I error})= \alpha$$ i.e. the probability of a Type I error $$\alpha$$ is the same as the significance level of the test, so

$\mathbb{P}(\text{Type I error})=0.05.$

## Relationship between Type I and Type II Errors

The relationship between the probabilities of Type I and Type II errors is important in hypothesis testing as statisticians want to minimise both. Yet to minimise the probability of one, you increase the probability of the other.

For example, if you reduce the probability of Type II error (the probability of not rejecting the null hypothesis when it is false) by decreasing the significance level of a test, doing this increases the probability of a Type I error. This trade-off phenomenon is often dealt with by prioritising the minimisation of the probability of Type I errors.

For more information on Type II errors check out our article on Type II Errors.

## Type I Errors - Key takeaways

• A Type I error occurs when you have rejected $$H_0$$ when $$H_0$$ is true.
• Type I errors are also known as false positives.
• The size of a test, $$\alpha$$, is the probability of rejecting the null hypothesis, $$H_0$$, when the $$H_0$$ is true and this is equal to the probability of a Type I error.
• You can decrease the probability of a Type I error by decreasing the significance level of the test.
• There is a trade-off between Type I and Type II errors since You cannot decrease the probability of a Type I error without increasing the probability of a Type II error, and vice versa.

For continuous random variables, the probability of a type I error is the significance level of the test.

For discrete random variables, the probability of a type I error is the actual significance level, which is found by calculating the critical region then finding the probability that you are in the critical region.

A type I error is when you have rejected the null hypothesis when it is true.

An example of a type I error is when someone has tested positive for Covid-19 but they don't actually have Covid-19.

In most cases, Type 1 errors are seen as worse than Type 2 errors. This is because incorrectly rejecting the null hypothesis usually leads to more significant consequences.

Type I and Type II errors are important because it means that an incorrect conclusion has been made in a hypothesis/statistical test. This can lead to issues such as false information or costly errors.

## Final Type I Error Quiz

Question

What is a Type I error?

A Type I error is when you have rejected $$H_0$$ when $$H_0$$ is true.

Show question

Question

What is the relationship between the probability of a Type I error and the significance level of a test?

When the test statistic in question has a continuous distribution then the probability of a Type I error

is equal to the significance level of the test. If it has a discrete distribution then the probability of a Type I error is less than or equal to the significance level of a test, but equal to the actual significance level of the test.

Show question

Question

What is the significance level of a hypothesis test?

The significance level of a test is the threshold which is set to mark where one can reject the null hypothesis.

Show question

Question

What are the common values given for the significance level of a test?

$$0.1, 0.05, 0.01$$

Show question

Question

What is a Type I error also known as?

False positive.

Show question

Question

How do we denote the size of a test?

Alpha, $$\alpha$$

Show question

Question

How can you reduce the probability of a Type I error?

Reduce the significance level

Show question

Question

Suppose you take a Covid-19 PCR test and it comes back positive but you later find out you didn't actually have coronavirus. Is this an example of a Type I error?

Yes

Show question

Question

When would a statistician or researcher set the significance level?

Before conducting a hypothesis test

Show question

Question

True or false, when conducting a hypothesis test, it is possible to control and minimise both Type I and Type II errors simultaneously?

True

Show question

Question

You cannot decrease the probability of a Type I error without increasing the probability of a Type II error, and vice versa.

Show question

Question

Suppose a statistician tested a sample of a population and $$H_0$$ was rejected, but in the whole population $$H_0$$ was in fact true, what type of error is this?

Type I error

Show question

Question

Why can't statisticians always minimise the probability of a Type I error by setting a low significance level?

Due to the trade-off effect, decreasing the Type I error increases the probability of a Type II error.

Show question

Question

For a discrete random variable, what is the probability of a Type I error?

$\mathbb{P}(\text{Type I error})\leq \alpha$

Show question

Question

For a continuous random variable, what is the probability of a Type I error?

$\mathbb{P}(\text{Type I error})= \alpha$

Show question

60%

of the users don't pass the Type I Error quiz! Will you pass the quiz?

Start Quiz

## Study Plan

Be perfectly prepared on time with an individual plan.

## Quizzes

Test your knowledge with gamified quizzes.

## Flashcards

Create and find flashcards in record time.

## Notes

Create beautiful notes faster than ever before.

## Study Sets

Have all your study materials in one place.

## Documents

Upload unlimited documents and save them online.

## Study Analytics

Identify your study strength and weaknesses.

## Weekly Goals

Set individual study goals and earn points reaching them.

## Smart Reminders

Stop procrastinating with our study reminders.

## Rewards

Earn points, unlock badges and level up while studying.

## Magic Marker

Create flashcards in notes completely automatically.

## Smart Formatting

Create the most beautiful study materials using our templates.