StudySmarter - The all-in-one study app.

4.8 • +11k Ratings

More than 3 Million Downloads

Free

Suggested languages for you:

Americas

Europe

Hypothesis Test of Two Population Proportions

- Calculus
- Absolute Maxima and Minima
- Absolute and Conditional Convergence
- Accumulation Function
- Accumulation Problems
- Algebraic Functions
- Alternating Series
- Antiderivatives
- Application of Derivatives
- Approximating Areas
- Arc Length of a Curve
- Area Between Two Curves
- Arithmetic Series
- Average Value of a Function
- Calculus of Parametric Curves
- Candidate Test
- Combining Differentiation Rules
- Combining Functions
- Continuity
- Continuity Over an Interval
- Convergence Tests
- Cost and Revenue
- Density and Center of Mass
- Derivative Functions
- Derivative of Exponential Function
- Derivative of Inverse Function
- Derivative of Logarithmic Functions
- Derivative of Trigonometric Functions
- Derivatives
- Derivatives and Continuity
- Derivatives and the Shape of a Graph
- Derivatives of Inverse Trigonometric Functions
- Derivatives of Polar Functions
- Derivatives of Sec, Csc and Cot
- Derivatives of Sin, Cos and Tan
- Determining Volumes by Slicing
- Direction Fields
- Disk Method
- Divergence Test
- Eliminating the Parameter
- Euler's Method
- Evaluating a Definite Integral
- Evaluation Theorem
- Exponential Functions
- Finding Limits
- Finding Limits of Specific Functions
- First Derivative Test
- Function Transformations
- General Solution of Differential Equation
- Geometric Series
- Growth Rate of Functions
- Higher-Order Derivatives
- Hydrostatic Pressure
- Hyperbolic Functions
- Implicit Differentiation Tangent Line
- Implicit Relations
- Improper Integrals
- Indefinite Integral
- Indeterminate Forms
- Initial Value Problem Differential Equations
- Integral Test
- Integrals of Exponential Functions
- Integrals of Motion
- Integrating Even and Odd Functions
- Integration Formula
- Integration Tables
- Integration Using Long Division
- Integration of Logarithmic Functions
- Integration using Inverse Trigonometric Functions
- Intermediate Value Theorem
- Inverse Trigonometric Functions
- Jump Discontinuity
- Lagrange Error Bound
- Limit Laws
- Limit of Vector Valued Function
- Limit of a Sequence
- Limits
- Limits at Infinity
- Limits at Infinity and Asymptotes
- Limits of a Function
- Linear Approximations and Differentials
- Linear Differential Equation
- Linear Functions
- Logarithmic Differentiation
- Logarithmic Functions
- Logistic Differential Equation
- Maclaurin Series
- Manipulating Functions
- Maxima and Minima
- Maxima and Minima Problems
- Mean Value Theorem for Integrals
- Models for Population Growth
- Motion Along a Line
- Motion in Space
- Natural Logarithmic Function
- Net Change Theorem
- Newton's Method
- Nonhomogeneous Differential Equation
- One-Sided Limits
- Optimization Problems
- P Series
- Particle Model Motion
- Particular Solutions to Differential Equations
- Polar Coordinates
- Polar Coordinates Functions
- Polar Curves
- Population Change
- Power Series
- Radius of Convergence
- Ratio Test
- Removable Discontinuity
- Riemann Sum
- Rolle's Theorem
- Root Test
- Second Derivative Test
- Separable Equations
- Separation of Variables
- Simpson's Rule
- Solid of Revolution
- Solutions to Differential Equations
- Surface Area of Revolution
- Symmetry of Functions
- Tangent Lines
- Taylor Polynomials
- Taylor Series
- Techniques of Integration
- The Fundamental Theorem of Calculus
- The Mean Value Theorem
- The Power Rule
- The Squeeze Theorem
- The Trapezoidal Rule
- Theorems of Continuity
- Trigonometric Substitution
- Vector Valued Function
- Vectors in Calculus
- Vectors in Space
- Washer Method
- Decision Maths
- Geometry
- 2 Dimensional Figures
- 3 Dimensional Vectors
- 3-Dimensional Figures
- Altitude
- Angles in Circles
- Arc Measures
- Area and Volume
- Area of Circles
- Area of Circular Sector
- Area of Parallelograms
- Area of Plane Figures
- Area of Rectangles
- Area of Regular Polygons
- Area of Rhombus
- Area of Trapezoid
- Area of a Kite
- Composition
- Congruence Transformations
- Congruent Triangles
- Convexity in Polygons
- Coordinate Systems
- Dilations
- Distance and Midpoints
- Equation of Circles
- Equilateral Triangles
- Figures
- Fundamentals of Geometry
- Geometric Inequalities
- Geometric Mean
- Geometric Probability
- Glide Reflections
- HL ASA and AAS
- Identity Map
- Inscribed Angles
- Isometry
- Isosceles Triangles
- Law of Cosines
- Law of Sines
- Linear Measure and Precision
- Median
- Parallel Lines Theorem
- Parallelograms
- Perpendicular Bisector
- Plane Geometry
- Polygons
- Projections
- Properties of Chords
- Proportionality Theorems
- Pythagoras Theorem
- Rectangle
- Reflection in Geometry
- Regular Polygon
- Rhombuses
- Right Triangles
- Rotations
- SSS and SAS
- Segment Length
- Similarity
- Similarity Transformations
- Special quadrilaterals
- Squares
- Surface Area of Cone
- Surface Area of Cylinder
- Surface Area of Prism
- Surface Area of Sphere
- Surface Area of a Solid
- Surface of Pyramids
- Symmetry
- Translations
- Trapezoids
- Triangle Inequalities
- Triangles
- Using Similar Polygons
- Vector Addition
- Vector Product
- Volume of Cone
- Volume of Cylinder
- Volume of Pyramid
- Volume of Solid
- Volume of Sphere
- Volume of prisms
- Mechanics Maths
- Acceleration and Time
- Acceleration and Velocity
- Angular Speed
- Assumptions
- Calculus Kinematics
- Coefficient of Friction
- Connected Particles
- Conservation of Mechanical Energy
- Constant Acceleration
- Constant Acceleration Equations
- Converting Units
- Elastic Strings and Springs
- Force as a Vector
- Kinematics
- Newton's First Law
- Newton's Law of Gravitation
- Newton's Second Law
- Newton's Third Law
- Power
- Projectiles
- Pulleys
- Resolving Forces
- Statics and Dynamics
- Tension in Strings
- Variable Acceleration
- Work Done by a Constant Force
- Probability and Statistics
- Bar Graphs
- Basic Probability
- Charts and Diagrams
- Conditional Probabilities
- Continuous and Discrete Data
- Frequency, Frequency Tables and Levels of Measurement
- Independent Events Probability
- Line Graphs
- Mean Median and Mode
- Mutually Exclusive Probabilities
- Probability Rules
- Probability of Combined Events
- Quartiles and Interquartile Range
- Systematic Listing
- Pure Maths
- ASA Theorem
- Absolute Value Equations and Inequalities
- Addition and Subtraction of Rational Expressions
- Addition, Subtraction, Multiplication and Division
- Algebra
- Algebraic Fractions
- Algebraic Notation
- Algebraic Representation
- Analyzing Graphs of Polynomials
- Angle Measure
- Angles
- Angles in Polygons
- Approximation and Estimation
- Area and Circumference of a Circle
- Area and Perimeter of Quadrilaterals
- Area of Triangles
- Argand Diagram
- Arithmetic Sequences
- Average Rate of Change
- Bijective Functions
- Binomial Expansion
- Binomial Theorem
- Chain Rule
- Circle Theorems
- Circles
- Circles Maths
- Combination of Functions
- Combinatorics
- Common Factors
- Common Multiples
- Completing the Square
- Completing the Squares
- Complex Numbers
- Composite Functions
- Composition of Functions
- Compound Interest
- Compound Units
- Conic Sections
- Construction and Loci
- Converting Metrics
- Convexity and Concavity
- Coordinate Geometry
- Coordinates in Four Quadrants
- Cubic Function Graph
- Cubic Polynomial Graphs
- Data transformations
- De Moivre's Theorem
- Deductive Reasoning
- Definite Integrals
- Deriving Equations
- Determinant of Inverse Matrix
- Determinants
- Differential Equations
- Differentiation
- Differentiation Rules
- Differentiation from First Principles
- Differentiation of Hyperbolic Functions
- Direct and Inverse proportions
- Disjoint and Overlapping Events
- Disproof by Counterexample
- Distance from a Point to a Line
- Divisibility Tests
- Double Angle and Half Angle Formulas
- Drawing Conclusions from Examples
- Ellipse
- Equation of Line in 3D
- Equation of a Perpendicular Bisector
- Equation of a circle
- Equations
- Equations and Identities
- Equations and Inequalities
- Estimation in Real Life
- Euclidean Algorithm
- Evaluating and Graphing Polynomials
- Even Functions
- Exponential Form of Complex Numbers
- Exponential Rules
- Exponentials and Logarithms
- Expression Math
- Expressions and Formulas
- Faces Edges and Vertices
- Factorials
- Factoring Polynomials
- Factoring Quadratic Equations
- Factorising expressions
- Factors
- Finding Maxima and Minima Using Derivatives
- Finding Rational Zeros
- Finding the Area
- Forms of Quadratic Functions
- Fractional Powers
- Fractional Ratio
- Fractions
- Fractions and Decimals
- Fractions and Factors
- Fractions in Expressions and Equations
- Fractions, Decimals and Percentages
- Function Basics
- Functional Analysis
- Functions
- Fundamental Counting Principle
- Fundamental Theorem of Algebra
- Generating Terms of a Sequence
- Geometric Sequence
- Gradient and Intercept
- Graphical Representation
- Graphing Rational Functions
- Graphing Trigonometric Functions
- Graphs
- Graphs and Differentiation
- Graphs of Common Functions
- Graphs of Exponents and Logarithms
- Graphs of Trigonometric Functions
- Greatest Common Divisor
- Growth and Decay
- Growth of Functions
- Highest Common Factor
- Hyperbolas
- Imaginary Unit and Polar Bijection
- Implicit differentiation
- Inductive Reasoning
- Inequalities Maths
- Infinite geometric series
- Injective functions
- Instantaneous Rate of Change
- Integers
- Integrating Polynomials
- Integrating Trig Functions
- Integrating e^x and 1/x
- Integration
- Integration Using Partial Fractions
- Integration by Parts
- Integration by Substitution
- Integration of Hyperbolic Functions
- Interest
- Inverse Hyperbolic Functions
- Inverse Matrices
- Inverse and Joint Variation
- Inverse functions
- Iterative Methods
- Law of Cosines in Algebra
- Law of Sines in Algebra
- Laws of Logs
- Limits of Accuracy
- Linear Expressions
- Linear Systems
- Linear Transformations of Matrices
- Location of Roots
- Logarithm Base
- Logic
- Lower and Upper Bounds
- Lowest Common Denominator
- Lowest Common Multiple
- Math formula
- Matrices
- Matrix Addition and Subtraction
- Matrix Determinant
- Matrix Multiplication
- Metric and Imperial Units
- Misleading Graphs
- Mixed Expressions
- Modulus Functions
- Modulus and Phase
- Multiples of Pi
- Multiplication and Division of Fractions
- Multiplicative Relationship
- Multiplying and Dividing Rational Expressions
- Natural Logarithm
- Natural Numbers
- Notation
- Number
- Number Line
- Number Systems
- Numerical Methods
- Odd functions
- Open Sentences and Identities
- Operation with Complex Numbers
- Operations with Decimals
- Operations with Matrices
- Operations with Polynomials
- Order of Operations
- Parabola
- Parallel Lines
- Parametric Differentiation
- Parametric Equations
- Parametric Integration
- Partial Fractions
- Pascal's Triangle
- Percentage
- Percentage Increase and Decrease
- Percentage as fraction or decimals
- Perimeter of a Triangle
- Permutations and Combinations
- Perpendicular Lines
- Points Lines and Planes
- Polynomial Graphs
- Polynomials
- Powers Roots And Radicals
- Powers and Exponents
- Powers and Roots
- Prime Factorization
- Prime Numbers
- Problem-solving Models and Strategies
- Product Rule
- Proof
- Proof and Mathematical Induction
- Proof by Contradiction
- Proof by Deduction
- Proof by Exhaustion
- Proof by Induction
- Properties of Exponents
- Proportion
- Proving an Identity
- Pythagorean Identities
- Quadratic Equations
- Quadratic Function Graphs
- Quadratic Graphs
- Quadratic functions
- Quadrilaterals
- Quotient Rule
- Radians
- Radical Functions
- Rates of Change
- Ratio
- Ratio Fractions
- Rational Exponents
- Rational Expressions
- Rational Functions
- Rational Numbers and Fractions
- Ratios as Fractions
- Real Numbers
- Reciprocal Graphs
- Recurrence Relation
- Recursion and Special Sequences
- Remainder and Factor Theorems
- Representation of Complex Numbers
- Rewriting Formulas and Equations
- Roots of Complex Numbers
- Roots of Polynomials
- Roots of Unity
- Rounding
- SAS Theorem
- SSS Theorem
- Scalar Triple Product
- Scale Drawings and Maps
- Scale Factors
- Scientific Notation
- Second Order Recurrence Relation
- Sector of a Circle
- Segment of a Circle
- Sequences
- Sequences and Series
- Series Maths
- Sets Math
- Similar Triangles
- Similar and Congruent Shapes
- Simple Interest
- Simplifying Fractions
- Simplifying Radicals
- Simultaneous Equations
- Sine and Cosine Rules
- Small Angle Approximation
- Solving Linear Equations
- Solving Linear Systems
- Solving Quadratic Equations
- Solving Radical Inequalities
- Solving Rational Equations
- Solving Simultaneous Equations Using Matrices
- Solving Systems of Inequalities
- Solving Trigonometric Equations
- Solving and Graphing Quadratic Equations
- Solving and Graphing Quadratic Inequalities
- Special Products
- Standard Form
- Standard Integrals
- Standard Unit
- Straight Line Graphs
- Substraction and addition of fractions
- Sum and Difference of Angles Formulas
- Sum of Natural Numbers
- Surds
- Surjective functions
- Tables and Graphs
- Tangent of a Circle
- The Quadratic Formula and the Discriminant
- Transformations
- Transformations of Graphs
- Translations of Trigonometric Functions
- Triangle Rules
- Triangle trigonometry
- Trigonometric Functions
- Trigonometric Functions of General Angles
- Trigonometric Identities
- Trigonometric Ratios
- Trigonometry
- Turning Points
- Types of Functions
- Types of Numbers
- Types of Triangles
- Unit Circle
- Units
- Variables in Algebra
- Vectors
- Verifying Trigonometric Identities
- Writing Equations
- Writing Linear Equations
- Statistics
- Bias in Experiments
- Binomial Distribution
- Binomial Hypothesis Test
- Bivariate Data
- Box Plots
- Categorical Data
- Categorical Variables
- Central Limit Theorem
- Chi Square Test for Goodness of Fit
- Chi Square Test for Homogeneity
- Chi Square Test for Independence
- Chi-Square Distribution
- Combining Random Variables
- Comparing Data
- Comparing Two Means Hypothesis Testing
- Conditional Probability
- Conducting a Study
- Conducting a Survey
- Conducting an Experiment
- Confidence Interval for Population Mean
- Confidence Interval for Population Proportion
- Confidence Interval for Slope of Regression Line
- Confidence Interval for the Difference of Two Means
- Confidence Intervals
- Correlation Math
- Cumulative Distribution Function
- Cumulative Frequency
- Data Analysis
- Data Interpretation
- Degrees of Freedom
- Discrete Random Variable
- Distributions
- Dot Plot
- Empirical Rule
- Errors in Hypothesis Testing
- Estimator Bias
- Events (Probability)
- Frequency Polygons
- Generalization and Conclusions
- Geometric Distribution
- Histograms
- Hypothesis Test for Correlation
- Hypothesis Test for Regression Slope
- Hypothesis Test of Two Population Proportions
- Hypothesis Testing
- Inference for Distributions of Categorical Data
- Inferences in Statistics
- Large Data Set
- Least Squares Linear Regression
- Linear Interpolation
- Linear Regression
- Measures of Central Tendency
- Methods of Data Collection
- Normal Distribution
- Normal Distribution Hypothesis Test
- Normal Distribution Percentile
- Paired T-Test
- Point Estimation
- Probability
- Probability Calculations
- Probability Density Function
- Probability Distribution
- Probability Generating Function
- Quantitative Variables
- Quartiles
- Random Variables
- Randomized Block Design
- Residual Sum of Squares
- Residuals
- Sample Mean
- Sample Proportion
- Sampling
- Sampling Distribution
- Scatter Graphs
- Single Variable Data
- Skewness
- Spearman's Rank Correlation Coefficient
- Standard Deviation
- Standard Error
- Standard Normal Distribution
- Statistical Graphs
- Statistical Measures
- Stem and Leaf Graph
- Sum of Independent Random Variables
- Survey Bias
- T-distribution
- Transforming Random Variables
- Tree Diagram
- Two Categorical Variables
- Two Quantitative Variables
- Type I Error
- Type II Error
- Types of Data in Statistics
- Variance for Binomial Distribution
- Venn Diagrams

Suppose you did a survey of employees at corporations in your country and found that out of \(1300\) full-time employees and \(290\) part-time employees, that \(40\%\) of the full-time employees and \(38\%\) of the part-time employees were putting aside at least twelve percent of their earnings as savings. Could you draw any conclusions about the differences in savings habits between full-time and part-time employees? Hypothesis testing to the rescue! This is an example of two population proportions, and here you will see how to do a hypothesis test and draw conclusions from this kind of sampling.

Let's start by listing what you know from the example at the start of this article.

Population | Population Proportion | Sample Size | |

Full-time employees of corporations in your country. | \(p_1 = \) proportion of | \(n_1 = 1300\) | \(\hat{p}_1 = 0.40\) |

Part-time employees of corporations in your country. | \(p_2 = \) proportion of | \(n_2 = 290\) | \(\hat{p}_2 = 0.38\) |

It is clear looking at the table that the sample sizes are very different, and their sample proportions are different as well. However, it will be very rare for you to find an example where the sample proportions are the same. Why might the sample proportions be different, even if you might eventually be able to conclude that the proportion of people who put aside at least twelve percent of their earnings is the same between part-time and full-time employees?

Differences that occur between two samples just by chance are called **sampling variability**.

One of the main questions that a hypothesis test for two population proportions tries to answer is whether the difference in your sample proportions happens because of sampling variability or because of an actual difference in the populations.

One of the assumptions you will need is that your samples are **independent**.

Two samples are **independent** if picking members for one sample doesn't influence how members of the second sample are picked.

In the example involving employees, picking a person who is a full-time employee doesn't influence who you pick as a part-time employee, so they are independent. That is very different from dependent samples.

Two samples are **dependent** if picking members for one sample automatically determines the members of the second sample.

If you were doing a study on twins then picking a twin for one sample would automatically put the other twin in the second sample. Twins are a common example of dependent samples. This is called matched-pair data, and it requires a different form of hypothesis testing than you will see here.

There are many ways that \(p_1\) can be different from \(p_2\). It might be that \(p_1 < p_2\), or that \(p_1>p_2\). Rather than try and list all of the ways they are different and do a hypothesis test for each, you can look at the **difference** between the two population proportions. In fact, a hypothesis test for two population proportions is often called a hypothesis test for the difference between two population proportions for this very reason!

In this kind of hypothesis test, your null hypothesis will almost always be that the two population proportions are the same. If you state that in terms of their difference you get:

\[ H_0:\; p_1 - p_2 = 0.\]

Then there are three varieties of alternative hypotheses outlined in the next table.

Question | Alternative hypothesis | Test Type |

Is \(p_1\) different from \(p_2\)? | \(H_a:\; p_1 - p_2 \ne 0\) | Two-tailed test. |

Is \(p_1\) smaller than \(p_2\)? | \(H_a:\; p_1 - p_2 < 0\) | Left-tailed test. |

Is \(p_1\) larger than \(p_2\)? | \(H_a:\; p_1 - p_2 > 0\) | Right-tailed test. |

Let's go back to the example from the start of this article.

Your goal here is to figure out if full-time employees and part-time employees have different saving habits, so the hypotheses would be:

\[ \begin{align} &H_0:\; p_1 -p_2 = 0 \\ & H_a: \; p_1-p_2 \ne 0, \end{align} \]

and it would be a two-tailed test.

Next, let's look at the test statistic for this type of hypothesis test.

It is important that your samples are independent, or the test statistic will be different from the one shown here. Since you are using independent samples, remember that

\[ \mu_{\hat{p}_1 - \hat{p}_2} = p_1 - p_2.\]

For a reminder on why this is true, see the articles Transforming Random Variables and Combining Random Variables.

For the standard deviation,

\[ \sigma_{\hat{p}_1 - \hat{p}_2} = \sqrt{ \frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2} }.\]

For the savings example, you have that \(n_1 = 1300\), \(n_2 = 290\), \(\hat{p}_1 = 0.40\), and \(\hat{p}_2 = 0.38\). Calculating the mean of the sampling distribution \(\hat{p}_1 - \hat{p}_2 \) gives you:

\[\begin{align} \mu_{\hat{p}_1 - \hat{p}_2} &= p_1 - p_2 \\ &= 0.40 - 0.38 \\ &= 0.02 \end{align}\]

The standard deviation for \(\hat{p}_1 - \hat{p}_2 \) is:

\[ \begin{align} \sigma_{\hat{p}_1 - \hat{p}_2} &= \sqrt{ \frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2} } \\ &= \sqrt{ \frac{0.40(1-0.40)}{1300} + \frac{0.38(1-0.38)}{290} } \\ &= \sqrt{\frac{0.24}{1300} + \frac{0.2356}{290} } \\ &\approx 0.03157 \end{align} \]

So far you have only assumed that the samples are independent. For the next part, you will need to assume that the sample sizes are large enough. If they are, you can use the Central Limit Theorem to get that your sampling distribution \(\hat{p}_1 - \hat{p}_2 \) is approximately normal.

How do you know if your samples are large enough? If all four of the following conditions are satisfied, then your samples are large enough for the sampling distribution \(\hat{p}_1 - \hat{p}_2 \) to be approximately normal:

\[n_1\hat{p_1} \ge 10\].

\[n_2\hat{p_2} \ge 10\].

\[n_1(1-p_1) \ge 10\]. and

\[n_2(1-p_2) \ge 10\].

It isn't too hard to check that the sample sizes in the savings example are large enough for the sampling distribution to be approximately normal.

The last condition to use this type of hypothesis test is that your sample is less than \(10\%\) of the overall population. In this case, the sample size is certainly less than \(10\%\) of all of the people in your country, so this condition is satisfied as well.

When doing a hypothesis test for the difference in population proportions, a \(z\)-test is used. To do this, you will need to calculate the test statistic, which uses the difference in the two proportions. To make calculations a little easier, it is helpful to find:

\[ \begin{align}\hat{p}_c &= \frac{\text{number of successes in the two samples} }{\text{total of the two sample sizes}} \\ &= \frac{n_1\hat{p_1} + n_2\hat{p_2} }{n_1 + n_2} \end{align}\]

Combining counts to get an overall proportion is called **pooling**, and \(p_c\) is called the pooled (or combined) proportion.

Going again back to the savings example, \(n_1 = 1300\), \(n_2 = 290\), \(\hat{p}_1 = 0.40\), and

\(\hat{p}_2 = 0.38\), which means that:

\[ \begin{align}\hat{p}_c &= \frac{n_1\hat{p_1} + n_2\hat{p_2} }{n_1 + n_2}. \\ &= \frac{1300(0.40)+ 290(0.38) }{1300+ 290} \\ &= \frac{630.2}{1590} \\ & \approx 0.3964 \end{align}\]

As long as your null hypothesis is \(H_0:\; p_1 -p_2 = 0 \), the test statistic can be calculated using the formula:

\[ z = \frac{\hat{p_1} - \hat{p_2} }{\sqrt{ \dfrac{\hat{p}_c (1-\hat{p}_c) }{n_1} +\dfrac{\hat{p}_c (1-\hat{p}_c) }{n_2} } }\]

Calculating the test statistic for the savings example:

\[ \begin{align} z &= \frac{\hat{p_1} - \hat{p_2} }{\sqrt{ \dfrac{\hat{p}_c (1-\hat{p}_c) }{n_1} +\dfrac{\hat{p}_c (1-\hat{p}_c) }{n_2} } } \\ &= \frac{0.40 - 0.38 }{\sqrt{ \dfrac{0.3964 (1-0.3964 ) }{1300} +\dfrac{0.3964 (1-0.3964 ) }{290} } } \\ & \approx 0.63,\end{align} \]

Rounded to \(2\) decimal places.

Let's finish up the hypothesis test for the savings example. No significance level was given, so you will need to consider the Type I and Type II error consequences. See Errors in Hypothesis Testing for more information and examples. In this example, a Type I error would be deciding that the savings proportions are not the same for the two groups when in fact they are the same.

A Type II error would be not thinking there is a difference in the population proportion between the two groups when in fact they are not the same. Neither error is very bad (unlike in a medical trial where the type of error is of much more importance) so choosing a significance level of \(\alpha = 0.05\) would be fine.

Remember that this is a two-tailed test! So the \(P\)-value is twice the area under the \(z\)-curve and to the right of the \(z\)-value. In other words:

\[ \begin{align} P\text{-value} &= 2(\text{area under curve to the right of }0.63) \\ &= 2\cdot P(z>0.63) \\ &= 2(0.2643) \\ &\approx 0.529 \end{align} \]

The \(P\)-value is greater than the significance level of \(\alpha = 0.05\), so you will fail to reject the null hypothesis.

Remember that you never say things like "the null hypothesis is true". For a reminder on why, see the article Hypothesis Testing.

Communicating your conclusion can be the most challenging part of doing a hypothesis test. What does it mean to fail to reject the null hypothesis?

**Solution:**

The original goal was to find out if there is a difference in savings habits between full-time and part-time employees at corporations in your country. The null hypothesis is that there is no difference in the savings habits between the two groups. In failing to reject the null hypothesis, what you are saying is that there is no convincing evidence that there is a difference in savings habits between full-time and part-time employees.

Why was there a difference in the population proportions then? It might have been from sampling variability. All you can say from the sample proportions is that you are not convinced there is a difference between the two sampling proportions.

Let's look at another example of hypothesis testing for the difference in two population proportions.

Many bulldog owners report that their pet snores, and in fact, their bulldog snores more frequently as it gets older.

You have decided to do a test to see if this is actually true or maybe just a matter of perception. So you break down bulldogs into two groups, those under three years of age and those over three years of age, and choose a random sample of \(700\) bulldog owners to ask them about their dog's snoring. From the survey responses (not everyone responds to surveys), you create the following table:

Population | Population Proportion | Sample Size | |

Bulldogs under the age of \(3\). | \(p_1 = \) proportion of bulldogs under the age of \(3\) who snore more than five times a week. | \(n_1 = 300\) | \(\hat{p}_1 = 0.26\) |

Bulldogs over the age of 3. | \(p_2 = \) proportion of bulldogs over the age of \(3\) who snore more than five times a week. | \(n_2 = 291\) | \(\hat{p}_2 = 0.392\) |

Before going any further, let's check to make sure that the conditions for doing a hypothesis test for two population proportions are satisfied. First, the samples are independent since a bulldog can't be both under \(3\) years old and over \(3\) years old at the same time. In addition, there are certainly far more than \(591\) people worldwide that own bulldogs, so the number of bulldog owners sampled is less than \(10\%\) of the overall population of people who own bulldogs. Also,

\(n_1\hat{p_1} = 300(0.26)=78 \ge 10\),

\(n_2\hat{p_2} = 291(0.392) = 114 \ge 10\).

\(n_1(1-p_1) = 300(1-0.26) = 222 \ge 10\)

\(n_2(1-p_2) = 291(1-0.392) = 176.9 \ge 10\).

so all of the conditions for applying the test are met.

The next step is deciding on the null and alternative hypotheses. The null hypothesis would be:

\[ H_0: \; p_2-p_1 = 0\]

or in other words that there is no difference in snoring between the two groups. The alternative hypothesis would be that there is a difference in the snoring rates of the two groups, so:

\[H_a:\; p_2-p_1 \ne 0\]

Calculating the pooled success rate (sometimes called the combined success rate):

\[ \begin{align}\hat{p}_c &= \frac{n_1\hat{p_1} + n_2\hat{p_2} }{n_1 + n_2} \\ &= \frac{300(0.26)+291(0.392)}{300+291} \\ &\approx 0.325 . \end{align}\]

Then the test statistic is:

\[\begin{align} z &= \frac{\hat{p_2} - \hat{p_1} }{\sqrt{ \dfrac{\hat{p}_c (1-\hat{p}_c) }{n_1} +\dfrac{\hat{p}_c (1-\hat{p}_c) }{n_2} } } \\ &= \frac{ 0.392 - 0.26 }{\sqrt{ \dfrac{0.325 (1-0.325) }{300} +\dfrac{0.325 (1-0.325) }{291} } } \\ &\approx 3.425 \end{align}\]

Notice that here you are using \(p_2-p_1\) as the null hypothesis simply for the convenience of having \(\hat{p_2} - \hat{p_1} \) be positive. It actually doesn't matter which version you choose for the null hypothesis, as long as you are consistent throughout your work and you make sure your \(z\) calculation matches.

Remember that this is a two-tailed test! So the \(P\)-value is twice the area under the \(z\)-curve and to the right of the \(z\)-value. In other words:

\[ \begin{align} P\text{-value} &= 2(\text{area under curve to the right of }3.425) \\ &= 2\cdot P(z>3.425) \\ &\approx 2(0.0003) \\ &= 0.0006, \end{align} \]

where the value of \(P(z>3.425)\) can be found using a standard normal table or calculator.

So at a \(\alpha = 0.05\) significance level, you can reject the null hypothesis, and conclude that there is a difference in bulldog snoring based on age.

Would your conclusion have been any different if the alternative hypothesis had been:

\[H_a:\; p_2-p_1 > 0?\]

**Solution:**

The main change would have been in calculating the \(P\)-value. Since it would be a one-tailed test, in this case, the calculation would be:

\[ \begin{align} P\text{-value} &= \text{area under curve to the right of }3.425 \\ &= P(z>3.425) \\ &\approx 0.0003 \end{align} \]

At the \(\alpha = 0.05\) significance level, you would still reject the null hypothesis and conclude that bulldogs over the age of \(3\) do snore more than bulldogs under the age of \(3\).

- Two samples are independent if picking members for one sample doesn't influence how members of the second sample are picked.
- Two samples are dependent if picking members for one sample automatically determines the members of the second sample.
- For a hypothesis test for two population proportions, the null hypothesis will almost always be that the two population proportions are the same.
- The conditions for applying a hypothesis test for the difference of two population proportions are:
- The samples are independent.
- The sample is less than \(10\%\) of the overall population.
- \(n_1\hat{p_1} \ge 10\), \(n_2\hat{p_2} \ge 10\), \(n_1(1-p_1) \ge 10\), and \(n_2(1-p_2) \ge 10\) where \(n_1\) is the size of the first sample, \(n_2\) is the size of the second sample, \(p_1\) is the proportion of successes in the first sample, and \(p_2\) is the proportion of successes in the second sample.

- The pooled proportion formula is \[ \begin{align}\hat{p}_c &= \frac{\text{number of successes in the two samples} }{\text{total of the two sample sizes}} \\ &= \frac{n_1\hat{p_1} + n_2\hat{p_2} }{n_1 + n_2}. \end{align}\]
- The formula for the test statistic is \[ z = \frac{\hat{p_1} - \hat{p_2} }{\sqrt{ \dfrac{\hat{p}_c (1-\hat{p}_c) }{n_1} +\dfrac{\hat{p}_c (1-\hat{p}_c) }{n_2} } }\]

Perform a hypothesis test for the difference of two populations proportions.

More about Hypothesis Test of Two Population Proportions

60%

of the users don't pass the Hypothesis Test of Two Population Proportions quiz! Will you pass the quiz?

Start QuizBe perfectly prepared on time with an individual plan.

Test your knowledge with gamified quizzes.

Create and find flashcards in record time.

Create beautiful notes faster than ever before.

Have all your study materials in one place.

Upload unlimited documents and save them online.

Identify your study strength and weaknesses.

Set individual study goals and earn points reaching them.

Stop procrastinating with our study reminders.

Earn points, unlock badges and level up while studying.

Create flashcards in notes completely automatically.

Create the most beautiful study materials using our templates.

Sign up to highlight and take notes. It’s 100% free.

Over 10 million students from across the world are already learning smarter.

Get Started for Free