# Bivariate Data

Bivariate data is data that has been collected in two variables, and each data point in one variable has a corresponding data point in the other value. We normally collect bivariate data to try and investigate the relationship between the two variables and then use this relationship to inform future decisions.

For example, we could collect data of outside temperature versus ice cream sales, or we could study height vs shoe size, these would both be examples of bivariate data. If there was a relationship showing an increase of outside temperature increased ice cream sales, then shops could use this to buy more ice cream for hotter spells during the summer.

## How to represent bivariate data?

We use scatter graphs to represent bivariate data. A scatter graph of bivariate data is a two-dimensional graph with one variable on one axis, and the other variable on the other axis. We then plot the corresponding points on the graph. We can then draw a regression line (also known as a line of best fit), and look at the correlation of the data (which direction the data goes, and how close to the line of best fit the data points are).

### Drawing a scatter graph

Step 1: We start by drawing a set of axis and choosing an appropriate scale for the data.Step 2 : Label the x-axis with the explanatory / independent variable (the variable that will change), and the y-axis with the response / dependent variable (the variable which we suspect will change due to the independent variable changing). Also label the graph itself, describing what the graph shows. Step 3: Plot the data points on the graph.Step 4: Draw the line of best fit, if required.

Here is a set of data relating the temperature on days in July, and the number of ice creams sold in a corner shop.

 Temperature (° C) 14 16 15 16 23 12 21 22 Ice cream sales 16 18 14 19 43 12 24 26

In this case, the temperature is the independent variable, and ice cream sales are the dependent variable. This means that we plot temperature on the x-axis, and ice cream sales on the y-axis. The resulting graph should look as follows.

Graph of Ice cream sales against temperature - StudySmarter Originals

The following data represents the journey of a car with time and distance travelled measured starting from the beginning of the journey:

 Time (in hours) 1 2 3 4 5 6 7 8 Distance (km) 12 17 18 29 35 51 53 60

In this case, time is the independent variable, and distance is the dependent variable. This means that we plot time on the x-axis, and distance on the y-axis. The resulting graph should look as follows.

Graph of distance against time - StudySmarter Originals

## What is the meaning of correlation and regression for bivariate data?

Correlation describes the relationship between two variables. We describe correlation on a sliding scale from -1 to 1. Anything negative is called a negative correlation, and a positive correlation corresponds to a positive number. The closer to each end of the scale the correlation is, the stronger the relationship, and the closer to zero the correlation is, the weaker the relationship. A zero correlation means there is no relationship between the two variables. Regression is when we draw a line of best fit for the data. This line of best fit minimizes the distance between the data points and this regression line. Correlation is a measure of how close the data is to our line of best fit. If we can find a strong correlation between two variables, then we can establish they have a strong relationship, meaning that there is a good probability that one variable influences the other.

## Bivariate data - Key takeaways

• Bivariate data is the collection of two data sets, where each piece of data is paired with another from the other data set
• We use a scatter graph to show bivariate data.
• The correlation between bivariate data demonstrates how strong the relationship is between two variables.

Bivariate data is the collection of two data sets, where data in one set corresponds pairwise to the data in the other set.

Univariate data is an observation on only one variable, whilst bivariate data is observation on two variables.

## Final Bivariate Data Quiz

Question

What are scatter graphs?

Scatter graphs are graphs with points that show the relationship between two variables.

Show question

Question

What is the difference between the dependent and independent variables?

The dependent variables are influenced or affected by the independent variable and plotted on the y-axis, whilst the independent variables are not influenced by anything and plotted on the x-axis.

Show question

Question

What does each point on a scatter graph relate to?

Each point relates to the values of the two variables that are being compared.

Show question

Question

What is correlation?

Correlation is the relationship between two data sets or variables.

Show question

Question

What does the correlation coefficient measure?

The correlation coefficient measures the strength and direction of the linear relationship between two variables being compared.

Show question

Question

What is a positive correlation?

A positive correlation is when one variable increases, then so will the other one.

Show question

Question

What is a perfect positive correlation?

A perfect positive correlation is a correlation expressed as +1 and it means that that the variables being compared will always move together in the same direction and percentage.

Show question

Question

What is a negative correlation?

A negative correlation is when one variable decreases then the other will increase.

Show question

Question

What is a perfect negative correlation?

A perfect negative correlation is a correlation expressed as -1, where the two variables being compared always move in opposite directions.

Show question

Question

What correlation would this be: The more time a person spends practising Maths, the less confused they will be on the topics.

Negative correlation

Show question

Question

What is an independent variable?

An independent variable is the variable that changes by itself (some may say changes independently)

Show question

Question

What is a dependent variable?

A dependent variable is one that changes as a result of other variables changing.

Show question

Question

In summer, when the weather is hot in New York, there is an increase in both crime rate and ice cream sales. Explain why there may be a correlation but this does not mean that an increase of ice cream sales means crime increases.

This is a case of correlation, but not necessarily causation. When it is hot, people want to cool down, thus more ice cream. Also when it gets hot, people get more agitated, thus crime rate increases

Show question

Question

What is the dependent variable?

The variable which changes as a result of a change of the independent variable.

Show question

Question

What is the independent variable?

The independent variable is one which does not change as a result of others changing. We will change this variable and see how others change.

Show question

Question

What number represents a strong positive correlation?

1

Show question

Question

What number represents a strong negative correlation?

-1

Show question

Question

What number represents a zero correlation?

0

Show question

Question

On which axis of a scatter graph would the dependent variable be?

y-axis

Show question

Question

On which axis of a scatter graph would the independent variable be?

x-axis

Show question

Question

True or false: a correlation of -0.8 is stronger than 0.6

True

Show question

Question

Which of the following correlation coefficients is strongest:

-1

Show question

Question

True or false: a correlation coefficient of 0.7 is weaker than -0.9

True

Show question

