Suggested languages for you:
|
|

## All-in-one learning app

• Flashcards
• NotesNotes
• ExplanationsExplanations
• Study Planner
• Textbook solutions

# Single Variable Data

Single variable data is usually called univariate data. This is a type of data that consists of observations on only a single characteristic or attribute. Single variable data can be used in a descriptive study to see how each characteristic or attribute varies before including that variable in a study with two or more variables.

## Examples of single variable data

What were the scores of the students that took the maths test? Which sickness was responsible for most deaths in 2020? What are the weights of each person present in the gym? What is the typical income of the average person in the UK? All these questions can be answered using single variable data. Single variable analysis is the simplest form of analysing data. Its main purpose is to describe, and it does not take into considerations causes and relationships.

For instance, when the question about the scores of students that took a particular math test is asked, we are mostly interested in how varied the results are from each person. By this, we can statistically summarise the data using Statistical Measures to get an idea about the performance of the whole population that took the test.

## How significant is single variable data?

In research, single variable data does not concern itself with answering questions that involve relationships between variables. It describes an attribute of the subject in question, and how it varies from observation to observation. Univariate data analysis involves using statistical measures such as Measures of Central Tendency. It also takes advantage of measures of spread.

There are two main reasons why a researcher would conduct a single variable analysis. The first is to have a descriptive study of how one characteristic varies from subject to subject. The second is to analyse the variety of each characteristic before they can be paired with other variables in a study.

This is where Bivariate Data and multivariate data comes in. Multivariate data describes multiple characteristics of a subject. It is necessary to examine how varied students' scores are with respect to other factors such as subject and their background.

## Single variable data analysis

As mentioned earlier, statistical measures are used to summarise single variable data's centres and spread. Whilst the commonest way to display single variable data is in a table, other common ways are:

• Histograms.
• Frequency distribution.
• Box plots.
• Pie charts.

Scores of eight students were recorded after taking a maths test in grade 6, and they are as follows; 76, 88, 45, 50, 88, 67, 75, 83. Find the

1. Mean

Median

3. Mode

1.

2.

Rearrange values from lowest to highest.

45, 50, 67, 75, 76, 83, 88, 88

Median = 75.5

3.

The most frequently occurring number is 88.

## Histograms

Histograms are one of the most commonly used graphs to show frequency distribution. It is a graphical display of data using bars of different heights. Similar to the bar chart, the histogram groups numbers into ranges. It is an appropriate way to display single-variable data.

Histogram of travel time to work. Image: QWFP, CC BY-SA 3.0

## Frequency distribution

Frequency distribution is data modelled in a tabular format to display the number of observations within a space. This displays values and their frequency (how often something occurs). This format also appropriately represents single variable data and is as simple as possible.

The numbers of newspapers sold at a shop over the last 10 days are;

20, 20, 25, 23, 20, 18, 22, 20, 18, 22.

This can be represented by frequency distribution. The values above are the variables, and the table is going to show how often a specific number of sales occurred over the last 10 days.

 Papers sold Frequency 2 0 21 0 2 23 1 24 0 1

## Pie charts

Pie charts are types of graphs that display data as circular graphs. They are represented in slices where each slice of the pie is relative to the size of that category in the group as a whole. This means that the entire pie is 100%, and each slice is its proportional value.

Assuming the data for pets ownership in Lincoln were collected as follows, how would it be represented on a pie chart?

Dogs - 1110 people

Cats - 987 people

Rodents - 312 people

Reptiles - 97 people

Fish - 398 people

Figure 2. Pie chart representing data of pets in Lincoln

## Box plots

Presenting data using the box plot gives a good graphical image of the concentration of the data. It displays the five-number summary of a dataset; the minimum, first quartile, median, third quartile, and maximum. This is also a good system to represent single variable data.

The ages of 10 students in grade 12 were collected and they are as follows.

15, 21, 19, 19, 17, 16, 17, 18, 19, 18.

First, we will arrange this from lowest to highest so the median can be determined.

15, 16, 17, 17, 18, 18, 19, 19, 19, 21

Median = 18

In finding the quartiles, the first will be the median to the right of the overall median.

The median for 15, 16, 17, 17, 18 is 17

The third quartile will be the median to the right of the overall median.

Median for 18, 19, 19, 19, 21, will make 19.

We will now note the minimum number which is 15, and also the maximum which is 21.

Figure 3. Box plot representing students ages

## Single variable data - Key takeaways

• Single variable data is a term used to describe a type of data that consists of observations on only a single characteristic or attribute.
• Single variable data's main purpose is to describe, and it does not take into considerations causes and relationships.
• Statistical measures are used to summarise single variable data's centres and spread.
• Common ways single variable data can be described are through histograms, frequency distributions, box plots, and pie charts.

Images

Histogram: https://commons.wikimedia.org/wiki/File:Travel_time_histogram_total_n_Stata.png

Variable means the measured values can be varied anywhere along a given scale, whilst attribute data is something that can be measured in terms of numbers or can be described as either yes or no for recording and analysis.

The ages of students in a class.

Single variable data gives measures of only one attribute whilst two-variable data gives measures of two attributes describing a subject.

Single variable data is used to describe a type of data that consists of observations on only a single characteristic or attribute.

## Final Single Variable Data Quiz

Question

What is cumulative frequency?

The cumulative frequency at a point x is the sum of the individual frequencies up to and at the point x.

Show question

Question

Which of the following can you obtain from a cumulative frequency distribution? a) median b) quartiles c) percentiles d) all of the above

d

Show question

Question

If a cumulative frequency for the (n-1)th value is 85 in discrete frequency distribution with 110 data points, what is the raw frequency for the nth value?

25

Show question

Question

For a grouped frequency distribution, what is the class mark for the class 0.5 - 1.0?

0.75

Show question

Question

For a grouped frequency distribution, what is the class mark for the class 2.5 - 3.5?

3.0

Show question

Question

For a grouped frequency distribution, what is the class mark for the class 8 - 12?

10

Show question

Question

State whether the following statement is true or false : the curve for a cumulative frequency graph is never decreasing.

True

Show question

Question

The cumulative frequency curve for an experiment with 200 trials is given by x = y/5, where the cumulative frequency is represented on the y-axis. Find the median.

x = (200/2)/5 = 20

Show question

Question

The cumulative frequency curve for an experiment with 200 trials is given by x = y/5, where the cumulative frequency is represented on the y-axis. Find the upper quartile.

x = (200 × 3/4)/5 = 30

Show question

Question

The cumulative frequency curve for an experiment with 200 trials is given by x = y/5, where the cumulative frequency is represented on the y-axis. Find the 43rd percentile.

x = (200 × 43/100)/5 = 17.2

Show question

Question

The cumulative frequency curve for an experiment with 200 trials is given by x = y/5, where the cumulative frequency is represented on the y-axis. Find the 70th percentile.

x = (200 × 70/100)/5 = 28

Show question

Question

The cumulative frequency curve for an experiment with 100 trials is given by x = 2y + 3, where the cumulative frequency is represented on the y-axis. Find the median.

x = 2 × (100/2) + 3 = 103

Show question

Question

The cumulative frequency curve for an experiment with 100 trials is given by y = 2x + 3, where the cumulative frequency is represented on the y-axis. Find the lower quartile.

x = 2 × (100/4) + 3 = 53

Show question

Question

A grouped frequency distribution has been made for the length of 500 snakes. The cumulative frequency of a class (8.0 - 8.5) inches is 320. How many snakes are more than 8.5 inches long?

180

Show question

Question

A grouped frequency distribution has been made for the length of 500 snakes. The cumulative frequency of a class (8.0 - 8.5) inches is 320. Which of the following is the correct conclusion?

There are 320 snakes shorter than than or equal to 8.5 inches

Show question

Question

What is a box plot?

A box plot is a type of graph that visually shows features of the data.

Show question

Question

What are the features that a box plot shows?

A box plot shows you the lowest value, lower quartile, median, upper quartile, highest value and any outliers that the data may have.

Show question

Question

How do you find upper and lower quartiles?

To find the upper and lower quartile you first need to arrange your data into numerical order, the next step is to find your median, you can then use this to find both of the quartiles. The lower quartile will then be the midpoint between the lowest value and the median, the upper quartile will be the midpoint between the median and the highest value.

Show question

Question

How do you find the interquartile range?

To find the interquartile range you subtract the lower quartile from the upper quartile.

Show question

Question

What is an outlier?

An outlier is classed as data that falls 1.5 x the interquartile range above the upper quartile or below the lower quartile.

Show question

Question

What is a histogram?

A histogram is a type of graph that represents grouped data.

Show question

Question

How do you calculate the frequency density?

Frequency density is calculated by dividing the frequency by the class width.

Show question

Question

What is a frequency polygon?

A frequency polygon is a graphical representation of a data set with frequency information. It is one of the most commonly used statistical tools used to represent and analyze grouped statistical data.

Show question

Question

For a grouped frequency distribution, what is plotted along the X-axis when building a frequency polygon?

Class mark

Show question

Question

For a grouped frequency distribution, what is plotted along the Y-axis when building a frequency polygon?

Frequency

Show question

Question

How do you obtain a frequency polygon from a given histogram?

Join the middle of the top of each bar of the histogram sequentially.

Show question

Question

State whether the following statement is true or false : To draw a frequency polygon, you first have to create a histogram.

False

Show question

Question

What is the class mark for the class "8-10"?

9

Show question

Question

What is the class mark for the class "45.5-55"?

50.25

Show question

Question

What is the class mark for the class "0.1-0.2"?

0.15

Show question

Question

State whether the following statement is true or false : The sum of the frequencies of a frequency polygon must equal 1

False

Show question

Question

State whether the following statement is true or false : The frequencies of a frequency polygon must be positive

True

Show question

Question

State whether the following statement is true or false : To draw a frequency polygon from a given grouped frequency distribution, we must plot the frequency against the class marks and not the class boundaries.

True

Show question

Question

What are the two types of measures that are usually commented on when comparing data distributions?

1. measure of location

Show question

Question

What is a measure of spread?

a measure of spread provides us information regarding the variability of data in a given data set, i.e. how close or far away the different points in a data set are from each other.

Show question

Question

What is a measure of location?

a measure of location is used to summarize an entire data set with a single value.

Show question

Question

Data set A - median 25, Q1 = 18, Q3 = 56

Data set B - median 24, Q1 = 14, Q3 = 130

Data set A has a lower measure of location (median) and also a lower variability among the data.

Show question

Question

Data set A - median 100, Q1 = 50, Q3 = 150

Data set B - median 200, Q1 = 150, Q3 = 250

Data set A has a lower measure of location (median). There appears to be an equal variability among the data sets.

Show question

Question

Data set A - median 300, Q1 = 275, Q3 = 325

Data set B - median 200, Q1 = 150, Q3 = 250

Data set A has a higher measure of location (median) and a lower variability among the data.

Show question

Question

Which of the following is appropriate to use along with median for comparison?

Interquartile range

Show question

Question

Which of the following is appropriate to use along with mean for comparison?

standard deviation

Show question

Question

Which of the following is appropriate to use along with

standard deviation for comparison?

mean

Show question

Question

Which of the following is appropriate to use along with interquartile range for comparison?

mean

Show question

Question

Which of the following should you use for comparing a data set with extreme values?

mean and standard deviation

Show question

Question

Compare the 2 data sets

Data set A - mean 100, standard deviation = 50

Data set B - mean 200, standard deviation = 50

Data set A has a lower measure of location (mean). There is an equal variability among the data sets.

Show question

Question

Compare the 2 data sets

Data set A - mean = 13, standard deviation = 5

Data set B - mean = 18, standard deviation = 15

Data set A has a lower measure of location (mean) and a lower variability among the data sets.

Show question

Question

Compare the 2 data sets

Data set A - mean = 13, standard deviation = 5

Data set B - mean = 13, standard deviation = 5

Both data sets have similar measures of location and spread within the data.

Show question

Question

What is single-variable data?

Single-variable data is a  type of data that consists of observations on only a single characteristic or attribute.

Show question

Question

Another name for univariate data is?

Single-variable data

Show question

Question

Amongst the number of ways single-variable data can be represented, which of them does it display the five-number summary of a dataset?

Box plot

Show question

60%

of the users don't pass the Single Variable Data quiz! Will you pass the quiz?

Start Quiz

## Study Plan

Be perfectly prepared on time with an individual plan.

## Quizzes

Test your knowledge with gamified quizzes.

## Flashcards

Create and find flashcards in record time.

## Notes

Create beautiful notes faster than ever before.

## Study Sets

Have all your study materials in one place.

## Documents

Upload unlimited documents and save them online.

## Study Analytics

Identify your study strength and weaknesses.

## Weekly Goals

Set individual study goals and earn points reaching them.

## Smart Reminders

Stop procrastinating with our study reminders.

## Rewards

Earn points, unlock badges and level up while studying.

## Magic Marker

Create flashcards in notes completely automatically.

## Smart Formatting

Create the most beautiful study materials using our templates.