Select your language

Suggested languages for you:
Log In Start studying!
StudySmarter - The all-in-one study app.
4.8 • +11k Ratings
More than 3 Million Downloads
Free
|
|

All-in-one learning app

  • Flashcards
  • NotesNotes
  • ExplanationsExplanations
  • Study Planner
  • Textbook solutions
Start studying

Data Interpretation

Save Save
Print Print
Edit Edit
Sign up to use all features for free. Sign up now
Data Interpretation

Data interpretation refers to the process of subjecting data to predefined processes such as the organization of tables, charts, or graphs so that logical and statistical conclusions can be derived. This part of statistics answers a common question among researchers: what exactly are we supposed to present?

It is not ideal for researchers to present numerical values of data collected from instruments or surveys. Data need to be organized to tell the story of what you want to emphasize in your research. This should focus on the problem you want to solve - also known as 'the statement of the problem'. It is the primary function of the research.

Statistical tools are used in the process, helping you to transform data into useful information that can help you to arrive at important conclusions. This process is called data analysis. It is after this process that data can be fully interpreted.

Statistical methods

Statistical methods allow you to work on your data. Imagine you have the exam scores for 100 students, and you want to interpret this data. Scanning through the scores by eye alone might be quite tough! Here are two methods that would simplify this task.

Measures of central tendency

Central tendency values are used to describe some key characteristics of the whole data set, producing a single value that is typical of the whole set. For example, the mode will give you the value that occurs the most often.

  • The mean is the most commonly reported measure of central tendency and it is the mathematical average. To calculate your mean, you add up all of your values available and divide that by the number of values you added. The mean is represented by μ, and its formula is, where n is the number of data items in the sample and is the sum of all data values.

  • The median is the mid-point value in your data set. Where the median is two numbers, it is the average of both values in ordered data.

  • The mode is the value that occurs the most often.

Default deviation

Another statistical measure that is commonly used is variability, also known as spread. The range is the simplest form of variability. Let's take the exam score dataset again - the range is the span between the lowest and highest numerical values.

Another common measure is variance; which is the squared average deviation from the mean. This number indicates how much the individual values deviate from the mean. What you will see reported more often is the standard deviation. This is modelled as the square root of the variance. Standard deviation expresses how much individual class scores differ from the mean value for the group. Mathematically, it can be modelled into an equation:

Single variable data

Single variable data involves examining one particular variable relevant to a dataset. Single data analysis is common in descriptive forms of analysis and uses histograms, frequency distributions, and box plots among other methods. This is mostly used in the first step of investigating data. Let's take a look at a box plot.

boxplot

A box plot displays a five-number summary of a dataset. They are the minimum, first quartile, median, third quartile, and maximum. Quartiles tell us about the spread of data by breaking the data set into quarters. The lower quartile, Q1 represents 25%, the middle quartile equals 50% and the upper quartile represents 75%.

The ages of 10 students in grade 12 were collected and they are as follows:

15, 21, 19, 19, 17, 16, 17, 18, 19, 18.

Let's first arrange these in ascending order.

15, 16, 17, 17, 18, 18, 19, 19, 19, 21.

We can now find the median, which is the middle number. And since we have an even number, we have two of them. Finding the average is standard practice; however, with ours, we have the same number.

median = 18

We will find the quartiles now. The first is the median to the left of the overall median.

That will mean we are finding the median for 15, 16, 17, 17, 18.

This equals 17.

The third quartile will be the median to the right of the median.

18, 19, 19, 19, 21

Which will make that 19.

Now we will document the minimum number which is 15.

And also document the maximum which is 21.

Single variable data, box plot, StudySmarterFigure 1. Box plot

The image above is the box plot representing the data of the ages of the students in grade 12.

We will take another example with an odd number of data points.

The table below is data of basketball players' points scored per game over a seven-game span. Visualise this on a box and whisker plot.

GamePoints
110
217
35
432
516
618
720

Step 1.

Rearrange the values in the data set from lowest to highest.

5, 10, 16, 17, 18, 20, 32.

Step 2.

Now identify the highest and lowest values in the data set

Highest value: 32

Lowest value: 5

Step 3.

We can now identify the midpoint value (median) of the data set.

Median = 17

Step 4.

We will now find the upper and lower quartiles.

The lower quartile is the median for the first half of the data set.

That will mean that we are finding the median for 5, 10, 16

Lower quartile = 10

The upper quartile is the median for the second half of the data set.

That will also mean that we are finding the median for 18, 20, 32

Upper quartile = 20

Step 5.

Now that we have all our necessary values, we will construct our box and whisker plot.

Highest value = 32

Lowest value = 5

Median = 17

Upper quartile = 20

Lower quartile = 10

We will first draw a number line that fits the data, and plot all the necessary values we found.

Statistics, Box and whisker plot, StudySmarterFigure 2. Plotting necessary values on a box

Construct a rectangle that encloses the median of the entire data set that its vertical lines pass through the upper and lower quartiles. Now construct a vertical line through the median that hits both ends of the rectangle.

Statistics, Box and whisker plot, StudySmarterFigure 3. Box and plot

There, we have our box and whisker plot for the basketball games.

Bivariate data

In contrast to single variable data, bivariate data consist of two variables for each individual. For example, in large studies in the health sector, it is common to collect variables such as height, age, blood pressure, etc. in each individual. Let's look at an example in a two-way frequency table.

These are the number of males and females who had each grade on a math project in school.

degrees

Female

Total

A

9

21

B.

18

32

C

11

D

2

3

5

E

1

2

3

Total

38

42

80

We can see there are 9 males and 12 females who got an A, 18 males and 14 females who got a B, and so on.

Now we can answer a couple of questions.

  1. How many students in total had an A?

Answer: 21 students.

  1. How many males were surveyed?

Answer: 38 males.

  1. How many males earned a grade of A?

Answer: 9.

Below is a graph representation of two variables, the sales of ice cream in a given shop against the temperature of the day. This demonstrates how much ice cream is purchased at every given temperature.

Data interpretation, Bivariate data, StudySmarterBivariate data; Ice cream sales versus temperature of day

Probability

Probability is the measure of how likely an event is to happen. Probabilities can be placed on a number line between 0 and 1, as shown below.

So if the probability of an event is zero, then it is impossible for the event to occur. Whilst if it is 1, then it is certain. Then we have variant degrees in between the two values, and 0.5 would mean there is an even chance of the event happening.

Probabilities are written down using the following notation :

If event A has a between happening and not happening, then the probability of event A not happening = 1 - P (A ')

For example, if the P (A) = 0.8

P(A') = 0.2.

They should both add up to 1.

These are the basic concepts you would be using throughout probability at this level. You can be reintroduced to Venn diagrams, tree diagrams, etc. as well!

Data Interpretation - Key takeaways

  • Data interpretation refers to the process of subjecting collected data to predefined processes so logical and statistical conclusions can be derived.
  • Presentation refers to the representation of data in graphs, plots, frequency tables, etc.
  • The measure of central tendency produces a single value that is typical of the whole set. The basic values are mean, mode and median.
  • Single variable data involves examining one particular variable relevant in a dataset.
  • In contrast to single variable data, bivariate data consist of two variables for each individual.
  • Probability is the measure of how likely an event is to happen.

Frequently Asked Questions about Data Interpretation

You carry out analysis by selecting each component of the data and seeing if there are any patterns.


Data interpretation involves explaining what these findings mean with reference to the statement of the problem.


It's necessary to organise and group ideas in a logical way.

Data need to be organised to tell the story of what you want to emphasise in your research. This should focus on the problem you want to solve. It is the primary function of the research.

Final Data Interpretation Quiz

Question

What is cumulative frequency?

Show answer

Answer

The cumulative frequency at a point x is the sum of the individual frequencies up to and at the point x.

Show question

Question

Which of the following can you obtain from a cumulative frequency distribution? a) median b) quartiles c) percentiles d) all of the above

Show answer

Answer

d

Show question

Question

If a cumulative frequency for the (n-1)th value is 85 in discrete frequency distribution with 110 data points, what is the raw frequency for the nth value?

Show answer

Answer

25

Show question

Question

For a grouped frequency distribution, what is the class mark for the class 0.5 - 1.0?

Show answer

Answer

0.75

Show question

Question

For a grouped frequency distribution, what is the class mark for the class 2.5 - 3.5?

Show answer

Answer

3.0

Show question

Question

For a grouped frequency distribution, what is the class mark for the class 8 - 12?

Show answer

Answer

10

Show question

Question

State whether the following statement is true or false : the curve for a cumulative frequency graph is never decreasing.

Show answer

Answer

True

Show question

Question

The cumulative frequency curve for an experiment with 200 trials is given by x = y/5, where the cumulative frequency is represented on the y-axis. Find the median.

Show answer

Answer

x = (200/2)/5 = 20

Show question

Question

The cumulative frequency curve for an experiment with 200 trials is given by x = y/5, where the cumulative frequency is represented on the y-axis. Find the upper quartile.

Show answer

Answer

x = (200 × 3/4)/5 = 30

Show question

Question

The cumulative frequency curve for an experiment with 200 trials is given by x = y/5, where the cumulative frequency is represented on the y-axis. Find the 43rd percentile.

Show answer

Answer

x = (200 × 43/100)/5 = 17.2

Show question

Question

The cumulative frequency curve for an experiment with 200 trials is given by x = y/5, where the cumulative frequency is represented on the y-axis. Find the 70th percentile.

Show answer

Answer

x = (200 × 70/100)/5 = 28

Show question

Question

The cumulative frequency curve for an experiment with 100 trials is given by x = 2y + 3, where the cumulative frequency is represented on the y-axis. Find the median.

Show answer

Answer

x = 2 × (100/2) + 3 = 103

Show question

Question

The cumulative frequency curve for an experiment with 100 trials is given by y = 2x + 3, where the cumulative frequency is represented on the y-axis. Find the lower quartile.

Show answer

Answer

x = 2 × (100/4) + 3 = 53

Show question

Question

A grouped frequency distribution has been made for the length of 500 snakes. The cumulative frequency of a class (8.0 - 8.5) inches is 320. How many snakes are more than 8.5 inches long?  

Show answer

Answer

180

Show question

Question

A grouped frequency distribution has been made for the length of 500 snakes. The cumulative frequency of a class (8.0 - 8.5) inches is 320. Which of the following is the correct conclusion?

Show answer

Answer

There are 320 snakes shorter than than or equal to 8.5 inches

Show question

Question

What are statistical measures?

Show answer

Answer

Statistical measures are a technique of descriptive analysis used to give a summary of the characteristics of a data set.


Show question

Question

Which of these measures of central tendency best describe the most frequently occurring number in a dataset?


Show answer

Answer

Mode

Show question

Question

What are the three main measures of central tendency?


Show answer

Answer

Mean, median, and mode.

Show question

Question

What is the median?


Show answer

Answer

The median is the mid-point value of a given dataset.

Show question

Question

Given the data set {2, 3, 4, 6, 7, 7, 8, 9}, what is the median here?


Show answer

Answer

6.5

Show question

Question

What is the mode if you are given the dataset {2, 5, 3, 2, 5, 6, 7, 5}?


Show answer

Answer

5

Show question

Question

Which of these is not a measure of spread?


Show answer

Answer

Mean

Show question

Question

Which of these statements is true about variance and standard deviation?


Show answer

Answer

Standard deviation is the square root of variance.


Show question

Question

The difference between the highest values and lowest values of a given data is known as?


Show answer

Answer

The range

Show question

Question

Find the range for the given dataset, {43, 34, 78, 16}.


Show answer

Answer

62

Show question

Question

The difference between the upper quartile and the lower quartile value is known as?


Show answer

Answer

The interquartile range.

Show question

Question

What is conditional probability?

Show answer

Answer

Conditional probability is the probability of an event B occurring given that another event A has already occurred.

Show question

Question

 Are events A and B in conditional probability dependent or independent?

Show answer

Answer

dependent

Show question

Question

What methods can be used to calculate conditional probability?

Show answer

Answer

using the formula, drawing a venn diagram or drawing a tree diagram

Show question

Question

 In an international school, there are 35 pupils in one particular class. 5 of them are French. 3 of the French students are boys. A student is picked at random from the class. What is the probability of that student being a boy given that the student is French?

Show answer

Answer

0.599

Show question

Question

 if we had a bag of 12 sweets, with 6 lemon (L) and 6 strawberry (S)sweets at the beginning and would then pick two sweets, one after the other without replacing them, what would be P(S|S)?

Show answer

Answer

0.417

Show question

Question

if we had a bag of 12 sweets, with 6 lemon (L) and 6 strawberry (S)sweets at the beginning and would then pick two sweets, one after the other without replacing them, what would be P(L|S)?

Show answer

Answer

0.583

Show question

Question

If P(B|A)=0.4, P(A)=0.7 and P(B)=0.3, what is P(A|B)?

Show answer

Answer

0.933

Show question

Question

If P(B|A)=0.51, P(A)=0.33 and P(B)=0.66, what is P(A|B)?

Show answer

Answer

0.255

Show question

Question

Given that P(A)=0.4 , P(B)=0.56  and P(A|B)=0.857, calculate P(A∩B).

Show answer

Answer

0.4799

Show question

Question

 Mike decides to buy two gerbils. He goes to a shop and the keeper reliably informs him that there are seven male and eight female gerbils to choose from. Mike chooses two gerbils at random.

Find the probability that both gerbils are female given they are the same sex.

Show answer

Answer

0.571

Show question

Question

What is an event in probability?


Show answer

Answer

An event is the outcome or set of outcomes resulting from an experiment.

Show question

Question

What is the sample space in probability?


Show answer

Answer

The sample space is the set of all possible outcomes.

Show question

Question

What does the sum of the probabilities of all the possible outcomes equal to?


Show answer

Answer

The sum of the probabilities of all the possible outcomes equal to 1.

Show question

Question

What is a discrete random variable?


Show answer

Answer

A random variable is discrete when it can only take certain numerical values within a given interval.

Show question

Question

What diagrams can you use to represent probability?


Show answer

Answer

Venn diagrams and Tree diagrams

Show question

Question

What is probability distribution?


Show answer

Answer

A probability distribution is a table or equation that associates each possible outcome of a random variable with its corresponding probabilities.

Show question

Question

What is probability?

Show answer

Answer

Probability is the branch of mathematics that studies the numerical description of how likely it is that an event will happen.

Show question

Question

What is an experiment in probability?


Show answer

Answer

An experiment is a process that can be repeated many times producing a set of specific outcomes, i.e. tossing a coin, or rolling a die.

Show question

Question

What is a box plot?

Show answer

Answer

A box plot is a type of graph that visually shows features of the data.

Show question

Question

What are the features that a box plot shows?

Show answer

Answer

A box plot shows you the lowest value, lower quartile, median, upper quartile, highest value and any outliers that the data may have.

Show question

Question

How do you find upper and lower quartiles? 


Show answer

Answer

To find the upper and lower quartile you first need to arrange your data into numerical order, the next step is to find your median, you can then use this to find both of the quartiles. The lower quartile will then be the midpoint between the lowest value and the median, the upper quartile will be the midpoint between the median and the highest value.

Show question

Question

How do you find the interquartile range?


Show answer

Answer

To find the interquartile range you subtract the lower quartile from the upper quartile.

Show question

Question

What is an outlier?


Show answer

Answer

An outlier is classed as data that falls 1.5 x the interquartile range above the upper quartile or below the lower quartile.

Show question

Question

What is an event in probability?

Show answer

Answer

An event is the outcome or set of outcomes resulting from an experiment. An event is also known as a subset of the sample space.

Show question

More about Data Interpretation
60%

of the users don't pass the Data Interpretation quiz! Will you pass the quiz?

Start Quiz

Discover the right content for your subjects

No need to cheat if you have everything you need to succeed! Packed into one app!

Study Plan

Be perfectly prepared on time with an individual plan.

Quizzes

Test your knowledge with gamified quizzes.

Flashcards

Create and find flashcards in record time.

Notes

Create beautiful notes faster than ever before.

Study Sets

Have all your study materials in one place.

Documents

Upload unlimited documents and save them online.

Study Analytics

Identify your study strength and weaknesses.

Weekly Goals

Set individual study goals and earn points reaching them.

Smart Reminders

Stop procrastinating with our study reminders.

Rewards

Earn points, unlock badges and level up while studying.

Magic Marker

Create flashcards in notes completely automatically.

Smart Formatting

Create the most beautiful study materials using our templates.

Sign up to highlight and take notes. It’s 100% free.