Skip to main content
Mathematics 3% exam weight

Statistics

Part of the JAMB UTME study roadmap. Mathematics topic math-19 of Mathematics.

Statistics

🟢 Lite — Quick Review (1h–1d)

Rapid summary for last-minute revision before your exam.

Statistics — Quick Facts

What is Statistics? Statistics is the branch of mathematics that deals with collecting, organizing, presenting, and interpreting data to make meaningful conclusions. In JAMB, Statistics is a key topic that appears almost every year.

Data Types:

  • Raw data: Individual observations as collected (e.g., 4, 7, 2, 9, 3)
  • Grouped data: Data organized into classes/frequency tables
  • Discrete data: Data that can be counted (number of children, shoe sizes)
  • Continuous data: Data that can be measured (height, weight, time, temperature)

Measures of Central Tendency — The Big Three:

MeasureWhat it isSymbol/Formula
MeanAverage value$\bar{x} = \frac{\sum x}{n}$
MedianMiddle valuePosition = $\frac{n+1}{2}$
ModeMost frequent valueNo formula

Quick Formulas — Memorise Now:

  • Mean (ungrouped): $\bar{x} = \frac{\sum x}{n}$
  • Mean (grouped discrete): $\bar{x} = \frac{\sum fx}{\sum f}$
  • Mean (grouped continuous): $\bar{x} = \frac{\sum fx}{\sum f}$ where $x$ = class midpoint
  • Median (grouped): $M = L + \left[\frac{\frac{n}{2} - c.f.}{f}\right] \times w$
  • Mode (grouped): $M_o = L + \frac{d_1}{d_1 + d_2} \times w$
  • Standard deviation (ungrouped): $s = \sqrt{\frac{\sum(x - \bar{x})^2}{n}}$

JAMB Exam Tip: For grouped data questions, always find the class midpoint $x$ first (midpoint of class boundaries). This is critical for calculating the mean and standard deviation of grouped data.


🟡 Standard — Regular Study (2d–2mo)

For students who want genuine understanding.

Statistics — Study Guide

1. What is Statistics?

Statistics is the science of collecting, analyzing, interpreting, and presenting data. In everyday life and in exams, we use statistics to summarize large amounts of information into single numbers that tell us something useful.

Populations and Samples:

  • Population: The entire set of items or observations under study
  • Sample: A smaller subset of the population selected for actual data collection
  • Census: Data collected from every member of the population

Data Types — Know the Difference:

Raw Data: Unorganized data in the order it was collected. Example: Scores in a class test — 45, 72, 58, 90, 33, 67, 81, 55

Grouped Data: Raw data organized into groups (classes) with frequencies. Example: Heights of 20 students grouped into class intervals:

Class IntervalFrequency (f)
150–1543
155–1597
160–1646
165–1694

Discrete Data: Can only take specific, separate values (usually whole numbers). Examples: number of students in a class, number of cars in a parking lot, dice rolls.

Continuous Data: Can take any value within a range. Examples: height (can be 155.5 cm, 155.55 cm…), weight, temperature, time, distance.

Class Boundaries and Class Marks: For continuous grouped data:

  • Class boundary: The actual lower and upper limits of a class interval
    • Lower class boundary = lower limit − 0.5 (for whole number data)
    • Upper class boundary = upper limit + 0.5
  • Class mark (midpoint): $x = \frac{\text{lower boundary} + \text{upper boundary}}{2}$

Example: Class 150–154 → Lower boundary = 149.5, Upper boundary = 154.5, Class mark = $\frac{149.5 + 154.5}{2} = 102$

Wait, let me recalculate:

  • Class 150–154 → Lower boundary = 149.5, Upper boundary = 154.5
  • Class mark $x = \frac{149.5 + 154.5}{2} = \frac{304}{2} = 152$

Class Width: $w = \text{upper boundary} - \text{lower boundary} = 154.5 - 149.5 = 5$

2. Measures of Central Tendency

Central tendency answers: “What is a typical value in my data?”

Mean (Arithmetic Average)

Mean for Ungrouped Data: $$\bar{x} = \frac{\sum x}{n}$$

Where $\sum x$ = sum of all values, $n$ = number of values

Example: Find the mean of: 4, 7, 2, 9, 3

$$\bar{x} = \frac{4 + 7 + 2 + 9 + 3}{5} = \frac{25}{5} = 5$$


Mean for Grouped Discrete Data: $$\bar{x} = \frac{\sum fx}{\sum f}$$

Where $f$ = frequency, $x$ = value/data point

Example:

Value (x)Frequency (f)fx
236
4520
6212
818
Total1146

$$\bar{x} = \frac{46}{11} = 4.18$$


Mean for Grouped Continuous Data: $$\bar{x} = \frac{\sum fx}{\sum f}$$

Where $x$ = class midpoint (class mark), $f$ = frequency

Example:

Class IntervalClass Midpoint (x)Frequency (f)fx
10–1412448
15–19178136
20–24225110
25–2927381
Total20375

$$\bar{x} = \frac{375}{20} = 18.75$$

⚠️ Critical Step: For continuous grouped data, you MUST use the class midpoint as your $x$ values — not the lower or upper class boundaries!


Median

The median is the middle value when data is arranged in order.

Median for Ungrouped Data:

  1. Arrange data in ascending order
  2. Find the position: $\text{Position} = \frac{n+1}{2}$
  3. Read the value at that position

Example: Find the median of: 8, 3, 5, 1, 9

Step 1: Arrange: 1, 3, 5, 8, 9 Step 2: Position = $\frac{5+1}{2} = 3$ Step 3: 3rd value = 5

If n is even: Example: 2, 4, 6, 8 Position = $\frac{4+1}{2} = 2.5$ Median = average of 2nd and 3rd values = $\frac{4+6}{2} = 5$


Median for Grouped Data (Using Median Formula):

$$M_e = L + \left[\frac{\frac{n}{2} - c.f.}{f}\right] \times w$$

Where:

  • L = Lower class boundary of the median class
  • n = Total frequency (sum of all frequencies)
  • c.f. = Cumulative frequency of the class before the median class
  • f = Frequency of the median class
  • w = Class width (same for all classes)

How to Find the Median Class: Find $\frac{n}{2}$, then locate the class where the cumulative frequency first exceeds $\frac{n}{2}$.

Full Worked Example:

Class IntervalFrequency (f)Cumulative Frequency (c.f.)
0–422
5–979
10–141019
15–19625
20–24328

Total $n = 28$, so $\frac{n}{2} = 14$

The class where c.f. first exceeds 14 is 10–14 (c.f. = 19). This is the median class.

Using the formula:

  • $L = 9.5$ (lower boundary of 10–14)
  • $c.f. = 9$ (cumulative frequency before 10–14)
  • $f = 10$ (frequency of median class)
  • $w = 5$ (class width: 14.5 − 9.5 = 5)

$$M_e = 9.5 + \left[\frac{14 - 9}{10}\right] \times 5 = 9.5 + \left[\frac{5}{10}\right] \times 5 = 9.5 + 2.5 = 12$$

⚠️ JAMB Trick: Students often use the wrong cumulative frequency. Remember: c.f. is the frequency of ALL classes BEFORE the median class, NOT including the median class itself.


Mode

Mode for Ungrouped Data: The mode is simply the value that appears most frequently.

Example: In the data set 3, 5, 2, 7, 3, 5, 3, 9

  • Frequency of 3 = 3 times
  • Frequency of 5 = 2 times
  • All others appear once
  • Mode = 3

A dataset can have no mode (all values appear once), one mode (unimodal), or more than one mode (bimodal/multimodal).


Mode for Grouped Data:

$$M_o = L + \frac{d_1}{d_1 + d_2} \times w$$

Where:

  • L = Lower class boundary of the modal class (the class with the highest frequency)
  • d₁ = $f_1 - f_0$ (frequency of modal class minus frequency of class before it)
  • d₂ = $f_1 - f_2$ (frequency of modal class minus frequency of class after it)
  • w = Class width

Full Worked Example:

Class IntervalFrequency (f)
0–42
5–97
10–1410 ← highest (modal class)
15–196
20–243
  • Modal class = 10–14 (highest frequency of 10)
  • $L = 9.5$
  • $f_1 = 10$ (modal class frequency)
  • $f_0 = 7$ (class before modal class)
  • $f_2 = 6$ (class after modal class)
  • $d_1 = 10 - 7 = 3$
  • $d_2 = 10 - 6 = 4$
  • $w = 5$

$$M_o = 9.5 + \frac{3}{3 + 4} \times 5 = 9.5 + \frac{3}{7} \times 5 = 9.5 + 2.14 = 11.64$$

3. Cumulative Frequency and Ogives

Cumulative Frequency (c.f.): The running total of frequencies up to a particular class.

Class IntervalFrequencyCumulative Frequency
0–422
5–972 + 7 = 9
10–14109 + 10 = 19
15–19619 + 6 = 25
20–24325 + 3 = 28

Ogives: Line graphs showing cumulative frequency plotted against class boundaries.

  • Less than ogive: Plots the upper class boundary against cumulative frequency (e.g., less than 5, less than 10…)
  • More than ogive: Plots the lower class boundary against cumulative frequency (e.g., more than 0, more than 5…)

Ogives can be used to estimate the median (the value at $\frac{n}{2}$ on the y-axis) and other percentiles.

4. Measures of Dispersion

Dispersion tells us how spread out the data is. Two data sets can have the same mean but very different spreads.

Range

$$\text{Range} = \text{Maximum value} - \text{Minimum value}$$

Example: Data: 3, 7, 2, 9, 5 Range = 9 − 2 = 7

The range is simple but sensitive to extreme values (outliers).

Variance

Variance measures the average squared deviation from the mean.

For Ungrouped Data: $$s^2 = \frac{\sum(x - \bar{x})^2}{n} \quad \text{(population variance)}$$ $$s^2 = \frac{\sum(x - \bar{x})^2}{n-1} \quad \text{(sample variance)}$$

JAMB Note: Most JAMB questions use the population formula ($n$ in denominator). When in doubt, check whether the question specifies “sample” or “population.”

For Grouped Data: $$s^2 = \frac{\sum f(x - \bar{x})^2}{\sum f}$$

Where $x$ = class midpoint, $f$ = frequency

Standard Deviation

Standard deviation is the square root of variance. It gives dispersion in the same units as the original data.

For Ungrouped Data: $$s = \sqrt{\frac{\sum(x - \bar{x})^2}{n}}$$

For Grouped Data: $$s = \sqrt{\frac{\sum f(x - \bar{x})^2}{\sum f}}$$

Worked Example — Ungrouped Data:

Data: 6, 8, 10, 12, 14

Step 1: Find the mean $$\bar{x} = \frac{6+8+10+12+14}{5} = \frac{50}{5} = 10$$

Step 2: Find deviations and squared deviations

$x$$x - \bar{x}$$(x - \bar{x})^2$
6−416
8−24
1000
1224
14416
Total40

Step 3: Calculate standard deviation $$s = \sqrt{\frac{40}{5}} = \sqrt{8} = 2.83$$


🔴 Extended — Deep Study (3mo+)

Comprehensive coverage for students on a longer study timeline.

Statistics — Comprehensive Notes

5. The Relationship Between Mean, Median, and Mode

For a symmetric (normal) distribution: $$\text{Mean} = \text{Median} = \text{Mode}$$

For a positively skewed distribution (tail to the right): $$\text{Mean} > \text{Median} > \text{Mode}$$

For a negatively skewed distribution (tail to the left): $$\text{Mean} < \text{Median} < \text{Mode}$$

This relationship is useful for checking your answers. If your calculated mean is less than your calculated median, you likely made an error.

6. Combined Mean

If two or more groups have been studied separately, their combined mean can be calculated:

$$\bar{x}_{\text{combined}} = \frac{n_1\bar{x}_1 + n_2\bar{x}_2 + n_3\bar{x}_3 + \ldots}{n_1 + n_2 + n_3 + \ldots}$$

Example: Group A has 30 students with mean score 65. Group B has 20 students with mean score 72. Find the combined mean.

$$\bar{x} = \frac{(30 \times 65) + (20 \times 72)}{30 + 20} = \frac{1950 + 1440}{50} = \frac{3390}{50} = 67.8$$

7. Deciles, Percentiles, and Quartiles

Just as the median divides data into two equal halves:

  • Quartiles divide data into four equal parts: Q₁ (25th percentile), Q₂ (50th = median), Q₃ (75th percentile)
  • Deciles divide data into ten equal parts: D₁, D₂, …, D₉
  • Percentiles divide data into one hundred equal parts

Finding Quartiles from Ogives:

  • Q₁ = value at $\frac{n}{4}$ position
  • Q₂ = value at $\frac{n}{2}$ position (same as median)
  • Q₃ = value at $\frac{3n}{4}$ position

Interquartile Range (IQR): $$\text{IQR} = Q_3 - Q_1$$

The IQR measures the spread of the middle 50% of data, ignoring extreme values.

8. Standard Deviation — Detailed Calculation for Grouped Data

Worked Example:

Class IntervalMidpoint (x)Frequency (f)fx$(x - \bar{x})$$(x - \bar{x})^2$$f(x - \bar{x})^2$
10–141244812−18.75=−6.7545.56182.25
15–1917813617−18.75=−1.753.0624.50
20–2422511022−18.75=3.2510.5652.81
25–292738127−18.75=8.2568.06204.19
Total20375463.75

Mean $\bar{x} = \frac{375}{20} = 18.75$

$$s^2 = \frac{463.75}{20} = 23.1875$$ $$s = \sqrt{23.1875} = 4.82$$

9. JAMB Past Questions — Worked Examples

Question 1 (Mean — Ungrouped): Find the mean of: 12, 15, 18, 22, 25

$$\bar{x} = \frac{12+15+18+22+25}{5} = \frac{92}{5} = 18.4$$

Question 2 (Mean — Grouped Continuous):

Class0–45–910–1415–19
Freq3575

Class midpoints: 2, 7, 12, 17 $$\bar{x} = \frac{(3\times2)+(5\times7)+(7\times12)+(5\times17)}{3+5+7+5} = \frac{6+35+84+85}{20} = \frac{210}{20} = 10.5$$

Question 3 (Median — Grouped):

Class20–2930–3940–4950–59
Freq8152512

$n = 60$, $\frac{n}{2} = 30$ Cumulative frequencies: 8, 23, 48, 60 Median class = 40–49 (c.f. first exceeds 30) $L = 39.5$, $c.f. = 23$, $f = 25$, $w = 10$

$$M_e = 39.5 + \left[\frac{30-23}{25}\right] \times 10 = 39.5 + 2.8 = 42.3$$

Question 4 (Mode — Grouped):

Class0–910–1920–2930–39
Freq412208

Modal class = 20–29 (highest frequency 20) $L = 19.5$, $f_0 = 12$, $f_1 = 20$, $f_2 = 8$, $w = 10$ $d_1 = 20-12 = 8$, $d_2 = 20-8 = 12$

$$M_o = 19.5 + \frac{8}{8+12} \times 10 = 19.5 + \frac{8}{20} \times 10 = 19.5 + 4 = 23.5$$

10. Exam Tips and Common Mistakes

⚠️ Common Mistakes to Avoid:

  1. Using raw class values instead of class midpoints: For continuous grouped data mean and standard deviation, you MUST use $(L + U)/2$ as $x$. Using the lower limit will give the wrong answer every time.

  2. Wrong cumulative frequency for median: The c.f. in the median formula is the cumulative frequency of ALL classes BEFORE the median class. Not the c.f. of the median class itself.

  3. Forgetting to arrange ungrouped data first: The median requires data to be in ascending order. Skipping this step guarantees a wrong answer.

  4. Using population formula for a sample: If the question says “sample,” use $n-1$ in the denominator of variance/standard deviation. If it doesn’t specify, use $n$.

  5. Incorrect class boundaries: When finding class boundaries for continuous data, subtract 0.5 from the lower limit and add 0.5 to the upper limit (for whole number data).

  6. Mode formula errors: Remember $d_1 = f_1 - f_0$ and $d_2 = f_1 - f_2$. Some students reverse these. $d_1$ is always the difference with the class BEFORE the modal class.

  7. Wrong position for median (ungrouped): Use $\frac{n+1}{2}$, NOT $\frac{n}{2}$. Only in the grouped median formula do you use $\frac{n}{2}$.

💡 JAMB-Specific Tips:

  • Time management: Statistics questions are often计算-heavy. Show your working clearly so you can check for errors without starting over.
  • Answer format: JAMB expects exact answers where possible. Simplify square roots (e.g., $\sqrt{50} = 5\sqrt{2}$) or leave as decimals to 2–3 significant figures.
  • Sketching helps: Even for calculation questions, quickly sketch the frequency distribution. It helps you spot the modal class and median class correctly.
  • Check for skewed data: If a question mentions “skewed distribution,” expect questions about the relationship between mean, median, and mode.
  • Units matter: Always include units where applicable (e.g., kg, cm, Naira). This can save marks in theory questions.

🎯 Study Priority for JAMB:

  1. Must master: Mean (all three types) — appears every year in some form
  2. Must master: Median formula and grouped median — very high frequency
  3. Must master: Mode (grouped) — common in JAMB
  4. High priority: Standard deviation calculations — often combined with mean
  5. High priority: Class boundaries and midpoints — prerequisite for all grouped calculations
  6. Medium priority: Cumulative frequency and ogives — usually 1–2 questions per year
  7. Medium priority: Range and variance — sometimes standalone, sometimes combined
  8. Know: Relationship between mean, median, mode for skewed distributions
  9. Know: Combined mean formula — occasionally tested
  10. Bonus: Quartiles and interquartile range — for comprehensive preparation

📋 Quick Reference Summary

FormulaWhen to Use
$\bar{x} = \frac{\sum x}{n}$Mean of ungrouped/raw data
$\bar{x} = \frac{\sum fx}{\sum f}$Mean of grouped discrete OR continuous (use midpoints)
Median position = $\frac{n+1}{2}$Finding median position in ungrouped data
$M_e = L + [\frac{n/2 - c.f.}{f}] \times w$Median of grouped data
$M_o = L + \frac{d_1}{d_1+d_2} \times w$Mode of grouped data
$d_1 = f_1 - f_0, \quad d_2 = f_1 - f_2$Components of grouped mode
Range = Max − MinSimplest measure of dispersion
$s = \sqrt{\frac{\sum(x-\bar{x})^2}{n}}$Standard deviation (population/ungrouped)
$s = \sqrt{\frac{\sum f(x-\bar{x})^2}{\sum f}}$Standard deviation (grouped)
Class midpoint $x = \frac{L+U}{2}$For continuous grouped data
Class boundary = limit ± 0.5For discrete data grouped into classes

  • math-14: Probability — Often combined with Statistics in exam questions (finding expected values, probability distributions)
  • math-15: Permutations and Combinations — Underpins probability calculations
  • math-4: Algebra — Solving equations is essential for statistical calculations
  • math-6: Plane Geometry — Sometimes combined with statistical diagrams
  • math-17: Data Collection and Presentation — Bar charts, histograms, pie charts — precursor to Statistics analysis

Content adapted based on your selected roadmap duration. Switch tiers using the pill selector above.

📐 Diagram Reference

Statistical diagram showing frequency distribution histogram, bar chart, mean median mode illustration, data visualization style, clean black and white

Diagrams are generated per-topic using AI. Support for AI-generated educational diagrams coming soon.