Data Interpretation & Statistics
Concept
Data Interpretation and Statistics together form one of the most scoring and time-bound sections in SSC CGL Tier 2. The questions test two distinct but complementary skill sets: the ability to read and extract information from charts and tables (Data Interpretation), and the ability to compute and understand statistical measures of central tendency and dispersion (Statistics proper).
The three measures of central tendency answer three different questions about a dataset. Mean answers “what is the average?” — sum all values and divide by count. Mean is sensitive to extreme values (outliers) because every value contributes to it. Median answers “what is the middle?” — sort data, find the central value. Median is robust to outliers and better represents skewed distributions. Mode answers “what occurs most?” — the value with highest frequency. Mode is especially useful for categorical data where you can’t compute an average.
Standard deviation measures how spread out the data is from the mean. A low standard deviation means data clusters near the mean; a high standard deviation means data is widely scattered. For grouped data, standard deviation is calculated using the assumed mean method or step deviation method, which reduce arithmetic complexity. Variance is simply the square of standard deviation.
Data Interpretation questions in SSC come as pie charts (showing parts of a whole as percentages), bar graphs (comparing categories), line graphs (showing trends over time), and tables (combining multiple data series). The skill is not just calculation — it is reading the visual correctly, identifying what is being asked, and knowing which operation to perform (percentage change, ratio, average, comparison).
Key Points
- Mean is affected by extreme values — if one value is very high or very low, the mean shifts dramatically; median stays relatively stable.
- Mode can be non-unique — a dataset can have two modes (bimodal) or no mode at all, so don’t assume a unique “most frequent” value exists.
- For pie charts, all percentages must sum to 100% — use this to cross-check before calculating and to find the “other” or unlabelled category.
- Percentage change = (New - Old) / Old × 100 — always identify which is the base (old) value before applying this formula.
- In bar graph comparisons, don’t assume scale starts at zero — some graphs start at a higher value to show detail, making differences look larger or smaller than they actually are.
Worked Example
Q: The following table shows marks obtained by students in a test out of 100:
| Marks Range | 0-20 | 20-40 | 40-60 | 60-80 | 80-100 |
|---|---|---|---|---|---|
| No. of Students | 5 | 15 | 30 | 35 | 15 |
Find the mean marks (using class midpoint method).
Approach:
- Find midpoint (x) for each class: 10, 30, 50, 70, 90
- Multiply each midpoint by frequency: 50, 450, 1500, 2450, 1350
- Sum of f·x = 5,800; Sum of f = 100
- Mean = 5,800 / 100 = 58 marks
Answer: 58 marks
SSC Pattern / Tips
- Pie chart questions are fastest if you convert the question’s percentage to a decimal and multiply by the total — avoid long division.
- Bar graph comparison questions often ask “which year showed the maximum/minimum” — scan visually first, then verify with numbers.
- Tabular DI questions give more raw data than needed — identify only the rows and columns relevant to the specific question before calculating.
- When asked for “average rate of increase/decrease” across periods, calculate the percentage change for each period separately, then take the arithmetic mean of those percentages.
📐 Diagram Reference
A histogram of student marks distribution: Class intervals 0-20 (freq 3), 20-40 (freq 7), 40-60 (freq 15), 60-80 (freq 10), 80-100 (freq 5). Show modal class highlighted.
Diagrams are generated per-topic using AI. Support for AI-generated educational diagrams coming soon.