Skip to main content
Commerce Stream 3% exam weight

Business Statistics and Data Analysis

Part of the A/L Examination (Sri Lanka) study roadmap. Commerce Stream topic commer-004 of Commerce Stream.

Business Statistics and Data Analysis

🟢 Lite — Quick Review (1h–1d)

Rapid summary for last-minute revision before your exam.

Business Statistics and Data Analysis — Key Facts for Sri Lanka A/L Examination

Descriptive Statistics:

  • Mean: Average (sum of values ÷ number of values)
  • Median: Middle value when data is arranged in order
  • Mode: Most frequently occurring value
  • Range: Maximum - Minimum

Measures of Dispersion:

  • Variance: Average of squared deviations from mean
  • Standard Deviation: √variance (most useful)
  • Range: Max - Min

Key Formulas:

Mean (x̄) = Σx / n
Variance (σ²) = Σ(x - x̄)² / n
Standard Deviation (σ) = √[Σ(x - x̄)² / n]

A/L Exam Tip: Standard deviation is in the same units as data — use this to compare values within the same dataset!


🟡 Standard — Regular Study (2d–2mo)

Standard content for students with a few days to months.

Business Statistics and Data Analysis — Detailed Study Guide

Data Types and Collection

Types of Data:

TypeDescriptionExamplesAnalysis
PrimaryCollected firsthandSurveys, experimentsMore control, specific to needs
SecondaryAlready collected by othersCensus data, company reportsFaster, cheaper
QuantitativeNumericalRevenue, number of employeesStatistical analysis
QualitativeDescriptive, non-numericalCustomer feedback, product typesCoding and categorisation

Data Collection Methods:

Survey Methods:

MethodDescriptionProsCons
QuestionnaireWritten questionsLow cost, large samplesLow response
InterviewOral questioningDeep data, high responseExpensive, slow
Telephone surveyPhone-basedModerate costDeclining response
Online surveyDigital distributionFast, cheapSample bias

Sampling Techniques:

Probability Sampling (random selection):

MethodDescriptionWhen to Use
Simple randomEvery member has equal chanceNo pre-existing groups
SystematicEvery kth memberLarge populations
StratifiedRandom sample from each stratumKnown subgroups
ClusterRandom clusters, all in clusterGeographic dispersed

Non-Probability Sampling:

MethodDescriptionLimitation
ConvenienceReadily availableSample bias
QuotaMeet quotas for characteristicsNon-random
PurposiveSelected for specific criteriaResearcher judgment

Sri Lankan Data Sources:

  • Department of Census and Statistics
  • Central Bank of Sri Lanka
  • Sri Lanka Customs
  • Line Ministry publications
  • World Bank, ADB databases

Frequency Distributions

Constructing Frequency Distributions:

Step 1: Decide on number of classes:

  • Generally 5-15 classes
  • Too few = lose detail
  • Too many = messy

Step 2: Calculate class width:

Class Width = (Max value - Min value) / Number of classes

Step 3: Determine class boundaries:

  • Lower limit = minimum value
  • Upper limit = lower limit + class width
  • Avoid overlapping

Step 4: Tally and count:

  • Tally each data point into appropriate class
  • Count tallies for frequency

Example - Monthly Sales (in thousands Rs.):

ClassTallyFrequency
50-60IIII4
60-70IIII IIII10
70-80IIII IIII IIII15
80-90IIII III8
90-100IIII4
Total41

Relative Frequency:

Relative Frequency = Class Frequency / Total Frequency

Cumulative Frequency:

ClassFrequencyCumulative Frequency
50-6044
60-701014
70-801529
80-90837
90-100441

Graphical Presentation:

  • Histogram: Bar graph of frequency distribution (no gaps)
  • Frequency polygon: Line graph connecting midpoints
  • Ogive: Cumulative frequency line graph
  • Pie chart: Proportional representation
  • Bar chart: Comparing categories

Measures of Central Tendency

The Mean (Average):

Simple Mean:

x̄ = (Σx) / n
where Σx = sum of all values
      n = number of values

Weighted Mean:

x̄w = (Σwx) / (Σw)
where w = weights
      x = values

Example - Weighted Average Cost:

ItemCost (Rs.)QuantityTotal
Item A100202,000
Item B150304,500
Item C2005010,000
Total10016,500

Weighted Mean = 16,500 / 100 = Rs. 165

The Median:

  • Middle value when data arranged in order
  • For odd n: Middle value
  • For even n: Average of two middle values

Example:

  • Data: 10, 15, 20, 25, 30
  • Median = 20 (middle of 5 values)

Example with even n:

  • Data: 10, 15, 20, 25
  • Median = (15 + 20) / 2 = 17.5

The Mode:

  • Most frequently occurring value
  • Can have no mode, one mode, or multiple modes
  • Useful for categorical data

Comparing Measures:

MeasureBest ForLimitation
MeanInterval/ratio data, symmetrySensitive to extreme values
MedianOrdinal data, skewed distributionsIgnores magnitude of values
ModeModal category, most common itemMay not exist or have multiple

Skewness:

  • Symmetric: Mean ≈ Median ≈ Mode
  • Positively skewed (right): Mean > Median > Mode
  • Negatively skewed (left): Mean < Median < Mode

Measures of Dispersion

Why Dispersion Matters:

  • Two datasets can have same mean but different spreads
  • Mean profit: Rs. 1,00,000
    • Dataset 1: 90,000; 100,000; 110,000 (consistent)
    • Dataset 2: 0; 100,000; 200,000 (risky)

Range:

Range = Maximum value - Minimum value

Variance and Standard Deviation:

Population Variance:

σ² = Σ(x - x̄)² / n

Sample Variance:

s² = Σ(x - x̄)² / (n - 1)

Standard Deviation:

σ = √[Σ(x - x̄)² / n]  (population)
s = √[Σ(x - x̄)² / (n - 1)]  (sample)

Coefficient of Variation:

CV = (Standard Deviation / Mean) × 100%
  • Allows comparison of variability between datasets
  • Useful for comparing different scales
  • Example: CV = 5% vs CV = 15% → first more consistent

Standard Deviation Calculation Step by Step:

Data: 10, 20, 30, 40, 50

  1. Calculate mean: (10+20+30+40+50)/5 = 150/5 = 30
  2. Calculate deviations: -20, -10, 0, 10, 20
  3. Square deviations: 400, 100, 0, 100, 400
  4. Sum of squares: 1000
  5. Variance: 1000/5 = 200
  6. Standard deviation: √200 = 14.14

🔴 Extended — Deep Study (3mo+)

Comprehensive coverage for students on a longer study timeline.

Business Statistics and Data Analysis — Complete Notes for A/L Sri Lanka

Correlation and Regression

Correlation Analysis:

  • Measures relationship between two variables
  • Does NOT imply causation

Scatter Diagram:

  • Plot one variable on x-axis, other on y-axis
  • Shows direction (positive/negative) and strength of relationship

Correlation Coefficient (r):

r = [nΣxy - (Σx)(Σy)] / √[(nΣx² - (Σx)²)(nΣy² - (Σy)²)]

Interpreting r:

Value of rInterpretation
+1.0Perfect positive correlation
+0.7 to +0.9Strong positive
+0.4 to +0.6Moderate positive
+0.1 to +0.3Weak positive
0.0No correlation
-0.1 to -0.3Weak negative
-0.4 to -0.6Moderate negative
-0.7 to -0.9Strong negative
-1.0Perfect negative correlation

Coefficient of Determination (r²):

  • Proportion of variation explained by the relationship
  • r² = 0.64 → 64% of variation explained
  • r² = 0.25 → Only 25% explained

Regression Analysis:

Linear Regression Line:

ŷ = a + bx
where:
b = slope = [nΣxy - (Σx)(Σy)] / [nΣx² - (Σx)²]
a = intercept = ȳ - bx̄

Least Squares Principle:

  • Line that minimises sum of squared vertical distances from points to line
  • “Best fit” line

Using Regression for Prediction:

  1. Plot scatter diagram
  2. Verify linear relationship
  3. Calculate regression equation
  4. Substitute x value to predict y

Example - Sales and Advertising:

Advertising (Rs.‘000)Sales (Rs.‘000)
1050
2070
3085
40100
50115
  • Positive correlation expected
  • Predict sales at Rs. 35,000 advertising: approximately Rs. 90,000

Assumptions of Regression:

  • Linear relationship
  • Homoscedasticity (equal variance)
  • Normal distribution of errors
  • Independence of observations

Index Numbers

What are Index Numbers?:

  • Measure of change over time relative to base period
  • Base period = 100 (or 1000 for some systems)
  • Subsequent periods show percentage change

Simple Price Index:

Price Index = (Price in current year / Price in base year) × 100

Simple Price Index Example:

  • Rice price in 2020: Rs. 90/kg
  • Rice price in 2023: Rs. 250/kg
  • Index = (250 / 90) × 100 = 277.8
  • Price increased 177.8% from base year

Laspeyres Index (using base year quantities):

PLI = (ΣP₁Q₀) / (ΣP₀Q₀) × 100
where:
P₁ = current price
P₀ = base price
Q₀ = base quantity

Paasche Index (using current year quantities):

PPI = (ΣP₁Q₁) / (ΣP₀Q₁) × 100
where:
Q₁ = current quantity

Fisher’s Ideal Index (geometric mean):

Fisher = √(Laspeyres × Paasche)

Consumer Price Index (CPI) in Sri Lanka:

  • Tracks cost of consumer basket over time
  • Released monthly by Department of Census and Statistics
  • Base period: 2013 = 100
  • Weighted by expenditure shares

Sri Lanka CPI Categories:

  • Food and non-alcoholic beverages (~35%)
  • Clothing and footwear (~4%)
  • Housing, water, electricity, gas (~22%)
  • Transport (~12%)
  • Communication (~4%)
  • Other categories (~23%)

Index Number Applications:

IndexWhat it MeasuresSri Lankan Source
CPIConsumer price changesCensus & Statistics Dept
GDP DeflatorAll domestic pricesCentral Bank
Colombo CPIUrban consumer pricesDept of Census
All Item IndexBroad inflationCentral Bank
Producer Price IndexWholesale pricesCensus & Statistics

Real vs. Nominal Values:

Real Value = Nominal Value / Price Index × 100

Example:

  • Nominal wage 2020: Rs. 50,000
  • CPI 2020: 130 (base 2013=100)
  • CPI 2023: 250 (base 2013=100)
  • Real wage in 2023 prices: 50,000 × (250/130) = Rs. 96,154

Time Series Analysis

Components of Time Series:

ComponentDescriptionExample
TrendLong-term movementGDP growth over decades
SeasonalRegular pattern within yearHigher rice prices before harvest
CyclicalBusiness cycle fluctuationsExpansion and recession
Irregular/RandomUnpredictableNatural disaster impact

Trend Analysis:

Moving Averages:

  • Simple moving average: Average of fixed number of periods
  • 3-year moving average: (Year 1 + Year 2 + Year 3) / 3
  • Smooths out fluctuations to show trend

Linear Trend (Least squares):

T = a + bt
where:
t = time period
b = slope = [nΣtY - (Σt)(ΣY)] / [nΣt² - (Σt)²]
a = intercept = Ȳ - b*t̄

Forecasting:

  • Use trend equation to extrapolate
  • Cautions: Past trends may not continue
  • External factors can change patterns

Example - Tea Production Trend:

YearProduction (million kg)
2019300
2020285
2021295
2022310
2023320
  • Calculating trend: Shows slight upward trend
  • Projecting 2024: Approximately 330 million kg

Probability and Business Decisions

Basic Probability:

Classical Probability:

P(Event) = Number of favorable outcomes / Total possible outcomes

Example: Rolling a 6 on a die = 1/6

Empirical Probability:

  • Based on relative frequency
  • P(Event) = Frequency of event / Total frequency

Subjective Probability:

  • Personal judgment or expert opinion
  • Used when classical/empirical not possible

Probability Rules:

RuleFormulaApplication
AdditionP(A or B) = P(A) + P(B) - P(A and B)Mutually exclusive vs. non-exclusive
MultiplicationP(A and B) = P(A) × P(B|A)Dependent vs. independent events
ComplementP(not A) = 1 - P(A)At least one success

Expected Value:

E(X) = Σ[x × P(x)]

Business Application - Decision Making:

OutcomeProbabilityProfit (Rs.)Expected Profit
High demand0.3500,000150,000
Medium demand0.5200,000100,000
Low demand0.2-100,000-20,000
Expected Value230,000

Risk Analysis:

  • Standard deviation of outcome values
  • Coefficient of variation
  • Higher CV = more risk

Business Application: Descriptive Statistics for Business

常用 Business Statistics in Sri Lanka:

Sales Analysis:

  • Average daily sales
  • Sales growth rate
  • Market share calculation
  • Seasonal variations

Financial Ratios:

  • Liquidity ratios (using averages)
  • Profitability (using mean profit margins)
  • Efficiency (using turnover ratios)

Quality Control:

  • Mean, standard deviation of product dimensions
  • Control charts using ±3 standard deviations

Sri Lankan Statistical Resources:

  • Central Bank Statistical Bulletin
  • Department of Census and Statistics publications
  • Sri Lanka Customs trade data
  • Annual reports of listed companies
  • World Bank Development Indicators

A/L Exam Tip: Statistics questions require practice with numbers. Know your formulas, show your workings, and interpret what your calculations mean in context!


Content adapted based on your selected roadmap duration. Switch tiers using the selector above.