Business Statistics and Data Analysis
🟢 Lite — Quick Review (1h–1d)
Rapid summary for last-minute revision before your exam.
Business Statistics and Data Analysis — Key Facts for Sri Lanka A/L Examination
Descriptive Statistics:
- Mean: Average (sum of values ÷ number of values)
- Median: Middle value when data is arranged in order
- Mode: Most frequently occurring value
- Range: Maximum - Minimum
Measures of Dispersion:
- Variance: Average of squared deviations from mean
- Standard Deviation: √variance (most useful)
- Range: Max - Min
Key Formulas:
Mean (x̄) = Σx / n
Variance (σ²) = Σ(x - x̄)² / n
Standard Deviation (σ) = √[Σ(x - x̄)² / n]
⚡ A/L Exam Tip: Standard deviation is in the same units as data — use this to compare values within the same dataset!
🟡 Standard — Regular Study (2d–2mo)
Standard content for students with a few days to months.
Business Statistics and Data Analysis — Detailed Study Guide
Data Types and Collection
Types of Data:
| Type | Description | Examples | Analysis |
|---|---|---|---|
| Primary | Collected firsthand | Surveys, experiments | More control, specific to needs |
| Secondary | Already collected by others | Census data, company reports | Faster, cheaper |
| Quantitative | Numerical | Revenue, number of employees | Statistical analysis |
| Qualitative | Descriptive, non-numerical | Customer feedback, product types | Coding and categorisation |
Data Collection Methods:
Survey Methods:
| Method | Description | Pros | Cons |
|---|---|---|---|
| Questionnaire | Written questions | Low cost, large samples | Low response |
| Interview | Oral questioning | Deep data, high response | Expensive, slow |
| Telephone survey | Phone-based | Moderate cost | Declining response |
| Online survey | Digital distribution | Fast, cheap | Sample bias |
Sampling Techniques:
Probability Sampling (random selection):
| Method | Description | When to Use |
|---|---|---|
| Simple random | Every member has equal chance | No pre-existing groups |
| Systematic | Every kth member | Large populations |
| Stratified | Random sample from each stratum | Known subgroups |
| Cluster | Random clusters, all in cluster | Geographic dispersed |
Non-Probability Sampling:
| Method | Description | Limitation |
|---|---|---|
| Convenience | Readily available | Sample bias |
| Quota | Meet quotas for characteristics | Non-random |
| Purposive | Selected for specific criteria | Researcher judgment |
Sri Lankan Data Sources:
- Department of Census and Statistics
- Central Bank of Sri Lanka
- Sri Lanka Customs
- Line Ministry publications
- World Bank, ADB databases
Frequency Distributions
Constructing Frequency Distributions:
Step 1: Decide on number of classes:
- Generally 5-15 classes
- Too few = lose detail
- Too many = messy
Step 2: Calculate class width:
Class Width = (Max value - Min value) / Number of classes
Step 3: Determine class boundaries:
- Lower limit = minimum value
- Upper limit = lower limit + class width
- Avoid overlapping
Step 4: Tally and count:
- Tally each data point into appropriate class
- Count tallies for frequency
Example - Monthly Sales (in thousands Rs.):
| Class | Tally | Frequency |
|---|---|---|
| 50-60 | IIII | 4 |
| 60-70 | IIII IIII | 10 |
| 70-80 | IIII IIII IIII | 15 |
| 80-90 | IIII III | 8 |
| 90-100 | IIII | 4 |
| Total | 41 |
Relative Frequency:
Relative Frequency = Class Frequency / Total Frequency
Cumulative Frequency:
| Class | Frequency | Cumulative Frequency |
|---|---|---|
| 50-60 | 4 | 4 |
| 60-70 | 10 | 14 |
| 70-80 | 15 | 29 |
| 80-90 | 8 | 37 |
| 90-100 | 4 | 41 |
Graphical Presentation:
- Histogram: Bar graph of frequency distribution (no gaps)
- Frequency polygon: Line graph connecting midpoints
- Ogive: Cumulative frequency line graph
- Pie chart: Proportional representation
- Bar chart: Comparing categories
Measures of Central Tendency
The Mean (Average):
Simple Mean:
x̄ = (Σx) / n
where Σx = sum of all values
n = number of values
Weighted Mean:
x̄w = (Σwx) / (Σw)
where w = weights
x = values
Example - Weighted Average Cost:
| Item | Cost (Rs.) | Quantity | Total |
|---|---|---|---|
| Item A | 100 | 20 | 2,000 |
| Item B | 150 | 30 | 4,500 |
| Item C | 200 | 50 | 10,000 |
| Total | 100 | 16,500 |
Weighted Mean = 16,500 / 100 = Rs. 165
The Median:
- Middle value when data arranged in order
- For odd n: Middle value
- For even n: Average of two middle values
Example:
- Data: 10, 15, 20, 25, 30
- Median = 20 (middle of 5 values)
Example with even n:
- Data: 10, 15, 20, 25
- Median = (15 + 20) / 2 = 17.5
The Mode:
- Most frequently occurring value
- Can have no mode, one mode, or multiple modes
- Useful for categorical data
Comparing Measures:
| Measure | Best For | Limitation |
|---|---|---|
| Mean | Interval/ratio data, symmetry | Sensitive to extreme values |
| Median | Ordinal data, skewed distributions | Ignores magnitude of values |
| Mode | Modal category, most common item | May not exist or have multiple |
Skewness:
- Symmetric: Mean ≈ Median ≈ Mode
- Positively skewed (right): Mean > Median > Mode
- Negatively skewed (left): Mean < Median < Mode
Measures of Dispersion
Why Dispersion Matters:
- Two datasets can have same mean but different spreads
- Mean profit: Rs. 1,00,000
- Dataset 1: 90,000; 100,000; 110,000 (consistent)
- Dataset 2: 0; 100,000; 200,000 (risky)
Range:
Range = Maximum value - Minimum value
Variance and Standard Deviation:
Population Variance:
σ² = Σ(x - x̄)² / n
Sample Variance:
s² = Σ(x - x̄)² / (n - 1)
Standard Deviation:
σ = √[Σ(x - x̄)² / n] (population)
s = √[Σ(x - x̄)² / (n - 1)] (sample)
Coefficient of Variation:
CV = (Standard Deviation / Mean) × 100%
- Allows comparison of variability between datasets
- Useful for comparing different scales
- Example: CV = 5% vs CV = 15% → first more consistent
Standard Deviation Calculation Step by Step:
Data: 10, 20, 30, 40, 50
- Calculate mean: (10+20+30+40+50)/5 = 150/5 = 30
- Calculate deviations: -20, -10, 0, 10, 20
- Square deviations: 400, 100, 0, 100, 400
- Sum of squares: 1000
- Variance: 1000/5 = 200
- Standard deviation: √200 = 14.14
🔴 Extended — Deep Study (3mo+)
Comprehensive coverage for students on a longer study timeline.
Business Statistics and Data Analysis — Complete Notes for A/L Sri Lanka
Correlation and Regression
Correlation Analysis:
- Measures relationship between two variables
- Does NOT imply causation
Scatter Diagram:
- Plot one variable on x-axis, other on y-axis
- Shows direction (positive/negative) and strength of relationship
Correlation Coefficient (r):
r = [nΣxy - (Σx)(Σy)] / √[(nΣx² - (Σx)²)(nΣy² - (Σy)²)]
Interpreting r:
| Value of r | Interpretation |
|---|---|
| +1.0 | Perfect positive correlation |
| +0.7 to +0.9 | Strong positive |
| +0.4 to +0.6 | Moderate positive |
| +0.1 to +0.3 | Weak positive |
| 0.0 | No correlation |
| -0.1 to -0.3 | Weak negative |
| -0.4 to -0.6 | Moderate negative |
| -0.7 to -0.9 | Strong negative |
| -1.0 | Perfect negative correlation |
Coefficient of Determination (r²):
- Proportion of variation explained by the relationship
- r² = 0.64 → 64% of variation explained
- r² = 0.25 → Only 25% explained
Regression Analysis:
Linear Regression Line:
ŷ = a + bx
where:
b = slope = [nΣxy - (Σx)(Σy)] / [nΣx² - (Σx)²]
a = intercept = ȳ - bx̄
Least Squares Principle:
- Line that minimises sum of squared vertical distances from points to line
- “Best fit” line
Using Regression for Prediction:
- Plot scatter diagram
- Verify linear relationship
- Calculate regression equation
- Substitute x value to predict y
Example - Sales and Advertising:
| Advertising (Rs.‘000) | Sales (Rs.‘000) |
|---|---|
| 10 | 50 |
| 20 | 70 |
| 30 | 85 |
| 40 | 100 |
| 50 | 115 |
- Positive correlation expected
- Predict sales at Rs. 35,000 advertising: approximately Rs. 90,000
Assumptions of Regression:
- Linear relationship
- Homoscedasticity (equal variance)
- Normal distribution of errors
- Independence of observations
Index Numbers
What are Index Numbers?:
- Measure of change over time relative to base period
- Base period = 100 (or 1000 for some systems)
- Subsequent periods show percentage change
Simple Price Index:
Price Index = (Price in current year / Price in base year) × 100
Simple Price Index Example:
- Rice price in 2020: Rs. 90/kg
- Rice price in 2023: Rs. 250/kg
- Index = (250 / 90) × 100 = 277.8
- Price increased 177.8% from base year
Laspeyres Index (using base year quantities):
PLI = (ΣP₁Q₀) / (ΣP₀Q₀) × 100
where:
P₁ = current price
P₀ = base price
Q₀ = base quantity
Paasche Index (using current year quantities):
PPI = (ΣP₁Q₁) / (ΣP₀Q₁) × 100
where:
Q₁ = current quantity
Fisher’s Ideal Index (geometric mean):
Fisher = √(Laspeyres × Paasche)
Consumer Price Index (CPI) in Sri Lanka:
- Tracks cost of consumer basket over time
- Released monthly by Department of Census and Statistics
- Base period: 2013 = 100
- Weighted by expenditure shares
Sri Lanka CPI Categories:
- Food and non-alcoholic beverages (~35%)
- Clothing and footwear (~4%)
- Housing, water, electricity, gas (~22%)
- Transport (~12%)
- Communication (~4%)
- Other categories (~23%)
Index Number Applications:
| Index | What it Measures | Sri Lankan Source |
|---|---|---|
| CPI | Consumer price changes | Census & Statistics Dept |
| GDP Deflator | All domestic prices | Central Bank |
| Colombo CPI | Urban consumer prices | Dept of Census |
| All Item Index | Broad inflation | Central Bank |
| Producer Price Index | Wholesale prices | Census & Statistics |
Real vs. Nominal Values:
Real Value = Nominal Value / Price Index × 100
Example:
- Nominal wage 2020: Rs. 50,000
- CPI 2020: 130 (base 2013=100)
- CPI 2023: 250 (base 2013=100)
- Real wage in 2023 prices: 50,000 × (250/130) = Rs. 96,154
Time Series Analysis
Components of Time Series:
| Component | Description | Example |
|---|---|---|
| Trend | Long-term movement | GDP growth over decades |
| Seasonal | Regular pattern within year | Higher rice prices before harvest |
| Cyclical | Business cycle fluctuations | Expansion and recession |
| Irregular/Random | Unpredictable | Natural disaster impact |
Trend Analysis:
Moving Averages:
- Simple moving average: Average of fixed number of periods
- 3-year moving average: (Year 1 + Year 2 + Year 3) / 3
- Smooths out fluctuations to show trend
Linear Trend (Least squares):
T = a + bt
where:
t = time period
b = slope = [nΣtY - (Σt)(ΣY)] / [nΣt² - (Σt)²]
a = intercept = Ȳ - b*t̄
Forecasting:
- Use trend equation to extrapolate
- Cautions: Past trends may not continue
- External factors can change patterns
Example - Tea Production Trend:
| Year | Production (million kg) |
|---|---|
| 2019 | 300 |
| 2020 | 285 |
| 2021 | 295 |
| 2022 | 310 |
| 2023 | 320 |
- Calculating trend: Shows slight upward trend
- Projecting 2024: Approximately 330 million kg
Probability and Business Decisions
Basic Probability:
Classical Probability:
P(Event) = Number of favorable outcomes / Total possible outcomes
Example: Rolling a 6 on a die = 1/6
Empirical Probability:
- Based on relative frequency
- P(Event) = Frequency of event / Total frequency
Subjective Probability:
- Personal judgment or expert opinion
- Used when classical/empirical not possible
Probability Rules:
| Rule | Formula | Application |
|---|---|---|
| Addition | P(A or B) = P(A) + P(B) - P(A and B) | Mutually exclusive vs. non-exclusive |
| Multiplication | P(A and B) = P(A) × P(B|A) | Dependent vs. independent events |
| Complement | P(not A) = 1 - P(A) | At least one success |
Expected Value:
E(X) = Σ[x × P(x)]
Business Application - Decision Making:
| Outcome | Probability | Profit (Rs.) | Expected Profit |
|---|---|---|---|
| High demand | 0.3 | 500,000 | 150,000 |
| Medium demand | 0.5 | 200,000 | 100,000 |
| Low demand | 0.2 | -100,000 | -20,000 |
| Expected Value | 230,000 |
Risk Analysis:
- Standard deviation of outcome values
- Coefficient of variation
- Higher CV = more risk
Business Application: Descriptive Statistics for Business
常用 Business Statistics in Sri Lanka:
Sales Analysis:
- Average daily sales
- Sales growth rate
- Market share calculation
- Seasonal variations
Financial Ratios:
- Liquidity ratios (using averages)
- Profitability (using mean profit margins)
- Efficiency (using turnover ratios)
Quality Control:
- Mean, standard deviation of product dimensions
- Control charts using ±3 standard deviations
Sri Lankan Statistical Resources:
- Central Bank Statistical Bulletin
- Department of Census and Statistics publications
- Sri Lanka Customs trade data
- Annual reports of listed companies
- World Bank Development Indicators
⚡ A/L Exam Tip: Statistics questions require practice with numbers. Know your formulas, show your workings, and interpret what your calculations mean in context!
Content adapted based on your selected roadmap duration. Switch tiers using the selector above.