What is Topic 7 in NCE (Nigeria)?

Topic 7 is a topic in the Education syllabus of NCE (Nigeria). It carries a weight of 3% in the exam. Our notes cover the key concepts, formulas, and problem-solving approaches you need to master this topic.

How do the Quick / Standard / Deep tiers work?

Each topic has three content tiers. Quick (1h–1d) gives high-yield facts and exam tips for last-minute revision. Standard (2d–2mo) covers full concepts, key points, and formulas. Deep (3mo+) provides comprehensive coverage with derivations and practice prompts. The tier auto-selects based on your roadmap duration, or you can switch manually.

How do these notes help with NCE (Nigeria) preparation?

These notes are part of your personalised NCE (Nigeria) study roadmap. Each topic has three depth levels so you study at the right intensity for your available time. Use the roadmap to prioritise topics by exam weight, then dive into notes at your chosen depth.

Yes — all StudyRoadmap™ content is completely free. No account required, no signup, no email. Just open the notes and start studying.

Testing and Measurement

🟢 Lite — Quick Review (1h–1d)

Rapid summary for last-minute revision before your exam.

Testing and Measurement — Key Facts for NCE (Nigeria)

Measurement: Assigning numbers to objects/events according to rules
Assessment: Broader process including tests and non-test data
Evaluation: Making judgments based on assessment data
Test: Formal instrument measuring a sample of behavior
⚡ Exam tip: Validity ensures test measures what it claims; Reliability ensures consistency of measurement

🟡 Standard — Regular Study (2d–2mo)

Standard content for students with a few days to months.

Testing and Measurement — NCE (Nigeria) Study Guide

Basic Concepts

Measurement: The process of assigning numbers to objects or events according to rules.

Assessment: Broad term including tests, observations, portfolios, etc.

Evaluation: Making judgments or decisions based on assessment information.

Test: A standardized instrument designed to measure a sample of behavior.

Scales of Measurement

1. Nominal Scale:

Categorization only
Numbers used as labels
Example: Gender (Male=1, Female=2), Ethnicity
Operations: Count, mode

2. Ordinal Scale:

Rank order
Differences not equal
Example: Class position (1st, 2nd, 3rd)
Operations: Median, percentile

3. Interval Scale:

Equal intervals
No absolute zero
Example: Temperature in Celsius
Operations: Mean, standard deviation

4. Ratio Scale:

Equal intervals + absolute zero
True ratios possible
Example: Height, weight, age
Operations: All statistical operations

Qualities of Good Tests

Validity: The test measures what it claims to measure.

Types of Validity:

Content Validity: Test covers all aspects of content
Criterion-Related Validity: Comparison with external criterion
- Concurrent: Correlates with criterion at same time
- Predictive: Predicts future performance
Construct Validity: Measures theoretical construct

Reliability: The consistency of test results.

Types of Reliability:

Test-Retest: Same test given twice
Parallel Forms: Two equivalent versions
Split-Half: Two halves of same test
Inter-rater: Agreement between raters

Reliability vs. Validity:

A test can be reliable without being valid
A test cannot be valid without being reliable

NCE Exam Pattern

Common question types:

Differences between measurement scales
Types and characteristics of validity/reliability
Computing measures of central tendency and dispersion
Interpretation of test scores
Construction of tests and rubrics

🔴 Extended — Deep Study (3mo+)

Comprehensive coverage for students on a longer study timeline.

Testing and Measurement — Comprehensive NCE (Nigeria) Notes

Detailed Theory

1. Nature of Educational Measurement

Definition: Educational measurement involves assigning numbers to student performance according to systematic rules.

Why Measure?

Diagnose learning difficulties
Evaluate instruction effectiveness
Assign grades and credits
Selection and placement
Accountability

Limitations of Measurement:

Cannot measure everything important
Always some measurement error
What gets measured may not be what matters most
Social context affects measurement

2. Scales of Measurement — Detailed

NOMINAL SCALE:

Purpose: Classification into distinct categories
Characteristics: Mutually exclusive categories, no order implied
Permissible Statistics: Mode, frequency counts, chi-square
Examples:
- Types of schools (public, private, mission)
- States of Nigeria (36 + FCT)
- Pass/Fail

ORDINAL SCALE:

Purpose: Rank ordering
Characteristics: Categories have order, but intervals unequal/unknown
Permissible Statistics: Median, percentile, rank correlation
Examples:
- Class position (1st, 2nd, 3rd)
- Socioeconomic status (low, middle, high)
- Grade levels

INTERVAL SCALE:

Purpose: Measure magnitude with equal intervals
Characteristics: Zero point is arbitrary, no true ratio
Permissible Statistics: Mean, standard deviation, correlation
Examples:
- Temperature (Celsius/Fahrenheit)
- Standard scores (z-scores, T-scores)
- Dates on calendar

RATIO SCALE:

Purpose: Measure with true zero and equal intervals
Characteristics: Absolute zero, true ratios meaningful
Permissible Statistics: All statistical operations
Examples:
- Height
- Weight
- Age
- Number of correct answers

3. Validity — Comprehensive Treatment

Definition: The degree to which evidence and theory support the interpretations of test scores for intended purposes.

Evidence-Based Validity:

Content evidence (test content)
Response process evidence (how test-takers respond)
Internal structure evidence (relationships within test)
Relations to other variables (criterion evidence)

CONTENT VALIDITY:

Degree to which test samples the content domain
Subject matter expert judgment required
Test blueprint/table of specifications
Example: Math test covering only algebra when geometry also required = low content validity

CRITERION-RELATED VALIDITY:

Concurrent Validity: Test correlates highly with criterion measured at same time
- Example: New IQ test correlates 0.85 with established IQ test
Predictive Validity: Test predicts future criterion
- Example: JAMB scores predict university performance
- Validity coefficient indicates predictive power

CONSTRUCT VALIDITY:

Degree to which test measures a theoretical construct
Construct: A theoretical concept (intelligence, anxiety, motivation)
Multiple forms of evidence gathered
Example: Intelligence test validates against theories of intelligence

FACTORS AFFECTING VALIDITY:

Test content unrepresentative
Item ambiguity
Test anxiety
Guessing
Administration errors
Interpretation errors

4. Reliability — Comprehensive Treatment

Definition: The consistency of scores obtained by the same persons on different occasions, with different items, or under different conditions.

TRUE SCORE THEORY:

Observed Score = True Score + Error Score
X = T + E
Perfect reliability = error variance of zero

TEST-RETEST RELIABILITY:

Same test administered twice
Time interval between tests
Correlation between scores = reliability coefficient
High correlation = high reliability
Problem: Memory effects, practice effects

PARALLEL-FORMS (EQUIVALENT-FORMS) RELIABILITY:

Two equivalent versions of test
Both administered to same group
Correlation between forms
Minimizes memory effects

SPLIT-HALF RELIABILITY:

One test, divided into two halves
Odd-numbered vs. even-numbered items
Correlation between halves
Spearman-Brown prophecy formula adjusts for full test

INTER-RATER RELIABILITY:

Agreement between two or more raters
Cohen’s Kappa for categorical judgments
Pearson correlation for continuous scores
ICC (Intraclass Correlation Coefficient)

RELIABILITY COEFFICIENTS:

Range: 0 to 1.00
0.90+ = Excellent (high-stakes decisions)
0.80-0.89 = Good (classroom use)
0.70-0.79 = Adequate (group decisions)
Below 0.70 = Questionable

RELIABILITY AND STANDARD ERROR OF MEASUREMENT:

SEM = SD × √(1 - r)

SEM provides range within which true score likely falls
Higher reliability → Smaller SEM

5. Measures of Central Tendency

MEAN:

Arithmetic average
Most sensitive to extreme scores
Best for interval/ratio data
Formula: Σx/n

MEDIAN:

Middle value when arranged in order
Less affected by extreme scores
Better for ordinal or skewed distributions
Position = (n+1)/2

MODE:

Most frequently occurring value
Used with nominal data
May have no mode or multiple modes

When to Use Each:

Data Type	Best Measure	Reason
Nominal	Mode	Only appropriate
Ordinal	Median	Rank order
Interval/Ratio (symmetric)	Mean	Most sensitive
Interval/Ratio (skewed)	Median	Resistant to outliers

6. Measures of Dispersion

RANGE:

Maximum - Minimum
Simplest measure
Affected by outliers

VARIANCE:

Average of squared deviations from mean
Population variance: Σ(x-μ)²/N
Sample variance: Σ(x-x̄)²/(n-1)

STANDARD DEVIATION:

Square root of variance
In same units as original data
Most commonly used measure
Formula: σ = √[Σ(x-μ)²/N]

COEFFICIENT OF VARIATION:

CV = (SD/Mean) × 100
Allows comparison across different scales
Useful for comparing variability of different distributions

7. Normal Distribution and Standard Scores

Normal Distribution:

Bell-shaped, symmetric
Mean = Median = Mode
Defined by mean and standard deviation
68% within 1 SD, 95% within 2 SD, 99.7% within 3 SD

Z-SCORES:

Standard score showing position in SD units
z = (X - μ)/σ
Mean of z-scores = 0
SD of z-scores = 1

T-SCORES:

z-score transformed to have mean of 50 and SD of 10
T = 50 + 10(z)

PERCENTILE RANKS:

Percentage of scores below given score
60th percentile = scored higher than 60% of test-takers
Not equal intervals — difference between percentiles varies

8. Types of Tests

Standardized Tests:

Norm-referenced or criterion-referenced
Administered under uniform conditions
Content and scoring standardized
Examples: WAEC, NECO, JAMB

Teacher-Made Tests:

Designed for specific classroom
Based on specific instruction
More flexible format
Diagnostic purposes

CRITERION-REFERENCED vs. NORM-REFERENCED:

Aspect	Criterion-Referenced	Norm-Referenced
Purpose	Mastery of objectives	Relative standing
Comparison	To standard	To other test-takers
Interpretation	% who mastered	Percentile rank
Example	Driving test (pass/fail)	IQ test (percentile)

9. Test Construction

STEPS IN TEST CONSTRUCTION:

Define objectives/content to be tested
Prepare table of specifications
Select item types
Write items
Review and edit items
Produce final test
Administer
Analyze items
Revise as needed

TABLE OF SPECIFICATIONS (Test Blueprint):

Grid showing content areas vs. cognitive levels
Ensures representative sampling
Guides item writing
Documents content validity

ITEM WRITING PRINCIPLES:

Clear, unambiguous language
One main idea per item
Avoid clues (grammatical cues, word frequency)
Appropriate difficulty
Free from bias
Correct answer only one option

10. Item Analysis

DIFFICULTY INDEX:

P = Number correct / Total number
Range 0 to 1
0.30-0.70 ideal for most purposes
Too easy (P>0.90) or too hard (P<0.20) = poor discrimination

DISCRIMINATION INDEX:

Difference between upper and lower groups
D = (% in upper group correct) - (% in lower group correct)
Range -1 to +1
0.40+ = Good discrimination
Negative = Item may be keyed incorrectly

Practice Questions for NCE

Differentiate between validity and reliability, explaining why a test can be reliable without being valid.
A test has a mean of 50 and standard deviation of 10. Calculate the z-score for a student scoring 70.
Explain the differences between norm-referenced and criterion-referenced tests.
What is the Standard Error of Measurement and how does it affect interpretation of test scores?
Describe the steps involved in constructing a classroom test.

Content adapted based on your selected roadmap duration. Switch tiers using the selector above.