BUS-1172 · Introduction to Statistics · Lecturer: Naimul Islam

Understanding
Data Types

Qualitative & Quantitative · Measures of Central Tendency

📊 Statistic Group 7

Rukaiya Tasnim Nijhum — 25215041 Ayesha Khanam — 25015030 Farzana Akter Soraeya — 25315009 Masrun Ahmed — 25315004 Mithia Noor — 25315018 Fariya Jannat Moon — 25215010

What is Data?

Data is information collected for analysis. Two major types exist:

Numerical Data

Made of Numbers

Continuous

Infinite options — can take any value including decimals.
e.g. Age, weight, blood pressure, BMI

Discrete

Finite / countable options — whole numbers only.
e.g. Shoe size, number of children

Categorical Data

Made of Words / Labels

Ordinal

Has a clear ORDER/RANKING but unequal gaps between levels.
e.g. Pain severity, satisfaction rating, wealth quintile

Nominal

No order, no ranking — just named categories.
e.g. Eye colour, blood type, gender, region

Source: DHS Mobile Survey — Bangladesh  (dhs-mobile_national_bgd.csv)

VariableData TypeSub-typeExample Values
Age (years)NumericalContinuous18, 25, 34, 42…
BMINumericalContinuous17.5, 22.1, 28.4…
Number of ChildrenNumericalDiscrete0, 1, 2, 3, 4…
Wealth Index ScoreNumericalContinuous−2.3, 0.5, 1.8…
Household MembersNumericalDiscrete2, 4, 5, 7…

Can take any value in a range — decimals are possible (measured, not counted)

Arithmetic operations are possible: +, −, ×, ÷

Best measures: MEAN (normal distribution) or MEDIAN (skewed/outliers)

Ordinal data has a clear ORDER/RANKING — but gaps between levels are NOT equal.

Level 5 — Excellent / Highest (e.g. Richest Wealth Quintile)
Level 4 — Good
Level 3 — Fair / Moderate
Level 2 — Poor
Level 1 — Very Poor / Lowest (e.g. Poorest Quintile)
Variable (Dataset 2 / Link 2)Ordinal ScaleBest Measure
Wealth Quintile1 = Poorest → 5 = RichestMEDIAN
Education LevelNo education → HigherMEDIAN
Satisfaction Rating1 = Unsatisfied → 5 = Very satisfiedMEDIAN

Nominal data uses NAMES / LABELS — NO ranking, NO order, just categories.

Gender
Male / Female / Other
Blood Type
A / B / AB / O
Division
Dhaka / Chittagong
Rajshahi / Sylhet
Variable (Dataset 3 / Link 3)CategoriesTypeBest Measure
Gender (sex)Male / FemaleNominalMODE
ReligionIslam / Hindu / Christian / BuddhistNominalMODE
Place of ResidenceUrban / RuralNominalMODE
Geographic DivisionDhaka, Chittagong, Rajshahi…NominalMODE
🔑 Key Rule: Nominal data can ONLY use MODE — averaging labels like "Male" or "Dhaka" makes no sense!

Qualitative vs Quantitative

The very first question when analysing any variable (from Image 2 flowchart)

Qualitative (Categorical)

Data described in WORDS — open to interpretation
  • Cannot do arithmetic on it
  • Describes categories or groups
  • Examples: gender, religion, region, ethnicity
  • Sub-types: Nominal & Ordinal
  • Datasets 2 & 3 (Links 2 & 3)

Quantitative (Numerical)

Data expressed as NUMBERS — from counting or measuring
  • Can do arithmetic (+, −, ×, ÷)
  • Can calculate mean, median, standard deviation
  • Examples: age, BMI, wealth score, no. of children
  • Sub-types: Continuous & Discrete
  • Dataset 1 (Link 1)

Measures of Central Tendency — Formulas & Real Calculations

MEAN · Dataset 1 · Link 1
x̄ = Σx / n
Formula: Sum of all values ÷ number of values

Dataset 1 Example — Age of women (years):
Values: 18, 22, 25, 28, 30, 32, 35, 38, 42, 45

Σx = 18+22+25+28+30+32+35+38+42+45 = 315
n = 10

x̄ = 315 ÷ 10 = 31.5 years

✔ Use for: Continuous numerical data when NOT skewed
MEDIAN · Datasets 1 & 2
Middle Value
If n is odd: position (n+1)/2
If n is even: average of n/2 and (n/2)+1 positions

Dataset 1 Example — No. of children:
Sorted: 0, 0, 1, 1, 2, 2, 3, 3, 4, 5
n = 10 (even)

Median = (2+2) ÷ 2 = 2 children

✔ Use for: Skewed data or ordinal data (Dataset 2 / Link 2)
MODE · Dataset 3 · Link 3
Most Frequent
Definition: The value that appears most often

Dataset 3 Example — Place of Residence:
Urban, Rural, Rural, Urban,
Rural, Rural, Urban, Rural

Count: Urban=3, Rural=5

MODE = Rural (appears 5 times)

✔ Use for: Nominal & ordinal categorical data

Skewed vs Normal Distribution

Distribution shape determines which measure to use — from Image 2 flowchart · Data from Dataset 1 (Link 1)

Normal (Bell Curve) — Age · Dataset 1 / Link 1

→ Use MEAN (x̄ = 31.5 years)

Right-Skewed — No. of Children · Dataset 1 / Link 1

→ Use MEDIAN (= 2 children)

Distribution TypeExample (Dataset 1 / Link 1)Best Measure
Normal / Not SkewedAge, BMI (symmetric bell shape)MEAN
Right / Left SkewedNo. of children (outliers pull tail)MEDIAN

Which Measure Should I Use?

Quick Decision Guide — based on Image 2 flowchart

Is your variable Qualitative or Quantitative?
If Qualitative (Words)
QUALITATIVE
Datasets 2 & 3 · Links 2 & 3
Nominal?
No order
🏆 MODE
Dataset 3 / Link 3
Ordinal?
Has ranking
🏆 MEDIAN
Dataset 2 / Link 2
If Quantitative (Numbers)
QUANTITATIVE
Dataset 1 · Link 1
Is data Skewed or have Outliers?
Skewed ✓
🏆 MEDIAN
Not Skewed ✓
🏆 MEAN

All three datasets from Links 1, 2 & 3 are represented in this decision framework

Summary & Key Takeaways

Everything we learned — connected to our 3 datasets

📊 Dataset 1 · Link 1
Numerical Data (DHS Bangladesh) — age, BMI, children count, wealth score
Continuous & Discrete → Use MEAN (normal) or MEDIAN (skewed)
📋 Dataset 2 · Link 2
Ordinal Data — ranked categories (wealth quintile, education level)
Has order but unequal gaps → Use MEDIAN
🏷️ Dataset 3 · Link 3
Nominal Data — labels with no ranking (gender, division, religion)
Categories only → Use MODE
📐 Formulas
Mean: x̄ = Σx / n  |  Median: middle value when sorted  |  Mode: most frequent value
📈 Distribution
Normal / not skewed → MEAN   |   Skewed / outliers present → MEDIAN

Thank you! · Group 7 · BUS-1172 Introduction to Statistics · Lecturer: Naimul Islam

1 / 10