Chapter 4: Statistics for Climatology

CCBD 524

Sir Calvin Gaye

2024-07-17

Chapter 1: Frequency Distribution and Graphs

OUTLINE

Chapter 1: Frequency Distribution and Graphs

OBJECTIVES

After completing this chapter, you should be able to

  1. Organize data using a frequency distribution.
  2. Represent data in frequency distributions graphically, using histograms, frequency polygons, and ogives.
  3. Represent data using bar graphs, Pareto charts, time series graphs, pie graphs, and dotplots.
  4. Draw and interpret a stem and leaf plot.

Introduction

To describe situations, draw conclusions, or make inferences about events, the researcher must organize the data.

  1. The most convenient method of organizing data is to construct a frequency distribution.

The researcher presents data so they can be understood by those who will benefit from reading the study.

  1. The most useful method of presenting the data is by constructing statistical charts and graphs.

Useful R Libraries

  ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
  ✔ dplyr     1.1.4     ✔ readr     2.1.5
  ✔ forcats   1.0.0     ✔ stringr   1.5.1
  ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
  ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
  ✔ purrr     1.0.2     
  ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
  ✖ purrr::compose() masks flextable::compose()
  ✖ dplyr::filter()  masks stats::filter()
  ✖ dplyr::lag()     masks stats::lag()
  ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
  
  Attaching package: 'formattable'
  
  
  The following object is masked from 'package:flextable':
  
      style
  
  
  
  Attaching package: 'gt'
  
  
  The following object is masked from 'package:formattable':
  
      currency
  
  
  Loading required package: reactable
  
  
  Attaching package: 'reactablefmtr'
  
  
  The following objects are masked from 'package:gt':
  
      google_font, html
  
  
  The following object is masked from 'package:ggplot2':
  
      margin
  
  
  The following object is masked from 'package:flextable':
  
      void

Organizing Data (OD) - raw data

  1. When the data are in original form, they are called raw data and are listed next.

X1

X2

X3

X4

X5

45

46

64

57

85

92

51

71

54

48

27

66

76

55

69

54

44

54

75

46

61

68

78

61

83

88

45

89

67

56

81

58

55

62

38

55

56

64

81

38

49

68

91

56

68

46

47

83

71

62

  1. Since little information can be obtained from looking at raw data, the researcher organizes the data into what is called a frequency distribution.

OD - frequency distribution

OD - Frequency Table

Class_limits

FrequencyRD

27-35

1

36-44

2

45-53

9

54-62

15

63-71

10

72-80

3

81-89

7

90-98

2

Total

50

OD - Categorical Frequency Distributions

OD - Categorical Frequency Distributions

EXAMPLE 1–1 Breakfast Beverages

  1. Forty people were asked what beverage they drink for breakfast. Construct a categorical frequency distribution for the data and summarize the results. Use these classes: W = water, M = milk, J = juice, C = coffee, and T = tea.

X1

X2

X3

X4

X5

X6

X7

X8

X9

X10

W

J

C

C

J

C

T

W

C

T

W

M

M

W

W

M

W

W

M

C

T

W

J

M

M

C

J

C

M

W

W

C

W

W

J

W

W

M

C

J

OD - EXAMPLE 1 Solution: Step 1

classBB

frequencyBB

percentBB

cum.frequencyBB

W

M

J

C

T

OD - EXAMPLE 1 Solution: Step 2 - 4

\[ \% = \frac{f}{n}\] where \(f = \text{frequency}\) of the class and \(n = \text{total number of values}\). For example, in class W, the percentage is

\[ = \frac{14}{40}\times 100 = 35\% \]

\[ 35\%, 35+20=55\%, 35+20+15 = 70\% ... \]

OD - EXAMPLE 1 Solution: Step 5

classBB

frequencyBB

percentBB

cum.frequencyBB

W

14

35.0

35.0

M

8

20.0

55.0

J

6

15.0

70.0

C

9

22.5

92.5

T

3

7.5

100.0

OD - Grouped Frequency Distributions

OD - Grouped Frequency Distributions2

OD - Grouped Frequency Distributions3

The researcher must decide how many classes to use and the width of each class. To construct a frequency distribution, follow these rules:

or

\[X_{m} = \frac{ll + ul}{2}\]

OD - Grouped Frequency Distributions4