Topic 1: Introduction to Statistics and presenting data

🎧 Online students

Throughout the computer lab question sheets, you will see emojis and/or collapsible sections like this one. Each emoji has a particular meaning and will sometimes be associated with additional instructions:

Prompts for you

💬 Write your answer in the chat.

Modes at different times during the lab

🏡 Main room. All together in the main room – your computer lab demonstrator will be presenting information or facilitating class discussion

💡 Breakout rooms. Person with birthday closest to (your computer lab demonstrator will pick a random date) shares their screen or whiteboard. Here you will discuss a question together and bring your group’s answer back to the main room.

💻 Focus mode. You will still be in the main room, but working independently. All students will be sharing screen during this time so that your computer lab demonstrator (but not other students) can see your screen.


🏫 Face-to-face (blended) students

Throughout the computer lab question sheets, you will see emojis and/or collapsible sections like this one. You can ignore the emojis and collapsible sections, as they contain information relevant to students who are studying online.


In this computer lab, we will work through some exercises relating to the content in Topic 1, and then apply our knowledge to new data.

After working through the questions in this computer lab, you will be ready to complete Quiz 2. If you have time during today’s lab, you may like to work on the quiz.

Preparation

Open up jamovi on your computer, as well as Word or a similar word processing program.

As you work through the questions, you can copy and paste your results from jamovi into Word (or similar) and save your work in a safe place, e.g. OneDrive.

1 Variables in the survey data set

🏡 Recall the survey data set introduced in the Topic 1 readings and Computer Lab 1. This data set is contained within the R package MASS, and consists of the responses of 237 Statistics students to a set of questions (Venables and Ripley 1999).

Follow the instructions below to familiarise yourself with the survey data set.

1.1

🏡 Download the file called survey.csv from the LMS, and save it in a relevant location on your computer.

Once you have done so, import the survey.csv file in jamovi. For revision on how to do this, see Computer Lab 1.

Note: If you prefer, you can instead open the .omv file you saved at the end of Computer Lab 1.

1.2

🏡 Once you have opened the survey data set, you should be able to see the data set on the left-hand side of the jamovi interface. Take some time to look over the the different types of data recorded in this data set.

Next, click on the Variables tab to view the list of variables in the data set. Write down this list of variables.

1.3

🏡 Go to this link to view the R documentation on the survey data set, which includes a short explanation of each variable.

1.4

🏡 Using the information at the link where needed, next to the name of each variable you have written down, write down what kind of variable it is. For example:

  • Sex: Categorical, Nominal
  • Wr.Hnd: Numerical, Continuous
  • Etc.

2 Frequency tables

💻 Frequency tables and are used to tell us how many people (or units) fall into each category. In this question, we will learn how to create frequency tables in jamovi.

2.1

💻 Consider the W.Hnd variable, which tells us the writing hand of each student. Use jamovi to create a frequency table for the W.Hnd variable.

2.2

💻 Do the proportions (or percentages) of left-handed and right-handed students align with your expectations? Why or why not?

2.3

💻 Now that you are familiar with frequency tables in jamovi, create a frequency table of the variable Sex, and then answer the following questions:

  • How many males and females are there in the class? 💬
  • What percentage of the class is female and what percentage is male? 💬

3 Types of variables in jamovi

💻 In 1.4, we determined that the variable Smoke is a categorical, ordinal variable. This is a useful fact to know when displaying frequency tables in jamovi. Variables can be set up in jamovi to be “ordered,” meaning that jamovi knows the order of the categories of the variable, and automatically displays any output accordingly.


To learn more about types of variables in jamovi, a useful reference is Variables (Navarro and Foxcroft 2022).

3.1

💻 After watching the above video, set up the Smoke variable so that the categories are correctly ordered.

3.2

💻 Create a frequency table for the Smoke variable, ensuring that the order of categories is displayed correctly.

4 Adding cumulative frequencies to the table

💻 When creating frequency tables, it can be useful to include a column for the cumulative frequencies. Consider, for example, the below output from R, which includes the frequencies (or counts), cumulative frequencies (or cumulative frequencies), percentages (or “relative frequencies”), and cumulative percentages (or “cumulative relative frequencies”):

##       Freq Cum Freq Rel Freq Cum Rel Freq
## Never  189      189    80.08        80.08
## Occas   19      208     8.05        88.14
## Regul   17      225     7.20        95.34
## Heavy   11      236     4.66       100.00

While the cumulative frequencies are not automatically provided in jamovi, it is possible to add a column for these outside of jamovi. The following video demonstrates how:

4.1

💻 After copying and pasting your frequency table for Smoke into a word processing or spreadsheet program of your choice, add a column that includes the cumulative counts (or frequencies) for the Smoke variable.

5 Bar charts

💻 Bar charts are a useful way to graphically present categorical data. As we will see, bar charts (or bar plots) are straightforward to produce in jamovi.

5.1

💻 Create a bar chart for the Smoke variable in jamovi. Once you have created the bar chart, use the settings to choose a theme and colour pallette for your bar plot.

5.2

💻 The bar chart we created in the previous question displayed the frequencies for each category contained in the Smoke variable. Although jamovi does not provide this option, it is also possible to create a bar chart that displayes relative frequencies (or percentages). Consider, for example, this relative frequency distribution chart created using R:

5.3

💻 You may have noticed that the shape of the above plot is exactly the same as that of the bar chart we created for the frequencies earlier. In your own words, explain the difference between the two plots, and how we can tell the difference between the plots by looking at each one.

6 Saving or exporting plots from jamovi

💻 Once you have produced an image in jamovi, such as your Bar chart from 5, you might like to save it. The below video shows how to copy-paste the image, as well as save or export the image as a file.

6.1

💻 Save your bar chart as a .png file somewhere safe (e.g. OneDrive).

7 Pie charts

💻 While not provided as a default option in jamovi, pie charts can also be a useful way to visualise data. Consider, for example, the below pie chart for the Smoke variable created using R:

7.1

💻 By considering the above pie chart, which category contains the most number of observations? 💬 Which category contains the least number of observations? 💬

8 Assessing numerical variables

🏡 So far, we have been considering categorical data. In this section, we will be looking at frequency tables and histograms to make sense of numerical data. While the variables we will look at in this section will mostly be continuous, the material presented can also be applied to discrete variables such as the Pulse variable.

Since numerical variables do not contain categories, it is not immediately obvious how one would go about creating a frequency table for these types of variables. Consider the Height variable, for example. What we can do is break up the range of heights into equal intervals. For example, 150-155cm, 155-160cm, and so on. After doing so, these intervals can be used in a frequency table for this modified Height variable.

It is possible to create modified variables in jamovi, however doing so is beyond the scope of this subject. (If you are interested though, a useful reference is Transforming scores to categories — jamovi (datalab.cc 2014).

We will, however, need to know how to interpret frequency tables created for numerical variables. Consider, for example, the below table produced using R:

##           Freq Cum Freq Rel Freq Cum Rel Freq
## [150,155)    6        6     2.87         2.87
## [155,160)   13       19     6.22         9.09
## [160,165)   20       39     9.57        18.66
## [165,170)   45       84    21.53        40.19
## [170,175)   42      126    20.10        60.29
## [175,180)   27      153    12.92        73.21
## [180,185)   28      181    13.40        86.60
## [185,190)   17      198     8.13        94.74
## [190,195)    8      206     3.83        98.56
## [195,200)    2      208     0.96        99.52
## [200,205)    1      209     0.48       100.00

8.1

🏡 By referring to the above frequency table, answer the following questions:

  1. How many students are between 170-175cm? 💬
  2. What percentage of students are between 170-175cm? 💬
  3. How many students are less than 175cm tall? 💬
  4. What percentage of students are less than 175cm tall? 💬
  5. How many students are 175cm or taller? 💬
  6. What percentage of students are 175cm or taller? 💬

9 Histograms

🏡 Recall that a histogram is a chart that depicts the frequency of a numerical variable in non-overlapping intervals, called ‘bins,’ that span the entire range of the data. We can think of a histogram as a visual representation of a frequency table. While we have used bar charts for categorical variables, a histogram would be the equivalent kind of chart for numerical data.

9.1

Create a histogram of the Height variable in jamovi. For a refresher on how to create a histogram, see this video.

9.2

Create a histogram of the ages of students in jamovi (use the Age variable). Does the data appear symmetrical or skewed? Can you see any outliers?


That’s everything for today! If you still have time, you may like to have a go at Quiz 2, which is based on the Topic 2 readings.

Before you finish up, remember to save your work (e.g. your jamovi and Word files) somewhere safe (e.g. OneDrive) so that you can access it at a later time.


References

datalab.cc. 2014. “Datalab.cc.” 2014. https://www.youtube.com/c/datalabcc/about.
Navarro, D. J., and D. R. Foxcroft. 2022. “Learning Statistics with Jamovi: A Tutorial for Psychology Students and Other Beginners.” 2022. https://www.learnstatswithjamovi.com/.
Venables, W. N., and B. D. Ripley. 1999. Modern Applied Statistics with s-PLUS. 3rd ed. New York: Springer.


These notes have been prepared by Amanda Shaker. The copyright for the material in these notes resides with the authors named above, with the Department of Mathematical and Physical Sciences and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License BY-NC-ND.