Here, we’re just setting a few options.

knitr::opts_chunk$set(
  warning = TRUE, # show warnings during codebook generation
  message = TRUE, # show messages during codebook generation
  error = TRUE, # do not interrupt codebook generation in case of errors,
                # usually better for debugging
  echo = TRUE  # show R code
)
ggplot2::theme_set(ggplot2::theme_bw())

Now, we’re preparing our data for the codebook.

library(codebook)
codebook_data <- readRDS("classical_final.rds")[, 2:26]
# to import an SPSS file from the same folder uncomment and edit the line below
# codebook_data <- rio::import("mydata.sav")
# for Stata
# codebook_data <- rio::import("mydata.dta")
# for CSV
# codebook_data <- rio::import("mydata.csv")

# omit the following lines, if your missing values are already properly labelled
codebook_data <- detect_missing(codebook_data,
    only_labelled = TRUE, # only labelled values are autodetected as
                                   # missing
    negative_values_are_missing = FALSE, # negative values are missing values
    ninety_nine_problems = TRUE,   # 99/999 are missing values, if they
                                   # are more than 5 MAD from the median
    )

# If you are not using formr, the codebook package needs to guess which items
# form a scale. The following line finds item aggregates with names like this:
# scale = scale_1 + scale_2R + scale_3R
# identifying these aggregates allows the codebook function to
# automatically compute reliabilities.
# However, it will not reverse items automatically.
codebook_data <- detect_scales(codebook_data)

Create codebook

codebook(codebook_data)
## Warning in inline_hist(., 5): Variable contains Inf or -Inf value(s) that were
## converted to NA.

## Warning in inline_hist(., 5): Variable contains Inf or -Inf value(s) that were
## converted to NA.

Metadata

Description

Dataset name: codebook_data

The dataset has N=1657 rows and 25 columns. 1653 rows have no missing values on any column.

Metadata for search engines
  • Date published: 2022-06-08
x
level_letters
level_syll
level_word
total_play_hrs
itemsbought
var_coin
max_coin
shopvisits
stickervisits
learnedlettersvisits
creditssvisits
time_LearnedLettersScreen
time_MapScreen
time_ShopScreen
time_StickerBook
time_Credits
numtrials
numlettertrials
numsylltrials
numwordtrials
meantimeused
meancorrect
numlevels
numtargets
numdays

Variables

level_letters

GG computed attained level of letter mastery

Distribution

Distribution of values for level_letters

Distribution of values for level_letters

4 missing values.

Summary statistics

name label data_type n_missing complete_rate n_unique empty min max whitespace
level_letters GG computed attained level of letter mastery character 4 0.997586 24 0 1 3 0

level_syll

GG computed attained level of syllable mastery

Distribution

Distribution of values for level_syll

Distribution of values for level_syll

4 missing values.

Summary statistics

name label data_type n_missing complete_rate n_unique empty min max whitespace
level_syll GG computed attained level of syllable mastery character 4 0.997586 88 0 1 2 0

level_word

GG computed attained level of word mastery

Distribution

Distribution of values for level_word

Distribution of values for level_word

4 missing values.

Summary statistics

name label data_type n_missing complete_rate n_unique empty min max whitespace
level_word GG computed attained level of word mastery character 4 0.997586 19 0 1 2 0

total_play_hrs

GG computed time spent playing

Distribution

Distribution of values for total_play_hrs

Distribution of values for total_play_hrs

0 missing values.

Summary statistics

## Warning in inline_hist(., 5): Variable contains Inf or -Inf value(s) that were
## converted to NA.
name label data_type n_missing complete_rate min median max mean hist
total_play_hrs GG computed time spent playing numeric 0 1 -Inf 4.8 22 -Inf ▇▇▁▁▁

itemsbought

Number of items bought

Distribution

Distribution of values for itemsbought

Distribution of values for itemsbought

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
itemsbought Number of items bought numeric 0 1 0 19 149 21.34882 14.26594 ▇▂▁▁▁

var_coin

Variation of coins in possession

Distribution

Distribution of values for var_coin

Distribution of values for var_coin

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
var_coin Variation of coins in possession numeric 0 1 47 171 2296 250.5931 223.9861 ▇▁▁▁▁

max_coin

Max number of coins owned

Distribution

Distribution of values for max_coin

Distribution of values for max_coin

0 missing values.

Summary statistics

name label data_type n_missing complete_rate n_unique empty min max whitespace
max_coin Max number of coins owned character 0 1 49 0 2 3 0

shopvisits

Number of visits to shop

Distribution

Distribution of values for shopvisits

Distribution of values for shopvisits

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
shopvisits Number of visits to shop numeric 0 1 0 24 229 32.1382 28.38471 ▇▂▁▁▁

stickervisits

Number of visits to stickerbook

Distribution

Distribution of values for stickervisits

Distribution of values for stickervisits

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
stickervisits Number of visits to stickerbook numeric 0 1 0 12 94 14.57695 12.22254 ▇▂▁▁▁

learnedlettersvisits

Number of visits to learned letters screen

Distribution

Distribution of values for learnedlettersvisits

Distribution of values for learnedlettersvisits

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
learnedlettersvisits Number of visits to learned letters screen numeric 0 1 0 11 142 12.95232 10.77776 ▇▁▁▁▁

creditssvisits

Number of visits to credit screen

Distribution

Distribution of values for creditssvisits

Distribution of values for creditssvisits

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
creditssvisits Number of visits to credit screen numeric 0 1 0 0 71 2.654798 6.147667 ▇▁▁▁▁

time_LearnedLettersScreen

Distribution

Distribution of values for time_LearnedLettersScreen

Distribution of values for time_LearnedLettersScreen

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
time_LearnedLettersScreen numeric 0 1 0 8.3 158 14.64278 17.03142 ▇▁▁▁▁ NA

time_MapScreen

Distribution

Distribution of values for time_MapScreen

Distribution of values for time_MapScreen

0 missing values.

Summary statistics

name data_type n_missing complete_rate min median max mean sd hist label
time_MapScreen numeric 0 1 3.1 8.1 18 8.3236 1.74467 ▁▇▃▁▁ NA

time_ShopScreen

Median time in second when in shop

Distribution

Distribution of values for time_ShopScreen

Distribution of values for time_ShopScreen

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
time_ShopScreen Median time in second when in shop numeric 0 1 0 23 180 28.82629 21.89392 ▇▂▁▁▁

time_StickerBook

Median time in second when in stickerbook

Distribution

Distribution of values for time_StickerBook

Distribution of values for time_StickerBook

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
time_StickerBook Median time in second when in stickerbook numeric 0 1 0 33 181 40.03571 28.12017 ▇▅▁▁▁

time_Credits

Median time in second when in credit screen

Distribution

Distribution of values for time_Credits

Distribution of values for time_Credits

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
time_Credits Median time in second when in credit screen numeric 0 1 0 0 69 8.397797 14.97402 ▇▁▁▁▁

numtrials

Number of trials played

Distribution

Distribution of values for numtrials

Distribution of values for numtrials

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
numtrials Number of trials played numeric 0 1 210 1654 8828 1817.575 909.3312 ▇▅▁▁▁

numlettertrials

Number of letter trials played

Distribution

Distribution of values for numlettertrials

Distribution of values for numlettertrials

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
numlettertrials Number of letter trials played numeric 0 1 177 901 4308 980.1877 452.4458 ▇▅▁▁▁

numsylltrials

Number of syllable trials played

Distribution

Distribution of values for numsylltrials

Distribution of values for numsylltrials

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
numsylltrials Number of syllable trials played numeric 0 1 0 457 4358 505.2414 330.8901 ▇▁▁▁▁

numwordtrials

Number of word trials played

Distribution

Distribution of values for numwordtrials

Distribution of values for numwordtrials

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
numwordtrials Number of word trials played numeric 0 1 0 117 4108 332.1454 469.8262 ▇▁▁▁▁

meantimeused

Mean time used on trials

Distribution

Distribution of values for meantimeused

Distribution of values for meantimeused

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
meantimeused Mean time used on trials numeric 0 1 1.7 3.7 11 3.845007 0.9643147 ▇▇▁▁▁

meancorrect

Proportion correct answer

Distribution

Distribution of values for meancorrect

Distribution of values for meancorrect

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
meancorrect Proportion correct answer numeric 0 1 0.64 0.83 0.99 0.8338436 0.0655215 ▁▅▇▇▃

numlevels

Number of levels played

Distribution

Distribution of values for numlevels

Distribution of values for numlevels

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
numlevels Number of levels played numeric 0 1 22 191 1025 210.6451 111.8307 ▇▅▁▁▁

numtargets

Number of letters/syll/words encountered

Distribution

Distribution of values for numtargets

Distribution of values for numtargets

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
numtargets Number of letters/syll/words encountered numeric 0 1 22 142 680 167.5576 105.2877 ▇▅▂▁▁

numdays

Number of days playing trials

Distribution

Distribution of values for numdays

Distribution of values for numdays

0 missing values.

Summary statistics

name label data_type n_missing complete_rate min median max mean sd hist
numdays Number of days playing trials numeric 0 1 1 24 56 24.53229 9.816386 ▂▇▇▃▁

Missingness report

Codebook table

JSON-LD metadata

The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.

{
  "name": "codebook_data",
  "datePublished": "2022-06-08",
  "description": "The dataset has N=1657 rows and 25 columns.\n1653 rows have no missing values on any column.\n\n\n## Table of variables\nThis table contains variable names, labels, and number of missing values.\nSee the complete codebook for more.\n\n|name                      |label                                          | n_missing|\n|:-------------------------|:----------------------------------------------|---------:|\n|level_letters             |GG computed attained level of letter mastery   |         4|\n|level_syll                |GG computed attained level of syllable mastery |         4|\n|level_word                |GG computed attained level of word mastery     |         4|\n|total_play_hrs            |GG computed time spent playing                 |         0|\n|itemsbought               |Number of items bought                         |         0|\n|var_coin                  |Variation of coins in possession               |         0|\n|max_coin                  |Max number of coins owned                      |         0|\n|shopvisits                |Number of visits to shop                       |         0|\n|stickervisits             |Number of visits to stickerbook                |         0|\n|learnedlettersvisits      |Number of visits to learned letters screen     |         0|\n|creditssvisits            |Number of visits to credit screen              |         0|\n|time_LearnedLettersScreen |NA                                             |         0|\n|time_MapScreen            |NA                                             |         0|\n|time_ShopScreen           |Median time in second when in shop             |         0|\n|time_StickerBook          |Median time in second when in stickerbook      |         0|\n|time_Credits              |Median time in second when in credit screen    |         0|\n|numtrials                 |Number of trials played                        |         0|\n|numlettertrials           |Number of letter trials played                 |         0|\n|numsylltrials             |Number of syllable trials played               |         0|\n|numwordtrials             |Number of word trials played                   |         0|\n|meantimeused              |Mean time used on trials                       |         0|\n|meancorrect               |Proportion correct answer                      |         0|\n|numlevels                 |Number of levels played                        |         0|\n|numtargets                |Number of letters/syll/words encountered       |         0|\n|numdays                   |Number of days playing trials                  |         0|\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.9.4.9000).",
  "keywords": ["level_letters", "level_syll", "level_word", "total_play_hrs", "itemsbought", "var_coin", "max_coin", "shopvisits", "stickervisits", "learnedlettersvisits", "creditssvisits", "time_LearnedLettersScreen", "time_MapScreen", "time_ShopScreen", "time_StickerBook", "time_Credits", "numtrials", "numlettertrials", "numsylltrials", "numwordtrials", "meantimeused", "meancorrect", "numlevels", "numtargets", "numdays"],
  "@context": "http://schema.org/",
  "@type": "Dataset",
  "variableMeasured": [
    {
      "name": "level_letters",
      "description": "GG computed attained level of letter mastery",
      "@type": "propertyValue"
    },
    {
      "name": "level_syll",
      "description": "GG computed attained level of syllable mastery",
      "@type": "propertyValue"
    },
    {
      "name": "level_word",
      "description": "GG computed attained level of word mastery",
      "@type": "propertyValue"
    },
    {
      "name": "total_play_hrs",
      "description": "GG computed time spent playing",
      "@type": "propertyValue"
    },
    {
      "name": "itemsbought",
      "description": "Number of items bought ",
      "@type": "propertyValue"
    },
    {
      "name": "var_coin",
      "description": "Variation of coins in possession",
      "@type": "propertyValue"
    },
    {
      "name": "max_coin",
      "description": "Max number of coins owned",
      "@type": "propertyValue"
    },
    {
      "name": "shopvisits",
      "description": "Number of visits to shop",
      "@type": "propertyValue"
    },
    {
      "name": "stickervisits",
      "description": "Number of visits to stickerbook",
      "@type": "propertyValue"
    },
    {
      "name": "learnedlettersvisits",
      "description": "Number of visits to learned letters screen",
      "@type": "propertyValue"
    },
    {
      "name": "creditssvisits",
      "description": "Number of visits to credit screen",
      "@type": "propertyValue"
    },
    {
      "name": "time_LearnedLettersScreen",
      "@type": "propertyValue"
    },
    {
      "name": "time_MapScreen",
      "@type": "propertyValue"
    },
    {
      "name": "time_ShopScreen",
      "description": "Median time in second when in shop",
      "@type": "propertyValue"
    },
    {
      "name": "time_StickerBook",
      "description": "Median time in second when in stickerbook",
      "@type": "propertyValue"
    },
    {
      "name": "time_Credits",
      "description": "Median time in second when in credit screen",
      "@type": "propertyValue"
    },
    {
      "name": "numtrials",
      "description": "Number of trials played",
      "@type": "propertyValue"
    },
    {
      "name": "numlettertrials",
      "description": "Number of letter trials played",
      "@type": "propertyValue"
    },
    {
      "name": "numsylltrials",
      "description": "Number of syllable trials played",
      "@type": "propertyValue"
    },
    {
      "name": "numwordtrials",
      "description": "Number of word trials played",
      "@type": "propertyValue"
    },
    {
      "name": "meantimeused",
      "description": "Mean time used on trials",
      "@type": "propertyValue"
    },
    {
      "name": "meancorrect",
      "description": "Proportion correct answer",
      "@type": "propertyValue"
    },
    {
      "name": "numlevels",
      "description": "Number of levels played",
      "@type": "propertyValue"
    },
    {
      "name": "numtargets",
      "description": "Number of letters/syll/words encountered",
      "@type": "propertyValue"
    },
    {
      "name": "numdays",
      "description": "Number of days playing trials",
      "@type": "propertyValue"
    }
  ]
}`