Here, we’re just setting a few options.
knitr::opts_chunk$set(
warning = TRUE, # show warnings during codebook generation
message = TRUE, # show messages during codebook generation
error = TRUE, # do not interrupt codebook generation in case of errors,
# usually better for debugging
echo = TRUE # show R code
)
ggplot2::theme_set(ggplot2::theme_bw())
Now, we’re preparing our data for the codebook.
library(codebook)
codebook_data <- readRDS("classical_final.rds")[, 2:26]
# to import an SPSS file from the same folder uncomment and edit the line below
# codebook_data <- rio::import("mydata.sav")
# for Stata
# codebook_data <- rio::import("mydata.dta")
# for CSV
# codebook_data <- rio::import("mydata.csv")
# omit the following lines, if your missing values are already properly labelled
codebook_data <- detect_missing(codebook_data,
only_labelled = TRUE, # only labelled values are autodetected as
# missing
negative_values_are_missing = FALSE, # negative values are missing values
ninety_nine_problems = TRUE, # 99/999 are missing values, if they
# are more than 5 MAD from the median
)
# If you are not using formr, the codebook package needs to guess which items
# form a scale. The following line finds item aggregates with names like this:
# scale = scale_1 + scale_2R + scale_3R
# identifying these aggregates allows the codebook function to
# automatically compute reliabilities.
# However, it will not reverse items automatically.
codebook_data <- detect_scales(codebook_data)
Create codebook
codebook(codebook_data)
## Warning in inline_hist(., 5): Variable contains Inf or -Inf value(s) that were
## converted to NA.
## Warning in inline_hist(., 5): Variable contains Inf or -Inf value(s) that were
## converted to NA.
Dataset name: codebook_data
The dataset has N=1657 rows and 25 columns. 1653 rows have no missing values on any column.
|
GG computed attained level of letter mastery
Distribution of values for level_letters
4 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| level_letters | GG computed attained level of letter mastery | character | 4 | 0.997586 | 24 | 0 | 1 | 3 | 0 |
GG computed attained level of syllable mastery
Distribution of values for level_syll
4 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| level_syll | GG computed attained level of syllable mastery | character | 4 | 0.997586 | 88 | 0 | 1 | 2 | 0 |
GG computed attained level of word mastery
Distribution of values for level_word
4 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| level_word | GG computed attained level of word mastery | character | 4 | 0.997586 | 19 | 0 | 1 | 2 | 0 |
GG computed time spent playing
Distribution of values for total_play_hrs
0 missing values.
## Warning in inline_hist(., 5): Variable contains Inf or -Inf value(s) that were
## converted to NA.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | hist |
|---|---|---|---|---|---|---|---|---|---|
| total_play_hrs | GG computed time spent playing | numeric | 0 | 1 | -Inf | 4.8 | 22 | -Inf | ▇▇▁▁▁ |
Number of items bought
Distribution of values for itemsbought
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| itemsbought | Number of items bought | numeric | 0 | 1 | 0 | 19 | 149 | 21.34882 | 14.26594 | ▇▂▁▁▁ |
Variation of coins in possession
Distribution of values for var_coin
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| var_coin | Variation of coins in possession | numeric | 0 | 1 | 47 | 171 | 2296 | 250.5931 | 223.9861 | ▇▁▁▁▁ |
Max number of coins owned
Distribution of values for max_coin
0 missing values.
| name | label | data_type | n_missing | complete_rate | n_unique | empty | min | max | whitespace |
|---|---|---|---|---|---|---|---|---|---|
| max_coin | Max number of coins owned | character | 0 | 1 | 49 | 0 | 2 | 3 | 0 |
Number of visits to shop
Distribution of values for shopvisits
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| shopvisits | Number of visits to shop | numeric | 0 | 1 | 0 | 24 | 229 | 32.1382 | 28.38471 | ▇▂▁▁▁ |
Number of visits to stickerbook
Distribution of values for stickervisits
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| stickervisits | Number of visits to stickerbook | numeric | 0 | 1 | 0 | 12 | 94 | 14.57695 | 12.22254 | ▇▂▁▁▁ |
Number of visits to learned letters screen
Distribution of values for learnedlettersvisits
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| learnedlettersvisits | Number of visits to learned letters screen | numeric | 0 | 1 | 0 | 11 | 142 | 12.95232 | 10.77776 | ▇▁▁▁▁ |
Number of visits to credit screen
Distribution of values for creditssvisits
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| creditssvisits | Number of visits to credit screen | numeric | 0 | 1 | 0 | 0 | 71 | 2.654798 | 6.147667 | ▇▁▁▁▁ |
Distribution of values for time_LearnedLettersScreen
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| time_LearnedLettersScreen | numeric | 0 | 1 | 0 | 8.3 | 158 | 14.64278 | 17.03142 | ▇▁▁▁▁ | NA |
Distribution of values for time_MapScreen
0 missing values.
| name | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist | label |
|---|---|---|---|---|---|---|---|---|---|---|
| time_MapScreen | numeric | 0 | 1 | 3.1 | 8.1 | 18 | 8.3236 | 1.74467 | ▁▇▃▁▁ | NA |
Median time in second when in shop
Distribution of values for time_ShopScreen
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| time_ShopScreen | Median time in second when in shop | numeric | 0 | 1 | 0 | 23 | 180 | 28.82629 | 21.89392 | ▇▂▁▁▁ |
Median time in second when in stickerbook
Distribution of values for time_StickerBook
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| time_StickerBook | Median time in second when in stickerbook | numeric | 0 | 1 | 0 | 33 | 181 | 40.03571 | 28.12017 | ▇▅▁▁▁ |
Median time in second when in credit screen
Distribution of values for time_Credits
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| time_Credits | Median time in second when in credit screen | numeric | 0 | 1 | 0 | 0 | 69 | 8.397797 | 14.97402 | ▇▁▁▁▁ |
Number of trials played
Distribution of values for numtrials
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| numtrials | Number of trials played | numeric | 0 | 1 | 210 | 1654 | 8828 | 1817.575 | 909.3312 | ▇▅▁▁▁ |
Number of letter trials played
Distribution of values for numlettertrials
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| numlettertrials | Number of letter trials played | numeric | 0 | 1 | 177 | 901 | 4308 | 980.1877 | 452.4458 | ▇▅▁▁▁ |
Number of syllable trials played
Distribution of values for numsylltrials
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| numsylltrials | Number of syllable trials played | numeric | 0 | 1 | 0 | 457 | 4358 | 505.2414 | 330.8901 | ▇▁▁▁▁ |
Number of word trials played
Distribution of values for numwordtrials
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| numwordtrials | Number of word trials played | numeric | 0 | 1 | 0 | 117 | 4108 | 332.1454 | 469.8262 | ▇▁▁▁▁ |
Mean time used on trials
Distribution of values for meantimeused
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| meantimeused | Mean time used on trials | numeric | 0 | 1 | 1.7 | 3.7 | 11 | 3.845007 | 0.9643147 | ▇▇▁▁▁ |
Proportion correct answer
Distribution of values for meancorrect
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| meancorrect | Proportion correct answer | numeric | 0 | 1 | 0.64 | 0.83 | 0.99 | 0.8338436 | 0.0655215 | ▁▅▇▇▃ |
Number of levels played
Distribution of values for numlevels
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| numlevels | Number of levels played | numeric | 0 | 1 | 22 | 191 | 1025 | 210.6451 | 111.8307 | ▇▅▁▁▁ |
Number of letters/syll/words encountered
Distribution of values for numtargets
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| numtargets | Number of letters/syll/words encountered | numeric | 0 | 1 | 22 | 142 | 680 | 167.5576 | 105.2877 | ▇▅▂▁▁ |
Number of days playing trials
Distribution of values for numdays
0 missing values.
| name | label | data_type | n_missing | complete_rate | min | median | max | mean | sd | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| numdays | Number of days playing trials | numeric | 0 | 1 | 1 | 24 | 56 | 24.53229 | 9.816386 | ▂▇▇▃▁ |
The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.
{
"name": "codebook_data",
"datePublished": "2022-06-08",
"description": "The dataset has N=1657 rows and 25 columns.\n1653 rows have no missing values on any column.\n\n\n## Table of variables\nThis table contains variable names, labels, and number of missing values.\nSee the complete codebook for more.\n\n|name |label | n_missing|\n|:-------------------------|:----------------------------------------------|---------:|\n|level_letters |GG computed attained level of letter mastery | 4|\n|level_syll |GG computed attained level of syllable mastery | 4|\n|level_word |GG computed attained level of word mastery | 4|\n|total_play_hrs |GG computed time spent playing | 0|\n|itemsbought |Number of items bought | 0|\n|var_coin |Variation of coins in possession | 0|\n|max_coin |Max number of coins owned | 0|\n|shopvisits |Number of visits to shop | 0|\n|stickervisits |Number of visits to stickerbook | 0|\n|learnedlettersvisits |Number of visits to learned letters screen | 0|\n|creditssvisits |Number of visits to credit screen | 0|\n|time_LearnedLettersScreen |NA | 0|\n|time_MapScreen |NA | 0|\n|time_ShopScreen |Median time in second when in shop | 0|\n|time_StickerBook |Median time in second when in stickerbook | 0|\n|time_Credits |Median time in second when in credit screen | 0|\n|numtrials |Number of trials played | 0|\n|numlettertrials |Number of letter trials played | 0|\n|numsylltrials |Number of syllable trials played | 0|\n|numwordtrials |Number of word trials played | 0|\n|meantimeused |Mean time used on trials | 0|\n|meancorrect |Proportion correct answer | 0|\n|numlevels |Number of levels played | 0|\n|numtargets |Number of letters/syll/words encountered | 0|\n|numdays |Number of days playing trials | 0|\n\n### Note\nThis dataset was automatically described using the [codebook R package](https://rubenarslan.github.io/codebook/) (version 0.9.4.9000).",
"keywords": ["level_letters", "level_syll", "level_word", "total_play_hrs", "itemsbought", "var_coin", "max_coin", "shopvisits", "stickervisits", "learnedlettersvisits", "creditssvisits", "time_LearnedLettersScreen", "time_MapScreen", "time_ShopScreen", "time_StickerBook", "time_Credits", "numtrials", "numlettertrials", "numsylltrials", "numwordtrials", "meantimeused", "meancorrect", "numlevels", "numtargets", "numdays"],
"@context": "http://schema.org/",
"@type": "Dataset",
"variableMeasured": [
{
"name": "level_letters",
"description": "GG computed attained level of letter mastery",
"@type": "propertyValue"
},
{
"name": "level_syll",
"description": "GG computed attained level of syllable mastery",
"@type": "propertyValue"
},
{
"name": "level_word",
"description": "GG computed attained level of word mastery",
"@type": "propertyValue"
},
{
"name": "total_play_hrs",
"description": "GG computed time spent playing",
"@type": "propertyValue"
},
{
"name": "itemsbought",
"description": "Number of items bought ",
"@type": "propertyValue"
},
{
"name": "var_coin",
"description": "Variation of coins in possession",
"@type": "propertyValue"
},
{
"name": "max_coin",
"description": "Max number of coins owned",
"@type": "propertyValue"
},
{
"name": "shopvisits",
"description": "Number of visits to shop",
"@type": "propertyValue"
},
{
"name": "stickervisits",
"description": "Number of visits to stickerbook",
"@type": "propertyValue"
},
{
"name": "learnedlettersvisits",
"description": "Number of visits to learned letters screen",
"@type": "propertyValue"
},
{
"name": "creditssvisits",
"description": "Number of visits to credit screen",
"@type": "propertyValue"
},
{
"name": "time_LearnedLettersScreen",
"@type": "propertyValue"
},
{
"name": "time_MapScreen",
"@type": "propertyValue"
},
{
"name": "time_ShopScreen",
"description": "Median time in second when in shop",
"@type": "propertyValue"
},
{
"name": "time_StickerBook",
"description": "Median time in second when in stickerbook",
"@type": "propertyValue"
},
{
"name": "time_Credits",
"description": "Median time in second when in credit screen",
"@type": "propertyValue"
},
{
"name": "numtrials",
"description": "Number of trials played",
"@type": "propertyValue"
},
{
"name": "numlettertrials",
"description": "Number of letter trials played",
"@type": "propertyValue"
},
{
"name": "numsylltrials",
"description": "Number of syllable trials played",
"@type": "propertyValue"
},
{
"name": "numwordtrials",
"description": "Number of word trials played",
"@type": "propertyValue"
},
{
"name": "meantimeused",
"description": "Mean time used on trials",
"@type": "propertyValue"
},
{
"name": "meancorrect",
"description": "Proportion correct answer",
"@type": "propertyValue"
},
{
"name": "numlevels",
"description": "Number of levels played",
"@type": "propertyValue"
},
{
"name": "numtargets",
"description": "Number of letters/syll/words encountered",
"@type": "propertyValue"
},
{
"name": "numdays",
"description": "Number of days playing trials",
"@type": "propertyValue"
}
]
}`