tell us

write a little paragraph about the package that the function is from, what it does, and how it might be useful to other psychology students

Describe is a function in the psych package that takes a set of data and produces a set of descriptive statistics in a data.frame. This is useful for psychology students as these statistics are some of the most frequently used, and can be checked to ensure there are no coding error as it produces the range for each variable. The descriptive data will only be produced when the data makes sense (e.g it will not be produced for alphanumeric data). There are other versions of this function such as describeData, which reports on the data types in the dataset, as well as describeFast, which produces less statistics for a more brief overview. There is also describeBy to describe the data for certain groups, similar to how one would use group_by() to group variables together.

show us

write a little demo to show how to install/load the package and use the function on some real data

install and load packages

library(palmerpenguins)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.4     ✓ purrr   0.3.4
## ✓ tibble  3.1.2     ✓ dplyr   1.0.6
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(psych)
## 
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha

get some data

penguins <- penguins 

use the function

#example 1, trial 1
penguins %>%
  describe(bill_length_mm)

#example 1, trial 2
penguins %>%
  describe(penguins$bill_depth_mm)
#example 1 - looks at the descriptives 
penguins %>%
describe()
##                   vars   n    mean     sd  median trimmed    mad    min    max
## species*             1 344    1.92   0.89    2.00    1.90   1.48    1.0    3.0
## island*              2 344    1.66   0.73    2.00    1.58   1.48    1.0    3.0
## bill_length_mm       3 342   43.92   5.46   44.45   43.91   7.04   32.1   59.6
## bill_depth_mm        4 342   17.15   1.97   17.30   17.17   2.22   13.1   21.5
## flipper_length_mm    5 342  200.92  14.06  197.00  200.34  16.31  172.0  231.0
## body_mass_g          6 342 4201.75 801.95 4050.00 4154.01 889.56 2700.0 6300.0
## sex*                 7 333    1.50   0.50    2.00    1.51   0.00    1.0    2.0
## year                 8 344 2008.03   0.82 2008.00 2008.04   1.48 2007.0 2009.0
##                    range  skew kurtosis    se
## species*             2.0  0.16    -1.73  0.05
## island*              2.0  0.61    -0.91  0.04
## bill_length_mm      27.5  0.05    -0.89  0.30
## bill_depth_mm        8.4 -0.14    -0.92  0.11
## flipper_length_mm   59.0  0.34    -1.00  0.76
## body_mass_g       3600.0  0.47    -0.74 43.36
## sex*                 1.0 -0.02    -2.01  0.03
## year                 2.0 -0.05    -1.51  0.04
#example 2
describe(penguins[, c("body_mass_g", "flipper_length_mm")])
##                   vars   n    mean     sd median trimmed    mad  min  max range
## body_mass_g          1 342 4201.75 801.95   4050 4154.01 889.56 2700 6300  3600
## flipper_length_mm    2 342  200.92  14.06    197  200.34  16.31  172  231    59
##                   skew kurtosis    se
## body_mass_g       0.47    -0.74 43.36
## flipper_length_mm 0.34    -1.00  0.76

more resources

write a little paragraph about how you learned about the function- what did you google? Include a list of the documentation that you found useful andresources that someone learning about the function might need. If you can find pictures or memes to include, great!!

To find out more about the function, we Googled it and found (https://methodenlehre.github.io/SGSCLM-R-course/descriptive-statistics.html)[this blog]. This told us the generic forms of the function which are as follows.

"The generic form is: describe(x, na.rm = TRUE, interp = FALSE, skew = TRUE, ranges = TRUE, trim = .1, type = 3, check = TRUE).

x stands for the data frame or the variable to be analyzed (df$variable). The defaults are: * interp = FALSE refers to the definition of the median (interp = TRUE uses our method of averaging adjacent values for an even n) * skew = TRUE displays skewness, kurtosis and the trimmed mean * ranges = TRUE displays the range * trim = .1 refers to the proportion of the distribution that is trimmed at the lower and upper ends for the trimmed mean (default trimming is 10% on both sides, thus the trimmed mean is computed from the middle 80% of the data) * type = 3 refers to the method of computing skewness and kurtosis (more here: ?psych::describe) * check = TRUE refers to checking for non-numeric variables in the dataset (for which describe has no use); if check = FALSE non-numeric variables exist, an error message is displayed."