The hypegrammaR package implements the IMPACT quantitative data analysis guidelines. While this guide works on its own, all of this will make a lot more sense if you read those first.
So you want to analyse some data. cool. get all your stuff as csv files first. All your stuff means:
Let’s assume for now that we have nothing but the data from a simple random survey design (not weighted, no cluster sampling..) .
Install the hypegrammaR package (this line you only have to run once when using hypegrammaR for the first time, or to update to a new version):
devtools::install_github("mabafaba/hypegrammaR",build_opts = c())(the build_opts = c() makes sure the package includes extra help pages & documentation)
Load the hypegrammaR package:
library(hypegrammaR)Load the data:
mydata<-read.csv("../tests/testthat/data.csv")Identify your analysis parameters.
Now we can use the function analyse_indicator with the above as parameters:
result<-analyse_indicator(
data = mydata,
dependent.var = "nutrition_need",
independent.var = "region",
hypothesis.type = "group_difference",
independent.var.type = "categorical",
dependent.var.type="numerical",
sampling.strategy.cluster = F,
sampling.strategy.stratified = F)You can find out what exactly you can/should enter for these parameters by running ?analyse_indicator, which will open the help page for the function (this works for any function)
The analyse_indicator function gives you a number of things:
First, a message telling you how it went:
result$message
#> [1] "success (or unidentified issue)"That’s what we want to see. If something went wrong, it should tell you here what happened.
result$parameters
#> $dependent.var
#> [1] "nutrition_need"
#>
#> $independent.var
#> [1] "region"
#>
#> $dependent.var.type
#> [1] "numerical"
#>
#> $independent.var.type
#> [1] "categorical"
#>
#> $hypothesis.type
#> [1] "group_difference"
#>
#> $sampling.strategy.cluster
#> [1] FALSE
#>
#> $sampling.strategy.stratified
#> [1] FALSE
#>
#> $case
#> [1] "CASE_group_difference_numerical_categorical"
#> attr(,"class")
#> [1] "analysis_case"As you can see, it remembers what your input parameters were. It also added a standardised name of the analysis case.
result$summary.statistic
#> dependent.var.value independent.var.value numbers
#> capitalcentral NA capitalcentral 0.2724824
#> east NA east 0.2006114
#> north NA north 0.2956114
#> northeast NA northeast 0.2002275
#> south NA south 0.1142410
#> southeast NA southeast 0.2470808
#> west NA west 0.2521008
#> se min max
#> capitalcentral 0.006409252 0.2599205 0.2850443
#> east 0.007828254 0.1852683 0.2159545
#> north 0.007581281 0.2807523 0.3104704
#> northeast 0.007792918 0.1849537 0.2155014
#> south 0.005627861 0.1032106 0.1252714
#> southeast 0.009321706 0.2288106 0.2653510
#> west 0.008486603 0.2354674 0.2687343In this case, “numbers” are averages, because the input variable was numerical. min and max is the corresponding confidence interval. dependent.var.value give the corresponding variable values if they are categorical (NA otherwise.) The summary statistic will always be organised with exactly these columns, no matter what analysis you did. This is so that if you add a new visualisations or ouput format, it will work for any output from this function.
Next, there’s information on which (if any) hypothesis test was used and the p value:
result$hypothesis.test
#> $result
#> $result$t
#> [1] -7.103762
#>
#> $result$p.value
#> [1] 1.25167e-12
#>
#>
#> $parameters
#> $parameters$df
#> [1] 21655
#>
#>
#> $name
#> [1] "two sample ttest on difference in means (two sided)"ou’ll probably be most interested in the p-value and the type of test that was used.
Finally, it returned a plot function that is appropriate to visualise the type of analysis that was performed:
labels_summary_statistic<-function(x){x}
visualise<-map_to_visualisation(result$parameters$case)
myvisualisation<-visualise(result)
myvisualisationFor advanced users (that know ggplot): The visualisation function returns a ggplot object, so you can add/overwrite ggplot stuff; for example:
myvisualisation+coord_polar()To save/export any results, you can use the generic map_to_file function. For example:
map_to_file(results$summary.statistics,"summary_stat.csv")
map_to_file(myvisualisation,"barchart.jpg",width="5",height="5")You will find the files in your current working directory (which you can find out with getwd())
…
…