Selection of Appropriate Statistical Methods

In Data Science, for each of the specific situation, statistical methods are available for analysis and interpretation of the data. To select the appropriate statistical method, one need to know the assumption and conditions of the statistical methods.

Level of measurement:

Your variables will be categorical or nominal, ordinal or rank-ordered, interval, or ratio-level. This needs to be done for both your independent and dependent variables. A nominal level variable is a variable where the categories just have names. Rank-ordered data is data that is ordered, like a horse race. When the distance between units is the same, you have interval data.

Statistical Analysis:

you have to clarify what you want to find out. The research question or hypothesis is typically phrased in terms of finding differences, relationships, or predicting. Difference-type” questions have interval or ratio-level Y variables, and categorical-level X variable. The appropriate statistical analyses for these questions are ANOVA and MANOVA. For relationship questions with interval, ordinal-level, or ratio-level variables, the correct statistical analysis is typically Spearman or Pearson correlations. Relationship questions with two categorical variables can be examined with a chi-square test. linear, ordinal, or multinomial regressions are the appropriate statistical analyses to use when the outcome variables are interval, ordinal, or categorical-level variables, respectively. when the categorical-level variable has more than two levels the variable has to be dummy coded.

Selection of appropriate statistical method is very important step in analysis. A wrong selection of the statistical method not only creates some serious problem during the interpretation of the findings but also affects the conclusion of the study.