Saurabh Belsare, Ankush Desai, Jieun Kim and Allen Tang
2016-12-02
MFA Package
The package "mfa" implements a statistical multivariate technique called Multiple Factor Analysis (MFA)
In addition to the main "mfa" function, it provides a set of complementary functions and methods to compute summaries of eigenvalues, contributions, coefficients to study the between-table structure (both \( R_V \) and \( L_g \)), and bootstrapping to estimate the stability of the compromise factor scores.
For visualization, the package includes functions for five types of plots:
A Shiny App is available for an interactive visualization.
The first step is to create a "mfa" object with the function mfa():
mymfa <- function(data, sets, supplData, ncomps = NULL, center = TRUE, scale = TRUE)
This is the main mfa function that implements all the calculations for performing the Multiple Factor Analysis. The function takes six arguments:
data: the input data, which contains only the assessor inputs,sets: the column number list demarcating columns for each assessor,supplData: the list of supplementary data for an additional factor analysis,ncomps: the number of components for the MFA,center: a boolean flag which decides whether to center the data,scale: a boolean flag which decides whether to scale the data.The function returns an object of type mfa that contains a list containing all the output information computed by the MFA analysis. It includes:
The package provides a print method to display basic information about mfa object: the numbers of blocks and components.
source("../R/MFAHelperFunctions.R")
source("../R/mfa.R")
print(mfa_out)
Object of type mfa
The number of assessors is 10
The number of components is 11
This package provides the summaries of eigenvalues that include each component's singular values, eigenvalues, cumulative eigenvalues, % inertia, and cumulative % inertia.
Eigenvalues(mfa_out)
C1 C2 C3 C4 C5
Singular value 0.8776418 0.3506073 0.30118188 0.27570882 0.24413253
Eigenvalue 0.7702551 0.1229254 0.09071052 0.07601535 0.05960069
cumulative 0.7702551 0.8931806 0.98389110 1.05990645 1.11950714
Inertia 61.2296749 9.7716778 7.21082608 6.04266702 4.73782113
cumulative 61.2296749 71.0013527 78.21217882 84.25484584 88.99266698
C6 C7 C8 C9 C10
Singular value 0.19799790 0.17581135 0.15798257 0.13660609 0.11592003
Eigenvalue 0.03920317 0.03090963 0.02495849 0.01866122 0.01343745
cumulative 1.15871031 1.18961994 1.21457843 1.23323966 1.24667711
Inertia 3.11636657 2.45709049 1.98401852 1.48343143 1.06817972
cumulative 92.10903355 94.56612404 96.55014256 98.03357399 99.10175371
C11
Singular value 0.10630019
Eigenvalue 0.01129973
cumulative 1.25797684
Inertia 0.89824629
cumulative 100.00000000
This package provides three complementary functions to print (1) contribution of an observation to a dimension, (2) contribution of a variable to a dimension, and (3) contribution of a table to a dimension.
# Note that only 2 roles are printed for the contributions below
head(CtrObserToDimension(mfa_out), 2)
[,1] [,2]
[1,] 0.10394860 0.018067938
[2,] 0.07078421 0.000721502
head(CtrVarToDimension(mfa_out), 2)
[,1] [,2]
[1,] 0.02088509 0.02441275
[2,] 0.01712210 0.01484237
head(CtrTableToDimension(mfa_out), 2)
[,1] [,2]
[1,] 0.1011327 0.09540216
[2,] 0.1000578 0.06849423
This package provides functions for computing \( R_V \) and \( L_g \) coefficients to evaluate the similarity between two tables.
For \( R_V \) coefficient, RV() takes two tables as the arguments and computes the cofficient for two tables.
Additionally, Rv_table() takes a data set and a list with sets of variables and computes a matrix of \( R_V \) coefficients. For \( L_g \) coefficient, this package provides Lg() and Lg_table(); their usage parallels that of \( R_V \) coefficient.
# RV
rv_coef <- RV(table1, table2)
rv_table <- Rv_table(dataset, sets = list(1:3, 4:5, 6:10))
# Lg
lg_coef <- Lg(table1, table2)
lg_table <- Lg_table(dataset, sets = list(1:3, 4:5, 6:10))
This package allows users to perform bootstrapping in order to estimate the stability of the compromise factor scores. While it does not provide a separate function for computing bootstrap ratios, these outputs are available from the main function mfa(). It generates 1,000 bootstrap samples that give 1,000 estimated bootstrapped factor scores. Like t-statistics, bootstrap ratios can be used to find the observations that reliably contribute to a given component.
boot_ratio <- mfa_out$BootstrapRatio
boot_ratio
[,1] [,2]
[1,] -25.1344657 1.3391898
[2,] -10.1424141 0.3415491
[3,] -8.1583201 -5.2706537
[4,] -16.1878472 -1.1364634
[5,] 20.1399463 -1.0001678
[6,] 14.5840499 -0.9457683
[7,] 8.4158509 1.9821122
[8,] 15.7983619 3.6649462
[9,] -6.6223419 2.5836422
[10,] 0.9063082 -5.3315775
[11,] -6.3159837 6.5030745
[12,] 5.2859914 -0.6530110
This package provides five types of plots.
You can use following functions to create graphics:
# for plotting all the graphs
plot(mfa_out)
plot_eigenvalues(mfa_out)
plot_factor_scores(mfa_out)
plot_partial_factor_scores(mfa_out)
plot_variable_loadings(mfa_out)
plot_boot_ratio(mfa_out, 2)
Thank You