Random R Codes Part-3

Few Markdown Tips
Multiple Correspondence Analysis (MCA)
- FactoMineR
- MASS
- ade4
- ca
- homals

R is now a large community. Today we have ~~4,567~~ 6,139 R packages. My attempt here is to jot down the code snippets based on the newer packages (by following table of available packages, sorted by date of publication)¹.

The chunk of the codes will be taken either from the Vignette or from the main package documentation pdf. I also like to include stack overflow threads if there are interesting questions. These interesting and easy-to-apply codes will be applied to different data sets to develop models and assumptions. In this random version, exploration on rmarkdown is done.

Few Markdown Tips

R Markdown Reference Guide is a good source in learning rmarkdown. A must read.

Output Style

highlight="tango", "pygments", "kate", "zenburn"
theme= "cerulean", "journal", "flatly", "readable", "spacelab", "united", "cosmo"

output:
    html_document:
        toc: true
        toc_depth: 4
        theme: cerulean
        highlight: zenburn

Tables

| Item | Nuts and Bolts          |
| ------------- | ----------- |
| Why      | ~~Don't~~ why me|
| Dude     | _Don't_ dude me, Bro.    |

| Left-Aligned  | Center Aligned  | Right Aligned |
| :------------ |:---------------:| -----:|
| Left is a      | Center is not | Right is a |
| Left is a      | always        |   Right. Right is not |
| Left is a Left | centered        |   a `right`|

Item	Nuts and Bolts
Why	~~Don’t~~ why me
Dude	Don’t dude me, Bro.

Left-Aligned	Center Aligned	Right Aligned
Left is a	Center is not	Right is a
Left is a	always	Right. Right is not
Left is a Left	centered	a `right`

Ordinal indicators are sometimes written as superscripts (1^st^, 2^nd^, 3^rd^, 4^th^, rather than 1st, 2nd, 3rd, 4th), although many English-language style guides recommend against this use. Other languages use a similar convention, such as 1^er^ or 2^e^ in French, or 4ª and 4º in Italian, Portuguese, and Spanish.

Also in mathematics and computing, a subscript can be used to represent the radix, or base, of a written number, especially where multiple bases are used alongside each other. For example, comparing values in hexadecimal, denary, and octal one might write C~hex~ = 12~dec~ = 14~oct~.

Ordinal indicators are sometimes written as superscripts (1^st, 2^nd, 3^rd, 4^th, rather than 1st, 2nd, 3rd, 4th), although many English-language style guides recommend against this use. Other languages use a similar convention, such as 1^er or 2^e in French, or 4ª and 4º in Italian, Portuguese, and Spanish.

Also in mathematics and computing, a subscript can be used to represent the radix, or base, of a written number, especially where multiple bases are used alongside each other. For example, comparing values in hexadecimal, denary, and octal one might write C_hex = 12_dec = 14_oct.

Equation

$$
\begin{aligned}
\dot{x} & = \sigma(y-x) \\
\dot{y} & = \rho x - y - xz \\
\dot{z} & = -\beta z + xy
\end{aligned} 
$$

\[ \begin{aligned} \dot{x} & = \sigma(y-x) \\ \dot{y} & = \rho x - y - xz \\ \dot{z} & = -\beta z + xy \end{aligned} \]

Multiple Correspondence Analysis (MCA)

Five R packages are widely used for Multiple Correspondence Analysis (MCA). Part 3 compiles MCA codes.

The packages are:

Factominer, MASS, ade4, ca, homals

I have used the codes compiled by Gaston Sanchez

FactoMineR

Package pdf.
Website.
Paper
Last updated: 2014-XX-XX

library(FactoMineR)
library(ggplot2)

data(tea)
newtea = tea[, c("Tea", "How", "how", "sugar", "where", "always")]
head(newtea, 3)

##         Tea   How     how    sugar       where     always
## 1     black alone tea bag    sugar chain store Not.always
## 2     black  milk tea bag No.sugar chain store Not.always
## 3 Earl Grey alone tea bag No.sugar chain store Not.always

cats = apply(newtea, 2, function(x) nlevels(as.factor(x)))
mca1 = MCA(newtea, graph = FALSE)
mca1_vars_df = data.frame(mca1$var$coord, Variable = rep(names(cats), cats))
mca1_obs_df = data.frame(mca1$ind$coord)
ggplot(data=mca1_vars_df, aes(x = Dim.1, y = Dim.2, label = rownames(mca1_vars_df))) +
 geom_hline(yintercept = 0, colour = "gray70") +
 geom_vline(xintercept = 0, colour = "gray70") +
 geom_text(aes(colour=Variable)) +
 ggtitle("MCA plot of variables using R package FactoMineR")

ggplot(data = mca1_obs_df, aes(x = Dim.1, y = Dim.2)) +
  geom_hline(yintercept = 0, colour = "gray70") +
  geom_vline(xintercept = 0, colour = "gray70") +
  geom_point(colour = "gray50", alpha = 0.7) +
  geom_density2d(colour = "gray80") +
  geom_text(data = mca1_vars_df, aes(x = Dim.1, y = Dim.2, 
                label = rownames(mca1_vars_df), colour = Variable)) +
  ggtitle("MCA plot of variables using R package FactoMineR") +
  scale_colour_discrete(name = "Variable")

MASS

Package pdf.
Last updated: 2014-XX-XX

library(MASS)

## 
## Attaching package: 'MASS'
## 
## The following object is masked _by_ '.GlobalEnv':
## 
##     cats

mca2 = mca(newtea, nf = 5)
mca2$d^2

## [1] 0.2797618 0.2577477 0.2201379 0.1879296 0.1687650

mca2_vars_df = data.frame(mca2$cs, Variable = rep(names(cats), cats))
ggplot(data = mca2_vars_df, 
       aes(x = X1, y = X2, label = rownames(mca2_vars_df))) +
  geom_hline(yintercept = 0, colour = "gray70") +
  geom_vline(xintercept = 0, colour = "gray70") +
  geom_text(aes(colour = Variable)) +
  ggtitle("MCA plot of variables using R package MASS")

ade4

Package pdf.
Last updated: 2014-XX-XX

library(ade4)

## 
## Attaching package: 'ade4'
## 
## The following object is masked from 'package:FactoMineR':
## 
##     reconst

mca3 = dudi.acm(newtea, scannf = FALSE, nf = 5)
mca3$eig

##  [1] 0.27976178 0.25774772 0.22013794 0.18792961 0.16876495 0.16368666
##  [7] 0.15288834 0.13838682 0.11569167 0.08612637 0.06221147

mca3_vars_df = data.frame(mca3$co, Variable = rep(names(cats), cats))
ggplot(data = mca3_vars_df, 
       aes(x = Comp1, y = Comp2, label = rownames(mca3_vars_df))) +
  geom_hline(yintercept = 0, colour = "gray70") +
  geom_vline(xintercept = 0, colour = "gray70") +
  geom_text(aes(colour = Variable)) +
  ggtitle("MCA plot of variables using R package ade4")

ca

Package pdf.
Last updated: 2014-XX-XX

library(ca)
mca4 = mjca(newtea, lambda = "indicator", nd = 5)
mca4$sv^2

##  [1] 0.27976178 0.25774772 0.22013794 0.18792961 0.16876495 0.16368666
##  [7] 0.15288834 0.13838682 0.11569167 0.08612637 0.06221147

mca4_vars_df = data.frame(mca4$colcoord, Variable = rep(names(cats), cats))
rownames(mca4_vars_df) = mca4$levelnames
ggplot(data = mca4_vars_df, 
       aes(x = X1, y = X2, label = rownames(mca4_vars_df))) +
  geom_hline(yintercept = 0, colour = "gray70") +
  geom_vline(xintercept = 0, colour = "gray70") +
  geom_text(aes(colour = Variable)) +
  ggtitle("MCA plot of variables using R package ca")

homals

Package pdf.
Last updated: 2014-XX-XX

library(homals)
mca5 = homals(newtea, ndim = 5, level = "nominal")
mca5$eigenvalues
mca5$catscores

D1 = unlist(lapply(mca5$catscores, function(x) x[,1]))
D2 = unlist(lapply(mca5$catscores, function(x) x[,2]))

mca5_vars_df = data.frame(D1 = D1, D2 = D2, Variable = rep(names(cats), cats))
rownames(mca5_vars_df) = unlist(sapply(mca5$catscores, function(x) rownames(x)))
ggplot(data = mca5_vars_df, 
       aes(x = D1, y = D2, label = rownames(mca5_vars_df))) +
  geom_hline(yintercept = 0, colour = "gray70") +
  geom_vline(xintercept = 0, colour = "gray70") +
  geom_text(aes(colour = Variable)) +
  ggtitle("MCA plot of variables using R package homals")

Compiled by Subasish Das ↩