1. Language detection

Consider the following string generated by a stochastic process whose states are (alphabet is) {a,b,#}.

#aababababaabaababaababaababa#

Suppose that you know the string was generated by one of the following processes, but you’re not sure which:

  1. Start with #; thereafter P(a)=q1; P(b)=q2; P(#)=1−q1−q2, until a second # occurs.

  2. A Markov chain with the following transition matrix, again starting with # and drawing words until another # is found.

State Next # a b
# p1 p2 1−p1−p2
Current a q1 q2 1−q1−q2
b r1 r2 1−r1−r2

For each model, compute the maximum likelihood estimate of the parameters given the string above.

For model i, which is a unigram model, the MLE is just the relative frequency.

s <- "#aababababaabaababaababaababa#"
s.vec <- as.factor(strsplit(s, '')[[1]])
uni.MLEpars <- table(s.vec) / length(s.vec)
uni.MLEpars
## s.vec
##          #          a          b 
## 0.06666667 0.56666667 0.36666667

For model ii, which is a bigram model, we first extract all the bigram counts.

Bigrams <- function(symbol, s.vec){
  bigram.tab <- table(s.vec[which(s.vec == symbol) + 1], useNA = "no")
  names(bigram.tab) <- paste(symbol, names(bigram.tab), sep="")
  return(bigram.tab)
}
hashtag.bicounts <- Bigrams('#', s.vec)
hashtag.bicounts
## ## #a #b 
##  0  1  0
a.bicounts <- Bigrams('a', s.vec)
a.bicounts
## a# aa ab 
##  1  5 11
b.bicounts <- Bigrams('b', s.vec)
b.bicounts
## b# ba bb 
##  0 11  0

Then we normalize the bigram counts to get MLE of bigram parameters.

Normalize <- function(v){
  return(v / sum(v))
}
hashtag.MLEbipars <- Normalize(hashtag.bicounts)
hashtag.MLEbipars
## ## #a #b 
##  0  1  0
a.MLEbipars <- Normalize(a.bicounts)
a.MLEbipars
##         a#         aa         ab 
## 0.05882353 0.29411765 0.64705882
b.MLEbipars <- Normalize(b.bicounts)
b.MLEbipars
## b# ba bb 
##  0  1  0

Also approximate using rejection sampling (or another method if your prefer) the Bayes factor of model (i) vs. model (ii) given the data, assuming that the models are equally likely a priori, and that all parameter values are equally likely subject to the constraint that the conditional probabilities sum to 1 as necessary. (Simple method: draw 3 numbers from a U(0,1) distribution, then normalize by dividing them all by their sum.)

Rejection sampling does not seem very feasible. Take the MLE unigram for example, given the unigram parameters,

uni.MLEpars
## s.vec
##          #          a          b 
## 0.06666667 0.56666667 0.36666667

and the unigram counts in the given string,

table(s.vec)
## s.vec
##  #  a  b 
##  2 17 11

the likelihood of the data is the following

prod(uni.MLEpars ^ table(s.vec))
## [1] 4.585361e-12

which is very small. This means that rejection sampling will take very long to compute.

Also, I think drawing 3 numbers from a U(0,1) distribution and then normalizing them will not result in equally likely parameter probabilities. Consider a simpler case in which we draw 2 numbers from a U(0,1) distribution, and then normalize them, in an attempt to obtain uniformly distributed parameter p for a binomial distribution.

hist(apply(matrix(runif(200000), nrow=2), 2, function(v)Normalize(v)[1]), 
    xlab = "p", main="p by normalizing two U(0,1) random numbers")

We can see that the parameter is not uniformly distributed on [0,1]. The reason is the following. If \(p=0.1\), it means the second number must be 9 times more than the first, but the second number is drawn from [0,1], which means the first number could only possibly be from [0,1/9]. If \(p=0.5\), however, the second number is the same as the first, and therefore there are no restrictions.

Clearly, in the binomial case, we could simply draw the parameter from U(0,1), which is essentially Beta(1,1). In the multinomial case, we could use the Dirichlet distribution, which is a generalization of the Beta distribution, to generate parameters for probability distributions. In this case, we will use Dirichlet(1,1,1).

library(gtools)
rdirichlet(1, c(1,1,1))
##            [,1]      [,2]      [,3]
## [1,] 0.08676009 0.6394932 0.2737467

As said before, rejection sampling is not really feasible. I will use the Monte Carlo method (not MCMC) instead to estimate the integral of likelihoods needed for the Bayes factor.

\[K(M_1, M_2) = \frac{\int_{\theta_1} \mathrm{Pr}(\theta_1 \mid M_1) \mathrm{llh}(D \mid M_1, \theta_1) \;\mathrm{d}\theta_1}{\int_{\theta_2} \mathrm{Pr}(\theta_2 \mid M_2) \mathrm{llh}(D \mid M_2, \theta_2) \;\mathrm{d}\theta_2}\]

The likelihood terms can be calculated analytically, and the density prior can be computed by the ddirichlet function. Therefore we only need to average their products over \(N\) samples to approximate the integral.

N <- 1000

# unigram likelihood
Uni.llh <- function(probs, s.vec){
  return(prod(probs ^ table(s.vec)))
}

# bigram likelihood
Bi.llh <- function(probsH, probsA, probsB, s.vec){
  return(prod(probsH ^ Bigrams('#', s.vec)) * 
         prod(probsA ^ Bigrams('a', s.vec)) * 
         prod(probsB ^ Bigrams('b', s.vec)))
}

UniLikelihoods <- function(s.vec){
  # Randomly sample N parameter combinations and return the likelihoods 
  # generate parameters 
  alpha <- c(1,1,1)
  pars <- rdirichlet(N, alpha)
  
  return(apply(pars, 1, function(v) Uni.llh(v, s.vec)))
}

BiLikelihoods <- function(s.vec){
  # Randomly sample N parameter combinations and return the likelihoods 
  # For bigrams, there are 3 independently generated probability distributions   
  alpha <- c(1,1,1)
  
  return(sapply(1:N, 
                function(i) {
                  Bi.llh(as.numeric(rdirichlet(1, alpha)), 
                         as.numeric(rdirichlet(1, alpha)), 
                         as.numeric(rdirichlet(1, alpha)), 
                         s.vec)
                }))
}

All parameters are equally likely, so for unigrams, the prior density is always 2

alpha <- c(1,1,1)
ddirichlet(c(0.1, 0.2, 0.7), alpha)
## [1] 2
ddirichlet(rdirichlet(1, alpha), alpha)
## [1] 2

And for bigrams, it is \(2^3=8\). Hence we can approximate the Bayes factor of model (i) vs (ii):

integral.uni <- 2 * mean(UniLikelihoods(s.vec))
integral.bi <- 8 * mean(BiLikelihoods(s.vec))
bayes.factor <- integral.uni / integral.bi
bayes.factor
## [1] 0.0003918043

We can see that the Bayes factor strongly favors bigram models.

For each model, what is the entropy rate per character under the maximum likelihood estimates of the parameters?

The entropy rate for the unigram model is just the entropy of the unigram distribution.

Entropy <- function(v){
  # given a prob. distr. vector, return its entropy
  return(sum(-v * log2(v)))
}
Entropy(uni.MLEpars)
## [1] 1.255537

To calculate the entropy rate of the bigram model, first we calculate the stationary distribution from the transition matrix.

trans.MLE <- matrix(c(hashtag.MLEbipars, a.MLEbipars, b.MLEbipars),
                    nrow=3, byrow=TRUE)
colnames(trans.MLE) <- c('#', 'a', 'b')
rownames(trans.MLE) <- c('#', 'a', 'b')
trans.MLE
##            #         a         b
## # 0.00000000 1.0000000 0.0000000
## a 0.05882353 0.2941176 0.6470588
## b 0.00000000 1.0000000 0.0000000

A stationary distribution \(\pi\) is one such that \(\pi P = \pi\), or equivalently, \(\pi (P-I) = 0\), where \(P\) is the transition matrix and \(I\) is the identity matrix. Since \(\pi\) is a probability distribution, it is subject to the additional constraint that \(\sum_i \pi_i=1\) (all \(\pi_i \geq 0\)). We put all these requirements in one matrix, a.mat, so that \(\pi*\) a.mat = \(b\)

b <- c(0,0,0,1)
a.mat <- trans.MLE - diag(1,3)
a.mat <- cbind(a.mat, c(1,1,1))
a.mat
##             #          a          b  
## # -1.00000000  1.0000000  0.0000000 1
## a  0.05882353 -0.7058824  0.6470588 1
## b  0.00000000  1.0000000 -1.0000000 1

We use qr.solve to solve linear equations. Note that a.mat is transposed because qr.solve solves equations in the form \(aX=b\).

stat.distr <- qr.solve(t(a.mat), b)
stat.distr
##          #          a          b 
## 0.03448276 0.58620690 0.37931034

Hence the entropy rate is

Entropy(stat.distr)
## [1] 1.149684

What is the entropy rate under the Bayesian posterior mean estimates of the parameters? Show your work.

Given the assumption that all parameters are equally likely, the posterior is proportional to the likelihood.

UniPosteriorMean <- function(s.vec){
  # Randomly sample N parameter combinations and return the likelihoods 
  # generate parameters 
  alpha <- c(1,1,1)
  pars <- rdirichlet(N, alpha)
  llhs <- apply(pars, 1, function(v) Uni.llh(v, s.vec))
  
  # use matrix multiplication for speed 
  # Normalize(llhs) (a 1*N matrix)  %*% pars (a N*3 matrix) 
  
  return(Normalize(llhs) %*% pars)
}
uni.BPMpars <- UniPosteriorMean(s.vec)
uni.BPMpars
##            [,1]      [,2]      [,3]
## [1,] 0.08946263 0.5468933 0.3636441
Entropy(uni.BPMpars)
## [1] 1.318425
BiPosteriorMean <- function(s.vec){
  # Randomly sample N parameter combinations and return the likelihoods 
  # For bigrams, there are 3 independently generated probability distributions   
  alpha <- c(1,1,1)
  parsH <- rdirichlet(N, alpha)
  parsA <- rdirichlet(N, alpha)
  parsB <- rdirichlet(N, alpha)
  
  llhs <- sapply(1:N, function(i) Bi.llh(parsH[i,], parsA[i,], parsB[i,], s.vec))
  llhs <- Normalize(llhs)
  
  return(rbind(llhs %*% parsH, llhs %*% parsA, llhs %*% parsB))
}
bi.BPMpars <- BiPosteriorMean(s.vec)
colnames(bi.BPMpars) <- c("#", "a", "b")
rownames(bi.BPMpars) <- c("#", "a", "b")
bi.BPMpars
##            #         a          b
## # 0.52328310 0.2730692 0.20364768
## a 0.10449723 0.3132056 0.58229721
## b 0.05740023 0.8943727 0.04822704

Now that we have the transition matrix, we can calculate the stationary distribution

b.BPM <- c(0,0,0,1)
a.BPMmat <- bi.BPMpars - diag(1,3)
a.BPMmat <- cbind(a.BPMmat, c(1,1,1))
a.BPMmat
##             #          a          b  
## # -0.47671690  0.2730692  0.2036477 1
## a  0.10449723 -0.6867944  0.5822972 1
## b  0.05740023  0.8943727 -0.9517730 1
stat.BPMdistr <- qr.solve(t(a.BPMmat), b.BPM)
stat.BPMdistr
##         #         a         b 
## 0.1520750 0.5058845 0.3420405

Hence the entropy rate is

Entropy(stat.BPMdistr)
## [1] 1.439952

Here you will find a file with 50 strings. Some of them were generated by process (i), and others by process (ii). Try to classify each as to the process it came from in each of two ways:

strings <- read.table("strings.txt", colClasses = "character")
string.vecs <- sapply(strings[[1]], function(s) factor(strsplit(s, '')[[1]], levels=c("#", "a","b")))

Brief <- function(s.vec, n=17){
  s.vec <- as.character(s.vec)
  k <- length(s.vec)
  if (k >= n){
    return(paste0(c(s.vec[1:4], "...", s.vec[k-3:0]), collapse=""))
    }
  else{
    return(paste0(s.vec, collapse=""))
  }
}
  1. compute a maximum likelihood estimate for the parameters, based on the string, and then report the ratio of the likelihoods under the two models;
UniMLE <- function(s.vec){
  uni.MLEpars <- table(s.vec) / length(s.vec)
  return(Uni.llh(uni.MLEpars, s.vec))
}

BiMLE <- function(s.vec){
  hashtag.MLEbipars <- Normalize(Bigrams('#', s.vec))
  a.MLEbipars <- Normalize(Bigrams('a', s.vec))
  b.MLEbipars <- Normalize(Bigrams('b', s.vec))
  return(Bi.llh(hashtag.MLEbipars, a.MLEbipars, b.MLEbipars, s.vec))
}

MLEratio <- function(s.vec){
  return(UniMLE(s.vec) / BiMLE(s.vec))
}
results <- data.frame(str=sapply(string.vecs, Brief), 
                     uniMLE=sapply(string.vecs, UniMLE),
                     biMLE=sapply(string.vecs, BiMLE),
                     MLEratio=sapply(string.vecs, MLEratio))
results
##                 str       uniMLE        biMLE     MLEratio
## 1       #abb...baa# 5.248056e-13 4.172076e-10 1.257900e-03
## 2       #aba...aaa# 6.841728e-10 2.571308e-08 2.660797e-02
## 3          #aaaaba# 7.450581e-04 8.640000e-03 8.623357e-02
## 4       #babababab# 1.121580e-05 8.192000e-02 1.369117e-04
## 5       #aba...aba# 1.314147e-08 1.043446e-04 1.259430e-04
## 6       #aba...aab# 3.970186e-24 4.894246e-21 8.111946e-04
## 7          #aaaaba# 7.450581e-04 8.640000e-03 8.623357e-02
## 8       #bab...bab# 5.677122e-27 1.063182e-15 5.339748e-12
## 9    #abababbababa# 7.835785e-07 4.486266e-03 1.746616e-04
## 10   #aababaaababa# 1.546064e-06 4.119873e-04 3.752697e-03
## 11           #aaaa# 2.194787e-02 1.054688e-01 2.080984e-01
## 12              #b# 1.481481e-01 1.000000e+00 1.481481e-01
## 13           #bbab# 2.314815e-03 3.703704e-02 6.250000e-02
## 14       #abbababa# 2.621440e-05 1.112366e-02 2.356635e-03
## 15  #aaabaaaabaaba# 2.466351e-06 1.259712e-04 1.957869e-02
## 16      #aab...aba# 2.576646e-12 5.211602e-09 4.944058e-04
## 17          #aabba# 5.245628e-04 9.259259e-03 5.665278e-02
## 18      #abb...abb# 1.752491e-09 8.026920e-07 2.183267e-03
## 19      #aaa...bab# 1.041863e-17 2.208989e-13 4.716469e-05
## 20  #aaaaaaabbaaba# 2.466351e-06 4.880255e-05 5.053734e-02
## 21      #bab...bba# 2.396624e-21 1.147043e-17 2.089393e-04
## 22   #babababababa# 7.835785e-07 6.697960e-02 1.169876e-05
## 23           #aaaa# 2.194787e-02 1.054688e-01 2.080984e-01
## 24      #aba...baa# 1.208511e-21 1.874360e-18 6.447592e-04
## 25      #aab...aba# 3.715046e-30 5.289511e-28 7.023419e-03
## 26      #abb...aaa# 1.190305e-20 1.101599e-18 1.080524e-02
## 27 #ababaaaabbbaba# 1.697335e-07 9.042245e-06 1.877117e-02
## 28          #aaaba# 1.243408e-03 1.562500e-02 7.957812e-02
## 29      #bbb...aba# 1.081547e-09 8.873497e-08 1.218851e-02
## 30     #baabbababa# 4.381119e-06 7.077888e-04 6.189868e-03
## 31      #aab...aab# 5.165387e-11 6.825157e-09 7.568158e-03
## 32 #abaaaaaaababba# 5.551115e-07 1.328602e-05 4.178161e-02
## 33      #aba...aba# 1.665052e-08 3.294172e-04 5.054539e-05
## 34        #baaaaab# 1.290587e-04 2.048000e-02 6.301696e-03
## 35             #ab# 1.562500e-02 1.000000e+00 1.562500e-02
## 36      #baa...abb# 1.157662e-15 1.241960e-10 9.321251e-06
## 37      #aaa...abb# 9.606403e-18 1.632718e-14 5.883686e-04
## 38              #b# 1.481481e-01 1.000000e+00 1.481481e-01
## 39             #aa# 6.250000e-02 2.500000e-01 2.500000e-01
## 40       #ababaaba# 3.375000e-05 8.640000e-03 3.906250e-03
## 41          #aaaaa# 1.517832e-02 8.192000e-02 1.852822e-01
## 42      #aba...aba# 5.165387e-11 5.031646e-06 1.026580e-05
## 43      #aaa...bba# 3.955940e-23 3.496411e-17 1.131429e-06
## 44      #aba...baa# 2.562079e-27 3.781948e-21 6.774496e-07
## 45      #aba...bab# 1.242419e-26 1.514213e-20 8.205046e-07
## 46      #aba...aba# 6.680893e-08 1.575465e-04 4.240584e-04
## 47              #a# 1.481481e-01 1.000000e+00 1.481481e-01
## 48      #bba...aaa# 2.348153e-20 8.800602e-18 2.668173e-03
## 49      #baa...aab# 3.534247e-13 1.284322e-10 2.751840e-03
## 50      #aba...aba# 8.695234e-23 1.316263e-13 6.606001e-10

We can see that all MLE ratios are less than 1, meaning bigrams are always preferred. This is expected, as a unigram is a special case of bigram (when all 3 conditional probablities are the same) and therefore can never be better than a bigram in terms of maximum likelihood.

  1. compute the Bayesian posterior odds of the models given the data. Do these methods diverge in their predictions in some cases? If so, can you explain why this occurs for the particular strings, and what’s going on in general? (A not-necessarily-exhaustive hint: See this paper or chapter 28 of MacKay’s book.)
BayesFactor <- function(s.vec){
  integral.uni <- 2 * mean(UniLikelihoods(s.vec))
  integral.bi <- 8 * mean(BiLikelihoods(s.vec))
  bayes.factor <- integral.uni / integral.bi
  return(bayes.factor)
}
string.BFs <- sapply(string.vecs, BayesFactor)
results$BayesFactor <- string.BFs
results
##                 str       uniMLE        biMLE     MLEratio  BayesFactor
## 1       #abb...baa# 5.248056e-13 4.172076e-10 1.257900e-03 1.388386e-02
## 2       #aba...aaa# 6.841728e-10 2.571308e-08 2.660797e-02 1.799285e-01
## 3          #aaaaba# 7.450581e-04 8.640000e-03 8.623357e-02 1.155887e-01
## 4       #babababab# 1.121580e-05 8.192000e-02 1.369117e-04 2.290855e-03
## 5       #aba...aba# 1.314147e-08 1.043446e-04 1.259430e-04 1.023143e-03
## 6       #aba...aab# 3.970186e-24 4.894246e-21 8.111946e-04 3.438993e-02
## 7          #aaaaba# 7.450581e-04 8.640000e-03 8.623357e-02 1.101222e-01
## 8       #bab...bab# 5.677122e-27 1.063182e-15 5.339748e-12 1.361998e-10
## 9    #abababbababa# 7.835785e-07 4.486266e-03 1.746616e-04 1.791662e-03
## 10   #aababaaababa# 1.546064e-06 4.119873e-04 3.752697e-03 2.451310e-02
## 11           #aaaa# 2.194787e-02 1.054688e-01 2.080984e-01 1.000284e-01
## 12              #b# 1.481481e-01 1.000000e+00 1.481481e-01 7.213254e-02
## 13           #bbab# 2.314815e-03 3.703704e-02 6.250000e-02 8.432756e-02
## 14       #abbababa# 2.621440e-05 1.112366e-02 2.356635e-03 1.254643e-02
## 15  #aaabaaaabaaba# 2.466351e-06 1.259712e-04 1.957869e-02 9.448998e-02
## 16      #aab...aba# 2.576646e-12 5.211602e-09 4.944058e-04 7.174002e-03
## 17          #aabba# 5.245628e-04 9.259259e-03 5.665278e-02 7.448720e-02
## 18      #abb...abb# 1.752491e-09 8.026920e-07 2.183267e-03 1.662718e-02
## 19      #aaa...bab# 1.041863e-17 2.208989e-13 4.716469e-05 2.720078e-03
## 20  #aaaaaaabbaaba# 2.466351e-06 4.880255e-05 5.053734e-02 1.568080e-01
## 21      #bab...bba# 2.396624e-21 1.147043e-17 2.089393e-04 5.506818e-03
## 22   #babababababa# 7.835785e-07 6.697960e-02 1.169876e-05 3.188229e-04
## 23           #aaaa# 2.194787e-02 1.054688e-01 2.080984e-01 1.292733e-01
## 24      #aba...baa# 1.208511e-21 1.874360e-18 6.447592e-04 1.525227e-02
## 25      #aab...aba# 3.715046e-30 5.289511e-28 7.023419e-03 4.637227e-01
## 26      #abb...aaa# 1.190305e-20 1.101599e-18 1.080524e-02 5.306990e-01
## 27 #ababaaaabbbaba# 1.697335e-07 9.042245e-06 1.877117e-02 5.630229e-02
## 28          #aaaba# 1.243408e-03 1.562500e-02 7.957812e-02 1.018592e-01
## 29      #bbb...aba# 1.081547e-09 8.873497e-08 1.218851e-02 5.933770e-02
## 30     #baabbababa# 4.381119e-06 7.077888e-04 6.189868e-03 2.082445e-02
## 31      #aab...aab# 5.165387e-11 6.825157e-09 7.568158e-03 5.969309e-02
## 32 #abaaaaaaababba# 5.551115e-07 1.328602e-05 4.178161e-02 1.373126e-01
## 33      #aba...aba# 1.665052e-08 3.294172e-04 5.054539e-05 7.712383e-04
## 34        #baaaaab# 1.290587e-04 2.048000e-02 6.301696e-03 2.095916e-02
## 35             #ab# 1.562500e-02 1.000000e+00 1.562500e-02 3.430196e-02
## 36      #baa...abb# 1.157662e-15 1.241960e-10 9.321251e-06 2.148632e-04
## 37      #aaa...abb# 9.606403e-18 1.632718e-14 5.883686e-04 1.083404e-02
## 38              #b# 1.481481e-01 1.000000e+00 1.481481e-01 7.248200e-02
## 39             #aa# 6.250000e-02 2.500000e-01 2.500000e-01 1.016751e-01
## 40       #ababaaba# 3.375000e-05 8.640000e-03 3.906250e-03 1.871427e-02
## 41          #aaaaa# 1.517832e-02 8.192000e-02 1.852822e-01 8.726346e-02
## 42      #aba...aba# 5.165387e-11 5.031646e-06 1.026580e-05 1.248877e-04
## 43      #aaa...bba# 3.955940e-23 3.496411e-17 1.131429e-06 1.873747e-04
## 44      #aba...baa# 2.562079e-27 3.781948e-21 6.774496e-07 1.844810e-05
## 45      #aba...bab# 1.242419e-26 1.514213e-20 8.205046e-07 1.081504e-04
## 46      #aba...aba# 6.680893e-08 1.575465e-04 4.240584e-04 3.521474e-03
## 47              #a# 1.481481e-01 1.000000e+00 1.481481e-01 6.803472e-02
## 48      #bba...aaa# 2.348153e-20 8.800602e-18 2.668173e-03 8.127954e-02
## 49      #baa...aab# 3.534247e-13 1.284322e-10 2.751840e-03 4.193214e-02
## 50      #aba...aba# 8.695234e-23 1.316263e-13 6.606001e-10 1.842198e-07

In general, the Bayes factors are closer to 1 than the MLE ratios, meaning that the Bayes factors generally favor unigram models, which are simpler than bigram models. The reason that Bayes factors generally favor simpler models is that simple models tend not to overfit the data, thus will have a stable performance with a wide range of parameters. Complex models, on the other hand, can overfit the sample data. They may fit the data very well for certain parameters, but behave poorly with many other parameters. Using Bayes factors for model comparison takes into account model complexity and penalizes complex models, thus reducing the chance of overfitting.

Aggregating the strings according to which process you think it came from, what is your best guess about the parameter values that were used in generating the data? Explain what method you used to arrive at the guess.

Suppose that the strings are generated by a Markov chain. We can collect all the bigrams and use maximum likelihood estimates of the parameters.

parsH.MLEall <- Normalize(rowSums(sapply(string.vecs, function(v) Bigrams('#', v))))
parsA.MLEall <- Normalize(rowSums(sapply(string.vecs, function(v) Bigrams('a', v))))
parsB.MLEall <- Normalize(rowSums(sapply(string.vecs, function(v) Bigrams('b', v))))

trans.MLEall <- rbind(parsH.MLEall, parsA.MLEall, parsB.MLEall)
colnames(trans.MLEall) <- c('#', 'a', 'b')
rownames(trans.MLEall) <- c('#', 'a', 'b')
trans.MLEall
##            #         a         b
## # 0.00000000 0.7400000 0.2600000
## a 0.04521964 0.4379845 0.5167959
## b 0.02846300 0.7552182 0.2163188