BT-Settl preparation for IPAC (T) based on class of features 1-Fs/Fc. Doppler compensated. Features from Cesetti

Introduction

After the pre-processing analysis carried out for M stars within the IPAC dataset, we will proceed with a similar methodology to the one ran for IRTF.

## Loading required package: MASS
## Loading required package: Cubist
## Loading required package: lattice
## Loading required package: caret
## Loading required package: ggplot2
## Loading required package: xtable
## Loading required package: Peaks
## Loading required package: magic
## Loading required package: abind
## Loading required package: JADE
## Loading required package: doMC
## Loading required package: foreach
## Loading required package: iterators
## Loading required package: parallel
## Loading required package: segmented
## Loading required package: fftw
## Loading required package: FITSio
## Loading required package: stringr
## Loading required package: ztable
## Welcome to package ztable ver 0.1.5
## Loading required package: signal
## 
## Attaching package: 'signal'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, poly
## 
## Loading required package: e1071
## Loading required package: class
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## 
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## Loading required package: plyr
## Loading required package: compare
## 
## Attaching package: 'compare'
## 
## The following object is masked from 'package:base':
## 
##     isTRUE
## 
## Loading required package: mgcv
## Loading required package: nlme
## This is mgcv 1.8-7. For overview type 'help("mgcv-package")'.
## 
## Attaching package: 'mgcv'
## 
## The following object is masked from 'package:magic':
## 
##     magic
## 
## Loading required package: elasticnet
## Loading required package: lars
## Loaded lars 1.2
## 
## Loading required package: pbapply
## Loading required package: nnet
## Loading required package: kernlab
## Loading required package: pls
## 
## Attaching package: 'pls'
## 
## The following object is masked from 'package:caret':
## 
##     R2
## 
## The following object is masked from 'package:stats':
## 
##     loadings
## 
## Loading required package: multicore
## 
## Attaching package: 'multicore'
## 
## The following objects are masked from 'package:parallel':
## 
##     mclapply, mcparallel, pvec
## 
## The following object is masked from 'package:lattice':
## 
##     parallel
## 
## Loading required package: devtools
## Loading required package: caretEnsemble
## Loading required package: mlbench
## Loading required package: mclust
## Package 'mclust' version 4.3
## 
## Attaching package: 'mclust'
## 
## The following object is masked from 'package:mgcv':
## 
##     mvn
## 
## Loading required package: analogue
## Loading required package: vegan
## Loading required package: permute
## 
## Attaching package: 'permute'
## 
## The following object is masked from 'package:devtools':
## 
##     check
## 
## The following object is masked from 'package:kernlab':
## 
##     how
## 
## This is vegan 2.3-0
## 
## Attaching package: 'vegan'
## 
## The following object is masked from 'package:pls':
## 
##     scores
## 
## The following object is masked from 'package:caret':
## 
##     tolerance
## 
## This is analogue 0.16-3
## 
## Attaching package: 'analogue'
## 
## The following objects are masked from 'package:pls':
## 
##     RMSEP, crossval, pcr
## 
## The following object is masked from 'package:compare':
## 
##     compare
## 
## The following object is masked from 'package:plyr':
## 
##     join
## 
## Loading required package: cluster
## [[1]]
## [1] TRUE
## 
## [[2]]
## [1] TRUE
## 
## [[3]]
## [1] TRUE
## 
## [[4]]
## [1] TRUE
## 
## [[5]]
## [1] TRUE
## 
## [[6]]
## [1] TRUE
## 
## [[7]]
## [1] TRUE
## 
## [[8]]
## [1] TRUE
## 
## [[9]]
## [1] TRUE
## 
## [[10]]
## [1] TRUE
## 
## [[11]]
## [1] TRUE
## 
## [[12]]
## [1] TRUE
## 
## [[13]]
## [1] TRUE
## 
## [[14]]
## [1] TRUE
## 
## [[15]]
## [1] TRUE
## 
## [[16]]
## [1] TRUE
## 
## [[17]]
## [1] TRUE
## 
## [[18]]
## [1] TRUE
## 
## [[19]]
## [1] TRUE
## 
## [[20]]
## [1] TRUE
## 
## [[21]]
## [1] TRUE
## 
## [[22]]
## [1] TRUE
## 
## [[23]]
## [1] TRUE
## 
## [[24]]
## [1] TRUE
## 
## [[25]]
## [1] TRUE
## 
## [[26]]
## [1] TRUE
## 
## [[27]]
## [1] TRUE
## 
## [[28]]
## [1] TRUE
## 
## [[29]]
## [1] TRUE
## 
## [[30]]
## [1] TRUE
## 
## [[31]]
## [1] TRUE
## 
## [[32]]
## [1] TRUE
## 
## [[33]]
## [1] TRUE
## 
## [[34]]
## [1] TRUE
## 
## [[35]]
## [1] TRUE
## 
## [[36]]
## [1] TRUE
## 
## [[37]]
## [1] TRUE

Target T will be between 2000 K and 4200 K

Prediction for Temperature

We will use the I-Band according to GA models noised by SNR=50 and SNR=10 as well as the features proposed by Cestti et al. also noised by SNR=50 and SNR=10.

Recovering already built models

# Procesado del genético de Tª 
range_elem<-function(z,l,bi){
  look=function(x,l,y){
    j<-which(y>= as.numeric(x[l]))[1];
    k<-(which(y>as.numeric(x[(l+1)]))[1])-1; 
    return(c(j,k))
  }
  y=z$data[[1]][,1]
  lm<-t(apply(bi,1,look,l,y))
  rownames(lm)<-bi[,1]
  colnames(lm)<-c("From","To")
  return(lm)
}
int_spec<-function(x,idx,norm=0){
    y<-x$data[[1]][eval(parse(text=idx)),]
    xz<-diff(as.numeric(y[,1]),1)
    yz<-as.numeric(y[,2])
    if ( norm > 0 ){
      z<-sum(xz)
    } else {
      z<-sum(xz*rollmean(yz,2))
    }
    return(z)
}
#
feature_extr<-function(sn,bp){
  sig<-sn[1]
  noi<-sn[2]
  nof<-sn[3] # Allowing to consider 2 bands as the paper
  Fcont<-unlist(lapply(bp,int_spec,noi,0))/
         unlist(lapply(bp,int_spec,noi,1))
  if (! is.na(nof)){ # In case of two bands => average
    Fcont2=unlist(lapply(bp,int_spec,nof,0))/
           unlist(lapply(bp,int_spec,nof,1))
    Fcont = (Fcont + Fcont2) /2.
  }
  fea<-unlist(lapply(bp,int_spec,sig,1))-
       unlist(lapply(bp,int_spec,sig,0))/Fcont
  return(fea)
}
#
sacar=function(x,pos=2) {
  if ( length(x) < pos ) {
    return(NA);
  }
  return(x[pos]);
}
#
###################### 
#
# Loading the object BT-Settl
if (file.exists("~/git/M_prep_IPAC/Models_BT_settl_IPAC33_v2.RData")) {
  load( file="~/git/M_prep_IPAC/Models_BT_settl_IPAC33_v2.RData")
} else {
  cat("ERROR: ~/git/M_prep_IPAC/Models_BT_settl_IPAC33_v2.RData NOT FOUND !!!")  
  exit(2)
}
#

Now, we will read the M stars from the IPAC dataset in fits format.

Reading the IPAC dataset

#
scaling=function(x,xlim=NULL,xp=NULL) {
  y=x
  if (! is.null(y$factors)) {
    y$data[,2]=y$data[,2]*y$factors
  }
  if ( is.null(xlim)){
    ixp=rep(TRUE,nrow(y$data))
    xlim=range(y$data[,1])
  } 
  if ( ! is.null(xp)) {
      xp = xp[(xp > xlim[1]) & (xp < xlim[2])]
      xli0 = y$data[rev(which(y$data[,1] < xlim[1]))[1],1] - 0.001 # Rounding ...
      xli1 = y$data[which(y$data[,1]>xlim[2])[1],1] + 0.001 # Rounding ...
      ixp  = (y$data[,1] >= xli0 & y$data[,1] <= xli1)
  } else {
      ixp = (y$data[,1]>= xlim[1] & y$data[,1] <= xlim[2])
  }
  z=y$data[ixp,]
  jp=rep(0,nrow(z)+1)
#  jp[is.finite(z[,2])]=1  # LSB has requested to interpolate the holes 30/7/2014
  jp[nrow(z)+1]=1
  zp=diff(jp)
  ip=which(zp != 0)
  ini=1
  j=0
  y$data=list()  
  for (i in ip) {
    if (zp[i] != -1 ) {
      j=j+1 
      y$data[[j]]=z[ini:i,]
    } else {
      ini=i+1
    }
  }
  y$nch=j
  y$factors=rep(NA,y$nch)
  for (i in 1:y$nch) {
    if ( is.na(y$data[[i]][nrow(y$data[[i]]),2])) {
      kk = rev(which(! is.na(y$data[[i]][,2])))[1]
      y$data[[i]][nrow(y$data[[i]]),2]=y$data[[i]][kk,2]
    }
    if ( is.na(y$data[[i]][1,2])) {
      kk = which(! is.na(y$data[[i]][,2]))[1]
      y$data[[i]][1,2]=y$data[[i]][kk,2]
    }
    if ( ! is.null(xp)) {
      yp=(approx(y$data[[i]][,1],y$data[[i]][,2],xout=xp))$y
      y$data[[i]]=data.frame(x=xp,y=yp)
    } else {
      yp=(approx(y$data[[i]][,1],y$data[[i]][,2],xout=y$data[[i]][,1]))$y
      y$data[[i]]=data.frame(x=y$data[[i]][,1],y=yp)
    }
    y$factors[i]=sum(rollmean(y$data[[i]][,2],2)*diff(y$data[[i]][,1]))
  }
  ss=sum(y$factors)
  for (i in 1:y$nch) {  
    y$data[[i]][,2]=y$data[[i]][,2]/ss      
  }
  return(y)
}
#
pinta=function(x,cortes=NULL,borders=FALSE, boundary=FALSE, close=FALSE,
               xlim=NULL, ylim=NULL,plt=NULL,width=12,height=8,...){
  if (is.null(x$nch)){
    nch=1
  } else {
    nch=x$nch
  }
  if ( length(xlim) < 2 ) {
    xmin=min(x$data[[1]][,1])
    xmax=max(x$data[[nch]][,1])
    xl=c(xmin,xmax)
  } else {
    xl=xlim
  }
  if ( length(ylim) < 2 ) {
    ymin=min(x$data[[1]][,2])
    ymax=max(x$data[[1]][,2])
    if ( nch > 1 ) {
      for (i in 2:nch){
        ymin=min(ymin,x$data[[i]][,2])
        ymax=max(ymax,x$data[[i]][,2])    
      }
    }
    yl=c(ymin,ymax)
  } else {
    yl = ylim
  }
  if (! is.null(plt) ) {
    pdf(file=plt,width,height)
  }
  if ( boundary) {
    plot(0,xlim=xl,ylim=yl,...)
  }
  for (i in 1:nch) {
    lines(x$data[[i]],...)
  }
  for (i in cortes) {
    abline(v=i[1],col=2)
    abline(v=i[2],col=2)
  }
  if ( borders) {
    for (i in nch) {
      abline(v=min(x$data[[i]][,1]),col=3)
      abline(v=max(x$data[[i]][,1]),col=3)
    }
  }
  if ( close ) {
    dev.off()
  }
}
setwd('~/git/M_prep_IPAC')
#

A number of 595 have been readed from the individual spectra dataset.

Preparation of potential features for T

#
area=function(v,nv) {
  y=data.frame(x=as.numeric(nv),y=v)
  xz = diff(as.numeric(y[,1]),1)
  yz = as.numeric(y[,2])
  z  = sum(xz*rollmean(yz,2))
  return(z)
}
#
bpy=do.call(rbind,lapply(bp_clean,function(x){return(x$data[[1]][,2])}))
bpy10=do.call(rbind,lapply(bp_10,function(x){return(x$data[[1]][,2])}))
bpy50=do.call(rbind,lapply(bp_50,function(x){return(x$data[[1]][,2])}))  
YPT=do.call(rbind,lapply(bp_clean,function(x){return(x$stellarp[1])}))
YPG=do.call(rbind,lapply(bp_clean,function(x){return(x$stellarp[2])}))
YPM=do.call(rbind,lapply(bp_clean,function(x){return(x$stellarp[3])}))
vf=as.character(bp_clean[[1]]$data[[1]][,1])
colnames(bpy)=vf
colnames(bpy10)=vf
colnames(bpy50)=vf
bfy=do.call(rbind,lapply(bf_clean,function(x){return(x$data[[1]][,2])}))
colnames(bfy)=vf
#

En lugar de usar Genéticos para encontrar las mejores features, ahora aplicaremos las features de Cesseti et al.

#
fea_x =t(matrix(c(8461,8474,8484,8513,8522,8562,8577,8619,8642,8682,8730,8772,
                  8802,8811,8850,8890,9000,9030,9080,9100),ncol=10))
fea_x0=t(matrix(c(8474,8484,8474,8484,8474,8484,8563,8577,8619,8642,8700,8725,
                  8776,8792,8815,8850,8983,8998,9040,9050),ncol=10))
fea_ix=fea_ix0=fea_x
for (i in (1:nrow(fea_x))) {
  for (j in (1:ncol(fea_x))) {
    fea_ix[i,j] =which.min(abs(as.numeric(vf)-fea_x[i,j]))
    fea_ix0[i,j]=which.min(abs(as.numeric(vf)-fea_x0[i,j]))    
  }
}
#
res00=res10=res50=list()
for (j in 1:nrow(fea_ix)) {
  x= apply(bpy[,((fea_ix[j,1]):(fea_ix[j,2]))],1,area,vf[eval(parse(text=paste(fea_ix[j,],collapse=":")))])
  x0=apply(bpy[,((fea_ix0[j,1]):(fea_ix0[j,2]))],1,area,vf[eval(parse(text=paste(fea_ix0[j,],collapse=":")))])
  name=paste(vf[fea_ix[j,]],collapse="-")
  name0=paste(vf[fea_ix0[j,]],collapse="-")
  res00[[j]]=list(x=x,x0=x0,name=name,name0=name0)
  x= apply(bpy10[,((fea_ix[j,1]):(fea_ix[j,2]))],1,area,vf[eval(parse(text=paste(fea_ix[j,],collapse=":")))])
  x0=apply(bpy10[,((fea_ix0[j,1]):(fea_ix0[j,2]))],1,area,vf[eval(parse(text=paste(fea_ix0[j,],collapse=":")))])
  res10[[j]]=list(x=x,x0=x0,name=name,name0=name0)
  x= apply(bpy50[,((fea_ix[j,1]):(fea_ix[j,2]))],1,area,vf[eval(parse(text=paste(fea_ix[j,],collapse=":")))])
  x0=apply(bpy50[,((fea_ix0[j,1]):(fea_ix0[j,2]))],1,area,vf[eval(parse(text=paste(fea_ix0[j,],collapse=":")))])
  res50[[j]]=list(x=x,x0=x0,name=name,name0=name0)
}
rm(x,x0,name,name0)
#

T foreseen by Luminosity models

# Loading IPAC prediction from Dr Sarro  (objeto piac_temp)
load("~/git/M_prep_Noise/Teffs.RData")
ipac_temp[,1]=as.character(ipac_temp[,1])
ipac_temp[,2]=as.character(ipac_temp[,2])
ipac_temp[,3]=as.character(ipac_temp[,3])
ipac_temp[,4]=as.character(ipac_temp[,4])
ipac_temp[,5]=as.character(ipac_temp[,5])
ipac_temp[,6]=as.character(ipac_temp[,6])
ipac_temp[ipac_temp=='\xa0']=""
#
lnm<-unlist(lapply(bf_clean,function(x){return(x$name)}))
buscar=function(x,y,l=4){
  for ( i in x[1:l]) {
    if (nchar(i) > 1) {
      stg = gsub('+','p',gsub(' ','_',i),fixed=TRUE)
      stg = gsub('\\(','\\\\(',gsub('\\)','\\\\)',stg))
      if (nchar(stg) > 0) {
        pos=grep(stg,y)
        if ( length(pos) == 1 ) {
          return (pos)
        }
      }
    }
  }
  return(-1)
}
#
idx=apply(ipac_temp[,-c(5,7)],1,buscar,lnm,6)
Teff_scs = rep(NA,max(idx))
Tipo_scs = rep(NA,max(idx))
idpos=data.frame(LSB=1:length(idx),BPG=idx)
for ( i in 1:nrow(idpos)) {
  if (idpos[i,2]>0) {
    Teff_scs[idpos[i,2]]= ipac_temp[idpos[i,1],7]
    Tipo_scs[idpos[i,2]]= ipac_temp[idpos[i,1],5]
  }
}
# 

Modelado de T con esas Features

#
clase_luminosidad = function(x) {
  # we look over Spectral SubClass
  # for roman numbers V=>dwarft; III => giants I => Supergiants (II => I) 
  cl=c("V","III","I") 
  i=1
  res=NA
  if ( ! is.na(x) ) {
    while(i <= length(cl) & is.na(res)) {
      j=regexpr(cl[i],x)
      if (j[1] > 0) {
        res=cl[i]
      } else {
        i=i+1
      }
    }
  }
  return(res)
}
clase_m = function(x) {
  # we look over Spectral SubClass
  # for roman numbers V=>dwarft; III => giants I => Supergiants (II => I) 
#   cl=c("M0.5","M0","M1.5","M1","M2.5","M2","M3.5","M3","M4.5","M4","M5.5","M5","M6.5","M6",
#        "M7.5","M7","M8.5","M8","M9.5","M9") 
  cl=c("M0","M1","M2","M3","M4","M5","M6","M7","M8","M9") 
  i=1
  res=NA
  if ( ! is.na(x) ) {
    while(i <= length(cl) & is.na(res)) {
      j=regexpr(cl[i],x)
      if (j[1] > 0) {
        res=cl[i]
      } else {
        i=i+1
      }
    }
  }
  return(res)
}
#
#
prep_datos=function(res){
  X=matrix(NA,nrow=length(res[[1]]$x),ncol=length(res))
  for (j in 1:length(res)) {
    X[,j]=res[[j]]$x/res[[j]]$x0
  }
  colnames(X)=paste("C",1:length(res),sep="")   
  return(X)
}
#
modelado=function(X,Y) {
  #
  trn=1:nrow(X)
  folds=5
  repeats=1
  #
  mseeds <- vector(mode = "list", length = (folds+1))
  for(i in 1:folds) mseeds[[i]] <- sample.int(nrow(X), 100)
      mseeds[[(folds+1)]] <- sample.int(1000, 1)
  myControl = trainControl(method='cv', number=folds, 
                  repeats=repeats, returnResamp='none', 
                  returnData=FALSE, savePredictions=TRUE, 
                  verboseIter=FALSE, allowParallel=TRUE,seeds=mseeds)
  #Train some models
  model_1.1 = "train(X[trn,], Y[trn], method='gbm', 
                     trControl=myControl,
                tuneGrid=expand.grid(n.trees=seq(200,500,100), n.minobsinnode=c(2,5),
                                     interaction.depth=seq(3,15,2), 
                                     shrinkage = seq(0.01,0.1,0.02)),verbose=FALSE)"
  model_1.2 = "train(X[trn,], Y[trn], method='blackboost', 
                  trControl=myControl, tuneGrid=expand.grid(.mstop=10^(2:4), 
                                     .maxdepth=seq(5,15,3)))"
  model_1.3 = "train(X[trn,], Y[trn], method='rf', 
                     trControl=myControl,
                     tuneGrid=expand.grid(.mtry=(2:ncol(X))),
                     proximity=TRUE)"
  model_1.4 = "train(X[trn,], Y[trn], method='knn', 
                     trControl=myControl, trace=FALSE,
                     tuneGrid=expand.grid(k=(2:min(9,ncol(X)))))"
  model_1.5 = "train(X[trn,], Y[trn], method='ppr', 
                     trControl=myControl,tuneGrid=expand.grid(.nterms=10))"
  model_1.6 = "train(X[trn,], Y[trn], method='earth', 
                     trControl=myControl,tuneGrid=expand.grid(.degree = 3,
                          .nprune = (1:10) * 2), metric = 'RMSE',
                          maximize = FALSE)"
  model_1.7 = "train(X[trn,], Y[trn], method='glm', 
                     trControl=myControl)"
  model_1.8 = "train(X[trn,], Y[trn], method='svmRadial', 
                     trControl=myControl,tuneGrid=expand.grid(.C=10^(-3:3), 
                      .sigma=c(10^(-4:3))))"
  model_1.9 = "train(X[trn,], Y[trn], method='cubist', 
                     trControl=myControl,commiittees=10)"
  model_1.10 = "train(X[trn,], Y[trn], method='kernelpls', 
                     trControl=myControl,tuneGrid=expand.grid(.ncomp=seq(2,ncol(X))))"
  model_1.11 = "train(X[trn,], Y[trn], method='nnet', 
                  trControl=myControl,tuneGrid=expand.grid(.size=seq(2,round(2*ncol(X)),3), 
                      .decay=c(1.e-5,1.e-2)),
                      linout=TRUE, rang = 300, trace=FALSE,
                      maxit=10000, reltol=1.0e-11, abstol=1.0e-6)"
  model_1.12 = "train(X[trn,], Y[trn], method='bagEarth', 
                  trControl=myControl,tuneGrid=expand.grid(.nprune=(1:10)*3,.degree=1:5))"
  model_1.13 = "train(X[trn,], Y[trn], method='cubist', trControl=myControl,
                  tuneGrid=expand.grid(.committees=3:50, .neighbors=1:7))"
  # 
  #Make a list of all the models
  lmodels=c("model_1.1", "model_1.2", "model_1.3", "model_1.4", 
            "model_1.5", "model_1.6", "model_1.7", "model_1.8", 
            "model_1.9", "model_1.10","model_1.11","model_1.12", 
            "model_1.13") 
#  
  all.models_5 =list()
  for (i in lmodels) {
    all.models_5[[length(all.models_5)+1]] = eval(parse(text=gsub('\n','',get(i))))
  }
  names(all.models_5) = sapply(all.models_5, function(x) x$method)
  # 
  ensam_5 <- caretList(X[trn,],Y[trn],methodList=c('svmRadial',
                        'gbm','blackboost','rf','earth','bagEarth'))
  ens <- caretEnsemble(ensam_5)
  #Make a linear regression ensemble
  if (exists("ens")){
    all.models_5[[length(all.models_5)+1]] = get("ens")
    names(all.models_5)[length(all.models_5)]="ENS"
  } 
  return(all.models_5)
}
#
# Function that returns Root Mean Squared Error
rmse <- function(error) {
    sqrt(mean(error^2,na.rm=TRUE))
}
# Function that returns Mean Absolute Error
mae <- function(error) {
    mean(abs(error),na.rm=TRUE)
}
#
#
if ( file.exists("~/git/M_prep_IPAC/GA_IPACrv_MT_CES_Models.RData")) {
  load("~/git/M_prep_IPAC/GA_IPACrv_MT_CES_Models.RData")
} else {
  X00 = prep_datos(res00)
  md5 = modelado(X00,as.numeric(YPT[,1]))
  X10 = prep_datos(res10)
  md51= modelado(X10,as.numeric(YPT[,1]))
  X50 = prep_datos(res50)
  md55= modelado(X50,as.numeric(YPT[,1]))
  #
  # Consideramos ahora los espectros IPAC corregidos por doppler (A Bello 16/7/2015)
  load("../M_redshift/M_prep_clean_IPAC_corrected.RData")
  bfy=do.call(rbind,lapply(bq_clean,function(x){return(x$data[[1]][,2])}))
  colnames(bfy)=vf
  #
  rf00=list()
  for (j in 1:nrow(fea_ix)) {
    x= apply(bfy[,((fea_ix[j,1]):(fea_ix[j,2]))],1,area,vf[eval(parse(text=paste(fea_ix[j,],collapse=":")))])
    x0=apply(bfy[,((fea_ix0[j,1]):(fea_ix0[j,2]))],1,area,vf[eval(parse(text=paste(fea_ix0[j,],collapse=":")))])
    name=paste(vf[fea_ix[j,]],collapse="-")
    name0=paste(vf[fea_ix0[j,]],collapse="-")
    rf00[[j]]=list(x=x,x0=x0,name=name,name0=name0)
  }
  rm(x,x0,name,name0)
  xf0=prep_datos(rf00)
  #
  YTn00=predict(md5[[which(names(md5)=="rf")]],xf0)
  YTn01=predict(md51[[which(names(md51)=="rf")]],xf0)
  YTn05=predict(md55[[which(names(md55)=="rf")]],xf0)
  #
  Clase_lum=apply(as.data.frame(as.character(Tipo_scs),
            stringsAsFactors=FALSE),1,clase_luminosidad)
  Clase_m  =apply(as.data.frame(as.character(Tipo_scs),
            stringsAsFactors=FALSE),1,clase_m)
  SpT=Clase_m
  LC =Clase_lum
  ref_tot0=data.frame(Name=as.character(ref$Name),
                chi2d_10=ref$Chi2_T_10, chi2d_50=ref$Chi2_T_50,
                YTn00,YTn01,YTn05,Teff_LSB=Teff_scs,SpT=SpT,LC=LC)
  #
  save(bpy,bfy,YPT,YPG,YPM,vf,md5,md51,
       md55,res00,res10,res50, X00,X10,X50,YTn00,YTn01,
       YTn05,xf0,ref_tot0,LC,SpT,
       file="~/git/M_prep_IPAC/GA_IPACrv_MT_CES_Models.RData")
}
GA_r3 Loading required package: gbm
GA_r3 Loading required package: survival
GA_r3 
GA_r3 Attaching package: 'survival'
GA_r3 
GA_r3 The following object is masked from 'package:caret':
GA_r3 
GA_r3     cluster
GA_r3 
GA_r3 Loading required package: splines
GA_r3 Loaded gbm 2.1.1
GA_r3 Loading required package: party
GA_r3 Loading required package: grid
GA_r3 Loading required package: mvtnorm
GA_r3 Loading required package: modeltools
GA_r3 Loading required package: stats4
GA_r3 
GA_r3 Attaching package: 'modeltools'
GA_r3 
GA_r3 The following object is masked from 'package:kernlab':
GA_r3 
GA_r3     prior
GA_r3 
GA_r3 The following object is masked from 'package:plyr':
GA_r3 
GA_r3     empty
GA_r3 
GA_r3 Loading required package: strucchange
GA_r3 Loading required package: sandwich
GA_r3 Loading required package: mboost
GA_r3 Loading required package: stabs
GA_r3 
GA_r3 Attaching package: 'stabs'
GA_r3 
GA_r3 The following object is masked from 'package:modeltools':
GA_r3 
GA_r3     parameters
GA_r3 
GA_r3 This is mboost 2.4-2. See 'package?mboost' and 'news(package  = "mboost")'
GA_r3 for a complete list of changes.
GA_r3 
GA_r3 
GA_r3 Attaching package: 'mboost'
GA_r3 
GA_r3 The following object is masked from 'package:ggplot2':
GA_r3 
GA_r3     %+%
GA_r3 
GA_r3 Loading required package: randomForest
GA_r3 randomForest 4.6-7
GA_r3 Type rfNews() to see new features/changes/bug fixes.
GA_r3 Loading required package: earth
GA_r3 Loading required package: plotmo
GA_r3 Loading required package: plotrix
GA_r3 Loading required package: TeachingDemos

Iter TrainDeviance ValidDeviance StepSize Improve 1 266837.9799 -nan 0.1000 50679.4472 2 225567.8450 -nan 0.1000 39128.5561 3 190607.3155 -nan 0.1000 34379.3481 4 162939.3578 -nan 0.1000 27833.8592 5 140781.5149 -nan 0.1000 21926.5859 6 120520.3309 -nan 0.1000 18715.0755 7 104531.5154 -nan 0.1000 12706.6562 8 91410.0121 -nan 0.1000 11908.1762 9 80849.8442 -nan 0.1000 9962.2873 10 71954.7671 -nan 0.1000 8851.4010 20 29733.4650 -nan 0.1000 1495.5256 40 15854.4932 -nan 0.1000 102.9798 60 11272.7321 -nan 0.1000 72.9390 80 8692.1186 -nan 0.1000 20.9711 100 7132.9922 -nan 0.1000 47.5293 120 6122.8991 -nan 0.1000 45.0011 140 5271.9895 -nan 0.1000 -28.9944 150 4952.0103 -nan 0.1000 -7.2483

Iter TrainDeviance ValidDeviance StepSize Improve 1 277004.4446 -nan 0.1000 37598.0453 2 247163.6278 -nan 0.1000 31074.3453 3 220228.2510 -nan 0.1000 26192.2689 4 198422.4127 -nan 0.1000 22178.1104 5 179921.5440 -nan 0.1000 18753.7068 6 164344.5776 -nan 0.1000 14323.2000 7 150971.3336 -nan 0.1000 12019.5320 8 138778.6890 -nan 0.1000 10370.6583 9 129632.4960 -nan 0.1000 8001.4350 10 120369.8609 -nan 0.1000 8757.7873 20 82061.6273 -nan 0.1000 903.0158 40 65321.1892 -nan 0.1000 143.9561 50 61771.7359 -nan 0.1000 -56.2139

Iter TrainDeviance ValidDeviance StepSize Improve 1 268156.5826 -nan 0.1000 49196.6103 2 229885.1189 -nan 0.1000 37707.9251 3 196570.5981 -nan 0.1000 31829.5597 4 171983.3993 -nan 0.1000 25319.9687 5 150501.5544 -nan 0.1000 20975.5410 6 132882.4024 -nan 0.1000 16906.5384 7 117887.6563 -nan 0.1000 13566.0318 8 107286.6894 -nan 0.1000 10194.7074 9 96412.9724 -nan 0.1000 10213.3828 10 87638.5113 -nan 0.1000 7909.9154 20 48557.9494 -nan 0.1000 1133.9296 40 33125.8692 -nan 0.1000 97.7063 60 27574.6126 -nan 0.1000 -83.0869 80 24477.0712 -nan 0.1000 -147.2479 100 21677.1131 -nan 0.1000 -41.9051 120 19690.5184 -nan 0.1000 -75.5971 140 18168.4906 -nan 0.1000 -69.1656 150 17199.4697 -nan 0.1000 19.7116

#
ref_tot0=data.frame(Name=as.character(ref$Name),
                chi2d_10=ref$Chi2_T_10, chi2d_50=ref$Chi2_T_50,
                YTn00,YTn01,YTn05,Teff_LSB=Teff_scs,SpT=SpT,LC=LC)
#
ref_tot2=ref_tot0
#
print(xtable(ldply(res00,function(x){return(c(x$name,x$name0))})),type="html")
V1 V2
1 8462.4-8473.2 8473.2-8484
2 8484-8512.8 8473.2-8484
3 8523.6-8563.2 8473.2-8484
4 8577.6-8617.2 8563.2-8577.6
5 8642.4-8682 8617.2-8642.4
6 8728.8-8772 8700-8725.2
7 8800.8-8811.6 8775.6-8793.6
8 8851.2-8890.8 8815.2-8851.2
9 8998.8-9031.2 8984.4-8998.8
10 9081.6-9099.6 9038.4-9049.2
print(xtable(ldply(res10,function(x){return(c(x$name,x$name0))})),type="html")
V1 V2
1 8462.4-8473.2 8473.2-8484
2 8484-8512.8 8473.2-8484
3 8523.6-8563.2 8473.2-8484
4 8577.6-8617.2 8563.2-8577.6
5 8642.4-8682 8617.2-8642.4
6 8728.8-8772 8700-8725.2
7 8800.8-8811.6 8775.6-8793.6
8 8851.2-8890.8 8815.2-8851.2
9 8998.8-9031.2 8984.4-8998.8
10 9081.6-9099.6 9038.4-9049.2
print(xtable(ldply(res50,function(x){return(c(x$name,x$name0))})),type="html")
V1 V2
1 8462.4-8473.2 8473.2-8484
2 8484-8512.8 8473.2-8484
3 8523.6-8563.2 8473.2-8484
4 8577.6-8617.2 8563.2-8577.6
5 8642.4-8682 8617.2-8642.4
6 8728.8-8772 8700-8725.2
7 8800.8-8811.6 8775.6-8793.6
8 8851.2-8890.8 8815.2-8851.2
9 8998.8-9031.2 8984.4-8998.8
10 9081.6-9099.6 9038.4-9049.2
#
ggplot(data=ref_tot2) + 
         geom_point(aes(x=Teff_LSB,y=chi2d_10,shape=LC),size=3) +
         xlab("Theoretical T [K]") + ylab("Chi2 10 [K]") +
         theme_bw() + #  scale_colour_brewer(palette="Set1") +
         geom_abline(position="identity", colour="gray") +
         xlim(2000,4200) + ylim(2000,4200)   +
         guides(col=guide_legend(ncol=2))

#
ggplot(data=ref_tot2) + 
         geom_point(aes(x=Teff_LSB,y=chi2d_50,shape=LC),size=3) +
         xlab("Theoretical T [K]") + ylab("Chi2 50 [K]") +
         theme_bw() + #  scale_colour_brewer(palette="Set1") +
         geom_abline(position="identity", colour="gray") +
         xlim(2000,4200) + ylim(2000,4200)   +
         guides(col=guide_legend(ncol=2))

#
ggplot(data=ref_tot2) + 
         geom_point(aes(x=Teff_LSB,y=YTn00,shape=LC),size=3) +
         xlab("Theoretical T [K]") + 
         ylab("ML predicted Msnr=oo Fsnr=oo [K]") +
         theme_bw() + #  scale_colour_brewer(palette="Set1") +
         geom_abline(position="identity", colour="gray") +
         xlim(2000,4200) + ylim(2000,4200)   +
         guides(col=guide_legend(ncol=2))

#
ggplot(data=ref_tot2) + 
         geom_point(aes(x=Teff_LSB,y=YTn01,shape=LC),size=3) +
         xlab("Theoretical T [K]") + 
         ylab("ML predicted Msnr=10 Fsnr=10 [K]") +
         theme_bw() + #  scale_colour_brewer(palette="Set1") +
         geom_abline(position="identity", colour="gray") +
         xlim(2000,4200) + ylim(2000,4200)   +
         guides(col=guide_legend(ncol=2))

#
ggplot(data=ref_tot2) + 
         geom_point(aes(x=Teff_LSB,y=YTn05,shape=LC),size=3) +
         xlab("Theoretical T [K]") + 
         ylab("ML predicted Msnr=50 Fsnr=50 [K]") +
         theme_bw() + #  scale_colour_brewer(palette="Set1") +
         geom_abline(position="identity", colour="gray") +
         xlim(2000,4200) + ylim(2000,4200)   +
         guides(col=guide_legend(ncol=2))

#
lmod=c("rf","gbm","svmRadial","nnet","knn","bagEarth","kernelpls","cubist")
nmod=c("RF","GB","SVR","NNR","KNN","MARS","PLS","Rule-Regression")
dataT = array(0,c(3,length(nmod),nrow(xf0))) # Dim 1 SNR, dim2 Star and Dim3 Model
nomT = unlist(lapply(bf_clean,function(x){nm=gsub('.fits','',x$name);pp=strsplit(nm,'_'); return(pp[[1]][2])}))
for(ll in c(1:2)) {
  if (ll == 2) {
    pdf(file="~/git/M_prep_IPAC/GA_model_Trv.pdf",width=12,heigh=10)
  }
  for (i in 1:length(lmod)) {
    YTm00=predict(md5[[which(names(md5)==lmod[i])[1]]],xf0)
    YTm01=predict(md51[[which(names(md51)==lmod[i])[1]]],xf0)
    YTm05=predict(md55[[which(names(md55)==lmod[i])[1]]],xf0)
    dataT[1,i,] = YTm00
    dataT[2,i,] = YTm01
    dataT[3,i,] = YTm05       
    #
    ref_tot0m=data.frame(Name=as.character(ref$Name),
                chi2d_10=ref$Chi2_T_10, chi2d_50=ref$Chi2_T_50,
                YTm00,YTm01,YTm05,Teff_LSB=Teff_scs,SpT=SpT,LC=LC)
    ref_tot2m=ref_tot0m 
    diffs=apply(ref_tot2m[,! colnames(ref_tot2m) %in% c("Name","SpT","Teff_LSB","LC")],
              2,FUN="-",ref_tot2m[,"Teff_LSB"])
    rownames(diffs)=ref_tot2m[,"Name"]
    #
    cat(paste("T Modelling with ",lmod[i],". Error analysis follows.",sep=""))
    errors=data.frame(drmse = apply(diffs,2,rmse),dmae=apply(diffs,2,mae))
    print(xtable(errors),type="html")
  #
    cat(paste("T Modelling with",nmod[i],". SNR=oo",sep=""))
    print(ggplot(data=ref_tot2m) + 
           geom_point(aes(x=Teff_LSB,y=YTm00,shape=LC),size=3) +
           xlab("Theoretical T [K]") + 
           ylab(paste(nmod[i]," predicted SNR=oo [K]","")) +
           theme_bw() + #  scale_colour_brewer(palette="Set1") +
           geom_abline(position="identity", colour="red") +
           xlim(2000,4200) + ylim(2000,4200)   +
           guides(col=guide_legend(ncol=2)))
    #
    cat(paste("T Modelling with",nmod[i],". SNR=10",sep=""))
    print(ggplot(data=ref_tot2m) + 
           geom_point(aes(x=Teff_LSB,y=YTm01,shape=LC),size=3) +
           xlab("Theoretical T [K]") + 
           ylab(paste(nmod[i]," predicted SNR=10 [K]","")) +
           theme_bw() + #  scale_colour_brewer(palette="Set1") +
           geom_abline(position="identity", colour="red") +
           xlim(2000,4200) + ylim(2000,4200)   +
           guides(col=guide_legend(ncol=2)))
    #
    cat(paste("T Modelling with",nmod[i],". SNR=50",sep=""))
    print(ggplot(data=ref_tot2m) + 
           geom_point(aes(x=Teff_LSB,y=YTm05,shape=LC),size=3) +
           xlab("Theoretical T [K]") + 
           ylab(paste(nmod[i]," predicted SNR=50 [K]","")) +
           theme_bw() + #  scale_colour_brewer(palette="Set1") +
           geom_abline(position="identity", colour="red") +
           xlim(2000,4200) + ylim(2000,4200)   +
           guides(col=guide_legend(ncol=2)))
  }
}
T Modelling with rf. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 345.61 260.31
YTm01 228.17 165.70
YTm05 282.06 199.90
T Modelling withRF. SNR=ooT Modelling withRF. SNR=10T Modelling withRF. SNR=50T Modelling with gbm. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 367.03 299.15
YTm01 214.55 157.17
YTm05 294.55 209.41
T Modelling withGB. SNR=ooT Modelling withGB. SNR=10T Modelling withGB. SNR=50T Modelling with svmRadial. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 855.74 722.97
YTm01 220.18 164.91
YTm05 409.38 291.16
T Modelling withSVR. SNR=ooT Modelling withSVR. SNR=10T Modelling withSVR. SNR=50T Modelling with nnet. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 508.62 393.79
YTm01 243.76 180.40
YTm05 461.01 377.54
T Modelling withNNR. SNR=ooT Modelling withNNR. SNR=10T Modelling withNNR. SNR=50T Modelling with knn. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 331.80 259.74
YTm01 262.15 202.62
YTm05 240.42 174.97
T Modelling withKNN. SNR=ooT Modelling withKNN. SNR=10T Modelling withKNN. SNR=50T Modelling with bagEarth. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 3501.23 1505.91
YTm01 253.33 161.59
YTm05 913.73 387.81
T Modelling withMARS. SNR=ooT Modelling withMARS. SNR=10T Modelling withMARS. SNR=50T Modelling with kernelpls. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 2260.65 1759.30
YTm01 268.18 213.00
YTm05 734.09 520.21
T Modelling withPLS. SNR=ooT Modelling withPLS. SNR=10T Modelling withPLS. SNR=50T Modelling with cubist. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 848.84 773.94
YTm01 238.56 175.04
YTm05 417.61 321.57
T Modelling withRule-Regression. SNR=ooT Modelling withRule-Regression. SNR=10T Modelling withRule-Regression. SNR=50T Modelling with rf. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 345.61 260.31
YTm01 228.17 165.70
YTm05 282.06 199.90
T Modelling withRF. SNR=ooT Modelling withRF. SNR=10T Modelling withRF. SNR=50T Modelling with gbm. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 367.03 299.15
YTm01 214.55 157.17
YTm05 294.55 209.41
T Modelling withGB. SNR=ooT Modelling withGB. SNR=10T Modelling withGB. SNR=50T Modelling with svmRadial. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 855.74 722.97
YTm01 220.18 164.91
YTm05 409.38 291.16
T Modelling withSVR. SNR=ooT Modelling withSVR. SNR=10T Modelling withSVR. SNR=50T Modelling with nnet. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 508.62 393.79
YTm01 243.76 180.40
YTm05 461.01 377.54
T Modelling withNNR. SNR=ooT Modelling withNNR. SNR=10T Modelling withNNR. SNR=50T Modelling with knn. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 331.80 259.74
YTm01 262.15 202.62
YTm05 240.42 174.97
T Modelling withKNN. SNR=ooT Modelling withKNN. SNR=10T Modelling withKNN. SNR=50T Modelling with bagEarth. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 3501.23 1505.91
YTm01 253.33 161.59
YTm05 913.73 387.81
T Modelling withMARS. SNR=ooT Modelling withMARS. SNR=10T Modelling withMARS. SNR=50T Modelling with kernelpls. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 2260.65 1759.30
YTm01 268.18 213.00
YTm05 734.09 520.21
T Modelling withPLS. SNR=ooT Modelling withPLS. SNR=10T Modelling withPLS. SNR=50T Modelling with cubist. Error analysis follows.
drmse dmae
chi2d_10 168.61 122.99
chi2d_50 155.71 99.24
YTm00 848.84 773.94
YTm01 238.56 175.04
YTm05 417.61 321.57

T Modelling withRule-Regression. SNR=ooT Modelling withRule-Regression. SNR=10T Modelling withRule-Regression. SNR=50

dev.off()

png 2

#