For this semester, you have two options:
Only complete the part of the assignment based on your choice for computer or virtual machine!!
##r chunk - do not change these
R.version
## _
## platform x86_64-apple-darwin17.0
## arch x86_64
## os darwin17.0
## system x86_64, darwin17.0
## status
## major 4
## minor 0.2
## year 2020
## month 06
## day 22
## svn rev 78730
## language R
## version.string R version 4.0.2 (2020-06-22)
## nickname Taking Off Again
#RStudio.Version() run this line but it won't knit with it "on"
ANSWER: What version of Rstudio are you using? Please note it should be the latest version! R version 3.6.1 (2019-07-05) ### Install all the R Packages
eval = TRUE to eval = FALSE once you have them installed.install.packages("https://osf.io/ak7gq/download", repos = NULL, method = "libcurl", type = "source")
devtools::install_github("trinker/termco")
## Skipping install of 'termco' from a github remote, the SHA1 (b246be55) has not changed since last install.
## Use `force = TRUE` to force installation
devtools::install_github("trinker/coreNLPsetup")
## Skipping install of 'coreNLPsetup' from a github remote, the SHA1 (0fc06d43) has not changed since last install.
## Use `force = TRUE` to force installation
devtools::install_github("trinker/tagger")
## Skipping install of 'tagger' from a github remote, the SHA1 (203c1ea5) has not changed since last install.
## Use `force = TRUE` to force installation
devtools::install_github("bnosac/RDRPOSTagger")
## Skipping install of 'RDRPOSTagger' from a github remote, the SHA1 (af51e38f) has not changed since last install.
## Use `force = TRUE` to force installation
devtools::install_github("bradleyboehmke/harrypotter")
## Skipping install of 'harrypotter' from a github remote, the SHA1 (51f71461) has not changed since last install.
## Use `force = TRUE` to force installation
reticulate library.##r chunk
Try typing py_config() below. You should get a prompt to install Miniconda. If not, use install_miniconda().
##r chunk
Run py_config() in the R chunk below.
##r chunk
Windows machines need special programs to make all this work:
py_install("package_name", pip = T)eval = TRUE to eval = FALSE once you have them installed.Packages: nltk, matplotlib, PyQt5, scikit-learn, numpy, pandas, regex, requests, bs4, spacy, contractions, textblob, sip, gensim, afinn, pyLDAvis
py_install("pandas", pip = T)
py_install("nltk", pip = T)
py_install("matplotlib", pip = T)
py_install("PyQt5", pip = T)
py_install("scikit-learn", pip = T)
py_install("numpy", pip = T)
py_install("regex", pip = T)
py_install("requests", pip = T)
py_install("bs4", pip = T)
py_install("spacy", pip = T)
py_install("contractions", pip = T)
py_install("textblob", pip = T)
py_install("sip", pip = T)
py_install("gensim", pip = T)
py_install("afinn", pip = T)
py_install("gensim", pip = T)
py_install("pyLDAvis", pip = T)
py_install("gensim", pip = T)
For nltk, you will need to add a few other pieces. Type the following into R console: - library(reticulate) - repl_python() - Here you should notice you have switched from > to >>> which indicates you are in Python:
To get out of >>> python, type exit or hit the Esc key.
Click on terminal > type in: - python -m spacy download en_core_web_sm - This will download the English language spacy module.
Go to: https://class.aggieerin.com/auth-sign-in
Your log in is:
Click on terminal and run the following lines:
Run the following in the R console:
When you run py_config() the first time, it will ask you to install miniconda. Say no! We already have python3 installed on the server.
##r chunk
library(reticulate)
py_config()
## python: /Users/emilyhuang/Library/r-miniconda/envs/r-reticulate/bin/python
## libpython: /Users/emilyhuang/Library/r-miniconda/envs/r-reticulate/lib/libpython3.6m.dylib
## pythonhome: /Users/emilyhuang/Library/r-miniconda/envs/r-reticulate:/Users/emilyhuang/Library/r-miniconda/envs/r-reticulate
## version: 3.6.10 |Anaconda, Inc.| (default, Mar 25 2020, 18:53:43) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]
## numpy: /Users/emilyhuang/Library/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/numpy
## numpy_version: 1.19.1
data(rock) to load it.head() function to print out the first six rows of the dataset.##r chunk
rock <= data(rock)
## area peri shape perm
## [1,] TRUE TRUE TRUE TRUE
## [2,] TRUE TRUE TRUE TRUE
## [3,] TRUE TRUE TRUE TRUE
## [4,] TRUE TRUE TRUE TRUE
## [5,] TRUE TRUE TRUE TRUE
## [6,] TRUE TRUE TRUE TRUE
## [7,] TRUE TRUE TRUE TRUE
## [8,] TRUE TRUE TRUE TRUE
## [9,] TRUE TRUE TRUE TRUE
## [10,] TRUE TRUE TRUE TRUE
## [11,] TRUE TRUE TRUE TRUE
## [12,] TRUE TRUE TRUE TRUE
## [13,] TRUE TRUE TRUE TRUE
## [14,] TRUE TRUE TRUE TRUE
## [15,] TRUE TRUE TRUE TRUE
## [16,] TRUE TRUE TRUE TRUE
## [17,] TRUE TRUE TRUE TRUE
## [18,] TRUE TRUE TRUE TRUE
## [19,] TRUE TRUE TRUE TRUE
## [20,] TRUE TRUE TRUE TRUE
## [21,] TRUE TRUE TRUE TRUE
## [22,] TRUE TRUE TRUE TRUE
## [23,] TRUE TRUE TRUE TRUE
## [24,] TRUE TRUE TRUE TRUE
## [25,] TRUE TRUE TRUE TRUE
## [26,] TRUE TRUE TRUE TRUE
## [27,] TRUE TRUE TRUE TRUE
## [28,] TRUE TRUE TRUE TRUE
## [29,] TRUE TRUE TRUE TRUE
## [30,] TRUE TRUE TRUE TRUE
## [31,] TRUE TRUE TRUE TRUE
## [32,] TRUE TRUE TRUE TRUE
## [33,] TRUE TRUE TRUE TRUE
## [34,] TRUE TRUE TRUE TRUE
## [35,] TRUE TRUE TRUE TRUE
## [36,] TRUE TRUE TRUE TRUE
## [37,] TRUE TRUE TRUE TRUE
## [38,] TRUE TRUE TRUE TRUE
## [39,] TRUE TRUE TRUE TRUE
## [40,] TRUE TRUE TRUE TRUE
## [41,] TRUE TRUE TRUE TRUE
## [42,] TRUE TRUE TRUE TRUE
## [43,] TRUE TRUE TRUE TRUE
## [44,] TRUE TRUE TRUE TRUE
## [45,] TRUE TRUE TRUE TRUE
## [46,] TRUE TRUE TRUE TRUE
## [47,] TRUE TRUE TRUE TRUE
## [48,] TRUE TRUE TRUE TRUE
head(rock)
## area peri shape perm
## 1 4990 2791.90 0.0903296 6.3
## 2 7002 3892.60 0.1486220 6.3
## 3 7558 3930.66 0.1833120 6.3
## 4 7352 3869.32 0.1170630 6.3
## 5 7943 3948.54 0.1224170 17.1
## 6 7979 4010.15 0.1670450 17.1
sklearn library, it has several sample datasets. You load python packages by using import PACKAGE. Note that you install and call this package different names (scikit-learn = sklearn).from PACKAGE import FUNCTION. Therefore, you should use from sklearn import datasets.boston dataset by doing: dataset_boston = datasets.load_boston()..head() function: df_boston.head(), after converting the file with pandas (code included below).##python chunk
##TYPE HERE##
#import sklearn
#from sklearn import datasets
##convert to pandas
#import pandas as pd
#df_boston = pd.DataFrame(data=dataset_boston.data, columns=dataset_boston.feature_names)
QUESTION: Look in your environment window. What do you see? same
py$VARNAME.DATAFRAME$COLUMN. Try to print out the CRIM column from your df_boston variable. py$VARNAME##r chunk
#rock$area
#view(rock)
$, we use . like this: r.VARNAME.DATAFRAME["COLUMNNAME"]. Try printing out the shape column in the rock dataset.##python chunk
#rock['area']
install.packages(rJava)