For this semester, you have two options:
Only complete the part of the assignment based on your choice for computer or virtual machine!!
##r chunk - do not change these
R.version
## _
## platform x86_64-w64-mingw32
## arch x86_64
## os mingw32
## system x86_64, mingw32
## status
## major 3
## minor 6.2
## year 2019
## month 12
## day 12
## svn rev 77560
## language R
## version.string R version 3.6.2 (2019-12-12)
## nickname Dark and Stormy Night
#RStudio.Version() run this line but it won't knit with it "on"
ANSWER: What version of Rstudio are you using? Please note it should be the latest version! R version 3.6.2 (2019-12-12)
eval = TRUE
to eval = FALSE
once you have them installed.install.packages("https://osf.io/ak7gq/download", repos = NULL, method = "libcurl", type = "source")
reticulate
library.##r chunk
library(reticulate)
## Warning: package 'reticulate' was built under R version 3.6.3
Try typing py_config()
below. You should get a prompt to install Miniconda. If not, use install_miniconda()
.
##r chunk
py_config()
## python: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate/python.exe
## libpython: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate/python36.dll
## pythonhome: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate
## version: 3.6.10 |Anaconda, Inc.| (default, May 7 2020, 19:46:08) [MSC v.1916 64 bit (AMD64)]
## Architecture: 64bit
## numpy: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate/Lib/site-packages/numpy
## numpy_version: 1.18.4
Run py_config()
in the R chunk below.
##r chunk
py_config()
## python: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate/python.exe
## libpython: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate/python36.dll
## pythonhome: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate
## version: 3.6.10 |Anaconda, Inc.| (default, May 7 2020, 19:46:08) [MSC v.1916 64 bit (AMD64)]
## Architecture: 64bit
## numpy: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate/Lib/site-packages/numpy
## numpy_version: 1.18.4
Windows machines need special programs to make all this work:
py_install("package_name", pip = T)
eval = TRUE
to eval = FALSE
once you have them installed.Packages: nltk, matplotlib, PyQt5, scikit-learn, numpy, pandas, prince, factor-analyzer, gensim, pyLDAvis, bs4
For nltk, you will need to add a few other pieces. Type the following into R console: - library(reticulate) - repl_python() - Here you should notice you have switched from > to >>> which indicates you are in Python:
To get out of >>> python, type exit or hit the Esc key.
Click on terminal > type in: - python -m spacy download en_core_web_sm - This will download the English language spacy module. - pip install lxml
Go to: https://class.aggieerin.com/auth-sign-in
Your log in is:
Click on terminal and run the following lines:
When you run py_config() the first time, it will ask you to install miniconda. Say no! We already have python3 installed on the server.
##r chunk
library(reticulate)
py_config()
## python: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate/python.exe
## libpython: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate/python36.dll
## pythonhome: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate
## version: 3.6.10 |Anaconda, Inc.| (default, May 7 2020, 19:46:08) [MSC v.1916 64 bit (AMD64)]
## Architecture: 64bit
## numpy: C:/Users/PC/AppData/Local/r-miniconda/envs/r-reticulate/Lib/site-packages/numpy
## numpy_version: 1.18.4
data(rock)
to load it.head()
function to print out the first six rows of the dataset.##r chunk
data(rock)
head(rock)
## area peri shape perm
## 1 4990 2791.90 0.0903296 6.3
## 2 7002 3892.60 0.1486220 6.3
## 3 7558 3930.66 0.1833120 6.3
## 4 7352 3869.32 0.1170630 6.3
## 5 7943 3948.54 0.1224170 17.1
## 6 7979 4010.15 0.1670450 17.1
sklearn
library, it has several sample datasets. You load python packages by using import PACKAGE
. Note that you install and call this package different names (scikit-learn = sklearn).from PACKAGE import FUNCTION
. Therefore, you should use from sklearn import datasets
.boston
dataset by doing: dataset_boston = datasets.load_boston()
..head()
function: df_boston.head()
, after converting the file with pandas
(code included below).##python chunk
#scikit-learn = sklearn
#import sklearn
#from sklearn import datasets
#dataset_boston = datasets.load_boston()
##convert to pandas
#import pandas as pd
#df_boston = pd.DataFrame(data=dataset_boston.data, columns=dataset_boston.feature_names)
#df_boston.head()
QUESTION: Look in your environment window. What do you see?
py$VARNAME
.DATAFRAME$COLUMN
. Try to print out the CRIM
column from your df_boston
variable.##r chunk
#df_boston$CRIM
$
, we use .
like this: r.VARNAME
.DATAFRAME["COLUMNNAME"]
. Try printing out the shape
column in the rock
dataset.##python chunk
#rock["shape"]