There are two parts to this script. The first part uses the R package reticulate to run python 3 code on a dataset of massage modalities’ benefits and separately contraindications for single user inputs on massage request based on his or her massage goals. But after testing the machine learning on predicting the modality of 24 duplicates each of these 19 categories of massage. The tests are 100%, as they should be because every unique massage modality is the same exact description, word for word. This was done so that when splitting the training and testing sets there wouldn’t be any left out categories, and therefore left out predictions, that would throw off the accuracy and the recall and precision.

Another prior version before the last version was done to test this accuracy, and all of the hot stone massages were misclassified as deep tissue massage but all deep tissue massage categories were identified as such correctly making the recall 100% for deep tissue, but zero for hot stone thereapy, and zero precision for hot stone therapy and 33% precision for deep tissue, with an overall accuracy of 94%. This error was fixed by fixing the underlying data.

This script uses 19 classes of massage therapy and massage modalities or enhancements (elevations) to recommend a massage for a client after excluding all other massage modalities based on the contraindications for each massage modality then benefit of each massage modality needed after excluding a user contraindicated modality based on brief health declarations of the user requesting a massage.This is a program built based on the tokenized lists read in from the python script (in this script using reticulate), and functions to grab a user input and run through a grand function that tokenizes the user input by lemmatizing it, then using the same ngrams of 2 for benefits and 3 for contraindications as was done in the python script using the nltk python module. But here, the tokens and lemmatization in R use the R packages tidytext and textstem respectively. All stop words were also removed and punctuations and made all lowercase. This function then builds a list on contraindicated massages based on user input tokens, and from those tokens creates a list of recommended probabilites for a modality that it then excludes the contraindicated modalities from all based on user defined tokens. A table of probabilities for each available modality for the user that is not in the list of contraindicated modalities is returned, and the top choices (3) of the massage.

This version 2 will then return the massage modality top choices’ description, benefits, and possible side effects.

modes <- read.csv('MassageModalities3.csv', sep=',', header=TRUE, na.strings=c('',' ','NA'))
colnames(modes)[1] <- 'modality'
head(modes)
##                     modality
## 1            Swedish Massage
## 2            Shiatsu Massage
## 3           Prenatal Massage
## 4             Sports Massage
## 5 Lymphatic Drainage Massage
## 6        Reflexology Massage
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     Description
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Traditional spa or clinic massage with hands,  palms,  elbows,  forearms of massage therapist used with glides and varying pressure applied along the muscle fibers of the body with varying amounts of pressure to get up to deeper layers of the body while avoiding causing pain to the client and avoiding discomfort to the client. Avoids massaging fast and avoids choppy short motions that disrupt relaxation,  but at the same rhythm to promote relaxation and calm the nervous system. 
## 2                                                                                                                                                                                                                                                                                                                      Accupressure or small and localized pressure is applied with knuckles,  thumbs,  or tools to deep muscle layers along the spine and limbs or meridians of the body to promote healing of muscle aches,  improve relaxation,  calm nervous system,  and reduce stress for repetitive cycles of 3-5 seconds with each pressure application,  sometimes longer. Helps with muscle tonicity,  can be adjusted to the right pressure for each client and can be combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports ,  and deep tissue massage
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        swedish massage up to medium pressure or middle muscle layers to help relax client going through body changes due to pregnancy and associated aches in the feet,  low back,  and upper back,  avoids the joints and major artery sites of client,  client cannot be a high risk pregnancy or within the first trimester unless not a high risk and has had massage regularly for at least one year allowing the clients body to welcome massage. Can be relaxing,  helps detox stress from body,  improves circulation,  and other benefits of massage
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    Massage that is focused on sports related discomforts around tendons and tight muscles to improve muscle recovery or improve muscle performance,  improve range of motion,  can include stretching and percussion and can be combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports ,  and deep tissue massage
## 5                                                   Massage focused on superficial muscles and lymphatic system of the body to drain edema or excess fluid retention from the limbs to the torso or heart to be eliminated,  it is rhythmic and slower to align with the lymphatic system of lymph nodes that runs along the circulatory system to clean the blood and improve immunity as well as sometimes elevate the limbs to help with drainage of excess water in the limbs from surgery,  cancer,  or other health related conditions as long as not a contraindication or client has a doctor's approval and then this massage can be combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports massage,  shiatsu massage,  instrument assisted soft tissue mobilization or IASTM tools,  myofascial massage,  trigger point therapy,  and deep tissue massage
## 6 Focus of this massage is to stimulate body functions and improve health of clients who find it more relaxing to have their scalp,  hands,  and/or feet massaged where these areas of the body have alignments in traditional chinese medicine to certain organs in the body reflected in locations on the feet,  hands,  and scalp to stimulate healing to those areas of the client's body. Historically,  this type of massage was the only type of massage allowed for health monitored clients recovering or living with certain health conditions such as cancer. If no contraindications or has a doctor's note,  this massage can be combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports massage,  shiatsu massage,  instrument assisted soft tissue mobilization or IASTM tools,  myofascial massage,  trigger point therapy,  and deep tissue massage
##                                                                                                                                                                                                                                                                                                                                                                                 benefits
## 1                                                                                                                                                                                                                      improves tight muscles,  loosens tight muscle fascia,  improves circulation,  improves relaxation,  improves immunity,  improves sleep,  improves range of motion
## 2                                                                                                                                                                                                                                                                                 improved circulation,  relaxing,  detoxes,  breaks apart adhesions,  improves sleep,  improves healing
## 3                                                                                                        improved circulation,  better sleep,  pain relief,  relaxing effect,  soothing effect,  calming effect,  improves range of motion,  helps with congestion,  helps detox,  helps clean old bruises,  modified massage if doctor approved for high risk or other health condition
## 4 improves healing from sports or work related muscle and tendon pains and discomforts,  relaxing,  improves range of motion,  improves flexibility,  improves immunity,  improves sleep,  improves workouts,  improves musle strength,  improves muscle recovery,   helps prevent muscle strain,   helps prevent tendon inflammation,  helps prevent injury,   helps muscle performance
## 5                                                                                                                                                                                                                                                                                                                                                                improves circulation,  
## 6                                                                        relaxing,  improves circulation,  improves sleep,  helps with pain and discomfort,  increases immunity,  recommended for people with pain sensitivity or people who cannot be touched because it causes discomfort by tickling,  itching,  or hurting them like some cases of fibromyalgia and neuropathic pain
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 contraindications
## 1                                                                                                                                                                                                                                                          dehydration,  fever,  rashes,  infection,  some mental disorders,  nausea,  epilepsy,  aneurism history, psychosis,   numbness in limbs,  limb numbness,  nerve tingling,  nerve pain,  difficulty breathing,  ragged breathing,  trouble breathing,  
## 2                                                                                                                                          dehydration,  fever,  rashes,  infection,  some mental disorders,  nausea,  epilepsy,  aneurism history,  cancer, psychosis,   numbness in limbs,  limb numbness,  nerve tingling,  nerve pain,  difficulty breathing,  ragged breathing,  trouble breathing,  heart disease, circulatory problems, blood pressure problems, blood clots, blood embolisms, thrombosis 
## 3 dehydration,  Be a high risk pregnancy (history of painful menstruation,  history of uterine diseases such as fibroids,  cysts,  endometriosis,  history of miscarriages,  over the age of 35 and first child,  diabetes,  blood disorders,  heart disorders) fever,  rashes,  infection,  some mental disorders,  nausea,  epilepsy,  aneurism history,  cancer, psychosis,   numbness in limbs,  limb numbness,  nerve tingling,  nerve pain,  difficulty breathing,  ragged breathing,  trouble breathing,  
## 4                                                                                                                               dehydration,  fever,  rashes,  infection,  some mental disorders,  nausea,  epilepsy,  pregnant,  aneurism history,  cancer, psychosis,   numbness in limbs,  limb numbness,  nerve tingling,  nerve pain,  difficulty breathing,  ragged breathing,  trouble breathing,  heart disease, circulatory problems, blood pressure problems, blood clots, blood embolisms, thrombosis 
## 5                                                                                                                                                                                                       dehydration,  fever,  rashes,  infection,  some mental disorders,  nausea,  epilepsy,  pregnant,  on wound,  pregnant,  heart condition,  aneurism history, psychosis,   numbness in limbs,  limb numbness,  nerve tingling,  nerve pain,  difficulty breathing,  ragged breathing,  trouble breathing,  
## 6                                                                                                                                                                            dehydration,  fever,  rashes,  infection,  some mental disorders,  nausea,  epilepsy,  , aneurism history, psychosis,   numbness in limbs,  limb numbness,  nerve tingling,  nerve pain,  difficulty breathing,  ragged breathing,  trouble breathing,  not recommended for people who have neuropathic pain or tickling of the feet
##                                                                                                                                                              sideEffects
## 1                                                                                             if no contraindications for massage exist in client, can make client tired
## 2                                                                               can cause muscle soreness for a few days from breaking apart adhesions and pressure used
## 3 can make client dizzy or nausious if first massage and not familiar with massage or early stages of pregnancy and a first time pregnancy but not a high risk pregnancy
## 4                                                                               can cause muscle soreness for a few days from breaking apart adhesions and pressure used
## 5  client could get dizzy or feel nausious if he or she has an underlying health condition such as a circulatory problem, blood disorder, blockage somewhere in the body
## 6                                                                                                                                          can make client light headed

Lets create a table from the one above of only unique or non-duplicated entries to use in our program later when returning the description and side effects of each recommended massage.

modesUnique <- modes[!duplicated(modes$modality),c(1,2,3,5)]

Lets use python 3 to tokenize the contraindications into three adjacent word pairs, and to tokenize the the benefits into two adjacent word pairs using the ngrams tokenization method.

library(reticulate)
## Warning: package 'reticulate' was built under R version 3.6.3
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(stringr)
library(tm)
## Loading required package: NLP
library(textstem)
## Loading required package: koRpus.lang.en
## Loading required package: koRpus
## Loading required package: sylly
## For information on available language packages for 'koRpus', run
## 
##   available.koRpus.lang()
## 
## and see ?install.koRpus.lang()
library(tidytext)
## Warning: package 'tidytext' was built under R version 3.6.3
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.2.1     v readr   1.3.1
## v tibble  2.1.3     v purrr   0.3.3
## v tidyr   1.0.0     v forcats 0.4.0
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x ggplot2::annotate() masks NLP::annotate()
## x dplyr::filter()     masks stats::filter()
## x dplyr::lag()        masks stats::lag()
## x readr::tokenize()   masks koRpus::tokenize()
conda_list(conda = "auto") 
##           name                                                  python
## 1    Anaconda2                     C:\\Users\\m\\Anaconda2\\python.exe
## 2    djangoenv    C:\\Users\\m\\Anaconda2\\envs\\djangoenv\\python.exe
## 3     python36     C:\\Users\\m\\Anaconda2\\envs\\python36\\python.exe
## 4     python37     C:\\Users\\m\\Anaconda2\\envs\\python37\\python.exe
## 5 r-reticulate C:\\Users\\m\\Anaconda2\\envs\\r-reticulate\\python.exe
use_condaenv(condaenv = "python36")
import pandas as pd 
import matplotlib.pyplot as plt 
from textblob import TextBlob 
import sklearn 
import numpy as np 
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer 
from sklearn.naive_bayes import MultinomialNB 
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

import re
import string
import nltk 

np.random.seed(47) 
modes = pd.read_csv('MassageModalities3.csv', encoding = 'unicode_escape') 
print(modes.head())
##                      modality  ...                                        sideEffects
## 0             Swedish Massage  ...  if no contraindications for massage exist in c...
## 1             Shiatsu Massage  ...  can cause muscle soreness for a few days from ...
## 2            Prenatal Massage  ...  can make client dizzy or nausious if first mas...
## 3              Sports Massage  ...  can cause muscle soreness for a few days from ...
## 4  Lymphatic Drainage Massage  ...  client could get dizzy or feel nausious if he ...
## 
## [5 rows x 5 columns]
print(modes.columns)
## Index(['modality', 'Description', 'benefits', 'contraindications',
##        'sideEffects'],
##       dtype='object')
print(modes['modality'].unique())
## ['Swedish Massage' 'Shiatsu Massage' 'Prenatal Massage' 'Sports Massage'
##  'Lymphatic Drainage Massage' 'Reflexology Massage' 'Craniosacral Massage'
##  'Trigger Point Therapy' 'Myofascial Massage' 'Deep tissue Massage'
##  'Hot Stone Therapy Massage' 'Biofreeze Muscle Pain Relief Gel'
##  'Massage Gun Therapy' 'Aromatherapy' 'Cupping Therapy' 'Stretching'
##  'Cold Stone Therapy' 'Cannabidiol (CBD) Massage Balm'
##  'Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage']
import numpy as np

modes = modes.reindex(np.random.permutation(modes.index))

print(modes.head())
##                            modality  ...                                        sideEffects
## 39                  Shiatsu Massage  ...  can cause muscle soreness for a few days from ...
## 282              Cold Stone Therapy  ...  if placed directly on the skin it could slight...
## 351             Deep tissue Massage  ...  can cause muscle soreness for a few days from ...
## 128                 Cupping Therapy  ...   can leave marks on the body that last up to t...
## 416  Cannabidiol (CBD) Massage Balm  ...  depends on the ingredients and amount of CBD i...
## 
## [5 rows x 5 columns]
print(modes.tail())
##                            modality  ...                                        sideEffects
## 72                       Stretching  ...  Can cause muscle soreness for a few days after...
## 264  Cannabidiol (CBD) Massage Balm  ...  depends on the ingredients and amount of CBD i...
## 327      Lymphatic Drainage Massage  ...  client could get dizzy or feel nausious if he ...
## 390       Hot Stone Therapy Massage  ...  can burn if stones are too hot for the client,...
## 135                Prenatal Massage  ...  can make client dizzy or nausious if first mas...
## 
## [5 rows x 5 columns]
modes.groupby('modality').describe()
##                                                    Description  ... sideEffects
##                                                          count  ...        freq
## modality                                                        ...            
## Aromatherapy                                                24  ...          24
## Biofreeze Muscle Pain Relief Gel                            24  ...          24
## Cannabidiol (CBD) Massage Balm                              24  ...          24
## Cold Stone Therapy                                          24  ...          24
## Craniosacral Massage                                        24  ...          24
## Cupping Therapy                                             24  ...          24
## Deep tissue Massage                                         24  ...          24
## Hot Stone Therapy Massage                                   24  ...          24
## Instrument Assisted Soft Tissue Mobilization (I...          24  ...          24
## Lymphatic Drainage Massage                                  24  ...          24
## Massage Gun Therapy                                         24  ...          24
## Myofascial Massage                                          24  ...          24
## Prenatal Massage                                            24  ...          24
## Reflexology Massage                                         24  ...          24
## Shiatsu Massage                                             24  ...          24
## Sports Massage                                              24  ...          24
## Stretching                                                  24  ...          24
## Swedish Massage                                             24  ...          24
## Trigger Point Therapy                                       24  ...          24
## 
## [19 rows x 16 columns]
stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])#unlisted with N-grams vectorization
    #text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#when using count Vectorization its a list
    #or else single letters returned.
    return text
modes['lemmatizedBenefits']=modes['benefits'].apply(lambda x: lemmatize(x))
modes['lemmatizedContraindications']=modes['contraindications'].apply(lambda x: lemmatize(x))
modes.columns
## Index(['modality', 'Description', 'benefits', 'contraindications',
##        'sideEffects', 'lemmatizedBenefits', 'lemmatizedContraindications'],
##       dtype='object')
modes.head()
##                            modality  ...                        lemmatizedContraindications
## 39                  Shiatsu Massage  ...  dehydration fever rash infection mental disord...
## 282              Cold Stone Therapy  ...  dehydration fever rash infection mental disord...
## 351             Deep tissue Massage  ...  dehydration fever rash infection mental disord...
## 128                 Cupping Therapy  ...  dehydration fever rash infection mental disord...
## 416  Cannabidiol (CBD) Massage Balm  ...  dehydration fever rash infection mental disord...
## 
## [5 rows x 7 columns]
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(modes[['lemmatizedContraindications','lemmatizedBenefits']],modes['modality'],test_size=0.15)
X_train.head()
##                            lemmatizedContraindications                                 lemmatizedBenefits
## 350  dehydration local site wound sore sensitive sk...      muscle trauma muscle spasm pain trigger point
## 344  dehydration high risk pregnancy history painfu...  improved circulation better sleep pain relief ...
## 438  dehydration fever rash infection mental disord...  improved circulation relaxing detox break apar...
## 390  dehydration fever rash infection mental disord...   relax client reduce stress calming increase h...
## 337  dehydration fever rash infection mental disord...  improved circulation better sleep pain relief ...
from sklearn.feature_extraction.text import CountVectorizer
n_gram3_vect=CountVectorizer(ngram_range=(3,3))
n_gram2_vect=CountVectorizer(ngram_range=(2,2))
type(X_train['lemmatizedBenefits'])
## <class 'pandas.core.series.Series'>
X_train['lemmatizedBenefits'].head()
## 350        muscle trauma muscle spasm pain trigger point
## 344    improved circulation better sleep pain relief ...
## 438    improved circulation relaxing detox break apar...
## 390     relax client reduce stress calming increase h...
## 337    improved circulation better sleep pain relief ...
## Name: lemmatizedBenefits, dtype: object
type(X_train['lemmatizedContraindications'])
## <class 'pandas.core.series.Series'>
X_train['lemmatizedContraindications'].head()
## 350    dehydration local site wound sore sensitive sk...
## 344    dehydration high risk pregnancy history painfu...
## 438    dehydration fever rash infection mental disord...
## 390    dehydration fever rash infection mental disord...
## 337    dehydration fever rash infection mental disord...
## Name: lemmatizedContraindications, dtype: object
n_gram3_vect_fit=n_gram3_vect.fit(X_train['lemmatizedContraindications'])
n_gram2_vect_fit=n_gram2_vect.fit(X_train['lemmatizedBenefits'])


n_gram3_train=n_gram3_vect_fit.transform(X_train['lemmatizedContraindications'])
n_gram3_test=n_gram3_vect_fit.transform(X_test['lemmatizedContraindications'])
n_gram2_train=n_gram2_vect_fit.transform(X_train['lemmatizedBenefits'])
n_gram2_test=n_gram2_vect_fit.transform(X_test['lemmatizedBenefits'])
print(len(n_gram3_vect_fit.get_feature_names()))
## 297
Ngram3 = n_gram3_vect_fit.get_feature_names()
print(Ngram3)
## ['35 first child', 'acute cranium bleeding', 'age 35 first', 'allergy fragrance epilepsy', 'allergy sensitivity pain', 'anemia blood disorder', 'anemia diabetes blood', 'aneurism herniation medulla', 'aneurism history cancer', 'aneurism history psychosis', 'application site fever', 'area sore application', 'arnoldchiari acute cranium', 'artery joint bruise', 'arthritis neuropathic pain', 'asthma allergy fragrance', 'autoimmune disease psychosis', 'bleeding wound area', 'blood clot blood', 'blood clot diabetes', 'blood clot heart', 'blood disorder anemia', 'blood disorder heart', 'blood disorder immune', 'blood disorder leukemia', 'blood embolism thrombosis', 'blood pressure diabetes', 'blood pressure disorder', 'blood pressure low', 'blood pressure pregnant', 'blood pressure problem', 'blood pressure wound', 'bone bruise wound', 'bone fractured bone', 'bone insensitivity pain', 'bone trauma osteoporosis', 'brain trauma history', 'breast feeding open', 'breathing blood disorder', 'breathing cold sensitive', 'breathing heart disease', 'breathing medication could', 'breathing muscle pain', 'breathing people sensitive', 'breathing ragged breathing', 'breathing recommended people', 'breathing spine broken', 'breathing trouble breathing', 'broken bone fractured', 'broken bone insensitivity', 'bruise heart disease', 'bruise wound cut', 'burn rash fever', 'cancer autoimmune disease', 'cancer heat sensitive', 'cancer lymphoma psychosis', 'cancer osteoporosis psychosis', 'cancer psychosis numbness', 'cancer sensitivity cold', 'cannabidiol allergy cbd', 'child diabetes blood', 'circulatory disease blood', 'circulatory disorder blood', 'circulatory problem blood', 'clot blood embolism', 'clot diabetes heart', 'clot heart disease', 'cold circulatory disorder', 'cold cold sensitivity', 'cold sensitive sensitivity', 'cold sensitivity psychosis', 'condition aneurism history', 'condition circulatory disease', 'could interact cannabidiol', 'cranium bleeding wound', 'cut burn rash', 'cut rash nerve', 'cyst endometriosis history', 'dehydration asthma allergy', 'dehydration fever rash', 'dehydration high risk', 'dehydration local site', 'dehydration seizure psychological', 'dehydration wound directly', 'diabetes blood disorder', 'diabetes blood pressure', 'diabetes heart disease', 'diabetes wound anemia', 'difficulty breathing ragged', 'direction massage therapist', 'directly surgical incision', 'disease blood clot', 'disease circulatory problem', 'disease fibroid cyst', 'disease psychosis numbness', 'disorder anemia diabetes', 'disorder blood clot', 'disorder blood pressure', 'disorder fever rash', 'disorder heart condition', 'disorder heart disorder', 'disorder histerical emotionally', 'disorder immune disorder', 'disorder leukemia sicklecell', 'disorder nausea epilepsy', 'disorder personality disorder', 'disorder pregnant aneurism', 'disturbed unable sit', 'eczema psoriasis sunburn', 'emotionally disturbed unable', 'endangering massage therapist', 'endometriosis history miscarriage', 'epilepsy allergy sensitivity', 'epilepsy aneurism history', 'epilepsy arthritis neuropathic', 'epilepsy bone trauma', 'epilepsy fever infection', 'epilepsy high blood', 'epilepsy pregnant aneurism', 'epilepsy pregnant wound', 'feeding open wound', 'fever infection nausea', 'fever infection neuropathy', 'fever infection skull', 'fever rash infection', 'fibroid cyst endometriosis', 'first child diabetes', 'fracture broken bone', 'fracture fever infection', 'fracture insensitivity pain', 'fractured bone bruise', 'fragrance epilepsy fever', 'heart condition aneurism', 'heart condition circulatory', 'heart disease circulatory', 'heart disorder fever', 'heat psychosis numbness', 'heat sensitive sensitivity', 'herniation medulla oblongato', 'high blood pressure', 'high risk pregnancy', 'histerical emotionally disturbed', 'history aneurism herniation', 'history cancer autoimmune', 'history cancer heat', 'history cancer lymphoma', 'history cancer osteoporosis', 'history cancer psychosis', 'history cancer sensitivity', 'history miscarriage age', 'history painful menstruation', 'history psychosis numbness', 'history stroke history', 'history uterine disease', 'immune disorder pregnant', 'impaired sensitivity pain', 'incision impaired sensitivity', 'infection mental disorder', 'infection nausea psychosis', 'infection neuropathy pregnant', 'infection skull trauma', 'injury heart disease', 'injury sprain strain', 'insensitivity pain pregnant', 'interact cannabidiol allergy', 'joint bruise heart', 'leukemia sicklecell blood', 'limb limb numbness', 'limb numbness nerve', 'limb spasm direction', 'local site wound', 'low blood pressure', 'lymphoma psychosis numbness', 'massage therapist brain', 'massage therapist endangering', 'medication could interact', 'medulla oblongato arnoldchiari', 'menstruation history uterine', 'mental disorder nausea', 'miscarriage age 35', 'muscle injury sprain', 'muscle pain muscle', 'nausea epilepsy allergy', 'nausea epilepsy aneurism', 'nausea epilepsy arthritis', 'nausea epilepsy bone', 'nausea epilepsy high', 'nausea epilepsy pregnant', 'nausea psychosis numbness', 'nerve artery joint', 'nerve pain difficulty', 'nerve tingling nerve', 'neuropathic pain pregnant', 'neuropathic pain tickling', 'neuropathy pregnant breast', 'numbness limb limb', 'numbness nerve tingling', 'oblongato arnoldchiari acute', 'open wound skin', 'osteoporosis fracture broken', 'osteoporosis fracture insensitivity', 'osteoporosis psychosis numbness', 'pain difficulty breathing', 'pain heart disease', 'pain high blood', 'pain muscle injury', 'pain pregnant aneurism', 'pain psychological disorder', 'pain tickling foot', 'painful menstruation history', 'people neuropathic pain', 'people sensitive pain', 'personality disorder fever', 'pregnancy history painful', 'pregnant aneurism history', 'pregnant breast feeding', 'pregnant heart condition', 'pregnant wound pregnant', 'pressure diabetes wound', 'pressure disorder blood', 'pressure disorder heart', 'pressure low blood', 'pressure pregnant aneurism', 'pressure problem blood', 'pressure wound aneurism', 'problem blood clot', 'problem blood pressure', 'psoriasis sunburn sensitive', 'psychological disorder histerical', 'psychological disorder personality', 'psychosis numbness limb', 'ragged breathing trouble', 'rash eczema psoriasis', 'rash fever rash', 'rash infection mental', 'rash nerve artery', 'recommended people neuropathic', 'risk pregnancy history', 'seizure psychological disorder', 'sensitive pain heart', 'sensitive sensitivity cold', 'sensitive sensitivity heat', 'sensitive skin sensitive', 'sensitive skin thin', 'sensitive smell pregnant', 'sensitivity cold circulatory', 'sensitivity cold cold', 'sensitivity heat psychosis', 'sensitivity pain high', 'sensitivity pain psychological', 'sensitivity psychosis numbness', 'sicklecell blood disorder', 'sit still limb', 'site fever infection', 'site wound sore', 'skin cut burn', 'skin rash eczema', 'skin sensitive smell', 'skin thin skin', 'skull fracture fever', 'skull trauma skull', 'smell pregnant aneurism', 'sore application site', 'sore sensitive skin', 'spasm direction massage', 'spine broken bone', 'sprain strain injury', 'still limb spasm', 'strain injury heart', 'stroke history aneurism', 'sunburn sensitive skin', 'surgical incision impaired', 'therapist brain trauma', 'therapist endangering massage', 'thin skin cut', 'tingling nerve pain', 'trauma history stroke', 'trauma osteoporosis fracture', 'trauma skull fracture', 'trouble breathing blood', 'trouble breathing cold', 'trouble breathing heart', 'trouble breathing medication', 'trouble breathing muscle', 'trouble breathing people', 'trouble breathing recommended', 'trouble breathing spine', 'unable sit still', 'uterine disease fibroid', 'wound anemia blood', 'wound aneurism history', 'wound area sore', 'wound cut rash', 'wound directly surgical', 'wound pregnant heart', 'wound skin rash', 'wound sore sensitive']
print(len(n_gram2_vect_fit.get_feature_names()))
## 245
Ngram2 = n_gram2_vect_fit.get_feature_names()
print(type(Ngram2))
## <class 'list'>
print(Ngram2)
## ['ache improve', 'ache relieve', 'acute pain', 'adhesion heal', 'adhesion improve', 'adhesion improves', 'adhesion increase', 'adhesion stress', 'alleviate headache', 'anxiety stress', 'apart adhesion', 'apart muscle', 'approved high', 'arthritis chronic', 'arthritis stress', 'arthritis tendonitis', 'associated nerve', 'auditory disturbancss', 'autonomous disfunction', 'back pain', 'better posture', 'better sleep', 'break adhesion', 'break apart', 'bruise modified', 'calm nervous', 'calming effect', 'calming increase', 'cannot touched', 'case fibromyalgia', 'cause discomfort', 'chronic pain', 'circulation better', 'circulation break', 'circulation good', 'circulation improves', 'circulation pain', 'circulation reduces', 'circulation relaxing', 'clean old', 'client reduce', 'congestion help', 'detox break', 'detox help', 'detox muscle', 'detox recharge', 'discomfort fibromyalgia', 'discomfort increase', 'discomfort relaxing', 'discomfort tickling', 'disfunction autonomous', 'disfunction emotional', 'disorder rheumatoid', 'disturbance auditory', 'disturbancss tmj', 'doctor approved', 'effect calming', 'effect improves', 'effect soothing', 'emotional disorder', 'emotionally upset', 'energize unwind', 'fascia adhesion', 'fascia improves', 'fibromyalgia myofascial', 'fibromyalgia neuropathic', 'fibrosis increased', 'flexibility improved', 'flexibility improves', 'good stretching', 'headache emotionally', 'headache pain', 'headache scoliosis', 'heal sore', 'healing improve', 'healing improved', 'healing improves', 'healing increase', 'healing sport', 'health condition', 'help alleviate', 'help chronic', 'help clean', 'help congestion', 'help detox', 'help muscle', 'help pain', 'help post', 'help prevent', 'help relieve', 'high risk', 'hurting like', 'hypertonicity hypomobility', 'immunity help', 'immunity improves', 'immunity recommended', 'improve circulation', 'improve healing', 'improve range', 'improve sleep', 'improved circulation', 'improved mood', 'improved range', 'improves circulation', 'improves flexibility', 'improves healing', 'improves immunity', 'improves muscle', 'improves musle', 'improves range', 'improves relaxation', 'improves sleep', 'improves tight', 'improves workout', 'increase healing', 'increase immunity', 'increased flexibility', 'increased healing', 'increased range', 'inflammation help', 'injury help', 'injury tight', 'insomnia headache', 'itching hurting', 'joint arthritis', 'joint discomfort', 'like case', 'loosens tight', 'massage doctor', 'modified massage', 'mood muscle', 'motion break', 'motion help', 'motion improve', 'motion improved', 'motion improves', 'motion prevent', 'muscle ache', 'muscle fascia', 'muscle help', 'muscle improve', 'muscle loosens', 'muscle performance', 'muscle recovery', 'muscle relief', 'muscle relieve', 'muscle spasm', 'muscle strain', 'muscle tendon', 'muscle trauma', 'muscular adhesion', 'musle pain', 'musle strength', 'myofascial pain', 'nerve pain', 'nervous system', 'neuromodulated pain', 'neuropathic pain', 'old bruise', 'pain acute', 'pain arthritis', 'pain associated', 'pain calm', 'pain discomfort', 'pain improve', 'pain improves', 'pain insomnia', 'pain point', 'pain relief', 'pain sensitivity', 'pain sensory', 'pain trigger', 'pain visceral', 'people cannot', 'people pain', 'point fibrosis', 'post workout', 'posture increased', 'prevent injury', 'prevent muscle', 'prevent tendon', 'range motion', 'recharge relax', 'recommended people', 'recovery break', 'recovery help', 'reduce stress', 'reduces pain', 'related muscle', 'relax client', 'relax energize', 'relax tight', 'relaxation improves', 'relaxing detox', 'relaxing effect', 'relaxing improves', 'relief better', 'relief muscle', 'relief relaxing', 'relieve muscle', 'relieve musle', 'relieve pain', 'relieve stiff', 'rheumatoid arthritis', 'risk health', 'scoliosis visual', 'sensitivity people', 'sensory stimulation', 'sleep help', 'sleep improves', 'sleep pain', 'soothing effect', 'sore muscle', 'spasm pain', 'sport work', 'stiff muscle', 'stimulation joint', 'strain help', 'strength improves', 'stress back', 'stress calming', 'stretching detox', 'system improves', 'tempromandibular joint', 'tendon inflammation', 'tendon neuromodulated', 'tendon pain', 'tendon working', 'tendonitis hypertonicity', 'tickling itching', 'tight muscle', 'tight tendon', 'tmj tempromandibular', 'touched cause', 'trauma muscle', 'trigger pain', 'trigger point', 'unwind invigorate', 'upset detox', 'visceral disfunction', 'visual disturbance', 'work related', 'working muscular', 'workout ache', 'workout improves']
n_gram3_train_df=pd.concat([X_train[['lemmatizedContraindications','lemmatizedBenefits']].reset_index(drop=True),pd.DataFrame(n_gram3_train.toarray())],axis=1)

n_gram3_test_df=pd.concat([X_test[['lemmatizedContraindications','lemmatizedBenefits']].reset_index(drop=True),pd.DataFrame(n_gram3_test.toarray())],axis=1)
n_gram3_train_df.head()
##                          lemmatizedContraindications  ... 296
## 0  dehydration local site wound sore sensitive sk...  ...   1
## 1  dehydration high risk pregnancy history painfu...  ...   0
## 2  dehydration fever rash infection mental disord...  ...   0
## 3  dehydration fever rash infection mental disord...  ...   0
## 4  dehydration fever rash infection mental disord...  ...   0
## 
## [5 rows x 299 columns]
n_gram2_train_df=pd.concat([X_train[['lemmatizedContraindications','lemmatizedBenefits']].reset_index(drop=True),pd.DataFrame(n_gram2_train.toarray())],axis=1)

n_gram2_test_df=pd.concat([X_test[['lemmatizedContraindications','lemmatizedBenefits']].reset_index(drop=True),pd.DataFrame(n_gram2_test.toarray())],axis=1)
n_gram2_train_df.head()
##                          lemmatizedContraindications  ... 244
## 0  dehydration local site wound sore sensitive sk...  ...   0
## 1  dehydration high risk pregnancy history painfu...  ...   0
## 2  dehydration fever rash infection mental disord...  ...   0
## 3  dehydration fever rash infection mental disord...  ...   0
## 4  dehydration fever rash infection mental disord...  ...   0
## 
## [5 rows x 247 columns]

n_gram2_train2=pd.DataFrame(n_gram2_train.toarray())
n_gram3_train3=pd.DataFrame(n_gram3_train.toarray())
n_gram2_test2=pd.DataFrame(n_gram2_test.toarray())
n_gram3_test3=pd.DataFrame(n_gram3_test.toarray())

n_gram3_train3.columns=Ngram3
n_gram2_train2.columns=Ngram2
n_gram3_test3.columns=Ngram3
n_gram2_test2.columns=Ngram2


n_gram_2_3_train_df=pd.concat([X_train[['lemmatizedContraindications','lemmatizedBenefits']].reset_index(drop=True),n_gram2_train2,n_gram3_train3],axis=1)

n_gram_2_3_test_df=pd.concat([X_test[['lemmatizedContraindications','lemmatizedBenefits']].reset_index(drop=True),n_gram2_test2,n_gram3_test3],axis=1)
ngram23Train = pd.concat([n_gram2_train2,n_gram3_train3],axis=1)
ngram23Test = pd.concat([n_gram2_test2,n_gram3_test3],axis=1)
ngram23Train.head()
##    ache improve  ache relieve  ...  wound skin rash  wound sore sensitive
## 0             0             0  ...                0                     1
## 1             0             0  ...                0                     0
## 2             0             0  ...                0                     0
## 3             0             0  ...                0                     0
## 4             0             0  ...                0                     0
## 
## [5 rows x 542 columns]
n_gram_2_3_train_df.head()
##                          lemmatizedContraindications  ... wound sore sensitive
## 0  dehydration local site wound sore sensitive sk...  ...                    1
## 1  dehydration high risk pregnancy history painfu...  ...                    0
## 2  dehydration fever rash infection mental disord...  ...                    0
## 3  dehydration fever rash infection mental disord...  ...                    0
## 4  dehydration fever rash infection mental disord...  ...                    0
## 
## [5 rows x 544 columns]

Write this table of ngram tokens out to csv.

n_gram_2_3_test_df.to_csv('ngrams2_3_test.csv',index=False)
n_gram_2_3_train_df.to_csv('ngrams2_3_train.csv', index=False)
y_train.to_csv('y_train_ngrams23.csv', index=False)
## C:/Users/m/Anaconda2/envs/python36/python.exe:1: FutureWarning: The signature of `Series.to_csv` was aligned to that of `DataFrame.to_csv`, and argument 'header' will change its default value from False to True: please pass an explicit value to suppress this warning.
y_test.to_csv('y_test_ngrams23.csv', index=False)

Lets read in this large file in RStudio, and combine the data into one table.

ngrams23train <- read.csv('ngrams2_3_train.csv', sep=',', header=TRUE, 
                          na.strings=c('',' ','NA'))
ngrams23test <- read.csv('ngrams2_3_test.csv', sep=',', header=TRUE,
                         na.strings=c('',' ','NA'))
ytrain <- read.csv('y_train_ngrams23.csv', sep=',', header=FALSE,
                   na.strings=c('',' ','NA'))
colnames(ytrain) <- 'modality'
ytest <- read.csv('y_test_ngrams23.csv', sep=',', header=FALSE,
                  na.strings=c('',' ','NA'))
colnames(ytest) <- 'modality'

train <- cbind(ytrain,ngrams23train)
test <- cbind(ytest,ngrams23test)

ngrams23All <- rbind(train,test)

write.csv(ngrams23All,'lemmNgramsBenefits2Contraindications3.csv', row.names=FALSE)

We now have the lemmatized ngram tokens of 2 adjacent word pairs for our benefits and three adjacent word pairs for our contraindications saved to csv to use later or as needed for building our recommender system for a specific massage modality.

Lets get back to python, for machine learning using our previous models for the random forest classifier and the gradient boosted trees classifier. Lets use the combined tokens for the benefits and contraindications to see how well these trees do in classifying our massage modalities.

from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(ngram23Train,y_train)
end=time.time()
fit_time=(end-start)
fit_time
## 0.906379222869873
start=time.time()
y_pred=rf_model.predict(ngram23Test)
end=time.time()
pred_time=(end-start)
pred_time
## 0.19825387001037598

prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                             Predicted                          modality
## 30   Biofreeze Muscle Pain Relief Gel  Biofreeze Muscle Pain Relief Gel
## 164               Massage Gun Therapy               Massage Gun Therapy
## 43                Reflexology Massage               Reflexology Massage
## 215              Craniosacral Massage              Craniosacral Massage
## 233               Reflexology Massage               Reflexology Massage
## ..                                ...                               ...
## 169    Cannabidiol (CBD) Massage Balm    Cannabidiol (CBD) Massage Balm
## 175        Lymphatic Drainage Massage        Lymphatic Drainage Massage
## 316               Massage Gun Therapy               Massage Gun Therapy
## 351               Deep tissue Massage               Deep tissue Massage
## 271               Reflexology Massage               Reflexology Massage
## 
## [69 rows x 2 columns]

Results Random Forest

from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 1.0
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                                                        precision    recall  f1-score   support
## 
##                                                          Aromatherapy       1.00      1.00      1.00         3
##                                      Biofreeze Muscle Pain Relief Gel       1.00      1.00      1.00         6
##                                        Cannabidiol (CBD) Massage Balm       1.00      1.00      1.00         5
##                                                    Cold Stone Therapy       1.00      1.00      1.00         5
##                                                  Craniosacral Massage       1.00      1.00      1.00         3
##                                                       Cupping Therapy       1.00      1.00      1.00         5
##                                                   Deep tissue Massage       1.00      1.00      1.00         5
##                                             Hot Stone Therapy Massage       1.00      1.00      1.00         5
## Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage       1.00      1.00      1.00         7
##                                            Lymphatic Drainage Massage       1.00      1.00      1.00         1
##                                                   Massage Gun Therapy       1.00      1.00      1.00         4
##                                                    Myofascial Massage       1.00      1.00      1.00         4
##                                                      Prenatal Massage       1.00      1.00      1.00         1
##                                                   Reflexology Massage       1.00      1.00      1.00         5
##                                                       Shiatsu Massage       1.00      1.00      1.00         1
##                                                        Sports Massage       1.00      1.00      1.00         3
##                                                       Swedish Massage       1.00      1.00      1.00         4
##                                                 Trigger Point Therapy       1.00      1.00      1.00         2
## 
##                                                              accuracy                           1.00        69
##                                                             macro avg       1.00      1.00      1.00        69
##                                                          weighted avg       1.00      1.00      1.00        69
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(ngram23Train,y_train)
end=time.time()
fit_time=(end-start)
fit_time
## 7.755925178527832
start=time.time()
y_pred=gb_model.predict(ngram23Test)
end=time.time()
pred_time=(end-start)
pred_time
## 0.03600645065307617
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                             Predicted                          modality
## 30   Biofreeze Muscle Pain Relief Gel  Biofreeze Muscle Pain Relief Gel
## 164               Massage Gun Therapy               Massage Gun Therapy
## 43                Reflexology Massage               Reflexology Massage
## 215              Craniosacral Massage              Craniosacral Massage
## 233               Reflexology Massage               Reflexology Massage
## ..                                ...                               ...
## 169    Cannabidiol (CBD) Massage Balm    Cannabidiol (CBD) Massage Balm
## 175        Lymphatic Drainage Massage        Lymphatic Drainage Massage
## 316               Massage Gun Therapy               Massage Gun Therapy
## 351               Deep tissue Massage               Deep tissue Massage
## 271               Reflexology Massage               Reflexology Massage
## 
## [69 rows x 2 columns]

Results Gradient Boosted Trees

from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 1.0
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0]
##  [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                                                        precision    recall  f1-score   support
## 
##                                                          Aromatherapy       1.00      1.00      1.00         3
##                                      Biofreeze Muscle Pain Relief Gel       1.00      1.00      1.00         6
##                                        Cannabidiol (CBD) Massage Balm       1.00      1.00      1.00         5
##                                                    Cold Stone Therapy       1.00      1.00      1.00         5
##                                                  Craniosacral Massage       1.00      1.00      1.00         3
##                                                       Cupping Therapy       1.00      1.00      1.00         5
##                                                   Deep tissue Massage       1.00      1.00      1.00         5
##                                             Hot Stone Therapy Massage       1.00      1.00      1.00         5
## Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage       1.00      1.00      1.00         7
##                                            Lymphatic Drainage Massage       1.00      1.00      1.00         1
##                                                   Massage Gun Therapy       1.00      1.00      1.00         4
##                                                    Myofascial Massage       1.00      1.00      1.00         4
##                                                      Prenatal Massage       1.00      1.00      1.00         1
##                                                   Reflexology Massage       1.00      1.00      1.00         5
##                                                       Shiatsu Massage       1.00      1.00      1.00         1
##                                                        Sports Massage       1.00      1.00      1.00         3
##                                                       Swedish Massage       1.00      1.00      1.00         4
##                                                 Trigger Point Therapy       1.00      1.00      1.00         2
## 
##                                                              accuracy                           1.00        69
##                                                             macro avg       1.00      1.00      1.00        69
##                                                          weighted avg       1.00      1.00      1.00        69

It is great that these two produced the same results of 100%, as they should because each class of modality is a duplicate up to 23 duplicates, or 24 samples of each modality that are all identical. I ran a previous script on the same data and used 1-4 ngrams and the hot stone therapy observations were all getting misclassified as deep tissue recommendations for benefits and the same for contraindications of each type.

Lets try user inputs using this data after we make the above into a function for both models.


def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text
    
def predict_ngramRFC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    n_gram2_vect=CountVectorizer(ngram_range=(2,2))
    n_gram3_vect=CountVectorizer(ngram_range=(3,3))

    n_gram2_vect_fit=n_gram2_vect.fit(X_train['lemmatizedBenefits'])
    n_gram3_vect_fit=n_gram3_vect.fit(X_train['lemmatizedContraindications'])

    n_gram2_train=n_gram2_vect_fit.transform(X_train['lemmatizedBenefits'])
    n_gram3_train=n_gram3_vect_fit.transform(X_train['lemmatizedContraindications'])

    Ngram2 = n_gram2_vect_fit.get_feature_names()
    Ngram3 = n_gram3_vect_fit.get_feature_names()

    n_gram2_train2=pd.DataFrame(n_gram2_train.toarray())
    n_gram3_train3=pd.DataFrame(n_gram3_train.toarray())

    n_gram2_train2.columns=Ngram2
    n_gram3_train3.columns=Ngram3

    ngram23Train = pd.concat([n_gram2_train2,n_gram3_train3],axis=1)

    nr_gram2_test=n_gram2_vect_fit.transform(nr['lemma'])
    nr_gram3_test=n_gram3_vect_fit.transform(nr['lemma'])
   
    nr_test2=pd.DataFrame(nr_gram2_test.toarray())
    nr_test3=pd.DataFrame(nr_gram3_test.toarray())
    
    nr_test2.columns=Ngram2
    nr_test3.columns=Ngram3

    nrTest = pd.concat([nr_test2,nr_test3],axis=1)
    
    model = rf.fit(ngram23Train,y_train)
    pred=pd.DataFrame(model.predict(nrTest))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_2ngram3_RFC_85-15:']
    print('\n\n',pred)
np.random.seed(12345)
predict_ngramRFC_lemma('I need a massage!') 
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:      Lymphatic Drainage Massage
np.random.seed(12345)

predict_ngramRFC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:      Lymphatic Drainage Massage

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text
    

def predict_ngramGBC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))

    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    n_gram2_vect=CountVectorizer(ngram_range=(2,2))
    n_gram3_vect=CountVectorizer(ngram_range=(3,3))

    n_gram2_vect_fit=n_gram2_vect.fit(X_train['lemmatizedBenefits'])
    n_gram3_vect_fit=n_gram3_vect.fit(X_train['lemmatizedContraindications'])

    n_gram2_train=n_gram2_vect_fit.transform(X_train['lemmatizedBenefits'])
    n_gram3_train=n_gram3_vect_fit.transform(X_train['lemmatizedContraindications'])

    Ngram2 = n_gram2_vect_fit.get_feature_names()
    Ngram3 = n_gram3_vect_fit.get_feature_names()

    n_gram2_train2=pd.DataFrame(n_gram2_train.toarray())
    n_gram3_train3=pd.DataFrame(n_gram3_train.toarray())

    n_gram2_train2.columns=Ngram2
    n_gram3_train3.columns=Ngram3

    ngram23Train = pd.concat([n_gram2_train2,n_gram3_train3],axis=1)

    nr_gram2_test=n_gram2_vect_fit.transform(nr['lemma'])
    nr_gram3_test=n_gram3_vect_fit.transform(nr['lemma'])
   
    nr_test2=pd.DataFrame(nr_gram2_test.toarray())
    nr_test3=pd.DataFrame(nr_gram3_test.toarray())
    
    nr_test2.columns=Ngram2
    nr_test3.columns=Ngram3

    nrTest = pd.concat([nr_test2,nr_test3],axis=1)

    model = gb.fit(ngram23Train,y_train)
    pred=pd.DataFrame(model.predict(nrTest))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_2ngram3_GBC_85-15:']
    print('\n\n',pred)
np.random.seed(12345)

predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
np.random.seed(12345)

predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching

That was pretty interesting, to see the different recommendations. Since many of the contraindications and benefits are the same between modalities, these simple user inputs produced the same results with the seed set. If I remove the seed or starting point to randomize within the operating system, then lets see how this knits.

Gradient Boosted Trees user input generated results

predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching
predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching

Random Forest Trees user input generated results

predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:      Lymphatic Drainage Massage
predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:      Lymphatic Drainage Massage
predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:      Lymphatic Drainage Massage
predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:  Cannabidiol (CBD) Massage Balm
predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:                    Aromatherapy
predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:      Lymphatic Drainage Massage
predict_ngramRFC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:  Cannabidiol (CBD) Massage Balm
predict_ngramRFC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:  Cannabidiol (CBD) Massage Balm
predict_ngramRFC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:      Lymphatic Drainage Massage
predict_ngramRFC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:                    Aromatherapy
predict_ngramRFC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:  Cannabidiol (CBD) Massage Balm
predict_ngramRFC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_RFC_85-15:      Lymphatic Drainage Massage

Wonderful! But now lets try to get some of the modalities other than CBD, biofreeze, aromatherapy, stretching, and lymphatic drainage massage. Those are more additional therapeutics for massage therapy.

predict_ngramGBC_lemma('I want either a Swedish or Deep Tissue massage, I want a lot of pressure, and need to fall asleep, I workout, have stress at work, alright with some hot stones or cold stones, or added cups.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_2ngram3_GBC_85-15:                      Stretching

Our next model will use ngrams on the modality description to better capture the tokenized words for each modality, and keep the bigrams on benefits and trigrams on contraindications.Also, there should be a filter system so that those contraindicated massage modalities are excluded from the next run using the benefits or expectations of a user. These choices are selecting both benefits and those massages contraindicated with the current design. This did make the prediction on the testing set 100% accurate as it should, because only using benefits or contraindications tokenized produced a precision error on classifying hot stone therapy as deep tissue massage from a previous script done in Jupyter Notebook for python.


Massage Recommender R

Lets switch to R and read in the lemmatized tokens of the benefits and contraindications.

ngrams23All <- read.csv('lemmNgramsBenefits2Contraindications3.csv', sep=',', header=TRUE,
                        na.strings=c('',' ','NA'))
unique(ngrams23All$modality)
##  [1] Myofascial Massage                                                   
##  [2] Prenatal Massage                                                     
##  [3] Shiatsu Massage                                                      
##  [4] Hot Stone Therapy Massage                                            
##  [5] Cupping Therapy                                                      
##  [6] Sports Massage                                                       
##  [7] Biofreeze Muscle Pain Relief Gel                                     
##  [8] Cold Stone Therapy                                                   
##  [9] Stretching                                                           
## [10] Aromatherapy                                                         
## [11] Swedish Massage                                                      
## [12] Deep tissue Massage                                                  
## [13] Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage
## [14] Trigger Point Therapy                                                
## [15] Massage Gun Therapy                                                  
## [16] Lymphatic Drainage Massage                                           
## [17] Reflexology Massage                                                  
## [18] Craniosacral Massage                                                 
## [19] Cannabidiol (CBD) Massage Balm                                       
## 19 Levels: Aromatherapy ... Trigger Point Therapy
myofascial <- subset(ngrams23All, ngrams23All$modality=='Myofascial Massage')
dim(myofascial)
## [1]  24 545
myofascial1 <- myofascial[,-c(1:3)]
myofascial2 <- subset(myofascial1, colSums(myofascial1)!=0)
dim(myofascial2)
## [1]  54 542
myofascial2 <- myofascial1[,colSums(myofascial1) >= 1]
colSums(myofascial2)
##                muscle.spasm               muscle.trauma 
##                          24                          24 
##                pain.trigger                  spasm.pain 
##                          24                          24 
##               trauma.muscle               trigger.point 
##                          24                          24 
##     aneurism.history.cancer            blood.clot.blood 
##                          24                          24 
##   blood.embolism.thrombosis      blood.pressure.problem 
##                          24                          24 
##     breathing.heart.disease  breathing.ragged.breathing 
##                          24                          24 
## breathing.trouble.breathing             burn.rash.fever 
##                          24                          24 
##   cancer.psychosis.numbness   circulatory.problem.blood 
##                          24                          24 
##         clot.blood.embolism               cut.burn.rash 
##                          24                          24 
##      dehydration.local.site difficulty.breathing.ragged 
##                          24                          24 
## disease.circulatory.problem    disorder.nausea.epilepsy 
##                          24                          24 
##  epilepsy.pregnant.aneurism        fever.rash.infection 
##                          24                          24 
##   heart.disease.circulatory    history.cancer.psychosis 
##                          24                          24 
##   infection.mental.disorder          limb.limb.numbness 
##                          24                          24 
##         limb.numbness.nerve            local.site.wound 
##                          24                          24 
##      mental.disorder.nausea    nausea.epilepsy.pregnant 
##                          24                          24 
##       nerve.pain.difficulty        nerve.tingling.nerve 
##                          24                          24 
##          numbness.limb.limb     numbness.nerve.tingling 
##                          24                          24 
##   pain.difficulty.breathing   pregnant.aneurism.history 
##                          24                          24 
##      pressure.problem.blood          problem.blood.clot 
##                          24                          24 
##      problem.blood.pressure     psychosis.numbness.limb 
##                          24                          24 
##    ragged.breathing.trouble             rash.fever.rash 
##                          24                          24 
##       rash.infection.mental         sensitive.skin.thin 
##                          24                          24 
##             site.wound.sore               skin.cut.burn 
##                          24                          24 
##              skin.thin.skin         sore.sensitive.skin 
##                          24                          24 
##               thin.skin.cut         tingling.nerve.pain 
##                          24                          24 
##     trouble.breathing.heart        wound.sore.sensitive 
##                          24                          24
contraMyo <- grep('[.].*[.].*',colnames(myofascial2))
myoContra <- myofascial2[,contraMyo]
myoBenefit <- myofascial2[,-contraMyo]

We now have the benefits as bigrams and contraindications as trigrams to exclude this as a list of the myofascial therapy contraindications, and to recommend the list of myofascial benefits. We just have to make these column names into lists for benefits and contraindications.

benefits_myofascial <- gsub('[.]',' ', colnames(myoBenefit), perl=TRUE)
contra_myofascial <- gsub('[.]',' ', colnames(myoContra), perl=TRUE)

We will make the lists of the other 18 categories of massage modalities’ benefits and contraindications to use in building our recommender system for massage modalities for each user.

Prenatal Massage

prenatal <- subset(ngrams23All, ngrams23All$modality=='Prenatal Massage')

prenatal1 <- prenatal[,-c(1:3)]
prenatal2 <- subset(prenatal1, colSums(prenatal1)!=0)

prenatal2 <- prenatal1[,colSums(prenatal1) >= 1]

contraPre <- grep('[.].*[.].*',colnames(prenatal2))
PreContra <- prenatal2[,contraPre]
PreBenefit <- prenatal2[,-contraPre]

benefits_prenatal <- gsub('[.]',' ', colnames(PreBenefit), perl=TRUE)
contra_prenatal <- gsub('[.]',' ', colnames(PreContra), perl=TRUE)

Shiatsu Massage

shiatsu <- subset(ngrams23All, ngrams23All$modality=='Shiatsu Massage')

shiatsu1 <- shiatsu[,-c(1:3)]
shiatsu2 <- subset(shiatsu1, colSums(shiatsu1)!=0)

shiatsu2 <- shiatsu1[,colSums(shiatsu1) >= 1]

contrashi <- grep('[.].*[.].*',colnames(shiatsu2))
shiContra <- shiatsu2[,contrashi]
shiBenefit <- shiatsu2[,-contrashi]

benefits_shiatsu <- gsub('[.]',' ', colnames(shiBenefit), perl=TRUE)
contra_shiatsu <- gsub('[.]',' ', colnames(shiContra), perl=TRUE)

Hot Stone Therapy Massage

hotStone <- subset(ngrams23All, ngrams23All$modality=='Hot Stone Therapy Massage')

hotStone1 <- hotStone[,-c(1:3)]
hotStone2 <- subset(hotStone1, colSums(hotStone1)!=0)

hotStone2 <- hotStone1[,colSums(hotStone1) >= 1]

contrahs <- grep('[.].*[.].*',colnames(hotStone2))
hsContra <- hotStone2[,contrahs]
hsBenefit <- hotStone2[,-contrahs]

benefits_hs <- gsub('[.]',' ', colnames(hsBenefit), perl=TRUE)
contra_hs <- gsub('[.]',' ', colnames(hsContra), perl=TRUE)

Cupping Therapy

Cupping <- subset(ngrams23All, ngrams23All$modality=='Cupping Therapy')

Cupping1 <- Cupping[,-c(1:3)]
Cupping2 <- subset(Cupping1, colSums(Cupping1)!=0)

Cupping2 <- Cupping1[,colSums(Cupping1) >= 1]

contracup <- grep('[.].*[.].*',colnames(Cupping2))
cupContra <- Cupping2[,contracup]
cupBenefit <- Cupping2[,-contracup]

benefits_cup <- gsub('[.]',' ', colnames(cupBenefit), perl=TRUE)
contra_cup <- gsub('[.]',' ', colnames(cupContra), perl=TRUE)

Sports Massage

Sports <- subset(ngrams23All, ngrams23All$modality=='Sports Massage')

Sports1 <- Sports[,-c(1:3)]
Sports2 <- subset(Sports1, colSums(Sports1)!=0)

Sports2 <- Sports1[,colSums(Sports1) >= 1]

contrasports <- grep('[.].*[.].*',colnames(Sports2))
sportsContra <- Sports2[,contrasports]
sportsBenefit <- Sports2[,-contrasports]

benefits_sports <- gsub('[.]',' ', colnames(sportsBenefit), perl=TRUE)
contra_sports <- gsub('[.]',' ', colnames(sportsContra), perl=TRUE)

Biofreeze Muscle Pain Relief Gel

Freeze <- subset(ngrams23All, ngrams23All$modality=='Biofreeze Muscle Pain Relief Gel')

Freeze1 <- Freeze[,-c(1:3)]
Freeze2 <- subset(Freeze1, colSums(Freeze1)!=0)

Freeze2 <- Freeze1[,colSums(Freeze1) >= 1]

contrafreeze <- grep('[.].*[.].*',colnames(Freeze2))
freezeContra <- Freeze2[,contrafreeze]
freezeBenefit <- Freeze2[,-contrafreeze]

benefits_freeze <- gsub('[.]',' ', colnames(freezeBenefit), perl=TRUE)
contra_freeze <- gsub('[.]',' ', colnames(freezeContra), perl=TRUE)

Cold Stone Therapy

ColdStone <- subset(ngrams23All, ngrams23All$modality=='Cold Stone Therapy')

ColdStone1 <- ColdStone[,-c(1:3)]
ColdStone2 <- subset(ColdStone1, colSums(ColdStone1)!=0)

ColdStone2 <- ColdStone1[,colSums(ColdStone1) >= 1]

contracold <- grep('[.].*[.].*',colnames(ColdStone2))
coldContra <- ColdStone2[,contracold]
coldBenefit <- ColdStone2[,-contracold]

benefits_cold <- gsub('[.]',' ', colnames(coldBenefit), perl=TRUE)
contra_cold <- gsub('[.]',' ', colnames(coldContra), perl=TRUE)

Stretching

Stretch <- subset(ngrams23All, ngrams23All$modality=='Stretching')

Stretch1 <- Stretch[,-c(1:3)]
Stretch2 <- subset(Stretch1, colSums(Stretch1)!=0)

Stretch2 <- Stretch1[,colSums(Stretch1) >= 1]

contrastretch <- grep('[.].*[.].*',colnames(Stretch2))
stretchContra <- Stretch2[,contrastretch]
stretchBenefit <- Stretch2[,-contrastretch]

benefits_stretch <- gsub('[.]',' ', colnames(stretchBenefit), perl=TRUE)
contra_stretch <- gsub('[.]',' ', colnames(stretchContra), perl=TRUE)

Aromatherapy

AromaTherapy <- subset(ngrams23All, ngrams23All$modality=='Aromatherapy')

AromaTherapy1 <- AromaTherapy[,-c(1:3)]
AromaTherapy2 <- subset(AromaTherapy1, colSums(AromaTherapy1)!=0)

AromaTherapy2 <- AromaTherapy1[,colSums(AromaTherapy1) >= 1]

contraaroma <- grep('[.].*[.].*',colnames(AromaTherapy2))
aromaContra <- AromaTherapy2[,contraaroma]
aromaBenefit <- AromaTherapy2[,-contraaroma]

benefits_aroma <- gsub('[.]',' ', colnames(aromaBenefit), perl=TRUE)
contra_aroma <- gsub('[.]',' ', colnames(aromaContra), perl=TRUE)

Swedish Massage

Swedish <- subset(ngrams23All, ngrams23All$modality=='Swedish Massage')

Swedish1 <- Swedish[,-c(1:3)]
Swedish2 <- subset(Swedish1, colSums(Swedish1)!=0)

Swedish2 <- Swedish1[,colSums(Swedish1) >= 1]

contraswedish <- grep('[.].*[.].*',colnames(Swedish2))
swedishContra <- Swedish2[,contraswedish]
swedishBenefit <- Swedish2[,-contraswedish]

benefits_swedish <- gsub('[.]',' ', colnames(swedishBenefit), perl=TRUE)
contra_swedish <- gsub('[.]',' ', colnames(swedishContra), perl=TRUE)

Deep tissue Massage

DTmassage <- subset(ngrams23All, ngrams23All$modality=='Deep tissue Massage')

DTmassage1 <- DTmassage[,-c(1:3)]
DTmassage2 <- subset(DTmassage1, colSums(DTmassage1)!=0)

DTmassage2 <- DTmassage1[,colSums(DTmassage1) >= 1]

contraDT <- grep('[.].*[.].*',colnames(DTmassage2))
DTContra <- DTmassage2[,contraDT]
DTBenefit <- DTmassage2[,-contraDT]

benefits_DT <- gsub('[.]',' ', colnames(DTBenefit), perl=TRUE)
contra_DT <- gsub('[.]',' ', colnames(DTContra), perl=TRUE)

Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage

IASTM <- subset(ngrams23All, ngrams23All$modality=='Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage')

IASTM1 <- IASTM[,-c(1:3)]
IASTM2 <- subset(IASTM1, colSums(IASTM1)!=0)

IASTM2 <- IASTM1[,colSums(IASTM1) >= 1]

contrainstrument <- grep('[.].*[.].*',colnames(IASTM2))
instrumentContra <- IASTM2[,contrainstrument]
instrumentBenefit <- IASTM2[,-contrainstrument]

benefits_instrument <- gsub('[.]',' ', colnames(instrumentBenefit), perl=TRUE)
contra_instrument <- gsub('[.]',' ', colnames(instrumentContra), perl=TRUE)

Trigger Point Therapy

TPT <- subset(ngrams23All, ngrams23All$modality=='Trigger Point Therapy')

TPT1 <- TPT[,-c(1:3)]
TPT2 <- subset(TPT1, colSums(TPT1)!=0)

TPT2 <- TPT1[,colSums(TPT1) >= 1]

contratpt <- grep('[.].*[.].*',colnames(TPT2))
tptContra <- TPT2[,contratpt]
tptBenefit <- TPT2[,-contratpt]

benefits_tpt <- gsub('[.]',' ', colnames(tptBenefit), perl=TRUE)
contra_tpt <- gsub('[.]',' ', colnames(tptContra), perl=TRUE)

Massage Gun Therapy

massageGun <- subset(ngrams23All, ngrams23All$modality=='Massage Gun Therapy')

massageGun1 <- massageGun[,-c(1:3)]
massageGun2 <- subset(massageGun1, colSums(massageGun1)!=0)

massageGun2 <- massageGun1[,colSums(massageGun1) >= 1]

contramassagegun <- grep('[.].*[.].*',colnames(massageGun2))
massagegunContra <- massageGun2[,contramassagegun]
massagegunBenefit <- massageGun2[,-contramassagegun]

benefits_massagegun <- gsub('[.]',' ', colnames(massagegunBenefit), perl=TRUE)
contra_massagegun <- gsub('[.]',' ', colnames(massagegunContra), perl=TRUE)

Lymphatic Drainage Massage

Lymphatic <- subset(ngrams23All, ngrams23All$modality=='Lymphatic Drainage Massage')

Lymphatic1 <- Lymphatic[,-c(1:3)]
Lymphatic2 <- subset(Lymphatic1, colSums(Lymphatic1)!=0)

Lymphatic2 <- Lymphatic1[,colSums(Lymphatic1) >= 1]

contralymphatic <- grep('[.].*[.].*',colnames(Lymphatic2))
lymphaticContra <- Lymphatic2[,contralymphatic]
lymphaticBenefit <- Lymphatic2[,-contralymphatic]

benefits_lymphatic <- gsub('[.]',' ', colnames(lymphaticBenefit), perl=TRUE)
contra_lymphatic <- gsub('[.]',' ', colnames(lymphaticContra), perl=TRUE)

Reflexology Massage

Reflexology <- subset(ngrams23All, ngrams23All$modality=='Reflexology Massage')

Reflexology1 <- Reflexology[,-c(1:3)]
Reflexology2 <- subset(Reflexology1, colSums(Reflexology1)!=0)

Reflexology2 <- Reflexology1[,colSums(Reflexology1) >= 1]

contrareflexology <- grep('[.].*[.].*',colnames(Reflexology2))
reflexologyContra <- Reflexology2[,contrareflexology]
reflexologyBenefit <- Reflexology2[,-contrareflexology]

benefits_reflexology <- gsub('[.]',' ', colnames(reflexologyBenefit), perl=TRUE)
contra_reflexology <- gsub('[.]',' ', colnames(reflexologyContra), perl=TRUE)

Craniosacral Massage

Craniosacral <- subset(ngrams23All, ngrams23All$modality=='Craniosacral Massage')

Craniosacral1 <- Craniosacral[,-c(1:3)]
Craniosacral2 <- subset(Craniosacral1, colSums(Craniosacral1)!=0)

Craniosacral2 <- Craniosacral1[,colSums(Craniosacral1) >= 1]

contracraniosacral <- grep('[.].*[.].*',colnames(Craniosacral2))
craniosacralContra <- Craniosacral2[,contracraniosacral]
craniosacralBenefit <- Craniosacral2[,-contracraniosacral]

benefits_craniosacral <- gsub('[.]',' ', colnames(craniosacralBenefit), perl=TRUE)
contra_craniosacral <- gsub('[.]',' ', colnames(craniosacralContra), perl=TRUE)

Cannabidiol (CBD) Massage Balm

CBD <- subset(ngrams23All, ngrams23All$modality=='Cannabidiol (CBD) Massage Balm')

CBD1 <- CBD[,-c(1:3)]
CBD2 <- subset(CBD1, colSums(CBD1)!=0)

CBD2 <- CBD1[,colSums(CBD1) >= 1]

contracbd <- grep('[.].*[.].*',colnames(CBD2))
cbdContra <- CBD2[,contracbd]
cbdBenefit <- CBD2[,-contracbd]

benefits_cbd <- gsub('[.]',' ', colnames(cbdBenefit), perl=TRUE)
contra_cbd <- gsub('[.]',' ', colnames(cbdContra), perl=TRUE)

Benefits of each modality as lists: benefits_cbd,benefits_craniosacral,benefits_reflexology,benefits_lymphatic,benefits_massagegun,benefits_tpt,benefits_instrument,benefits_DT,benefits_swedish,benefits_aroma,benefits_stretch,benefits_cold,benefits_freeze,benefits_sports,benefits_cup,benefits_hs,benefits_shiatsu,benefits_prenatal,benefits_myofascial

Contraindications of each modality as lists: contra_cbd,contra_craniosacral,contra_reflexology,contra_lymphatic,contra_massagegun,contra_tpt,contra_instrument,contra_DT,contra_swedish,contra_aroma,contra_stretch,contra_cold,contra_freeze, contra_sports,contra_cup,contra_hs,contra_shiatsu,contra_prenatal,contra_myofascial

benefits_cbd
## [1] "arthritis stress" "associated nerve" "chronic pain"     "help chronic"    
## [5] "nerve pain"       "pain arthritis"   "pain associated"
contra_cbd
##  [1] "aneurism history cancer"      "autoimmune disease psychosis"
##  [3] "breathing medication could"   "breathing ragged breathing"  
##  [5] "breathing trouble breathing"  "cancer autoimmune disease"   
##  [7] "cannabidiol allergy cbd"      "could interact cannabidiol"  
##  [9] "dehydration fever rash"       "difficulty breathing ragged" 
## [11] "disease psychosis numbness"   "disorder nausea epilepsy"    
## [13] "epilepsy pregnant aneurism"   "fever rash infection"        
## [15] "history cancer autoimmune"    "infection mental disorder"   
## [17] "interact cannabidiol allergy" "limb limb numbness"          
## [19] "limb numbness nerve"          "medication could interact"   
## [21] "mental disorder nausea"       "nausea epilepsy pregnant"    
## [23] "nerve pain difficulty"        "nerve tingling nerve"        
## [25] "numbness limb limb"           "numbness nerve tingling"     
## [27] "pain difficulty breathing"    "pregnant aneurism history"   
## [29] "psychosis numbness limb"      "ragged breathing trouble"    
## [31] "rash infection mental"        "tingling nerve pain"         
## [33] "trouble breathing medication"

The above demonstrates the benefits of CBD as a list of double word pairs or bigrams with the stop words stripped, and the bottom list is the longer list of trigrams of three word groups for the contraindications for CBD with the stopwords stripped. What we now want is a way to get the user the have a user input that will scan the list of contraindications for each massage modality, and if it is in the list of a modality, then it will be excluded from the list of available massage modalities for the user.

Looking at the above list the contraindications are grouped together that are different health conditions like psychosis (and) numbness (in the) limb. Users don’t want to scan a list of 500 health conditions or even more than 10, but any health conditions they have will have to be reported before scheduling a massage so that it isn’t cancelled or booked for the wrong modality. Lets assume the user honestly includes every possible health condition and history of their health conditions for serious medical conditions, then we want this program to scan those groups of words and find the modalities the user absolutely should not have, so that a list of available massage modalities are provided for the user to select the best one for benefits.Some users might not spell the same words or use the same words to describe the same sort of health condition. Like psychosis, I caught my own spelling error as pyschosis and fixed it. Also, the mental disorder or psychosis wouldn’t be a health condition one’s self would put down, only someone scheduling the massage for the person, like a child of a dementia patient, or parent, for similar mental disorders. There are quite a bit of those, and no massage therapist wants to put their professional or livelihood on the line to massage someone with mental disorders if the client will have a break down in session of some sort, like a relapse or yelling, or similar actions people aren’t capable of dealing with on a professional level. Massage therapists are not mental healthcare workers, but the massage does produce benefits that improve mental functions, therefore they cannot handle a client having a break down or threatening their safety if it occurs. So it is best to have these types of health conditions resolved or alerted to, so that a bystander for the client with a mental disorder can be nearby to handle the situation should it occur, or wait until the mental health has improved for the client before scheduling a massage. Lets not compete with God here, lest he or she put us in a comatose state and be like, ‘oh you know my thoughts and any of my children’s thoughts, well see how you produce machine learning on thoughts now, but, etc.’ So, we will not manually scan these words in the list and pick them out, but lets perhaps take that same user input off our trigrams from our model built on contrainidcations, and assume that we can select any bigram word pairs from these trigrams and then add any modality the user input pulls up into a list of modalities to exclude. Some will not be good bigrams, like ‘disease pyschosis’ but others like ‘autoimmune disease’ are good bigrams for reported health conditions. Because someone with autoimmune disease will not want a painful and debillitating flare up if a massage modality activated those symptoms, and the client came to realize that he or she did not report it beforehand. Other words like,pregnant, fever, and rash would be best as unigrams. So we should create an ngram of unigrams and bigrams from each list of contraindications.

We can use string literals and list apply methods to extract these uni and bi gram word pairs from our trigrams of contraindications.

cbd_split <- strsplit(contra_cbd,split=' ')
cbd_split1 <- lapply(cbd_split, '[',1)
cbd_split1b <- as.character(cbd_split1)
cbd_split1b
##  [1] "aneurism"    "autoimmune"  "breathing"   "breathing"   "breathing"  
##  [6] "cancer"      "cannabidiol" "could"       "dehydration" "difficulty" 
## [11] "disease"     "disorder"    "epilepsy"    "fever"       "history"    
## [16] "infection"   "interact"    "limb"        "limb"        "medication" 
## [21] "mental"      "nausea"      "nerve"       "nerve"       "numbness"   
## [26] "numbness"    "pain"        "pregnant"    "psychosis"   "ragged"     
## [31] "rash"        "tingling"    "trouble"
cat('\n','\n')
cbd_split2 <- lapply(cbd_split, '[',2)
cbd_split2b <- as.character(cbd_split2)
cbd_split2b
##  [1] "history"     "disease"     "medication"  "ragged"      "trouble"    
##  [6] "autoimmune"  "allergy"     "interact"    "fever"       "breathing"  
## [11] "psychosis"   "nausea"      "pregnant"    "rash"        "cancer"     
## [16] "mental"      "cannabidiol" "limb"        "numbness"    "could"      
## [21] "disorder"    "epilepsy"    "pain"        "tingling"    "limb"       
## [26] "nerve"       "difficulty"  "aneurism"    "numbness"    "breathing"  
## [31] "infection"   "nerve"       "breathing"
cat('\n','\n')
cbd_split3 <- lapply(cbd_split, '[',3)
cbd_split3b <- as.character(cbd_split3)
cbd_split3b
##  [1] "cancer"      "psychosis"   "could"       "breathing"   "breathing"  
##  [6] "disease"     "cbd"         "cannabidiol" "rash"        "ragged"     
## [11] "numbness"    "epilepsy"    "aneurism"    "infection"   "autoimmune" 
## [16] "disorder"    "allergy"     "numbness"    "nerve"       "interact"   
## [21] "nausea"      "pregnant"    "difficulty"  "nerve"       "limb"       
## [26] "tingling"    "breathing"   "history"     "limb"        "trouble"    
## [31] "mental"      "pain"        "medication"
cat('\n','\n')

Above is our list of unigrams for CBD contraindications. Lets get the bigrams.

cbd_bi1 <- paste(cbd_split1b,cbd_split2b)
cbd_bi1
##  [1] "aneurism history"     "autoimmune disease"   "breathing medication"
##  [4] "breathing ragged"     "breathing trouble"    "cancer autoimmune"   
##  [7] "cannabidiol allergy"  "could interact"       "dehydration fever"   
## [10] "difficulty breathing" "disease psychosis"    "disorder nausea"     
## [13] "epilepsy pregnant"    "fever rash"           "history cancer"      
## [16] "infection mental"     "interact cannabidiol" "limb limb"           
## [19] "limb numbness"        "medication could"     "mental disorder"     
## [22] "nausea epilepsy"      "nerve pain"           "nerve tingling"      
## [25] "numbness limb"        "numbness nerve"       "pain difficulty"     
## [28] "pregnant aneurism"    "psychosis numbness"   "ragged breathing"    
## [31] "rash infection"       "tingling nerve"       "trouble breathing"
cat('\n\n')
cbd_bi2 <- paste(cbd_split2b,cbd_split3b)
cbd_bi2
##  [1] "history cancer"       "disease psychosis"    "medication could"    
##  [4] "ragged breathing"     "trouble breathing"    "autoimmune disease"  
##  [7] "allergy cbd"          "interact cannabidiol" "fever rash"          
## [10] "breathing ragged"     "psychosis numbness"   "nausea epilepsy"     
## [13] "pregnant aneurism"    "rash infection"       "cancer autoimmune"   
## [16] "mental disorder"      "cannabidiol allergy"  "limb numbness"       
## [19] "numbness nerve"       "could interact"       "disorder nausea"     
## [22] "epilepsy pregnant"    "pain difficulty"      "tingling nerve"      
## [25] "limb limb"            "nerve tingling"       "difficulty breathing"
## [28] "aneurism history"     "numbness limb"        "breathing trouble"   
## [31] "infection mental"     "nerve pain"           "breathing medication"
cat('\n\n')
cbd_bi3 <- paste(cbd_split3b,cbd_split1b)
cbd_bi3
##  [1] "cancer aneurism"      "psychosis autoimmune" "could breathing"     
##  [4] "breathing breathing"  "breathing breathing"  "disease cancer"      
##  [7] "cbd cannabidiol"      "cannabidiol could"    "rash dehydration"    
## [10] "ragged difficulty"    "numbness disease"     "epilepsy disorder"   
## [13] "aneurism epilepsy"    "infection fever"      "autoimmune history"  
## [16] "disorder infection"   "allergy interact"     "numbness limb"       
## [19] "nerve limb"           "interact medication"  "nausea mental"       
## [22] "pregnant nausea"      "difficulty nerve"     "nerve nerve"         
## [25] "limb numbness"        "tingling numbness"    "breathing pain"      
## [28] "history pregnant"     "limb psychosis"       "trouble ragged"      
## [31] "mental rash"          "pain tingling"        "medication trouble"
cat('\n\n')

The following is a list of all the unique unigrams in the CBD contraindication trigrams.

uni_cbd <- unique(c(cbd_split1b, cbd_split2b, cbd_split3b))
uni_cbd
##  [1] "aneurism"    "autoimmune"  "breathing"   "cancer"      "cannabidiol"
##  [6] "could"       "dehydration" "difficulty"  "disease"     "disorder"   
## [11] "epilepsy"    "fever"       "history"     "infection"   "interact"   
## [16] "limb"        "medication"  "mental"      "nausea"      "nerve"      
## [21] "numbness"    "pain"        "pregnant"    "psychosis"   "ragged"     
## [26] "rash"        "tingling"    "trouble"     "allergy"     "cbd"

The following is a list of all the unique bigrams of the CBD contraindications trigrams.

bi_cbd <- unique(c(cbd_bi1, cbd_bi2, cbd_bi3))
bi_cbd
##  [1] "aneurism history"     "autoimmune disease"   "breathing medication"
##  [4] "breathing ragged"     "breathing trouble"    "cancer autoimmune"   
##  [7] "cannabidiol allergy"  "could interact"       "dehydration fever"   
## [10] "difficulty breathing" "disease psychosis"    "disorder nausea"     
## [13] "epilepsy pregnant"    "fever rash"           "history cancer"      
## [16] "infection mental"     "interact cannabidiol" "limb limb"           
## [19] "limb numbness"        "medication could"     "mental disorder"     
## [22] "nausea epilepsy"      "nerve pain"           "nerve tingling"      
## [25] "numbness limb"        "numbness nerve"       "pain difficulty"     
## [28] "pregnant aneurism"    "psychosis numbness"   "ragged breathing"    
## [31] "rash infection"       "tingling nerve"       "trouble breathing"   
## [34] "allergy cbd"          "cancer aneurism"      "psychosis autoimmune"
## [37] "could breathing"      "breathing breathing"  "disease cancer"      
## [40] "cbd cannabidiol"      "cannabidiol could"    "rash dehydration"    
## [43] "ragged difficulty"    "numbness disease"     "epilepsy disorder"   
## [46] "aneurism epilepsy"    "infection fever"      "autoimmune history"  
## [49] "disorder infection"   "allergy interact"     "nerve limb"          
## [52] "interact medication"  "nausea mental"        "pregnant nausea"     
## [55] "difficulty nerve"     "nerve nerve"          "tingling numbness"   
## [58] "breathing pain"       "history pregnant"     "limb psychosis"      
## [61] "trouble ragged"       "mental rash"          "pain tingling"       
## [64] "medication trouble"

The trigrams are already uniquely identified, because they were column names, and only unique column names can be used but also they were the trigrams that had count values out of all contraindications within the massage modalities’ contraindications, for this modality.

So, we will combine this as the list of 1-3 ngrams for our CBD contraindications.

CBD_contraindications <- c(contra_cbd, bi_cbd, uni_cbd)
CBD_contraindications
##   [1] "aneurism history cancer"      "autoimmune disease psychosis"
##   [3] "breathing medication could"   "breathing ragged breathing"  
##   [5] "breathing trouble breathing"  "cancer autoimmune disease"   
##   [7] "cannabidiol allergy cbd"      "could interact cannabidiol"  
##   [9] "dehydration fever rash"       "difficulty breathing ragged" 
##  [11] "disease psychosis numbness"   "disorder nausea epilepsy"    
##  [13] "epilepsy pregnant aneurism"   "fever rash infection"        
##  [15] "history cancer autoimmune"    "infection mental disorder"   
##  [17] "interact cannabidiol allergy" "limb limb numbness"          
##  [19] "limb numbness nerve"          "medication could interact"   
##  [21] "mental disorder nausea"       "nausea epilepsy pregnant"    
##  [23] "nerve pain difficulty"        "nerve tingling nerve"        
##  [25] "numbness limb limb"           "numbness nerve tingling"     
##  [27] "pain difficulty breathing"    "pregnant aneurism history"   
##  [29] "psychosis numbness limb"      "ragged breathing trouble"    
##  [31] "rash infection mental"        "tingling nerve pain"         
##  [33] "trouble breathing medication" "aneurism history"            
##  [35] "autoimmune disease"           "breathing medication"        
##  [37] "breathing ragged"             "breathing trouble"           
##  [39] "cancer autoimmune"            "cannabidiol allergy"         
##  [41] "could interact"               "dehydration fever"           
##  [43] "difficulty breathing"         "disease psychosis"           
##  [45] "disorder nausea"              "epilepsy pregnant"           
##  [47] "fever rash"                   "history cancer"              
##  [49] "infection mental"             "interact cannabidiol"        
##  [51] "limb limb"                    "limb numbness"               
##  [53] "medication could"             "mental disorder"             
##  [55] "nausea epilepsy"              "nerve pain"                  
##  [57] "nerve tingling"               "numbness limb"               
##  [59] "numbness nerve"               "pain difficulty"             
##  [61] "pregnant aneurism"            "psychosis numbness"          
##  [63] "ragged breathing"             "rash infection"              
##  [65] "tingling nerve"               "trouble breathing"           
##  [67] "allergy cbd"                  "cancer aneurism"             
##  [69] "psychosis autoimmune"         "could breathing"             
##  [71] "breathing breathing"          "disease cancer"              
##  [73] "cbd cannabidiol"              "cannabidiol could"           
##  [75] "rash dehydration"             "ragged difficulty"           
##  [77] "numbness disease"             "epilepsy disorder"           
##  [79] "aneurism epilepsy"            "infection fever"             
##  [81] "autoimmune history"           "disorder infection"          
##  [83] "allergy interact"             "nerve limb"                  
##  [85] "interact medication"          "nausea mental"               
##  [87] "pregnant nausea"              "difficulty nerve"            
##  [89] "nerve nerve"                  "tingling numbness"           
##  [91] "breathing pain"               "history pregnant"            
##  [93] "limb psychosis"               "trouble ragged"              
##  [95] "mental rash"                  "pain tingling"               
##  [97] "medication trouble"           "aneurism"                    
##  [99] "autoimmune"                   "breathing"                   
## [101] "cancer"                       "cannabidiol"                 
## [103] "could"                        "dehydration"                 
## [105] "difficulty"                   "disease"                     
## [107] "disorder"                     "epilepsy"                    
## [109] "fever"                        "history"                     
## [111] "infection"                    "interact"                    
## [113] "limb"                         "medication"                  
## [115] "mental"                       "nausea"                      
## [117] "nerve"                        "numbness"                    
## [119] "pain"                         "pregnant"                    
## [121] "psychosis"                    "ragged"                      
## [123] "rash"                         "tingling"                    
## [125] "trouble"                      "allergy"                     
## [127] "cbd"

Looking at the list above, it is clear that if any of these unigrams are in a user input, then the modality will get excluded. We won’t want this to occur, because somebody using dialogue consisting of ‘pain’ or ‘could’ would not get any recommendations for massage. And, also, you might be wondering why not do this earlier when pulling the ngrams and setting to (1,3), but recall this is a list of trigrams for contraindications and a list of bigrams for benefits that were put together in the table, and when using regex to pull the bigrams from the trigrams, we were able to separate the bigrams with one ‘[.]’ and the trigrams with two ‘[.]’.

We can do the similar with the benefits of CBD list of bigrams to create the unigrams to combine with the bigrams for CBD benefits, then pull out the unigrams in common between the two and favor those single words or unigrams as benefits. Same for the bigrams in common, we will pull those out to assign them to benefits. A possible problem is that the bigram could be a contraindication and the same for the unigrams. We will examine that after building those lists.

Lets get the unigrams of benefits of CBD.

benefit_uni <- strsplit(benefits_cbd, split=" ")
benefit_uni1 <- lapply(benefit_uni,'[',1)
benefit_uni1b <- as.character(benefit_uni1)
benefit_uni2 <- lapply(benefit_uni,'[',2)
benefit_uni2b <- as.character(benefit_uni2)

uni_benefits <- unique(c(benefit_uni1b,benefit_uni2b))

benefits_cbd1 <- c(uni_benefits, benefits_cbd)
benefits_cbd1
##  [1] "arthritis"        "associated"       "chronic"          "help"            
##  [5] "nerve"            "pain"             "stress"           "arthritis stress"
##  [9] "associated nerve" "chronic pain"     "help chronic"     "nerve pain"      
## [13] "pain arthritis"   "pain associated"

Lets now get the list of tokens in both benefits and contraindications.

both_cbd <- CBD_contraindications %in% benefits_cbd1
length(both_cbd)
## [1] 127
both_cbd
##   [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
##  [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [109] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE
## [121] FALSE FALSE FALSE FALSE FALSE FALSE FALSE
cbd1 <- CBD_contraindications[both_cbd]
length(cbd1)
## [1] 3
cbd1
## [1] "nerve pain" "nerve"      "pain"

The above words are included in both the benefits of CBD and the contraindications of CBD. So, judging from the three words, ‘nerve pain’, ‘nerve’, and ‘pain,’ we will exclude these as markers for contraindications for CBD oil during massage, as they are more likely benefits.

CBD_contraindications1 <- CBD_contraindications[both_cbd==FALSE]
length(CBD_contraindications1)
## [1] 124
CBD_contraindications1
##   [1] "aneurism history cancer"      "autoimmune disease psychosis"
##   [3] "breathing medication could"   "breathing ragged breathing"  
##   [5] "breathing trouble breathing"  "cancer autoimmune disease"   
##   [7] "cannabidiol allergy cbd"      "could interact cannabidiol"  
##   [9] "dehydration fever rash"       "difficulty breathing ragged" 
##  [11] "disease psychosis numbness"   "disorder nausea epilepsy"    
##  [13] "epilepsy pregnant aneurism"   "fever rash infection"        
##  [15] "history cancer autoimmune"    "infection mental disorder"   
##  [17] "interact cannabidiol allergy" "limb limb numbness"          
##  [19] "limb numbness nerve"          "medication could interact"   
##  [21] "mental disorder nausea"       "nausea epilepsy pregnant"    
##  [23] "nerve pain difficulty"        "nerve tingling nerve"        
##  [25] "numbness limb limb"           "numbness nerve tingling"     
##  [27] "pain difficulty breathing"    "pregnant aneurism history"   
##  [29] "psychosis numbness limb"      "ragged breathing trouble"    
##  [31] "rash infection mental"        "tingling nerve pain"         
##  [33] "trouble breathing medication" "aneurism history"            
##  [35] "autoimmune disease"           "breathing medication"        
##  [37] "breathing ragged"             "breathing trouble"           
##  [39] "cancer autoimmune"            "cannabidiol allergy"         
##  [41] "could interact"               "dehydration fever"           
##  [43] "difficulty breathing"         "disease psychosis"           
##  [45] "disorder nausea"              "epilepsy pregnant"           
##  [47] "fever rash"                   "history cancer"              
##  [49] "infection mental"             "interact cannabidiol"        
##  [51] "limb limb"                    "limb numbness"               
##  [53] "medication could"             "mental disorder"             
##  [55] "nausea epilepsy"              "nerve tingling"              
##  [57] "numbness limb"                "numbness nerve"              
##  [59] "pain difficulty"              "pregnant aneurism"           
##  [61] "psychosis numbness"           "ragged breathing"            
##  [63] "rash infection"               "tingling nerve"              
##  [65] "trouble breathing"            "allergy cbd"                 
##  [67] "cancer aneurism"              "psychosis autoimmune"        
##  [69] "could breathing"              "breathing breathing"         
##  [71] "disease cancer"               "cbd cannabidiol"             
##  [73] "cannabidiol could"            "rash dehydration"            
##  [75] "ragged difficulty"            "numbness disease"            
##  [77] "epilepsy disorder"            "aneurism epilepsy"           
##  [79] "infection fever"              "autoimmune history"          
##  [81] "disorder infection"           "allergy interact"            
##  [83] "nerve limb"                   "interact medication"         
##  [85] "nausea mental"                "pregnant nausea"             
##  [87] "difficulty nerve"             "nerve nerve"                 
##  [89] "tingling numbness"            "breathing pain"              
##  [91] "history pregnant"             "limb psychosis"              
##  [93] "trouble ragged"               "mental rash"                 
##  [95] "pain tingling"                "medication trouble"          
##  [97] "aneurism"                     "autoimmune"                  
##  [99] "breathing"                    "cancer"                      
## [101] "cannabidiol"                  "could"                       
## [103] "dehydration"                  "difficulty"                  
## [105] "disease"                      "disorder"                    
## [107] "epilepsy"                     "fever"                       
## [109] "history"                      "infection"                   
## [111] "interact"                     "limb"                        
## [113] "medication"                   "mental"                      
## [115] "nausea"                       "numbness"                    
## [117] "pregnant"                     "psychosis"                   
## [119] "ragged"                       "rash"                        
## [121] "tingling"                     "trouble"                     
## [123] "allergy"                      "cbd"

We could create a data table of the benefits of each modality, the contraindications, and/or both tokens to pull from or keep them as lists to pull from within our program. But not only do we already have those tables, because we made our 19 modality subsets from them, but we would have to merge each one to the other or join creating Nulls. So we will keep them as lists that have been filtered to be better than the orignal lists that removed as much ambiguity as possible between benefits and contraindications. So, now we will create these filtered lists for the other 18 modalities.

As an aside, who here has watched the reintroduced film on Amazon Prime from 2008, ‘Rainman Twins-..Savante autistics Kay and…’? I don’t remember the name, but easily Googled. I paid $2.99 to watch it yesterday thinking it was new, but only to me. Sometimes, just like neural networkds old stuff isn’t good at that time, but can be re-introduced 10-30 years later for its benefits. The point is, if you did watch this, and if you know the back story of neural networks, you will find that those twins were savante autistics who could recall on the spot with no more than one second to recall the day of the week of any date, the food they ate, the clothes worn, etc. and were viewed and harassed as ‘retards’ until the film came out with Dustin HOffman praising and encouraging savante’s to be found and appreciated. They needed to have routine, and marked all the bells, buzzards, and host clothings during every episode of the 100,000 Pyramid show, and went calmly nuts when it ended. If they were to do the following, they could be of better use. We are going to do just that, and think to ourselves how those twins could have been put to better work. So as we build our next filtered lists of the remaining 18 benefits and contraindications for each modality. We will also be thinking about how we are going to build this program.


benefits_craniosacral,contra_craniosacral

cranio_split <- strsplit(contra_craniosacral,split=' ')
cranio_split1 <- lapply(cranio_split, '[',1)
cranio_split1b <- as.character(cranio_split1)
cranio_split2 <- lapply(cranio_split, '[',2)
cranio_split2b <- as.character(cranio_split2)
cranio_split3 <- lapply(cranio_split, '[',3)
cranio_split3b <- as.character(cranio_split3)
cranio_bi1 <- paste(cranio_split1b,cranio_split2b)
cranio_bi2 <- paste(cranio_split2b,cranio_split3b)
cranio_bi3 <- paste(cranio_split3b,cranio_split1b)
uni_cranio <- unique(c(cranio_split1b, cranio_split2b, cranio_split3b))
bi_cranio <- unique(c(cranio_bi1, cranio_bi2, cranio_bi3))
Cranio_contraindications <- c(contra_craniosacral, bi_cranio, uni_cranio)
csbenefit_uni <- strsplit(benefits_craniosacral, split=" ")
csbenefit_uni1 <- lapply(csbenefit_uni,'[',1)
csbenefit_uni1b <- as.character(csbenefit_uni1)
csbenefit_uni2 <- lapply(csbenefit_uni,'[',2)
csbenefit_uni2b <- as.character(csbenefit_uni2)
csuni_benefits <- unique(c(csbenefit_uni1b,csbenefit_uni2b))
benefits_cranio1 <- c(csuni_benefits, benefits_craniosacral) #

both_cranio <- Cranio_contraindications %in% benefits_cranio1
cranio1 <- Cranio_contraindications[both_cranio] #
Cranio_contraindications1 <- Cranio_contraindications[both_cranio==FALSE] #

benefits_reflexology,contra_reflexology

reflex_split <- strsplit(contra_reflexology,split=' ')
reflex_split1 <- lapply(reflex_split, '[',1)
reflex_split1b <- as.character(reflex_split1)
reflex_split2 <- lapply(reflex_split, '[',2)
reflex_split2b <- as.character(reflex_split2)
reflex_split3 <- lapply(reflex_split, '[',3)
reflex_split3b <- as.character(reflex_split3)
reflex_bi1 <- paste(reflex_split1b,reflex_split2b)
reflex_bi2 <- paste(reflex_split2b,reflex_split3b)
reflex_bi3 <- paste(reflex_split3b,reflex_split1b)
uni_reflex <- unique(c(reflex_split1b, reflex_split2b, reflex_split3b))
bi_reflex <- unique(c(reflex_bi1, reflex_bi2, reflex_bi3))
Reflex_contraindications <- c(contra_reflexology, bi_reflex, uni_reflex)
rfxbenefit_uni <- strsplit(benefits_reflexology, split=" ")
rfxbenefit_uni1 <- lapply(rfxbenefit_uni,'[',1)
rfxbenefit_uni1b <- as.character(rfxbenefit_uni1)
rfxbenefit_uni2 <- lapply(rfxbenefit_uni,'[',2)
rfxbenefit_uni2b <- as.character(rfxbenefit_uni2)
rfxuni_benefits <- unique(c(rfxbenefit_uni1b,rfxbenefit_uni2b))
benefits_reflex1 <- c(rfxuni_benefits, benefits_reflexology) #

both_reflex <- Reflex_contraindications %in% benefits_reflex1
reflex1 <- Reflex_contraindications[both_reflex] #
Reflex_contraindications1 <- Reflex_contraindications[both_reflex==FALSE] #

benefits_lymphatic, contra_lymphatic

lymph_split <- strsplit(contra_lymphatic,split=' ')
lymph_split1 <- lapply(lymph_split, '[',1)
lymph_split1b <- as.character(lymph_split1)
lymph_split2 <- lapply(lymph_split, '[',2)
lymph_split2b <- as.character(lymph_split2)
lymph_split3 <- lapply(lymph_split, '[',3)
lymph_split3b <- as.character(lymph_split3)
lymph_bi1 <- paste(lymph_split1b,lymph_split2b)
lymph_bi2 <- paste(lymph_split2b,lymph_split3b)
lymph_bi3 <- paste(lymph_split3b,lymph_split1b)
uni_lymph <- unique(c(lymph_split1b, lymph_split2b, lymph_split3b))
bi_lymph <- unique(c(lymph_bi1, lymph_bi2, lymph_bi3))
Lymph_contraindications <- c(contra_lymphatic, bi_lymph, uni_lymph)
lymphbenefit_uni <- strsplit(benefits_lymphatic, split=" ")
lymphbenefit_uni1 <- lapply(lymphbenefit_uni,'[',1)
lymphbenefit_uni1b <- as.character(lymphbenefit_uni1)
lymphbenefit_uni2 <- lapply(lymphbenefit_uni,'[',2)
lymphbenefit_uni2b <- as.character(lymphbenefit_uni2)
lymphuni_benefits <- unique(c(lymphbenefit_uni1b,lymphbenefit_uni2b))
benefits_lymph1 <- c(lymphuni_benefits, benefits_lymphatic) #

both_lymph <- Lymph_contraindications %in% benefits_lymph1
lymph1 <- Lymph_contraindications[both_lymph] #
Lymph_contraindications1 <- Lymph_contraindications[both_lymph==FALSE] #

benefits_massagegun, contra_massagegun

mgn_split <- strsplit(contra_massagegun,split=' ')
mgn_split1 <- lapply(mgn_split, '[',1)
mgn_split1b <- as.character(mgn_split1)
mgn_split2 <- lapply(mgn_split, '[',2)
mgn_split2b <- as.character(mgn_split2)
mgn_split3 <- lapply(mgn_split, '[',3)
mgn_split3b <- as.character(mgn_split3)
mgn_bi1 <- paste(mgn_split1b,mgn_split2b)
mgn_bi2 <- paste(mgn_split2b,mgn_split3b)
mgn_bi3 <- paste(mgn_split3b,mgn_split1b)
uni_mgn <- unique(c(mgn_split1b, mgn_split2b, mgn_split3b))
bi_mgn <- unique(c(mgn_bi1, mgn_bi2, mgn_bi3))
Mgn_contraindications <- c(contra_massagegun, bi_mgn, uni_mgn)
mgnbenefit_uni <- strsplit(benefits_massagegun, split=" ")
mgnbenefit_uni1 <- lapply(mgnbenefit_uni,'[',1)
mgnbenefit_uni1b <- as.character(mgnbenefit_uni1)
mgnbenefit_uni2 <- lapply(mgnbenefit_uni,'[',2)
mgnbenefit_uni2b <- as.character(mgnbenefit_uni2)
mgnuni_benefits <- unique(c(mgnbenefit_uni1b,mgnbenefit_uni2b))
benefits_mgn1 <- c(mgnuni_benefits, benefits_massagegun) #

both_mgn <- Mgn_contraindications %in% benefits_mgn1
mgn1 <- Mgn_contraindications[both_mgn] #
Mgn_contraindications1 <- Mgn_contraindications[both_mgn==FALSE] #

benefits_tpt, contra_tpt

tpt_split <- strsplit(contra_tpt,split=' ')
tpt_split1 <- lapply(tpt_split, '[',1)
tpt_split1b <- as.character(tpt_split1)
tpt_split2 <- lapply(tpt_split, '[',2)
tpt_split2b <- as.character(tpt_split2)
tpt_split3 <- lapply(tpt_split, '[',3)
tpt_split3b <- as.character(tpt_split3)
tpt_bi1 <- paste(tpt_split1b,tpt_split2b)
tpt_bi2 <- paste(tpt_split2b,tpt_split3b)
tpt_bi3 <- paste(tpt_split3b,tpt_split1b)
uni_tpt <- unique(c(tpt_split1b, tpt_split2b, tpt_split3b))
bi_tpt <- unique(c(tpt_bi1, tpt_bi2, tpt_bi3))
TPT_contraindications <- c(contra_tpt, bi_tpt, uni_tpt)
tptbenefit_uni <- strsplit(benefits_tpt, split=" ")
tptbenefit_uni1 <- lapply(tptbenefit_uni,'[',1)
tptbenefit_uni1b <- as.character(tptbenefit_uni1)
tptbenefit_uni2 <- lapply(tptbenefit_uni,'[',2)
tptbenefit_uni2b <- as.character(tptbenefit_uni2)
tptuni_benefits <- unique(c(tptbenefit_uni1b,tptbenefit_uni2b))
benefits_tpt1 <- c(tptuni_benefits, benefits_tpt) #

both_tpt <- TPT_contraindications %in% benefits_tpt1
tpt1 <- TPT_contraindications[both_tpt] #
TPT_contraindications1 <- TPT_contraindications[both_tpt==FALSE] #

benefits_instrument, contra_instrument

instrument_split <- strsplit(contra_instrument,split=' ')
instrument_split1 <- lapply(instrument_split, '[',1)
instrument_split1b <- as.character(instrument_split1)
instrument_split2 <- lapply(instrument_split, '[',2)
instrument_split2b <- as.character(instrument_split2)
instrument_split3 <- lapply(instrument_split, '[',3)
instrument_split3b <- as.character(instrument_split3)
instrument_bi1 <- paste(instrument_split1b,instrument_split2b)
instrument_bi2 <- paste(instrument_split2b,instrument_split3b)
instrument_bi3 <- paste(instrument_split3b,instrument_split1b)
uni_instrument <- unique(c(instrument_split1b, instrument_split2b, instrument_split3b))
bi_instrument <- unique(c(instrument_bi1, instrument_bi2, instrument_bi3))
Instrument_contraindications <- c(contra_instrument, bi_instrument, uni_instrument)
instrumentbenefit_uni <- strsplit(benefits_instrument, split=" ")
instrumentbenefit_uni1 <- lapply(instrumentbenefit_uni,'[',1)
instrumentbenefit_uni1b <- as.character(instrumentbenefit_uni1)
instrumentbenefit_uni2 <- lapply(instrumentbenefit_uni,'[',2)
instrumentbenefit_uni2b <- as.character(instrumentbenefit_uni2)
instrumentuni_benefits <- unique(c(instrumentbenefit_uni1b,instrumentbenefit_uni2b))
benefits_instrument1 <- c(instrumentuni_benefits, benefits_instrument) #

both_instrument <- Instrument_contraindications %in% benefits_instrument1
instrument1 <- Instrument_contraindications[both_instrument] #
Instrument_contraindications1 <- Instrument_contraindications[both_instrument==FALSE] #

benefits_DT, contra_DT

DT_split <- strsplit(contra_DT,split=' ')
DT_split1 <- lapply(DT_split, '[',1)
DT_split1b <- as.character(DT_split1)
DT_split2 <- lapply(DT_split, '[',2)
DT_split2b <- as.character(DT_split2)
DT_split3 <- lapply(DT_split, '[',3)
DT_split3b <- as.character(DT_split3)
DT_bi1 <- paste(DT_split1b,DT_split2b)
DT_bi2 <- paste(DT_split2b,DT_split3b)
DT_bi3 <- paste(DT_split3b,DT_split1b)
uni_DT <- unique(c(DT_split1b, DT_split2b, DT_split3b))
bi_DT <- unique(c(DT_bi1, DT_bi2, DT_bi3))
DT_contraindications <- c(contra_DT, bi_DT, uni_DT)
DTbenefit_uni <- strsplit(benefits_DT, split=" ")
DTbenefit_uni1 <- lapply(DTbenefit_uni,'[',1)
DTbenefit_uni1b <- as.character(DTbenefit_uni1)
DTbenefit_uni2 <- lapply(DTbenefit_uni,'[',2)
DTbenefit_uni2b <- as.character(DTbenefit_uni2)
DTuni_benefits <- unique(c(DTbenefit_uni1b,DTbenefit_uni2b))
benefits_DT1 <- c(DTuni_benefits, benefits_DT) #

both_DT <- DT_contraindications %in% benefits_DT1
DT1 <- DT_contraindications[both_DT] #
DT_contraindications1 <- DT_contraindications[both_DT==FALSE] #

benefits_swedish, contra_swedish

swedish_split <- strsplit(contra_swedish,split=' ')
swedish_split1 <- lapply(swedish_split, '[',1)
swedish_split1b <- as.character(swedish_split1)
swedish_split2 <- lapply(swedish_split, '[',2)
swedish_split2b <- as.character(swedish_split2)
swedish_split3 <- lapply(swedish_split, '[',3)
swedish_split3b <- as.character(swedish_split3)
swedish_bi1 <- paste(swedish_split1b,swedish_split2b)
swedish_bi2 <- paste(swedish_split2b,swedish_split3b)
swedish_bi3 <- paste(swedish_split3b,swedish_split1b)
uni_swedish <- unique(c(swedish_split1b, swedish_split2b, swedish_split3b))
bi_swedish <- unique(c(swedish_bi1, swedish_bi2, swedish_bi3))
swedish_contraindications <- c(contra_swedish, bi_swedish, uni_swedish)
swedishbenefit_uni <- strsplit(benefits_swedish, split=" ")
swedishbenefit_uni1 <- lapply(swedishbenefit_uni,'[',1)
swedishbenefit_uni1b <- as.character(swedishbenefit_uni1)
swedishbenefit_uni2 <- lapply(swedishbenefit_uni,'[',2)
swedishbenefit_uni2b <- as.character(swedishbenefit_uni2)
swedishuni_benefits <- unique(c(swedishbenefit_uni1b,swedishbenefit_uni2b))
benefits_swedish1 <- c(swedishuni_benefits, benefits_swedish) #

both_swedish <- swedish_contraindications %in% benefits_swedish1
swedish1 <- swedish_contraindications[both_swedish] #
swedish_contraindications1 <- swedish_contraindications[both_swedish==FALSE] #

benefits_aroma, contra_aroma

aroma_split <- strsplit(contra_aroma,split=' ')
aroma_split1 <- lapply(aroma_split, '[',1)
aroma_split1b <- as.character(aroma_split1)
aroma_split2 <- lapply(aroma_split, '[',2)
aroma_split2b <- as.character(aroma_split2)
aroma_split3 <- lapply(aroma_split, '[',3)
aroma_split3b <- as.character(aroma_split3)
aroma_bi1 <- paste(aroma_split1b,aroma_split2b)
aroma_bi2 <- paste(aroma_split2b,aroma_split3b)
aroma_bi3 <- paste(aroma_split3b,aroma_split1b)
uni_aroma <- unique(c(aroma_split1b, aroma_split2b, aroma_split3b))
bi_aroma <- unique(c(aroma_bi1, aroma_bi2, aroma_bi3))
Aroma_contraindications <- c(contra_aroma, bi_aroma, uni_aroma)
aromabenefit_uni <- strsplit(benefits_aroma, split=" ")
aromabenefit_uni1 <- lapply(aromabenefit_uni,'[',1)
aromabenefit_uni1b <- as.character(aromabenefit_uni1)
aromabenefit_uni2 <- lapply(aromabenefit_uni,'[',2)
aromabenefit_uni2b <- as.character(aromabenefit_uni2)
aromauni_benefits <- unique(c(aromabenefit_uni1b,aromabenefit_uni2b))
benefits_aroma1 <- c(aromauni_benefits, benefits_aroma) #

both_aroma <- Aroma_contraindications %in% benefits_aroma1
aroma1 <- Aroma_contraindications[both_aroma] #
Aroma_contraindications1 <- Aroma_contraindications[both_aroma==FALSE] #

benefits_stretch, contra_stretch

stretch_split <- strsplit(contra_stretch,split=' ')
stretch_split1 <- lapply(stretch_split, '[',1)
stretch_split1b <- as.character(stretch_split1)
stretch_split2 <- lapply(stretch_split, '[',2)
stretch_split2b <- as.character(stretch_split2)
stretch_split3 <- lapply(stretch_split, '[',3)
stretch_split3b <- as.character(stretch_split3)
stretch_bi1 <- paste(stretch_split1b,stretch_split2b)
stretch_bi2 <- paste(stretch_split2b,stretch_split3b)
stretch_bi3 <- paste(stretch_split3b,stretch_split1b)
uni_stretch <- unique(c(stretch_split1b, stretch_split2b, stretch_split3b))
bi_stretch <- unique(c(stretch_bi1, stretch_bi2, stretch_bi3))
stretch_contraindications <- c(contra_stretch, bi_stretch, uni_stretch)
stretchbenefit_uni <- strsplit(benefits_stretch, split=" ")
stretchbenefit_uni1 <- lapply(stretchbenefit_uni,'[',1)
stretchbenefit_uni1b <- as.character(stretchbenefit_uni1)
stretchbenefit_uni2 <- lapply(stretchbenefit_uni,'[',2)
stretchbenefit_uni2b <- as.character(stretchbenefit_uni2)
stretchuni_benefits <- unique(c(stretchbenefit_uni1b,stretchbenefit_uni2b))
benefits_stretch1 <- c(stretchuni_benefits, benefits_stretch) #

both_stretch <- stretch_contraindications %in% benefits_stretch1
stretch1 <- stretch_contraindications[both_stretch] #
stretch_contraindications1 <- stretch_contraindications[both_stretch==FALSE] #

benefits_cold, contra_cold

cold_split <- strsplit(contra_cold,split=' ')
cold_split1 <- lapply(cold_split, '[',1)
cold_split1b <- as.character(cold_split1)
cold_split2 <- lapply(cold_split, '[',2)
cold_split2b <- as.character(cold_split2)
cold_split3 <- lapply(cold_split, '[',3)
cold_split3b <- as.character(cold_split3)
cold_bi1 <- paste(cold_split1b,cold_split2b)
cold_bi2 <- paste(cold_split2b,cold_split3b)
cold_bi3 <- paste(cold_split3b,cold_split1b)
uni_cold <- unique(c(cold_split1b, cold_split2b, cold_split3b))
bi_cold <- unique(c(cold_bi1, cold_bi2, cold_bi3))
cold_contraindications <- c(contra_cold, bi_cold, uni_cold)
coldbenefit_uni <- strsplit(benefits_cold, split=" ")
coldbenefit_uni1 <- lapply(coldbenefit_uni,'[',1)
coldbenefit_uni1b <- as.character(coldbenefit_uni1)
coldbenefit_uni2 <- lapply(coldbenefit_uni,'[',2)
coldbenefit_uni2b <- as.character(coldbenefit_uni2)
colduni_benefits <- unique(c(coldbenefit_uni1b,coldbenefit_uni2b))
benefits_cold1 <- c(colduni_benefits, benefits_cold) #

both_cold <- cold_contraindications %in% benefits_cold1
cold1 <- cold_contraindications[both_cold] #
cold_contraindications1 <- cold_contraindications[both_cold==FALSE] #

benefits_freeze, contra_freeze

freeze_split <- strsplit(contra_freeze,split=' ')
freeze_split1 <- lapply(freeze_split, '[',1)
freeze_split1b <- as.character(freeze_split1)
freeze_split2 <- lapply(freeze_split, '[',2)
freeze_split2b <- as.character(freeze_split2)
freeze_split3 <- lapply(freeze_split, '[',3)
freeze_split3b <- as.character(freeze_split3)
freeze_bi1 <- paste(freeze_split1b,freeze_split2b)
freeze_bi2 <- paste(freeze_split2b,freeze_split3b)
freeze_bi3 <- paste(freeze_split3b,freeze_split1b)
uni_freeze <- unique(c(freeze_split1b, freeze_split2b, freeze_split3b))
bi_freeze <- unique(c(freeze_bi1, freeze_bi2, freeze_bi3))
Freeze_contraindications <- c(contra_freeze, bi_freeze, uni_freeze)
freezebenefit_uni <- strsplit(benefits_freeze, split=" ")
freezebenefit_uni1 <- lapply(freezebenefit_uni,'[',1)
freezebenefit_uni1b <- as.character(freezebenefit_uni1)
freezebenefit_uni2 <- lapply(freezebenefit_uni,'[',2)
freezebenefit_uni2b <- as.character(freezebenefit_uni2)
freezeuni_benefits <- unique(c(freezebenefit_uni1b,freezebenefit_uni2b))
benefits_freeze1 <- c(freezeuni_benefits, benefits_freeze) #

both_freeze <- Freeze_contraindications %in% benefits_freeze1
freeze1 <- Freeze_contraindications[both_freeze] #
Freeze_contraindications1 <- Freeze_contraindications[both_freeze==FALSE] #

benefits_sports, contra_sports

sports_split <- strsplit(contra_sports,split=' ')
sports_split1 <- lapply(sports_split, '[',1)
sports_split1b <- as.character(sports_split1)
sports_split2 <- lapply(sports_split, '[',2)
sports_split2b <- as.character(sports_split2)
sports_split3 <- lapply(sports_split, '[',3)
sports_split3b <- as.character(sports_split3)
sports_bi1 <- paste(sports_split1b,sports_split2b)
sports_bi2 <- paste(sports_split2b,sports_split3b)
sports_bi3 <- paste(sports_split3b,sports_split1b)
uni_sports <- unique(c(sports_split1b, sports_split2b, sports_split3b))
bi_sports <- unique(c(sports_bi1, sports_bi2, sports_bi3))
Sports_contraindications <- c(contra_sports, bi_sports, uni_sports)
sportsbenefit_uni <- strsplit(benefits_sports, split=" ")
sportsbenefit_uni1 <- lapply(sportsbenefit_uni,'[',1)
sportsbenefit_uni1b <- as.character(sportsbenefit_uni1)
sportsbenefit_uni2 <- lapply(sportsbenefit_uni,'[',2)
sportsbenefit_uni2b <- as.character(sportsbenefit_uni2)
sportsuni_benefits <- unique(c(sportsbenefit_uni1b,sportsbenefit_uni2b))
benefits_sports1 <- c(sportsuni_benefits, benefits_sports) #

both_sports <- Sports_contraindications %in% benefits_sports1
sports1 <- Sports_contraindications[both_sports] #
Sports_contraindications1 <- Sports_contraindications[both_sports==FALSE] #

benefits_cup, contra_cup

cup_split <- strsplit(contra_cup,split=' ')
cup_split1 <- lapply(cup_split, '[',1)
cup_split1b <- as.character(cup_split1)
cup_split2 <- lapply(cup_split, '[',2)
cup_split2b <- as.character(cup_split2)
cup_split3 <- lapply(cup_split, '[',3)
cup_split3b <- as.character(cup_split3)
cup_bi1 <- paste(cup_split1b,cup_split2b)
cup_bi2 <- paste(cup_split2b,cup_split3b)
cup_bi3 <- paste(cup_split3b,cup_split1b)
uni_cup <- unique(c(cup_split1b, cup_split2b, cup_split3b))
bi_cup <- unique(c(cup_bi1, cup_bi2, cup_bi3))
Cupping_contraindications <- c(contra_cup, bi_cup, uni_cup)
cuppingbenefit_uni <- strsplit(benefits_cup, split=" ")
cuppingbenefit_uni1 <- lapply(cuppingbenefit_uni,'[',1)
cuppingbenefit_uni1b <- as.character(cuppingbenefit_uni1)
cuppingbenefit_uni2 <- lapply(cuppingbenefit_uni,'[',2)
cuppingbenefit_uni2b <- as.character(cuppingbenefit_uni2)
cuppinguni_benefits <- unique(c(cuppingbenefit_uni1b,cuppingbenefit_uni2b))
benefits_cup1 <- c(cuppinguni_benefits, benefits_cup) #

both_cup <- Cupping_contraindications %in% benefits_cup1
cup1 <- Cupping_contraindications[both_cup] #
Cupping_contraindications1 <- Cupping_contraindications[both_cup==FALSE] #

benefits_hs, contra_hs

HotStone_split <- strsplit(contra_hs,split=' ')
HotStone_split1 <- lapply(HotStone_split, '[',1)
HotStone_split1b <- as.character(HotStone_split1)
HotStone_split2 <- lapply(HotStone_split, '[',2)
HotStone_split2b <- as.character(HotStone_split2)
HotStone_split3 <- lapply(HotStone_split, '[',3)
HotStone_split3b <- as.character(HotStone_split3)
HotStone_bi1 <- paste(HotStone_split1b,HotStone_split2b)
HotStone_bi2 <- paste(HotStone_split2b,HotStone_split3b)
HotStone_bi3 <- paste(HotStone_split3b,HotStone_split1b)
uni_HotStone <- unique(c(HotStone_split1b, HotStone_split2b, HotStone_split3b))
bi_HotStone <- unique(c(HotStone_bi1, HotStone_bi2, HotStone_bi3))
HotStone_contraindications <- c(contra_hs, bi_HotStone, uni_HotStone)
hsbenefit_uni <- strsplit(benefits_hs, split=" ")
hsbenefit_uni1 <- lapply(hsbenefit_uni,'[',1)
hsbenefit_uni1b <- as.character(hsbenefit_uni1)
hsbenefit_uni2 <- lapply(hsbenefit_uni,'[',2)
hsbenefit_uni2b <- as.character(hsbenefit_uni2)
hsuni_benefits <- unique(c(hsbenefit_uni1b,hsbenefit_uni2b))
benefits_HotStone1 <- c(hsuni_benefits, benefits_hs) #

both_HotStone <- HotStone_contraindications %in% benefits_HotStone1
HotStone1 <- HotStone_contraindications[both_HotStone] #
HotStone_contraindications1 <- HotStone_contraindications[both_HotStone==FALSE] #

benefits_shiatsu, contra_shiatsu

shiatsu_split <- strsplit(contra_shiatsu,split=' ')
shiatsu_split1 <- lapply(shiatsu_split, '[',1)
shiatsu_split1b <- as.character(shiatsu_split1)
shiatsu_split2 <- lapply(shiatsu_split, '[',2)
shiatsu_split2b <- as.character(shiatsu_split2)
shiatsu_split3 <- lapply(shiatsu_split, '[',3)
shiatsu_split3b <- as.character(shiatsu_split3)
shiatsu_bi1 <- paste(shiatsu_split1b,shiatsu_split2b)
shiatsu_bi2 <- paste(shiatsu_split2b,shiatsu_split3b)
shiatsu_bi3 <- paste(shiatsu_split3b,shiatsu_split1b)
uni_shiatsu <- unique(c(shiatsu_split1b, shiatsu_split2b, shiatsu_split3b))
bi_shiatsu <- unique(c(shiatsu_bi1, shiatsu_bi2, shiatsu_bi3))
Shiatsu_contraindications <- c(contra_shiatsu, bi_shiatsu, uni_shiatsu)
shiatsubenefit_uni <- strsplit(benefits_shiatsu, split=" ")
shiatsubenefit_uni1 <- lapply(shiatsubenefit_uni,'[',1)
shiatsubenefit_uni1b <- as.character(shiatsubenefit_uni1)
shiatsubenefit_uni2 <- lapply(shiatsubenefit_uni,'[',2)
shiatsubenefit_uni2b <- as.character(shiatsubenefit_uni2)
shiatsuuni_benefits <- unique(c(shiatsubenefit_uni1b,shiatsubenefit_uni2b))
benefits_shiatsu1 <- c(shiatsuuni_benefits, benefits_shiatsu) #

both_shiatsu <- Shiatsu_contraindications %in% benefits_shiatsu1
shiatsu1 <- Shiatsu_contraindications[both_shiatsu] #
Shiatsu_contraindications1 <- Shiatsu_contraindications[both_shiatsu==FALSE] #

benefits_prenatal, contra_prenatal

prenatal_split <- strsplit(contra_prenatal,split=' ')
prenatal_split1 <- lapply(prenatal_split, '[',1)
prenatal_split1b <- as.character(prenatal_split1)
prenatal_split2 <- lapply(prenatal_split, '[',2)
prenatal_split2b <- as.character(prenatal_split2)
prenatal_split3 <- lapply(prenatal_split, '[',3)
prenatal_split3b <- as.character(prenatal_split3)
prenatal_bi1 <- paste(prenatal_split1b,prenatal_split2b)
prenatal_bi2 <- paste(prenatal_split2b,prenatal_split3b)
prenatal_bi3 <- paste(prenatal_split3b,prenatal_split1b)
uni_prenatal <- unique(c(prenatal_split1b, prenatal_split2b, prenatal_split3b))
bi_prenatal <- unique(c(prenatal_bi1, prenatal_bi2, prenatal_bi3))
Prenatal_contraindications <- c(contra_prenatal, bi_prenatal, uni_prenatal)
prenatalbenefit_uni <- strsplit(benefits_prenatal, split=" ")
prenatalbenefit_uni1 <- lapply(prenatalbenefit_uni,'[',1)
prenatalbenefit_uni1b <- as.character(prenatalbenefit_uni1)
prenatalbenefit_uni2 <- lapply(prenatalbenefit_uni,'[',2)
prenatalbenefit_uni2b <- as.character(prenatalbenefit_uni2)
prenataluni_benefits <- unique(c(prenatalbenefit_uni1b,prenatalbenefit_uni2b))
benefits_prenatal1 <- c(prenataluni_benefits, benefits_prenatal) #

both_prenatal <- Prenatal_contraindications %in% benefits_prenatal1
prenatal1 <- Prenatal_contraindications[both_prenatal] #
Prenatal_contraindications1 <- Prenatal_contraindications[both_prenatal==FALSE] #

benefits_myofascial, contra_myofascial

myofascial_split <- strsplit(contra_myofascial,split=' ')
myofascial_split1 <- lapply(myofascial_split, '[',1)
myofascial_split1b <- as.character(myofascial_split1)
myofascial_split2 <- lapply(myofascial_split, '[',2)
myofascial_split2b <- as.character(myofascial_split2)
myofascial_split3 <- lapply(myofascial_split, '[',3)
myofascial_split3b <- as.character(myofascial_split3)
myofascial_bi1 <- paste(myofascial_split1b,myofascial_split2b)
myofascial_bi2 <- paste(myofascial_split2b,myofascial_split3b)
myofascial_bi3 <- paste(myofascial_split3b,myofascial_split1b)
uni_myofascial <- unique(c(myofascial_split1b, myofascial_split2b, myofascial_split3b))
bi_myofascial <- unique(c(myofascial_bi1, myofascial_bi2, myofascial_bi3))
Myofascial_contraindications <- c(contra_myofascial, bi_myofascial, uni_myofascial)
myofascialbenefit_uni <- strsplit(benefits_myofascial, split=" ")
myofascialbenefit_uni1 <- lapply(myofascialbenefit_uni,'[',1)
myofascialbenefit_uni1b <- as.character(myofascialbenefit_uni1)
myofascialbenefit_uni2 <- lapply(myofascialbenefit_uni,'[',2)
myofascialbenefit_uni2b <- as.character(myofascialbenefit_uni2)
myofascialuni_benefits <- unique(c(myofascialbenefit_uni1b,myofascialbenefit_uni2b))
benefits_myofascial1 <- c(myofascialuni_benefits, benefits_myofascial) #

both_myofascial <- Myofascial_contraindications %in% benefits_myofascial1
myofascial1 <- Myofascial_contraindications[both_myofascial] #
Myofascial_contraindications1 <- Myofascial_contraindications[both_myofascial==FALSE] #

Now that we have our lists of better contraindications and better benefits for each massage modality, and the tokens commen to both, we should start building our recommender system for massage that first excludes any modality that has the tokens for contraindications based on a user input, then use the user input to recommend a massage or many massages based on the tokens that fit into the benefits of each modality.

We should first start by tokenizing a list the same way we did with the model that produced these tokens. We did this in python, and we could go back to python, or we could tokenize in R with the same packages used in an earlier text mining script called SentimentAnalysisReviewsMixedBusinessModels2.Rmd (reviewsYelp2 desktop folder) from the tm, tidytext, textstem, stringr, dplyr, and tidyverse R packages.

We will lemmatize the tokens of the benefits and contraindications each separately and also the Description and sideEffects to possibly use later.

modes <- read.csv('MassageModalities2.csv', sep=',', header=TRUE, na.strings=c('',' ','NA'))
colnames(modes)[1] <- 'modality'

modes$lemmaDescription <- lemmatize_strings(modes$Description, dictionary=lexicon::hash_lemmas)
modes$lemmaBenefits <- lemmatize_strings(modes$benefits, dictionary=lexicon::hash_lemmas)
modes$lemmaContraindications <- lemmatize_strings(modes$contraindications,
                                                  dictionary=lexicon::hash_lemmas)
modes$lemmaSideEffects <- lemmatize_strings(modes$sideEffects, dictionary=lexicon::hash_lemmas)

Then we will take the bigrams of the lemmatized benefits, and trigrams of the lemmatized contraindications.

lemmaTable <- tibble(line=1:456, Modality=modes$modality,
                  Description=modes$lemmaDescription,
                  SideEffects=modes$lemmaSideEffects,
                  Benefits=modes$lemmaBenefits,
                  Contraindications=modes$lemmaContraindications
                  )

bigram_df <- lemmaTable %>% unnest_tokens(BenefitsBigram, Benefits, token='ngrams',n=2) 
bigram_df2 <- bigram_df %>% count(BenefitsBigram, sort=TRUE)

bigram_separate <- bigram_df2 %>%
  separate(BenefitsBigram, c('word1','word2'), sep=' ') 

bigram_noStops <- bigram_separate %>%
  filter(!word1 %in% stop_words$word) %>%
  filter(!word2 %in% stop_words$word) 

bigram_counts <- bigram_noStops %>% count(word1,word2,sort=TRUE)



trigram_df <- lemmaTable %>% unnest_tokens(ContraindicationsTrigram,Contraindications,
                                           token='ngrams',n=3)
trigram_df2 <- trigram_df %>% count(ContraindicationsTrigram, sort=TRUE)

trigram_separate <- trigram_df2 %>% separate(ContraindicationsTrigram,
                                             c('word1','word2','word3'), sep=' ')

trigram_noStops <- trigram_separate %>% filter(!word1 %in% stop_words$word)%>%
  filter(!word2 %in% stop_words$word) %>% filter(!word3 %in% stop_words$word)

trigram_noStops_counts <- trigram_noStops %>% count(word1,word2,word3, sort=TRUE)
colnames(bigram_counts)
## [1] "word1" "word2" "n"
bigram_counts
## # A tibble: 163 x 3
##    word1     word2          n
##    <chr>     <chr>      <int>
##  1 ache      improve        1
##  2 ache      relieve        1
##  3 acute     pain           1
##  4 adhesion  heal           1
##  5 adhesion  improve        1
##  6 adhesion  increase       1
##  7 alleviate headache       1
##  8 anxiety   stress         1
##  9 arthritis chronic        1
## 10 arthritis tendonitis     1
## # ... with 153 more rows
colnames(trigram_noStops_counts)
## [1] "word1" "word2" "word3" "n"
trigram_noStops_counts
## # A tibble: 191 x 4
##    word1       word2       word3         n
##    <chr>       <chr>       <chr>     <int>
##  1 acute       cranium     bleed         1
##  2 anemia      blood       disorder      1
##  3 anemia      diabetes    blood         1
##  4 aneurism    history     cancer        1
##  5 aneurism    history     psychosis     1
##  6 application site        fever         1
##  7 arnold      chiari      acute         1
##  8 arthritis   neuropathic pain          1
##  9 autoimmune  disease     psychosis     1
## 10 blood       clot        diabetes      1
## # ... with 181 more rows
quadgram_description <- lemmaTable %>% unnest_tokens(DescriptionQuadgram,Description,
                                           token='ngrams',n=4)
quadgram_description
## # A tibble: 46,944 x 6
##     line Modality  SideEffects    Benefits   Contraindications  DescriptionQuad~
##    <int> <fct>     <chr>          <chr>      <chr>              <chr>           
##  1     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ traditional spa~
##  2     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ spa or clinic m~
##  3     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ or clinic massa~
##  4     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ clinic massage ~
##  5     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ massage with ha~
##  6     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ with hand palm ~
##  7     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ hand palm elbow~
##  8     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ palm elbow fore~
##  9     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ elbow forearm o~
## 10     1 Swedish ~ if no contrai~ improve t~ dehydration, feve~ forearm of mass~
## # ... with 46,934 more rows

The Massage Recommender Program

This program is Built in R from lemmatized and tokened lists done in python 3.

The tidytext has the great tokenization feature, but doesn’t allow groups of more than one ngram, such as setting ngrams to a list like n=c(1,4), as the python 3 nltk package does. But it can be worked around if needed to get those values separately or as we did with string literals earlier when filtering the tokenized words for better tokens of benefits and contraindications for massage.

We aren’t much concerned with counts, as this goal was to get the contraindications, from 24 identical samples for each of 19 massage modality benefits or contraindications, so the counts are not necessary because there won’t be much variance if any, and it was only if there was a count greater than zero that interested this script’s focus. But it is useful to have the counts listed above for each ngram.

Right now, the goal is to build the same extracted tokens as we did earlier in python then use a function to wrap around a user input that will tokenize the input and transform it into the same matrices as was done in python. I am not sure how or what programs to use in R to do that at the moment and won’t spend time searching to finish this task. So it is tempting to switch back to python using R’s reticulate library to run python script in the RStudio console. But the whole purpose of this section is to build a program that will:

1.) Use input from a user and tokenize the input by: a.) lemmatizing the input b.) extracting the: i.) unigrams ii.) bigrams iii.) trigrams

2.) Then take those lemmatized ngrams, and: a.) compare to our tokenized list of contraindications for all modalities b.) create a list of every modality the input is a contraindication for c.) create a list of available massages by: i.) removing any modality in 2.b for the user to choose from ii.) this will be a separate list of available modalities

3.) take the uni and bi gram lemmatized tokens of the user input in 1.b then: a.) compare to the list of available modalities in 2.c.ii. and make a list b.) from this list, any of the tokens that match the user input tokens: i.) add to a list of recommended massage modality ii.) here the more tokens per modality is needed from user input tokens to select the modality with the highest counts of tokens for benefits iii.) If there is a tie for massage modality recommended, then output all available massages and: -also provide the massage modality description -also provide the massage modality side effects

We will use a program defined request.

userInput <- function(input){
    input <- as.character(paste(input))
    input
    
}

print('Please explain what brings you in for a massage today, such as if you want to relax, its a gift, you have a headache, stressed at work, pain in the body, sore workouts, and briefly what your health is like and history of health.')
## [1] "Please explain what brings you in for a massage today, such as if you want to relax, its a gift, you have a headache, stressed at work, pain in the body, sore workouts, and briefly what your health is like and history of health."
input1 <- userInput('I have had headaches off and on, have seasonal allergies, my sleep is good, I have a history of stroke, and am on blood pressure medicine.')
input2 <- userInput('I just want to relax, and get massages regularly at a local massage place. I don\'t have any pre-existing health conditions and like a deep pressure with focus on my upper back. I am over a cold that I had two weeks ago. I have a sunburn on my face, that you could avoid.')
input3 <- userInput('I have leukemia, but want cupping done. My neighbor says it is good for me and I never tried it. Also, I need medium to deep pressure and don\'t want a light massage.')
input1
## [1] "I have had headaches off and on, have seasonal allergies, my sleep is good, I have a history of stroke, and am on blood pressure medicine."
input2
## [1] "I just want to relax, and get massages regularly at a local massage place. I don't have any pre-existing health conditions and like a deep pressure with focus on my upper back. I am over a cold that I had two weeks ago. I have a sunburn on my face, that you could avoid."
input3
## [1] "I have leukemia, but want cupping done. My neighbor says it is good for me and I never tried it. Also, I need medium to deep pressure and don't want a light massage."

The problem with using the function is that escape characters need to be used. When using the readline() there is no need to worry about that.The following won’t display because it is set to FALSE for eval= in the r chunk header, but shows what it could be in a program.

input1 <- readline('What makes you want a massage today? please explain your sleep, stress, health, health history, pressure used to, and last massage.')

Now to tokenize these user inputs. We need to create three functions that will get the uni, bi, and tri grams.

uni <- as.character(c(' '))
unitoken <- function(input){
    lemm1 <- lemmatize_strings(input, dictionary=lexicon::hash_lemmas)
    Lemm1 <- tibble(line=1, unigram=lemm1)
    unigram <- unnest_tokens(Lemm1,userUnigram, unigram,token='ngrams',n=1)
    uniNoStops <- filter(unigram, !userUnigram %in% stop_words$word)
    uniNoStops1 <- as.character(paste(uniNoStops$userUnigram))
    uniNoStops1
    
}

unoInput1 <- unitoken(input1)
unoInput1
## [1] "headache" "seasonal" "allergy"  "sleep"    "history"  "stroke"   "blood"   
## [8] "pressure" "medicine"
bitoken <- function(input){
    lemm2 <- lemmatize_strings(input, dictionary=lexicon::hash_lemmas)
    Lemm2 <- tibble(line=1, bigram=lemm2)
    bigram <- unnest_tokens(Lemm2,userBigram, bigram,token='ngrams',n=2)
    bigram_separate <- bigram %>% separate(userBigram,
                                             c('word1','word2'), sep=' ')
    bigram_noStops <- bigram_separate %>% 
        filter(!word1 %in% stop_words$word) %>% 
        filter(!word2 %in% stop_words$word) 
    bigram <- as.character(paste(bigram_noStops$word1,bigram_noStops$word2))
    bigram
}

dosInput1 <- bitoken(input1)
dosInput1
## [1] "seasonal allergy"  "blood pressure"    "pressure medicine"
tritoken <- function(input){
    lemm3 <- lemmatize_strings(input, dictionary=lexicon::hash_lemmas)
    Lemm3 <- tibble(line=1, trigram=lemm3)
    trigram <- unnest_tokens(Lemm3,userTrigram, trigram, token='ngrams',n=3)
    trigram_separate <- trigram %>% separate(userTrigram, c('word1','word2','word3'), sep=' ')
    trigram_noStops <- trigram_separate %>% 
        filter(!word1 %in% stop_words$word) %>% 
        filter(!word2 %in% stop_words$word) %>%
        filter(!word3 %in% stop_words$word)
        
    trigram <- as.character(paste(trigram_noStops$word1,trigram_noStops$word2,
                                  trigram_noStops$word3))
    trigram
}

tresInput1 <- tritoken(input1)
tresInput1
## [1] "blood pressure medicine"

Combine into one function the uni, bi, and tri grams.

uniBiTriTokens <- function(input){

    lemm1 <- lemmatize_strings(input, dictionary=lexicon::hash_lemmas)
    Lemm1 <- tibble(line=1, unigram=lemm1)
    unigram <- unnest_tokens(Lemm1,userUnigram, unigram,token='ngrams',n=1)
    uniNoStops <- filter(unigram, !userUnigram %in% stop_words$word)
    ungram <- uniNoStops$userUnigram
    
    Lemm2 <- tibble(line=1, bigram=lemm1)
    bigram <- unnest_tokens(Lemm2,userBigram, bigram,token='ngrams',n=2)
    bigram_separate <- bigram %>% separate(userBigram,
                                             c('word1','word2'), sep=' ')
    bigram_noStops <- bigram_separate %>% 
        filter(!word1 %in% stop_words$word) %>% 
        filter(!word2 %in% stop_words$word) 
    bigram <- as.character(paste(bigram_noStops$word1,bigram_noStops$word2))

    Lemm3 <- tibble(line=1, trigram=lemm1)
    trigram <- unnest_tokens(Lemm3,userTrigram, trigram, token='ngrams',n=3)
    trigram_separate <- trigram %>% separate(userTrigram, c('word1','word2','word3'), sep=' ')
    trigram_noStops <- trigram_separate %>% 
        filter(!word1 %in% stop_words$word) %>% 
        filter(!word2 %in% stop_words$word) %>%
        filter(!word3 %in% stop_words$word)
        
    trigram <- as.character(paste(trigram_noStops$word1,
                                  trigram_noStops$word2,trigram_noStops$word3))
    bt <- append(bigram,trigram, after=length(bigram))
    uniBiTriToken <-append(ungram, bt, after=length(ungram))
    uniBiTriToken
}

unDosTresGram <- uniBiTriTokens(input1)
unDosTresGram
##  [1] "headache"                "seasonal"               
##  [3] "allergy"                 "sleep"                  
##  [5] "history"                 "stroke"                 
##  [7] "blood"                   "pressure"               
##  [9] "medicine"                "seasonal allergy"       
## [11] "blood pressure"          "pressure medicine"      
## [13] "blood pressure medicine"

We have our list and function that will produce the bigrams, trigrams, and unigrams of a single user input. Now we need to compare this list of preprocessed, lemmatized, ngram tokens to the lists of tokens in each contraindication for each massage modality. And build a list of modalities this user input is not recommended. From there, we will compare that list of available modalities, to the list of tokens in the benefits of those modalities, and only keep those modalities the user input is recommended based on his or her input.

As an aside, this would probably be similar to the automated prompts to most telecommunications and major credit cards like Capital One or Macy’s, where a user might possibly get lost, know what they want but don’t know how to input according to these tokens. That would not be a desired outcome of this model. But textual recommender, not based on voice to text recommendations. So the better the user inputs his or her health and massage goals the better the results, but also the better fine tuned our benefits and contraindications for how a user would input for example, a heart condition or symptoms related to health conditions, like a fever could vary user to user. As a refresher, our named lists of contraindications within each modality are:

Our function that separates the user input into bi-tri-uni-tokens is called uniBiTriTokens(), with the user input as the only argument as a string. We stored the first input as unDosTresGram. This is the list we need to compare to every modality list above to see if any of the tokens are in that list, and create a list that will be a list of unavailable massages.

unDosTresGram
##  [1] "headache"                "seasonal"               
##  [3] "allergy"                 "sleep"                  
##  [5] "history"                 "stroke"                 
##  [7] "blood"                   "pressure"               
##  [9] "medicine"                "seasonal allergy"       
## [11] "blood pressure"          "pressure medicine"      
## [13] "blood pressure medicine"
CBD_contraindications1
##   [1] "aneurism history cancer"      "autoimmune disease psychosis"
##   [3] "breathing medication could"   "breathing ragged breathing"  
##   [5] "breathing trouble breathing"  "cancer autoimmune disease"   
##   [7] "cannabidiol allergy cbd"      "could interact cannabidiol"  
##   [9] "dehydration fever rash"       "difficulty breathing ragged" 
##  [11] "disease psychosis numbness"   "disorder nausea epilepsy"    
##  [13] "epilepsy pregnant aneurism"   "fever rash infection"        
##  [15] "history cancer autoimmune"    "infection mental disorder"   
##  [17] "interact cannabidiol allergy" "limb limb numbness"          
##  [19] "limb numbness nerve"          "medication could interact"   
##  [21] "mental disorder nausea"       "nausea epilepsy pregnant"    
##  [23] "nerve pain difficulty"        "nerve tingling nerve"        
##  [25] "numbness limb limb"           "numbness nerve tingling"     
##  [27] "pain difficulty breathing"    "pregnant aneurism history"   
##  [29] "psychosis numbness limb"      "ragged breathing trouble"    
##  [31] "rash infection mental"        "tingling nerve pain"         
##  [33] "trouble breathing medication" "aneurism history"            
##  [35] "autoimmune disease"           "breathing medication"        
##  [37] "breathing ragged"             "breathing trouble"           
##  [39] "cancer autoimmune"            "cannabidiol allergy"         
##  [41] "could interact"               "dehydration fever"           
##  [43] "difficulty breathing"         "disease psychosis"           
##  [45] "disorder nausea"              "epilepsy pregnant"           
##  [47] "fever rash"                   "history cancer"              
##  [49] "infection mental"             "interact cannabidiol"        
##  [51] "limb limb"                    "limb numbness"               
##  [53] "medication could"             "mental disorder"             
##  [55] "nausea epilepsy"              "nerve tingling"              
##  [57] "numbness limb"                "numbness nerve"              
##  [59] "pain difficulty"              "pregnant aneurism"           
##  [61] "psychosis numbness"           "ragged breathing"            
##  [63] "rash infection"               "tingling nerve"              
##  [65] "trouble breathing"            "allergy cbd"                 
##  [67] "cancer aneurism"              "psychosis autoimmune"        
##  [69] "could breathing"              "breathing breathing"         
##  [71] "disease cancer"               "cbd cannabidiol"             
##  [73] "cannabidiol could"            "rash dehydration"            
##  [75] "ragged difficulty"            "numbness disease"            
##  [77] "epilepsy disorder"            "aneurism epilepsy"           
##  [79] "infection fever"              "autoimmune history"          
##  [81] "disorder infection"           "allergy interact"            
##  [83] "nerve limb"                   "interact medication"         
##  [85] "nausea mental"                "pregnant nausea"             
##  [87] "difficulty nerve"             "nerve nerve"                 
##  [89] "tingling numbness"            "breathing pain"              
##  [91] "history pregnant"             "limb psychosis"              
##  [93] "trouble ragged"               "mental rash"                 
##  [95] "pain tingling"                "medication trouble"          
##  [97] "aneurism"                     "autoimmune"                  
##  [99] "breathing"                    "cancer"                      
## [101] "cannabidiol"                  "could"                       
## [103] "dehydration"                  "difficulty"                  
## [105] "disease"                      "disorder"                    
## [107] "epilepsy"                     "fever"                       
## [109] "history"                      "infection"                   
## [111] "interact"                     "limb"                        
## [113] "medication"                   "mental"                      
## [115] "nausea"                       "numbness"                    
## [117] "pregnant"                     "psychosis"                   
## [119] "ragged"                       "rash"                        
## [121] "tingling"                     "trouble"                     
## [123] "allergy"                      "cbd"

The above shows the list of tokens in the user input, then the list of contraindications for CBD as a massage modality.

testCBD_contra <- ifelse(unDosTresGram %in%  CBD_contraindications1,1,0)
l <- length(unDosTresGram)
s <- sum(testCBD_contra)
l
## [1] 13
s
## [1] 2
notlikelyCBD <- round(s/l,2)
notlikelyCBD
## [1] 0.15
unDosTresGram[testCBD_contra>0]
## [1] "allergy" "history"
input1
## [1] "I have had headaches off and on, have seasonal allergies, my sleep is good, I have a history of stroke, and am on blood pressure medicine."

Lets look at the contraindications for CBD to see how the input compares.

cbd <- subset(modes, modes$modality=='Cannabidiol (CBD) Massage Balm')[1,4]
cbd
## [1] dehydration,  fever,  rashes,  infection,  some mental disorders,  nausea,  epilepsy,  pregnant,  aneurism history,  cancer,  autoimmune disease, psychosis,   numbness in limbs,  limb numbness,  nerve tingling,  nerve pain,  difficulty breathing,  ragged breathing,  trouble breathing,  medications that could interact with cannabidiol,  allergies to CBD
## 19 Levels: dehydration,  asthma,  allergies to fragrances,  epilepsy,  fever,  infection,  heart condition,  neuropathy,  pregnant,  breast feeding,  open wounds,  skin rashes,  eczema,  psoriasis,  sunburn,  sensitive skin,  sensitive to smells,  pregnant,  aneurism history,  cancer, psychosis,   numbness in limbs,  limb numbness,  nerve tingling,  nerve pain,  difficulty breathing,  ragged breathing,  trouble breathing,   ...

Lets create a function to do just this for all 19 massage modalities. It will return a percent of not recommended that the modality is a contraindication for the user. The input would be the output of the user input from the uniBiTriTokens stored as a variable, so that the variable would be the input to this function, we will name notRecommended().

notRecommended <- function(uniBiTriTokensOutputVar){
    
        testCBD <- ifelse(uniBiTriTokensOutputVar %in%  CBD_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testCBD)
        cbd <- round(s/l,2)

        testCranio <- ifelse(uniBiTriTokensOutputVar %in%  Cranio_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testCranio)
        craniosacral <- round(s/l,2)

        testReflex <- ifelse(uniBiTriTokensOutputVar %in%  Reflex_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testReflex)
        reflexology <- round(s/l,2)

        testLymph <- ifelse(uniBiTriTokensOutputVar %in%  Lymph_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testLymph)
        lymphatic <- round(s/l,2)

        testMGN <- ifelse(uniBiTriTokensOutputVar %in%  Mgn_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testMGN)
        massagegun <- round(s/l,2)

        testIASTM <- ifelse(uniBiTriTokensOutputVar %in%  Instrument_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testIASTM)
        IASTM <- round(s/l,2)

        testDT <- ifelse(uniBiTriTokensOutputVar %in%  DT_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testDT)
        deepTissue <- round(s/l,2)

        testSwedish <- ifelse(uniBiTriTokensOutputVar %in% swedish_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testSwedish)
        swedish <- round(s/l,2)

        testAroma <- ifelse(uniBiTriTokensOutputVar %in%  Aroma_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testAroma)
        aromatherapy <- round(s/l,2)

        testStretch <- ifelse(uniBiTriTokensOutputVar %in%  stretch_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testStretch)
        stretch <- round(s/l,2)

        testTPT <- ifelse(uniBiTriTokensOutputVar %in%  TPT_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testTPT)
        triggerPoint <- round(s/l,2)

        testHS <- ifelse(uniBiTriTokensOutputVar %in%  HotStone_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testHS)
        hotStone <- round(s/l,2)

        testCups <- ifelse(uniBiTriTokensOutputVar %in%  Cupping_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testCups)
        cupping <- round(s/l,2)

        testSports <- ifelse(uniBiTriTokensOutputVar %in%  Sports_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testSports)
        sports <- round(s/l,2)

        testFreeze <- ifelse(uniBiTriTokensOutputVar %in%  Freeze_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testFreeze)
        biofreeze <- round(s/l,2)

        testCold <- ifelse(uniBiTriTokensOutputVar %in%  cold_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testCold)
        coldStone <- round(s/l,2)

        testShiatsu <- ifelse(uniBiTriTokensOutputVar %in%  Shiatsu_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testShiatsu)
        shiatsu <- round(s/l,2)

        testPreg <- ifelse(uniBiTriTokensOutputVar %in%  Prenatal_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testPreg)
        prenatal <- round(s/l,2)

        testMyo <- ifelse(uniBiTriTokensOutputVar %in%  Myofascial_contraindications1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testMyo)
        myofascial <- round(s/l,2)
        
        names <- c("Cannabidiol (CBD) Massage Balm","Craniosacral Massage","Reflexology Massage","Lymphatic Drainage Massage","Massage Gun Therapy","Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage", "Deep tissue Massage","Swedish Massage","Aromatherapy","Stretching","Trigger Point Therapy","Hot Stone Therapy Massage", "Cupping Therapy","Sports Massage","Biofreeze Muscle Pain Relief Gel","Cold Stone Therapy","Shiatsu Massage","Prenatal Massage","Myofascial Massage")
        
        probs <- c(cbd,craniosacral,reflexology,lymphatic,massagegun,IASTM,deepTissue,
                   swedish,aromatherapy,stretch,triggerPoint,hotStone,cupping,sports,
                   biofreeze,coldStone,shiatsu,prenatal,myofascial)
        
        probabilities <- as.data.frame(cbind(names,probs))
        colnames(probabilities) <- c('modality','probability')
        probabilities$probability <- as.numeric(paste(probabilities$probability))
        
        probabilities1 <- subset(probabilities,
                                 probabilities$probability>min(probabilities$probability))
        probabilities1
}
nr <- notRecommended(unDosTresGram)
nr
##                                                                 modality
## 1                                         Cannabidiol (CBD) Massage Balm
## 2                                                   Craniosacral Massage
## 5                                                    Massage Gun Therapy
## 6  Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage
## 7                                                    Deep tissue Massage
## 9                                                           Aromatherapy
## 11                                                 Trigger Point Therapy
## 12                                             Hot Stone Therapy Massage
## 13                                                       Cupping Therapy
## 14                                                        Sports Massage
## 15                                      Biofreeze Muscle Pain Relief Gel
## 16                                                    Cold Stone Therapy
## 17                                                       Shiatsu Massage
## 18                                                      Prenatal Massage
## 19                                                    Myofascial Massage
##    probability
## 1         0.15
## 2         0.38
## 5         0.31
## 6         0.31
## 7         0.31
## 9         0.15
## 11        0.31
## 12        0.31
## 13        0.31
## 14        0.31
## 15        0.38
## 16        0.31
## 17        0.31
## 18        0.15
## 19        0.31

Now that we have the list of massages not to recommend for this user’s input, we can compare those tokens in the user input to the benefits tokens for each massage modality not in the modalities not recommended.

NR <- as.character(paste(nr$modality))
NR
##  [1] "Cannabidiol (CBD) Massage Balm"                                       
##  [2] "Craniosacral Massage"                                                 
##  [3] "Massage Gun Therapy"                                                  
##  [4] "Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage"
##  [5] "Deep tissue Massage"                                                  
##  [6] "Aromatherapy"                                                         
##  [7] "Trigger Point Therapy"                                                
##  [8] "Hot Stone Therapy Massage"                                            
##  [9] "Cupping Therapy"                                                      
## [10] "Sports Massage"                                                       
## [11] "Biofreeze Muscle Pain Relief Gel"                                     
## [12] "Cold Stone Therapy"                                                   
## [13] "Shiatsu Massage"                                                      
## [14] "Prenatal Massage"                                                     
## [15] "Myofascial Massage"
modesList <- as.character(paste(unique(modes$modality)))
availModes <- !modesList %in% NR
availModes
##  [1]  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
available <- modesList[availModes]
available
## [1] "Swedish Massage"            "Lymphatic Drainage Massage"
## [3] "Reflexology Massage"        "Stretching"

Now that we have our avaiable massage modalities, lets get the best recommended massages from this list to recommend a massage for our user’s input tokens.The uniBiTriTokens output that was stored as a variable is used, and inside this function, modesRecommended(), it will calculate all probability recommended massages based on our filtered benefits lists of each 19 modes, then return a list of available massages that aren’t in our notRecommended list and with a higher probability based on the user input tokens to be recommended compared with each modality’s benefits. We will modify the function above but replace the contraindications for each modality with the benefits. The benefits lists built earlier are: - benefits_cbd1 - benefits_cranio1 - benefits_reflex1
- benefits_lymph1 - benefits_mgn1 - benefits_tpt1 - benefits_instrument1 - benefits_DT1 - benefits_swedish1 - benefits_aroma1 - benefits_stretch1 - benefits_cold1 - benefits_freeze1 - benefits_sports1 - benefits_cup1 - benefits_HotStone1 - benefits_shiatsu1 - benefits_prenatal1 - benefits_myofascial1

The stored variable for our input, uniBiTriTokensOutputVar, is unDosTresGrams.

modesRecommended <- function(uniBiTriTokensOutputVar){
    
        testCBD <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cbd1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testCBD)
        cbd <- round(s/l,2)

        testCranio <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cranio1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testCranio)
        craniosacral <- round(s/l,2)

        testReflex <- ifelse(uniBiTriTokensOutputVar %in%  benefits_reflex1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testReflex)
        reflexology <- round(s/l,2)

        testLymph <- ifelse(uniBiTriTokensOutputVar %in%  benefits_lymph1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testLymph)
        lymphatic <- round(s/l,2)

        testMGN <- ifelse(uniBiTriTokensOutputVar %in%  benefits_mgn1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testMGN)
        massagegun <- round(s/l,2)

        testIASTM <- ifelse(uniBiTriTokensOutputVar %in%  benefits_instrument1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testIASTM)
        IASTM <- round(s/l,2)

        testDT <- ifelse(uniBiTriTokensOutputVar %in%  benefits_DT1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testDT)
        deepTissue <- round(s/l,2)

        testSwedish <- ifelse(uniBiTriTokensOutputVar %in% benefits_swedish1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testSwedish)
        swedish <- round(s/l,2)

        testAroma <- ifelse(uniBiTriTokensOutputVar %in%  benefits_aroma1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testAroma)
        aromatherapy <- round(s/l,2)

        testStretch <- ifelse(uniBiTriTokensOutputVar %in%  benefits_stretch1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testStretch)
        stretch <- round(s/l,2)

        testTPT <- ifelse(uniBiTriTokensOutputVar %in%  benefits_tpt1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testTPT)
        triggerPoint <- round(s/l,2)

        testHS <- ifelse(uniBiTriTokensOutputVar %in%  benefits_HotStone1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testHS)
        hotStone <- round(s/l,2)

        testCups <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cup1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testCups)
        cupping <- round(s/l,2)

        testSports <- ifelse(uniBiTriTokensOutputVar %in%  benefits_sports1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testSports)
        sports <- round(s/l,2)

        testFreeze <- ifelse(uniBiTriTokensOutputVar %in%  benefits_freeze1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testFreeze)
        biofreeze <- round(s/l,2)

        testCold <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cold1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testCold)
        coldStone <- round(s/l,2)

        testShiatsu <- ifelse(uniBiTriTokensOutputVar %in%  benefits_shiatsu1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testShiatsu)
        shiatsu <- round(s/l,2)

        testPreg <- ifelse(uniBiTriTokensOutputVar %in%  benefits_prenatal1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testPreg)
        prenatal <- round(s/l,2)

        testMyo <- ifelse(uniBiTriTokensOutputVar %in%  benefits_myofascial1,1,0)
        l <- length(uniBiTriTokensOutputVar)
        s <- sum(testMyo)
        myofascial <- round(s/l,2)
        
        names <- c("Cannabidiol (CBD) Massage Balm","Craniosacral Massage","Reflexology Massage","Lymphatic Drainage Massage","Massage Gun Therapy","Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage", "Deep tissue Massage","Swedish Massage","Aromatherapy","Stretching","Trigger Point Therapy","Hot Stone Therapy Massage", "Cupping Therapy","Sports Massage","Biofreeze Muscle Pain Relief Gel","Cold Stone Therapy","Shiatsu Massage","Prenatal Massage","Myofascial Massage")
        
        probs <- c(cbd,craniosacral,reflexology,lymphatic,massagegun,IASTM,deepTissue,
                   swedish,aromatherapy,stretch,triggerPoint,hotStone,cupping,sports,
                   biofreeze,coldStone,shiatsu,prenatal,myofascial)
        
        probabilities <- as.data.frame(cbind(names,probs))
        colnames(probabilities) <- c('modality','probability')
        probabilities$probability <- as.numeric(paste(probabilities$probability))
        
        probabilities0 <- probabilities %>% filter(probabilities$modality %in% available)
        
        probabilities01 <- probabilities0[order(probabilities0$probability, decreasing=TRUE),]
        print(probabilities01)
        
        probs <- as.character(paste(probabilities01$modality))
        
        print('Your top choices for your recommended massage modalities based on your user input are: ')
        # print(probs[1])
        # print(probs[2])
        # print(probs[3])
        # print('Thank you.')
        probs[1:3]
        
}
reccMode <- modesRecommended(unDosTresGram)
##                     modality probability
## 1        Reflexology Massage        0.08
## 3            Swedish Massage        0.08
## 4                 Stretching        0.08
## 2 Lymphatic Drainage Massage        0.00
## [1] "Your top choices for your recommended massage modalities based on your user input are: "
reccMode
## [1] "Reflexology Massage" "Swedish Massage"     "Stretching"

Lets make this function look pretty, but not using any pretty packaging of a related design. Just clean it up and combine it to work in an online app. ***

Version 1

massageModalityRecommender <- function(input){
  
  lemm1 <- lemmatize_strings(input, dictionary=lexicon::hash_lemmas)
  Lemm1 <- tibble(line=1, unigram=lemm1)
  unigram <- unnest_tokens(Lemm1,userUnigram, unigram,token='ngrams',n=1)
  uniNoStops <- filter(unigram, !userUnigram %in% stop_words$word)
  ungram <- uniNoStops$userUnigram
  
  Lemm2 <- tibble(line=1, bigram=lemm1)
  bigram <- unnest_tokens(Lemm2,userBigram, bigram,token='ngrams',n=2)
  bigram_separate <- bigram %>% separate(userBigram,
                                         c('word1','word2'), sep=' ')
  bigram_noStops <- bigram_separate %>% 
    filter(!word1 %in% stop_words$word) %>% 
    filter(!word2 %in% stop_words$word) 
  bigram <- as.character(paste(bigram_noStops$word1,bigram_noStops$word2))
  
  Lemm3 <- tibble(line=1, trigram=lemm1)
  trigram <- unnest_tokens(Lemm3,userTrigram, trigram, token='ngrams',n=3)
  trigram_separate <- trigram %>% separate(userTrigram, c('word1','word2','word3'), sep=' ')
  trigram_noStops <- trigram_separate %>% 
    filter(!word1 %in% stop_words$word) %>% 
    filter(!word2 %in% stop_words$word) %>%
    filter(!word3 %in% stop_words$word)
  
  trigram <- as.character(paste(trigram_noStops$word1,
                                trigram_noStops$word2,trigram_noStops$word3))
  bt <- append(bigram,trigram, after=length(bigram))
  uniBiTriToken <-append(ungram, bt, after=length(ungram))
  uniBiTriTokensOutputVar <- uniBiTriToken

  testCBD <- ifelse(uniBiTriTokensOutputVar %in%  CBD_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCBD)
  cbd <- round(s/l,2)
  
  testCranio <- ifelse(uniBiTriTokensOutputVar %in%  Cranio_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCranio)
  craniosacral <- round(s/l,2)
  
  testReflex <- ifelse(uniBiTriTokensOutputVar %in%  Reflex_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testReflex)
  reflexology <- round(s/l,2)
  
  testLymph <- ifelse(uniBiTriTokensOutputVar %in%  Lymph_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testLymph)
  lymphatic <- round(s/l,2)
  
  testMGN <- ifelse(uniBiTriTokensOutputVar %in%  Mgn_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testMGN)
  massagegun <- round(s/l,2)
  
  testIASTM <- ifelse(uniBiTriTokensOutputVar %in%  Instrument_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testIASTM)
  IASTM <- round(s/l,2)
  
  testDT <- ifelse(uniBiTriTokensOutputVar %in%  DT_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testDT)
  deepTissue <- round(s/l,2)
  
  testSwedish <- ifelse(uniBiTriTokensOutputVar %in% swedish_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testSwedish)
  swedish <- round(s/l,2)
  
  testAroma <- ifelse(uniBiTriTokensOutputVar %in%  Aroma_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testAroma)
  aromatherapy <- round(s/l,2)
  
  testStretch <- ifelse(uniBiTriTokensOutputVar %in%  stretch_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testStretch)
  stretch <- round(s/l,2)
  
  testTPT <- ifelse(uniBiTriTokensOutputVar %in%  TPT_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testTPT)
  triggerPoint <- round(s/l,2)
  
  testHS <- ifelse(uniBiTriTokensOutputVar %in%  HotStone_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testHS)
  hotStone <- round(s/l,2)
  
  testCups <- ifelse(uniBiTriTokensOutputVar %in%  Cupping_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCups)
  cupping <- round(s/l,2)
  
  testSports <- ifelse(uniBiTriTokensOutputVar %in%  Sports_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testSports)
  sports <- round(s/l,2)
  
  testFreeze <- ifelse(uniBiTriTokensOutputVar %in%  Freeze_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testFreeze)
  biofreeze <- round(s/l,2)
  
  testCold <- ifelse(uniBiTriTokensOutputVar %in%  cold_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCold)
  coldStone <- round(s/l,2)
  
  testShiatsu <- ifelse(uniBiTriTokensOutputVar %in%  Shiatsu_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testShiatsu)
  shiatsu <- round(s/l,2)
  
  testPreg <- ifelse(uniBiTriTokensOutputVar %in%  Prenatal_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testPreg)
  prenatal <- round(s/l,2)
  
  testMyo <- ifelse(uniBiTriTokensOutputVar %in%  Myofascial_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testMyo)
  myofascial <- round(s/l,2)
  
  names <- c("Cannabidiol (CBD) Massage Balm","Craniosacral Massage","Reflexology Massage","Lymphatic Drainage Massage","Massage Gun Therapy","Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage", "Deep tissue Massage","Swedish Massage","Aromatherapy","Stretching","Trigger Point Therapy","Hot Stone Therapy Massage", "Cupping Therapy","Sports Massage","Biofreeze Muscle Pain Relief Gel","Cold Stone Therapy","Shiatsu Massage","Prenatal Massage","Myofascial Massage")
  
  probs <- c(cbd,craniosacral,reflexology,lymphatic,massagegun,IASTM,deepTissue,
             swedish,aromatherapy,stretch,triggerPoint,hotStone,cupping,sports,
             biofreeze,coldStone,shiatsu,prenatal,myofascial)
  
  probabilities <- as.data.frame(cbind(names,probs))
  colnames(probabilities) <- c('modality','probability')
  probabilities$probability <- as.numeric(paste(probabilities$probability))
  
  probabilities1 <- subset(probabilities,
                           probabilities$probability>min(probabilities$probability))
  
  NR <- as.character(paste(probabilities1$modality))
  availModes <- !names %in% NR
  available <- names[availModes]

  testCBD1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cbd1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCBD1)
  cbd1 <- round(s/l,2)
  
  testCranio1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cranio1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCranio1)
  craniosacral1 <- round(s/l,2)
  
  testReflex1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_reflex1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testReflex1)
  reflexology1 <- round(s/l,2)
  
  testLymph1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_lymph1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testLymph1)
  lymphatic1 <- round(s/l,2)
  
  testMGN1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_mgn1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testMGN1)
  massagegun1 <- round(s/l,2)
  
  testIASTM1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_instrument1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testIASTM1)
  IASTM1 <- round(s/l,2)
  
  testDT1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_DT1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testDT1)
  deepTissue1 <- round(s/l,2)
  
  testSwedish1 <- ifelse(uniBiTriTokensOutputVar %in% benefits_swedish1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testSwedish1)
  swedish1 <- round(s/l,2)
  
  testAroma1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_aroma1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testAroma1)
  aromatherapy1 <- round(s/l,2)
  
  testStretch1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_stretch1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testStretch1)
  stretch1 <- round(s/l,2)
  
  testTPT1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_tpt1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testTPT1)
  triggerPoint1 <- round(s/l,2)
  
  testHS1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_HotStone1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testHS1)
  hotStone1 <- round(s/l,2)
  
  testCups1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cup1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCups1)
  cupping1 <- round(s/l,2)
  
  testSports1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_sports1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testSports1)
  sports1 <- round(s/l,2)
  
  testFreeze1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_freeze1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testFreeze1)
  biofreeze1 <- round(s/l,2)
  
  testCold1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cold1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCold1)
  coldStone1 <- round(s/l,2)
  
  testShiatsu1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_shiatsu1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testShiatsu1)
  shiatsu1 <- round(s/l,2)
  
  testPreg1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_prenatal1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testPreg1)
  prenatal1 <- round(s/l,2)
  
  testMyo1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_myofascial1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testMyo1)
  myofascial1 <- round(s/l,2)
  
  names1 <- c("Cannabidiol (CBD) Massage Balm","Craniosacral Massage","Reflexology Massage","Lymphatic Drainage Massage","Massage Gun Therapy","Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage", "Deep tissue Massage","Swedish Massage","Aromatherapy","Stretching","Trigger Point Therapy","Hot Stone Therapy Massage", "Cupping Therapy","Sports Massage","Biofreeze Muscle Pain Relief Gel","Cold Stone Therapy","Shiatsu Massage","Prenatal Massage","Myofascial Massage")
  
  probs <- c(cbd1,craniosacral1,reflexology1,lymphatic1,massagegun1,IASTM1,deepTissue1,
             swedish1,aromatherapy1,stretch1,triggerPoint1,hotStone1,cupping1,sports1,
             biofreeze1,coldStone1,shiatsu1,prenatal1,myofascial1)
  
  probabilities <- as.data.frame(cbind(names1,probs))
  colnames(probabilities) <- c('modality','probability')
  probabilities$probability <- as.numeric(paste(probabilities$probability))
  
  probabilities0 <- probabilities %>% filter(probabilities$modality %in% available)
  
  probabilities01 <- probabilities0[order(probabilities0$probability, decreasing=TRUE),]
  print(probabilities01)
  
  probs <- as.character(paste(probabilities01$modality))
  
  print('Your top choices for your recommended massage modalities based on your user input are: ')
  probs[1:3]
  
}
massageModalityRecommender(input1)
##                     modality probability
## 1        Reflexology Massage        0.08
## 3            Swedish Massage        0.08
## 4                 Stretching        0.08
## 2 Lymphatic Drainage Massage        0.00
## [1] "Your top choices for your recommended massage modalities based on your user input are: "
## [1] "Reflexology Massage" "Swedish Massage"     "Stretching"

Great! The function works as planned.


Version 2

This version is the same function above but, we will add in that the return also show each recommended massage modality’s description, benefits, and side effects. We will use the modesUnique table to pull the description, benefits, and side effects from when returning those features’ information for each recommended modality only.

massageModalityRecommender2 <- function(input){
  
  lemm1 <- lemmatize_strings(input, dictionary=lexicon::hash_lemmas)
  Lemm1 <- tibble(line=1, unigram=lemm1)
  unigram <- unnest_tokens(Lemm1,userUnigram, unigram,token='ngrams',n=1)
  uniNoStops <- filter(unigram, !userUnigram %in% stop_words$word)
  ungram <- uniNoStops$userUnigram
  
  Lemm2 <- tibble(line=1, bigram=lemm1)
  bigram <- unnest_tokens(Lemm2,userBigram, bigram,token='ngrams',n=2)
  bigram_separate <- bigram %>% separate(userBigram,
                                         c('word1','word2'), sep=' ')
  bigram_noStops <- bigram_separate %>% 
    filter(!word1 %in% stop_words$word) %>% 
    filter(!word2 %in% stop_words$word) 
  bigram <- as.character(paste(bigram_noStops$word1,bigram_noStops$word2))
  
  Lemm3 <- tibble(line=1, trigram=lemm1)
  trigram <- unnest_tokens(Lemm3,userTrigram, trigram, token='ngrams',n=3)
  trigram_separate <- trigram %>% separate(userTrigram, c('word1','word2','word3'), sep=' ')
  trigram_noStops <- trigram_separate %>% 
    filter(!word1 %in% stop_words$word) %>% 
    filter(!word2 %in% stop_words$word) %>%
    filter(!word3 %in% stop_words$word)
  
  trigram <- as.character(paste(trigram_noStops$word1,
                                trigram_noStops$word2,trigram_noStops$word3))
  bt <- append(bigram,trigram, after=length(bigram))
  uniBiTriToken <-append(ungram, bt, after=length(ungram))
  uniBiTriTokensOutputVar <- uniBiTriToken

  testCBD <- ifelse(uniBiTriTokensOutputVar %in%  CBD_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCBD)
  cbd <- round(s/l,2)
  
  testCranio <- ifelse(uniBiTriTokensOutputVar %in%  Cranio_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCranio)
  craniosacral <- round(s/l,2)
  
  testReflex <- ifelse(uniBiTriTokensOutputVar %in%  Reflex_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testReflex)
  reflexology <- round(s/l,2)
  
  testLymph <- ifelse(uniBiTriTokensOutputVar %in%  Lymph_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testLymph)
  lymphatic <- round(s/l,2)
  
  testMGN <- ifelse(uniBiTriTokensOutputVar %in%  Mgn_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testMGN)
  massagegun <- round(s/l,2)
  
  testIASTM <- ifelse(uniBiTriTokensOutputVar %in%  Instrument_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testIASTM)
  IASTM <- round(s/l,2)
  
  testDT <- ifelse(uniBiTriTokensOutputVar %in%  DT_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testDT)
  deepTissue <- round(s/l,2)
  
  testSwedish <- ifelse(uniBiTriTokensOutputVar %in% swedish_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testSwedish)
  swedish <- round(s/l,2)
  
  testAroma <- ifelse(uniBiTriTokensOutputVar %in%  Aroma_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testAroma)
  aromatherapy <- round(s/l,2)
  
  testStretch <- ifelse(uniBiTriTokensOutputVar %in%  stretch_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testStretch)
  stretch <- round(s/l,2)
  
  testTPT <- ifelse(uniBiTriTokensOutputVar %in%  TPT_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testTPT)
  triggerPoint <- round(s/l,2)
  
  testHS <- ifelse(uniBiTriTokensOutputVar %in%  HotStone_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testHS)
  hotStone <- round(s/l,2)
  
  testCups <- ifelse(uniBiTriTokensOutputVar %in%  Cupping_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCups)
  cupping <- round(s/l,2)
  
  testSports <- ifelse(uniBiTriTokensOutputVar %in%  Sports_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testSports)
  sports <- round(s/l,2)
  
  testFreeze <- ifelse(uniBiTriTokensOutputVar %in%  Freeze_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testFreeze)
  biofreeze <- round(s/l,2)
  
  testCold <- ifelse(uniBiTriTokensOutputVar %in%  cold_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCold)
  coldStone <- round(s/l,2)
  
  testShiatsu <- ifelse(uniBiTriTokensOutputVar %in%  Shiatsu_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testShiatsu)
  shiatsu <- round(s/l,2)
  
  testPreg <- ifelse(uniBiTriTokensOutputVar %in%  Prenatal_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testPreg)
  prenatal <- round(s/l,2)
  
  testMyo <- ifelse(uniBiTriTokensOutputVar %in%  Myofascial_contraindications1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testMyo)
  myofascial <- round(s/l,2)
  
  names <- c("Cannabidiol (CBD) Massage Balm","Craniosacral Massage","Reflexology Massage","Lymphatic Drainage Massage","Massage Gun Therapy","Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage", "Deep tissue Massage","Swedish Massage","Aromatherapy","Stretching","Trigger Point Therapy","Hot Stone Therapy Massage", "Cupping Therapy","Sports Massage","Biofreeze Muscle Pain Relief Gel","Cold Stone Therapy","Shiatsu Massage","Prenatal Massage","Myofascial Massage")
  
  probs <- c(cbd,craniosacral,reflexology,lymphatic,massagegun,IASTM,deepTissue,
             swedish,aromatherapy,stretch,triggerPoint,hotStone,cupping,sports,
             biofreeze,coldStone,shiatsu,prenatal,myofascial)
  
  probabilities <- as.data.frame(cbind(names,probs))
  colnames(probabilities) <- c('modality','probability')
  probabilities$probability <- as.numeric(paste(probabilities$probability))
  
  probabilities1 <- subset(probabilities,
                           probabilities$probability>min(probabilities$probability))
  
  NR <- as.character(paste(probabilities1$modality))
  availModes <- !names %in% NR
  available <- names[availModes]

  testCBD1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cbd1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCBD1)
  cbd1 <- round(s/l,2)
  
  testCranio1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cranio1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCranio1)
  craniosacral1 <- round(s/l,2)
  
  testReflex1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_reflex1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testReflex1)
  reflexology1 <- round(s/l,2)
  
  testLymph1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_lymph1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testLymph1)
  lymphatic1 <- round(s/l,2)
  
  testMGN1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_mgn1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testMGN1)
  massagegun1 <- round(s/l,2)
  
  testIASTM1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_instrument1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testIASTM1)
  IASTM1 <- round(s/l,2)
  
  testDT1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_DT1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testDT1)
  deepTissue1 <- round(s/l,2)
  
  testSwedish1 <- ifelse(uniBiTriTokensOutputVar %in% benefits_swedish1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testSwedish1)
  swedish1 <- round(s/l,2)
  
  testAroma1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_aroma1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testAroma1)
  aromatherapy1 <- round(s/l,2)
  
  testStretch1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_stretch1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testStretch1)
  stretch1 <- round(s/l,2)
  
  testTPT1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_tpt1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testTPT1)
  triggerPoint1 <- round(s/l,2)
  
  testHS1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_HotStone1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testHS1)
  hotStone1 <- round(s/l,2)
  
  testCups1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cup1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCups1)
  cupping1 <- round(s/l,2)
  
  testSports1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_sports1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testSports1)
  sports1 <- round(s/l,2)
  
  testFreeze1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_freeze1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testFreeze1)
  biofreeze1 <- round(s/l,2)
  
  testCold1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_cold1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testCold1)
  coldStone1 <- round(s/l,2)
  
  testShiatsu1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_shiatsu1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testShiatsu1)
  shiatsu1 <- round(s/l,2)
  
  testPreg1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_prenatal1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testPreg1)
  prenatal1 <- round(s/l,2)
  
  testMyo1 <- ifelse(uniBiTriTokensOutputVar %in%  benefits_myofascial1,1,0)
  l <- length(uniBiTriTokensOutputVar)
  s <- sum(testMyo1)
  myofascial1 <- round(s/l,2)
  
  names1 <- c("Cannabidiol (CBD) Massage Balm","Craniosacral Massage","Reflexology Massage","Lymphatic Drainage Massage","Massage Gun Therapy","Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage", "Deep tissue Massage","Swedish Massage","Aromatherapy","Stretching","Trigger Point Therapy","Hot Stone Therapy Massage", "Cupping Therapy","Sports Massage","Biofreeze Muscle Pain Relief Gel","Cold Stone Therapy","Shiatsu Massage","Prenatal Massage","Myofascial Massage")
  
  probs <- c(cbd1,craniosacral1,reflexology1,lymphatic1,massagegun1,IASTM1,deepTissue1,
             swedish1,aromatherapy1,stretch1,triggerPoint1,hotStone1,cupping1,sports1,
             biofreeze1,coldStone1,shiatsu1,prenatal1,myofascial1)
  
  probabilities <- as.data.frame(cbind(names1,probs))
  colnames(probabilities) <- c('modality','probability')
  probabilities$probability <- as.numeric(paste(probabilities$probability))
  
  probabilities0 <- probabilities %>% filter(probabilities$modality %in% available)
  
  probabilities01 <- probabilities0[order(probabilities0$probability, decreasing=TRUE),]
  print(probabilities01)
  
  probs <- as.character(paste(probabilities01$modality))
  
  probs3 <- probs[1:3]
  
  modesUnique$modality <- as.character(paste(modesUnique$modality))
  modesUnique$Description <- as.character(paste(modesUnique$Description))
  modesUnique$sideEffects <- as.character(paste(modesUnique$sideEffects))
  modesUnique$benefits <- as.character(paste(modesUnique$benefits))
  
  modesReccommended <- modesUnique %>% filter(modesUnique$modality %in% probs3)
      
  print('Your top choices for your recommended massage modalities based on your user input are: ')
  
  print(probs3)
  cat('\n\n')
  print(probs3[1])
  print('Your first recommended massage modality description: ')
  print(modesReccommended[1,2])
  cat('\n')
  print('Your first recommended massage modality benefits: ')
  print(modesReccommended[1,3])
  cat('\n')
  print('Your first recommended massage modality possible side effects: ')
  print(modesReccommended[1,4])
  cat('\n\n')
  
  print(probs3[2])
  print('Your second recommended massage modality description: ')
  print(modesReccommended[2,2])
  cat('\n')
  print('Your second recommended massage modality benefits: ')
  print(modesReccommended[2,3])
  cat('\n')
  print('Your second recommended massage modality possible side effects: ')
  print(modesReccommended[2,4])
  cat('\n\n')

  print(probs3[3])
  print('Your third recommended massage modality description: ')
  print(modesReccommended[3,2])
  cat('\n')
  print('Your third recommended massage modality benefits: ')
  print(modesReccommended[3,3])
  cat('\n')
  print('Your third recommended massage modality possible side effects: ')
  print(modesReccommended[3,4])

}
input1
## [1] "I have had headaches off and on, have seasonal allergies, my sleep is good, I have a history of stroke, and am on blood pressure medicine."
massageModalityRecommender2(input1)
##                     modality probability
## 1        Reflexology Massage        0.08
## 3            Swedish Massage        0.08
## 4                 Stretching        0.08
## 2 Lymphatic Drainage Massage        0.00
## [1] "Your top choices for your recommended massage modalities based on your user input are: "
## [1] "Reflexology Massage" "Swedish Massage"     "Stretching"         
## 
## 
## [1] "Reflexology Massage"
## [1] "Your first recommended massage modality description: "
## [1] "Traditional spa or clinic massage with hands,  palms,  elbows,  forearms of massage therapist used with glides and varying pressure applied along the muscle fibers of the body with varying amounts of pressure to get up to deeper layers of the body while avoiding causing pain to the client and avoiding discomfort to the client. Avoids massaging fast and avoids choppy short motions that disrupt relaxation,  but at the same rhythm to promote relaxation and calm the nervous system. "
## 
## [1] "Your first recommended massage modality benefits: "
## [1] "improves tight muscles,  loosens tight muscle fascia,  improves circulation,  improves relaxation,  improves immunity,  improves sleep,  improves range of motion"
## 
## [1] "Your first recommended massage modality possible side effects: "
## [1] "if no contraindications for massage exist in client, can make client tired"
## 
## 
## [1] "Swedish Massage"
## [1] "Your second recommended massage modality description: "
## [1] "Focus of this massage is to stimulate body functions and improve health of clients who find it more relaxing to have their scalp,  hands,  and/or feet massaged where these areas of the body have alignments in traditional chinese medicine to certain organs in the body reflected in locations on the feet,  hands,  and scalp to stimulate healing to those areas of the client's body. Historically,  this type of massage was the only type of massage allowed for health monitored clients recovering or living with certain health conditions such as cancer. If no contraindications or has a doctor's note,  this massage can be combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports massage,  shiatsu massage,  instrument assisted soft tissue mobilization or IASTM tools,  myofascial massage,  trigger point therapy,  and deep tissue massage"
## 
## [1] "Your second recommended massage modality benefits: "
## [1] "relaxing,  improves circulation,  improves sleep,  helps with pain and discomfort,  increases immunity,  recommended for people with pain sensitivity or people who cannot be touched because it causes discomfort by tickling,  itching,  or hurting them like some cases of fibromyalgia and neuropathic pain"
## 
## [1] "Your second recommended massage modality possible side effects: "
## [1] "can make client light headed "
## 
## 
## [1] "Stretching"
## [1] "Your third recommended massage modality description: "
## [1] "this massage modality takes the clients limbs of body that are gently pulled up to a 7 pain scale on a 1-10 pain scale with 10 being the most pain to stretch focused muscle group to increase range of motion and detox overworked muscles and break apart muscle fascia adhesions where each stretch held for three to ten deep and controlled breaths of the client. This modality can be a stand alone treatment or combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports massage,  shiatsu massage,  instrument assisted soft tissue mobilization or IASTM tools,  myofascial massage,  trigger point therapy,  and deep tissue massage"
## 
## [1] "Your third recommended massage modality benefits: "
## [1] "increased flexibility,  improved range of motion,  improved circulation,  pain relief,  better posture,  increased healing,  improved mood,  muscle relief,  better sleep"
## 
## [1] "Your third recommended massage modality possible side effects: "
## [1] "Can cause muscle soreness for a few days afterwards"
input2
## [1] "I just want to relax, and get massages regularly at a local massage place. I don't have any pre-existing health conditions and like a deep pressure with focus on my upper back. I am over a cold that I had two weeks ago. I have a sunburn on my face, that you could avoid."
massageModalityRecommender2(input2)
##                         modality probability
## 5               Prenatal Massage        0.19
## 1 Cannabidiol (CBD) Massage Balm        0.00
## 2            Reflexology Massage        0.00
## 3                Swedish Massage        0.00
## 4                     Stretching        0.00
## [1] "Your top choices for your recommended massage modalities based on your user input are: "
## [1] "Prenatal Massage"               "Cannabidiol (CBD) Massage Balm"
## [3] "Reflexology Massage"           
## 
## 
## [1] "Prenatal Massage"
## [1] "Your first recommended massage modality description: "
## [1] "swedish massage up to medium pressure or middle muscle layers to help relax client going through body changes due to pregnancy and associated aches in the feet,  low back,  and upper back,  avoids the joints and major artery sites of client,  client cannot be a high risk pregnancy or within the first trimester unless not a high risk and has had massage regularly for at least one year allowing the clients body to welcome massage. Can be relaxing,  helps detox stress from body,  improves circulation,  and other benefits of massage"
## 
## [1] "Your first recommended massage modality benefits: "
## [1] "improved circulation,  better sleep,  pain relief,  relaxing effect,  soothing effect,  calming effect,  improves range of motion,  helps with congestion,  helps detox,  helps clean old bruises,  modified massage if doctor approved for high risk or other health condition"
## 
## [1] "Your first recommended massage modality possible side effects: "
## [1] "can make client dizzy or nausious if first massage and not familiar with massage or early stages of pregnancy and a first time pregnancy but not a high risk pregnancy"
## 
## 
## [1] "Cannabidiol (CBD) Massage Balm"
## [1] "Your second recommended massage modality description: "
## [1] "Focus of this massage is to stimulate body functions and improve health of clients who find it more relaxing to have their scalp,  hands,  and/or feet massaged where these areas of the body have alignments in traditional chinese medicine to certain organs in the body reflected in locations on the feet,  hands,  and scalp to stimulate healing to those areas of the client's body. Historically,  this type of massage was the only type of massage allowed for health monitored clients recovering or living with certain health conditions such as cancer. If no contraindications or has a doctor's note,  this massage can be combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports massage,  shiatsu massage,  instrument assisted soft tissue mobilization or IASTM tools,  myofascial massage,  trigger point therapy,  and deep tissue massage"
## 
## [1] "Your second recommended massage modality benefits: "
## [1] "relaxing,  improves circulation,  improves sleep,  helps with pain and discomfort,  increases immunity,  recommended for people with pain sensitivity or people who cannot be touched because it causes discomfort by tickling,  itching,  or hurting them like some cases of fibromyalgia and neuropathic pain"
## 
## [1] "Your second recommended massage modality possible side effects: "
## [1] "can make client light headed "
## 
## 
## [1] "Reflexology Massage"
## [1] "Your third recommended massage modality description: "
## [1] "can be combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports,  and deep tissue massage"
## 
## [1] "Your third recommended massage modality benefits: "
## [1] "helps with chronic pain associated with nerve pain,  arthritis,  or stress"
## 
## [1] "Your third recommended massage modality possible side effects: "
## [1] "depends on the ingredients and amount of CBD in product being used, but some side effects include lethargy and dry mouth, if you have an autoimmune disease or allergies, it could irritate your skin"
input3
## [1] "I have leukemia, but want cupping done. My neighbor says it is good for me and I never tried it. Also, I need medium to deep pressure and don't want a light massage."
massageModalityRecommender2(input3)
##                         modality probability
## 7               Prenatal Massage         0.1
## 1 Cannabidiol (CBD) Massage Balm         0.0
## 2            Reflexology Massage         0.0
## 3     Lymphatic Drainage Massage         0.0
## 4                Swedish Massage         0.0
## 5                   Aromatherapy         0.0
## 6                     Stretching         0.0
## [1] "Your top choices for your recommended massage modalities based on your user input are: "
## [1] "Prenatal Massage"               "Cannabidiol (CBD) Massage Balm"
## [3] "Reflexology Massage"           
## 
## 
## [1] "Prenatal Massage"
## [1] "Your first recommended massage modality description: "
## [1] "swedish massage up to medium pressure or middle muscle layers to help relax client going through body changes due to pregnancy and associated aches in the feet,  low back,  and upper back,  avoids the joints and major artery sites of client,  client cannot be a high risk pregnancy or within the first trimester unless not a high risk and has had massage regularly for at least one year allowing the clients body to welcome massage. Can be relaxing,  helps detox stress from body,  improves circulation,  and other benefits of massage"
## 
## [1] "Your first recommended massage modality benefits: "
## [1] "improved circulation,  better sleep,  pain relief,  relaxing effect,  soothing effect,  calming effect,  improves range of motion,  helps with congestion,  helps detox,  helps clean old bruises,  modified massage if doctor approved for high risk or other health condition"
## 
## [1] "Your first recommended massage modality possible side effects: "
## [1] "can make client dizzy or nausious if first massage and not familiar with massage or early stages of pregnancy and a first time pregnancy but not a high risk pregnancy"
## 
## 
## [1] "Cannabidiol (CBD) Massage Balm"
## [1] "Your second recommended massage modality description: "
## [1] "Focus of this massage is to stimulate body functions and improve health of clients who find it more relaxing to have their scalp,  hands,  and/or feet massaged where these areas of the body have alignments in traditional chinese medicine to certain organs in the body reflected in locations on the feet,  hands,  and scalp to stimulate healing to those areas of the client's body. Historically,  this type of massage was the only type of massage allowed for health monitored clients recovering or living with certain health conditions such as cancer. If no contraindications or has a doctor's note,  this massage can be combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports massage,  shiatsu massage,  instrument assisted soft tissue mobilization or IASTM tools,  myofascial massage,  trigger point therapy,  and deep tissue massage"
## 
## [1] "Your second recommended massage modality benefits: "
## [1] "relaxing,  improves circulation,  improves sleep,  helps with pain and discomfort,  increases immunity,  recommended for people with pain sensitivity or people who cannot be touched because it causes discomfort by tickling,  itching,  or hurting them like some cases of fibromyalgia and neuropathic pain"
## 
## [1] "Your second recommended massage modality possible side effects: "
## [1] "can make client light headed "
## 
## 
## [1] "Reflexology Massage"
## [1] "Your third recommended massage modality description: "
## [1] "can be combined with other massage modalities like hot stone therapy,  cold stone therapy,  aromatheray,  CBD or cannabidiol products,  cupping therapy,  stretching,  massage gun therapy,  biofreeze,  craniosacral therapy,  reflexology,  lymphatic drainage,  sports,  and deep tissue massage"
## 
## [1] "Your third recommended massage modality benefits: "
## [1] "helps with chronic pain associated with nerve pain,  arthritis,  or stress"
## 
## [1] "Your third recommended massage modality possible side effects: "
## [1] "depends on the ingredients and amount of CBD in product being used, but some side effects include lethargy and dry mouth, if you have an autoimmune disease or allergies, it could irritate your skin"

TensorFlow Keras for Deep Neural Network predictions

This next script uses the TensorFlow experimental contributor Keras package to build a deep neural network on the lemmatized okens of the benefits and contraindications to predict the modality of the massage, and got up to 100% accuracy. Keras is the simplified version of TensorFlow, that is super fast because it parallelizes the process, but also can be used with Graphical Processing Units if set up accordingly. No worries, because the following is not dependend on the GPUs. It uses your computer’s CPUs. I took this template model from the demonstration of Chapter 13 of ‘Python Machine Learning-Second Edition’ by Sebastian Raschka and Vahid Mirjalili Kindle version. We used this course book in my Machine Learning Graduate course in Fall 2019, and this code works for python 3 as is. No issues and that is saying something– because actually there were some minor issues, the mapping of the one hot encoding wasn’t interpretable for my data, but everything worked perfect for this book’s provided data of the ubiquitous MNIST data set. Buy it, rent it, github the Jupyter notebooks, and so on.

Aside: I was binge watching a teen show about idiot surfers the whole time who go lost treasure hunting and dig themselves holes the whole story, and had many opportunities to save themselves if they weren’t idiot teenage surfers who have to go looking for trouble by petty actions, but anyhow. Another great binge, because the writers have to be so smart to dumb down their minds to that of someone so clueless and write non-stop drama. But anyways, that was how I felt looking through this simple code to overlay or use as a wrapper for my own data, and it was worth it.

Check this out!

Reload these packages to use Python in R, if you are just jumping to this section.

library(reticulate)
conda_list(conda = "auto") 
##           name                                                  python
## 1    Anaconda2                     C:\\Users\\m\\Anaconda2\\python.exe
## 2    djangoenv    C:\\Users\\m\\Anaconda2\\envs\\djangoenv\\python.exe
## 3     python36     C:\\Users\\m\\Anaconda2\\envs\\python36\\python.exe
## 4     python37     C:\\Users\\m\\Anaconda2\\envs\\python37\\python.exe
## 5 r-reticulate C:\\Users\\m\\Anaconda2\\envs\\r-reticulate\\python.exe
use_condaenv(condaenv = "python36")
import numpy as np
import pandas as pd
data = pd.read_csv('lemmNgramsBenefits2Contraindications3.csv', encoding='unicode_escape')
data.shape
## (456, 545)
data.head()
##                     modality  ... wound.sore.sensitive
## 0         Myofascial Massage  ...                    1
## 1           Prenatal Massage  ...                    0
## 2            Shiatsu Massage  ...                    0
## 3  Hot Stone Therapy Massage  ...                    0
## 4            Cupping Therapy  ...                    0
## 
## [5 rows x 545 columns]
np.random.seed(123)
data0 = data.reindex(np.random.permutation(data.index))
data1 = data0.iloc[:,3:]
data1.shape
## (456, 542)
target=data0.iloc[:,0:1]
target.shape
## (456, 1)
print(target['modality'].unique())
## ['Hot Stone Therapy Massage' 'Cold Stone Therapy' 'Reflexology Massage'
##  'Deep tissue Massage' 'Lymphatic Drainage Massage' 'Stretching'
##  'Aromatherapy' 'Trigger Point Therapy' 'Biofreeze Muscle Pain Relief Gel'
##  'Shiatsu Massage' 'Massage Gun Therapy' 'Cupping Therapy'
##  'Cannabidiol (CBD) Massage Balm' 'Sports Massage' 'Myofascial Massage'
##  'Craniosacral Massage' 'Prenatal Massage'
##  'Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage'
##  'Swedish Massage']
print(len(target['modality'].unique()))
## 19
mean_vals = np.mean(data1, axis=0)
std_val = np.std(data1)

data1_centered = (data1 - mean_vals)/std_val

print(data1_centered.shape, target.shape)
## (456, 542) (456, 1)
print(data1.head())
##      ache.improve  ache.relieve  ...  wound.skin.rash  wound.sore.sensitive
## 49              0             0  ...                0                     0
## 85              0             0  ...                0                     0
## 34              0             0  ...                0                     0
## 381             0             0  ...                0                     0
## 232             0             0  ...                0                     0
## 
## [5 rows x 542 columns]
print(target.head())
##                        modality
## 49    Hot Stone Therapy Massage
## 85           Cold Stone Therapy
## 34          Reflexology Massage
## 381         Deep tissue Massage
## 232  Lymphatic Drainage Massage
#numpy function

class_mapping = {label: idx for idx, label in enumerate(np.unique(target['modality']))}
class_mapping
## {'Aromatherapy': 0, 'Biofreeze Muscle Pain Relief Gel': 1, 'Cannabidiol (CBD) Massage Balm': 2, 'Cold Stone Therapy': 3, 'Craniosacral Massage': 4, 'Cupping Therapy': 5, 'Deep tissue Massage': 6, 'Hot Stone Therapy Massage': 7, 'Instrument Assisted Soft Tissue Mobilization (IASTM) Friction Massage': 8, 'Lymphatic Drainage Massage': 9, 'Massage Gun Therapy': 10, 'Myofascial Massage': 11, 'Prenatal Massage': 12, 'Reflexology Massage': 13, 'Shiatsu Massage': 14, 'Sports Massage': 15, 'Stretching': 16, 'Swedish Massage': 17, 'Trigger Point Therapy': 18}
target['mode']=target['modality']
target['modality'] = target['modality'].map(class_mapping)
target.head()
##      modality                        mode
## 49          7   Hot Stone Therapy Massage
## 85          3          Cold Stone Therapy
## 34         13         Reflexology Massage
## 381         6         Deep tissue Massage
## 232         9  Lymphatic Drainage Massage
target1 = target['modality']
target1.head()
## 49      7
## 85      3
## 34     13
## 381     6
## 232     9
## Name: modality, dtype: int64
X_train = data1[:365]
X_test = data1[365:]
y_train = target1[:365]
y_test = target1[365:]

################################
# for adding the names of the classes after prediction from earlier in script
y_trainNames = target['mode']
y_trainNames = y_trainNames[:365]
y_trainNames.columns=['mode']
y_trainNames1=pd.DataFrame(y_trainNames)

y_testNames = target['mode']
y_testNames = y_testNames[365:]
y_testNames.columns=['mode']
y_testNames1=pd.DataFrame(y_testNames)
################################

print(X_train.shape)
## (365, 542)
print(y_train.shape)
## (365,)
print(X_test.shape)
## (91, 542)
print(y_test.shape)
## (91,)
y_train
## 49      7
## 85      3
## 34     13
## 381     6
## 232     9
##        ..
## 103    16
## 149    14
## 139     4
## 67     10
## 3       7
## Name: modality, Length: 365, dtype: int64
import tensorflow as tf
import tensorflow.contrib.keras as keras
#optionally use import tensorflow.keras as keras when no longer experimental contributor package development

np.random.seed(123)
tf.set_random_seed(123)
model = keras.models.Sequential()

model.add(
    keras.layers.Dense(
        units=150,   #output units need to match next layer inputs 
        input_dim=542, #number of features for input
        kernel_initializer='glorot_uniform',# name of the guy behind Xavier Initialization; the biases to zero
        bias_initializer='zeros',
        activation='tanh'))
## WARNING: Logging before flag parsing goes to stderr.
## W0501 20:25:41.968364 52684 deprecation.py:506] From C:\Users\m\Anaconda2\envs\python36\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
## Instructions for updating:
## Call initializer instance with the dtype argument instead of passing it to the constructor
model.add(
    keras.layers.Dense(
        units=150,   #output matches next layer input 
        input_dim=150, #input matches last layer's output
        kernel_initializer='glorot_uniform',
        bias_initializer='zeros',
        activation='tanh'))

model.add(
    keras.layers.Dense(
        units=19,  #these are the number of class categories in our target  
        input_dim=150,
        kernel_initializer='glorot_uniform',
        bias_initializer='zeros',
        activation='softmax'))#will return the class membership probs summing to 1 of all class probs

# these are hyperparameters that can be tuned if overfitting during training, or to get better accuracy
sgd_optimizer = keras.optimizers.SGD( 
        lr=0.001, decay=1e-7, momentum=.9)

# categorical_crossentropy is used in multiclass classification instead of binary_crossentropy
# to match the softmax function
model.compile(optimizer=sgd_optimizer,
              loss='sparse_categorical_crossentropy')
# it was 'categorical_crossentropy', but that expects binary matrices of 1s and 0s
# it said to use sparse_categorical_crossentropy
history = model.fit(X_train, y_train,
                    batch_size=64, epochs=50,
                    verbose=1, #setting verbose=1 will allow us to see the training and stop to tune parameters if needed
                    validation_split=0.1) # this takes 10% of the training set held out for testing/validation at each epoch
## Train on 328 samples, validate on 37 samples
## Epoch 1/50
## 
##  64/328 [====>.........................] - ETA: 3s - loss: 2.9594
## 328/328 [==============================] - 1s 3ms/sample - loss: 2.9579 - val_loss: 2.9613
## Epoch 2/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 2.8792
## 328/328 [==============================] - 0s 146us/sample - loss: 2.8659 - val_loss: 2.8551
## Epoch 3/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 2.7533
## 328/328 [==============================] - 0s 107us/sample - loss: 2.7223 - val_loss: 2.7224
## Epoch 4/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 2.6201
## 320/328 [============================>.] - ETA: 0s - loss: 2.5568
## 328/328 [==============================] - 0s 194us/sample - loss: 2.5532 - val_loss: 2.5862
## Epoch 5/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 2.4538
## 328/328 [==============================] - 0s 69us/sample - loss: 2.3748 - val_loss: 2.4383
## Epoch 6/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 2.1806
## 328/328 [==============================] - 0s 152us/sample - loss: 2.2025 - val_loss: 2.2939
## Epoch 7/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 2.0842
## 328/328 [==============================] - 0s 180us/sample - loss: 2.0393 - val_loss: 2.1495
## Epoch 8/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 1.9280
## 256/328 [======================>.......] - ETA: 0s - loss: 1.8833
## 328/328 [==============================] - 0s 929us/sample - loss: 1.8833 - val_loss: 2.0177
## Epoch 9/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 1.6767
## 192/328 [================>.............] - ETA: 0s - loss: 1.7256
## 256/328 [======================>.......] - ETA: 0s - loss: 1.7533
## 328/328 [==============================] - 0s 929us/sample - loss: 1.7381 - val_loss: 1.8856
## Epoch 10/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 1.6847
## 128/328 [==========>...................] - ETA: 0s - loss: 1.6249
## 256/328 [======================>.......] - ETA: 0s - loss: 1.5984
## 328/328 [==============================] - 0s 1000us/sample - loss: 1.6050 - val_loss: 1.7616
## Epoch 11/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 1.5354
## 328/328 [==============================] - 0s 275us/sample - loss: 1.4799 - val_loss: 1.6345
## Epoch 12/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 1.4706
## 328/328 [==============================] - 0s 128us/sample - loss: 1.3664 - val_loss: 1.5178
## Epoch 13/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 1.3116
## 328/328 [==============================] - 0s 109us/sample - loss: 1.2606 - val_loss: 1.4124
## Epoch 14/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 1.1675
## 328/328 [==============================] - 0s 157us/sample - loss: 1.1634 - val_loss: 1.3109
## Epoch 15/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 1.0773
## 256/328 [======================>.......] - ETA: 0s - loss: 1.0547
## 328/328 [==============================] - 0s 386us/sample - loss: 1.0747 - val_loss: 1.2183
## Epoch 16/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 1.0969
## 328/328 [==============================] - 0s 126us/sample - loss: 0.9943 - val_loss: 1.1300
## Epoch 17/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.9077
## 128/328 [==========>...................] - ETA: 0s - loss: 0.9134
## 192/328 [================>.............] - ETA: 0s - loss: 0.9125
## 256/328 [======================>.......] - ETA: 0s - loss: 0.9206
## 320/328 [============================>.] - ETA: 0s - loss: 0.9171
## 328/328 [==============================] - 1s 2ms/sample - loss: 0.9186 - val_loss: 1.0532
## Epoch 18/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.8720
## 128/328 [==========>...................] - ETA: 0s - loss: 0.8681
## 192/328 [================>.............] - ETA: 0s - loss: 0.8801
## 328/328 [==============================] - 1s 2ms/sample - loss: 0.8500 - val_loss: 0.9808
## Epoch 19/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.8833
## 192/328 [================>.............] - ETA: 0s - loss: 0.7924
## 320/328 [============================>.] - ETA: 0s - loss: 0.7850
## 328/328 [==============================] - 0s 666us/sample - loss: 0.7865 - val_loss: 0.9132
## Epoch 20/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.7801
## 192/328 [================>.............] - ETA: 0s - loss: 0.7331
## 328/328 [==============================] - 0s 352us/sample - loss: 0.7292 - val_loss: 0.8503
## Epoch 21/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.6872
## 328/328 [==============================] - 0s 157us/sample - loss: 0.6775 - val_loss: 0.7945
## Epoch 22/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.6311
## 328/328 [==============================] - 0s 140us/sample - loss: 0.6310 - val_loss: 0.7423
## Epoch 23/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.5620
## 328/328 [==============================] - 0s 156us/sample - loss: 0.5884 - val_loss: 0.6936
## Epoch 24/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.5623
## 328/328 [==============================] - 0s 138us/sample - loss: 0.5502 - val_loss: 0.6497
## Epoch 25/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.5105
## 328/328 [==============================] - 0s 69us/sample - loss: 0.5146 - val_loss: 0.6085
## Epoch 26/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.5065
## 328/328 [==============================] - 0s 107us/sample - loss: 0.4826 - val_loss: 0.5728
## Epoch 27/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.4492
## 328/328 [==============================] - 0s 173us/sample - loss: 0.4539 - val_loss: 0.5398
## Epoch 28/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.3997
## 328/328 [==============================] - 0s 103us/sample - loss: 0.4274 - val_loss: 0.5098
## Epoch 29/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.4485
## 328/328 [==============================] - 0s 141us/sample - loss: 0.4035 - val_loss: 0.4797
## Epoch 30/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.4375
## 328/328 [==============================] - 0s 128us/sample - loss: 0.3813 - val_loss: 0.4522
## Epoch 31/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.3537
## 128/328 [==========>...................] - ETA: 0s - loss: 0.3724
## 328/328 [==============================] - 0s 368us/sample - loss: 0.3605 - val_loss: 0.4284
## Epoch 32/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.3547
## 328/328 [==============================] - 0s 172us/sample - loss: 0.3423 - val_loss: 0.4061
## Epoch 33/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.3203
## 328/328 [==============================] - 0s 80us/sample - loss: 0.3249 - val_loss: 0.3855
## Epoch 34/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2949
## 328/328 [==============================] - 0s 159us/sample - loss: 0.3093 - val_loss: 0.3661
## Epoch 35/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.3123
## 328/328 [==============================] - 0s 149us/sample - loss: 0.2949 - val_loss: 0.3476
## Epoch 36/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2868
## 328/328 [==============================] - 0s 123us/sample - loss: 0.2812 - val_loss: 0.3308
## Epoch 37/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2589
## 328/328 [==============================] - 0s 95us/sample - loss: 0.2686 - val_loss: 0.3152
## Epoch 38/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2573
## 328/328 [==============================] - 0s 137us/sample - loss: 0.2570 - val_loss: 0.3011
## Epoch 39/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2415
## 328/328 [==============================] - 0s 158us/sample - loss: 0.2462 - val_loss: 0.2878
## Epoch 40/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2255
## 328/328 [==============================] - 0s 116us/sample - loss: 0.2361 - val_loss: 0.2757
## Epoch 41/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2227
## 328/328 [==============================] - 0s 125us/sample - loss: 0.2267 - val_loss: 0.2644
## Epoch 42/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2084
## 328/328 [==============================] - 0s 104us/sample - loss: 0.2179 - val_loss: 0.2540
## Epoch 43/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2161
## 328/328 [==============================] - 0s 144us/sample - loss: 0.2096 - val_loss: 0.2441
## Epoch 44/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2064
## 328/328 [==============================] - 0s 143us/sample - loss: 0.2018 - val_loss: 0.2349
## Epoch 45/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.2090
## 328/328 [==============================] - 0s 118us/sample - loss: 0.1946 - val_loss: 0.2266
## Epoch 46/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.1734
## 328/328 [==============================] - 0s 95us/sample - loss: 0.1877 - val_loss: 0.2189
## Epoch 47/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.1982
## 328/328 [==============================] - 0s 163us/sample - loss: 0.1814 - val_loss: 0.2113
## Epoch 48/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.1653
## 328/328 [==============================] - 0s 122us/sample - loss: 0.1754 - val_loss: 0.2041
## Epoch 49/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.1611
## 328/328 [==============================] - 0s 278us/sample - loss: 0.1697 - val_loss: 0.1975
## Epoch 50/50
## 
##  64/328 [====>.........................] - ETA: 0s - loss: 0.1679
## 192/328 [================>.............] - ETA: 0s - loss: 0.1645
## 320/328 [============================>.] - ETA: 0s - loss: 0.1647
## 328/328 [==============================] - 0s 900us/sample - loss: 0.1643 - val_loss: 0.1914

A note on the above, it was very fast, faster than the time to load in the previous python modules the model uses. This is because that rediscovered paper on backpropagation for matriceXvector multiplication from 1982-1986, two different sources and different versions, was rediscovered, and found to solve a lot of computation. The way that the neural networks work is with an activation layer that is linear, then applying some non-linear activations functions to a hidden layer, and more layers with other mixes or variations as to what type of non-linear function it is. They are all versions to some extent of the logistic or sigmoid activation function. Mainly the hyperbolic tangent and a rectified liner unit, and the softmax function to add to the logistic function to present meaningful class membership probabilities to classification, unlike the logistic function that provided class probabilities on each class with the max being chosen but with all class probabilities not summing to 1. I know it is jibberish above if your are a grammatic and spelling fanatic. But that was a generalization of how NNs work and I myself found it initially confusing, until I actually wrote the notes on the NNs and tested it out myself. Keep in mind the kindle version is not the best version to get, especially for coding, because of the placement and white spacing, but this book and many others do usually provide the code in a separate free sourced link to run the demos yourself.

As an aside, if you ever watched the trailers or partial trailers to the movie from the early 2000s called the Caterpillar, this is what is similar to NNs. In fact it could be that some nerd, was locked up in their room going over papers, and thinking of their own dissertation or working in a department with some link in interviewed, interned, contract colleagues, etc. that casually passed on the information of how similar the caterpillar movie is to the way NNs work. Because you have to take in input dimension units as the number of features in the training set, then as the outputs, declare some number of units or pieces (demented not dimensioned) and you then put those same units into the first hidden layer that matches the number of input units of the number of outputs of the previous layer, and so on.. until the last hidden layer you have down size into the number of classes, so that the number of output units on the last hidden layer equals the number of classes, and this is where you set ‘None’ in TensorFlow, or you use Keras and select the softmax activation to give class membership probabilities that are meaningful (sum to 1). There is more that goes into building NNs, and not every format works for what you have to feed in, but manipulating the data types could fix the problem. I never watched the Caterpillar, I assumed from the partial trailer watched that this is what happened, and I never plan on watching it.

y_train_pred = model.predict_classes(X_train, verbose=0)
print('First 3 predictions: ', y_train_pred[:3])
## First 3 predictions:  [ 7  3 13]
y_train_pred = model.predict_classes(X_train, 
                                     verbose=0)
y_train_pred1 = pd.DataFrame(y_train_pred)
y_train_pred1.columns=['predicted']

y_train1 = y_train
y_train1.columns=['modality']
y_train1 = pd.DataFrame(y_train1)
y_train_pred1.index=y_train1.index

Train=pd.concat([y_train1['modality'],y_trainNames1['mode'],y_train_pred1['predicted']],axis=1)

print(Train)
##      modality                        mode  predicted
## 49          7   Hot Stone Therapy Massage          7
## 85          3          Cold Stone Therapy          3
## 34         13         Reflexology Massage         13
## 381         6         Deep tissue Massage          6
## 232         9  Lymphatic Drainage Massage          9
## ..        ...                         ...        ...
## 103        16                  Stretching         16
## 149        14             Shiatsu Massage         14
## 139         4        Craniosacral Massage          4
## 67         10         Massage Gun Therapy         10
## 3           7   Hot Stone Therapy Massage          7
## 
## [365 rows x 3 columns]
y_test_pred = model.predict_classes(X_test, 
                                    verbose=0)
y_test_pred1 = pd.DataFrame(y_test_pred)
y_test_pred1.columns=['predicted']

y_test1 = y_test
y_test1.columns=['modality']
y_test1 = pd.DataFrame(y_test1)
y_test_pred1.index=y_test1.index

Test=pd.concat([y_test1['modality'],y_testNames1['mode'],y_test_pred1['predicted']],axis=1)

print(Test)
##      modality                                               mode  predicted
## 342         8  Instrument Assisted Soft Tissue Mobilization (...          8
## 56         10                                Massage Gun Therapy         10
## 304         8  Instrument Assisted Soft Tissue Mobilization (...          8
## 233         9                         Lymphatic Drainage Massage          9
## 51          2                     Cannabidiol (CBD) Massage Balm          2
## ..        ...                                                ...        ...
## 230        16                                         Stretching         16
## 98          9                         Lymphatic Drainage Massage          9
## 322        12                                   Prenatal Massage         12
## 382         3                                 Cold Stone Therapy          3
## 365         2                     Cannabidiol (CBD) Massage Balm          2
## 
## [91 rows x 3 columns]
s = sum(Train['modality']==Train['predicted'])
l = len(Train['modality'])
accTrain = s/l
print('Training Correctly Predicted:',s,'Training Accuracy:',accTrain,'\n')
## Training Correctly Predicted: 365 Training Accuracy: 1.0
s = sum(Test['modality']==Test['predicted'])
l = len(Test['modality'])
accTest = s/l
print('Testing Correctly Predicted:',s,'Testing Accuracy:',accTest)
## Testing Correctly Predicted: 91 Testing Accuracy: 1.0

Keep in mind, this uses the lemmatized tokens from the benefits and contraindications to predict the modality. All the tokens are the same for each modality. We uses Random Forest and Gradient Boosted Trees at the beginning of this script with the same results. As they should be, because every sample is a duplicate only within each unique modality of massage for this recommender system.

Results Random Forest Results Gradient Boosted Trees

But the model fitting and validation was much faster for NNs, and the code was much less lines for the Random Forest and Gradient Boosted trees.

It would be useful to see how this works on the healthcare and wellness descriptions to see if the right class could be predicted.