This script uses the previous scripts on the healthcare recommender systems, but without the non-professional category as it was replaced with two new categories of lymphatic drainage massage and dry brushing. The risks, side-effects, and contraindications were removed from the healthcare documents, and so were the sponsored ads, images, and author information. The link to the documents is included in the dataset to get the original source material. The focus is to build a recommender system by user input that will take these documents on various healthcare and wellness services to recommend a health and wellness service based on user input.

The random forest classifier (RFC), Gradient Boosting Classifier (GBC), and the multinomial naive bayes classifiers will be used starting with the multinomial naive bayes classifier. The vectorization of the words as tokens will be tested using term frequency-inverse document frequency (TF-IDF), Count, and N-gram vectorized tokens for our separate document term matrices within the RFC and GBC classifiers. There will also be a split for each tokenization method for the RFC and GBC classifiers. The recall, precision, and accuracy measures will be recorded for each, and user input of short text will be used after the model is built and tested on the testing set for each relative model to predict the health and wellness class recommendation for the short input supplied by a user.

This is an R-markdown document in Rstudio, mostly using python 3 to run the machine learning model builds using the R package ‘reticulate.’ The RFC and GBC models were built using sci-kit learn or sklearn as well as the multinomial naive bayes model that starts this script. Other packages include pandas, numpy, matplotlib, re (regex), string, and nltk for the python modules used inside Rstudio.

library(reticulate)
## Warning: package 'reticulate' was built under R version 3.6.3
conda_list(conda = "auto") 
##           name                                                  python
## 1    Anaconda2                     C:\\Users\\m\\Anaconda2\\python.exe
## 2    djangoenv    C:\\Users\\m\\Anaconda2\\envs\\djangoenv\\python.exe
## 3     python36     C:\\Users\\m\\Anaconda2\\envs\\python36\\python.exe
## 4     python37     C:\\Users\\m\\Anaconda2\\envs\\python37\\python.exe
## 5 r-reticulate C:\\Users\\m\\Anaconda2\\envs\\r-reticulate\\python.exe

Without having my python IDE, Anaconda, open in the console I want to use the python36 environment, all the environments in Anaconda for python are listed above.

use_condaenv(condaenv = "python36")
import pandas as pd 
import matplotlib.pyplot as plt 
from textblob import TextBlob 
import sklearn 
import numpy as np 
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer 
from sklearn.naive_bayes import MultinomialNB 
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

import re
import string
import nltk 

np.random.seed(47) 
set.seed(47)

The following data table will not show in your Rstudio environment, but python inside your python IDE will store the table.

modalities = pd.read_csv('benefitsContraindications4.csv', encoding = 'unicode_escape') 
print(modalities.shape)
## (82, 6)
print(modalities.columns)
## Index(['Document', 'Source', 'Topic', 'InternetSearch', 'Contraindications',
##        'risksAdverseEffects'],
##       dtype='object')
print(modalities.head())
##                                             Document  ...                                risksAdverseEffects
## 0  Chiropractic adjustments and treatments serve ...  ...                                                NaN
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...                                                NaN
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...  Risks and side effects associated with chiropr...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...                                                NaN
## 4  Heading to the spa can be a pampering treat, b...  ...                                                NaN
## 
## [5 rows x 6 columns]
print(modalities.tail())
##                                              Document  ... risksAdverseEffects
## 77  General guidelines - When to visit an emergenc...  ...                 NaN
## 78  \nHow to know where to go for sudden health ca...  ...                 NaN
## 79  How to Know If You Need to Go to the E.R. With...  ...                 NaN
## 80  When to call 911 or go to an emergency room im...  ...                 NaN
## 81  Is It an Emergency?\r\n\r\nConditions we treat...  ...                 NaN
## 
## [5 rows x 6 columns]
print(modalities['Topic'].unique())
## ['chiropractic benefits' 'massage benefits' 'physical therapy benefits'
##  'mental health services benefits' 'cupping benefits'
##  'massage gun benefits' 'cold stone benefits' 'Lymphatic Drainage Massage'
##  'dry brushing massage' 'ER']

There are a total of 10 classes to classify in this table of healthcare and wellness documents.

import regex
def preprocessor(text):
    text = regex.sub('<[^>]*>', '', text)
    emoticons = regex.findall('(?::|;|=)(?:-)?(?:\)|\(|D|P)', text)
    text = regex.sub('[\W]+', ' ', text.lower()) +\
        ' '.join(emoticons).replace('-', '')
    return text
modalities.tail()
##                                              Document  ... risksAdverseEffects
## 77  General guidelines - When to visit an emergenc...  ...                 NaN
## 78  \nHow to know where to go for sudden health ca...  ...                 NaN
## 79  How to Know If You Need to Go to the E.R. With...  ...                 NaN
## 80  When to call 911 or go to an emergency room im...  ...                 NaN
## 81  Is It an Emergency?\r\n\r\nConditions we treat...  ...                 NaN
## 
## [5 rows x 6 columns]

Reorder the observations so that they are mixed and not grouped together as they are in the original file.

import numpy as np

modalities = modalities.reindex(np.random.permutation(modalities.index))

print(modalities.head())
##                                              Document  ...                                risksAdverseEffects
## 77  General guidelines - When to visit an emergenc...  ...                                                NaN
## 25  \nFive Warning Signs of Mental Illness\n\n\nIt...  ...                                                NaN
## 14  What Are the Health Benefits of Massage?\n\nMa...  ...                                                NaN
## 40  \n\nCupping therapy is a practice that involve...  ...  You shouldn't use cupping in place of standard...
## 75  The Benefits and Risks of Dry Brushing\n\nWhat...  ...  What are the risks of dry brushing?\nRisks of ...
## 
## [5 rows x 6 columns]
print(modalities.tail())
##                                              Document  ... risksAdverseEffects
## 72  Dry Brushing: The Technique That Stimulates Yo...  ...                 NaN
## 8   BENEFITS OF MASSAGE\r\n\r\nYou know that post-...  ...                 NaN
## 71  Brushing for Lymphedema: Does it really work?\...  ...                 NaN
## 6   25 Reasons to Get a Massage\n\n\n25 Reasons to...  ...                 NaN
## 7   7 Benefits of Massage Therapy\r\n\r\nMassage t...  ...                 NaN
## 
## [5 rows x 6 columns]
modalities.groupby('Topic').describe()
##                                 Document  ... risksAdverseEffects
##                                    count  ...                freq
## Topic                                     ...                    
## ER                                     6  ...                 NaN
## Lymphatic Drainage Massage            10  ...                   1
## chiropractic benefits                  8  ...                   1
## cold stone benefits                    9  ...                   1
## cupping benefits                      10  ...                   1
## dry brushing massage                   5  ...                   1
## massage benefits                      12  ...                   1
## massage gun benefits                   8  ...                   1
## mental health services benefits        6  ...                 NaN
## physical therapy benefits              8  ...                 NaN
## 
## [10 rows x 20 columns]
modalities['length'] = modalities['Document'].map(lambda text: len(text))
print(modalities.head())
##                                              Document  ... length
## 77  General guidelines - When to visit an emergenc...  ...   2039
## 25  \nFive Warning Signs of Mental Illness\n\n\nIt...  ...   8145
## 14  What Are the Health Benefits of Massage?\n\nMa...  ...   1780
## 40  \n\nCupping therapy is a practice that involve...  ...   3488
## 75  The Benefits and Risks of Dry Brushing\n\nWhat...  ...   4932
## 
## [5 rows x 7 columns]
modalities.length.plot(bins=20, kind='hist')
plt.show()

modalities.length.describe()
## count       82.000000
## mean      3974.804878
## std       2518.144552
## min        382.000000
## 25%       2014.250000
## 50%       3320.000000
## 75%       5531.750000
## max      11993.000000
## Name: length, dtype: float64
print(list(modalities.Document[modalities.length > 3900].index))
## [25, 75, 64, 29, 70, 5, 78, 43, 49, 24, 39, 1, 16, 26, 44, 3, 27, 33, 22, 37, 17, 2, 34, 28, 9, 48, 68, 41, 52, 50, 45, 79, 72, 7]
print(list(modalities.Topic[modalities.length > 3900]))
## ['mental health services benefits', 'dry brushing massage', 'Lymphatic Drainage Massage', 'mental health services benefits', 'Lymphatic Drainage Massage', 'massage benefits', 'ER', 'cupping benefits', 'massage gun benefits', 'mental health services benefits', 'cupping benefits', 'chiropractic benefits', 'physical therapy benefits', 'mental health services benefits', 'massage gun benefits', 'chiropractic benefits', 'mental health services benefits', 'chiropractic benefits', 'physical therapy benefits', 'cupping benefits', 'physical therapy benefits', 'chiropractic benefits', 'cupping benefits', 'mental health services benefits', 'massage benefits', 'massage gun benefits', 'Lymphatic Drainage Massage', 'cupping benefits', 'cold stone benefits', 'massage gun benefits', 'massage gun benefits', 'ER', 'dry brushing massage', 'massage benefits']
modalities.hist(column='length', by='Topic', bins=5)


plt.show()

def split_into_tokens(review):
    
    return TextBlob(review).words
modalities.Document.head().apply(split_into_tokens)
## 77    [General, guidelines, When, to, visit, an, eme...
## 25    [Five, Warning, Signs, of, Mental, Illness, It...
## 14    [What, Are, the, Health, Benefits, of, Massage...
## 40    [Cupping, therapy, is, a, practice, that, invo...
## 75    [The, Benefits, and, Risks, of, Dry, Brushing,...
## Name: Document, dtype: object
TextBlob("hello world, how is it going?").tags  # list of (word, POS) pairs
## [('hello', 'JJ'), ('world', 'NN'), ('how', 'WRB'), ('is', 'VBZ'), ('it', 'PRP'), ('going', 'VBG')]
import nltk
nltk.download('stopwords')
## True
## 
## [nltk_data] Downloading package stopwords to
## [nltk_data]     C:\Users\m\AppData\Roaming\nltk_data...
## [nltk_data]   Package stopwords is already up-to-date!
from nltk.corpus import stopwords

stop = stopwords.words('english')
stop = stop + [u'a',u'b',u'c',u'd',u'e',u'f',u'g',u'h',u'i',u'j',u'k',u'l',u'm',u'n',u'o',u'p',u'q',u'r',u's',u't',u'v',u'w',u'x',u'y',u'z']
def split_into_lemmas(review):
    #review = unicode(review, 'iso-8859-1')
    review = review.lower()
    #review = unicode(review, 'utf8').lower()
    #review = str(review).lower()
    words = TextBlob(review).words
    # for each word, take its "base form" = lemma 
    return [word.lemma for word in words if word not in stop]

modalities.Document.head().apply(split_into_lemmas)
## 77    [general, guideline, visit, emergency, room, c...
## 25    [five, warning, sign, mental, illness, easy, g...
## 14    [health, benefit, massage, many, type, massage...
## 40    [cupping, therapy, practice, involves, briefly...
## 75    [benefit, risk, dry, brushing, dry, brushing, ...
## Name: Document, dtype: object
bow_transformerNgrams = CountVectorizer(analyzer=split_into_lemmas,ngram_range=(2,2)).fit(modalities['Document'])
          
print(len(bow_transformerNgrams.vocabulary_))
## 4857
modality4 = modalities['Document'][40]
print(modality4)
## 
## 
## Cupping therapy is a practice that involves briefly applying rounded inverted cups to certain parts of the body using a vacuum effect. Some proponents suggest that the drawing of the skin inside the cups increases blood flow to the area.
## 
## Long used in Taditional Chinese Medicine and other ancient healing systems, cupping has gained considerable popularity in recent years among athletes. For instance, swimmer Michael Phelps had the therapy in preparation for the 2016 Summer Olympics.1?
## Uses
## 
## Cupping is often recommended as a complementary therapy for the following conditions:2
## 
##     Back pain
##     Headache or migraine
##     Knee pain
##     Muscle pain and soreness
##     Neck and shoulder pain
##     Sports injuries and performance
##     Bronchial congestion due to the cold or asthma
## 
## In traditional Chinese medicine, cupping is said to stimulate the flow of vital energy (also known as "Qi" or "chi") and blood, and to help correct any imbalances arising from illness or injury. It's sometimes combined with acupuncture and tuina, other therapies said to promote the flow of energy.
## How Does Cupping Therapy Work?
## 
## To create the suction inside the cups, the practitioner may them by placing a flammable substance (such as herbs, alcohol, and/or paper) inside each cup and then igniting that substance. Next, the practitioner places the cup upside down on the body. During a typical cupping treatment, between three and seven cups are placed on the body.
## 
## Today, many practitioners use a manual or electric pump to create the vacuum, or use self-suctioning cupping sets. After the cups are in place, they are usually removed after five to ten minutes. (Practitioners may practice "flash" cupping, by quickly attaching then removing the cup repeatedly.)
## 
## Some practitioners apply massage oil or cream and then attach silicone cups, sliding them around the body rhythmically for a massage-like effect.
## 
## In a procedure known as "wet cupping," the skin is punctured prior to treatment. This causes blood to flow out of the punctures during the cupping procedure, which is thought to clear toxins from the body.4?
## Benefits
## 
## To date, there is a lack of high-quality scientific research to support the use of cupping to treat any health condition. For instance, a 2011 research review sized up seven trials testing cupping in people with pain (such as low back pain); results showed that most of the studies were of poor quality.4?
## 
## In another research review published in 2017, scientists analyzed 11 studies that tested the use of cupping by athletes. The review's authors concluded that no explicit recommendation could be made for or against the use of cupping in athletes and that further studies are needed. Some studies did show that cupping improved perceptions of pain and disability and had favorable effects on range of motion compared to no cupping.3?
## 
## Although cupping is sometimes recommended to increase flexibility in athletes, a small study published in the Journal of Sports Rehabilitation in 2018 found no change in hamstring flexibility after a seven-minute cupping session using four cups. Study participants were NCAA Division III college soccer players without symptoms.5?
## 
## 
## 
## 
## 
## 
## 
## Bottom Line
## 
## After seeing high-profile athletes and celebrities sport the characteristic round purple marks, it may be tempting to try cupping, but there's currently a lack of research on cupping. If you're still thinking of trying it, be sure to consult your doctor before beginning treatment.
bow4 = bow_transformerNgrams.transform([modality4])
print(bow4)
##   (0, 5) 1
##   (0, 7) 3
##   (0, 22)    1
##   (0, 52)    1
##   (0, 56)    1
##   (0, 58)    1
##   (0, 59)    1
##   (0, 188)   1
##   (0, 267)   1
##   (0, 296)   1
##   (0, 304)   1
##   (0, 316)   1
##   (0, 323)   1
##   (0, 325)   1
##   (0, 326)   1
##   (0, 346)   1
##   (0, 375)   1
##   (0, 376)   1
##   (0, 384)   1
##   (0, 390)   1
##   (0, 395)   1
##   (0, 429)   1
##   (0, 432)   5
##   (0, 437)   1
##   (0, 439)   1
##   :  :
##   (0, 4339)  1
##   (0, 4346)  1
##   (0, 4351)  1
##   (0, 4391)  1
##   (0, 4426)  1
##   (0, 4431)  1
##   (0, 4454)  1
##   (0, 4458)  3
##   (0, 4463)  1
##   (0, 4481)  1
##   (0, 4482)  1
##   (0, 4486)  1
##   (0, 4497)  1
##   (0, 4570)  1
##   (0, 4580)  1
##   (0, 4584)  5
##   (0, 4585)  1
##   (0, 4588)  2
##   (0, 4589)  1
##   (0, 4597)  2
##   (0, 4654)  1
##   (0, 4718)  1
##   (0, 4749)  1
##   (0, 4760)  1
##   (0, 4792)  1
modalities_bow = bow_transformerNgrams.transform(modalities['Document'])
print('sparse matrix shape:', modalities_bow.shape)
## sparse matrix shape: (82, 4857)
print('number of non-zeros:', modalities_bow.nnz)
## number of non-zeros: 17756
print('sparsity: %.2f%%' % (100.0 * modalities_bow.nnz / (modalities_bow.shape[0] * modalities_bow.shape[1])))
## sparsity: 4.46%
modalities_bow
## <82x4857 sparse matrix of type '<class 'numpy.int64'>'
##  with 17756 stored elements in Compressed Sparse Row format>

# Split/splice into training ~ 80% and testing ~ 20%
modalities_bow_train = modalities_bow[:65]
modalities_bow_test = modalities_bow[65:]
modalities_sentiment_train = modalities['Topic'][:65]
modalities_sentiment_test = modalities['Topic'][65:]

print(modalities_bow_train.shape)
## (65, 4857)
print(modalities_bow_test.shape)
## (17, 4857)
print
## <built-in function print>
modalities_sentiment = MultinomialNB().fit(modalities_bow_train, modalities_sentiment_train)
print('predicted:', modalities_sentiment.predict(bow4)[0])
## predicted: cupping benefits
print('expected:', modalities.Topic[40])
## expected: cupping benefits
predictions = modalities_sentiment.predict(modalities_bow_test)
#print(predictions)

prd = pd.DataFrame(predictions)
prd.columns=['predictions']
prd.index=modalities_sentiment_test.index
pred=pd.concat([pd.DataFrame(prd),modalities_sentiment_test],axis=1)
print(pred)
##                    predictions                       Topic
## 68  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 41            cupping benefits            cupping benefits
## 52            massage benefits         cold stone benefits
## 65  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 50        massage gun benefits        massage gun benefits
## 55         cold stone benefits         cold stone benefits
## 45        massage gun benefits        massage gun benefits
## 79                          ER                          ER
## 81                          ER                          ER
## 59         cold stone benefits         cold stone benefits
## 23   physical therapy benefits   physical therapy benefits
## 51        massage gun benefits        massage gun benefits
## 72        dry brushing massage        dry brushing massage
## 8             massage benefits            massage benefits
## 71  Lymphatic Drainage Massage        dry brushing massage
## 6             massage benefits            massage benefits
## 7             massage benefits            massage benefits
print('accuracy', accuracy_score(modalities_sentiment_test, predictions))
## accuracy 0.8823529411764706
print('confusion matrix\n', confusion_matrix(modalities_sentiment_test, predictions))
## confusion matrix
##  [[2 0 0 0 0 0 0 0]
##  [0 2 0 0 0 0 0 0]
##  [0 0 2 0 0 1 0 0]
##  [0 0 0 1 0 0 0 0]
##  [0 1 0 0 1 0 0 0]
##  [0 0 0 0 0 3 0 0]
##  [0 0 0 0 0 0 3 0]
##  [0 0 0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(modalities_sentiment_test, predictions))
##                             precision    recall  f1-score   support
## 
##                         ER       1.00      1.00      1.00         2
## Lymphatic Drainage Massage       0.67      1.00      0.80         2
##        cold stone benefits       1.00      0.67      0.80         3
##           cupping benefits       1.00      1.00      1.00         1
##       dry brushing massage       1.00      0.50      0.67         2
##           massage benefits       0.75      1.00      0.86         3
##       massage gun benefits       1.00      1.00      1.00         3
##  physical therapy benefits       1.00      1.00      1.00         1
## 
##                   accuracy                           0.88        17
##                  macro avg       0.93      0.90      0.89        17
##               weighted avg       0.92      0.88      0.88        17

From the above, precision accounts for type 1 errors (how many real negatives classified as positives-False Positives: TP/(TP+FP)) and type 2 errors (how many real posiives classified as negatives-False Negatives: TP/(TP+FN)) are part of recall. We can see that in our testing set of samples, by looking at precision, the ER, cold stone benefits, cupping benefits, and physical therapy benefits classes were correctly classified when classified at all. But for recall, all except for cold stone benefits and the Not Professional class were mis-classified as cold stone benefits when the class was not a cold stone benefit. Also, for precision, the massage gun and massage benefits as well as the Not Professional classes were mis-classified as each such class incorrectly. So, when the classifier predicted the class as being massage gun benefits the classification was correct 67% of the time, and half the time for massage benefits as far as precision is concerned. And when the classifications of ER, cupping, massage, massage gun, and physical therapy benefits were classifed, they were all ‘found’ or recalled, there were no misclassifications of those classes.

A note with precision and recall, because when learning from different sources, they have differnt nmemonics for remembering, even with the way they set up their Type I and Type II diagram, and that could leave to confusion. But it is important to make it simple and not complex nor introduce more confusion when not needed. Recall is analogous to saying, ‘out of all those that were class A, we classified them with some percent accuracy.’ For precision, it is analogous to saying, ‘out of all the classifications we made for class A, we were right this percent of the time.’ Or find what works for you. Precision is Type 1 error and recall is a Type 2 error, some nmemonics say to use how many negatives? One or two. If predicted positive when negative, its a false negative or type 2 error, and if predicted negative when positive, it is a false positive or type 1 error.

Aside: If your concern is with improving recall or improving precision, you would need to tune your predictive model to get better prediction or recall and put aside the need for the accuracy to be close to 100%. Because if accuracy is 98% and your recall is 60%, that is not going to help you, if say you needed to find all of the tumor cells and only found 60% of them. Same, for precision, if your accuracy is 98% and your precision is 80% and recall 90%, then if your predicting the next day of a stock price increasing and misclassify it as such when it decreases, your precision needs to be improved. Because you may have classified 90% of those days that increased, but you misclassified some decreasing days by 80%, and could have bought that stock. When in doubt, do as the nerds do and set up 10 analogous examples of real-life outcomes and see if its true, or you could stand in a corner until you fall asleep and dream about what is a type 1 and type 2. Or depending on your access to google just google it. But when classifying, you should definitely make sure you have a clear understanding of the difference and necessity for recall and precision. If this was a movie, there would be an unanswered question as to how to tune for these improvements in precision or recall. Since, the above uses multinomial naive bayes, the probabilities are based on the principle of garbage in and garbage out. You would have to get better data, more relevant features, exclude features, play with the testing and training sets, remove outliers, or normalize the data to include outliers by taking the log, and getting to know the sometimes 100s of features (or appearance of) your algorithm if using a package like nltk on tokenizing words. You will see this later in the list of nltk attributes in the other algorithm and model testing on this dataset.

modalitiesu = modalities.Topic.unique()
mus = np.sort(modalitiesu)
mus
## array(['ER', 'Lymphatic Drainage Massage', 'chiropractic benefits',
##        'cold stone benefits', 'cupping benefits', 'dry brushing massage',
##        'massage benefits', 'massage gun benefits',
##        'mental health services benefits', 'physical therapy benefits'],
##       dtype=object)

def predict_modality(new_review): 
    new_sample = bow_transformerNgrams.transform([new_review])
    pr = np.around(modalities_sentiment.predict_proba(new_sample),2)
    print(new_review,'\n\n', pr)
    print('\n\nThe respective order:\n 0-ER\n 1-Lymphatic Drainage Massage\n 2-chiropractic therapy\n 3-cold stone therapy\n 4-cupping therapy\n 5-dry brushing massage\n 6-massage benefits\n 7-massage gun therapy\n 8-mental health services\n 9-physical therapy services\n\n')
    
    if (pr[0][0] == max(pr[0])):
        print('The max probability is Emergency Room services for this recommendation with ', pr[0][0]*100,'%')
    elif (pr[0][1] == max(pr[0])):
        print('The max probability is Lymphatic Drainage Massage for this recommendation with ', pr[0][1]*100,'%')
        
    elif (pr[0][2] == max(pr[0])):
        print('The max probability is chiropractic therapy for this recommendation with ', pr[0][2]*100,'%')
        
    elif (pr[0][3] == max(pr[0])):
        print('The max probability is cold stone massage for this recommendation with ', pr[0][3]*100,'%')
        
    elif (pr[0][4] == max(pr[0])):
        print('The max probability is cupping therapy for this recommendation with ', pr[0][4]*100,'%')
   
    elif (pr[0][5] == max(pr[0])):
        print('The max probability is dry brushing massage for this recommendation with ', pr[0][5]*100,'%')
    
    elif (pr[0][6] == max(pr[0])):
        print('The max probability is massage therapy for this recommendation with ', pr[0][6]*100,'%')
    
    elif (pr[0][7] == max(pr[0])):
        print('The max probability is massage gun therapy for this recommendation with ', pr[0][7]*100,'%')
    
    elif (pr[0][8] == max(pr[0])):
        print('The max probability is mental health services for this recommendation with ', pr[0][8]*100,'%')
    else:
        print('The max probability is physical therapy services for this recommendation with ', pr[0][9]*100,'%')
    print('-----------------------------------------\n\n')
predict_modality('Headaches, body sweats, depressed.')
## Headaches, body sweats, depressed. 
## 
##  [[0.01 0.09 0.32 0.04 0.03 0.03 0.46 0.01 0.01 0.01]]
## 
## 
## The respective order:
##  0-ER
##  1-Lymphatic Drainage Massage
##  2-chiropractic therapy
##  3-cold stone therapy
##  4-cupping therapy
##  5-dry brushing massage
##  6-massage benefits
##  7-massage gun therapy
##  8-mental health services
##  9-physical therapy services
## 
## 
## The max probability is massage therapy for this recommendation with  46.0 %
## -----------------------------------------
predict_modality('sleepless, energy depraved, cold, tension')
## sleepless, energy depraved, cold, tension 
## 
##  [[0.   0.   0.02 0.84 0.11 0.01 0.01 0.01 0.   0.  ]]
## 
## 
## The respective order:
##  0-ER
##  1-Lymphatic Drainage Massage
##  2-chiropractic therapy
##  3-cold stone therapy
##  4-cupping therapy
##  5-dry brushing massage
##  6-massage benefits
##  7-massage gun therapy
##  8-mental health services
##  9-physical therapy services
## 
## 
## The max probability is cold stone massage for this recommendation with  84.0 %
## -----------------------------------------
predict_modality('body aches from working out')
## body aches from working out 
## 
##  [[0.   0.08 0.06 0.07 0.09 0.02 0.22 0.26 0.06 0.14]]
## 
## 
## The respective order:
##  0-ER
##  1-Lymphatic Drainage Massage
##  2-chiropractic therapy
##  3-cold stone therapy
##  4-cupping therapy
##  5-dry brushing massage
##  6-massage benefits
##  7-massage gun therapy
##  8-mental health services
##  9-physical therapy services
## 
## 
## The max probability is massage gun therapy for this recommendation with  26.0 %
## -----------------------------------------
predict_modality('can\'t move my arm. stuck at home. worried about my neck.')
## can't move my arm. stuck at home. worried about my neck. 
## 
##  [[0.06 0.4  0.12 0.01 0.02 0.04 0.19 0.02 0.01 0.13]]
## 
## 
## The respective order:
##  0-ER
##  1-Lymphatic Drainage Massage
##  2-chiropractic therapy
##  3-cold stone therapy
##  4-cupping therapy
##  5-dry brushing massage
##  6-massage benefits
##  7-massage gun therapy
##  8-mental health services
##  9-physical therapy services
## 
## 
## The max probability is Lymphatic Drainage Massage for this recommendation with  40.0 %
## -----------------------------------------
predict_modality('breathing ragged, tired, headaches, dizzy, nausious ')
## breathing ragged, tired, headaches, dizzy, nausious  
## 
##  [[0.19 0.04 0.37 0.02 0.04 0.01 0.32 0.   0.01 0.01]]
## 
## 
## The respective order:
##  0-ER
##  1-Lymphatic Drainage Massage
##  2-chiropractic therapy
##  3-cold stone therapy
##  4-cupping therapy
##  5-dry brushing massage
##  6-massage benefits
##  7-massage gun therapy
##  8-mental health services
##  9-physical therapy services
## 
## 
## The max probability is chiropractic therapy for this recommendation with  37.0 %
## -----------------------------------------
predict_modality("relief from this pain. can't sleep. feet hurt. chills.")
## relief from this pain. can't sleep. feet hurt. chills. 
## 
##  [[0.   0.03 0.5  0.   0.02 0.   0.32 0.02 0.   0.11]]
## 
## 
## The respective order:
##  0-ER
##  1-Lymphatic Drainage Massage
##  2-chiropractic therapy
##  3-cold stone therapy
##  4-cupping therapy
##  5-dry brushing massage
##  6-massage benefits
##  7-massage gun therapy
##  8-mental health services
##  9-physical therapy services
## 
## 
## The max probability is chiropractic therapy for this recommendation with  50.0 %
## -----------------------------------------
predict_modality('love this place better than others')
## love this place better than others 
## 
##  [[0.01 0.02 0.1  0.02 0.3  0.01 0.02 0.04 0.47 0.01]]
## 
## 
## The respective order:
##  0-ER
##  1-Lymphatic Drainage Massage
##  2-chiropractic therapy
##  3-cold stone therapy
##  4-cupping therapy
##  5-dry brushing massage
##  6-massage benefits
##  7-massage gun therapy
##  8-mental health services
##  9-physical therapy services
## 
## 
## The max probability is mental health services for this recommendation with  47.0 %
## -----------------------------------------

library(reticulate)
conda_list(conda = "auto") 
##           name                                                  python
## 1    Anaconda2                     C:\\Users\\m\\Anaconda2\\python.exe
## 2    djangoenv    C:\\Users\\m\\Anaconda2\\envs\\djangoenv\\python.exe
## 3     python36     C:\\Users\\m\\Anaconda2\\envs\\python36\\python.exe
## 4     python37     C:\\Users\\m\\Anaconda2\\envs\\python37\\python.exe
## 5 r-reticulate C:\\Users\\m\\Anaconda2\\envs\\r-reticulate\\python.exe

Without having my python IDE, Anaconda, open in the console I want to use the python36 environment, all the environments in Anaconda for python are listed above.

use_condaenv(condaenv = "python36")
import pandas as pd 
import matplotlib.pyplot as plt 
from textblob import TextBlob 
import sklearn 
import numpy as np 
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer 
from sklearn.naive_bayes import MultinomialNB 
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

import re
import string
import nltk 

np.random.seed(45678) 

The first part of the following code uses the Random Forest Classifier (RFC) and the Gradient Boosting Classifier (GBC) to categorize the recommendation based on three separate types of document term matrix (dtm) tokenizations, Count Vectorizer, Term Frequency-Inverse Document Frequency (TF-IDF) Vectorizer, and N-grams Vectorizer. This first part also uses Lemmatization to get the more ideal word meaning root word. And the training model uses 80% of the samples and 20% to test the model on within each type of vectorized and lemmatized token.

The second part of the following will keep all the same but change the training set to 85% and the testing set to 15% while still using lemmatization.

The third part will keep the first part the same but only change the lemmatization to stemmed word roots.

The fourt part will keep the same third part settings but change the testing set to 15% and the training set to 85%.

Those four sections of variations will allow us to contrast and compare which setting worked best for recall, precision, and accuracy within each vectorized and stemmed/lemmatized tokens of the dtms for either RFC or GBC.


First part: Lemmatized Tokens & 80/20 Train/Test split & RFC | GBC

Count Vectorizer RFC and GBC


stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
data <- read.csv('benefitsContraindications4.csv',sep=',',header=TRUE,  na.strings=c('',' ','NA'))
colnames(data)
## [1] "Document"            "Source"              "Topic"              
## [4] "InternetSearch"      "Contraindications"   "risksAdverseEffects"
head(data,5)
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Document
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Chiropractic adjustments and treatments serve the needs of millions of people around the world.\n\n \n\nAdjustments offer effective, non-invasive and cost-effective solutions to neck and back pain, as well as a myriad of other medical issues.\n\nHave you ever stopped to wonder how many of us suffer from neck and back stiffness or pain?\n\nApart from the obvious discomfort, simple daily tasks such as driving a car, crossing a busy street and picking things up from the floor can become all too challenging for individuals experiencing such pain.\n\nAs anyone who has experienced pain would know, having restricted movement can be debilitating and unfortunately, our busy world doesn?t allow for us to stop.\n\n \nSome of the benefits of long-term chiropractic care include:\n\n    Chiropractors can identify mechanical issues that cause spine-related pain and offer a series of adjustments that provide near immediate relief. Following appointments, patients often report feeling their symptoms noticeably better.\n    When a chiropractor performs an adjustment, they can help restore movement in joints that have ?locked up?. This becomes possible as treatment allows muscles surrounding joints to relax, thereby reducing joint stiffness.\n    Many factors affect health, including exercise patterns, nutrition, sleep, heredity and the environment in which we live. Rather than just treat symptoms of the disease, chiropractic care focuses on a holistic approach to naturally maintain health and resist disease.\n    Chiropractic adjustments help restore normal function and movement to the entire body. Many patients report an improvement in their ability to move with efficiency and strength.\n    Many patients find delight in the results chiropractic adjustments have on old and chronic injuries. Whether an injury is, in fact, new or old, chiropractic care can help reduce pain, restore mobility and provide quick pain relief to all joints in the body. Such care can help maintain better overall health and thus faster recovery time.\n\nHave you ever noticed that when you are in pain and unable to perform regular or favorite activities, it can put a strain on emotional and mental well-being?\n\nFor example, the increased stress from not being able to properly perform a paid job. This, in turn, can have a negative impact on physical health with increases in heart rate and blood pressure. The domino effect often continues with sleep becoming disturbed, with resulting lethargy and tiredness during the day. Does anyone really feel up to exercising in this state?\n\nChiropractic care is a natural method of healing the body?s communication system and never relies on the use of pharmaceutical drugs or invasive surgery.\n\n\n\n 
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          \nUnitedHealthcare Combats Opioid Crisis with Non-Opioid Benefits\nPhysical therapy and chiropractic care can prevent or reduce expensive, invasive spinal procedures, such as imaging or surgery, to reduce opioid use and cut costs.\nUnitedHealthcare, opioid, physical therapy, healthcare spending\n\n\nOctober 29, 2019 - UnitedHealthcare (UHC) is combatting the opioid epidemic and high healthcare costs with new physical therapy and chiropractic care benefits to prevent, delay, or in some cases substitute for invasive spinal procedures.\n\n?With millions of Americans experiencing low back pain currently or at some point during their lifetimes, we believe this benefit design will help make a meaningful difference by improving health outcomes while reducing costs,? said Anne Docimo, MD, UnitedHealthcare chief medical officer.\n\nLower back pain is in part responsible for sustaining the opioid epidemic and also increases healthcare costs.\nDig Deeper\n\nAlthough opioid overdoses fell by two percent from 2017 to 2018 and a legal battles aim to hold pharmaceutical companies accountable, there is no end in sight for the opioid epidemic. Industry professionals are still grappling with the balance between cutting opioid prescriptions will working to reduce patient pain.\n\nCommon conditions such as low back pain bolster the epidemic?s presence, with clinicians still prescribing the opioids against best practice recommendations. According to a recent OptumLabs study, 9 percent of patients with newly diagnosed low back pain are prescribed opioids and lower back pain currently contributes 52 percent to the overall opioid prescription rate.\n\nIn addition to boosting opioids distribution, alternative, invasive lower back pain treatments can significantly impact healthcare spending.\n\nIt is not new information that physical therapy and chiropractic care are effective, lower cost alternatives to spinal imaging or surgery. However, payers are still in the process of adopting the method.\n\nTo counteract the high-cost, high-risk potential of using opioids to treat back pain, UHC created a benefit that does not rely on medication or technology but rather on physical therapy and chiropractic care.\n\nThe benefit allows eligible employers to offer physical therapist and chiropractor visits with no out-of-pocket costs. Members who already receive physical therapist and chiropractic care benefits under UHC?s employer-sponsored health plans and who have maxed out their visits will not receive additional visits under this benefit.\n\nHowever, for those who still have visits to use and who choose physical therapy or chiropractic care over other forms of treatment, the copay or deductible for those visits will be waived and they will receive three visits at no cost.\n\nUHC has high expectations for the fiscal and physical impacts of this benefit.\n\nAccording to UHC?s analysis, the health payer expects that by 2021, opioid use will decrease by 19 percent. Spinal imaging test frequency and spinal surgeries will be reduced by 22 percent and 21 percent, respectively. In addition to these specific goals, UHC hopes to see a decrease in the overall cost of spinal care.\n\nThe same OptumLabs study demonstrated that UHC?s expectations are not without precedent.\n\nThe study looked at the correlation between out-of-pocket costs and patient utilization of noninvasive treatments. Researchers discovered that members whose copay was over $30 were a little under 30 percent less likely to choose physical therapy as opposed to more invasive treatments.\n\nAn American Journal of Managed Care study in June 2019 found that patients with high deductibles, typically over $1,000, were less likely to visit physical therapy.\n\nEligible employers may be brand new or renewing their membership. They must be fully insured and over 51 or more employees strong. The benefit is currently available in Connecticut, Florida, Georgia, New York, and North Carolina.\n\nHowever, UHC plans to expand the benefit from 2020 into 2021. By the end of this expansion period, the benefit will also be available to self-funded employers and organizations with an employee population between 2 and 50. The benefit will span ten states, primarily in the southeast.\n\n?This new benefit design may help encourage people with low back pain to get the right care at the right time and in the right setting, helping expand access to evidence-based and more affordable treatments,? said Docimo.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      The Safety of Chiropractic Adjustments\n\n    Chiropractic Adjustment\n    What Research Shows\n    Safety\n\nChiropractic adjustment, also called spinal manipulation, is a procedure done by a chiropractor using the hands or small instruments to apply controlled force to a spinal joint. The goal is to improve spinal motion and physical function of the entire body. Chiropractic adjustment is safe when performed by someone who is properly trained and licensed to practice chiropractic care. Complications are rare, but they are possible. Learn more about both the benefits and risks.\nChiropractic adjustment\nVerywell / Brianna Gilmartin \nChiropractic Adjustment\n\nOne of the most important reasons people seek chiropractic care is because it is a completely drug-free therapy. Someone dealing with joint pain, back pain, or headaches might consider visiting a chiropractor.\n\nThe goal of chiropractic adjustment is to place the body into a proper position so the body can heal itself. Treatments are believed to reduce stress on the immune system, reducing the potential for disease. Chiropractic care aims to address the entire body, including a person?s ability to move, perform, and even think.\nWhat Research Shows\n\nMany people wonder how helpful chiropractic care is in treating years of trauma and poor posture. There have been numerous studies showing the therapeutic benefits of chiropractic care.\nSciatica\n\nSciatica is a type of pain affecting the sciatic nerve, the large nerve extending from the low back down the back of the legs. Other natural therapies don?t always offer relief and most people want to avoid steroid injections and surgery, so they turn to chiropractic care.\n\nA double-blind trial reported in the Spine Journal compared active and simulated chiropractic manipulations in people with sciatic nerve pain. Active manipulations involved the patient laying down and receiving treatment from a chiropractor. Stimulated manipulations involved electrical muscle stimulation with electrodes placed on the skin to send electrical pulses to different parts of the body.\n\nThe researchers determined active manipulation offered more benefits than stimulated. The people who received active manipulations experienced fewer days of moderate or severe pain and other sciatica symptoms. They also reported no adverse effects.\nNeck Pain\n\nOne study reported in the Annals of Internal Medicine looked at different therapies for treating neck pain. They divided 272 study participants into three groups: one that received spinal manipulation from a chiropractic doctor, a second group given over-the-counter (OTC) pain relievers, narcotics, and muscle relaxers, and a third group who did at-home exercises. \n\nAfter 12 weeks, patients reported a 75% pain reduction, with the chiropractic treatment group achieving the most improvement. About 57% of the chiropractic group achieved pain reduction, while 48% received pain reduction from exercising, and 33% from medication.\n\nAfter one year, 53% of the drug-free groups continued to report pain relief compared to only 38% of those taking pain medications. \nHeadaches\n\nCervicogenic headaches and migraines are commonly treated by chiropractors. Cervicogenic headaches are often called secondary headaches because pain is usually referred from another source, usually the neck. Migraine headaches cause severe, throbbing pain and are generally experienced on one side of the head. There are few non-medicinal options for managing both types of chronic headaches.\n\nResearch reported in the Journal of Manipulative and Physiological Therapeutics suggests chiropractic care, specifically spinal manipulation, can improve migraines and cervicogenic headaches.  \nFrozen Shoulder\n\nFrozen shoulder affects the shoulder joint and involves pain and stiffness that develops gradually and gets worse. Frozen shoulder can be quite painful, and treatment involves preserving as much range of motion in the shoulder as possible and managing pain.\n\nA clinical trial reported in the Journal of Chiropractic Medicine described how patients suffering from frozen shoulder responded to chiropractic treatment. Of the 50 patients, 16 completely recovered, 25 showed a 75 to 90% improvement, and eight showed a 50 to 75% improvement. Only one person showed zero to 50% improvement. The researchers concluded most people can get improvement by treating frozen shoulder with chiropractic treatment.\nPreventing Need for Surgery\n\nChiropractic care may reduce the need for back surgery. Guidelines reported in the Journal of the American Medical Association suggest that it's reasonable for people suffering from back pain to try spinal manipulation before deciding on surgical intervention.\nLow Back Pain\n\nStudies have shown chiropractic care, including spinal manipulation, can provide relief from mild to moderate low back pain. In fact, spinal manipulation may work as well as other standard treatments, including pain-relief medications.\n\nA 2011 review of 26 clinical trials looked at the effectiveness of different treatments for chronic low back pain. What they found was that spinal manipulation is just as effective as other treatments for reducing back pain and improving function.\nSafety\n\n\n
## 4 Advanced Chiropractic Relief: 8 Key Benefits of Chiropractor Care\n\nAre you one of the 50 million Americans who suffer from chronic pain? If so you?re probably intimately familiar with the feeling of pure desperation that can arise from an inability to find relief.\n\nIn addition to physical issues, chronic pain can cause anxiety, depression, and more. However, there could be a light at the end of the tunnel. Many people are finding advanced chiropractic relief that is completely changing their lives.\n\nYour body is a world in itself. At this very moment, more than a million chemical reactions are taking place in your body. It manufactures energy, it regulates your heartbeat, your breathing and it regenerates and heals itself. Everything takes place without your conscious knowledge, without you controlling it voluntarily. The master system that controls it all is your nervous system.\n\nThe nervous system is made out of your brain, spinal cord and all your nerves.\n\nThe energy that flows through your nervous system in your body is like electricity. In order to have that electric flow normally and freely, we need to have a well functioning spine. Whenever you have disruption of that flow, disease happens. That would be the case when your spine is misaligned or is not moving properly.\n\nDid you know that 90% of stimulation and nutrition to the brain is generated by the movement of the spine?\n\nThe more mechanically distorted a person is, the less energy is available for thinking, metabolism and healing.\n\nThis is why it is so important to have a healthy spine, a proper posture, to exercise, to eat properly ? all of it truly matters for your quality of life.\nChiropractors localize the areas of your spine that do not move properly ? referred to as vertebral subluxations ? and adjust them with a specific high speed, but yet gentle, thrust to improve spinal motion.\n\nWant to learn about some of the ways chiropractic care can help you? Keep reading for insight into some of the key benefits of seeing a chiropractor.\n\nThe benefits of chiropractic care are numerous:\n\n1. Lower Blood Pressure\n\nStudies show that chiropractic treatment can lower your blood pressure. Sometimes, this works just as well as a prescription blood pressure medication! This benefit can also last for as long as six months after treatment.\n\nHigh blood pressure can cause an array of serious side effects like nausea, fatigue, dizziness, and anxiety. Sufferers who haven?t found relief should consider consulting with a chiropractor. A chiropractic adjustment may be the solution.\n\nSome studies have shown that chiropractic adjustments can also help patients who are suffering from low blood pressure.\n\n2. Reduced Inflammation\n\nIn many cases, joint issues, pain, and tension are caused by inflammation in the body. Chiropractic adjustments can reduce inflammation.\n\nThis leads to relief of muscle tension, chronic back pain, and joint pain. These adjustments can sometimes also slow the progression of inflammation-related diseases, like arthritis.\n\n3. Better Sleep\n\nPatients who receive chiropractic adjustments report a significant improvement in their sleep patterns. If you regularly suffer from insomnia, visiting a chiropractor regularly may help. Also, when you experience pain relief, this will help you get a restful night?s sleep.\n\n4. Digestive Relief\n\nChiropractors often give nutritional advice as part of their services. However, this isn?t the only way that they provide patients with digestive relief.\n\nAdjusting the thoraco-lumbar spine restores the neurological function of your digestive system. Regular adjustments can help with chronic digestive issues.\n\n5. Stress Release\n\nEveryday life can cause muscle cramping, inflammation, and more. When you?re sore from working at a computer, heavy lifting, or just dealing with emotional stress, a chiropractic adjustment can help. This leads to greater comfort and advanced pain relief.\n\n6. Improvement of Neurological Conditions\n\nA chiropractic adjustment can also increase blood flow to the brain and increase the flow of cerebral spinal fluid. This means that patients suffering from neurological conditions like epilepsy and multiple sclerosis can significantly benefit from regular adjustments.\n\nThis is a relatively new area of study, but the potential is huge. Those suffering from these conditions will want to do some research. It?s important to find the best chiropractor in their area with experience dealing with these specific types of cases.\n\n7. Chiropractic care can improve communication from your brain to your muscles\n\nResearch seems to show that chiropractic care can improve your brain-body communication, helping your brain to be more aware of what is going on in the body so it can control your body better.\n\nBetter health, more energy and vitality are some of the positive effects of getting your spine adjusted. It sets your vertebrae back into motion freeing up the energy that travels through your nerves.\n\nChiropractic care is a partnership. The results patients want is a combination of what the chiropractor does and what the patient does.\n\nThere are many good things that can be changed and improved for a better lifestyle: exercise, good nutrition, good mental attitude and spinal adjustments.\n\nYour whole body will work better by having your nervous system free of interference. That is the essence of chiropractic care and is designed for you and your family.\n\n8. Pain Relief\n\nPerhaps the most well-known benefit of going to a chiropractor is pain relief. Adjustments can help with a huge array of painful conditions including the following.\n\nNeck and Lower Back Pain\n\nAdjustments are the most effective non-invasive pain relief method for this type of pain. They may help patients avoid having to take prescription pain management drugs.\n\nSciatica\n\nTreatments help relieve pressure on the nerve. This results in less severe pain that lasts for a fewer number of days.\n\nHeadaches\n\nChiropractic adjustments help headaches and migraines. They do this by treating back misalignment, muscle tension, and stress. Cervical spine manipulation was associated with significant improvement in headache outcomes in trials involving patients with neck pain and/or neck dysfunction and headache.\n\nChronic headaches can result from the abnormal positioning of the head and can be worsened from neck pressure and movement. Chiropractic removes the interference whether it may be from the distant muscle tightness in the back causing strain on your spine or an abnormal lordotic cervical curve and moving vertebrae.\nChiropractic care can reduce the duration of headaches, lower their intensity when they do occur and limit the frequency of their occurrence all together.\n\nMenstrual cramps\n\nChiropractic treatment removes tension from the pelvis and sacrum. It also regulates the neurological function communicating with the reproductive organs. Adjustments can also relieve the bloating, cramping, and pain associated with menstrual cramps\n\nAnyone who has tried traditional medical treatments and has been unable to find pain relief should experiment with chiropractic care. More often than not, you?ll be pleasantly surprised!\n\nBonus: Advanced Chiropractic Relief\n\nIn addition to the benefits listed above, adjustments can bring advanced chiropractic relief for a wide variety of other conditions as well as overall life improvement. A few examples include:\n\nScoliosis ? adjustments have shown to help with the pain, reduced range of motion, abnormal posture, and even difficulty breathing caused by this abnormal curvature of the spine\n\nVertigo ? an adjustment can help realign and balance the spine, thereby reducing the dizziness, nausea, and disorientation caused by vertigo\n\nSinus and allergy relief ? adjusting the upper cervical spine can help drain the sinuses and provide immediate and lasting relief from both long-term and seasonal allergies\n\nExpectant mothers ? women can experience relief from pain and morning sickness and are better able to maintain proper posture during and after pregnancy\n\nChildren?s issues ? treatments have been shown to help children with acid reflux, cholic, and ear infections\nAthletic performance ? the reduction in pain and inflammation is particularly beneficial for professional and amateur athletes\n\nStimulates the immune system ? chiropractic care helps to boost the immune system, speeding up the healing process following illnesses or injuries. One of the most important studies showing the positive effect chiropractic care can have on the immune system and general health was performed by Ronald Pero, Ph.D., chief of cancer prevention research at New York?s Preventive Medicine Institute and professor of medicine at New York University. Dr. Pero measured the immune systems of people under chiropractic care as compared to those in the general population and those with cancer and other serious diseases.\n\nIn his initial three-year study of 107 individuals who had been under chiropractic care for five years or more, the chiropractic patients were found to have a 200% greater immune competence than people who had not received chiropractic care, and 400% greater immune competence than people with cancer and other serious diseases. The immune system superiority of those under chiropractic care did not diminish with age.\n\nDr. Pero stated: ?When applied in a clinical framework, I have never seen a group other than this chiropractic group to experience a 200% increase over the normal patients. This is why it is so dramatically important. We have never seen such a positive improvement in a group.?\n\nAs you can see, there are almost limitless benefits to seeking chiropractic treatment. If you haven?t tried it yet, what are you waiting for?\n\nThere?s no need to accept pain and discomfort as a normal part of life. You have nothing to lose and everything to gain, so it only makes sense to find out more about this possibly life-changing approach to improving your health and wellness.
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Heading to the spa can be a pampering treat, but it can also be a huge boost to your health and wellness! Massage therapy can relieve all sorts of ailments ? from physical pain, to stress and anxiety. People who choose to supplement their healthcare regimen with regular massages will not only enjoy a relaxing hour or two at the spa, but they will see the benefits carry through the days and weeks after the appointment!\n\n1\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\nThese are the 10 most common benefits reported from massage therapy:\n\n1. Reduce Stress\n\nA relaxing day at the spa is a great way to unwind and de-stress. However, clients are sure to notice themselves feeling relaxed and at ease for days and even weeks after their appointments!\n\n \n\n2. Improve Circulation\n\nLoosening muscles and tendons allows increased blood flow throughout the body. Improving your circulation can have a number of positive effects on the rest of your body, including reduced fatigue and pain management!\n\n \n\n3. Reduce Pain\n\nMassage therapy is great for working out problem areas like lower back pain and chronic stiffness. A professional therapist will be able to accurately target the source of your pain and help achieve the perfect massage regimen.\n\n \n\n3\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n4. Eliminate Toxins\n\nStimulating the soft tissues of your body will help to release toxins through your blood and lymphatic systems.\n\n \n\n5. Improve Flexibility\n\nMassage therapy will loosen and relax your muscles, helping your body to achieve its full range of movement potential.\n\n \n\n6. Improve Sleep\n\nA massage will encourage relaxation and boost your mood.  Going to bed with relaxed and loosened muscles promotes more restful sleep, and you?ll feel less tired in the morning!\n\n \n\n7. Enhance Immunity\n\nStimulation of the lymph nodes re-charges the body?s natural defense system.\n\n \n\n8. Reduce Fatigue\n\nMassage therapy is known to boost mood and promote better quality sleep, thus making you feel more rested and less worn-out at the end of the day.\n\n \n\n2\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n9. Alleviate Depression and Anxiety\n\nMassage therapy can help to release endorphins in your body, helping you to feel happy, energized, and at ease.\n\n \n\n10. Reduce post-surgery and post-injury swelling\n\nA professional massage is a great way to safely deal with a sports injury or post-surgery rehabilitation.\n\nDo you think that massage therapy could help you find relief in any of these areas? What improvements would you like to see in your health? Contact us today with your questions about massage therapy and see how we can help you get on the path to improved health and wellness!
##                                                                                                                                       Source
## 1 https://coremedicalohio.com/benefits-of-long-term-chiropractic-care/?utm_source=ReviveOldPost&utm_medium=social&utm_campaign=ReviveOldPost
## 2                                   https://healthpayerintelligence.com/news/unitedhealthcare-combats-opioid-crisis-with-non-opioid-benefits
## 3                                                                     https://www.verywellhealth.com/is-chiropractic-adjustment-safe-4588279
## 4                                           https://hafkeychiropractic.com/advanced-chiropractic-relief-8-key-benefits-of-chiropractor-care/
## 5                                                                                  https://www.urbannirvana.com/10-benefits-massage-therapy/
##                   Topic InternetSearch
## 1 chiropractic benefits           <NA>
## 2 chiropractic benefits           <NA>
## 3 chiropractic benefits           <NA>
## 4 chiropractic benefits           <NA>
## 5      massage benefits         google
##                                                                                                                                                                                               Contraindications
## 1 Doctors of Chiropractic work collaboratively with other healthcare professionals. Should your condition require the attention of another healthcare profession, that recommendation or referral will be made.
## 2                                                                                                                                                                                                          <NA>
## 3                                                                                                                                                                                                          <NA>
## 4                                                                                                                                                                                                          <NA>
## 5                                                                                                                                                                                                          <NA>
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      risksAdverseEffects
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 3 Risks and side effects associated with chiropractic adjustments may include:\n\n    temporary headaches\n    fatigue after treatment\n    discomfort in parts of the body that were treated\n\nRare but serious risks associated with chiropractic adjustment include:\n\n    stroke\n    cauda equina syndrome, a condition involving pinched nerves in the lower part of the spinal canal\n    worsening of herniated disks (although research isn't conclusive)\n\nIn addition to effectiveness, research has focused on the safety of chiropractic treatments, mainly spinal manipulation. \n\nOne 2017 review of 250 articles looked at serious adverse events and benign events associated with chiropractic care. Based on the evidence the researchers reviewed, serious adverse events accounted for one out of every two million spinal manipulations to 13 per 10,00 patients. Serious adverse events included spinal or neurological problems and cervical arterial strokes (dissection of any of the arteries in the neck).\n\nBenign events were more common and included more pain and higher levels of neck problems, but most were short-term problems.\n\nThe researchers confirmed serious adverse events were rare and often related to other preexisting conditions, while benign events are more common. However, the reasons for any types of adverse events are unknown.\n\nA second 2017 review looked 118 articles and found frequently described adverse events include stroke, headache and vertebral artery dissection (cervical arterial stroke). Forty-six percent of the reviews determined that spinal manipulation was safe, while 13% expressed concern of harm. The remaining studies were unclear or neutral. While the researchers did not offer an overall conclusion, they determined spinal manipulation can significantly be helpful, and some risk does exist.\nA Word From Verywell   When chiropractors are correctly trained and licensed, chiropractic care is safe. Mild side effects are to be expected and include temporary soreness, stiffness, and tenderness in the treated area. However, you still want to do your research. Ask for a referral from your doctor. Look at the chiropractor?s website, including patient reviews. Meet with the chiropractor to discuss his or her treatment practices and ask about possible adverse effects related to treatment.\n\nIf you decide a chiropractor isn?t for you, consider seeing an osteopathic doctor. Osteopaths are fully licensed doctors who can practice all areas of medicine. They have received special training on the musculoskeletal system, which includes manual readjustments, myofascial release and other physical manipulation of bones and muscle tissues.
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text        

data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  [chiropractic, adjustment, treatment, serve, n...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...  [, unitedhealthcare, combat, opioid, crisis, n...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...  [, safety, chiropractic, adjustment, chiroprac...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  [advanced, chiropractic, relief, 8, key, benef...
## 4  Heading to the spa can be a pampering treat, b...  ...  [heading, spa, pampering, treat, also, huge, b...
## 
## [5 rows x 10 columns]
data.to_csv('dataCleanLemm.csv')
#DATA = pd.read_csv('dataCleanLemm.csv', encoding='unicode_escape')
DATA <- read.csv('dataCleanLemm.csv', sep=',', header=TRUE, na.strings=c('',' ','NA'), row.names=1)
colnames(DATA)
##  [1] "Document"          "Source"            "Topic"            
##  [4] "InternetSearch"    "Contraindications" "RisksSideEffects" 
##  [7] "body_length"       "punct."            "Cleaned_text"     
## [10] "Lemmatized"
head(DATA,2)
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Document
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Chiropractic adjustments and treatments serve the needs of millions of people around the world.\n\n \n\nAdjustments offer effective, non-invasive and cost-effective solutions to neck and back pain, as well as a myriad of other medical issues.\n\nHave you ever stopped to wonder how many of us suffer from neck and back stiffness or pain?\n\nApart from the obvious discomfort, simple daily tasks such as driving a car, crossing a busy street and picking things up from the floor can become all too challenging for individuals experiencing such pain.\n\nAs anyone who has experienced pain would know, having restricted movement can be debilitating and unfortunately, our busy world doesn?t allow for us to stop.\n\n \nSome of the benefits of long-term chiropractic care include:\n\n    Chiropractors can identify mechanical issues that cause spine-related pain and offer a series of adjustments that provide near immediate relief. Following appointments, patients often report feeling their symptoms noticeably better.\n    When a chiropractor performs an adjustment, they can help restore movement in joints that have ?locked up?. This becomes possible as treatment allows muscles surrounding joints to relax, thereby reducing joint stiffness.\n    Many factors affect health, including exercise patterns, nutrition, sleep, heredity and the environment in which we live. Rather than just treat symptoms of the disease, chiropractic care focuses on a holistic approach to naturally maintain health and resist disease.\n    Chiropractic adjustments help restore normal function and movement to the entire body. Many patients report an improvement in their ability to move with efficiency and strength.\n    Many patients find delight in the results chiropractic adjustments have on old and chronic injuries. Whether an injury is, in fact, new or old, chiropractic care can help reduce pain, restore mobility and provide quick pain relief to all joints in the body. Such care can help maintain better overall health and thus faster recovery time.\n\nHave you ever noticed that when you are in pain and unable to perform regular or favorite activities, it can put a strain on emotional and mental well-being?\n\nFor example, the increased stress from not being able to properly perform a paid job. This, in turn, can have a negative impact on physical health with increases in heart rate and blood pressure. The domino effect often continues with sleep becoming disturbed, with resulting lethargy and tiredness during the day. Does anyone really feel up to exercising in this state?\n\nChiropractic care is a natural method of healing the body?s communication system and never relies on the use of pharmaceutical drugs or invasive surgery.\n\n\n\n 
## 1 \nUnitedHealthcare Combats Opioid Crisis with Non-Opioid Benefits\nPhysical therapy and chiropractic care can prevent or reduce expensive, invasive spinal procedures, such as imaging or surgery, to reduce opioid use and cut costs.\nUnitedHealthcare, opioid, physical therapy, healthcare spending\n\n\nOctober 29, 2019 - UnitedHealthcare (UHC) is combatting the opioid epidemic and high healthcare costs with new physical therapy and chiropractic care benefits to prevent, delay, or in some cases substitute for invasive spinal procedures.\n\n?With millions of Americans experiencing low back pain currently or at some point during their lifetimes, we believe this benefit design will help make a meaningful difference by improving health outcomes while reducing costs,? said Anne Docimo, MD, UnitedHealthcare chief medical officer.\n\nLower back pain is in part responsible for sustaining the opioid epidemic and also increases healthcare costs.\nDig Deeper\n\nAlthough opioid overdoses fell by two percent from 2017 to 2018 and a legal battles aim to hold pharmaceutical companies accountable, there is no end in sight for the opioid epidemic. Industry professionals are still grappling with the balance between cutting opioid prescriptions will working to reduce patient pain.\n\nCommon conditions such as low back pain bolster the epidemic?s presence, with clinicians still prescribing the opioids against best practice recommendations. According to a recent OptumLabs study, 9 percent of patients with newly diagnosed low back pain are prescribed opioids and lower back pain currently contributes 52 percent to the overall opioid prescription rate.\n\nIn addition to boosting opioids distribution, alternative, invasive lower back pain treatments can significantly impact healthcare spending.\n\nIt is not new information that physical therapy and chiropractic care are effective, lower cost alternatives to spinal imaging or surgery. However, payers are still in the process of adopting the method.\n\nTo counteract the high-cost, high-risk potential of using opioids to treat back pain, UHC created a benefit that does not rely on medication or technology but rather on physical therapy and chiropractic care.\n\nThe benefit allows eligible employers to offer physical therapist and chiropractor visits with no out-of-pocket costs. Members who already receive physical therapist and chiropractic care benefits under UHC?s employer-sponsored health plans and who have maxed out their visits will not receive additional visits under this benefit.\n\nHowever, for those who still have visits to use and who choose physical therapy or chiropractic care over other forms of treatment, the copay or deductible for those visits will be waived and they will receive three visits at no cost.\n\nUHC has high expectations for the fiscal and physical impacts of this benefit.\n\nAccording to UHC?s analysis, the health payer expects that by 2021, opioid use will decrease by 19 percent. Spinal imaging test frequency and spinal surgeries will be reduced by 22 percent and 21 percent, respectively. In addition to these specific goals, UHC hopes to see a decrease in the overall cost of spinal care.\n\nThe same OptumLabs study demonstrated that UHC?s expectations are not without precedent.\n\nThe study looked at the correlation between out-of-pocket costs and patient utilization of noninvasive treatments. Researchers discovered that members whose copay was over $30 were a little under 30 percent less likely to choose physical therapy as opposed to more invasive treatments.\n\nAn American Journal of Managed Care study in June 2019 found that patients with high deductibles, typically over $1,000, were less likely to visit physical therapy.\n\nEligible employers may be brand new or renewing their membership. They must be fully insured and over 51 or more employees strong. The benefit is currently available in Connecticut, Florida, Georgia, New York, and North Carolina.\n\nHowever, UHC plans to expand the benefit from 2020 into 2021. By the end of this expansion period, the benefit will also be available to self-funded employers and organizations with an employee population between 2 and 50. The benefit will span ten states, primarily in the southeast.\n\n?This new benefit design may help encourage people with low back pain to get the right care at the right time and in the right setting, helping expand access to evidence-based and more affordable treatments,? said Docimo.
##                                                                                                                                       Source
## 0 https://coremedicalohio.com/benefits-of-long-term-chiropractic-care/?utm_source=ReviveOldPost&utm_medium=social&utm_campaign=ReviveOldPost
## 1                                   https://healthpayerintelligence.com/news/unitedhealthcare-combats-opioid-crisis-with-non-opioid-benefits
##                   Topic InternetSearch
## 0 chiropractic benefits           <NA>
## 1 chiropractic benefits           <NA>
##                                                                                                                                                                                               Contraindications
## 0 Doctors of Chiropractic work collaboratively with other healthcare professionals. Should your condition require the attention of another healthcare profession, that recommendation or referral will be made.
## 1                                                                                                                                                                                                          <NA>
##   RisksSideEffects body_length punct.
## 0             <NA>        2288    2.4
## 1             <NA>        3796    2.4
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Cleaned_text
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ['chiropract', 'adjust', 'treatment', 'serv', 'need', 'million', 'peopl', 'around', 'world', 'adjust', 'offer', 'effect', 'noninvas', 'costeffect', 'solut', 'neck', 'back', 'pain', 'well', 'myriad', 'medic', 'issu', 'ever', 'stop', 'wonder', 'mani', 'us', 'suffer', 'neck', 'back', 'stiff', 'pain', 'apart', 'obviou', 'discomfort', 'simpl', 'daili', 'task', 'drive', 'car', 'cross', 'busi', 'street', 'pick', 'thing', 'floor', 'becom', 'challeng', 'individu', 'experienc', 'pain', 'anyon', 'experienc', 'pain', 'would', 'know', 'restrict', 'movement', 'debilit', 'unfortun', 'busi', 'world', 'doesnt', 'allow', 'us', 'stop', 'benefit', 'longterm', 'chiropract', 'care', 'includ', 'chiropractor', 'identifi', 'mechan', 'issu', 'caus', 'spinerel', 'pain', 'offer', 'seri', 'adjust', 'provid', 'near', 'immedi', 'relief', 'follow', 'appoint', 'patient', 'often', 'report', 'feel', 'symptom', 'notic', 'better', 'chiropractor', 'perform', 'adjust', 'help', 'restor', 'movement', 'joint', 'lock', 'becom', 'possibl', 'treatment', 'allow', 'muscl', 'surround', 'joint', 'relax', 'therebi', 'reduc', 'joint', 'stiff', 'mani', 'factor', 'affect', 'health', 'includ', 'exercis', 'pattern', 'nutrit', 'sleep', 'hered', 'environ', 'live', 'rather', 'treat', 'symptom', 'diseas', 'chiropract', 'care', 'focus', 'holist', 'approach', 'natur', 'maintain', 'health', 'resist', 'diseas', 'chiropract', 'adjust', 'help', 'restor', 'normal', 'function', 'movement', 'entir', 'bodi', 'mani', 'patient', 'report', 'improv', 'abil', 'move', 'effici', 'strength', 'mani', 'patient', 'find', 'delight', 'result', 'chiropract', 'adjust', 'old', 'chronic', 'injuri', 'whether', 'injuri', 'fact', 'new', 'old', 'chiropract', 'care', 'help', 'reduc', 'pain', 'restor', 'mobil', 'provid', 'quick', 'pain', 'relief', 'joint', 'bodi', 'care', 'help', 'maintain', 'better', 'overal', 'health', 'thu', 'faster', 'recoveri', 'time', 'ever', 'notic', 'pain', 'unabl', 'perform', 'regular', 'favorit', 'activ', 'put', 'strain', 'emot', 'mental', 'wellb', 'exampl', 'increas', 'stress', 'abl', 'properli', 'perform', 'paid', 'job', 'turn', 'neg', 'impact', 'physic', 'health', 'increas', 'heart', 'rate', 'blood', 'pressur', 'domino', 'effect', 'often', 'continu', 'sleep', 'becom', 'disturb', 'result', 'lethargi', 'tired', 'day', 'anyon', 'realli', 'feel', 'exercis', 'state', 'chiropract', 'care', 'natur', 'method', 'heal', 'bodi', 'commun', 'system', 'never', 'reli', 'use', 'pharmaceut', 'drug', 'invas', 'surgeri', '']
## 1 ['', 'unitedhealthcar', 'combat', 'opioid', 'crisi', 'nonopioid', 'benefit', 'physic', 'therapi', 'chiropract', 'care', 'prevent', 'reduc', 'expens', 'invas', 'spinal', 'procedur', 'imag', 'surgeri', 'reduc', 'opioid', 'use', 'cut', 'cost', 'unitedhealthcar', 'opioid', 'physic', 'therapi', 'healthcar', 'spend', 'octob', '29', '2019', 'unitedhealthcar', 'uhc', 'combat', 'opioid', 'epidem', 'high', 'healthcar', 'cost', 'new', 'physic', 'therapi', 'chiropract', 'care', 'benefit', 'prevent', 'delay', 'case', 'substitut', 'invas', 'spinal', 'procedur', 'million', 'american', 'experienc', 'low', 'back', 'pain', 'current', 'point', 'lifetim', 'believ', 'benefit', 'design', 'help', 'make', 'meaning', 'differ', 'improv', 'health', 'outcom', 'reduc', 'cost', 'said', 'ann', 'docimo', 'md', 'unitedhealthcar', 'chief', 'medic', 'offic', 'lower', 'back', 'pain', 'part', 'respons', 'sustain', 'opioid', 'epidem', 'also', 'increas', 'healthcar', 'cost', 'dig', 'deeper', 'although', 'opioid', 'overdos', 'fell', 'two', 'percent', '2017', '2018', 'legal', 'battl', 'aim', 'hold', 'pharmaceut', 'compani', 'account', 'end', 'sight', 'opioid', 'epidem', 'industri', 'profession', 'still', 'grappl', 'balanc', 'cut', 'opioid', 'prescript', 'work', 'reduc', 'patient', 'pain', 'common', 'condit', 'low', 'back', 'pain', 'bolster', 'epidem', 'presenc', 'clinician', 'still', 'prescrib', 'opioid', 'best', 'practic', 'recommend', 'accord', 'recent', 'optumlab', 'studi', '9', 'percent', 'patient', 'newli', 'diagnos', 'low', 'back', 'pain', 'prescrib', 'opioid', 'lower', 'back', 'pain', 'current', 'contribut', '52', 'percent', 'overal', 'opioid', 'prescript', 'rate', 'addit', 'boost', 'opioid', 'distribut', 'altern', 'invas', 'lower', 'back', 'pain', 'treatment', 'significantli', 'impact', 'healthcar', 'spend', 'new', 'inform', 'physic', 'therapi', 'chiropract', 'care', 'effect', 'lower', 'cost', 'altern', 'spinal', 'imag', 'surgeri', 'howev', 'payer', 'still', 'process', 'adopt', 'method', 'counteract', 'highcost', 'highrisk', 'potenti', 'use', 'opioid', 'treat', 'back', 'pain', 'uhc', 'creat', 'benefit', 'reli', 'medic', 'technolog', 'rather', 'physic', 'therapi', 'chiropract', 'care', 'benefit', 'allow', 'elig', 'employ', 'offer', 'physic', 'therapist', 'chiropractor', 'visit', 'outofpocket', 'cost', 'member', 'alreadi', 'receiv', 'physic', 'therapist', 'chiropract', 'care', 'benefit', 'uhc', 'employersponsor', 'health', 'plan', 'max', 'visit', 'receiv', 'addit', 'visit', 'benefit', 'howev', 'still', 'visit', 'use', 'choos', 'physic', 'therapi', 'chiropract', 'care', 'form', 'treatment', 'copay', 'deduct', 'visit', 'waiv', 'receiv', 'three', 'visit', 'cost', 'uhc', 'high', 'expect', 'fiscal', 'physic', 'impact', 'benefit', 'accord', 'uhc', 'analysi', 'health', 'payer', 'expect', '2021', 'opioid', 'use', 'decreas', '19', 'percent', 'spinal', 'imag', 'test', 'frequenc', 'spinal', 'surgeri', 'reduc', '22', 'percent', '21', 'percent', 'respect', 'addit', 'specif', 'goal', 'uhc', 'hope', 'see', 'decreas', 'overal', 'cost', 'spinal', 'care', 'optumlab', 'studi', 'demonstr', 'uhc', 'expect', 'without', 'preced', 'studi', 'look', 'correl', 'outofpocket', 'cost', 'patient', 'util', 'noninvas', 'treatment', 'research', 'discov', 'member', 'whose', 'copay', '30', 'littl', '30', 'percent', 'less', 'like', 'choos', 'physic', 'therapi', 'oppos', 'invas', 'treatment', 'american', 'journal', 'manag', 'care', 'studi', 'june', '2019', 'found', 'patient', 'high', 'deduct', 'typic', '1000', 'less', 'like', 'visit', 'physic', 'therapi', 'elig', 'employ', 'may', 'brand', 'new', 'renew', 'membership', 'must', 'fulli', 'insur', '51', 'employe', 'strong', 'benefit', 'current', 'avail', 'connecticut', 'florida', 'georgia', 'new', 'york', 'north', 'carolina', 'howev', 'uhc', 'plan', 'expand', 'benefit', '2020', '2021', 'end', 'expans', 'period', 'benefit', 'also', 'avail', 'selffund', 'employ', 'organ', 'employe', 'popul', '2', '50', 'benefit', 'span', 'ten', 'state', 'primarili', 'southeast', 'new', 'benefit', 'design', 'may', 'help', 'encourag', 'peopl', 'low', 'back', 'pain', 'get', 'right', 'care', 'right', 'time', 'right', 'set', 'help', 'expand', 'access', 'evidencebas', 'afford', 'treatment', 'said', 'docimo']
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Lemmatized
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                ['chiropractic', 'adjustment', 'treatment', 'serve', 'need', 'million', 'people', 'around', 'world', 'adjustment', 'offer', 'effective', 'noninvasive', 'costeffective', 'solution', 'neck', 'back', 'pain', 'well', 'myriad', 'medical', 'issue', 'ever', 'stopped', 'wonder', 'many', 'u', 'suffer', 'neck', 'back', 'stiffness', 'pain', 'apart', 'obvious', 'discomfort', 'simple', 'daily', 'task', 'driving', 'car', 'crossing', 'busy', 'street', 'picking', 'thing', 'floor', 'become', 'challenging', 'individual', 'experiencing', 'pain', 'anyone', 'experienced', 'pain', 'would', 'know', 'restricted', 'movement', 'debilitating', 'unfortunately', 'busy', 'world', 'doesnt', 'allow', 'u', 'stop', 'benefit', 'longterm', 'chiropractic', 'care', 'include', 'chiropractor', 'identify', 'mechanical', 'issue', 'cause', 'spinerelated', 'pain', 'offer', 'series', 'adjustment', 'provide', 'near', 'immediate', 'relief', 'following', 'appointment', 'patient', 'often', 'report', 'feeling', 'symptom', 'noticeably', 'better', 'chiropractor', 'performs', 'adjustment', 'help', 'restore', 'movement', 'joint', 'locked', 'becomes', 'possible', 'treatment', 'allows', 'muscle', 'surrounding', 'joint', 'relax', 'thereby', 'reducing', 'joint', 'stiffness', 'many', 'factor', 'affect', 'health', 'including', 'exercise', 'pattern', 'nutrition', 'sleep', 'heredity', 'environment', 'live', 'rather', 'treat', 'symptom', 'disease', 'chiropractic', 'care', 'focus', 'holistic', 'approach', 'naturally', 'maintain', 'health', 'resist', 'disease', 'chiropractic', 'adjustment', 'help', 'restore', 'normal', 'function', 'movement', 'entire', 'body', 'many', 'patient', 'report', 'improvement', 'ability', 'move', 'efficiency', 'strength', 'many', 'patient', 'find', 'delight', 'result', 'chiropractic', 'adjustment', 'old', 'chronic', 'injury', 'whether', 'injury', 'fact', 'new', 'old', 'chiropractic', 'care', 'help', 'reduce', 'pain', 'restore', 'mobility', 'provide', 'quick', 'pain', 'relief', 'joint', 'body', 'care', 'help', 'maintain', 'better', 'overall', 'health', 'thus', 'faster', 'recovery', 'time', 'ever', 'noticed', 'pain', 'unable', 'perform', 'regular', 'favorite', 'activity', 'put', 'strain', 'emotional', 'mental', 'wellbeing', 'example', 'increased', 'stress', 'able', 'properly', 'perform', 'paid', 'job', 'turn', 'negative', 'impact', 'physical', 'health', 'increase', 'heart', 'rate', 'blood', 'pressure', 'domino', 'effect', 'often', 'continues', 'sleep', 'becoming', 'disturbed', 'resulting', 'lethargy', 'tiredness', 'day', 'anyone', 'really', 'feel', 'exercising', 'state', 'chiropractic', 'care', 'natural', 'method', 'healing', 'body', 'communication', 'system', 'never', 'relies', 'use', 'pharmaceutical', 'drug', 'invasive', 'surgery', '']
## 1 ['', 'unitedhealthcare', 'combat', 'opioid', 'crisis', 'nonopioid', 'benefit', 'physical', 'therapy', 'chiropractic', 'care', 'prevent', 'reduce', 'expensive', 'invasive', 'spinal', 'procedure', 'imaging', 'surgery', 'reduce', 'opioid', 'use', 'cut', 'cost', 'unitedhealthcare', 'opioid', 'physical', 'therapy', 'healthcare', 'spending', 'october', '29', '2019', 'unitedhealthcare', 'uhc', 'combatting', 'opioid', 'epidemic', 'high', 'healthcare', 'cost', 'new', 'physical', 'therapy', 'chiropractic', 'care', 'benefit', 'prevent', 'delay', 'case', 'substitute', 'invasive', 'spinal', 'procedure', 'million', 'american', 'experiencing', 'low', 'back', 'pain', 'currently', 'point', 'lifetime', 'believe', 'benefit', 'design', 'help', 'make', 'meaningful', 'difference', 'improving', 'health', 'outcome', 'reducing', 'cost', 'said', 'anne', 'docimo', 'md', 'unitedhealthcare', 'chief', 'medical', 'officer', 'lower', 'back', 'pain', 'part', 'responsible', 'sustaining', 'opioid', 'epidemic', 'also', 'increase', 'healthcare', 'cost', 'dig', 'deeper', 'although', 'opioid', 'overdoses', 'fell', 'two', 'percent', '2017', '2018', 'legal', 'battle', 'aim', 'hold', 'pharmaceutical', 'company', 'accountable', 'end', 'sight', 'opioid', 'epidemic', 'industry', 'professional', 'still', 'grappling', 'balance', 'cutting', 'opioid', 'prescription', 'working', 'reduce', 'patient', 'pain', 'common', 'condition', 'low', 'back', 'pain', 'bolster', 'epidemic', 'presence', 'clinician', 'still', 'prescribing', 'opioids', 'best', 'practice', 'recommendation', 'according', 'recent', 'optumlabs', 'study', '9', 'percent', 'patient', 'newly', 'diagnosed', 'low', 'back', 'pain', 'prescribed', 'opioids', 'lower', 'back', 'pain', 'currently', 'contributes', '52', 'percent', 'overall', 'opioid', 'prescription', 'rate', 'addition', 'boosting', 'opioids', 'distribution', 'alternative', 'invasive', 'lower', 'back', 'pain', 'treatment', 'significantly', 'impact', 'healthcare', 'spending', 'new', 'information', 'physical', 'therapy', 'chiropractic', 'care', 'effective', 'lower', 'cost', 'alternative', 'spinal', 'imaging', 'surgery', 'however', 'payer', 'still', 'process', 'adopting', 'method', 'counteract', 'highcost', 'highrisk', 'potential', 'using', 'opioids', 'treat', 'back', 'pain', 'uhc', 'created', 'benefit', 'rely', 'medication', 'technology', 'rather', 'physical', 'therapy', 'chiropractic', 'care', 'benefit', 'allows', 'eligible', 'employer', 'offer', 'physical', 'therapist', 'chiropractor', 'visit', 'outofpocket', 'cost', 'member', 'already', 'receive', 'physical', 'therapist', 'chiropractic', 'care', 'benefit', 'uhcs', 'employersponsored', 'health', 'plan', 'maxed', 'visit', 'receive', 'additional', 'visit', 'benefit', 'however', 'still', 'visit', 'use', 'choose', 'physical', 'therapy', 'chiropractic', 'care', 'form', 'treatment', 'copay', 'deductible', 'visit', 'waived', 'receive', 'three', 'visit', 'cost', 'uhc', 'high', 'expectation', 'fiscal', 'physical', 'impact', 'benefit', 'according', 'uhcs', 'analysis', 'health', 'payer', 'expects', '2021', 'opioid', 'use', 'decrease', '19', 'percent', 'spinal', 'imaging', 'test', 'frequency', 'spinal', 'surgery', 'reduced', '22', 'percent', '21', 'percent', 'respectively', 'addition', 'specific', 'goal', 'uhc', 'hope', 'see', 'decrease', 'overall', 'cost', 'spinal', 'care', 'optumlabs', 'study', 'demonstrated', 'uhcs', 'expectation', 'without', 'precedent', 'study', 'looked', 'correlation', 'outofpocket', 'cost', 'patient', 'utilization', 'noninvasive', 'treatment', 'researcher', 'discovered', 'member', 'whose', 'copay', '30', 'little', '30', 'percent', 'le', 'likely', 'choose', 'physical', 'therapy', 'opposed', 'invasive', 'treatment', 'american', 'journal', 'managed', 'care', 'study', 'june', '2019', 'found', 'patient', 'high', 'deductible', 'typically', '1000', 'le', 'likely', 'visit', 'physical', 'therapy', 'eligible', 'employer', 'may', 'brand', 'new', 'renewing', 'membership', 'must', 'fully', 'insured', '51', 'employee', 'strong', 'benefit', 'currently', 'available', 'connecticut', 'florida', 'georgia', 'new', 'york', 'north', 'carolina', 'however', 'uhc', 'plan', 'expand', 'benefit', '2020', '2021', 'end', 'expansion', 'period', 'benefit', 'also', 'available', 'selffunded', 'employer', 'organization', 'employee', 'population', '2', '50', 'benefit', 'span', 'ten', 'state', 'primarily', 'southeast', 'new', 'benefit', 'design', 'may', 'help', 'encourage', 'people', 'low', 'back', 'pain', 'get', 'right', 'care', 'right', 'time', 'right', 'setting', 'helping', 'expand', 'access', 'evidencebased', 'affordable', 'treatment', 'said', 'docimo']
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.20)
from sklearn.feature_extraction.text import CountVectorizer
count_vect=CountVectorizer(analyzer=lemmatize)
count_vect_fit=count_vect.fit(X_train['Document'])

count_train=count_vect_fit.transform(X_train['Document'])
count_test=count_vect_fit.transform(X_test['Document'])
len(count_vect_fit.get_feature_names())
## 4114
count_vect_fit.get_feature_names()[200:350]
## ['affiliate', 'affirm', 'affordable', 'afraid', 'afterhours', 'afterward', 'afterwards', 'age', 'agency', 'agephysical', 'agerelated', 'aggression', 'aggressive', 'aggressively', 'aging', 'ago', 'agree', 'ahead', 'aid', 'aiding', 'ailment', 'aim', 'air', 'airport', 'alan', 'alchemist', 'alcohol', 'alcoholsoaked', 'alert', 'align', 'aligning', 'alignment', 'alike', 'allergic', 'allergy', 'alleviate', 'alleviates', 'alleviating', 'alleviation', 'alliance', 'allison', 'allow', 'allowing', 'allows', 'allround', 'almost', 'alone', 'along', 'alongside', 'already', 'alright', 'also', 'alter', 'altered', 'alternate', 'alternating', 'alternative', 'although', 'altogether', 'always', 'alzheimers', 'amateur', 'amazing', 'ambulance', 'ameliorating', 'america', 'american', 'among', 'amount', 'amplitude', 'analgesic', 'analysis', 'analyzed', 'anatomy', 'ancient', 'andor', 'anecdotal', 'anemia', 'anesthesia', 'anesthetic', 'anger', 'angions', 'anhedonia', 'animal', 'aniston', 'ankle', 'ann', 'annals', 'anne', 'announced', 'announcing', 'annual', 'annually', 'anosognosia', 'another', 'answer', 'answered', 'antibiotic', 'anticipate', 'antidepressant', 'antiinflammatory', 'antiviral', 'anxiety', 'anxietydepression', 'anyone', 'anything', 'anywhere', 'apart', 'appealing', 'appearance', 'appeared', 'appears', 'appendix', 'appetite', 'appliance', 'applicable', 'application', 'applied', 'apply', 'applying', 'appointment', 'approach', 'approached', 'appropriate', 'approved', 'approximately', 'apta', 'area', 'areaswere', 'arent', 'argue', 'arise', 'arises', 'arising', 'arizona', 'arm', 'armpit', 'aromatherapy', 'around', 'arquette', 'array', 'arrive', 'art', 'artery', 'arthritic', 'arthritis', 'article', 'ascertain', 'ashi', 'aside']
count_train_vect=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(count_train.toarray())],axis=1)

count_test_vect=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(count_test.toarray())],axis=1)
count_train_vect.head()
##                                             Document  body_length  ...  4112 4113
## 0  Cold Stone Therapy for Migraine Headaches\n\nP...         5275  ...     0    0
## 1  About Lymphatic Massage (MLD Manual Lymphatic ...         1695  ...     0    0
## 2  \nFinding Help: When to Get It and Where to Go...         6862  ...     0    0
## 3  \nFive Warning Signs of Mental Illness\n\n\nIt...         6759  ...     0    0
## 4  What is Hot & Cold Stone Therapy?\n\nHot & Col...         2781  ...     0    0
## 
## [5 rows x 4119 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(count_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(count_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 14                 massage benefits                 massage benefits
## 29  mental health services benefits  mental health services benefits
## 36                 massage benefits                 cupping benefits
## 19                 massage benefits        physical therapy benefits
## 26        physical therapy benefits  mental health services benefits
## 66       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 9              massage gun benefits                 massage benefits
## 56              cold stone benefits              cold stone benefits
## 70       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 69       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 68       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 24                 massage benefits  mental health services benefits
## 72             dry brushing massage             dry brushing massage
## 80                               ER                               ER
## 10                 massage benefits                 massage benefits
## 12                 massage benefits                 massage benefits
## 53              cold stone benefits              cold stone benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.7058823529411765
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 0 0 0 0 0]
##  [0 4 0 0 0 0 0 0 0]
##  [0 0 2 0 0 0 0 0 0]
##  [0 0 0 0 0 1 0 0 0]
##  [0 0 0 0 1 0 0 0 0]
##  [0 0 0 0 0 3 1 0 0]
##  [0 0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 1 0 1 1]
##  [0 0 0 0 0 1 0 0 0]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##                              ER       1.00      1.00      1.00         1
##      Lymphatic Drainage Massage       1.00      1.00      1.00         4
##             cold stone benefits       1.00      1.00      1.00         2
##                cupping benefits       0.00      0.00      0.00         1
##            dry brushing massage       1.00      1.00      1.00         1
##                massage benefits       0.50      0.75      0.60         4
##            massage gun benefits       0.00      0.00      0.00         0
## mental health services benefits       1.00      0.33      0.50         3
##       physical therapy benefits       0.00      0.00      0.00         1
## 
##                        accuracy                           0.71        17
##                       macro avg       0.61      0.56      0.57        17
##                    weighted avg       0.76      0.71      0.70        17
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1439: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples.
##   'recall', 'true', average, warn_for)
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(count_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(count_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 14                 cupping benefits                 massage benefits
## 29  mental health services benefits  mental health services benefits
## 36                 cupping benefits                 cupping benefits
## 19                 cupping benefits        physical therapy benefits
## 26  mental health services benefits  mental health services benefits
## 66       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 9                  cupping benefits                 massage benefits
## 56              cold stone benefits              cold stone benefits
## 70       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 69       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 68       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 24  mental health services benefits  mental health services benefits
## 72                 massage benefits             dry brushing massage
## 80                               ER                               ER
## 10                 massage benefits                 massage benefits
## 12                 cupping benefits                 massage benefits
## 53              cold stone benefits              cold stone benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.7058823529411765
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 0 0 0 0]
##  [0 4 0 0 0 0 0 0]
##  [0 0 2 0 0 0 0 0]
##  [0 0 0 1 0 0 0 0]
##  [0 0 0 0 0 1 0 0]
##  [0 0 0 3 0 1 0 0]
##  [0 0 0 0 0 0 3 0]
##  [0 0 0 1 0 0 0 0]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##                              ER       1.00      1.00      1.00         1
##      Lymphatic Drainage Massage       1.00      1.00      1.00         4
##             cold stone benefits       1.00      1.00      1.00         2
##                cupping benefits       0.20      1.00      0.33         1
##            dry brushing massage       0.00      0.00      0.00         1
##                massage benefits       0.50      0.25      0.33         4
## mental health services benefits       1.00      1.00      1.00         3
##       physical therapy benefits       0.00      0.00      0.00         1
## 
##                        accuracy                           0.71        17
##                       macro avg       0.59      0.66      0.58        17
##                    weighted avg       0.72      0.71      0.69        17
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_countRFC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    count_vect=CountVectorizer(analyzer=lemmatize)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['lemmatized'])
    
    model = rf.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_count_RFC_80-20:']
    print('\n\n',pred)
    

def predict_countRFC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    count_vect=CountVectorizer(analyzer=clean_text)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['clean'])
    
    model = rf.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_count_RFC_80-20:']
    print('\n\n',pred)
    
predict_countRFC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_count_RFC_80-20:               massage benefits
predict_countRFC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_count_RFC_80-20:               massage benefits

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_countGBC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    count_vect=CountVectorizer(analyzer=lemmatize)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['lemmatized'])
    
    model = gb.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_count_GBC_80-20:']
    print('\n\n',pred)
    

def predict_countGBC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    count_vect=CountVectorizer(analyzer=clean_text)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['clean'])
    
    model = gb.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_count_GBC_80-20:']
    print('\n\n',pred)
    
predict_countGBC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_count_GBC_80-20:               cupping benefits
predict_countGBC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_count_GBC_80-20:               cupping benefits

TF-IDF RFC and GBC


stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text        

data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  [chiropractic, adjustment, treatment, serve, n...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...  [, unitedhealthcare, combat, opioid, crisis, n...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...  [, safety, chiropractic, adjustment, chiroprac...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  [advanced, chiropractic, relief, 8, key, benef...
## 4  Heading to the spa can be a pampering treat, b...  ...  [heading, spa, pampering, treat, also, huge, b...
## 
## [5 rows x 10 columns]

from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.20)

from sklearn.feature_extraction.text import TfidfVectorizer

tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])

tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
tfidf_test=tfidf_vect_fit.transform(X_test['Document'])
len(tfidf_vect_fit.get_feature_names())
## 4325
tfidf_vect_fit.get_feature_names()[200:350]
## ['advantageous', 'adverse', 'adversity', 'advertised', 'advice', 'advise', 'advocate', 'aerobic', 'affair', 'affect', 'affected', 'affecting', 'affiliate', 'affirm', 'afford', 'affordable', 'afternoon', 'afterward', 'afterwards', 'age', 'agency', 'agephysical', 'agerelated', 'aggression', 'aggressive', 'aging', 'ago', 'agree', 'ahead', 'aid', 'aiding', 'ailment', 'aim', 'aimed', 'air', 'airport', 'alarm', 'alchemist', 'alcohol', 'alert', 'aligning', 'alignment', 'alike', 'allergic', 'allergy', 'alleviate', 'alleviated', 'alleviates', 'alleviating', 'alleviation', 'alliance', 'allison', 'allow', 'allowing', 'allows', 'allround', 'almost', 'alone', 'along', 'alongside', 'already', 'alright', 'also', 'alter', 'altered', 'alternate', 'alternating', 'alternative', 'alters', 'although', 'altogether', 'always', 'alzheimers', 'amateur', 'amazing', 'ambulance', 'ameliorating', 'america', 'american', 'among', 'amount', 'amplitude', 'amyclarklymphaticdrainagemassage', 'amyclarklymphaticdrainagemassage1', 'analgesic', 'analysis', 'analyzed', 'anatomy', 'ancient', 'andor', 'anecdotal', 'anemia', 'anesthesia', 'anesthetic', 'anger', 'angions', 'angry', 'anhedonia', 'animal', 'ankle', 'ann', 'annals', 'anne', 'announced', 'announcing', 'annual', 'annually', 'anosognosia', 'another', 'answer', 'answered', 'antibiotic', 'antibody', 'anticipate', 'antidepressant', 'antiinflammatory', 'antiviral', 'anxiety', 'anxietydepression', 'anxietyfree', 'anxious', 'anyone', 'anything', 'anywhere', 'apart', 'appealing', 'appearance', 'appears', 'appendix', 'appetite', 'applicable', 'application', 'applied', 'apply', 'applying', 'appointment', 'approach', 'approached', 'appropriate', 'approved', 'approximately', 'apta', 'area', 'areaswere', 'arent', 'arise', 'arising', 'arizona', 'arm', 'armpit']
tfidf_train_vect=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(tfidf_train.toarray())],axis=1)

tfidf_test_vect=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(tfidf_test.toarray())],axis=1)
tfidf_train_vect.head()
##                                             Document  body_length  ...  4323 4324
## 0  \nUnitedHealthcare Combats Opioid Crisis with ...         3796  ...   0.0  0.0
## 1  \r\nThai Massage\r\n\r\nDuring a Thai massage,...         1567  ...   0.0  0.0
## 2  ll You Need To Know About Massage Gun\n\nThere...         4366  ...   0.0  0.0
## 3  \nChiropractic Care for Back Pain\n\n\nStudies...          364  ...   0.0  0.0
## 4  \nFinding Help: When to Get It and Where to Go...         6862  ...   0.0  0.0
## 
## [5 rows x 4330 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
np.random.seed(45678)
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(tfidf_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(tfidf_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 60         cold stone benefits         cold stone benefits
## 30       chiropractic benefits       chiropractic benefits
## 47            massage benefits        massage gun benefits
## 34            cupping benefits            cupping benefits
## 77                          ER                          ER
## 74            massage benefits        dry brushing massage
## 48            massage benefits        massage gun benefits
## 72  Lymphatic Drainage Massage        dry brushing massage
## 81                          ER                          ER
## 53         cold stone benefits         cold stone benefits
## 66  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 5             massage benefits            massage benefits
## 56         cold stone benefits         cold stone benefits
## 61  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 19            massage benefits   physical therapy benefits
## 73            massage benefits        dry brushing massage
## 78                          ER                          ER
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.6470588235294118
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[3 0 0 0 0 0 0 0 0]
##  [0 2 0 0 0 0 0 0 0]
##  [0 0 1 0 0 0 0 0 0]
##  [0 0 0 3 0 0 0 0 0]
##  [0 0 0 0 1 0 0 0 0]
##  [0 1 0 0 0 0 2 0 0]
##  [0 0 0 0 0 0 1 0 0]
##  [0 0 0 0 0 0 2 0 0]
##  [0 0 0 0 0 0 1 0 0]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
##                         ER       1.00      1.00      1.00         3
## Lymphatic Drainage Massage       0.67      1.00      0.80         2
##      chiropractic benefits       1.00      1.00      1.00         1
##        cold stone benefits       1.00      1.00      1.00         3
##           cupping benefits       1.00      1.00      1.00         1
##       dry brushing massage       0.00      0.00      0.00         3
##           massage benefits       0.17      1.00      0.29         1
##       massage gun benefits       0.00      0.00      0.00         2
##  physical therapy benefits       0.00      0.00      0.00         1
## 
##                   accuracy                           0.65        17
##                  macro avg       0.54      0.67      0.57        17
##               weighted avg       0.56      0.65      0.58        17
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(tfidf_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(tfidf_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 60         cold stone benefits         cold stone benefits
## 30       chiropractic benefits       chiropractic benefits
## 47        massage gun benefits        massage gun benefits
## 34            cupping benefits            cupping benefits
## 77                          ER                          ER
## 74        dry brushing massage        dry brushing massage
## 48        massage gun benefits        massage gun benefits
## 72        dry brushing massage        dry brushing massage
## 81                          ER                          ER
## 53         cold stone benefits         cold stone benefits
## 66  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 5             massage benefits            massage benefits
## 56         cold stone benefits         cold stone benefits
## 61  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 19            massage benefits   physical therapy benefits
## 73        dry brushing massage        dry brushing massage
## 78                          ER                          ER
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.9411764705882353
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[3 0 0 0 0 0 0 0 0]
##  [0 2 0 0 0 0 0 0 0]
##  [0 0 1 0 0 0 0 0 0]
##  [0 0 0 3 0 0 0 0 0]
##  [0 0 0 0 1 0 0 0 0]
##  [0 0 0 0 0 3 0 0 0]
##  [0 0 0 0 0 0 1 0 0]
##  [0 0 0 0 0 0 0 2 0]
##  [0 0 0 0 0 0 1 0 0]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
##                         ER       1.00      1.00      1.00         3
## Lymphatic Drainage Massage       1.00      1.00      1.00         2
##      chiropractic benefits       1.00      1.00      1.00         1
##        cold stone benefits       1.00      1.00      1.00         3
##           cupping benefits       1.00      1.00      1.00         1
##       dry brushing massage       1.00      1.00      1.00         3
##           massage benefits       0.50      1.00      0.67         1
##       massage gun benefits       1.00      1.00      1.00         2
##  physical therapy benefits       0.00      0.00      0.00         1
## 
##                   accuracy                           0.94        17
##                  macro avg       0.83      0.89      0.85        17
##               weighted avg       0.91      0.94      0.92        17
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_tfidfRFC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['lemmatized'])
    
    model = rf.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_tfidf_RFC_80-20:']
    print('\n\n',pred)
    

def predict_tfidfRFC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    tfidf_vect=TfidfVectorizer(analyzer=clean_text)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['clean'])
    
    model = rf.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_tfidf_RFC_80-20:']
    print('\n\n',pred)
    
predict_tfidfRFC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_tfidf_RFC_80-20:               massage benefits
predict_tfidfRFC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_tfidf_RFC_80-20:               massage benefits

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_tfidfGBC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['lemmatized'])
    
    model = gb.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_tfidf_GBC_80-20:']
    print('\n\n',pred)
    

def predict_tfidfGBC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    tfidf_vect=TfidfVectorizer(analyzer=clean_text)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['clean'])
    
    model = gb.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_tfidf_GBC_80-20:']
    print('\n\n',pred)
    
predict_tfidfGBC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_tfidf_GBC_80-20:               massage benefits
predict_tfidfGBC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_tfidf_GBC_80-20:               massage benefits

N-Grams Vectorization for RFC and GBC

stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])#unlisted with N-grams vectorization
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])#unlisted with N-grams vectorization
    #text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#when using count Vectorization its a list
    #or else single letters returned.
    return text    
data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  chiropractic adjustment treatment serve need m...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...   unitedhealthcare combat opioid crisis nonopio...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...   safety chiropractic adjustment chiropractic a...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  advanced chiropractic relief 8 key benefit chi...
## 4  Heading to the spa can be a pampering treat, b...  ...  heading spa pampering treat also huge boost he...
## 
## [5 rows x 10 columns]
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.20)
from sklearn.feature_extraction.text import CountVectorizer
n_gram_vect=CountVectorizer(ngram_range=(1,4))
type(X_train['Cleaned_text'])
## <class 'pandas.core.series.Series'>
X_train['Cleaned_text'].head()
## 15    top 5 health benefit regular massag therapi ma...
## 3     advanc chiropract relief 8 key benefit chiropr...
## 16    physic therapi help physic therapi train profe...
## 0     chiropract adjust treatment serv need million ...
## 20    benefit physic therapi peopl think physic ther...
## Name: Cleaned_text, dtype: object
X_train['Lemmatized'].head()
## 15    top 5 health benefit regular massage therapy m...
## 3     advanced chiropractic relief 8 key benefit chi...
## 16    physical therapy help physical therapy trained...
## 0     chiropractic adjustment treatment serve need m...
## 20    benefit physical therapy people think physical...
## Name: Lemmatized, dtype: object
n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])


n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
n_gram_test=n_gram_vect_fit.transform(X_test['Lemmatized'])
len(n_gram_vect_fit.get_feature_names())
## 70661
print(n_gram_vect_fit.get_feature_names()[200:500])
## ['2011 research review sized', '2011 review', '2011 review 26', '2011 review 26 clinical', '2012', '2012 review', '2012 review studiestrusted', '2012 review studiestrusted source', '2013', '2013 illustrating', '2013 illustrating benefit', '2013 illustrating benefit journaling', '2015', '2015 report', '2015 report published', '2015 report published journal', '2015 review', '2015 review evidence', '2015 review evidence found', '2015 systematic', '2015 systematic review', '2015 systematic review concluded', '2016', '2016 michael', '2016 michael phelps', '2016 michael phelps permanently', '2016 study', '2016 study published', '2016 study published journal', '2016 summer', '2016 summer olympics', '2016 summer olympics share', '2016 summer olympics1', '2016 summer olympics1 us', '20162017', '20162017 wide', '20162017 wide range', '20162017 wide range reason', '2017', '2017 2018', '2017 2018 legal', '2017 2018 legal battle', '2017 chiropractor', '2017 chiropractor tout', '2017 chiropractor tout treatment', '2017 nba', '2017 nba final', '2017 nba final irving', '2017 scientist', '2017 scientist analyzed', '2017 scientist analyzed 11', '2017 study', '2017 study found', '2017 study found structure', '2018', '2018 found', '2018 found change', '2018 found change hamstring', '2018 galluppalmer', '2018 galluppalmer college', '2018 galluppalmer college chiropractic', '2018 legal', '2018 legal battle', '2018 legal battle aim', '2018 study', '2018 study led', '2018 study led dr', '2019', '2019 early', '2019 early 2020', '2019 early 2020 many', '2019 found', '2019 found patient', '2019 found patient high', '2019 massage', '2019 massage gun', '2019 massage gun one', '2019 unitedhealthcare', '2019 unitedhealthcare uhc', '2019 unitedhealthcare uhc combatting', '2020', '2020 2021', '2020 2021 end', '2020 2021 end expansion', '2020 beyond', '2020 beyond people', '2020 beyond people say', '2020 many', '2020 many people', '2020 many people started', '2021', '2021 end', '2021 end expansion', '2021 end expansion period', '2021 opioid', '2021 opioid use', '2021 opioid use decrease', '20minute', '20minute selfmassage', '20minute selfmassage using', '20minute selfmassage using massage', '21', '21 benefit', '21 benefit chiropractic', '21 benefit chiropractic adjustment', '21 benefit might', '21 benefit might known', '21 percent', '21 percent respectively', '21 percent respectively addition', '22', '22 million', '22 million american', '22 million american visit', '22 percent', '22 percent 21', '22 percent 21 percent', '23', '23 2019', '23 2019 massage', '23 2019 massage gun', '24', '24 separately', '24 separately column', '24 separately column several', '25', '25 percent', '25 percent american', '25 percent american adult', '25 reason', '25 reason get', '25 reason get massage', '25 showed', '25 showed 75', '25 showed 75 90', '26', '26 clinical', '26 clinical trial', '26 clinical trial looked', '272', '272 study', '272 study participant', '272 study participant three', '281', '281 341', '281 341 many', '281 341 many taoist', '29', '29 2019', '29 2019 unitedhealthcare', '29 2019 unitedhealthcare uhc', '30', '30 little', '30 little 30', '30 little 30 percent', '30 percent', '30 percent le', '30 percent le likely', '30 second', '30 second working', '30 second working along', '300', '300 ad', '300 ad even', '300 ad even earlier', '33', '33 medication', '33 medication one', '33 medication one year', '34', '34 lymphatic', '34 lymphatic system', '34 lymphatic system drain', '341', '341 many', '341 many taoist', '341 many taoist believe', '35', '35 cup', '35 cup first', '35 cup first session', '35 seeking', '35 seeking relief', '35 seeking relief back', '37', '37 study', '37 study found', '37 study found reduction', '38', '38 taking', '38 taking pain', '38 taking pain medication', '400', '400 600', '400 600 massage', '400 600 massage gun', '400 greater', '400 greater immune', '400 greater immune competence', '4357', '4357 local', '4357 local health', '4357 local health department', '44', '44 million', '44 million people', '44 million people 1320', '456000', '456000 chiropractor', '456000 chiropractor massage', '456000 chiropractor massage therapist', '48', '48 percent', '48 percent went', '48 percent went doctor', '48 received', '48 received pain', '48 received pain reduction', '4pm', '4pm afternoon', '4pm afternoon time', '4pm afternoon time dinner', '50', '50 75', '50 75 improvement', '50 75 improvement one', '50 benefit', '50 benefit span', '50 benefit span ten', '50 improvement', '50 improvement researcher', '50 improvement researcher concluded', '50 million', '50 million american', '50 million american suffer', '50 minute', '50 minute long', '50 minute long say', '50 patient', '50 patient 16', '50 patient 16 completely', '50 state', '50 state however', '50 state however many', '51', '51 employee', '51 employee strong', '51 employee strong benefit', '52', '52 percent', '52 percent overall', '52 percent overall opioid', '53', '53 drugfree', '53 drugfree group', '53 drugfree group continued', '53 sought', '53 sought treatment', '53 sought treatment chiropractor', '57', '57 chiropractic', '57 chiropractic group', '57 chiropractic group achieved', '57 cup', '57 cup british', '57 cup british cupping', '60', '60 also', '60 also vulnerable', '60 also vulnerable complication', '60 minute', '60 minute massage', '60 minute massage show', '60 second', '60 second brushing', '60 second brushing use', '600', '600 massage', '600 massage gun', '600 massage gun isnt', '600 massage gun work', '6070', '6070 lymphatic', '6070 lymphatic tissue', '6070 lymphatic tissue want', '62', '62 adult', '62 adult neck', '62 adult neck back', '62 million', '62 million people', '62 million people seen', '63', '63 saw', '63 saw medical', '63 saw medical doctor', '65', '65 mile', '65 mile hour', '65 mile hour allow']
n_gram_train_df=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(n_gram_train.toarray())],axis=1)

n_gram_test_df=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(n_gram_test.toarray())],axis=1)
n_gram_train_df.head()
##                                             Document  body_length  ...  70659 70660
## 0  Top 5 Health Benefits of Regular Massage Thera...          893  ...      0     0
## 1  Advanced Chiropractic Relief: 8 Key Benefits o...         8624  ...      0     0
## 2  How can physical therapy help?\n\nIn physical ...         7619  ...      0     0
## 3  Chiropractic adjustments and treatments serve ...         2288  ...      0     0
## 4  The Benefits of Physical Therapy\n\n\nWhen peo...         3221  ...      0     0
## 
## [5 rows x 70666 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(n_gram_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(n_gram_test)
end=time.time()
pred_time=(end-start)

prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                            Topic
## 58            massage benefits              cold stone benefits
## 56         cold stone benefits              cold stone benefits
## 72        dry brushing massage             dry brushing massage
## 36            massage benefits                 cupping benefits
## 12            massage benefits                 massage benefits
## 67            massage benefits       Lymphatic Drainage Massage
## 62            massage benefits       Lymphatic Drainage Massage
## 37            cupping benefits                 cupping benefits
## 52            massage benefits              cold stone benefits
## 57            massage benefits              cold stone benefits
## 4             massage benefits                 massage benefits
## 10            massage benefits                 massage benefits
## 17   physical therapy benefits        physical therapy benefits
## 46            massage benefits             massage gun benefits
## 49        massage gun benefits             massage gun benefits
## 29            massage benefits  mental health services benefits
## 69  Lymphatic Drainage Massage       Lymphatic Drainage Massage
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.5294117647058824
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 2 0 0 0]
##  [0 1 0 0 3 0 0 0]
##  [0 0 1 0 1 0 0 0]
##  [0 0 0 1 0 0 0 0]
##  [0 0 0 0 3 0 0 0]
##  [0 0 0 0 1 1 0 0]
##  [0 0 0 0 1 0 0 0]
##  [0 0 0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##      Lymphatic Drainage Massage       1.00      0.33      0.50         3
##             cold stone benefits       1.00      0.25      0.40         4
##                cupping benefits       1.00      0.50      0.67         2
##            dry brushing massage       1.00      1.00      1.00         1
##                massage benefits       0.27      1.00      0.43         3
##            massage gun benefits       1.00      0.50      0.67         2
## mental health services benefits       0.00      0.00      0.00         1
##       physical therapy benefits       1.00      1.00      1.00         1
## 
##                        accuracy                           0.53        17
##                       macro avg       0.78      0.57      0.58        17
##                    weighted avg       0.81      0.53      0.53        17
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(n_gram_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(n_gram_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 58              cold stone benefits              cold stone benefits
## 56              cold stone benefits              cold stone benefits
## 72             dry brushing massage             dry brushing massage
## 36                 cupping benefits                 cupping benefits
## 12                 massage benefits                 massage benefits
## 67       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 62       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 37                 cupping benefits                 cupping benefits
## 52              cold stone benefits              cold stone benefits
## 57              cold stone benefits              cold stone benefits
## 4                  massage benefits                 massage benefits
## 10                 massage benefits                 massage benefits
## 17        physical therapy benefits        physical therapy benefits
## 46                 massage benefits             massage gun benefits
## 49             massage gun benefits             massage gun benefits
## 29  mental health services benefits  mental health services benefits
## 69       Lymphatic Drainage Massage       Lymphatic Drainage Massage
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.9411764705882353
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[3 0 0 0 0 0 0 0]
##  [0 4 0 0 0 0 0 0]
##  [0 0 2 0 0 0 0 0]
##  [0 0 0 1 0 0 0 0]
##  [0 0 0 0 3 0 0 0]
##  [0 0 0 0 1 1 0 0]
##  [0 0 0 0 0 0 1 0]
##  [0 0 0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##      Lymphatic Drainage Massage       1.00      1.00      1.00         3
##             cold stone benefits       1.00      1.00      1.00         4
##                cupping benefits       1.00      1.00      1.00         2
##            dry brushing massage       1.00      1.00      1.00         1
##                massage benefits       0.75      1.00      0.86         3
##            massage gun benefits       1.00      0.50      0.67         2
## mental health services benefits       1.00      1.00      1.00         1
##       physical therapy benefits       1.00      1.00      1.00         1
## 
##                        accuracy                           0.94        17
##                       macro avg       0.97      0.94      0.94        17
##                    weighted avg       0.96      0.94      0.94        17
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])
    return text

def predict_ngramRFC_clean(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))

    rf=RandomForestClassifier(n_estimators=150,max_depth=None, n_jobs=-1)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
    n_gram_test=n_gram_vect_fit.transform(nr['clean'])
    
    model = rf.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['stemmed_1ngram4_RFC_80-20:']
    print('\n\n',pred)
    

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text    
    
def predict_ngramRFC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))

    rf=RandomForestClassifier(n_estimators=150,max_depth=None, n_jobs=-1)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
    n_gram_test=n_gram_vect_fit.transform(nr['lemma'])
    
    model = rf.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_1ngram4RFC_80-20:']
    print('\n\n',pred)
predict_ngramRFC_clean('I need a massage!') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_RFC_80-20:               massage benefits
predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                               Recommended Healthcare Service:
## lemmatized_1ngram4RFC_80-20:                massage benefits
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])
    return text

def predict_ngramGBC_clean(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))

    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
    n_gram_test=n_gram_vect_fit.transform(nr['clean'])
    
    model = gb.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['stemmed_1ngram4_GBC_80-20:']
    print('\n\n',pred)
    

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text    
    
def predict_ngramGBC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))

    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
    n_gram_test=n_gram_vect_fit.transform(nr['lemma'])
    
    model = gb.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_1ngram4GBC_80-20:']
    print('\n\n',pred)
predict_ngramGBC_clean('I need a massage!') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_GBC_80-20:               massage benefits
predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                               Recommended Healthcare Service:
## lemmatized_1ngram4GBC_80-20:                massage benefits

Second part: Lemmatized Tokens & 85/15 Train/Test split & RFC | GBC

Count Vectorizer RFC and GBC


stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
data <- read.csv('benefitsContraindications4.csv',sep=',',header=TRUE,  na.strings=c('',' ','NA'))
colnames(data)
## [1] "Document"            "Source"              "Topic"              
## [4] "InternetSearch"      "Contraindications"   "risksAdverseEffects"
head(data,5)
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Document
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Chiropractic adjustments and treatments serve the needs of millions of people around the world.\n\n \n\nAdjustments offer effective, non-invasive and cost-effective solutions to neck and back pain, as well as a myriad of other medical issues.\n\nHave you ever stopped to wonder how many of us suffer from neck and back stiffness or pain?\n\nApart from the obvious discomfort, simple daily tasks such as driving a car, crossing a busy street and picking things up from the floor can become all too challenging for individuals experiencing such pain.\n\nAs anyone who has experienced pain would know, having restricted movement can be debilitating and unfortunately, our busy world doesn?t allow for us to stop.\n\n \nSome of the benefits of long-term chiropractic care include:\n\n    Chiropractors can identify mechanical issues that cause spine-related pain and offer a series of adjustments that provide near immediate relief. Following appointments, patients often report feeling their symptoms noticeably better.\n    When a chiropractor performs an adjustment, they can help restore movement in joints that have ?locked up?. This becomes possible as treatment allows muscles surrounding joints to relax, thereby reducing joint stiffness.\n    Many factors affect health, including exercise patterns, nutrition, sleep, heredity and the environment in which we live. Rather than just treat symptoms of the disease, chiropractic care focuses on a holistic approach to naturally maintain health and resist disease.\n    Chiropractic adjustments help restore normal function and movement to the entire body. Many patients report an improvement in their ability to move with efficiency and strength.\n    Many patients find delight in the results chiropractic adjustments have on old and chronic injuries. Whether an injury is, in fact, new or old, chiropractic care can help reduce pain, restore mobility and provide quick pain relief to all joints in the body. Such care can help maintain better overall health and thus faster recovery time.\n\nHave you ever noticed that when you are in pain and unable to perform regular or favorite activities, it can put a strain on emotional and mental well-being?\n\nFor example, the increased stress from not being able to properly perform a paid job. This, in turn, can have a negative impact on physical health with increases in heart rate and blood pressure. The domino effect often continues with sleep becoming disturbed, with resulting lethargy and tiredness during the day. Does anyone really feel up to exercising in this state?\n\nChiropractic care is a natural method of healing the body?s communication system and never relies on the use of pharmaceutical drugs or invasive surgery.\n\n\n\n 
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          \nUnitedHealthcare Combats Opioid Crisis with Non-Opioid Benefits\nPhysical therapy and chiropractic care can prevent or reduce expensive, invasive spinal procedures, such as imaging or surgery, to reduce opioid use and cut costs.\nUnitedHealthcare, opioid, physical therapy, healthcare spending\n\n\nOctober 29, 2019 - UnitedHealthcare (UHC) is combatting the opioid epidemic and high healthcare costs with new physical therapy and chiropractic care benefits to prevent, delay, or in some cases substitute for invasive spinal procedures.\n\n?With millions of Americans experiencing low back pain currently or at some point during their lifetimes, we believe this benefit design will help make a meaningful difference by improving health outcomes while reducing costs,? said Anne Docimo, MD, UnitedHealthcare chief medical officer.\n\nLower back pain is in part responsible for sustaining the opioid epidemic and also increases healthcare costs.\nDig Deeper\n\nAlthough opioid overdoses fell by two percent from 2017 to 2018 and a legal battles aim to hold pharmaceutical companies accountable, there is no end in sight for the opioid epidemic. Industry professionals are still grappling with the balance between cutting opioid prescriptions will working to reduce patient pain.\n\nCommon conditions such as low back pain bolster the epidemic?s presence, with clinicians still prescribing the opioids against best practice recommendations. According to a recent OptumLabs study, 9 percent of patients with newly diagnosed low back pain are prescribed opioids and lower back pain currently contributes 52 percent to the overall opioid prescription rate.\n\nIn addition to boosting opioids distribution, alternative, invasive lower back pain treatments can significantly impact healthcare spending.\n\nIt is not new information that physical therapy and chiropractic care are effective, lower cost alternatives to spinal imaging or surgery. However, payers are still in the process of adopting the method.\n\nTo counteract the high-cost, high-risk potential of using opioids to treat back pain, UHC created a benefit that does not rely on medication or technology but rather on physical therapy and chiropractic care.\n\nThe benefit allows eligible employers to offer physical therapist and chiropractor visits with no out-of-pocket costs. Members who already receive physical therapist and chiropractic care benefits under UHC?s employer-sponsored health plans and who have maxed out their visits will not receive additional visits under this benefit.\n\nHowever, for those who still have visits to use and who choose physical therapy or chiropractic care over other forms of treatment, the copay or deductible for those visits will be waived and they will receive three visits at no cost.\n\nUHC has high expectations for the fiscal and physical impacts of this benefit.\n\nAccording to UHC?s analysis, the health payer expects that by 2021, opioid use will decrease by 19 percent. Spinal imaging test frequency and spinal surgeries will be reduced by 22 percent and 21 percent, respectively. In addition to these specific goals, UHC hopes to see a decrease in the overall cost of spinal care.\n\nThe same OptumLabs study demonstrated that UHC?s expectations are not without precedent.\n\nThe study looked at the correlation between out-of-pocket costs and patient utilization of noninvasive treatments. Researchers discovered that members whose copay was over $30 were a little under 30 percent less likely to choose physical therapy as opposed to more invasive treatments.\n\nAn American Journal of Managed Care study in June 2019 found that patients with high deductibles, typically over $1,000, were less likely to visit physical therapy.\n\nEligible employers may be brand new or renewing their membership. They must be fully insured and over 51 or more employees strong. The benefit is currently available in Connecticut, Florida, Georgia, New York, and North Carolina.\n\nHowever, UHC plans to expand the benefit from 2020 into 2021. By the end of this expansion period, the benefit will also be available to self-funded employers and organizations with an employee population between 2 and 50. The benefit will span ten states, primarily in the southeast.\n\n?This new benefit design may help encourage people with low back pain to get the right care at the right time and in the right setting, helping expand access to evidence-based and more affordable treatments,? said Docimo.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      The Safety of Chiropractic Adjustments\n\n    Chiropractic Adjustment\n    What Research Shows\n    Safety\n\nChiropractic adjustment, also called spinal manipulation, is a procedure done by a chiropractor using the hands or small instruments to apply controlled force to a spinal joint. The goal is to improve spinal motion and physical function of the entire body. Chiropractic adjustment is safe when performed by someone who is properly trained and licensed to practice chiropractic care. Complications are rare, but they are possible. Learn more about both the benefits and risks.\nChiropractic adjustment\nVerywell / Brianna Gilmartin \nChiropractic Adjustment\n\nOne of the most important reasons people seek chiropractic care is because it is a completely drug-free therapy. Someone dealing with joint pain, back pain, or headaches might consider visiting a chiropractor.\n\nThe goal of chiropractic adjustment is to place the body into a proper position so the body can heal itself. Treatments are believed to reduce stress on the immune system, reducing the potential for disease. Chiropractic care aims to address the entire body, including a person?s ability to move, perform, and even think.\nWhat Research Shows\n\nMany people wonder how helpful chiropractic care is in treating years of trauma and poor posture. There have been numerous studies showing the therapeutic benefits of chiropractic care.\nSciatica\n\nSciatica is a type of pain affecting the sciatic nerve, the large nerve extending from the low back down the back of the legs. Other natural therapies don?t always offer relief and most people want to avoid steroid injections and surgery, so they turn to chiropractic care.\n\nA double-blind trial reported in the Spine Journal compared active and simulated chiropractic manipulations in people with sciatic nerve pain. Active manipulations involved the patient laying down and receiving treatment from a chiropractor. Stimulated manipulations involved electrical muscle stimulation with electrodes placed on the skin to send electrical pulses to different parts of the body.\n\nThe researchers determined active manipulation offered more benefits than stimulated. The people who received active manipulations experienced fewer days of moderate or severe pain and other sciatica symptoms. They also reported no adverse effects.\nNeck Pain\n\nOne study reported in the Annals of Internal Medicine looked at different therapies for treating neck pain. They divided 272 study participants into three groups: one that received spinal manipulation from a chiropractic doctor, a second group given over-the-counter (OTC) pain relievers, narcotics, and muscle relaxers, and a third group who did at-home exercises. \n\nAfter 12 weeks, patients reported a 75% pain reduction, with the chiropractic treatment group achieving the most improvement. About 57% of the chiropractic group achieved pain reduction, while 48% received pain reduction from exercising, and 33% from medication.\n\nAfter one year, 53% of the drug-free groups continued to report pain relief compared to only 38% of those taking pain medications. \nHeadaches\n\nCervicogenic headaches and migraines are commonly treated by chiropractors. Cervicogenic headaches are often called secondary headaches because pain is usually referred from another source, usually the neck. Migraine headaches cause severe, throbbing pain and are generally experienced on one side of the head. There are few non-medicinal options for managing both types of chronic headaches.\n\nResearch reported in the Journal of Manipulative and Physiological Therapeutics suggests chiropractic care, specifically spinal manipulation, can improve migraines and cervicogenic headaches.  \nFrozen Shoulder\n\nFrozen shoulder affects the shoulder joint and involves pain and stiffness that develops gradually and gets worse. Frozen shoulder can be quite painful, and treatment involves preserving as much range of motion in the shoulder as possible and managing pain.\n\nA clinical trial reported in the Journal of Chiropractic Medicine described how patients suffering from frozen shoulder responded to chiropractic treatment. Of the 50 patients, 16 completely recovered, 25 showed a 75 to 90% improvement, and eight showed a 50 to 75% improvement. Only one person showed zero to 50% improvement. The researchers concluded most people can get improvement by treating frozen shoulder with chiropractic treatment.\nPreventing Need for Surgery\n\nChiropractic care may reduce the need for back surgery. Guidelines reported in the Journal of the American Medical Association suggest that it's reasonable for people suffering from back pain to try spinal manipulation before deciding on surgical intervention.\nLow Back Pain\n\nStudies have shown chiropractic care, including spinal manipulation, can provide relief from mild to moderate low back pain. In fact, spinal manipulation may work as well as other standard treatments, including pain-relief medications.\n\nA 2011 review of 26 clinical trials looked at the effectiveness of different treatments for chronic low back pain. What they found was that spinal manipulation is just as effective as other treatments for reducing back pain and improving function.\nSafety\n\n\n
## 4 Advanced Chiropractic Relief: 8 Key Benefits of Chiropractor Care\n\nAre you one of the 50 million Americans who suffer from chronic pain? If so you?re probably intimately familiar with the feeling of pure desperation that can arise from an inability to find relief.\n\nIn addition to physical issues, chronic pain can cause anxiety, depression, and more. However, there could be a light at the end of the tunnel. Many people are finding advanced chiropractic relief that is completely changing their lives.\n\nYour body is a world in itself. At this very moment, more than a million chemical reactions are taking place in your body. It manufactures energy, it regulates your heartbeat, your breathing and it regenerates and heals itself. Everything takes place without your conscious knowledge, without you controlling it voluntarily. The master system that controls it all is your nervous system.\n\nThe nervous system is made out of your brain, spinal cord and all your nerves.\n\nThe energy that flows through your nervous system in your body is like electricity. In order to have that electric flow normally and freely, we need to have a well functioning spine. Whenever you have disruption of that flow, disease happens. That would be the case when your spine is misaligned or is not moving properly.\n\nDid you know that 90% of stimulation and nutrition to the brain is generated by the movement of the spine?\n\nThe more mechanically distorted a person is, the less energy is available for thinking, metabolism and healing.\n\nThis is why it is so important to have a healthy spine, a proper posture, to exercise, to eat properly ? all of it truly matters for your quality of life.\nChiropractors localize the areas of your spine that do not move properly ? referred to as vertebral subluxations ? and adjust them with a specific high speed, but yet gentle, thrust to improve spinal motion.\n\nWant to learn about some of the ways chiropractic care can help you? Keep reading for insight into some of the key benefits of seeing a chiropractor.\n\nThe benefits of chiropractic care are numerous:\n\n1. Lower Blood Pressure\n\nStudies show that chiropractic treatment can lower your blood pressure. Sometimes, this works just as well as a prescription blood pressure medication! This benefit can also last for as long as six months after treatment.\n\nHigh blood pressure can cause an array of serious side effects like nausea, fatigue, dizziness, and anxiety. Sufferers who haven?t found relief should consider consulting with a chiropractor. A chiropractic adjustment may be the solution.\n\nSome studies have shown that chiropractic adjustments can also help patients who are suffering from low blood pressure.\n\n2. Reduced Inflammation\n\nIn many cases, joint issues, pain, and tension are caused by inflammation in the body. Chiropractic adjustments can reduce inflammation.\n\nThis leads to relief of muscle tension, chronic back pain, and joint pain. These adjustments can sometimes also slow the progression of inflammation-related diseases, like arthritis.\n\n3. Better Sleep\n\nPatients who receive chiropractic adjustments report a significant improvement in their sleep patterns. If you regularly suffer from insomnia, visiting a chiropractor regularly may help. Also, when you experience pain relief, this will help you get a restful night?s sleep.\n\n4. Digestive Relief\n\nChiropractors often give nutritional advice as part of their services. However, this isn?t the only way that they provide patients with digestive relief.\n\nAdjusting the thoraco-lumbar spine restores the neurological function of your digestive system. Regular adjustments can help with chronic digestive issues.\n\n5. Stress Release\n\nEveryday life can cause muscle cramping, inflammation, and more. When you?re sore from working at a computer, heavy lifting, or just dealing with emotional stress, a chiropractic adjustment can help. This leads to greater comfort and advanced pain relief.\n\n6. Improvement of Neurological Conditions\n\nA chiropractic adjustment can also increase blood flow to the brain and increase the flow of cerebral spinal fluid. This means that patients suffering from neurological conditions like epilepsy and multiple sclerosis can significantly benefit from regular adjustments.\n\nThis is a relatively new area of study, but the potential is huge. Those suffering from these conditions will want to do some research. It?s important to find the best chiropractor in their area with experience dealing with these specific types of cases.\n\n7. Chiropractic care can improve communication from your brain to your muscles\n\nResearch seems to show that chiropractic care can improve your brain-body communication, helping your brain to be more aware of what is going on in the body so it can control your body better.\n\nBetter health, more energy and vitality are some of the positive effects of getting your spine adjusted. It sets your vertebrae back into motion freeing up the energy that travels through your nerves.\n\nChiropractic care is a partnership. The results patients want is a combination of what the chiropractor does and what the patient does.\n\nThere are many good things that can be changed and improved for a better lifestyle: exercise, good nutrition, good mental attitude and spinal adjustments.\n\nYour whole body will work better by having your nervous system free of interference. That is the essence of chiropractic care and is designed for you and your family.\n\n8. Pain Relief\n\nPerhaps the most well-known benefit of going to a chiropractor is pain relief. Adjustments can help with a huge array of painful conditions including the following.\n\nNeck and Lower Back Pain\n\nAdjustments are the most effective non-invasive pain relief method for this type of pain. They may help patients avoid having to take prescription pain management drugs.\n\nSciatica\n\nTreatments help relieve pressure on the nerve. This results in less severe pain that lasts for a fewer number of days.\n\nHeadaches\n\nChiropractic adjustments help headaches and migraines. They do this by treating back misalignment, muscle tension, and stress. Cervical spine manipulation was associated with significant improvement in headache outcomes in trials involving patients with neck pain and/or neck dysfunction and headache.\n\nChronic headaches can result from the abnormal positioning of the head and can be worsened from neck pressure and movement. Chiropractic removes the interference whether it may be from the distant muscle tightness in the back causing strain on your spine or an abnormal lordotic cervical curve and moving vertebrae.\nChiropractic care can reduce the duration of headaches, lower their intensity when they do occur and limit the frequency of their occurrence all together.\n\nMenstrual cramps\n\nChiropractic treatment removes tension from the pelvis and sacrum. It also regulates the neurological function communicating with the reproductive organs. Adjustments can also relieve the bloating, cramping, and pain associated with menstrual cramps\n\nAnyone who has tried traditional medical treatments and has been unable to find pain relief should experiment with chiropractic care. More often than not, you?ll be pleasantly surprised!\n\nBonus: Advanced Chiropractic Relief\n\nIn addition to the benefits listed above, adjustments can bring advanced chiropractic relief for a wide variety of other conditions as well as overall life improvement. A few examples include:\n\nScoliosis ? adjustments have shown to help with the pain, reduced range of motion, abnormal posture, and even difficulty breathing caused by this abnormal curvature of the spine\n\nVertigo ? an adjustment can help realign and balance the spine, thereby reducing the dizziness, nausea, and disorientation caused by vertigo\n\nSinus and allergy relief ? adjusting the upper cervical spine can help drain the sinuses and provide immediate and lasting relief from both long-term and seasonal allergies\n\nExpectant mothers ? women can experience relief from pain and morning sickness and are better able to maintain proper posture during and after pregnancy\n\nChildren?s issues ? treatments have been shown to help children with acid reflux, cholic, and ear infections\nAthletic performance ? the reduction in pain and inflammation is particularly beneficial for professional and amateur athletes\n\nStimulates the immune system ? chiropractic care helps to boost the immune system, speeding up the healing process following illnesses or injuries. One of the most important studies showing the positive effect chiropractic care can have on the immune system and general health was performed by Ronald Pero, Ph.D., chief of cancer prevention research at New York?s Preventive Medicine Institute and professor of medicine at New York University. Dr. Pero measured the immune systems of people under chiropractic care as compared to those in the general population and those with cancer and other serious diseases.\n\nIn his initial three-year study of 107 individuals who had been under chiropractic care for five years or more, the chiropractic patients were found to have a 200% greater immune competence than people who had not received chiropractic care, and 400% greater immune competence than people with cancer and other serious diseases. The immune system superiority of those under chiropractic care did not diminish with age.\n\nDr. Pero stated: ?When applied in a clinical framework, I have never seen a group other than this chiropractic group to experience a 200% increase over the normal patients. This is why it is so dramatically important. We have never seen such a positive improvement in a group.?\n\nAs you can see, there are almost limitless benefits to seeking chiropractic treatment. If you haven?t tried it yet, what are you waiting for?\n\nThere?s no need to accept pain and discomfort as a normal part of life. You have nothing to lose and everything to gain, so it only makes sense to find out more about this possibly life-changing approach to improving your health and wellness.
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Heading to the spa can be a pampering treat, but it can also be a huge boost to your health and wellness! Massage therapy can relieve all sorts of ailments ? from physical pain, to stress and anxiety. People who choose to supplement their healthcare regimen with regular massages will not only enjoy a relaxing hour or two at the spa, but they will see the benefits carry through the days and weeks after the appointment!\n\n1\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\nThese are the 10 most common benefits reported from massage therapy:\n\n1. Reduce Stress\n\nA relaxing day at the spa is a great way to unwind and de-stress. However, clients are sure to notice themselves feeling relaxed and at ease for days and even weeks after their appointments!\n\n \n\n2. Improve Circulation\n\nLoosening muscles and tendons allows increased blood flow throughout the body. Improving your circulation can have a number of positive effects on the rest of your body, including reduced fatigue and pain management!\n\n \n\n3. Reduce Pain\n\nMassage therapy is great for working out problem areas like lower back pain and chronic stiffness. A professional therapist will be able to accurately target the source of your pain and help achieve the perfect massage regimen.\n\n \n\n3\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n4. Eliminate Toxins\n\nStimulating the soft tissues of your body will help to release toxins through your blood and lymphatic systems.\n\n \n\n5. Improve Flexibility\n\nMassage therapy will loosen and relax your muscles, helping your body to achieve its full range of movement potential.\n\n \n\n6. Improve Sleep\n\nA massage will encourage relaxation and boost your mood.  Going to bed with relaxed and loosened muscles promotes more restful sleep, and you?ll feel less tired in the morning!\n\n \n\n7. Enhance Immunity\n\nStimulation of the lymph nodes re-charges the body?s natural defense system.\n\n \n\n8. Reduce Fatigue\n\nMassage therapy is known to boost mood and promote better quality sleep, thus making you feel more rested and less worn-out at the end of the day.\n\n \n\n2\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n9. Alleviate Depression and Anxiety\n\nMassage therapy can help to release endorphins in your body, helping you to feel happy, energized, and at ease.\n\n \n\n10. Reduce post-surgery and post-injury swelling\n\nA professional massage is a great way to safely deal with a sports injury or post-surgery rehabilitation.\n\nDo you think that massage therapy could help you find relief in any of these areas? What improvements would you like to see in your health? Contact us today with your questions about massage therapy and see how we can help you get on the path to improved health and wellness!
##                                                                                                                                       Source
## 1 https://coremedicalohio.com/benefits-of-long-term-chiropractic-care/?utm_source=ReviveOldPost&utm_medium=social&utm_campaign=ReviveOldPost
## 2                                   https://healthpayerintelligence.com/news/unitedhealthcare-combats-opioid-crisis-with-non-opioid-benefits
## 3                                                                     https://www.verywellhealth.com/is-chiropractic-adjustment-safe-4588279
## 4                                           https://hafkeychiropractic.com/advanced-chiropractic-relief-8-key-benefits-of-chiropractor-care/
## 5                                                                                  https://www.urbannirvana.com/10-benefits-massage-therapy/
##                   Topic InternetSearch
## 1 chiropractic benefits           <NA>
## 2 chiropractic benefits           <NA>
## 3 chiropractic benefits           <NA>
## 4 chiropractic benefits           <NA>
## 5      massage benefits         google
##                                                                                                                                                                                               Contraindications
## 1 Doctors of Chiropractic work collaboratively with other healthcare professionals. Should your condition require the attention of another healthcare profession, that recommendation or referral will be made.
## 2                                                                                                                                                                                                          <NA>
## 3                                                                                                                                                                                                          <NA>
## 4                                                                                                                                                                                                          <NA>
## 5                                                                                                                                                                                                          <NA>
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      risksAdverseEffects
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 3 Risks and side effects associated with chiropractic adjustments may include:\n\n    temporary headaches\n    fatigue after treatment\n    discomfort in parts of the body that were treated\n\nRare but serious risks associated with chiropractic adjustment include:\n\n    stroke\n    cauda equina syndrome, a condition involving pinched nerves in the lower part of the spinal canal\n    worsening of herniated disks (although research isn't conclusive)\n\nIn addition to effectiveness, research has focused on the safety of chiropractic treatments, mainly spinal manipulation. \n\nOne 2017 review of 250 articles looked at serious adverse events and benign events associated with chiropractic care. Based on the evidence the researchers reviewed, serious adverse events accounted for one out of every two million spinal manipulations to 13 per 10,00 patients. Serious adverse events included spinal or neurological problems and cervical arterial strokes (dissection of any of the arteries in the neck).\n\nBenign events were more common and included more pain and higher levels of neck problems, but most were short-term problems.\n\nThe researchers confirmed serious adverse events were rare and often related to other preexisting conditions, while benign events are more common. However, the reasons for any types of adverse events are unknown.\n\nA second 2017 review looked 118 articles and found frequently described adverse events include stroke, headache and vertebral artery dissection (cervical arterial stroke). Forty-six percent of the reviews determined that spinal manipulation was safe, while 13% expressed concern of harm. The remaining studies were unclear or neutral. While the researchers did not offer an overall conclusion, they determined spinal manipulation can significantly be helpful, and some risk does exist.\nA Word From Verywell   When chiropractors are correctly trained and licensed, chiropractic care is safe. Mild side effects are to be expected and include temporary soreness, stiffness, and tenderness in the treated area. However, you still want to do your research. Ask for a referral from your doctor. Look at the chiropractor?s website, including patient reviews. Meet with the chiropractor to discuss his or her treatment practices and ask about possible adverse effects related to treatment.\n\nIf you decide a chiropractor isn?t for you, consider seeing an osteopathic doctor. Osteopaths are fully licensed doctors who can practice all areas of medicine. They have received special training on the musculoskeletal system, which includes manual readjustments, myofascial release and other physical manipulation of bones and muscle tissues.
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text        

data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  [chiropractic, adjustment, treatment, serve, n...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...  [, unitedhealthcare, combat, opioid, crisis, n...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...  [, safety, chiropractic, adjustment, chiroprac...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  [advanced, chiropractic, relief, 8, key, benef...
## 4  Heading to the spa can be a pampering treat, b...  ...  [heading, spa, pampering, treat, also, huge, b...
## 
## [5 rows x 10 columns]
data.to_csv('dataCleanLemm.csv')
#DATA = pd.read_csv('dataCleanLemm.csv', encoding='unicode_escape')
DATA <- read.csv('dataCleanLemm.csv', sep=',', header=TRUE, na.strings=c('',' ','NA'), row.names=1)
colnames(DATA)
##  [1] "Document"          "Source"            "Topic"            
##  [4] "InternetSearch"    "Contraindications" "RisksSideEffects" 
##  [7] "body_length"       "punct."            "Cleaned_text"     
## [10] "Lemmatized"
head(DATA,2)
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Document
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Chiropractic adjustments and treatments serve the needs of millions of people around the world.\n\n \n\nAdjustments offer effective, non-invasive and cost-effective solutions to neck and back pain, as well as a myriad of other medical issues.\n\nHave you ever stopped to wonder how many of us suffer from neck and back stiffness or pain?\n\nApart from the obvious discomfort, simple daily tasks such as driving a car, crossing a busy street and picking things up from the floor can become all too challenging for individuals experiencing such pain.\n\nAs anyone who has experienced pain would know, having restricted movement can be debilitating and unfortunately, our busy world doesn?t allow for us to stop.\n\n \nSome of the benefits of long-term chiropractic care include:\n\n    Chiropractors can identify mechanical issues that cause spine-related pain and offer a series of adjustments that provide near immediate relief. Following appointments, patients often report feeling their symptoms noticeably better.\n    When a chiropractor performs an adjustment, they can help restore movement in joints that have ?locked up?. This becomes possible as treatment allows muscles surrounding joints to relax, thereby reducing joint stiffness.\n    Many factors affect health, including exercise patterns, nutrition, sleep, heredity and the environment in which we live. Rather than just treat symptoms of the disease, chiropractic care focuses on a holistic approach to naturally maintain health and resist disease.\n    Chiropractic adjustments help restore normal function and movement to the entire body. Many patients report an improvement in their ability to move with efficiency and strength.\n    Many patients find delight in the results chiropractic adjustments have on old and chronic injuries. Whether an injury is, in fact, new or old, chiropractic care can help reduce pain, restore mobility and provide quick pain relief to all joints in the body. Such care can help maintain better overall health and thus faster recovery time.\n\nHave you ever noticed that when you are in pain and unable to perform regular or favorite activities, it can put a strain on emotional and mental well-being?\n\nFor example, the increased stress from not being able to properly perform a paid job. This, in turn, can have a negative impact on physical health with increases in heart rate and blood pressure. The domino effect often continues with sleep becoming disturbed, with resulting lethargy and tiredness during the day. Does anyone really feel up to exercising in this state?\n\nChiropractic care is a natural method of healing the body?s communication system and never relies on the use of pharmaceutical drugs or invasive surgery.\n\n\n\n 
## 1 \nUnitedHealthcare Combats Opioid Crisis with Non-Opioid Benefits\nPhysical therapy and chiropractic care can prevent or reduce expensive, invasive spinal procedures, such as imaging or surgery, to reduce opioid use and cut costs.\nUnitedHealthcare, opioid, physical therapy, healthcare spending\n\n\nOctober 29, 2019 - UnitedHealthcare (UHC) is combatting the opioid epidemic and high healthcare costs with new physical therapy and chiropractic care benefits to prevent, delay, or in some cases substitute for invasive spinal procedures.\n\n?With millions of Americans experiencing low back pain currently or at some point during their lifetimes, we believe this benefit design will help make a meaningful difference by improving health outcomes while reducing costs,? said Anne Docimo, MD, UnitedHealthcare chief medical officer.\n\nLower back pain is in part responsible for sustaining the opioid epidemic and also increases healthcare costs.\nDig Deeper\n\nAlthough opioid overdoses fell by two percent from 2017 to 2018 and a legal battles aim to hold pharmaceutical companies accountable, there is no end in sight for the opioid epidemic. Industry professionals are still grappling with the balance between cutting opioid prescriptions will working to reduce patient pain.\n\nCommon conditions such as low back pain bolster the epidemic?s presence, with clinicians still prescribing the opioids against best practice recommendations. According to a recent OptumLabs study, 9 percent of patients with newly diagnosed low back pain are prescribed opioids and lower back pain currently contributes 52 percent to the overall opioid prescription rate.\n\nIn addition to boosting opioids distribution, alternative, invasive lower back pain treatments can significantly impact healthcare spending.\n\nIt is not new information that physical therapy and chiropractic care are effective, lower cost alternatives to spinal imaging or surgery. However, payers are still in the process of adopting the method.\n\nTo counteract the high-cost, high-risk potential of using opioids to treat back pain, UHC created a benefit that does not rely on medication or technology but rather on physical therapy and chiropractic care.\n\nThe benefit allows eligible employers to offer physical therapist and chiropractor visits with no out-of-pocket costs. Members who already receive physical therapist and chiropractic care benefits under UHC?s employer-sponsored health plans and who have maxed out their visits will not receive additional visits under this benefit.\n\nHowever, for those who still have visits to use and who choose physical therapy or chiropractic care over other forms of treatment, the copay or deductible for those visits will be waived and they will receive three visits at no cost.\n\nUHC has high expectations for the fiscal and physical impacts of this benefit.\n\nAccording to UHC?s analysis, the health payer expects that by 2021, opioid use will decrease by 19 percent. Spinal imaging test frequency and spinal surgeries will be reduced by 22 percent and 21 percent, respectively. In addition to these specific goals, UHC hopes to see a decrease in the overall cost of spinal care.\n\nThe same OptumLabs study demonstrated that UHC?s expectations are not without precedent.\n\nThe study looked at the correlation between out-of-pocket costs and patient utilization of noninvasive treatments. Researchers discovered that members whose copay was over $30 were a little under 30 percent less likely to choose physical therapy as opposed to more invasive treatments.\n\nAn American Journal of Managed Care study in June 2019 found that patients with high deductibles, typically over $1,000, were less likely to visit physical therapy.\n\nEligible employers may be brand new or renewing their membership. They must be fully insured and over 51 or more employees strong. The benefit is currently available in Connecticut, Florida, Georgia, New York, and North Carolina.\n\nHowever, UHC plans to expand the benefit from 2020 into 2021. By the end of this expansion period, the benefit will also be available to self-funded employers and organizations with an employee population between 2 and 50. The benefit will span ten states, primarily in the southeast.\n\n?This new benefit design may help encourage people with low back pain to get the right care at the right time and in the right setting, helping expand access to evidence-based and more affordable treatments,? said Docimo.
##                                                                                                                                       Source
## 0 https://coremedicalohio.com/benefits-of-long-term-chiropractic-care/?utm_source=ReviveOldPost&utm_medium=social&utm_campaign=ReviveOldPost
## 1                                   https://healthpayerintelligence.com/news/unitedhealthcare-combats-opioid-crisis-with-non-opioid-benefits
##                   Topic InternetSearch
## 0 chiropractic benefits           <NA>
## 1 chiropractic benefits           <NA>
##                                                                                                                                                                                               Contraindications
## 0 Doctors of Chiropractic work collaboratively with other healthcare professionals. Should your condition require the attention of another healthcare profession, that recommendation or referral will be made.
## 1                                                                                                                                                                                                          <NA>
##   RisksSideEffects body_length punct.
## 0             <NA>        2288    2.4
## 1             <NA>        3796    2.4
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Cleaned_text
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ['chiropract', 'adjust', 'treatment', 'serv', 'need', 'million', 'peopl', 'around', 'world', 'adjust', 'offer', 'effect', 'noninvas', 'costeffect', 'solut', 'neck', 'back', 'pain', 'well', 'myriad', 'medic', 'issu', 'ever', 'stop', 'wonder', 'mani', 'us', 'suffer', 'neck', 'back', 'stiff', 'pain', 'apart', 'obviou', 'discomfort', 'simpl', 'daili', 'task', 'drive', 'car', 'cross', 'busi', 'street', 'pick', 'thing', 'floor', 'becom', 'challeng', 'individu', 'experienc', 'pain', 'anyon', 'experienc', 'pain', 'would', 'know', 'restrict', 'movement', 'debilit', 'unfortun', 'busi', 'world', 'doesnt', 'allow', 'us', 'stop', 'benefit', 'longterm', 'chiropract', 'care', 'includ', 'chiropractor', 'identifi', 'mechan', 'issu', 'caus', 'spinerel', 'pain', 'offer', 'seri', 'adjust', 'provid', 'near', 'immedi', 'relief', 'follow', 'appoint', 'patient', 'often', 'report', 'feel', 'symptom', 'notic', 'better', 'chiropractor', 'perform', 'adjust', 'help', 'restor', 'movement', 'joint', 'lock', 'becom', 'possibl', 'treatment', 'allow', 'muscl', 'surround', 'joint', 'relax', 'therebi', 'reduc', 'joint', 'stiff', 'mani', 'factor', 'affect', 'health', 'includ', 'exercis', 'pattern', 'nutrit', 'sleep', 'hered', 'environ', 'live', 'rather', 'treat', 'symptom', 'diseas', 'chiropract', 'care', 'focus', 'holist', 'approach', 'natur', 'maintain', 'health', 'resist', 'diseas', 'chiropract', 'adjust', 'help', 'restor', 'normal', 'function', 'movement', 'entir', 'bodi', 'mani', 'patient', 'report', 'improv', 'abil', 'move', 'effici', 'strength', 'mani', 'patient', 'find', 'delight', 'result', 'chiropract', 'adjust', 'old', 'chronic', 'injuri', 'whether', 'injuri', 'fact', 'new', 'old', 'chiropract', 'care', 'help', 'reduc', 'pain', 'restor', 'mobil', 'provid', 'quick', 'pain', 'relief', 'joint', 'bodi', 'care', 'help', 'maintain', 'better', 'overal', 'health', 'thu', 'faster', 'recoveri', 'time', 'ever', 'notic', 'pain', 'unabl', 'perform', 'regular', 'favorit', 'activ', 'put', 'strain', 'emot', 'mental', 'wellb', 'exampl', 'increas', 'stress', 'abl', 'properli', 'perform', 'paid', 'job', 'turn', 'neg', 'impact', 'physic', 'health', 'increas', 'heart', 'rate', 'blood', 'pressur', 'domino', 'effect', 'often', 'continu', 'sleep', 'becom', 'disturb', 'result', 'lethargi', 'tired', 'day', 'anyon', 'realli', 'feel', 'exercis', 'state', 'chiropract', 'care', 'natur', 'method', 'heal', 'bodi', 'commun', 'system', 'never', 'reli', 'use', 'pharmaceut', 'drug', 'invas', 'surgeri', '']
## 1 ['', 'unitedhealthcar', 'combat', 'opioid', 'crisi', 'nonopioid', 'benefit', 'physic', 'therapi', 'chiropract', 'care', 'prevent', 'reduc', 'expens', 'invas', 'spinal', 'procedur', 'imag', 'surgeri', 'reduc', 'opioid', 'use', 'cut', 'cost', 'unitedhealthcar', 'opioid', 'physic', 'therapi', 'healthcar', 'spend', 'octob', '29', '2019', 'unitedhealthcar', 'uhc', 'combat', 'opioid', 'epidem', 'high', 'healthcar', 'cost', 'new', 'physic', 'therapi', 'chiropract', 'care', 'benefit', 'prevent', 'delay', 'case', 'substitut', 'invas', 'spinal', 'procedur', 'million', 'american', 'experienc', 'low', 'back', 'pain', 'current', 'point', 'lifetim', 'believ', 'benefit', 'design', 'help', 'make', 'meaning', 'differ', 'improv', 'health', 'outcom', 'reduc', 'cost', 'said', 'ann', 'docimo', 'md', 'unitedhealthcar', 'chief', 'medic', 'offic', 'lower', 'back', 'pain', 'part', 'respons', 'sustain', 'opioid', 'epidem', 'also', 'increas', 'healthcar', 'cost', 'dig', 'deeper', 'although', 'opioid', 'overdos', 'fell', 'two', 'percent', '2017', '2018', 'legal', 'battl', 'aim', 'hold', 'pharmaceut', 'compani', 'account', 'end', 'sight', 'opioid', 'epidem', 'industri', 'profession', 'still', 'grappl', 'balanc', 'cut', 'opioid', 'prescript', 'work', 'reduc', 'patient', 'pain', 'common', 'condit', 'low', 'back', 'pain', 'bolster', 'epidem', 'presenc', 'clinician', 'still', 'prescrib', 'opioid', 'best', 'practic', 'recommend', 'accord', 'recent', 'optumlab', 'studi', '9', 'percent', 'patient', 'newli', 'diagnos', 'low', 'back', 'pain', 'prescrib', 'opioid', 'lower', 'back', 'pain', 'current', 'contribut', '52', 'percent', 'overal', 'opioid', 'prescript', 'rate', 'addit', 'boost', 'opioid', 'distribut', 'altern', 'invas', 'lower', 'back', 'pain', 'treatment', 'significantli', 'impact', 'healthcar', 'spend', 'new', 'inform', 'physic', 'therapi', 'chiropract', 'care', 'effect', 'lower', 'cost', 'altern', 'spinal', 'imag', 'surgeri', 'howev', 'payer', 'still', 'process', 'adopt', 'method', 'counteract', 'highcost', 'highrisk', 'potenti', 'use', 'opioid', 'treat', 'back', 'pain', 'uhc', 'creat', 'benefit', 'reli', 'medic', 'technolog', 'rather', 'physic', 'therapi', 'chiropract', 'care', 'benefit', 'allow', 'elig', 'employ', 'offer', 'physic', 'therapist', 'chiropractor', 'visit', 'outofpocket', 'cost', 'member', 'alreadi', 'receiv', 'physic', 'therapist', 'chiropract', 'care', 'benefit', 'uhc', 'employersponsor', 'health', 'plan', 'max', 'visit', 'receiv', 'addit', 'visit', 'benefit', 'howev', 'still', 'visit', 'use', 'choos', 'physic', 'therapi', 'chiropract', 'care', 'form', 'treatment', 'copay', 'deduct', 'visit', 'waiv', 'receiv', 'three', 'visit', 'cost', 'uhc', 'high', 'expect', 'fiscal', 'physic', 'impact', 'benefit', 'accord', 'uhc', 'analysi', 'health', 'payer', 'expect', '2021', 'opioid', 'use', 'decreas', '19', 'percent', 'spinal', 'imag', 'test', 'frequenc', 'spinal', 'surgeri', 'reduc', '22', 'percent', '21', 'percent', 'respect', 'addit', 'specif', 'goal', 'uhc', 'hope', 'see', 'decreas', 'overal', 'cost', 'spinal', 'care', 'optumlab', 'studi', 'demonstr', 'uhc', 'expect', 'without', 'preced', 'studi', 'look', 'correl', 'outofpocket', 'cost', 'patient', 'util', 'noninvas', 'treatment', 'research', 'discov', 'member', 'whose', 'copay', '30', 'littl', '30', 'percent', 'less', 'like', 'choos', 'physic', 'therapi', 'oppos', 'invas', 'treatment', 'american', 'journal', 'manag', 'care', 'studi', 'june', '2019', 'found', 'patient', 'high', 'deduct', 'typic', '1000', 'less', 'like', 'visit', 'physic', 'therapi', 'elig', 'employ', 'may', 'brand', 'new', 'renew', 'membership', 'must', 'fulli', 'insur', '51', 'employe', 'strong', 'benefit', 'current', 'avail', 'connecticut', 'florida', 'georgia', 'new', 'york', 'north', 'carolina', 'howev', 'uhc', 'plan', 'expand', 'benefit', '2020', '2021', 'end', 'expans', 'period', 'benefit', 'also', 'avail', 'selffund', 'employ', 'organ', 'employe', 'popul', '2', '50', 'benefit', 'span', 'ten', 'state', 'primarili', 'southeast', 'new', 'benefit', 'design', 'may', 'help', 'encourag', 'peopl', 'low', 'back', 'pain', 'get', 'right', 'care', 'right', 'time', 'right', 'set', 'help', 'expand', 'access', 'evidencebas', 'afford', 'treatment', 'said', 'docimo']
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Lemmatized
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                ['chiropractic', 'adjustment', 'treatment', 'serve', 'need', 'million', 'people', 'around', 'world', 'adjustment', 'offer', 'effective', 'noninvasive', 'costeffective', 'solution', 'neck', 'back', 'pain', 'well', 'myriad', 'medical', 'issue', 'ever', 'stopped', 'wonder', 'many', 'u', 'suffer', 'neck', 'back', 'stiffness', 'pain', 'apart', 'obvious', 'discomfort', 'simple', 'daily', 'task', 'driving', 'car', 'crossing', 'busy', 'street', 'picking', 'thing', 'floor', 'become', 'challenging', 'individual', 'experiencing', 'pain', 'anyone', 'experienced', 'pain', 'would', 'know', 'restricted', 'movement', 'debilitating', 'unfortunately', 'busy', 'world', 'doesnt', 'allow', 'u', 'stop', 'benefit', 'longterm', 'chiropractic', 'care', 'include', 'chiropractor', 'identify', 'mechanical', 'issue', 'cause', 'spinerelated', 'pain', 'offer', 'series', 'adjustment', 'provide', 'near', 'immediate', 'relief', 'following', 'appointment', 'patient', 'often', 'report', 'feeling', 'symptom', 'noticeably', 'better', 'chiropractor', 'performs', 'adjustment', 'help', 'restore', 'movement', 'joint', 'locked', 'becomes', 'possible', 'treatment', 'allows', 'muscle', 'surrounding', 'joint', 'relax', 'thereby', 'reducing', 'joint', 'stiffness', 'many', 'factor', 'affect', 'health', 'including', 'exercise', 'pattern', 'nutrition', 'sleep', 'heredity', 'environment', 'live', 'rather', 'treat', 'symptom', 'disease', 'chiropractic', 'care', 'focus', 'holistic', 'approach', 'naturally', 'maintain', 'health', 'resist', 'disease', 'chiropractic', 'adjustment', 'help', 'restore', 'normal', 'function', 'movement', 'entire', 'body', 'many', 'patient', 'report', 'improvement', 'ability', 'move', 'efficiency', 'strength', 'many', 'patient', 'find', 'delight', 'result', 'chiropractic', 'adjustment', 'old', 'chronic', 'injury', 'whether', 'injury', 'fact', 'new', 'old', 'chiropractic', 'care', 'help', 'reduce', 'pain', 'restore', 'mobility', 'provide', 'quick', 'pain', 'relief', 'joint', 'body', 'care', 'help', 'maintain', 'better', 'overall', 'health', 'thus', 'faster', 'recovery', 'time', 'ever', 'noticed', 'pain', 'unable', 'perform', 'regular', 'favorite', 'activity', 'put', 'strain', 'emotional', 'mental', 'wellbeing', 'example', 'increased', 'stress', 'able', 'properly', 'perform', 'paid', 'job', 'turn', 'negative', 'impact', 'physical', 'health', 'increase', 'heart', 'rate', 'blood', 'pressure', 'domino', 'effect', 'often', 'continues', 'sleep', 'becoming', 'disturbed', 'resulting', 'lethargy', 'tiredness', 'day', 'anyone', 'really', 'feel', 'exercising', 'state', 'chiropractic', 'care', 'natural', 'method', 'healing', 'body', 'communication', 'system', 'never', 'relies', 'use', 'pharmaceutical', 'drug', 'invasive', 'surgery', '']
## 1 ['', 'unitedhealthcare', 'combat', 'opioid', 'crisis', 'nonopioid', 'benefit', 'physical', 'therapy', 'chiropractic', 'care', 'prevent', 'reduce', 'expensive', 'invasive', 'spinal', 'procedure', 'imaging', 'surgery', 'reduce', 'opioid', 'use', 'cut', 'cost', 'unitedhealthcare', 'opioid', 'physical', 'therapy', 'healthcare', 'spending', 'october', '29', '2019', 'unitedhealthcare', 'uhc', 'combatting', 'opioid', 'epidemic', 'high', 'healthcare', 'cost', 'new', 'physical', 'therapy', 'chiropractic', 'care', 'benefit', 'prevent', 'delay', 'case', 'substitute', 'invasive', 'spinal', 'procedure', 'million', 'american', 'experiencing', 'low', 'back', 'pain', 'currently', 'point', 'lifetime', 'believe', 'benefit', 'design', 'help', 'make', 'meaningful', 'difference', 'improving', 'health', 'outcome', 'reducing', 'cost', 'said', 'anne', 'docimo', 'md', 'unitedhealthcare', 'chief', 'medical', 'officer', 'lower', 'back', 'pain', 'part', 'responsible', 'sustaining', 'opioid', 'epidemic', 'also', 'increase', 'healthcare', 'cost', 'dig', 'deeper', 'although', 'opioid', 'overdoses', 'fell', 'two', 'percent', '2017', '2018', 'legal', 'battle', 'aim', 'hold', 'pharmaceutical', 'company', 'accountable', 'end', 'sight', 'opioid', 'epidemic', 'industry', 'professional', 'still', 'grappling', 'balance', 'cutting', 'opioid', 'prescription', 'working', 'reduce', 'patient', 'pain', 'common', 'condition', 'low', 'back', 'pain', 'bolster', 'epidemic', 'presence', 'clinician', 'still', 'prescribing', 'opioids', 'best', 'practice', 'recommendation', 'according', 'recent', 'optumlabs', 'study', '9', 'percent', 'patient', 'newly', 'diagnosed', 'low', 'back', 'pain', 'prescribed', 'opioids', 'lower', 'back', 'pain', 'currently', 'contributes', '52', 'percent', 'overall', 'opioid', 'prescription', 'rate', 'addition', 'boosting', 'opioids', 'distribution', 'alternative', 'invasive', 'lower', 'back', 'pain', 'treatment', 'significantly', 'impact', 'healthcare', 'spending', 'new', 'information', 'physical', 'therapy', 'chiropractic', 'care', 'effective', 'lower', 'cost', 'alternative', 'spinal', 'imaging', 'surgery', 'however', 'payer', 'still', 'process', 'adopting', 'method', 'counteract', 'highcost', 'highrisk', 'potential', 'using', 'opioids', 'treat', 'back', 'pain', 'uhc', 'created', 'benefit', 'rely', 'medication', 'technology', 'rather', 'physical', 'therapy', 'chiropractic', 'care', 'benefit', 'allows', 'eligible', 'employer', 'offer', 'physical', 'therapist', 'chiropractor', 'visit', 'outofpocket', 'cost', 'member', 'already', 'receive', 'physical', 'therapist', 'chiropractic', 'care', 'benefit', 'uhcs', 'employersponsored', 'health', 'plan', 'maxed', 'visit', 'receive', 'additional', 'visit', 'benefit', 'however', 'still', 'visit', 'use', 'choose', 'physical', 'therapy', 'chiropractic', 'care', 'form', 'treatment', 'copay', 'deductible', 'visit', 'waived', 'receive', 'three', 'visit', 'cost', 'uhc', 'high', 'expectation', 'fiscal', 'physical', 'impact', 'benefit', 'according', 'uhcs', 'analysis', 'health', 'payer', 'expects', '2021', 'opioid', 'use', 'decrease', '19', 'percent', 'spinal', 'imaging', 'test', 'frequency', 'spinal', 'surgery', 'reduced', '22', 'percent', '21', 'percent', 'respectively', 'addition', 'specific', 'goal', 'uhc', 'hope', 'see', 'decrease', 'overall', 'cost', 'spinal', 'care', 'optumlabs', 'study', 'demonstrated', 'uhcs', 'expectation', 'without', 'precedent', 'study', 'looked', 'correlation', 'outofpocket', 'cost', 'patient', 'utilization', 'noninvasive', 'treatment', 'researcher', 'discovered', 'member', 'whose', 'copay', '30', 'little', '30', 'percent', 'le', 'likely', 'choose', 'physical', 'therapy', 'opposed', 'invasive', 'treatment', 'american', 'journal', 'managed', 'care', 'study', 'june', '2019', 'found', 'patient', 'high', 'deductible', 'typically', '1000', 'le', 'likely', 'visit', 'physical', 'therapy', 'eligible', 'employer', 'may', 'brand', 'new', 'renewing', 'membership', 'must', 'fully', 'insured', '51', 'employee', 'strong', 'benefit', 'currently', 'available', 'connecticut', 'florida', 'georgia', 'new', 'york', 'north', 'carolina', 'however', 'uhc', 'plan', 'expand', 'benefit', '2020', '2021', 'end', 'expansion', 'period', 'benefit', 'also', 'available', 'selffunded', 'employer', 'organization', 'employee', 'population', '2', '50', 'benefit', 'span', 'ten', 'state', 'primarily', 'southeast', 'new', 'benefit', 'design', 'may', 'help', 'encourage', 'people', 'low', 'back', 'pain', 'get', 'right', 'care', 'right', 'time', 'right', 'setting', 'helping', 'expand', 'access', 'evidencebased', 'affordable', 'treatment', 'said', 'docimo']
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.15)
from sklearn.feature_extraction.text import CountVectorizer
count_vect=CountVectorizer(analyzer=lemmatize)
count_vect_fit=count_vect.fit(X_train['Document'])

count_train=count_vect_fit.transform(X_train['Document'])
count_test=count_vect_fit.transform(X_test['Document'])
len(count_vect_fit.get_feature_names())
## 4188
count_vect_fit.get_feature_names()[200:350]
## ['agency', 'agephysical', 'agerelated', 'aggression', 'aggressive', 'aggressively', 'aging', 'ago', 'agree', 'ahead', 'aid', 'aiding', 'ailment', 'aim', 'aimed', 'air', 'airport', 'alan', 'alarm', 'alchemist', 'alcohol', 'alcoholsoaked', 'alert', 'align', 'alignment', 'allergic', 'allergy', 'alleviate', 'alleviated', 'alleviates', 'alleviating', 'alleviation', 'alliance', 'allison', 'allow', 'allowing', 'allows', 'allround', 'almost', 'alone', 'along', 'alongside', 'already', 'alright', 'also', 'alter', 'altered', 'alternate', 'alternating', 'alternatingly', 'alternative', 'although', 'altogether', 'always', 'alzheimers', 'amateur', 'amazing', 'ambulance', 'ameliorating', 'america', 'american', 'among', 'amount', 'amplitude', 'amyclarklymphaticdrainagemassage', 'amyclarklymphaticdrainagemassage1', 'analgesic', 'analyzed', 'anatomy', 'ancient', 'andor', 'anecdotal', 'anemia', 'anesthesia', 'anesthetic', 'anger', 'angions', 'angry', 'anhedonia', 'animal', 'aniston', 'ankle', 'ann', 'annals', 'annually', 'anosognosia', 'another', 'answer', 'answered', 'antibiotic', 'antibody', 'antidepressant', 'antiinflammatory', 'antiviral', 'anxiety', 'anxietydepression', 'anxietyfree', 'anxious', 'anyone', 'anything', 'anywhere', 'apart', 'appearance', 'appeared', 'appears', 'appendix', 'appetite', 'appliance', 'applicable', 'application', 'applied', 'apply', 'applying', 'appointment', 'approach', 'approached', 'appropriate', 'approved', 'approximately', 'apta', 'area', 'areaswere', 'arent', 'argue', 'arise', 'arising', 'arizona', 'arm', 'armpit', 'around', 'arquette', 'array', 'arrive', 'art', 'artery', 'arthritic', 'arthritis', 'article', 'ascertain', 'ashi', 'aside', 'ask', 'asked', 'asking', 'asleep', 'aspect', 'ass', 'asserting', 'assessing', 'assessment']
count_train_vect=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(count_train.toarray())],axis=1)

count_test_vect=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(count_test.toarray())],axis=1)
count_train_vect.head()
##                                             Document  body_length  ...  4186 4187
## 0  \nFive Warning Signs of Mental Illness\n\n\nIt...         6759  ...     0    0
## 1  Lymphatic drainage\nLymphatic drainage is a th...         2631  ...     0    0
## 2  7 Benefits of Massage Therapy\r\n\r\nMassage t...         5948  ...     0    0
## 3  BENEFITS OF MASSAGE\r\n\r\nYou know that post-...          320  ...     0    0
## 4  \nGetting Started with Cold Stone Massage Ther...         1868  ...     0    0
## 
## [5 rows x 4193 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(count_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(count_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 26  mental health services benefits  mental health services benefits
## 30                 massage benefits            chiropractic benefits
## 77                               ER                               ER
## 72       Lymphatic Drainage Massage             dry brushing massage
## 13       Lymphatic Drainage Massage                 massage benefits
## 52                 massage benefits              cold stone benefits
## 68       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 71       Lymphatic Drainage Massage             dry brushing massage
## 15                 massage benefits                 massage benefits
## 57              cold stone benefits              cold stone benefits
## 1                  massage benefits            chiropractic benefits
## 62       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 33            chiropractic benefits            chiropractic benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.5384615384615384
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 0 0 0]
##  [0 2 0 0 0 0 0]
##  [0 0 1 0 0 2 0]
##  [0 0 0 1 0 1 0]
##  [0 2 0 0 0 0 0]
##  [0 1 0 0 0 1 0]
##  [0 0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##                              ER       1.00      1.00      1.00         1
##      Lymphatic Drainage Massage       0.40      1.00      0.57         2
##           chiropractic benefits       1.00      0.33      0.50         3
##             cold stone benefits       1.00      0.50      0.67         2
##            dry brushing massage       0.00      0.00      0.00         2
##                massage benefits       0.25      0.50      0.33         2
## mental health services benefits       1.00      1.00      1.00         1
## 
##                        accuracy                           0.54        13
##                       macro avg       0.66      0.62      0.58        13
##                    weighted avg       0.64      0.54      0.51        13
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(count_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(count_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 26  mental health services benefits  mental health services benefits
## 30            chiropractic benefits            chiropractic benefits
## 77                               ER                               ER
## 72             dry brushing massage             dry brushing massage
## 13                 massage benefits                 massage benefits
## 52              cold stone benefits              cold stone benefits
## 68       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 71             dry brushing massage             dry brushing massage
## 15                 massage benefits                 massage benefits
## 57              cold stone benefits              cold stone benefits
## 1             chiropractic benefits            chiropractic benefits
## 62       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 33            chiropractic benefits            chiropractic benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 1.0
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 0 0 0]
##  [0 2 0 0 0 0 0]
##  [0 0 3 0 0 0 0]
##  [0 0 0 2 0 0 0]
##  [0 0 0 0 2 0 0]
##  [0 0 0 0 0 2 0]
##  [0 0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##                              ER       1.00      1.00      1.00         1
##      Lymphatic Drainage Massage       1.00      1.00      1.00         2
##           chiropractic benefits       1.00      1.00      1.00         3
##             cold stone benefits       1.00      1.00      1.00         2
##            dry brushing massage       1.00      1.00      1.00         2
##                massage benefits       1.00      1.00      1.00         2
## mental health services benefits       1.00      1.00      1.00         1
## 
##                        accuracy                           1.00        13
##                       macro avg       1.00      1.00      1.00        13
##                    weighted avg       1.00      1.00      1.00        13

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_countRFC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    count_vect=CountVectorizer(analyzer=lemmatize)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['lemmatized'])
    
    model = rf.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_count_RFC_85-15:']
    print('\n\n',pred)
    

def predict_countRFC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    count_vect=CountVectorizer(analyzer=clean_text)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['clean'])
    
    model = rf.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_count_RFC_85-15:']
    print('\n\n',pred)
    
predict_countRFC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_count_RFC_85-15:               massage benefits
predict_countRFC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_count_RFC_85-15:               massage benefits

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_countGBC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    count_vect=CountVectorizer(analyzer=lemmatize)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['lemmatized'])
    
    model = gb.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_count_GBC_85-15:']
    print('\n\n',pred)
    

def predict_countGBC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    count_vect=CountVectorizer(analyzer=clean_text)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['clean'])
    
    model = gb.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_count_GBC_85-15:']
    print('\n\n',pred)
    
predict_countGBC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_count_GBC_85-15:               massage benefits
predict_countGBC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_count_GBC_85-15:               massage benefits

TF-IDF RFC and GBC


stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text        

data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  [chiropractic, adjustment, treatment, serve, n...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...  [, unitedhealthcare, combat, opioid, crisis, n...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...  [, safety, chiropractic, adjustment, chiroprac...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  [advanced, chiropractic, relief, 8, key, benef...
## 4  Heading to the spa can be a pampering treat, b...  ...  [heading, spa, pampering, treat, also, huge, b...
## 
## [5 rows x 10 columns]

from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.15)

from sklearn.feature_extraction.text import TfidfVectorizer

tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])

tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
tfidf_test=tfidf_vect_fit.transform(X_test['Document'])
len(tfidf_vect_fit.get_feature_names())
## 4385
tfidf_vect_fit.get_feature_names()[200:350]
## ['affect', 'affected', 'affecting', 'affiliate', 'affirm', 'afford', 'affordable', 'afraid', 'afterhours', 'afternoon', 'afterward', 'afterwards', 'age', 'agency', 'agephysical', 'agerelated', 'aggression', 'aggressive', 'aggressively', 'aging', 'ago', 'agree', 'ahead', 'aid', 'aiding', 'ailment', 'aim', 'aimed', 'air', 'airport', 'aka', 'alan', 'alarm', 'alchemist', 'alcohol', 'alert', 'alignment', 'alike', 'alkabath', 'alkalizing', 'allergic', 'allergy', 'alleviate', 'alleviated', 'alleviates', 'alleviating', 'alleviation', 'alliance', 'allison', 'allow', 'allowing', 'allows', 'alloy', 'allround', 'almost', 'alone', 'along', 'alongside', 'already', 'also', 'alter', 'altered', 'alternate', 'alternating', 'alternatingly', 'alternative', 'alters', 'although', 'altogether', 'always', 'alzheimers', 'amateur', 'amazing', 'amazon', 'ambulance', 'ameliorating', 'america', 'american', 'among', 'amount', 'amplitude', 'amyclarklymphaticdrainagemassage', 'amyclarklymphaticdrainagemassage1', 'analgesic', 'analysis', 'analyzed', 'anatomy', 'ancient', 'andor', 'anecdotal', 'anemia', 'anesthesia', 'anesthetic', 'anger', 'angions', 'anhedonia', 'animal', 'ankle', 'ann', 'annals', 'anne', 'announcing', 'annually', 'anosognosia', 'another', 'answer', 'answered', 'antibiotic', 'antibody', 'anticipate', 'antidepressant', 'antiinflammatory', 'antiviral', 'anxiety', 'anxietydepression', 'anxietyfree', 'anyone', 'anything', 'anywhere', 'apart', 'appealing', 'appearance', 'appeared', 'appears', 'appendix', 'appetite', 'apple', 'appliance', 'applicable', 'application', 'applied', 'apply', 'applying', 'appointment', 'approach', 'approached', 'appropriate', 'approved', 'approximately', 'apta', 'area', 'arent', 'argue', 'arise', 'arises', 'arising', 'arizona', 'arm', 'armpit', 'aromatherapy']
tfidf_train_vect=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(tfidf_train.toarray())],axis=1)

tfidf_test_vect=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(tfidf_test.toarray())],axis=1)
tfidf_train_vect.head()
##                                             Document  body_length  ...  4383 4384
## 0  The Role of Physical Therapy\r\n\r\nThe role o...         4704  ...   0.0  0.0
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...         3796  ...   0.0  0.0
## 2  Massage Guns Worth It? Here's What An Expert S...         1596  ...   0.0  0.0
## 3  The Benefits of Physical Therapy\n\n\nWhen peo...         3221  ...   0.0  0.0
## 4  What is a lymphatic drainage massage or detox ...         3554  ...   0.0  0.0
## 
## [5 rows x 4390 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
np.random.seed(45678)
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(tfidf_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(tfidf_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 65  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 33       chiropractic benefits       chiropractic benefits
## 69  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 41            cupping benefits            cupping benefits
## 38            cupping benefits            cupping benefits
## 39            cupping benefits            cupping benefits
## 50        massage gun benefits        massage gun benefits
## 13  Lymphatic Drainage Massage            massage benefits
## 45        massage gun benefits        massage gun benefits
## 34            cupping benefits            cupping benefits
## 60         cold stone benefits         cold stone benefits
## 7             massage benefits            massage benefits
## 10            massage benefits            massage benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.9230769230769231
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[2 0 0 0 0 0]
##  [0 1 0 0 0 0]
##  [0 0 1 0 0 0]
##  [0 0 0 4 0 0]
##  [1 0 0 0 2 0]
##  [0 0 0 0 0 2]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       0.67      1.00      0.80         2
##      chiropractic benefits       1.00      1.00      1.00         1
##        cold stone benefits       1.00      1.00      1.00         1
##           cupping benefits       1.00      1.00      1.00         4
##           massage benefits       1.00      0.67      0.80         3
##       massage gun benefits       1.00      1.00      1.00         2
## 
##                   accuracy                           0.92        13
##                  macro avg       0.94      0.94      0.93        13
##               weighted avg       0.95      0.92      0.92        13
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(tfidf_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(tfidf_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 65  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 33       chiropractic benefits       chiropractic benefits
## 69  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 41            cupping benefits            cupping benefits
## 38            cupping benefits            cupping benefits
## 39            cupping benefits            cupping benefits
## 50        massage gun benefits        massage gun benefits
## 13  Lymphatic Drainage Massage            massage benefits
## 45        massage gun benefits        massage gun benefits
## 34            cupping benefits            cupping benefits
## 60         cold stone benefits         cold stone benefits
## 7             massage benefits            massage benefits
## 10            massage benefits            massage benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.9230769230769231
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[2 0 0 0 0 0]
##  [0 1 0 0 0 0]
##  [0 0 1 0 0 0]
##  [0 0 0 4 0 0]
##  [1 0 0 0 2 0]
##  [0 0 0 0 0 2]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       0.67      1.00      0.80         2
##      chiropractic benefits       1.00      1.00      1.00         1
##        cold stone benefits       1.00      1.00      1.00         1
##           cupping benefits       1.00      1.00      1.00         4
##           massage benefits       1.00      0.67      0.80         3
##       massage gun benefits       1.00      1.00      1.00         2
## 
##                   accuracy                           0.92        13
##                  macro avg       0.94      0.94      0.93        13
##               weighted avg       0.95      0.92      0.92        13

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_tfidfRFC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['lemmatized'])
    
    model = rf.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_tfidf_RFC_85-15:']
    print('\n\n',pred)
    

def predict_tfidfRFC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    tfidf_vect=TfidfVectorizer(analyzer=clean_text)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['clean'])
    
    model = rf.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_tfidf_RFC_85-15:']
    print('\n\n',pred)
    
predict_tfidfRFC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_tfidf_RFC_85-15:               massage benefits
predict_tfidfRFC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_tfidf_RFC_85-15:               massage benefits

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_tfidfGBC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['lemmatized'])
    
    model = gb.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_tfidf_GBC_85-15:']
    print('\n\n',pred)
    

def predict_tfidfGBC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    tfidf_vect=TfidfVectorizer(analyzer=clean_text)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['clean'])
    
    model = gb.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_tfidf_GBC_85-15:']
    print('\n\n',pred)
    
predict_tfidfGBC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_tfidf_GBC_85-15:               massage benefits
predict_tfidfGBC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_tfidf_GBC_85-15:               massage benefits

N-Grams Vectorization for RFC and GBC

stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])#unlisted with N-grams vectorization
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])#unlisted with N-grams vectorization
    #text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#when using count Vectorization its a list
    #or else single letters returned.
    return text    
data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  chiropractic adjustment treatment serve need m...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...   unitedhealthcare combat opioid crisis nonopio...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...   safety chiropractic adjustment chiropractic a...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  advanced chiropractic relief 8 key benefit chi...
## 4  Heading to the spa can be a pampering treat, b...  ...  heading spa pampering treat also huge boost he...
## 
## [5 rows x 10 columns]
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.15)
from sklearn.feature_extraction.text import CountVectorizer
n_gram_vect=CountVectorizer(ngram_range=(1,4))
type(X_train['Cleaned_text'])
## <class 'pandas.core.series.Series'>
X_train['Cleaned_text'].head()
## 46     benefit vibrat percuss therapi vibrat therapi...
## 49    futurist gun could effect foam roll muscl sore...
## 29     mental health counselor train skill salari me...
## 69    lymphat drainag massag us think lymphat system...
## 15    top 5 health benefit regular massag therapi ma...
## Name: Cleaned_text, dtype: object
X_train['Lemmatized'].head()
## 46     benefit vibration percussion therapy vibratio...
## 49    futuristic gun could effective foam rolling mu...
## 29     mental health counselor training skill salary...
## 69    lymphatic drainage massage u think lymphatic s...
## 15    top 5 health benefit regular massage therapy m...
## Name: Lemmatized, dtype: object
n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])


n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
n_gram_test=n_gram_vect_fit.transform(X_test['Lemmatized'])
len(n_gram_vect_fit.get_feature_names())
## 75119
print(n_gram_vect_fit.get_feature_names()[200:500])
## ['2009 showed lymphatic drainage', '2011', '2011 number', '2011 number adult', '2011 number adult seek', '2011 research', '2011 research review', '2011 research review sized', '2011 review', '2011 review 26', '2011 review 26 clinical', '2012', '2012 review', '2012 review studiestrusted', '2012 review studiestrusted source', '2013', '2013 illustrating', '2013 illustrating benefit', '2013 illustrating benefit journaling', '2015', '2015 report', '2015 report published', '2015 report published journal', '2015 review', '2015 review evidence', '2015 review evidence found', '2015 systematic', '2015 systematic review', '2015 systematic review concluded', '2016', '2016 michael', '2016 michael phelps', '2016 michael phelps permanently', '2016 study', '2016 study published', '2016 study published journal', '2016 summer', '2016 summer olympics', '2016 summer olympics share', '2016 summer olympics1', '2016 summer olympics1 us', '20162017', '20162017 wide', '20162017 wide range', '20162017 wide range reason', '2017', '2017 2018', '2017 2018 legal', '2017 2018 legal battle', '2017 chiropractor', '2017 chiropractor tout', '2017 chiropractor tout treatment', '2017 nba', '2017 nba final', '2017 nba final irving', '2017 scientist', '2017 scientist analyzed', '2017 scientist analyzed 11', '2017 study', '2017 study found', '2017 study found structure', '2018', '2018 found', '2018 found change', '2018 found change hamstring', '2018 galluppalmer', '2018 galluppalmer college', '2018 galluppalmer college chiropractic', '2018 legal', '2018 legal battle', '2018 legal battle aim', '2018 study', '2018 study led', '2018 study led dr', '2019', '2019 early', '2019 early 2020', '2019 early 2020 many', '2019 found', '2019 found patient', '2019 found patient high', '2019 massage', '2019 massage gun', '2019 massage gun one', '2019 unitedhealthcare', '2019 unitedhealthcare uhc', '2019 unitedhealthcare uhc combatting', '2020', '2020 2021', '2020 2021 end', '2020 2021 end expansion', '2020 beyond', '2020 beyond people', '2020 beyond people say', '2020 many', '2020 many people', '2020 many people started', '2021', '2021 end', '2021 end expansion', '2021 end expansion period', '2021 opioid', '2021 opioid use', '2021 opioid use decrease', '20minute', '20minute selfmassage', '20minute selfmassage using', '20minute selfmassage using massage', '21', '21 benefit', '21 benefit chiropractic', '21 benefit chiropractic adjustment', '21 benefit might', '21 benefit might known', '21 percent', '21 percent respectively', '21 percent respectively addition', '2105', '2105 billion', '2105 billion year2', '2105 billion year2 curious', '22', '22 million', '22 million american', '22 million american visit', '22 percent', '22 percent 21', '22 percent 21 percent', '23', '23 2019', '23 2019 massage', '23 2019 massage gun', '24', '24 separately', '24 separately column', '24 separately column several', '25', '25 percent', '25 percent american', '25 percent american adult', '25 reason', '25 reason get', '25 reason get massage', '25 showed', '25 showed 75', '25 showed 75 90', '26', '26 clinical', '26 clinical trial', '26 clinical trial looked', '272', '272 study', '272 study participant', '272 study participant three', '275', '275 picture', '275 picture theragun', '275 picture theragun purely', '275 really', '275 really worth', '275 really worth investing', '281', '281 341', '281 341 many', '281 341 many taoist', '29', '29 2019', '29 2019 unitedhealthcare', '29 2019 unitedhealthcare uhc', '30', '30 little', '30 little 30', '30 little 30 percent', '30 percent', '30 percent le', '30 percent le likely', '30 second', '30 second working', '30 second working along', '300', '300 ad', '300 ad even', '300 ad even earlier', '33', '33 medication', '33 medication one', '33 medication one year', '34', '34 lymphatic', '34 lymphatic system', '34 lymphatic system drain', '341', '341 many', '341 many taoist', '341 many taoist believe', '35', '35 cup', '35 cup first', '35 cup first session', '35 seeking', '35 seeking relief', '35 seeking relief back', '37', '37 study', '37 study found', '37 study found reduction', '38', '38 taking', '38 taking pain', '38 taking pain medication', '40', '40 percussion', '40 percussion per', '40 percussion per second', '400', '400 600', '400 600 massage', '400 600 massage gun', '400 greater', '400 greater immune', '400 greater immune competence', '4357', '4357 local', '4357 local health', '4357 local health department', '44', '44 million', '44 million people', '44 million people 1320', '456000', '456000 chiropractor', '456000 chiropractor massage', '456000 chiropractor massage therapist', '48', '48 percent', '48 percent went', '48 percent went doctor', '48 received', '48 received pain', '48 received pain reduction', '4pm', '4pm afternoon', '4pm afternoon time', '4pm afternoon time dinner', '50', '50 75', '50 75 improvement', '50 75 improvement one', '50 adult', '50 adult least', '50 adult least one', '50 benefit', '50 benefit span', '50 benefit span ten', '50 improvement', '50 improvement researcher', '50 improvement researcher concluded', '50 million', '50 million american', '50 million american suffer', '50 minute', '50 minute long', '50 minute long say', '50 patient', '50 patient 16', '50 patient 16 completely', '50 state', '50 state however', '50 state however many', '51', '51 employee', '51 employee strong', '51 employee strong benefit', '52', '52 percent', '52 percent overall', '52 percent overall opioid', '53', '53 drugfree', '53 drugfree group', '53 drugfree group continued', '53 sought', '53 sought treatment', '53 sought treatment chiropractor', '549', '549 picture', '549 picture theragun', '549 picture theragun theragunliv', '57', '57 chiropractic', '57 chiropractic group', '57 chiropractic group achieved', '57 cup', '57 cup british', '57 cup british cupping', '60', '60 also', '60 also vulnerable', '60 also vulnerable complication', '60 minute']
n_gram_train_df=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(n_gram_train.toarray())],axis=1)

n_gram_test_df=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(n_gram_test.toarray())],axis=1)
n_gram_train_df.head()
##                                             Document  body_length  ...  75117 75118
## 0  \r\nBenefits of Vibration and Percussion Thera...         2848  ...      0     0
## 1  This futuristic ?gun? could be more effective ...         4784  ...      0     0
## 2   Mental Health Counselor Training, Skills, and...         5085  ...      0     0
## 3  What is Lymphatic Drainage Massage?\nMost of u...         1697  ...      0     0
## 4  Top 5 Health Benefits of Regular Massage Thera...          893  ...      0     0
## 
## [5 rows x 75124 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(n_gram_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(n_gram_test)
end=time.time()
pred_time=(end-start)

prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                     Predicted                       Topic
## 58        cold stone benefits         cold stone benefits
## 56        cold stone benefits         cold stone benefits
## 72       dry brushing massage        dry brushing massage
## 36           massage benefits            cupping benefits
## 12           massage benefits            massage benefits
## 67           massage benefits  Lymphatic Drainage Massage
## 62           massage benefits  Lymphatic Drainage Massage
## 37           cupping benefits            cupping benefits
## 52           massage benefits         cold stone benefits
## 57           massage benefits         cold stone benefits
## 4            massage benefits            massage benefits
## 10           massage benefits            massage benefits
## 17  physical therapy benefits   physical therapy benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.6153846153846154
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[0 0 0 0 2 0]
##  [0 2 0 0 2 0]
##  [0 0 1 0 1 0]
##  [0 0 0 1 0 0]
##  [0 0 0 0 3 0]
##  [0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       0.00      0.00      0.00         2
##        cold stone benefits       1.00      0.50      0.67         4
##           cupping benefits       1.00      0.50      0.67         2
##       dry brushing massage       1.00      1.00      1.00         1
##           massage benefits       0.38      1.00      0.55         3
##  physical therapy benefits       1.00      1.00      1.00         1
## 
##                   accuracy                           0.62        13
##                  macro avg       0.73      0.67      0.65        13
##               weighted avg       0.70      0.62      0.59        13
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(n_gram_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(n_gram_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 58         cold stone benefits         cold stone benefits
## 56         cold stone benefits         cold stone benefits
## 72        dry brushing massage        dry brushing massage
## 36            cupping benefits            cupping benefits
## 12            massage benefits            massage benefits
## 67  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 62  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 37            cupping benefits            cupping benefits
## 52         cold stone benefits         cold stone benefits
## 57         cold stone benefits         cold stone benefits
## 4             massage benefits            massage benefits
## 10            massage benefits            massage benefits
## 17   physical therapy benefits   physical therapy benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 1.0
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[2 0 0 0 0 0]
##  [0 4 0 0 0 0]
##  [0 0 2 0 0 0]
##  [0 0 0 1 0 0]
##  [0 0 0 0 3 0]
##  [0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       1.00      1.00      1.00         2
##        cold stone benefits       1.00      1.00      1.00         4
##           cupping benefits       1.00      1.00      1.00         2
##       dry brushing massage       1.00      1.00      1.00         1
##           massage benefits       1.00      1.00      1.00         3
##  physical therapy benefits       1.00      1.00      1.00         1
## 
##                   accuracy                           1.00        13
##                  macro avg       1.00      1.00      1.00        13
##               weighted avg       1.00      1.00      1.00        13
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])
    return text

def predict_ngramRFC_clean(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))

    rf=RandomForestClassifier(n_estimators=150,max_depth=None, n_jobs=-1)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
    n_gram_test=n_gram_vect_fit.transform(nr['clean'])
    
    model = rf.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['stemmed_1ngram4_RFC_85-15:']
    print('\n\n',pred)
    

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text    
    
def predict_ngramRFC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))

    rf=RandomForestClassifier(n_estimators=150,max_depth=None, n_jobs=-1)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
    n_gram_test=n_gram_vect_fit.transform(nr['lemma'])
    
    model = rf.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_1ngram4RFC_85-15:']
    print('\n\n',pred)
predict_ngramRFC_clean('I need a massage!') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_RFC_85-15:               massage benefits
predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                               Recommended Healthcare Service:
## lemmatized_1ngram4RFC_85-15:                massage benefits
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])
    return text

def predict_ngramGBC_clean(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))

    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
    n_gram_test=n_gram_vect_fit.transform(nr['clean'])
    
    model = gb.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['stemmed_1ngram4_GBC_85-15:']
    print('\n\n',pred)
    

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text    
    
def predict_ngramGBC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))

    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
    n_gram_test=n_gram_vect_fit.transform(nr['lemma'])
    
    model = gb.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_1ngram4GBC_85-15:']
    print('\n\n',pred)
predict_ngramGBC_clean('I need a massage!') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_GBC_85-15:               massage benefits
predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                               Recommended Healthcare Service:
## lemmatized_1ngram4GBC_85-15:                massage benefits

Third part: Stemmed Tokens & 80/20 Train/Test split & RFC | GBC

Count Vectorizer RFC and GBC


stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
data <- read.csv('benefitsContraindications4.csv',sep=',',header=TRUE,  na.strings=c('',' ','NA'))
colnames(data)
## [1] "Document"            "Source"              "Topic"              
## [4] "InternetSearch"      "Contraindications"   "risksAdverseEffects"
head(data,5)
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Document
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Chiropractic adjustments and treatments serve the needs of millions of people around the world.\n\n \n\nAdjustments offer effective, non-invasive and cost-effective solutions to neck and back pain, as well as a myriad of other medical issues.\n\nHave you ever stopped to wonder how many of us suffer from neck and back stiffness or pain?\n\nApart from the obvious discomfort, simple daily tasks such as driving a car, crossing a busy street and picking things up from the floor can become all too challenging for individuals experiencing such pain.\n\nAs anyone who has experienced pain would know, having restricted movement can be debilitating and unfortunately, our busy world doesn?t allow for us to stop.\n\n \nSome of the benefits of long-term chiropractic care include:\n\n    Chiropractors can identify mechanical issues that cause spine-related pain and offer a series of adjustments that provide near immediate relief. Following appointments, patients often report feeling their symptoms noticeably better.\n    When a chiropractor performs an adjustment, they can help restore movement in joints that have ?locked up?. This becomes possible as treatment allows muscles surrounding joints to relax, thereby reducing joint stiffness.\n    Many factors affect health, including exercise patterns, nutrition, sleep, heredity and the environment in which we live. Rather than just treat symptoms of the disease, chiropractic care focuses on a holistic approach to naturally maintain health and resist disease.\n    Chiropractic adjustments help restore normal function and movement to the entire body. Many patients report an improvement in their ability to move with efficiency and strength.\n    Many patients find delight in the results chiropractic adjustments have on old and chronic injuries. Whether an injury is, in fact, new or old, chiropractic care can help reduce pain, restore mobility and provide quick pain relief to all joints in the body. Such care can help maintain better overall health and thus faster recovery time.\n\nHave you ever noticed that when you are in pain and unable to perform regular or favorite activities, it can put a strain on emotional and mental well-being?\n\nFor example, the increased stress from not being able to properly perform a paid job. This, in turn, can have a negative impact on physical health with increases in heart rate and blood pressure. The domino effect often continues with sleep becoming disturbed, with resulting lethargy and tiredness during the day. Does anyone really feel up to exercising in this state?\n\nChiropractic care is a natural method of healing the body?s communication system and never relies on the use of pharmaceutical drugs or invasive surgery.\n\n\n\n 
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          \nUnitedHealthcare Combats Opioid Crisis with Non-Opioid Benefits\nPhysical therapy and chiropractic care can prevent or reduce expensive, invasive spinal procedures, such as imaging or surgery, to reduce opioid use and cut costs.\nUnitedHealthcare, opioid, physical therapy, healthcare spending\n\n\nOctober 29, 2019 - UnitedHealthcare (UHC) is combatting the opioid epidemic and high healthcare costs with new physical therapy and chiropractic care benefits to prevent, delay, or in some cases substitute for invasive spinal procedures.\n\n?With millions of Americans experiencing low back pain currently or at some point during their lifetimes, we believe this benefit design will help make a meaningful difference by improving health outcomes while reducing costs,? said Anne Docimo, MD, UnitedHealthcare chief medical officer.\n\nLower back pain is in part responsible for sustaining the opioid epidemic and also increases healthcare costs.\nDig Deeper\n\nAlthough opioid overdoses fell by two percent from 2017 to 2018 and a legal battles aim to hold pharmaceutical companies accountable, there is no end in sight for the opioid epidemic. Industry professionals are still grappling with the balance between cutting opioid prescriptions will working to reduce patient pain.\n\nCommon conditions such as low back pain bolster the epidemic?s presence, with clinicians still prescribing the opioids against best practice recommendations. According to a recent OptumLabs study, 9 percent of patients with newly diagnosed low back pain are prescribed opioids and lower back pain currently contributes 52 percent to the overall opioid prescription rate.\n\nIn addition to boosting opioids distribution, alternative, invasive lower back pain treatments can significantly impact healthcare spending.\n\nIt is not new information that physical therapy and chiropractic care are effective, lower cost alternatives to spinal imaging or surgery. However, payers are still in the process of adopting the method.\n\nTo counteract the high-cost, high-risk potential of using opioids to treat back pain, UHC created a benefit that does not rely on medication or technology but rather on physical therapy and chiropractic care.\n\nThe benefit allows eligible employers to offer physical therapist and chiropractor visits with no out-of-pocket costs. Members who already receive physical therapist and chiropractic care benefits under UHC?s employer-sponsored health plans and who have maxed out their visits will not receive additional visits under this benefit.\n\nHowever, for those who still have visits to use and who choose physical therapy or chiropractic care over other forms of treatment, the copay or deductible for those visits will be waived and they will receive three visits at no cost.\n\nUHC has high expectations for the fiscal and physical impacts of this benefit.\n\nAccording to UHC?s analysis, the health payer expects that by 2021, opioid use will decrease by 19 percent. Spinal imaging test frequency and spinal surgeries will be reduced by 22 percent and 21 percent, respectively. In addition to these specific goals, UHC hopes to see a decrease in the overall cost of spinal care.\n\nThe same OptumLabs study demonstrated that UHC?s expectations are not without precedent.\n\nThe study looked at the correlation between out-of-pocket costs and patient utilization of noninvasive treatments. Researchers discovered that members whose copay was over $30 were a little under 30 percent less likely to choose physical therapy as opposed to more invasive treatments.\n\nAn American Journal of Managed Care study in June 2019 found that patients with high deductibles, typically over $1,000, were less likely to visit physical therapy.\n\nEligible employers may be brand new or renewing their membership. They must be fully insured and over 51 or more employees strong. The benefit is currently available in Connecticut, Florida, Georgia, New York, and North Carolina.\n\nHowever, UHC plans to expand the benefit from 2020 into 2021. By the end of this expansion period, the benefit will also be available to self-funded employers and organizations with an employee population between 2 and 50. The benefit will span ten states, primarily in the southeast.\n\n?This new benefit design may help encourage people with low back pain to get the right care at the right time and in the right setting, helping expand access to evidence-based and more affordable treatments,? said Docimo.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      The Safety of Chiropractic Adjustments\n\n    Chiropractic Adjustment\n    What Research Shows\n    Safety\n\nChiropractic adjustment, also called spinal manipulation, is a procedure done by a chiropractor using the hands or small instruments to apply controlled force to a spinal joint. The goal is to improve spinal motion and physical function of the entire body. Chiropractic adjustment is safe when performed by someone who is properly trained and licensed to practice chiropractic care. Complications are rare, but they are possible. Learn more about both the benefits and risks.\nChiropractic adjustment\nVerywell / Brianna Gilmartin \nChiropractic Adjustment\n\nOne of the most important reasons people seek chiropractic care is because it is a completely drug-free therapy. Someone dealing with joint pain, back pain, or headaches might consider visiting a chiropractor.\n\nThe goal of chiropractic adjustment is to place the body into a proper position so the body can heal itself. Treatments are believed to reduce stress on the immune system, reducing the potential for disease. Chiropractic care aims to address the entire body, including a person?s ability to move, perform, and even think.\nWhat Research Shows\n\nMany people wonder how helpful chiropractic care is in treating years of trauma and poor posture. There have been numerous studies showing the therapeutic benefits of chiropractic care.\nSciatica\n\nSciatica is a type of pain affecting the sciatic nerve, the large nerve extending from the low back down the back of the legs. Other natural therapies don?t always offer relief and most people want to avoid steroid injections and surgery, so they turn to chiropractic care.\n\nA double-blind trial reported in the Spine Journal compared active and simulated chiropractic manipulations in people with sciatic nerve pain. Active manipulations involved the patient laying down and receiving treatment from a chiropractor. Stimulated manipulations involved electrical muscle stimulation with electrodes placed on the skin to send electrical pulses to different parts of the body.\n\nThe researchers determined active manipulation offered more benefits than stimulated. The people who received active manipulations experienced fewer days of moderate or severe pain and other sciatica symptoms. They also reported no adverse effects.\nNeck Pain\n\nOne study reported in the Annals of Internal Medicine looked at different therapies for treating neck pain. They divided 272 study participants into three groups: one that received spinal manipulation from a chiropractic doctor, a second group given over-the-counter (OTC) pain relievers, narcotics, and muscle relaxers, and a third group who did at-home exercises. \n\nAfter 12 weeks, patients reported a 75% pain reduction, with the chiropractic treatment group achieving the most improvement. About 57% of the chiropractic group achieved pain reduction, while 48% received pain reduction from exercising, and 33% from medication.\n\nAfter one year, 53% of the drug-free groups continued to report pain relief compared to only 38% of those taking pain medications. \nHeadaches\n\nCervicogenic headaches and migraines are commonly treated by chiropractors. Cervicogenic headaches are often called secondary headaches because pain is usually referred from another source, usually the neck. Migraine headaches cause severe, throbbing pain and are generally experienced on one side of the head. There are few non-medicinal options for managing both types of chronic headaches.\n\nResearch reported in the Journal of Manipulative and Physiological Therapeutics suggests chiropractic care, specifically spinal manipulation, can improve migraines and cervicogenic headaches.  \nFrozen Shoulder\n\nFrozen shoulder affects the shoulder joint and involves pain and stiffness that develops gradually and gets worse. Frozen shoulder can be quite painful, and treatment involves preserving as much range of motion in the shoulder as possible and managing pain.\n\nA clinical trial reported in the Journal of Chiropractic Medicine described how patients suffering from frozen shoulder responded to chiropractic treatment. Of the 50 patients, 16 completely recovered, 25 showed a 75 to 90% improvement, and eight showed a 50 to 75% improvement. Only one person showed zero to 50% improvement. The researchers concluded most people can get improvement by treating frozen shoulder with chiropractic treatment.\nPreventing Need for Surgery\n\nChiropractic care may reduce the need for back surgery. Guidelines reported in the Journal of the American Medical Association suggest that it's reasonable for people suffering from back pain to try spinal manipulation before deciding on surgical intervention.\nLow Back Pain\n\nStudies have shown chiropractic care, including spinal manipulation, can provide relief from mild to moderate low back pain. In fact, spinal manipulation may work as well as other standard treatments, including pain-relief medications.\n\nA 2011 review of 26 clinical trials looked at the effectiveness of different treatments for chronic low back pain. What they found was that spinal manipulation is just as effective as other treatments for reducing back pain and improving function.\nSafety\n\n\n
## 4 Advanced Chiropractic Relief: 8 Key Benefits of Chiropractor Care\n\nAre you one of the 50 million Americans who suffer from chronic pain? If so you?re probably intimately familiar with the feeling of pure desperation that can arise from an inability to find relief.\n\nIn addition to physical issues, chronic pain can cause anxiety, depression, and more. However, there could be a light at the end of the tunnel. Many people are finding advanced chiropractic relief that is completely changing their lives.\n\nYour body is a world in itself. At this very moment, more than a million chemical reactions are taking place in your body. It manufactures energy, it regulates your heartbeat, your breathing and it regenerates and heals itself. Everything takes place without your conscious knowledge, without you controlling it voluntarily. The master system that controls it all is your nervous system.\n\nThe nervous system is made out of your brain, spinal cord and all your nerves.\n\nThe energy that flows through your nervous system in your body is like electricity. In order to have that electric flow normally and freely, we need to have a well functioning spine. Whenever you have disruption of that flow, disease happens. That would be the case when your spine is misaligned or is not moving properly.\n\nDid you know that 90% of stimulation and nutrition to the brain is generated by the movement of the spine?\n\nThe more mechanically distorted a person is, the less energy is available for thinking, metabolism and healing.\n\nThis is why it is so important to have a healthy spine, a proper posture, to exercise, to eat properly ? all of it truly matters for your quality of life.\nChiropractors localize the areas of your spine that do not move properly ? referred to as vertebral subluxations ? and adjust them with a specific high speed, but yet gentle, thrust to improve spinal motion.\n\nWant to learn about some of the ways chiropractic care can help you? Keep reading for insight into some of the key benefits of seeing a chiropractor.\n\nThe benefits of chiropractic care are numerous:\n\n1. Lower Blood Pressure\n\nStudies show that chiropractic treatment can lower your blood pressure. Sometimes, this works just as well as a prescription blood pressure medication! This benefit can also last for as long as six months after treatment.\n\nHigh blood pressure can cause an array of serious side effects like nausea, fatigue, dizziness, and anxiety. Sufferers who haven?t found relief should consider consulting with a chiropractor. A chiropractic adjustment may be the solution.\n\nSome studies have shown that chiropractic adjustments can also help patients who are suffering from low blood pressure.\n\n2. Reduced Inflammation\n\nIn many cases, joint issues, pain, and tension are caused by inflammation in the body. Chiropractic adjustments can reduce inflammation.\n\nThis leads to relief of muscle tension, chronic back pain, and joint pain. These adjustments can sometimes also slow the progression of inflammation-related diseases, like arthritis.\n\n3. Better Sleep\n\nPatients who receive chiropractic adjustments report a significant improvement in their sleep patterns. If you regularly suffer from insomnia, visiting a chiropractor regularly may help. Also, when you experience pain relief, this will help you get a restful night?s sleep.\n\n4. Digestive Relief\n\nChiropractors often give nutritional advice as part of their services. However, this isn?t the only way that they provide patients with digestive relief.\n\nAdjusting the thoraco-lumbar spine restores the neurological function of your digestive system. Regular adjustments can help with chronic digestive issues.\n\n5. Stress Release\n\nEveryday life can cause muscle cramping, inflammation, and more. When you?re sore from working at a computer, heavy lifting, or just dealing with emotional stress, a chiropractic adjustment can help. This leads to greater comfort and advanced pain relief.\n\n6. Improvement of Neurological Conditions\n\nA chiropractic adjustment can also increase blood flow to the brain and increase the flow of cerebral spinal fluid. This means that patients suffering from neurological conditions like epilepsy and multiple sclerosis can significantly benefit from regular adjustments.\n\nThis is a relatively new area of study, but the potential is huge. Those suffering from these conditions will want to do some research. It?s important to find the best chiropractor in their area with experience dealing with these specific types of cases.\n\n7. Chiropractic care can improve communication from your brain to your muscles\n\nResearch seems to show that chiropractic care can improve your brain-body communication, helping your brain to be more aware of what is going on in the body so it can control your body better.\n\nBetter health, more energy and vitality are some of the positive effects of getting your spine adjusted. It sets your vertebrae back into motion freeing up the energy that travels through your nerves.\n\nChiropractic care is a partnership. The results patients want is a combination of what the chiropractor does and what the patient does.\n\nThere are many good things that can be changed and improved for a better lifestyle: exercise, good nutrition, good mental attitude and spinal adjustments.\n\nYour whole body will work better by having your nervous system free of interference. That is the essence of chiropractic care and is designed for you and your family.\n\n8. Pain Relief\n\nPerhaps the most well-known benefit of going to a chiropractor is pain relief. Adjustments can help with a huge array of painful conditions including the following.\n\nNeck and Lower Back Pain\n\nAdjustments are the most effective non-invasive pain relief method for this type of pain. They may help patients avoid having to take prescription pain management drugs.\n\nSciatica\n\nTreatments help relieve pressure on the nerve. This results in less severe pain that lasts for a fewer number of days.\n\nHeadaches\n\nChiropractic adjustments help headaches and migraines. They do this by treating back misalignment, muscle tension, and stress. Cervical spine manipulation was associated with significant improvement in headache outcomes in trials involving patients with neck pain and/or neck dysfunction and headache.\n\nChronic headaches can result from the abnormal positioning of the head and can be worsened from neck pressure and movement. Chiropractic removes the interference whether it may be from the distant muscle tightness in the back causing strain on your spine or an abnormal lordotic cervical curve and moving vertebrae.\nChiropractic care can reduce the duration of headaches, lower their intensity when they do occur and limit the frequency of their occurrence all together.\n\nMenstrual cramps\n\nChiropractic treatment removes tension from the pelvis and sacrum. It also regulates the neurological function communicating with the reproductive organs. Adjustments can also relieve the bloating, cramping, and pain associated with menstrual cramps\n\nAnyone who has tried traditional medical treatments and has been unable to find pain relief should experiment with chiropractic care. More often than not, you?ll be pleasantly surprised!\n\nBonus: Advanced Chiropractic Relief\n\nIn addition to the benefits listed above, adjustments can bring advanced chiropractic relief for a wide variety of other conditions as well as overall life improvement. A few examples include:\n\nScoliosis ? adjustments have shown to help with the pain, reduced range of motion, abnormal posture, and even difficulty breathing caused by this abnormal curvature of the spine\n\nVertigo ? an adjustment can help realign and balance the spine, thereby reducing the dizziness, nausea, and disorientation caused by vertigo\n\nSinus and allergy relief ? adjusting the upper cervical spine can help drain the sinuses and provide immediate and lasting relief from both long-term and seasonal allergies\n\nExpectant mothers ? women can experience relief from pain and morning sickness and are better able to maintain proper posture during and after pregnancy\n\nChildren?s issues ? treatments have been shown to help children with acid reflux, cholic, and ear infections\nAthletic performance ? the reduction in pain and inflammation is particularly beneficial for professional and amateur athletes\n\nStimulates the immune system ? chiropractic care helps to boost the immune system, speeding up the healing process following illnesses or injuries. One of the most important studies showing the positive effect chiropractic care can have on the immune system and general health was performed by Ronald Pero, Ph.D., chief of cancer prevention research at New York?s Preventive Medicine Institute and professor of medicine at New York University. Dr. Pero measured the immune systems of people under chiropractic care as compared to those in the general population and those with cancer and other serious diseases.\n\nIn his initial three-year study of 107 individuals who had been under chiropractic care for five years or more, the chiropractic patients were found to have a 200% greater immune competence than people who had not received chiropractic care, and 400% greater immune competence than people with cancer and other serious diseases. The immune system superiority of those under chiropractic care did not diminish with age.\n\nDr. Pero stated: ?When applied in a clinical framework, I have never seen a group other than this chiropractic group to experience a 200% increase over the normal patients. This is why it is so dramatically important. We have never seen such a positive improvement in a group.?\n\nAs you can see, there are almost limitless benefits to seeking chiropractic treatment. If you haven?t tried it yet, what are you waiting for?\n\nThere?s no need to accept pain and discomfort as a normal part of life. You have nothing to lose and everything to gain, so it only makes sense to find out more about this possibly life-changing approach to improving your health and wellness.
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Heading to the spa can be a pampering treat, but it can also be a huge boost to your health and wellness! Massage therapy can relieve all sorts of ailments ? from physical pain, to stress and anxiety. People who choose to supplement their healthcare regimen with regular massages will not only enjoy a relaxing hour or two at the spa, but they will see the benefits carry through the days and weeks after the appointment!\n\n1\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\nThese are the 10 most common benefits reported from massage therapy:\n\n1. Reduce Stress\n\nA relaxing day at the spa is a great way to unwind and de-stress. However, clients are sure to notice themselves feeling relaxed and at ease for days and even weeks after their appointments!\n\n \n\n2. Improve Circulation\n\nLoosening muscles and tendons allows increased blood flow throughout the body. Improving your circulation can have a number of positive effects on the rest of your body, including reduced fatigue and pain management!\n\n \n\n3. Reduce Pain\n\nMassage therapy is great for working out problem areas like lower back pain and chronic stiffness. A professional therapist will be able to accurately target the source of your pain and help achieve the perfect massage regimen.\n\n \n\n3\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n4. Eliminate Toxins\n\nStimulating the soft tissues of your body will help to release toxins through your blood and lymphatic systems.\n\n \n\n5. Improve Flexibility\n\nMassage therapy will loosen and relax your muscles, helping your body to achieve its full range of movement potential.\n\n \n\n6. Improve Sleep\n\nA massage will encourage relaxation and boost your mood.  Going to bed with relaxed and loosened muscles promotes more restful sleep, and you?ll feel less tired in the morning!\n\n \n\n7. Enhance Immunity\n\nStimulation of the lymph nodes re-charges the body?s natural defense system.\n\n \n\n8. Reduce Fatigue\n\nMassage therapy is known to boost mood and promote better quality sleep, thus making you feel more rested and less worn-out at the end of the day.\n\n \n\n2\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n9. Alleviate Depression and Anxiety\n\nMassage therapy can help to release endorphins in your body, helping you to feel happy, energized, and at ease.\n\n \n\n10. Reduce post-surgery and post-injury swelling\n\nA professional massage is a great way to safely deal with a sports injury or post-surgery rehabilitation.\n\nDo you think that massage therapy could help you find relief in any of these areas? What improvements would you like to see in your health? Contact us today with your questions about massage therapy and see how we can help you get on the path to improved health and wellness!
##                                                                                                                                       Source
## 1 https://coremedicalohio.com/benefits-of-long-term-chiropractic-care/?utm_source=ReviveOldPost&utm_medium=social&utm_campaign=ReviveOldPost
## 2                                   https://healthpayerintelligence.com/news/unitedhealthcare-combats-opioid-crisis-with-non-opioid-benefits
## 3                                                                     https://www.verywellhealth.com/is-chiropractic-adjustment-safe-4588279
## 4                                           https://hafkeychiropractic.com/advanced-chiropractic-relief-8-key-benefits-of-chiropractor-care/
## 5                                                                                  https://www.urbannirvana.com/10-benefits-massage-therapy/
##                   Topic InternetSearch
## 1 chiropractic benefits           <NA>
## 2 chiropractic benefits           <NA>
## 3 chiropractic benefits           <NA>
## 4 chiropractic benefits           <NA>
## 5      massage benefits         google
##                                                                                                                                                                                               Contraindications
## 1 Doctors of Chiropractic work collaboratively with other healthcare professionals. Should your condition require the attention of another healthcare profession, that recommendation or referral will be made.
## 2                                                                                                                                                                                                          <NA>
## 3                                                                                                                                                                                                          <NA>
## 4                                                                                                                                                                                                          <NA>
## 5                                                                                                                                                                                                          <NA>
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      risksAdverseEffects
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 3 Risks and side effects associated with chiropractic adjustments may include:\n\n    temporary headaches\n    fatigue after treatment\n    discomfort in parts of the body that were treated\n\nRare but serious risks associated with chiropractic adjustment include:\n\n    stroke\n    cauda equina syndrome, a condition involving pinched nerves in the lower part of the spinal canal\n    worsening of herniated disks (although research isn't conclusive)\n\nIn addition to effectiveness, research has focused on the safety of chiropractic treatments, mainly spinal manipulation. \n\nOne 2017 review of 250 articles looked at serious adverse events and benign events associated with chiropractic care. Based on the evidence the researchers reviewed, serious adverse events accounted for one out of every two million spinal manipulations to 13 per 10,00 patients. Serious adverse events included spinal or neurological problems and cervical arterial strokes (dissection of any of the arteries in the neck).\n\nBenign events were more common and included more pain and higher levels of neck problems, but most were short-term problems.\n\nThe researchers confirmed serious adverse events were rare and often related to other preexisting conditions, while benign events are more common. However, the reasons for any types of adverse events are unknown.\n\nA second 2017 review looked 118 articles and found frequently described adverse events include stroke, headache and vertebral artery dissection (cervical arterial stroke). Forty-six percent of the reviews determined that spinal manipulation was safe, while 13% expressed concern of harm. The remaining studies were unclear or neutral. While the researchers did not offer an overall conclusion, they determined spinal manipulation can significantly be helpful, and some risk does exist.\nA Word From Verywell   When chiropractors are correctly trained and licensed, chiropractic care is safe. Mild side effects are to be expected and include temporary soreness, stiffness, and tenderness in the treated area. However, you still want to do your research. Ask for a referral from your doctor. Look at the chiropractor?s website, including patient reviews. Meet with the chiropractor to discuss his or her treatment practices and ask about possible adverse effects related to treatment.\n\nIf you decide a chiropractor isn?t for you, consider seeing an osteopathic doctor. Osteopaths are fully licensed doctors who can practice all areas of medicine. They have received special training on the musculoskeletal system, which includes manual readjustments, myofascial release and other physical manipulation of bones and muscle tissues.
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text        

data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  [chiropractic, adjustment, treatment, serve, n...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...  [, unitedhealthcare, combat, opioid, crisis, n...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...  [, safety, chiropractic, adjustment, chiroprac...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  [advanced, chiropractic, relief, 8, key, benef...
## 4  Heading to the spa can be a pampering treat, b...  ...  [heading, spa, pampering, treat, also, huge, b...
## 
## [5 rows x 10 columns]
data.to_csv('dataCleanLemm.csv')
#DATA = pd.read_csv('dataCleanLemm.csv', encoding='unicode_escape')
DATA <- read.csv('dataCleanLemm.csv', sep=',', header=TRUE, na.strings=c('',' ','NA'), row.names=1)
colnames(DATA)
##  [1] "Document"          "Source"            "Topic"            
##  [4] "InternetSearch"    "Contraindications" "RisksSideEffects" 
##  [7] "body_length"       "punct."            "Cleaned_text"     
## [10] "Lemmatized"
head(DATA,2)
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Document
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Chiropractic adjustments and treatments serve the needs of millions of people around the world.\n\n \n\nAdjustments offer effective, non-invasive and cost-effective solutions to neck and back pain, as well as a myriad of other medical issues.\n\nHave you ever stopped to wonder how many of us suffer from neck and back stiffness or pain?\n\nApart from the obvious discomfort, simple daily tasks such as driving a car, crossing a busy street and picking things up from the floor can become all too challenging for individuals experiencing such pain.\n\nAs anyone who has experienced pain would know, having restricted movement can be debilitating and unfortunately, our busy world doesn?t allow for us to stop.\n\n \nSome of the benefits of long-term chiropractic care include:\n\n    Chiropractors can identify mechanical issues that cause spine-related pain and offer a series of adjustments that provide near immediate relief. Following appointments, patients often report feeling their symptoms noticeably better.\n    When a chiropractor performs an adjustment, they can help restore movement in joints that have ?locked up?. This becomes possible as treatment allows muscles surrounding joints to relax, thereby reducing joint stiffness.\n    Many factors affect health, including exercise patterns, nutrition, sleep, heredity and the environment in which we live. Rather than just treat symptoms of the disease, chiropractic care focuses on a holistic approach to naturally maintain health and resist disease.\n    Chiropractic adjustments help restore normal function and movement to the entire body. Many patients report an improvement in their ability to move with efficiency and strength.\n    Many patients find delight in the results chiropractic adjustments have on old and chronic injuries. Whether an injury is, in fact, new or old, chiropractic care can help reduce pain, restore mobility and provide quick pain relief to all joints in the body. Such care can help maintain better overall health and thus faster recovery time.\n\nHave you ever noticed that when you are in pain and unable to perform regular or favorite activities, it can put a strain on emotional and mental well-being?\n\nFor example, the increased stress from not being able to properly perform a paid job. This, in turn, can have a negative impact on physical health with increases in heart rate and blood pressure. The domino effect often continues with sleep becoming disturbed, with resulting lethargy and tiredness during the day. Does anyone really feel up to exercising in this state?\n\nChiropractic care is a natural method of healing the body?s communication system and never relies on the use of pharmaceutical drugs or invasive surgery.\n\n\n\n 
## 1 \nUnitedHealthcare Combats Opioid Crisis with Non-Opioid Benefits\nPhysical therapy and chiropractic care can prevent or reduce expensive, invasive spinal procedures, such as imaging or surgery, to reduce opioid use and cut costs.\nUnitedHealthcare, opioid, physical therapy, healthcare spending\n\n\nOctober 29, 2019 - UnitedHealthcare (UHC) is combatting the opioid epidemic and high healthcare costs with new physical therapy and chiropractic care benefits to prevent, delay, or in some cases substitute for invasive spinal procedures.\n\n?With millions of Americans experiencing low back pain currently or at some point during their lifetimes, we believe this benefit design will help make a meaningful difference by improving health outcomes while reducing costs,? said Anne Docimo, MD, UnitedHealthcare chief medical officer.\n\nLower back pain is in part responsible for sustaining the opioid epidemic and also increases healthcare costs.\nDig Deeper\n\nAlthough opioid overdoses fell by two percent from 2017 to 2018 and a legal battles aim to hold pharmaceutical companies accountable, there is no end in sight for the opioid epidemic. Industry professionals are still grappling with the balance between cutting opioid prescriptions will working to reduce patient pain.\n\nCommon conditions such as low back pain bolster the epidemic?s presence, with clinicians still prescribing the opioids against best practice recommendations. According to a recent OptumLabs study, 9 percent of patients with newly diagnosed low back pain are prescribed opioids and lower back pain currently contributes 52 percent to the overall opioid prescription rate.\n\nIn addition to boosting opioids distribution, alternative, invasive lower back pain treatments can significantly impact healthcare spending.\n\nIt is not new information that physical therapy and chiropractic care are effective, lower cost alternatives to spinal imaging or surgery. However, payers are still in the process of adopting the method.\n\nTo counteract the high-cost, high-risk potential of using opioids to treat back pain, UHC created a benefit that does not rely on medication or technology but rather on physical therapy and chiropractic care.\n\nThe benefit allows eligible employers to offer physical therapist and chiropractor visits with no out-of-pocket costs. Members who already receive physical therapist and chiropractic care benefits under UHC?s employer-sponsored health plans and who have maxed out their visits will not receive additional visits under this benefit.\n\nHowever, for those who still have visits to use and who choose physical therapy or chiropractic care over other forms of treatment, the copay or deductible for those visits will be waived and they will receive three visits at no cost.\n\nUHC has high expectations for the fiscal and physical impacts of this benefit.\n\nAccording to UHC?s analysis, the health payer expects that by 2021, opioid use will decrease by 19 percent. Spinal imaging test frequency and spinal surgeries will be reduced by 22 percent and 21 percent, respectively. In addition to these specific goals, UHC hopes to see a decrease in the overall cost of spinal care.\n\nThe same OptumLabs study demonstrated that UHC?s expectations are not without precedent.\n\nThe study looked at the correlation between out-of-pocket costs and patient utilization of noninvasive treatments. Researchers discovered that members whose copay was over $30 were a little under 30 percent less likely to choose physical therapy as opposed to more invasive treatments.\n\nAn American Journal of Managed Care study in June 2019 found that patients with high deductibles, typically over $1,000, were less likely to visit physical therapy.\n\nEligible employers may be brand new or renewing their membership. They must be fully insured and over 51 or more employees strong. The benefit is currently available in Connecticut, Florida, Georgia, New York, and North Carolina.\n\nHowever, UHC plans to expand the benefit from 2020 into 2021. By the end of this expansion period, the benefit will also be available to self-funded employers and organizations with an employee population between 2 and 50. The benefit will span ten states, primarily in the southeast.\n\n?This new benefit design may help encourage people with low back pain to get the right care at the right time and in the right setting, helping expand access to evidence-based and more affordable treatments,? said Docimo.
##                                                                                                                                       Source
## 0 https://coremedicalohio.com/benefits-of-long-term-chiropractic-care/?utm_source=ReviveOldPost&utm_medium=social&utm_campaign=ReviveOldPost
## 1                                   https://healthpayerintelligence.com/news/unitedhealthcare-combats-opioid-crisis-with-non-opioid-benefits
##                   Topic InternetSearch
## 0 chiropractic benefits           <NA>
## 1 chiropractic benefits           <NA>
##                                                                                                                                                                                               Contraindications
## 0 Doctors of Chiropractic work collaboratively with other healthcare professionals. Should your condition require the attention of another healthcare profession, that recommendation or referral will be made.
## 1                                                                                                                                                                                                          <NA>
##   RisksSideEffects body_length punct.
## 0             <NA>        2288    2.4
## 1             <NA>        3796    2.4
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Cleaned_text
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ['chiropract', 'adjust', 'treatment', 'serv', 'need', 'million', 'peopl', 'around', 'world', 'adjust', 'offer', 'effect', 'noninvas', 'costeffect', 'solut', 'neck', 'back', 'pain', 'well', 'myriad', 'medic', 'issu', 'ever', 'stop', 'wonder', 'mani', 'us', 'suffer', 'neck', 'back', 'stiff', 'pain', 'apart', 'obviou', 'discomfort', 'simpl', 'daili', 'task', 'drive', 'car', 'cross', 'busi', 'street', 'pick', 'thing', 'floor', 'becom', 'challeng', 'individu', 'experienc', 'pain', 'anyon', 'experienc', 'pain', 'would', 'know', 'restrict', 'movement', 'debilit', 'unfortun', 'busi', 'world', 'doesnt', 'allow', 'us', 'stop', 'benefit', 'longterm', 'chiropract', 'care', 'includ', 'chiropractor', 'identifi', 'mechan', 'issu', 'caus', 'spinerel', 'pain', 'offer', 'seri', 'adjust', 'provid', 'near', 'immedi', 'relief', 'follow', 'appoint', 'patient', 'often', 'report', 'feel', 'symptom', 'notic', 'better', 'chiropractor', 'perform', 'adjust', 'help', 'restor', 'movement', 'joint', 'lock', 'becom', 'possibl', 'treatment', 'allow', 'muscl', 'surround', 'joint', 'relax', 'therebi', 'reduc', 'joint', 'stiff', 'mani', 'factor', 'affect', 'health', 'includ', 'exercis', 'pattern', 'nutrit', 'sleep', 'hered', 'environ', 'live', 'rather', 'treat', 'symptom', 'diseas', 'chiropract', 'care', 'focus', 'holist', 'approach', 'natur', 'maintain', 'health', 'resist', 'diseas', 'chiropract', 'adjust', 'help', 'restor', 'normal', 'function', 'movement', 'entir', 'bodi', 'mani', 'patient', 'report', 'improv', 'abil', 'move', 'effici', 'strength', 'mani', 'patient', 'find', 'delight', 'result', 'chiropract', 'adjust', 'old', 'chronic', 'injuri', 'whether', 'injuri', 'fact', 'new', 'old', 'chiropract', 'care', 'help', 'reduc', 'pain', 'restor', 'mobil', 'provid', 'quick', 'pain', 'relief', 'joint', 'bodi', 'care', 'help', 'maintain', 'better', 'overal', 'health', 'thu', 'faster', 'recoveri', 'time', 'ever', 'notic', 'pain', 'unabl', 'perform', 'regular', 'favorit', 'activ', 'put', 'strain', 'emot', 'mental', 'wellb', 'exampl', 'increas', 'stress', 'abl', 'properli', 'perform', 'paid', 'job', 'turn', 'neg', 'impact', 'physic', 'health', 'increas', 'heart', 'rate', 'blood', 'pressur', 'domino', 'effect', 'often', 'continu', 'sleep', 'becom', 'disturb', 'result', 'lethargi', 'tired', 'day', 'anyon', 'realli', 'feel', 'exercis', 'state', 'chiropract', 'care', 'natur', 'method', 'heal', 'bodi', 'commun', 'system', 'never', 'reli', 'use', 'pharmaceut', 'drug', 'invas', 'surgeri', '']
## 1 ['', 'unitedhealthcar', 'combat', 'opioid', 'crisi', 'nonopioid', 'benefit', 'physic', 'therapi', 'chiropract', 'care', 'prevent', 'reduc', 'expens', 'invas', 'spinal', 'procedur', 'imag', 'surgeri', 'reduc', 'opioid', 'use', 'cut', 'cost', 'unitedhealthcar', 'opioid', 'physic', 'therapi', 'healthcar', 'spend', 'octob', '29', '2019', 'unitedhealthcar', 'uhc', 'combat', 'opioid', 'epidem', 'high', 'healthcar', 'cost', 'new', 'physic', 'therapi', 'chiropract', 'care', 'benefit', 'prevent', 'delay', 'case', 'substitut', 'invas', 'spinal', 'procedur', 'million', 'american', 'experienc', 'low', 'back', 'pain', 'current', 'point', 'lifetim', 'believ', 'benefit', 'design', 'help', 'make', 'meaning', 'differ', 'improv', 'health', 'outcom', 'reduc', 'cost', 'said', 'ann', 'docimo', 'md', 'unitedhealthcar', 'chief', 'medic', 'offic', 'lower', 'back', 'pain', 'part', 'respons', 'sustain', 'opioid', 'epidem', 'also', 'increas', 'healthcar', 'cost', 'dig', 'deeper', 'although', 'opioid', 'overdos', 'fell', 'two', 'percent', '2017', '2018', 'legal', 'battl', 'aim', 'hold', 'pharmaceut', 'compani', 'account', 'end', 'sight', 'opioid', 'epidem', 'industri', 'profession', 'still', 'grappl', 'balanc', 'cut', 'opioid', 'prescript', 'work', 'reduc', 'patient', 'pain', 'common', 'condit', 'low', 'back', 'pain', 'bolster', 'epidem', 'presenc', 'clinician', 'still', 'prescrib', 'opioid', 'best', 'practic', 'recommend', 'accord', 'recent', 'optumlab', 'studi', '9', 'percent', 'patient', 'newli', 'diagnos', 'low', 'back', 'pain', 'prescrib', 'opioid', 'lower', 'back', 'pain', 'current', 'contribut', '52', 'percent', 'overal', 'opioid', 'prescript', 'rate', 'addit', 'boost', 'opioid', 'distribut', 'altern', 'invas', 'lower', 'back', 'pain', 'treatment', 'significantli', 'impact', 'healthcar', 'spend', 'new', 'inform', 'physic', 'therapi', 'chiropract', 'care', 'effect', 'lower', 'cost', 'altern', 'spinal', 'imag', 'surgeri', 'howev', 'payer', 'still', 'process', 'adopt', 'method', 'counteract', 'highcost', 'highrisk', 'potenti', 'use', 'opioid', 'treat', 'back', 'pain', 'uhc', 'creat', 'benefit', 'reli', 'medic', 'technolog', 'rather', 'physic', 'therapi', 'chiropract', 'care', 'benefit', 'allow', 'elig', 'employ', 'offer', 'physic', 'therapist', 'chiropractor', 'visit', 'outofpocket', 'cost', 'member', 'alreadi', 'receiv', 'physic', 'therapist', 'chiropract', 'care', 'benefit', 'uhc', 'employersponsor', 'health', 'plan', 'max', 'visit', 'receiv', 'addit', 'visit', 'benefit', 'howev', 'still', 'visit', 'use', 'choos', 'physic', 'therapi', 'chiropract', 'care', 'form', 'treatment', 'copay', 'deduct', 'visit', 'waiv', 'receiv', 'three', 'visit', 'cost', 'uhc', 'high', 'expect', 'fiscal', 'physic', 'impact', 'benefit', 'accord', 'uhc', 'analysi', 'health', 'payer', 'expect', '2021', 'opioid', 'use', 'decreas', '19', 'percent', 'spinal', 'imag', 'test', 'frequenc', 'spinal', 'surgeri', 'reduc', '22', 'percent', '21', 'percent', 'respect', 'addit', 'specif', 'goal', 'uhc', 'hope', 'see', 'decreas', 'overal', 'cost', 'spinal', 'care', 'optumlab', 'studi', 'demonstr', 'uhc', 'expect', 'without', 'preced', 'studi', 'look', 'correl', 'outofpocket', 'cost', 'patient', 'util', 'noninvas', 'treatment', 'research', 'discov', 'member', 'whose', 'copay', '30', 'littl', '30', 'percent', 'less', 'like', 'choos', 'physic', 'therapi', 'oppos', 'invas', 'treatment', 'american', 'journal', 'manag', 'care', 'studi', 'june', '2019', 'found', 'patient', 'high', 'deduct', 'typic', '1000', 'less', 'like', 'visit', 'physic', 'therapi', 'elig', 'employ', 'may', 'brand', 'new', 'renew', 'membership', 'must', 'fulli', 'insur', '51', 'employe', 'strong', 'benefit', 'current', 'avail', 'connecticut', 'florida', 'georgia', 'new', 'york', 'north', 'carolina', 'howev', 'uhc', 'plan', 'expand', 'benefit', '2020', '2021', 'end', 'expans', 'period', 'benefit', 'also', 'avail', 'selffund', 'employ', 'organ', 'employe', 'popul', '2', '50', 'benefit', 'span', 'ten', 'state', 'primarili', 'southeast', 'new', 'benefit', 'design', 'may', 'help', 'encourag', 'peopl', 'low', 'back', 'pain', 'get', 'right', 'care', 'right', 'time', 'right', 'set', 'help', 'expand', 'access', 'evidencebas', 'afford', 'treatment', 'said', 'docimo']
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Lemmatized
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                ['chiropractic', 'adjustment', 'treatment', 'serve', 'need', 'million', 'people', 'around', 'world', 'adjustment', 'offer', 'effective', 'noninvasive', 'costeffective', 'solution', 'neck', 'back', 'pain', 'well', 'myriad', 'medical', 'issue', 'ever', 'stopped', 'wonder', 'many', 'u', 'suffer', 'neck', 'back', 'stiffness', 'pain', 'apart', 'obvious', 'discomfort', 'simple', 'daily', 'task', 'driving', 'car', 'crossing', 'busy', 'street', 'picking', 'thing', 'floor', 'become', 'challenging', 'individual', 'experiencing', 'pain', 'anyone', 'experienced', 'pain', 'would', 'know', 'restricted', 'movement', 'debilitating', 'unfortunately', 'busy', 'world', 'doesnt', 'allow', 'u', 'stop', 'benefit', 'longterm', 'chiropractic', 'care', 'include', 'chiropractor', 'identify', 'mechanical', 'issue', 'cause', 'spinerelated', 'pain', 'offer', 'series', 'adjustment', 'provide', 'near', 'immediate', 'relief', 'following', 'appointment', 'patient', 'often', 'report', 'feeling', 'symptom', 'noticeably', 'better', 'chiropractor', 'performs', 'adjustment', 'help', 'restore', 'movement', 'joint', 'locked', 'becomes', 'possible', 'treatment', 'allows', 'muscle', 'surrounding', 'joint', 'relax', 'thereby', 'reducing', 'joint', 'stiffness', 'many', 'factor', 'affect', 'health', 'including', 'exercise', 'pattern', 'nutrition', 'sleep', 'heredity', 'environment', 'live', 'rather', 'treat', 'symptom', 'disease', 'chiropractic', 'care', 'focus', 'holistic', 'approach', 'naturally', 'maintain', 'health', 'resist', 'disease', 'chiropractic', 'adjustment', 'help', 'restore', 'normal', 'function', 'movement', 'entire', 'body', 'many', 'patient', 'report', 'improvement', 'ability', 'move', 'efficiency', 'strength', 'many', 'patient', 'find', 'delight', 'result', 'chiropractic', 'adjustment', 'old', 'chronic', 'injury', 'whether', 'injury', 'fact', 'new', 'old', 'chiropractic', 'care', 'help', 'reduce', 'pain', 'restore', 'mobility', 'provide', 'quick', 'pain', 'relief', 'joint', 'body', 'care', 'help', 'maintain', 'better', 'overall', 'health', 'thus', 'faster', 'recovery', 'time', 'ever', 'noticed', 'pain', 'unable', 'perform', 'regular', 'favorite', 'activity', 'put', 'strain', 'emotional', 'mental', 'wellbeing', 'example', 'increased', 'stress', 'able', 'properly', 'perform', 'paid', 'job', 'turn', 'negative', 'impact', 'physical', 'health', 'increase', 'heart', 'rate', 'blood', 'pressure', 'domino', 'effect', 'often', 'continues', 'sleep', 'becoming', 'disturbed', 'resulting', 'lethargy', 'tiredness', 'day', 'anyone', 'really', 'feel', 'exercising', 'state', 'chiropractic', 'care', 'natural', 'method', 'healing', 'body', 'communication', 'system', 'never', 'relies', 'use', 'pharmaceutical', 'drug', 'invasive', 'surgery', '']
## 1 ['', 'unitedhealthcare', 'combat', 'opioid', 'crisis', 'nonopioid', 'benefit', 'physical', 'therapy', 'chiropractic', 'care', 'prevent', 'reduce', 'expensive', 'invasive', 'spinal', 'procedure', 'imaging', 'surgery', 'reduce', 'opioid', 'use', 'cut', 'cost', 'unitedhealthcare', 'opioid', 'physical', 'therapy', 'healthcare', 'spending', 'october', '29', '2019', 'unitedhealthcare', 'uhc', 'combatting', 'opioid', 'epidemic', 'high', 'healthcare', 'cost', 'new', 'physical', 'therapy', 'chiropractic', 'care', 'benefit', 'prevent', 'delay', 'case', 'substitute', 'invasive', 'spinal', 'procedure', 'million', 'american', 'experiencing', 'low', 'back', 'pain', 'currently', 'point', 'lifetime', 'believe', 'benefit', 'design', 'help', 'make', 'meaningful', 'difference', 'improving', 'health', 'outcome', 'reducing', 'cost', 'said', 'anne', 'docimo', 'md', 'unitedhealthcare', 'chief', 'medical', 'officer', 'lower', 'back', 'pain', 'part', 'responsible', 'sustaining', 'opioid', 'epidemic', 'also', 'increase', 'healthcare', 'cost', 'dig', 'deeper', 'although', 'opioid', 'overdoses', 'fell', 'two', 'percent', '2017', '2018', 'legal', 'battle', 'aim', 'hold', 'pharmaceutical', 'company', 'accountable', 'end', 'sight', 'opioid', 'epidemic', 'industry', 'professional', 'still', 'grappling', 'balance', 'cutting', 'opioid', 'prescription', 'working', 'reduce', 'patient', 'pain', 'common', 'condition', 'low', 'back', 'pain', 'bolster', 'epidemic', 'presence', 'clinician', 'still', 'prescribing', 'opioids', 'best', 'practice', 'recommendation', 'according', 'recent', 'optumlabs', 'study', '9', 'percent', 'patient', 'newly', 'diagnosed', 'low', 'back', 'pain', 'prescribed', 'opioids', 'lower', 'back', 'pain', 'currently', 'contributes', '52', 'percent', 'overall', 'opioid', 'prescription', 'rate', 'addition', 'boosting', 'opioids', 'distribution', 'alternative', 'invasive', 'lower', 'back', 'pain', 'treatment', 'significantly', 'impact', 'healthcare', 'spending', 'new', 'information', 'physical', 'therapy', 'chiropractic', 'care', 'effective', 'lower', 'cost', 'alternative', 'spinal', 'imaging', 'surgery', 'however', 'payer', 'still', 'process', 'adopting', 'method', 'counteract', 'highcost', 'highrisk', 'potential', 'using', 'opioids', 'treat', 'back', 'pain', 'uhc', 'created', 'benefit', 'rely', 'medication', 'technology', 'rather', 'physical', 'therapy', 'chiropractic', 'care', 'benefit', 'allows', 'eligible', 'employer', 'offer', 'physical', 'therapist', 'chiropractor', 'visit', 'outofpocket', 'cost', 'member', 'already', 'receive', 'physical', 'therapist', 'chiropractic', 'care', 'benefit', 'uhcs', 'employersponsored', 'health', 'plan', 'maxed', 'visit', 'receive', 'additional', 'visit', 'benefit', 'however', 'still', 'visit', 'use', 'choose', 'physical', 'therapy', 'chiropractic', 'care', 'form', 'treatment', 'copay', 'deductible', 'visit', 'waived', 'receive', 'three', 'visit', 'cost', 'uhc', 'high', 'expectation', 'fiscal', 'physical', 'impact', 'benefit', 'according', 'uhcs', 'analysis', 'health', 'payer', 'expects', '2021', 'opioid', 'use', 'decrease', '19', 'percent', 'spinal', 'imaging', 'test', 'frequency', 'spinal', 'surgery', 'reduced', '22', 'percent', '21', 'percent', 'respectively', 'addition', 'specific', 'goal', 'uhc', 'hope', 'see', 'decrease', 'overall', 'cost', 'spinal', 'care', 'optumlabs', 'study', 'demonstrated', 'uhcs', 'expectation', 'without', 'precedent', 'study', 'looked', 'correlation', 'outofpocket', 'cost', 'patient', 'utilization', 'noninvasive', 'treatment', 'researcher', 'discovered', 'member', 'whose', 'copay', '30', 'little', '30', 'percent', 'le', 'likely', 'choose', 'physical', 'therapy', 'opposed', 'invasive', 'treatment', 'american', 'journal', 'managed', 'care', 'study', 'june', '2019', 'found', 'patient', 'high', 'deductible', 'typically', '1000', 'le', 'likely', 'visit', 'physical', 'therapy', 'eligible', 'employer', 'may', 'brand', 'new', 'renewing', 'membership', 'must', 'fully', 'insured', '51', 'employee', 'strong', 'benefit', 'currently', 'available', 'connecticut', 'florida', 'georgia', 'new', 'york', 'north', 'carolina', 'however', 'uhc', 'plan', 'expand', 'benefit', '2020', '2021', 'end', 'expansion', 'period', 'benefit', 'also', 'available', 'selffunded', 'employer', 'organization', 'employee', 'population', '2', '50', 'benefit', 'span', 'ten', 'state', 'primarily', 'southeast', 'new', 'benefit', 'design', 'may', 'help', 'encourage', 'people', 'low', 'back', 'pain', 'get', 'right', 'care', 'right', 'time', 'right', 'setting', 'helping', 'expand', 'access', 'evidencebased', 'affordable', 'treatment', 'said', 'docimo']
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.20)
from sklearn.feature_extraction.text import CountVectorizer
count_vect=CountVectorizer(analyzer=clean_text)
count_vect_fit=count_vect.fit(X_train['Document'])

count_train=count_vect_fit.transform(X_train['Document'])
count_test=count_vect_fit.transform(X_test['Document'])
len(count_vect_fit.get_feature_names())
## 3170
count_vect_fit.get_feature_names()[200:350]
## ['alzheim', 'amateur', 'amaz', 'ambul', 'amelior', 'america', 'american', 'among', 'amount', 'amplitud', 'amyclarklymphaticdrainagemassag', 'amyclarklymphaticdrainagemassage1', 'analges', 'analyz', 'anatomi', 'ancient', 'andor', 'anecdot', 'anemia', 'anesthesia', 'anesthet', 'anger', 'angion', 'angri', 'anim', 'aniston', 'ankl', 'ann', 'annal', 'annual', 'anoth', 'answer', 'antibiot', 'antibodi', 'antiinflammatori', 'antivir', 'anxieti', 'anxietydepress', 'anxietyfre', 'anxiou', 'anyon', 'anyth', 'anywher', 'apart', 'appear', 'appendix', 'appetit', 'appli', 'applianc', 'applic', 'appoint', 'approach', 'appropri', 'approv', 'approxim', 'apta', 'area', 'areaswer', 'arent', 'argu', 'aris', 'arizona', 'arm', 'armpit', 'around', 'arquett', 'array', 'arriv', 'art', 'arteri', 'arthrit', 'arthriti', 'articl', 'ashi', 'asid', 'ask', 'asleep', 'aspect', 'assert', 'assess', 'assist', 'associ', 'asthma', 'athlet', 'athom', 'atrophi', 'attach', 'attack', 'attent', 'attitud', 'attract', 'australian', 'author', 'authorth', 'autoimmun', 'autonom', 'auxiliari', 'avail', 'averag', 'avoid', 'aw', 'awar', 'away', 'awkward', 'ayurved', 'ba', 'babi', 'back', 'backneck', 'backthough', 'bad', 'badli', 'bag', 'baker', 'balanc', 'ball', 'bamboo', 'band', 'bandag', 'bannist', 'bar', 'barrier', 'basalt', 'base', 'basi', 'basic', 'bath', 'bc', 'beauti', 'becam', 'beck', 'becom', 'bed', 'begin', 'behavior', 'behind', 'belief', 'believ', 'belli', 'ben', 'bench', 'beneath', 'benefici', 'benefit', 'bernardo', 'best', 'bet', 'better', 'betterevidenc', 'betterqu']
count_train_vect=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(count_train.toarray())],axis=1)

count_test_vect=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(count_test.toarray())],axis=1)
count_train_vect.head()
##                                             Document  body_length  ...  3168 3169
## 0  \nGetting Started with Cold Stone Massage Ther...         1868  ...     0    0
## 1  6 Surprising Benefits of Massage Therapy\n\nSu...         2815  ...     0    0
## 2  When to call 911 or go to an emergency room im...          988  ...     0    0
## 3  What Are the Health Benefits of Massage?\n\nMa...         1489  ...     0    0
## 4  Heading to the spa can be a pampering treat, b...         2312  ...     0    0
## 
## [5 rows x 3175 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(count_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(count_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 26        physical therapy benefits  mental health services benefits
## 30            chiropractic benefits            chiropractic benefits
## 77                               ER                               ER
## 72       Lymphatic Drainage Massage             dry brushing massage
## 13       Lymphatic Drainage Massage                 massage benefits
## 52              cold stone benefits              cold stone benefits
## 68       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 71       Lymphatic Drainage Massage             dry brushing massage
## 15                 massage benefits                 massage benefits
## 57              cold stone benefits              cold stone benefits
## 1             chiropractic benefits            chiropractic benefits
## 62       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 33            chiropractic benefits            chiropractic benefits
## 25  mental health services benefits  mental health services benefits
## 65       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 7                  massage benefits                 massage benefits
## 8               cold stone benefits                 massage benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.7058823529411765
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 0 0 0 0]
##  [0 3 0 0 0 0 0 0]
##  [0 0 3 0 0 0 0 0]
##  [0 0 0 2 0 0 0 0]
##  [0 2 0 0 0 0 0 0]
##  [0 1 0 1 0 2 0 0]
##  [0 0 0 0 0 0 1 1]
##  [0 0 0 0 0 0 0 0]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##                              ER       1.00      1.00      1.00         1
##      Lymphatic Drainage Massage       0.50      1.00      0.67         3
##           chiropractic benefits       1.00      1.00      1.00         3
##             cold stone benefits       0.67      1.00      0.80         2
##            dry brushing massage       0.00      0.00      0.00         2
##                massage benefits       1.00      0.50      0.67         4
## mental health services benefits       1.00      0.50      0.67         2
##       physical therapy benefits       0.00      0.00      0.00         0
## 
##                        accuracy                           0.71        17
##                       macro avg       0.65      0.62      0.60        17
##                    weighted avg       0.75      0.71      0.68        17
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1439: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples.
##   'recall', 'true', average, warn_for)
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(count_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(count_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 26  mental health services benefits  mental health services benefits
## 30            chiropractic benefits            chiropractic benefits
## 77                               ER                               ER
## 72             dry brushing massage             dry brushing massage
## 13                 cupping benefits                 massage benefits
## 52              cold stone benefits              cold stone benefits
## 68       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 71             dry brushing massage             dry brushing massage
## 15                 cupping benefits                 massage benefits
## 57              cold stone benefits              cold stone benefits
## 1             chiropractic benefits            chiropractic benefits
## 62       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 33            chiropractic benefits            chiropractic benefits
## 25  mental health services benefits  mental health services benefits
## 65       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 7                  massage benefits                 massage benefits
## 8                  cupping benefits                 massage benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.8235294117647058
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 0 0 0 0]
##  [0 3 0 0 0 0 0 0]
##  [0 0 3 0 0 0 0 0]
##  [0 0 0 2 0 0 0 0]
##  [0 0 0 0 0 0 0 0]
##  [0 0 0 0 0 2 0 0]
##  [0 0 0 0 3 0 1 0]
##  [0 0 0 0 0 0 0 2]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##                              ER       1.00      1.00      1.00         1
##      Lymphatic Drainage Massage       1.00      1.00      1.00         3
##           chiropractic benefits       1.00      1.00      1.00         3
##             cold stone benefits       1.00      1.00      1.00         2
##                cupping benefits       0.00      0.00      0.00         0
##            dry brushing massage       1.00      1.00      1.00         2
##                massage benefits       1.00      0.25      0.40         4
## mental health services benefits       1.00      1.00      1.00         2
## 
##                        accuracy                           0.82        17
##                       macro avg       0.88      0.78      0.80        17
##                    weighted avg       1.00      0.82      0.86        17
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1439: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples.
##   'recall', 'true', average, warn_for)

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_countRFC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    count_vect=CountVectorizer(analyzer=lemmatize)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['lemmatized'])
    
    model = rf.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_count_RFC_80-20:']
    print('\n\n',pred)
    

def predict_countRFC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    count_vect=CountVectorizer(analyzer=clean_text)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['clean'])
    
    model = rf.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_count_RFC_80-20:']
    print('\n\n',pred)
    
predict_countRFC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_count_RFC_80-20:            cold stone benefits
predict_countRFC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_count_RFC_80-20:            cold stone benefits

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_countGBC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    count_vect=CountVectorizer(analyzer=lemmatize)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['lemmatized'])
    
    model = gb.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_count_GBC_80-20:']
    print('\n\n',pred)
    

def predict_countGBC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    count_vect=CountVectorizer(analyzer=clean_text)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['clean'])
    
    model = gb.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_count_GBC_80-20:']
    print('\n\n',pred)
    
predict_countGBC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_count_GBC_80-20:               cupping benefits
predict_countGBC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_count_GBC_80-20:               cupping benefits

TF-IDF RFC and GBC


stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text        

data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  [chiropractic, adjustment, treatment, serve, n...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...  [, unitedhealthcare, combat, opioid, crisis, n...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...  [, safety, chiropractic, adjustment, chiroprac...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  [advanced, chiropractic, relief, 8, key, benef...
## 4  Heading to the spa can be a pampering treat, b...  ...  [heading, spa, pampering, treat, also, huge, b...
## 
## [5 rows x 10 columns]

from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.20)

from sklearn.feature_extraction.text import TfidfVectorizer

tfidf_vect=TfidfVectorizer(analyzer=clean_text)
tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])

tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
tfidf_test=tfidf_vect_fit.transform(X_test['Document'])
len(tfidf_vect_fit.get_feature_names())
## 3334
tfidf_vect_fit.get_feature_names()[200:350]
## ['alloy', 'allround', 'almost', 'alon', 'along', 'alongsid', 'alreadi', 'also', 'alter', 'altern', 'alternatingli', 'although', 'altogeth', 'alway', 'alzheim', 'amateur', 'amaz', 'amazon', 'ambul', 'america', 'american', 'among', 'amount', 'amplitud', 'amyclarklymphaticdrainagemassag', 'amyclarklymphaticdrainagemassage1', 'analges', 'analysi', 'analyz', 'anatomi', 'ancient', 'andor', 'anecdot', 'anemia', 'anesthesia', 'anesthet', 'anger', 'angion', 'anhedonia', 'anim', 'ankl', 'ann', 'annal', 'announc', 'annual', 'anosognosia', 'anoth', 'answer', 'antibiot', 'antibodi', 'anticip', 'antidepress', 'antiinflammatori', 'antivir', 'anxieti', 'anxietydepress', 'anxietyfre', 'anyon', 'anyth', 'anywher', 'apart', 'appeal', 'appear', 'appendix', 'appetit', 'appl', 'appli', 'applic', 'appoint', 'approach', 'appropri', 'approv', 'approxim', 'apta', 'area', 'arent', 'argu', 'aris', 'arizona', 'arm', 'armpit', 'aromatherapi', 'around', 'array', 'arriv', 'art', 'arthrit', 'arthriti', 'articl', 'ascertain', 'ashi', 'asid', 'ask', 'asleep', 'aspect', 'aspirin', 'assert', 'assess', 'assign', 'assist', 'associ', 'asthma', 'astonishingli', 'athlet', 'athom', 'atlanta', 'atrophi', 'attach', 'attack', 'attempt', 'attent', 'attitud', 'australian', 'author', 'authorth', 'autoimmun', 'autonom', 'avail', 'averag', 'avoid', 'aw', 'awar', 'away', 'awesom', 'awkward', 'axillari', 'ayurved', 'b', 'ba', 'babi', 'back', 'bad', 'badli', 'bag', 'balanc', 'ball', 'bamboo', 'band', 'bandag', 'bank', 'bannist', 'bar', 'barrier', 'basalt', 'base', 'basi', 'basic', 'bath', 'bather', 'bathroom']
tfidf_train_vect=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(tfidf_train.toarray())],axis=1)

tfidf_test_vect=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(tfidf_test.toarray())],axis=1)
tfidf_train_vect.head()
##                                             Document  ...  3333
## 0  What is a lymphatic drainage massage or detox ...  ...   0.0
## 1  \nCupping Therapy\n\n\nWhat Does the Research ...  ...   0.0
## 2  \nEverything You Need To Know About Massage Gu...  ...   0.0
## 3  \nMassage Therapy Styles and Health Benefits\n...  ...   0.0
## 4  Lymphatic Brushing: How to Skin Brush for Deto...  ...   0.0
## 
## [5 rows x 3339 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
np.random.seed(45678)
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(tfidf_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(tfidf_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 65  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 33       chiropractic benefits       chiropractic benefits
## 69  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 41            cupping benefits            cupping benefits
## 38            cupping benefits            cupping benefits
## 39            cupping benefits            cupping benefits
## 50        massage gun benefits        massage gun benefits
## 13  Lymphatic Drainage Massage            massage benefits
## 45        massage gun benefits        massage gun benefits
## 34            cupping benefits            cupping benefits
## 60         cold stone benefits         cold stone benefits
## 7             massage benefits            massage benefits
## 10            massage benefits            massage benefits
## 22   physical therapy benefits   physical therapy benefits
## 1        chiropractic benefits       chiropractic benefits
## 47            massage benefits        massage gun benefits
## 20   physical therapy benefits   physical therapy benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.8823529411764706
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[2 0 0 0 0 0 0]
##  [0 2 0 0 0 0 0]
##  [0 0 1 0 0 0 0]
##  [0 0 0 4 0 0 0]
##  [1 0 0 0 2 0 0]
##  [0 0 0 0 1 2 0]
##  [0 0 0 0 0 0 2]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       0.67      1.00      0.80         2
##      chiropractic benefits       1.00      1.00      1.00         2
##        cold stone benefits       1.00      1.00      1.00         1
##           cupping benefits       1.00      1.00      1.00         4
##           massage benefits       0.67      0.67      0.67         3
##       massage gun benefits       1.00      0.67      0.80         3
##  physical therapy benefits       1.00      1.00      1.00         2
## 
##                   accuracy                           0.88        17
##                  macro avg       0.90      0.90      0.90        17
##               weighted avg       0.90      0.88      0.88        17
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(tfidf_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(tfidf_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 65  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 33       chiropractic benefits       chiropractic benefits
## 69  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 41            cupping benefits            cupping benefits
## 38            cupping benefits            cupping benefits
## 39            cupping benefits            cupping benefits
## 50            massage benefits        massage gun benefits
## 13  Lymphatic Drainage Massage            massage benefits
## 45            massage benefits        massage gun benefits
## 34            cupping benefits            cupping benefits
## 60         cold stone benefits         cold stone benefits
## 7             massage benefits            massage benefits
## 10            massage benefits            massage benefits
## 22   physical therapy benefits   physical therapy benefits
## 1    physical therapy benefits       chiropractic benefits
## 47       chiropractic benefits        massage gun benefits
## 20   physical therapy benefits   physical therapy benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.7058823529411765
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[2 0 0 0 0 0 0]
##  [0 1 0 0 0 0 1]
##  [0 0 1 0 0 0 0]
##  [0 0 0 4 0 0 0]
##  [1 0 0 0 2 0 0]
##  [0 1 0 0 2 0 0]
##  [0 0 0 0 0 0 2]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       0.67      1.00      0.80         2
##      chiropractic benefits       0.50      0.50      0.50         2
##        cold stone benefits       1.00      1.00      1.00         1
##           cupping benefits       1.00      1.00      1.00         4
##           massage benefits       0.50      0.67      0.57         3
##       massage gun benefits       0.00      0.00      0.00         3
##  physical therapy benefits       0.67      1.00      0.80         2
## 
##                   accuracy                           0.71        17
##                  macro avg       0.62      0.74      0.67        17
##               weighted avg       0.60      0.71      0.64        17
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_tfidfRFC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['lemmatized'])
    
    model = rf.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_tfidf_RFC_80-20:']
    print('\n\n',pred)
    

def predict_tfidfRFC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    tfidf_vect=TfidfVectorizer(analyzer=clean_text)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['clean'])
    
    model = rf.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_tfidf_RFC_80-20:']
    print('\n\n',pred)
    
predict_tfidfRFC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_tfidf_RFC_80-20:               massage benefits
predict_tfidfRFC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_tfidf_RFC_80-20:               massage benefits

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_tfidfGBC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['lemmatized'])
    
    model = gb.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_tfidf_GBC_80-20:']
    print('\n\n',pred)
    

def predict_tfidfGBC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    tfidf_vect=TfidfVectorizer(analyzer=clean_text)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['clean'])
    
    model = gb.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_tfidf_GBC_80-20:']
    print('\n\n',pred)
    
predict_tfidfGBC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_tfidf_GBC_80-20:               massage benefits
predict_tfidfGBC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_tfidf_GBC_80-20:               massage benefits

N-Grams Vectorization for RFC and GBC

stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])#unlisted with N-grams vectorization
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])#unlisted with N-grams vectorization
    #text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#when using count Vectorization its a list
    #or else single letters returned.
    return text    
data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  chiropractic adjustment treatment serve need m...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...   unitedhealthcare combat opioid crisis nonopio...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...   safety chiropractic adjustment chiropractic a...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  advanced chiropractic relief 8 key benefit chi...
## 4  Heading to the spa can be a pampering treat, b...  ...  heading spa pampering treat also huge boost he...
## 
## [5 rows x 10 columns]
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.20)
from sklearn.feature_extraction.text import CountVectorizer
n_gram_vect=CountVectorizer(ngram_range=(1,4))
type(X_train['Cleaned_text'])
## <class 'pandas.core.series.Series'>
X_train['Cleaned_text'].head()
## 15    top 5 health benefit regular massag therapi ma...
## 3     advanc chiropract relief 8 key benefit chiropr...
## 16    physic therapi help physic therapi train profe...
## 0     chiropract adjust treatment serv need million ...
## 20    benefit physic therapi peopl think physic ther...
## Name: Cleaned_text, dtype: object
X_train['Lemmatized'].head()
## 15    top 5 health benefit regular massage therapy m...
## 3     advanced chiropractic relief 8 key benefit chi...
## 16    physical therapy help physical therapy trained...
## 0     chiropractic adjustment treatment serve need m...
## 20    benefit physical therapy people think physical...
## Name: Lemmatized, dtype: object
n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])


n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
n_gram_test=n_gram_vect_fit.transform(X_test['Cleaned_text'])
len(n_gram_vect_fit.get_feature_names())
## 69006
print(n_gram_vect_fit.get_feature_names()[200:500])
## ['2011 research review size', '2011 review', '2011 review 26', '2011 review 26 clinic', '2012', '2012 review', '2012 review studiestrust', '2012 review studiestrust sourc', '2013', '2013 illustr', '2013 illustr benefit', '2013 illustr benefit journal', '2015', '2015 report', '2015 report publish', '2015 report publish journal', '2015 review', '2015 review evid', '2015 review evid found', '2015 systemat', '2015 systemat review', '2015 systemat review conclud', '2016', '2016 michael', '2016 michael phelp', '2016 michael phelp perman', '2016 studi', '2016 studi publish', '2016 studi publish journal', '2016 summer', '2016 summer olymp', '2016 summer olymp share', '2016 summer olympics1', '2016 summer olympics1 use', '20162017', '20162017 wide', '20162017 wide rang', '20162017 wide rang reason', '2017', '2017 2018', '2017 2018 legal', '2017 2018 legal battl', '2017 chiropractor', '2017 chiropractor tout', '2017 chiropractor tout treatment', '2017 nba', '2017 nba final', '2017 nba final irv', '2017 scientist', '2017 scientist analyz', '2017 scientist analyz 11', '2017 studi', '2017 studi found', '2017 studi found structur', '2018', '2018 found', '2018 found chang', '2018 found chang hamstr', '2018 galluppalm', '2018 galluppalm colleg', '2018 galluppalm colleg chiropract', '2018 legal', '2018 legal battl', '2018 legal battl aim', '2018 studi', '2018 studi led', '2018 studi led dr', '2019', '2019 earli', '2019 earli 2020', '2019 earli 2020 mani', '2019 found', '2019 found patient', '2019 found patient high', '2019 massag', '2019 massag gun', '2019 massag gun one', '2019 unitedhealthcar', '2019 unitedhealthcar uhc', '2019 unitedhealthcar uhc combat', '2020', '2020 2021', '2020 2021 end', '2020 2021 end expans', '2020 beyond', '2020 beyond peopl', '2020 beyond peopl say', '2020 mani', '2020 mani peopl', '2020 mani peopl start', '2021', '2021 end', '2021 end expans', '2021 end expans period', '2021 opioid', '2021 opioid use', '2021 opioid use decreas', '20minut', '20minut selfmassag', '20minut selfmassag use', '20minut selfmassag use massag', '21', '21 benefit', '21 benefit chiropract', '21 benefit chiropract adjust', '21 benefit might', '21 benefit might known', '21 percent', '21 percent respect', '21 percent respect addit', '22', '22 million', '22 million american', '22 million american visit', '22 percent', '22 percent 21', '22 percent 21 percent', '23', '23 2019', '23 2019 massag', '23 2019 massag gun', '24', '24 separ', '24 separ column', '24 separ column sever', '25', '25 percent', '25 percent american', '25 percent american adult', '25 reason', '25 reason get', '25 reason get massag', '25 show', '25 show 75', '25 show 75 90', '26', '26 clinic', '26 clinic trial', '26 clinic trial look', '272', '272 studi', '272 studi particip', '272 studi particip three', '281', '281 341', '281 341 mani', '281 341 mani taoist', '29', '29 2019', '29 2019 unitedhealthcar', '29 2019 unitedhealthcar uhc', '30', '30 littl', '30 littl 30', '30 littl 30 percent', '30 percent', '30 percent less', '30 percent less like', '30 second', '30 second work', '30 second work along', '300', '300 ad', '300 ad even', '300 ad even earlier', '33', '33 medic', '33 medic one', '33 medic one year', '34', '34 lymphat', '34 lymphat system', '34 lymphat system drain', '341', '341 mani', '341 mani taoist', '341 mani taoist believ', '35', '35 cup', '35 cup first', '35 cup first session', '35 seek', '35 seek relief', '35 seek relief back', '37', '37 studi', '37 studi found', '37 studi found reduct', '38', '38 take', '38 take pain', '38 take pain medic', '400', '400 600', '400 600 massag', '400 600 massag gun', '400 greater', '400 greater immun', '400 greater immun compet', '4357', '4357 local', '4357 local health', '4357 local health depart', '44', '44 million', '44 million peopl', '44 million peopl 1320', '456000', '456000 chiropractor', '456000 chiropractor massag', '456000 chiropractor massag therapist', '48', '48 percent', '48 percent went', '48 percent went doctor', '48 receiv', '48 receiv pain', '48 receiv pain reduct', '4pm', '4pm afternoon', '4pm afternoon time', '4pm afternoon time dinner', '50', '50 75', '50 75 improv', '50 75 improv one', '50 benefit', '50 benefit span', '50 benefit span ten', '50 improv', '50 improv research', '50 improv research conclud', '50 million', '50 million american', '50 million american suffer', '50 minut', '50 minut long', '50 minut long say', '50 patient', '50 patient 16', '50 patient 16 complet', '50 state', '50 state howev', '50 state howev mani', '51', '51 employe', '51 employe strong', '51 employe strong benefit', '52', '52 percent', '52 percent overal', '52 percent overal opioid', '53', '53 drugfre', '53 drugfre group', '53 drugfre group continu', '53 sought', '53 sought treatment', '53 sought treatment chiropractor', '57', '57 chiropract', '57 chiropract group', '57 chiropract group achiev', '57 cup', '57 cup british', '57 cup british cup', '60', '60 also', '60 also vulner', '60 also vulner complic', '60 minut', '60 minut massag', '60 minut massag show', '60 second', '60 second brush', '60 second brush use', '600', '600 massag', '600 massag gun', '600 massag gun isnt', '600 massag gun work', '6070', '6070 lymphat', '6070 lymphat tissu', '6070 lymphat tissu want', '62', '62 adult', '62 adult us', '62 adult us neck', '62 million', '62 million peopl', '62 million peopl seen', '63', '63 saw', '63 saw medic', '63 saw medic doctor', '65', '65 mile', '65 mile hour', '65 mile hour allow']
n_gram_train_df=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(n_gram_train.toarray())],axis=1)

n_gram_test_df=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(n_gram_test.toarray())],axis=1)
n_gram_train_df.head()
##                                             Document  body_length  ...  69004 69005
## 0  Top 5 Health Benefits of Regular Massage Thera...          893  ...      0     0
## 1  Advanced Chiropractic Relief: 8 Key Benefits o...         8624  ...      0     0
## 2  How can physical therapy help?\n\nIn physical ...         7619  ...      0     0
## 3  Chiropractic adjustments and treatments serve ...         2288  ...      0     0
## 4  The Benefits of Physical Therapy\n\n\nWhen peo...         3221  ...      0     0
## 
## [5 rows x 69011 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(n_gram_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(n_gram_test)
end=time.time()
pred_time=(end-start)

prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                            Topic
## 58            massage benefits              cold stone benefits
## 56         cold stone benefits              cold stone benefits
## 72        dry brushing massage             dry brushing massage
## 36            massage benefits                 cupping benefits
## 12            massage benefits                 massage benefits
## 67            massage benefits       Lymphatic Drainage Massage
## 62            massage benefits       Lymphatic Drainage Massage
## 37            cupping benefits                 cupping benefits
## 52            massage benefits              cold stone benefits
## 57         cold stone benefits              cold stone benefits
## 4             massage benefits                 massage benefits
## 10            massage benefits                 massage benefits
## 17   physical therapy benefits        physical therapy benefits
## 46            massage benefits             massage gun benefits
## 49        massage gun benefits             massage gun benefits
## 29            massage benefits  mental health services benefits
## 69  Lymphatic Drainage Massage       Lymphatic Drainage Massage
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.5882352941176471
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 2 0 0 0]
##  [0 2 0 0 2 0 0 0]
##  [0 0 1 0 1 0 0 0]
##  [0 0 0 1 0 0 0 0]
##  [0 0 0 0 3 0 0 0]
##  [0 0 0 0 1 1 0 0]
##  [0 0 0 0 1 0 0 0]
##  [0 0 0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##      Lymphatic Drainage Massage       1.00      0.33      0.50         3
##             cold stone benefits       1.00      0.50      0.67         4
##                cupping benefits       1.00      0.50      0.67         2
##            dry brushing massage       1.00      1.00      1.00         1
##                massage benefits       0.30      1.00      0.46         3
##            massage gun benefits       1.00      0.50      0.67         2
## mental health services benefits       0.00      0.00      0.00         1
##       physical therapy benefits       1.00      1.00      1.00         1
## 
##                        accuracy                           0.59        17
##                       macro avg       0.79      0.60      0.62        17
##                    weighted avg       0.82      0.59      0.60        17
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(n_gram_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(n_gram_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 58              cold stone benefits              cold stone benefits
## 56              cold stone benefits              cold stone benefits
## 72             dry brushing massage             dry brushing massage
## 36                 cupping benefits                 cupping benefits
## 12                 massage benefits                 massage benefits
## 67       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 62       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 37                 cupping benefits                 cupping benefits
## 52              cold stone benefits              cold stone benefits
## 57              cold stone benefits              cold stone benefits
## 4                  massage benefits                 massage benefits
## 10                 massage benefits                 massage benefits
## 17        physical therapy benefits        physical therapy benefits
## 46                 massage benefits             massage gun benefits
## 49             massage gun benefits             massage gun benefits
## 29  mental health services benefits  mental health services benefits
## 69       Lymphatic Drainage Massage       Lymphatic Drainage Massage
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.9411764705882353
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[3 0 0 0 0 0 0 0]
##  [0 4 0 0 0 0 0 0]
##  [0 0 2 0 0 0 0 0]
##  [0 0 0 1 0 0 0 0]
##  [0 0 0 0 3 0 0 0]
##  [0 0 0 0 1 1 0 0]
##  [0 0 0 0 0 0 1 0]
##  [0 0 0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##      Lymphatic Drainage Massage       1.00      1.00      1.00         3
##             cold stone benefits       1.00      1.00      1.00         4
##                cupping benefits       1.00      1.00      1.00         2
##            dry brushing massage       1.00      1.00      1.00         1
##                massage benefits       0.75      1.00      0.86         3
##            massage gun benefits       1.00      0.50      0.67         2
## mental health services benefits       1.00      1.00      1.00         1
##       physical therapy benefits       1.00      1.00      1.00         1
## 
##                        accuracy                           0.94        17
##                       macro avg       0.97      0.94      0.94        17
##                    weighted avg       0.96      0.94      0.94        17
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])
    return text

def predict_ngramRFC_clean(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))

    rf=RandomForestClassifier(n_estimators=150,max_depth=None, n_jobs=-1)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
    n_gram_test=n_gram_vect_fit.transform(nr['clean'])
    
    model = rf.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['stemmed_1ngram4_RFC_80-20:']
    print('\n\n',pred)
    

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text    
    
def predict_ngramRFC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))

    rf=RandomForestClassifier(n_estimators=150,max_depth=None, n_jobs=-1)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
    n_gram_test=n_gram_vect_fit.transform(nr['lemma'])
    
    model = rf.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_1ngram4RFC_80-20:']
    print('\n\n',pred)
predict_ngramRFC_clean('I need a massage!') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_RFC_80-20:               massage benefits
predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                               Recommended Healthcare Service:
## lemmatized_1ngram4RFC_80-20:                massage benefits
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])
    return text

def predict_ngramGBC_clean(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))

    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
    n_gram_test=n_gram_vect_fit.transform(nr['clean'])
    
    model = gb.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['stemmed_1ngram4_GBC_80-20:']
    print('\n\n',pred)
    

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text    
    
def predict_ngramGBC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))

    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
    n_gram_test=n_gram_vect_fit.transform(nr['lemma'])
    
    model = gb.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_1ngram4GBC_80-20:']
    print('\n\n',pred)
predict_ngramGBC_clean('I need a massage!') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_GBC_80-20:               massage benefits
predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                               Recommended Healthcare Service:
## lemmatized_1ngram4GBC_80-20:                massage benefits

Fourth part: Stemmed Tokens & 85/15 Train/Test split & RFC | GBC

Count Vectorizer RFC and GBC


stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
data <- read.csv('benefitsContraindications4.csv',sep=',',header=TRUE,  na.strings=c('',' ','NA'))
colnames(data)
## [1] "Document"            "Source"              "Topic"              
## [4] "InternetSearch"      "Contraindications"   "risksAdverseEffects"
head(data,5)
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            Document
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Chiropractic adjustments and treatments serve the needs of millions of people around the world.\n\n \n\nAdjustments offer effective, non-invasive and cost-effective solutions to neck and back pain, as well as a myriad of other medical issues.\n\nHave you ever stopped to wonder how many of us suffer from neck and back stiffness or pain?\n\nApart from the obvious discomfort, simple daily tasks such as driving a car, crossing a busy street and picking things up from the floor can become all too challenging for individuals experiencing such pain.\n\nAs anyone who has experienced pain would know, having restricted movement can be debilitating and unfortunately, our busy world doesn?t allow for us to stop.\n\n \nSome of the benefits of long-term chiropractic care include:\n\n    Chiropractors can identify mechanical issues that cause spine-related pain and offer a series of adjustments that provide near immediate relief. Following appointments, patients often report feeling their symptoms noticeably better.\n    When a chiropractor performs an adjustment, they can help restore movement in joints that have ?locked up?. This becomes possible as treatment allows muscles surrounding joints to relax, thereby reducing joint stiffness.\n    Many factors affect health, including exercise patterns, nutrition, sleep, heredity and the environment in which we live. Rather than just treat symptoms of the disease, chiropractic care focuses on a holistic approach to naturally maintain health and resist disease.\n    Chiropractic adjustments help restore normal function and movement to the entire body. Many patients report an improvement in their ability to move with efficiency and strength.\n    Many patients find delight in the results chiropractic adjustments have on old and chronic injuries. Whether an injury is, in fact, new or old, chiropractic care can help reduce pain, restore mobility and provide quick pain relief to all joints in the body. Such care can help maintain better overall health and thus faster recovery time.\n\nHave you ever noticed that when you are in pain and unable to perform regular or favorite activities, it can put a strain on emotional and mental well-being?\n\nFor example, the increased stress from not being able to properly perform a paid job. This, in turn, can have a negative impact on physical health with increases in heart rate and blood pressure. The domino effect often continues with sleep becoming disturbed, with resulting lethargy and tiredness during the day. Does anyone really feel up to exercising in this state?\n\nChiropractic care is a natural method of healing the body?s communication system and never relies on the use of pharmaceutical drugs or invasive surgery.\n\n\n\n 
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          \nUnitedHealthcare Combats Opioid Crisis with Non-Opioid Benefits\nPhysical therapy and chiropractic care can prevent or reduce expensive, invasive spinal procedures, such as imaging or surgery, to reduce opioid use and cut costs.\nUnitedHealthcare, opioid, physical therapy, healthcare spending\n\n\nOctober 29, 2019 - UnitedHealthcare (UHC) is combatting the opioid epidemic and high healthcare costs with new physical therapy and chiropractic care benefits to prevent, delay, or in some cases substitute for invasive spinal procedures.\n\n?With millions of Americans experiencing low back pain currently or at some point during their lifetimes, we believe this benefit design will help make a meaningful difference by improving health outcomes while reducing costs,? said Anne Docimo, MD, UnitedHealthcare chief medical officer.\n\nLower back pain is in part responsible for sustaining the opioid epidemic and also increases healthcare costs.\nDig Deeper\n\nAlthough opioid overdoses fell by two percent from 2017 to 2018 and a legal battles aim to hold pharmaceutical companies accountable, there is no end in sight for the opioid epidemic. Industry professionals are still grappling with the balance between cutting opioid prescriptions will working to reduce patient pain.\n\nCommon conditions such as low back pain bolster the epidemic?s presence, with clinicians still prescribing the opioids against best practice recommendations. According to a recent OptumLabs study, 9 percent of patients with newly diagnosed low back pain are prescribed opioids and lower back pain currently contributes 52 percent to the overall opioid prescription rate.\n\nIn addition to boosting opioids distribution, alternative, invasive lower back pain treatments can significantly impact healthcare spending.\n\nIt is not new information that physical therapy and chiropractic care are effective, lower cost alternatives to spinal imaging or surgery. However, payers are still in the process of adopting the method.\n\nTo counteract the high-cost, high-risk potential of using opioids to treat back pain, UHC created a benefit that does not rely on medication or technology but rather on physical therapy and chiropractic care.\n\nThe benefit allows eligible employers to offer physical therapist and chiropractor visits with no out-of-pocket costs. Members who already receive physical therapist and chiropractic care benefits under UHC?s employer-sponsored health plans and who have maxed out their visits will not receive additional visits under this benefit.\n\nHowever, for those who still have visits to use and who choose physical therapy or chiropractic care over other forms of treatment, the copay or deductible for those visits will be waived and they will receive three visits at no cost.\n\nUHC has high expectations for the fiscal and physical impacts of this benefit.\n\nAccording to UHC?s analysis, the health payer expects that by 2021, opioid use will decrease by 19 percent. Spinal imaging test frequency and spinal surgeries will be reduced by 22 percent and 21 percent, respectively. In addition to these specific goals, UHC hopes to see a decrease in the overall cost of spinal care.\n\nThe same OptumLabs study demonstrated that UHC?s expectations are not without precedent.\n\nThe study looked at the correlation between out-of-pocket costs and patient utilization of noninvasive treatments. Researchers discovered that members whose copay was over $30 were a little under 30 percent less likely to choose physical therapy as opposed to more invasive treatments.\n\nAn American Journal of Managed Care study in June 2019 found that patients with high deductibles, typically over $1,000, were less likely to visit physical therapy.\n\nEligible employers may be brand new or renewing their membership. They must be fully insured and over 51 or more employees strong. The benefit is currently available in Connecticut, Florida, Georgia, New York, and North Carolina.\n\nHowever, UHC plans to expand the benefit from 2020 into 2021. By the end of this expansion period, the benefit will also be available to self-funded employers and organizations with an employee population between 2 and 50. The benefit will span ten states, primarily in the southeast.\n\n?This new benefit design may help encourage people with low back pain to get the right care at the right time and in the right setting, helping expand access to evidence-based and more affordable treatments,? said Docimo.
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      The Safety of Chiropractic Adjustments\n\n    Chiropractic Adjustment\n    What Research Shows\n    Safety\n\nChiropractic adjustment, also called spinal manipulation, is a procedure done by a chiropractor using the hands or small instruments to apply controlled force to a spinal joint. The goal is to improve spinal motion and physical function of the entire body. Chiropractic adjustment is safe when performed by someone who is properly trained and licensed to practice chiropractic care. Complications are rare, but they are possible. Learn more about both the benefits and risks.\nChiropractic adjustment\nVerywell / Brianna Gilmartin \nChiropractic Adjustment\n\nOne of the most important reasons people seek chiropractic care is because it is a completely drug-free therapy. Someone dealing with joint pain, back pain, or headaches might consider visiting a chiropractor.\n\nThe goal of chiropractic adjustment is to place the body into a proper position so the body can heal itself. Treatments are believed to reduce stress on the immune system, reducing the potential for disease. Chiropractic care aims to address the entire body, including a person?s ability to move, perform, and even think.\nWhat Research Shows\n\nMany people wonder how helpful chiropractic care is in treating years of trauma and poor posture. There have been numerous studies showing the therapeutic benefits of chiropractic care.\nSciatica\n\nSciatica is a type of pain affecting the sciatic nerve, the large nerve extending from the low back down the back of the legs. Other natural therapies don?t always offer relief and most people want to avoid steroid injections and surgery, so they turn to chiropractic care.\n\nA double-blind trial reported in the Spine Journal compared active and simulated chiropractic manipulations in people with sciatic nerve pain. Active manipulations involved the patient laying down and receiving treatment from a chiropractor. Stimulated manipulations involved electrical muscle stimulation with electrodes placed on the skin to send electrical pulses to different parts of the body.\n\nThe researchers determined active manipulation offered more benefits than stimulated. The people who received active manipulations experienced fewer days of moderate or severe pain and other sciatica symptoms. They also reported no adverse effects.\nNeck Pain\n\nOne study reported in the Annals of Internal Medicine looked at different therapies for treating neck pain. They divided 272 study participants into three groups: one that received spinal manipulation from a chiropractic doctor, a second group given over-the-counter (OTC) pain relievers, narcotics, and muscle relaxers, and a third group who did at-home exercises. \n\nAfter 12 weeks, patients reported a 75% pain reduction, with the chiropractic treatment group achieving the most improvement. About 57% of the chiropractic group achieved pain reduction, while 48% received pain reduction from exercising, and 33% from medication.\n\nAfter one year, 53% of the drug-free groups continued to report pain relief compared to only 38% of those taking pain medications. \nHeadaches\n\nCervicogenic headaches and migraines are commonly treated by chiropractors. Cervicogenic headaches are often called secondary headaches because pain is usually referred from another source, usually the neck. Migraine headaches cause severe, throbbing pain and are generally experienced on one side of the head. There are few non-medicinal options for managing both types of chronic headaches.\n\nResearch reported in the Journal of Manipulative and Physiological Therapeutics suggests chiropractic care, specifically spinal manipulation, can improve migraines and cervicogenic headaches.  \nFrozen Shoulder\n\nFrozen shoulder affects the shoulder joint and involves pain and stiffness that develops gradually and gets worse. Frozen shoulder can be quite painful, and treatment involves preserving as much range of motion in the shoulder as possible and managing pain.\n\nA clinical trial reported in the Journal of Chiropractic Medicine described how patients suffering from frozen shoulder responded to chiropractic treatment. Of the 50 patients, 16 completely recovered, 25 showed a 75 to 90% improvement, and eight showed a 50 to 75% improvement. Only one person showed zero to 50% improvement. The researchers concluded most people can get improvement by treating frozen shoulder with chiropractic treatment.\nPreventing Need for Surgery\n\nChiropractic care may reduce the need for back surgery. Guidelines reported in the Journal of the American Medical Association suggest that it's reasonable for people suffering from back pain to try spinal manipulation before deciding on surgical intervention.\nLow Back Pain\n\nStudies have shown chiropractic care, including spinal manipulation, can provide relief from mild to moderate low back pain. In fact, spinal manipulation may work as well as other standard treatments, including pain-relief medications.\n\nA 2011 review of 26 clinical trials looked at the effectiveness of different treatments for chronic low back pain. What they found was that spinal manipulation is just as effective as other treatments for reducing back pain and improving function.\nSafety\n\n\n
## 4 Advanced Chiropractic Relief: 8 Key Benefits of Chiropractor Care\n\nAre you one of the 50 million Americans who suffer from chronic pain? If so you?re probably intimately familiar with the feeling of pure desperation that can arise from an inability to find relief.\n\nIn addition to physical issues, chronic pain can cause anxiety, depression, and more. However, there could be a light at the end of the tunnel. Many people are finding advanced chiropractic relief that is completely changing their lives.\n\nYour body is a world in itself. At this very moment, more than a million chemical reactions are taking place in your body. It manufactures energy, it regulates your heartbeat, your breathing and it regenerates and heals itself. Everything takes place without your conscious knowledge, without you controlling it voluntarily. The master system that controls it all is your nervous system.\n\nThe nervous system is made out of your brain, spinal cord and all your nerves.\n\nThe energy that flows through your nervous system in your body is like electricity. In order to have that electric flow normally and freely, we need to have a well functioning spine. Whenever you have disruption of that flow, disease happens. That would be the case when your spine is misaligned or is not moving properly.\n\nDid you know that 90% of stimulation and nutrition to the brain is generated by the movement of the spine?\n\nThe more mechanically distorted a person is, the less energy is available for thinking, metabolism and healing.\n\nThis is why it is so important to have a healthy spine, a proper posture, to exercise, to eat properly ? all of it truly matters for your quality of life.\nChiropractors localize the areas of your spine that do not move properly ? referred to as vertebral subluxations ? and adjust them with a specific high speed, but yet gentle, thrust to improve spinal motion.\n\nWant to learn about some of the ways chiropractic care can help you? Keep reading for insight into some of the key benefits of seeing a chiropractor.\n\nThe benefits of chiropractic care are numerous:\n\n1. Lower Blood Pressure\n\nStudies show that chiropractic treatment can lower your blood pressure. Sometimes, this works just as well as a prescription blood pressure medication! This benefit can also last for as long as six months after treatment.\n\nHigh blood pressure can cause an array of serious side effects like nausea, fatigue, dizziness, and anxiety. Sufferers who haven?t found relief should consider consulting with a chiropractor. A chiropractic adjustment may be the solution.\n\nSome studies have shown that chiropractic adjustments can also help patients who are suffering from low blood pressure.\n\n2. Reduced Inflammation\n\nIn many cases, joint issues, pain, and tension are caused by inflammation in the body. Chiropractic adjustments can reduce inflammation.\n\nThis leads to relief of muscle tension, chronic back pain, and joint pain. These adjustments can sometimes also slow the progression of inflammation-related diseases, like arthritis.\n\n3. Better Sleep\n\nPatients who receive chiropractic adjustments report a significant improvement in their sleep patterns. If you regularly suffer from insomnia, visiting a chiropractor regularly may help. Also, when you experience pain relief, this will help you get a restful night?s sleep.\n\n4. Digestive Relief\n\nChiropractors often give nutritional advice as part of their services. However, this isn?t the only way that they provide patients with digestive relief.\n\nAdjusting the thoraco-lumbar spine restores the neurological function of your digestive system. Regular adjustments can help with chronic digestive issues.\n\n5. Stress Release\n\nEveryday life can cause muscle cramping, inflammation, and more. When you?re sore from working at a computer, heavy lifting, or just dealing with emotional stress, a chiropractic adjustment can help. This leads to greater comfort and advanced pain relief.\n\n6. Improvement of Neurological Conditions\n\nA chiropractic adjustment can also increase blood flow to the brain and increase the flow of cerebral spinal fluid. This means that patients suffering from neurological conditions like epilepsy and multiple sclerosis can significantly benefit from regular adjustments.\n\nThis is a relatively new area of study, but the potential is huge. Those suffering from these conditions will want to do some research. It?s important to find the best chiropractor in their area with experience dealing with these specific types of cases.\n\n7. Chiropractic care can improve communication from your brain to your muscles\n\nResearch seems to show that chiropractic care can improve your brain-body communication, helping your brain to be more aware of what is going on in the body so it can control your body better.\n\nBetter health, more energy and vitality are some of the positive effects of getting your spine adjusted. It sets your vertebrae back into motion freeing up the energy that travels through your nerves.\n\nChiropractic care is a partnership. The results patients want is a combination of what the chiropractor does and what the patient does.\n\nThere are many good things that can be changed and improved for a better lifestyle: exercise, good nutrition, good mental attitude and spinal adjustments.\n\nYour whole body will work better by having your nervous system free of interference. That is the essence of chiropractic care and is designed for you and your family.\n\n8. Pain Relief\n\nPerhaps the most well-known benefit of going to a chiropractor is pain relief. Adjustments can help with a huge array of painful conditions including the following.\n\nNeck and Lower Back Pain\n\nAdjustments are the most effective non-invasive pain relief method for this type of pain. They may help patients avoid having to take prescription pain management drugs.\n\nSciatica\n\nTreatments help relieve pressure on the nerve. This results in less severe pain that lasts for a fewer number of days.\n\nHeadaches\n\nChiropractic adjustments help headaches and migraines. They do this by treating back misalignment, muscle tension, and stress. Cervical spine manipulation was associated with significant improvement in headache outcomes in trials involving patients with neck pain and/or neck dysfunction and headache.\n\nChronic headaches can result from the abnormal positioning of the head and can be worsened from neck pressure and movement. Chiropractic removes the interference whether it may be from the distant muscle tightness in the back causing strain on your spine or an abnormal lordotic cervical curve and moving vertebrae.\nChiropractic care can reduce the duration of headaches, lower their intensity when they do occur and limit the frequency of their occurrence all together.\n\nMenstrual cramps\n\nChiropractic treatment removes tension from the pelvis and sacrum. It also regulates the neurological function communicating with the reproductive organs. Adjustments can also relieve the bloating, cramping, and pain associated with menstrual cramps\n\nAnyone who has tried traditional medical treatments and has been unable to find pain relief should experiment with chiropractic care. More often than not, you?ll be pleasantly surprised!\n\nBonus: Advanced Chiropractic Relief\n\nIn addition to the benefits listed above, adjustments can bring advanced chiropractic relief for a wide variety of other conditions as well as overall life improvement. A few examples include:\n\nScoliosis ? adjustments have shown to help with the pain, reduced range of motion, abnormal posture, and even difficulty breathing caused by this abnormal curvature of the spine\n\nVertigo ? an adjustment can help realign and balance the spine, thereby reducing the dizziness, nausea, and disorientation caused by vertigo\n\nSinus and allergy relief ? adjusting the upper cervical spine can help drain the sinuses and provide immediate and lasting relief from both long-term and seasonal allergies\n\nExpectant mothers ? women can experience relief from pain and morning sickness and are better able to maintain proper posture during and after pregnancy\n\nChildren?s issues ? treatments have been shown to help children with acid reflux, cholic, and ear infections\nAthletic performance ? the reduction in pain and inflammation is particularly beneficial for professional and amateur athletes\n\nStimulates the immune system ? chiropractic care helps to boost the immune system, speeding up the healing process following illnesses or injuries. One of the most important studies showing the positive effect chiropractic care can have on the immune system and general health was performed by Ronald Pero, Ph.D., chief of cancer prevention research at New York?s Preventive Medicine Institute and professor of medicine at New York University. Dr. Pero measured the immune systems of people under chiropractic care as compared to those in the general population and those with cancer and other serious diseases.\n\nIn his initial three-year study of 107 individuals who had been under chiropractic care for five years or more, the chiropractic patients were found to have a 200% greater immune competence than people who had not received chiropractic care, and 400% greater immune competence than people with cancer and other serious diseases. The immune system superiority of those under chiropractic care did not diminish with age.\n\nDr. Pero stated: ?When applied in a clinical framework, I have never seen a group other than this chiropractic group to experience a 200% increase over the normal patients. This is why it is so dramatically important. We have never seen such a positive improvement in a group.?\n\nAs you can see, there are almost limitless benefits to seeking chiropractic treatment. If you haven?t tried it yet, what are you waiting for?\n\nThere?s no need to accept pain and discomfort as a normal part of life. You have nothing to lose and everything to gain, so it only makes sense to find out more about this possibly life-changing approach to improving your health and wellness.
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             Heading to the spa can be a pampering treat, but it can also be a huge boost to your health and wellness! Massage therapy can relieve all sorts of ailments ? from physical pain, to stress and anxiety. People who choose to supplement their healthcare regimen with regular massages will not only enjoy a relaxing hour or two at the spa, but they will see the benefits carry through the days and weeks after the appointment!\n\n1\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\nThese are the 10 most common benefits reported from massage therapy:\n\n1. Reduce Stress\n\nA relaxing day at the spa is a great way to unwind and de-stress. However, clients are sure to notice themselves feeling relaxed and at ease for days and even weeks after their appointments!\n\n \n\n2. Improve Circulation\n\nLoosening muscles and tendons allows increased blood flow throughout the body. Improving your circulation can have a number of positive effects on the rest of your body, including reduced fatigue and pain management!\n\n \n\n3. Reduce Pain\n\nMassage therapy is great for working out problem areas like lower back pain and chronic stiffness. A professional therapist will be able to accurately target the source of your pain and help achieve the perfect massage regimen.\n\n \n\n3\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n4. Eliminate Toxins\n\nStimulating the soft tissues of your body will help to release toxins through your blood and lymphatic systems.\n\n \n\n5. Improve Flexibility\n\nMassage therapy will loosen and relax your muscles, helping your body to achieve its full range of movement potential.\n\n \n\n6. Improve Sleep\n\nA massage will encourage relaxation and boost your mood.  Going to bed with relaxed and loosened muscles promotes more restful sleep, and you?ll feel less tired in the morning!\n\n \n\n7. Enhance Immunity\n\nStimulation of the lymph nodes re-charges the body?s natural defense system.\n\n \n\n8. Reduce Fatigue\n\nMassage therapy is known to boost mood and promote better quality sleep, thus making you feel more rested and less worn-out at the end of the day.\n\n \n\n2\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n9. Alleviate Depression and Anxiety\n\nMassage therapy can help to release endorphins in your body, helping you to feel happy, energized, and at ease.\n\n \n\n10. Reduce post-surgery and post-injury swelling\n\nA professional massage is a great way to safely deal with a sports injury or post-surgery rehabilitation.\n\nDo you think that massage therapy could help you find relief in any of these areas? What improvements would you like to see in your health? Contact us today with your questions about massage therapy and see how we can help you get on the path to improved health and wellness!
##                                                                                                                                       Source
## 1 https://coremedicalohio.com/benefits-of-long-term-chiropractic-care/?utm_source=ReviveOldPost&utm_medium=social&utm_campaign=ReviveOldPost
## 2                                   https://healthpayerintelligence.com/news/unitedhealthcare-combats-opioid-crisis-with-non-opioid-benefits
## 3                                                                     https://www.verywellhealth.com/is-chiropractic-adjustment-safe-4588279
## 4                                           https://hafkeychiropractic.com/advanced-chiropractic-relief-8-key-benefits-of-chiropractor-care/
## 5                                                                                  https://www.urbannirvana.com/10-benefits-massage-therapy/
##                   Topic InternetSearch
## 1 chiropractic benefits           <NA>
## 2 chiropractic benefits           <NA>
## 3 chiropractic benefits           <NA>
## 4 chiropractic benefits           <NA>
## 5      massage benefits         google
##                                                                                                                                                                                               Contraindications
## 1 Doctors of Chiropractic work collaboratively with other healthcare professionals. Should your condition require the attention of another healthcare profession, that recommendation or referral will be made.
## 2                                                                                                                                                                                                          <NA>
## 3                                                                                                                                                                                                          <NA>
## 4                                                                                                                                                                                                          <NA>
## 5                                                                                                                                                                                                          <NA>
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      risksAdverseEffects
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 3 Risks and side effects associated with chiropractic adjustments may include:\n\n    temporary headaches\n    fatigue after treatment\n    discomfort in parts of the body that were treated\n\nRare but serious risks associated with chiropractic adjustment include:\n\n    stroke\n    cauda equina syndrome, a condition involving pinched nerves in the lower part of the spinal canal\n    worsening of herniated disks (although research isn't conclusive)\n\nIn addition to effectiveness, research has focused on the safety of chiropractic treatments, mainly spinal manipulation. \n\nOne 2017 review of 250 articles looked at serious adverse events and benign events associated with chiropractic care. Based on the evidence the researchers reviewed, serious adverse events accounted for one out of every two million spinal manipulations to 13 per 10,00 patients. Serious adverse events included spinal or neurological problems and cervical arterial strokes (dissection of any of the arteries in the neck).\n\nBenign events were more common and included more pain and higher levels of neck problems, but most were short-term problems.\n\nThe researchers confirmed serious adverse events were rare and often related to other preexisting conditions, while benign events are more common. However, the reasons for any types of adverse events are unknown.\n\nA second 2017 review looked 118 articles and found frequently described adverse events include stroke, headache and vertebral artery dissection (cervical arterial stroke). Forty-six percent of the reviews determined that spinal manipulation was safe, while 13% expressed concern of harm. The remaining studies were unclear or neutral. While the researchers did not offer an overall conclusion, they determined spinal manipulation can significantly be helpful, and some risk does exist.\nA Word From Verywell   When chiropractors are correctly trained and licensed, chiropractic care is safe. Mild side effects are to be expected and include temporary soreness, stiffness, and tenderness in the treated area. However, you still want to do your research. Ask for a referral from your doctor. Look at the chiropractor?s website, including patient reviews. Meet with the chiropractor to discuss his or her treatment practices and ask about possible adverse effects related to treatment.\n\nIf you decide a chiropractor isn?t for you, consider seeing an osteopathic doctor. Osteopaths are fully licensed doctors who can practice all areas of medicine. They have received special training on the musculoskeletal system, which includes manual readjustments, myofascial release and other physical manipulation of bones and muscle tissues.
## 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   <NA>
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text        

data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  [chiropractic, adjustment, treatment, serve, n...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...  [, unitedhealthcare, combat, opioid, crisis, n...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...  [, safety, chiropractic, adjustment, chiroprac...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  [advanced, chiropractic, relief, 8, key, benef...
## 4  Heading to the spa can be a pampering treat, b...  ...  [heading, spa, pampering, treat, also, huge, b...
## 
## [5 rows x 10 columns]
data.to_csv('dataCleanLemm.csv')
#DATA = pd.read_csv('dataCleanLemm.csv', encoding='unicode_escape')
DATA <- read.csv('dataCleanLemm.csv', sep=',', header=TRUE, na.strings=c('',' ','NA'), row.names=1)
colnames(DATA)
##  [1] "Document"          "Source"            "Topic"            
##  [4] "InternetSearch"    "Contraindications" "RisksSideEffects" 
##  [7] "body_length"       "punct."            "Cleaned_text"     
## [10] "Lemmatized"
head(DATA,2)
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Document
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Chiropractic adjustments and treatments serve the needs of millions of people around the world.\n\n \n\nAdjustments offer effective, non-invasive and cost-effective solutions to neck and back pain, as well as a myriad of other medical issues.\n\nHave you ever stopped to wonder how many of us suffer from neck and back stiffness or pain?\n\nApart from the obvious discomfort, simple daily tasks such as driving a car, crossing a busy street and picking things up from the floor can become all too challenging for individuals experiencing such pain.\n\nAs anyone who has experienced pain would know, having restricted movement can be debilitating and unfortunately, our busy world doesn?t allow for us to stop.\n\n \nSome of the benefits of long-term chiropractic care include:\n\n    Chiropractors can identify mechanical issues that cause spine-related pain and offer a series of adjustments that provide near immediate relief. Following appointments, patients often report feeling their symptoms noticeably better.\n    When a chiropractor performs an adjustment, they can help restore movement in joints that have ?locked up?. This becomes possible as treatment allows muscles surrounding joints to relax, thereby reducing joint stiffness.\n    Many factors affect health, including exercise patterns, nutrition, sleep, heredity and the environment in which we live. Rather than just treat symptoms of the disease, chiropractic care focuses on a holistic approach to naturally maintain health and resist disease.\n    Chiropractic adjustments help restore normal function and movement to the entire body. Many patients report an improvement in their ability to move with efficiency and strength.\n    Many patients find delight in the results chiropractic adjustments have on old and chronic injuries. Whether an injury is, in fact, new or old, chiropractic care can help reduce pain, restore mobility and provide quick pain relief to all joints in the body. Such care can help maintain better overall health and thus faster recovery time.\n\nHave you ever noticed that when you are in pain and unable to perform regular or favorite activities, it can put a strain on emotional and mental well-being?\n\nFor example, the increased stress from not being able to properly perform a paid job. This, in turn, can have a negative impact on physical health with increases in heart rate and blood pressure. The domino effect often continues with sleep becoming disturbed, with resulting lethargy and tiredness during the day. Does anyone really feel up to exercising in this state?\n\nChiropractic care is a natural method of healing the body?s communication system and never relies on the use of pharmaceutical drugs or invasive surgery.\n\n\n\n 
## 1 \nUnitedHealthcare Combats Opioid Crisis with Non-Opioid Benefits\nPhysical therapy and chiropractic care can prevent or reduce expensive, invasive spinal procedures, such as imaging or surgery, to reduce opioid use and cut costs.\nUnitedHealthcare, opioid, physical therapy, healthcare spending\n\n\nOctober 29, 2019 - UnitedHealthcare (UHC) is combatting the opioid epidemic and high healthcare costs with new physical therapy and chiropractic care benefits to prevent, delay, or in some cases substitute for invasive spinal procedures.\n\n?With millions of Americans experiencing low back pain currently or at some point during their lifetimes, we believe this benefit design will help make a meaningful difference by improving health outcomes while reducing costs,? said Anne Docimo, MD, UnitedHealthcare chief medical officer.\n\nLower back pain is in part responsible for sustaining the opioid epidemic and also increases healthcare costs.\nDig Deeper\n\nAlthough opioid overdoses fell by two percent from 2017 to 2018 and a legal battles aim to hold pharmaceutical companies accountable, there is no end in sight for the opioid epidemic. Industry professionals are still grappling with the balance between cutting opioid prescriptions will working to reduce patient pain.\n\nCommon conditions such as low back pain bolster the epidemic?s presence, with clinicians still prescribing the opioids against best practice recommendations. According to a recent OptumLabs study, 9 percent of patients with newly diagnosed low back pain are prescribed opioids and lower back pain currently contributes 52 percent to the overall opioid prescription rate.\n\nIn addition to boosting opioids distribution, alternative, invasive lower back pain treatments can significantly impact healthcare spending.\n\nIt is not new information that physical therapy and chiropractic care are effective, lower cost alternatives to spinal imaging or surgery. However, payers are still in the process of adopting the method.\n\nTo counteract the high-cost, high-risk potential of using opioids to treat back pain, UHC created a benefit that does not rely on medication or technology but rather on physical therapy and chiropractic care.\n\nThe benefit allows eligible employers to offer physical therapist and chiropractor visits with no out-of-pocket costs. Members who already receive physical therapist and chiropractic care benefits under UHC?s employer-sponsored health plans and who have maxed out their visits will not receive additional visits under this benefit.\n\nHowever, for those who still have visits to use and who choose physical therapy or chiropractic care over other forms of treatment, the copay or deductible for those visits will be waived and they will receive three visits at no cost.\n\nUHC has high expectations for the fiscal and physical impacts of this benefit.\n\nAccording to UHC?s analysis, the health payer expects that by 2021, opioid use will decrease by 19 percent. Spinal imaging test frequency and spinal surgeries will be reduced by 22 percent and 21 percent, respectively. In addition to these specific goals, UHC hopes to see a decrease in the overall cost of spinal care.\n\nThe same OptumLabs study demonstrated that UHC?s expectations are not without precedent.\n\nThe study looked at the correlation between out-of-pocket costs and patient utilization of noninvasive treatments. Researchers discovered that members whose copay was over $30 were a little under 30 percent less likely to choose physical therapy as opposed to more invasive treatments.\n\nAn American Journal of Managed Care study in June 2019 found that patients with high deductibles, typically over $1,000, were less likely to visit physical therapy.\n\nEligible employers may be brand new or renewing their membership. They must be fully insured and over 51 or more employees strong. The benefit is currently available in Connecticut, Florida, Georgia, New York, and North Carolina.\n\nHowever, UHC plans to expand the benefit from 2020 into 2021. By the end of this expansion period, the benefit will also be available to self-funded employers and organizations with an employee population between 2 and 50. The benefit will span ten states, primarily in the southeast.\n\n?This new benefit design may help encourage people with low back pain to get the right care at the right time and in the right setting, helping expand access to evidence-based and more affordable treatments,? said Docimo.
##                                                                                                                                       Source
## 0 https://coremedicalohio.com/benefits-of-long-term-chiropractic-care/?utm_source=ReviveOldPost&utm_medium=social&utm_campaign=ReviveOldPost
## 1                                   https://healthpayerintelligence.com/news/unitedhealthcare-combats-opioid-crisis-with-non-opioid-benefits
##                   Topic InternetSearch
## 0 chiropractic benefits           <NA>
## 1 chiropractic benefits           <NA>
##                                                                                                                                                                                               Contraindications
## 0 Doctors of Chiropractic work collaboratively with other healthcare professionals. Should your condition require the attention of another healthcare profession, that recommendation or referral will be made.
## 1                                                                                                                                                                                                          <NA>
##   RisksSideEffects body_length punct.
## 0             <NA>        2288    2.4
## 1             <NA>        3796    2.4
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   Cleaned_text
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ['chiropract', 'adjust', 'treatment', 'serv', 'need', 'million', 'peopl', 'around', 'world', 'adjust', 'offer', 'effect', 'noninvas', 'costeffect', 'solut', 'neck', 'back', 'pain', 'well', 'myriad', 'medic', 'issu', 'ever', 'stop', 'wonder', 'mani', 'us', 'suffer', 'neck', 'back', 'stiff', 'pain', 'apart', 'obviou', 'discomfort', 'simpl', 'daili', 'task', 'drive', 'car', 'cross', 'busi', 'street', 'pick', 'thing', 'floor', 'becom', 'challeng', 'individu', 'experienc', 'pain', 'anyon', 'experienc', 'pain', 'would', 'know', 'restrict', 'movement', 'debilit', 'unfortun', 'busi', 'world', 'doesnt', 'allow', 'us', 'stop', 'benefit', 'longterm', 'chiropract', 'care', 'includ', 'chiropractor', 'identifi', 'mechan', 'issu', 'caus', 'spinerel', 'pain', 'offer', 'seri', 'adjust', 'provid', 'near', 'immedi', 'relief', 'follow', 'appoint', 'patient', 'often', 'report', 'feel', 'symptom', 'notic', 'better', 'chiropractor', 'perform', 'adjust', 'help', 'restor', 'movement', 'joint', 'lock', 'becom', 'possibl', 'treatment', 'allow', 'muscl', 'surround', 'joint', 'relax', 'therebi', 'reduc', 'joint', 'stiff', 'mani', 'factor', 'affect', 'health', 'includ', 'exercis', 'pattern', 'nutrit', 'sleep', 'hered', 'environ', 'live', 'rather', 'treat', 'symptom', 'diseas', 'chiropract', 'care', 'focus', 'holist', 'approach', 'natur', 'maintain', 'health', 'resist', 'diseas', 'chiropract', 'adjust', 'help', 'restor', 'normal', 'function', 'movement', 'entir', 'bodi', 'mani', 'patient', 'report', 'improv', 'abil', 'move', 'effici', 'strength', 'mani', 'patient', 'find', 'delight', 'result', 'chiropract', 'adjust', 'old', 'chronic', 'injuri', 'whether', 'injuri', 'fact', 'new', 'old', 'chiropract', 'care', 'help', 'reduc', 'pain', 'restor', 'mobil', 'provid', 'quick', 'pain', 'relief', 'joint', 'bodi', 'care', 'help', 'maintain', 'better', 'overal', 'health', 'thu', 'faster', 'recoveri', 'time', 'ever', 'notic', 'pain', 'unabl', 'perform', 'regular', 'favorit', 'activ', 'put', 'strain', 'emot', 'mental', 'wellb', 'exampl', 'increas', 'stress', 'abl', 'properli', 'perform', 'paid', 'job', 'turn', 'neg', 'impact', 'physic', 'health', 'increas', 'heart', 'rate', 'blood', 'pressur', 'domino', 'effect', 'often', 'continu', 'sleep', 'becom', 'disturb', 'result', 'lethargi', 'tired', 'day', 'anyon', 'realli', 'feel', 'exercis', 'state', 'chiropract', 'care', 'natur', 'method', 'heal', 'bodi', 'commun', 'system', 'never', 'reli', 'use', 'pharmaceut', 'drug', 'invas', 'surgeri', '']
## 1 ['', 'unitedhealthcar', 'combat', 'opioid', 'crisi', 'nonopioid', 'benefit', 'physic', 'therapi', 'chiropract', 'care', 'prevent', 'reduc', 'expens', 'invas', 'spinal', 'procedur', 'imag', 'surgeri', 'reduc', 'opioid', 'use', 'cut', 'cost', 'unitedhealthcar', 'opioid', 'physic', 'therapi', 'healthcar', 'spend', 'octob', '29', '2019', 'unitedhealthcar', 'uhc', 'combat', 'opioid', 'epidem', 'high', 'healthcar', 'cost', 'new', 'physic', 'therapi', 'chiropract', 'care', 'benefit', 'prevent', 'delay', 'case', 'substitut', 'invas', 'spinal', 'procedur', 'million', 'american', 'experienc', 'low', 'back', 'pain', 'current', 'point', 'lifetim', 'believ', 'benefit', 'design', 'help', 'make', 'meaning', 'differ', 'improv', 'health', 'outcom', 'reduc', 'cost', 'said', 'ann', 'docimo', 'md', 'unitedhealthcar', 'chief', 'medic', 'offic', 'lower', 'back', 'pain', 'part', 'respons', 'sustain', 'opioid', 'epidem', 'also', 'increas', 'healthcar', 'cost', 'dig', 'deeper', 'although', 'opioid', 'overdos', 'fell', 'two', 'percent', '2017', '2018', 'legal', 'battl', 'aim', 'hold', 'pharmaceut', 'compani', 'account', 'end', 'sight', 'opioid', 'epidem', 'industri', 'profession', 'still', 'grappl', 'balanc', 'cut', 'opioid', 'prescript', 'work', 'reduc', 'patient', 'pain', 'common', 'condit', 'low', 'back', 'pain', 'bolster', 'epidem', 'presenc', 'clinician', 'still', 'prescrib', 'opioid', 'best', 'practic', 'recommend', 'accord', 'recent', 'optumlab', 'studi', '9', 'percent', 'patient', 'newli', 'diagnos', 'low', 'back', 'pain', 'prescrib', 'opioid', 'lower', 'back', 'pain', 'current', 'contribut', '52', 'percent', 'overal', 'opioid', 'prescript', 'rate', 'addit', 'boost', 'opioid', 'distribut', 'altern', 'invas', 'lower', 'back', 'pain', 'treatment', 'significantli', 'impact', 'healthcar', 'spend', 'new', 'inform', 'physic', 'therapi', 'chiropract', 'care', 'effect', 'lower', 'cost', 'altern', 'spinal', 'imag', 'surgeri', 'howev', 'payer', 'still', 'process', 'adopt', 'method', 'counteract', 'highcost', 'highrisk', 'potenti', 'use', 'opioid', 'treat', 'back', 'pain', 'uhc', 'creat', 'benefit', 'reli', 'medic', 'technolog', 'rather', 'physic', 'therapi', 'chiropract', 'care', 'benefit', 'allow', 'elig', 'employ', 'offer', 'physic', 'therapist', 'chiropractor', 'visit', 'outofpocket', 'cost', 'member', 'alreadi', 'receiv', 'physic', 'therapist', 'chiropract', 'care', 'benefit', 'uhc', 'employersponsor', 'health', 'plan', 'max', 'visit', 'receiv', 'addit', 'visit', 'benefit', 'howev', 'still', 'visit', 'use', 'choos', 'physic', 'therapi', 'chiropract', 'care', 'form', 'treatment', 'copay', 'deduct', 'visit', 'waiv', 'receiv', 'three', 'visit', 'cost', 'uhc', 'high', 'expect', 'fiscal', 'physic', 'impact', 'benefit', 'accord', 'uhc', 'analysi', 'health', 'payer', 'expect', '2021', 'opioid', 'use', 'decreas', '19', 'percent', 'spinal', 'imag', 'test', 'frequenc', 'spinal', 'surgeri', 'reduc', '22', 'percent', '21', 'percent', 'respect', 'addit', 'specif', 'goal', 'uhc', 'hope', 'see', 'decreas', 'overal', 'cost', 'spinal', 'care', 'optumlab', 'studi', 'demonstr', 'uhc', 'expect', 'without', 'preced', 'studi', 'look', 'correl', 'outofpocket', 'cost', 'patient', 'util', 'noninvas', 'treatment', 'research', 'discov', 'member', 'whose', 'copay', '30', 'littl', '30', 'percent', 'less', 'like', 'choos', 'physic', 'therapi', 'oppos', 'invas', 'treatment', 'american', 'journal', 'manag', 'care', 'studi', 'june', '2019', 'found', 'patient', 'high', 'deduct', 'typic', '1000', 'less', 'like', 'visit', 'physic', 'therapi', 'elig', 'employ', 'may', 'brand', 'new', 'renew', 'membership', 'must', 'fulli', 'insur', '51', 'employe', 'strong', 'benefit', 'current', 'avail', 'connecticut', 'florida', 'georgia', 'new', 'york', 'north', 'carolina', 'howev', 'uhc', 'plan', 'expand', 'benefit', '2020', '2021', 'end', 'expans', 'period', 'benefit', 'also', 'avail', 'selffund', 'employ', 'organ', 'employe', 'popul', '2', '50', 'benefit', 'span', 'ten', 'state', 'primarili', 'southeast', 'new', 'benefit', 'design', 'may', 'help', 'encourag', 'peopl', 'low', 'back', 'pain', 'get', 'right', 'care', 'right', 'time', 'right', 'set', 'help', 'expand', 'access', 'evidencebas', 'afford', 'treatment', 'said', 'docimo']
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  Lemmatized
## 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                ['chiropractic', 'adjustment', 'treatment', 'serve', 'need', 'million', 'people', 'around', 'world', 'adjustment', 'offer', 'effective', 'noninvasive', 'costeffective', 'solution', 'neck', 'back', 'pain', 'well', 'myriad', 'medical', 'issue', 'ever', 'stopped', 'wonder', 'many', 'u', 'suffer', 'neck', 'back', 'stiffness', 'pain', 'apart', 'obvious', 'discomfort', 'simple', 'daily', 'task', 'driving', 'car', 'crossing', 'busy', 'street', 'picking', 'thing', 'floor', 'become', 'challenging', 'individual', 'experiencing', 'pain', 'anyone', 'experienced', 'pain', 'would', 'know', 'restricted', 'movement', 'debilitating', 'unfortunately', 'busy', 'world', 'doesnt', 'allow', 'u', 'stop', 'benefit', 'longterm', 'chiropractic', 'care', 'include', 'chiropractor', 'identify', 'mechanical', 'issue', 'cause', 'spinerelated', 'pain', 'offer', 'series', 'adjustment', 'provide', 'near', 'immediate', 'relief', 'following', 'appointment', 'patient', 'often', 'report', 'feeling', 'symptom', 'noticeably', 'better', 'chiropractor', 'performs', 'adjustment', 'help', 'restore', 'movement', 'joint', 'locked', 'becomes', 'possible', 'treatment', 'allows', 'muscle', 'surrounding', 'joint', 'relax', 'thereby', 'reducing', 'joint', 'stiffness', 'many', 'factor', 'affect', 'health', 'including', 'exercise', 'pattern', 'nutrition', 'sleep', 'heredity', 'environment', 'live', 'rather', 'treat', 'symptom', 'disease', 'chiropractic', 'care', 'focus', 'holistic', 'approach', 'naturally', 'maintain', 'health', 'resist', 'disease', 'chiropractic', 'adjustment', 'help', 'restore', 'normal', 'function', 'movement', 'entire', 'body', 'many', 'patient', 'report', 'improvement', 'ability', 'move', 'efficiency', 'strength', 'many', 'patient', 'find', 'delight', 'result', 'chiropractic', 'adjustment', 'old', 'chronic', 'injury', 'whether', 'injury', 'fact', 'new', 'old', 'chiropractic', 'care', 'help', 'reduce', 'pain', 'restore', 'mobility', 'provide', 'quick', 'pain', 'relief', 'joint', 'body', 'care', 'help', 'maintain', 'better', 'overall', 'health', 'thus', 'faster', 'recovery', 'time', 'ever', 'noticed', 'pain', 'unable', 'perform', 'regular', 'favorite', 'activity', 'put', 'strain', 'emotional', 'mental', 'wellbeing', 'example', 'increased', 'stress', 'able', 'properly', 'perform', 'paid', 'job', 'turn', 'negative', 'impact', 'physical', 'health', 'increase', 'heart', 'rate', 'blood', 'pressure', 'domino', 'effect', 'often', 'continues', 'sleep', 'becoming', 'disturbed', 'resulting', 'lethargy', 'tiredness', 'day', 'anyone', 'really', 'feel', 'exercising', 'state', 'chiropractic', 'care', 'natural', 'method', 'healing', 'body', 'communication', 'system', 'never', 'relies', 'use', 'pharmaceutical', 'drug', 'invasive', 'surgery', '']
## 1 ['', 'unitedhealthcare', 'combat', 'opioid', 'crisis', 'nonopioid', 'benefit', 'physical', 'therapy', 'chiropractic', 'care', 'prevent', 'reduce', 'expensive', 'invasive', 'spinal', 'procedure', 'imaging', 'surgery', 'reduce', 'opioid', 'use', 'cut', 'cost', 'unitedhealthcare', 'opioid', 'physical', 'therapy', 'healthcare', 'spending', 'october', '29', '2019', 'unitedhealthcare', 'uhc', 'combatting', 'opioid', 'epidemic', 'high', 'healthcare', 'cost', 'new', 'physical', 'therapy', 'chiropractic', 'care', 'benefit', 'prevent', 'delay', 'case', 'substitute', 'invasive', 'spinal', 'procedure', 'million', 'american', 'experiencing', 'low', 'back', 'pain', 'currently', 'point', 'lifetime', 'believe', 'benefit', 'design', 'help', 'make', 'meaningful', 'difference', 'improving', 'health', 'outcome', 'reducing', 'cost', 'said', 'anne', 'docimo', 'md', 'unitedhealthcare', 'chief', 'medical', 'officer', 'lower', 'back', 'pain', 'part', 'responsible', 'sustaining', 'opioid', 'epidemic', 'also', 'increase', 'healthcare', 'cost', 'dig', 'deeper', 'although', 'opioid', 'overdoses', 'fell', 'two', 'percent', '2017', '2018', 'legal', 'battle', 'aim', 'hold', 'pharmaceutical', 'company', 'accountable', 'end', 'sight', 'opioid', 'epidemic', 'industry', 'professional', 'still', 'grappling', 'balance', 'cutting', 'opioid', 'prescription', 'working', 'reduce', 'patient', 'pain', 'common', 'condition', 'low', 'back', 'pain', 'bolster', 'epidemic', 'presence', 'clinician', 'still', 'prescribing', 'opioids', 'best', 'practice', 'recommendation', 'according', 'recent', 'optumlabs', 'study', '9', 'percent', 'patient', 'newly', 'diagnosed', 'low', 'back', 'pain', 'prescribed', 'opioids', 'lower', 'back', 'pain', 'currently', 'contributes', '52', 'percent', 'overall', 'opioid', 'prescription', 'rate', 'addition', 'boosting', 'opioids', 'distribution', 'alternative', 'invasive', 'lower', 'back', 'pain', 'treatment', 'significantly', 'impact', 'healthcare', 'spending', 'new', 'information', 'physical', 'therapy', 'chiropractic', 'care', 'effective', 'lower', 'cost', 'alternative', 'spinal', 'imaging', 'surgery', 'however', 'payer', 'still', 'process', 'adopting', 'method', 'counteract', 'highcost', 'highrisk', 'potential', 'using', 'opioids', 'treat', 'back', 'pain', 'uhc', 'created', 'benefit', 'rely', 'medication', 'technology', 'rather', 'physical', 'therapy', 'chiropractic', 'care', 'benefit', 'allows', 'eligible', 'employer', 'offer', 'physical', 'therapist', 'chiropractor', 'visit', 'outofpocket', 'cost', 'member', 'already', 'receive', 'physical', 'therapist', 'chiropractic', 'care', 'benefit', 'uhcs', 'employersponsored', 'health', 'plan', 'maxed', 'visit', 'receive', 'additional', 'visit', 'benefit', 'however', 'still', 'visit', 'use', 'choose', 'physical', 'therapy', 'chiropractic', 'care', 'form', 'treatment', 'copay', 'deductible', 'visit', 'waived', 'receive', 'three', 'visit', 'cost', 'uhc', 'high', 'expectation', 'fiscal', 'physical', 'impact', 'benefit', 'according', 'uhcs', 'analysis', 'health', 'payer', 'expects', '2021', 'opioid', 'use', 'decrease', '19', 'percent', 'spinal', 'imaging', 'test', 'frequency', 'spinal', 'surgery', 'reduced', '22', 'percent', '21', 'percent', 'respectively', 'addition', 'specific', 'goal', 'uhc', 'hope', 'see', 'decrease', 'overall', 'cost', 'spinal', 'care', 'optumlabs', 'study', 'demonstrated', 'uhcs', 'expectation', 'without', 'precedent', 'study', 'looked', 'correlation', 'outofpocket', 'cost', 'patient', 'utilization', 'noninvasive', 'treatment', 'researcher', 'discovered', 'member', 'whose', 'copay', '30', 'little', '30', 'percent', 'le', 'likely', 'choose', 'physical', 'therapy', 'opposed', 'invasive', 'treatment', 'american', 'journal', 'managed', 'care', 'study', 'june', '2019', 'found', 'patient', 'high', 'deductible', 'typically', '1000', 'le', 'likely', 'visit', 'physical', 'therapy', 'eligible', 'employer', 'may', 'brand', 'new', 'renewing', 'membership', 'must', 'fully', 'insured', '51', 'employee', 'strong', 'benefit', 'currently', 'available', 'connecticut', 'florida', 'georgia', 'new', 'york', 'north', 'carolina', 'however', 'uhc', 'plan', 'expand', 'benefit', '2020', '2021', 'end', 'expansion', 'period', 'benefit', 'also', 'available', 'selffunded', 'employer', 'organization', 'employee', 'population', '2', '50', 'benefit', 'span', 'ten', 'state', 'primarily', 'southeast', 'new', 'benefit', 'design', 'may', 'help', 'encourage', 'people', 'low', 'back', 'pain', 'get', 'right', 'care', 'right', 'time', 'right', 'setting', 'helping', 'expand', 'access', 'evidencebased', 'affordable', 'treatment', 'said', 'docimo']
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.15)
from sklearn.feature_extraction.text import CountVectorizer
count_vect=CountVectorizer(analyzer=clean_text)
count_vect_fit=count_vect.fit(X_train['Document'])

count_train=count_vect_fit.transform(X_train['Document'])
count_test=count_vect_fit.transform(X_test['Document'])
len(count_vect_fit.get_feature_names())
## 3267
count_vect_fit.get_feature_names()[200:350]
## ['alternatingli', 'although', 'altogeth', 'alway', 'alzheim', 'amateur', 'amaz', 'ambul', 'amelior', 'america', 'american', 'among', 'amount', 'amplitud', 'amyclarklymphaticdrainagemassag', 'amyclarklymphaticdrainagemassage1', 'analges', 'analyz', 'anatomi', 'ancient', 'andor', 'anecdot', 'anemia', 'anesthesia', 'anesthet', 'anger', 'angion', 'angri', 'anhedonia', 'anim', 'aniston', 'ankl', 'ann', 'annal', 'annual', 'anosognosia', 'anoth', 'answer', 'antibiot', 'antibodi', 'antidepress', 'antiinflammatori', 'antivir', 'anxieti', 'anxietydepress', 'anxietyfre', 'anxiou', 'anyon', 'anyth', 'anywher', 'apart', 'appear', 'appendix', 'appetit', 'appli', 'applianc', 'applic', 'appoint', 'approach', 'appropri', 'approv', 'approxim', 'apta', 'area', 'areaswer', 'arent', 'argu', 'aris', 'arizona', 'arm', 'armpit', 'around', 'arquett', 'array', 'arriv', 'art', 'arteri', 'arthrit', 'arthriti', 'articl', 'ascertain', 'ashi', 'asid', 'ask', 'asleep', 'aspect', 'assert', 'assess', 'assign', 'assist', 'associ', 'asthma', 'athlet', 'athom', 'atrophi', 'attach', 'attack', 'attent', 'attitud', 'attract', 'australian', 'author', 'authorth', 'autoimmun', 'autonom', 'auxiliari', 'avail', 'averag', 'avoid', 'aw', 'awar', 'away', 'awkward', 'ayurved', 'ba', 'babi', 'back', 'backneck', 'backthough', 'bacteria', 'bad', 'badli', 'bag', 'baker', 'balanc', 'ball', 'bamboo', 'band', 'bandag', 'bannist', 'bar', 'barrier', 'basalt', 'base', 'basi', 'basic', 'bath', 'bc', 'beauti', 'becam', 'beck', 'becom', 'bed', 'begin', 'behavior', 'behind', 'belief', 'believ', 'belli', 'ben']
count_train_vect=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(count_train.toarray())],axis=1)

count_test_vect=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(count_test.toarray())],axis=1)
count_train_vect.head()
##                                             Document  body_length  ...  3265 3266
## 0  \nFive Warning Signs of Mental Illness\n\n\nIt...         6759  ...     0    0
## 1  Lymphatic drainage\nLymphatic drainage is a th...         2631  ...     0    0
## 2  7 Benefits of Massage Therapy\r\n\r\nMassage t...         5948  ...     0    0
## 3  BENEFITS OF MASSAGE\r\n\r\nYou know that post-...          320  ...     0    0
## 4  \nGetting Started with Cold Stone Massage Ther...         1868  ...     0    0
## 
## [5 rows x 3272 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(count_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(count_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 26  mental health services benefits  mental health services benefits
## 30                 massage benefits            chiropractic benefits
## 77                               ER                               ER
## 72       Lymphatic Drainage Massage             dry brushing massage
## 13       Lymphatic Drainage Massage                 massage benefits
## 52                 massage benefits              cold stone benefits
## 68       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 71       Lymphatic Drainage Massage             dry brushing massage
## 15                 massage benefits                 massage benefits
## 57              cold stone benefits              cold stone benefits
## 1         physical therapy benefits            chiropractic benefits
## 62       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 33            chiropractic benefits            chiropractic benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.5384615384615384
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 0 0 0 0]
##  [0 2 0 0 0 0 0 0]
##  [0 0 1 0 0 1 0 1]
##  [0 0 0 1 0 1 0 0]
##  [0 2 0 0 0 0 0 0]
##  [0 1 0 0 0 1 0 0]
##  [0 0 0 0 0 0 1 0]
##  [0 0 0 0 0 0 0 0]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##                              ER       1.00      1.00      1.00         1
##      Lymphatic Drainage Massage       0.40      1.00      0.57         2
##           chiropractic benefits       1.00      0.33      0.50         3
##             cold stone benefits       1.00      0.50      0.67         2
##            dry brushing massage       0.00      0.00      0.00         2
##                massage benefits       0.33      0.50      0.40         2
## mental health services benefits       1.00      1.00      1.00         1
##       physical therapy benefits       0.00      0.00      0.00         0
## 
##                        accuracy                           0.54        13
##                       macro avg       0.59      0.54      0.52        13
##                    weighted avg       0.65      0.54      0.52        13
## 
## 
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1437: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
##   'precision', 'predicted', average, warn_for)
## C:\Users\m\Anaconda2\envs\python36\lib\site-packages\sklearn\metrics\classification.py:1439: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples.
##   'recall', 'true', average, warn_for)
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(count_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(count_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                           Predicted                            Topic
## 26  mental health services benefits  mental health services benefits
## 30            chiropractic benefits            chiropractic benefits
## 77                               ER                               ER
## 72             dry brushing massage             dry brushing massage
## 13                 massage benefits                 massage benefits
## 52              cold stone benefits              cold stone benefits
## 68       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 71             dry brushing massage             dry brushing massage
## 15                 massage benefits                 massage benefits
## 57              cold stone benefits              cold stone benefits
## 1             chiropractic benefits            chiropractic benefits
## 62       Lymphatic Drainage Massage       Lymphatic Drainage Massage
## 33            chiropractic benefits            chiropractic benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 1.0
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 0 0 0]
##  [0 2 0 0 0 0 0]
##  [0 0 3 0 0 0 0]
##  [0 0 0 2 0 0 0]
##  [0 0 0 0 2 0 0]
##  [0 0 0 0 0 2 0]
##  [0 0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                                  precision    recall  f1-score   support
## 
##                              ER       1.00      1.00      1.00         1
##      Lymphatic Drainage Massage       1.00      1.00      1.00         2
##           chiropractic benefits       1.00      1.00      1.00         3
##             cold stone benefits       1.00      1.00      1.00         2
##            dry brushing massage       1.00      1.00      1.00         2
##                massage benefits       1.00      1.00      1.00         2
## mental health services benefits       1.00      1.00      1.00         1
## 
##                        accuracy                           1.00        13
##                       macro avg       1.00      1.00      1.00        13
##                    weighted avg       1.00      1.00      1.00        13

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_countRFC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    count_vect=CountVectorizer(analyzer=lemmatize)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['lemmatized'])
    
    model = rf.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_count_RFC_85-15:']
    print('\n\n',pred)
    

def predict_countRFC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    count_vect=CountVectorizer(analyzer=clean_text)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['clean'])
    
    model = rf.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_count_RFC_85-15:']
    print('\n\n',pred)
    
predict_countRFC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_count_RFC_85-15:               massage benefits
predict_countRFC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_count_RFC_85-15:               massage benefits

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_countGBC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    count_vect=CountVectorizer(analyzer=lemmatize)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['lemmatized'])
    
    model = gb.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_count_GBC_85-15:']
    print('\n\n',pred)
    

def predict_countGBC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    count_vect=CountVectorizer(analyzer=clean_text)
    
    count_vect_fit=count_vect.fit(X_train['Document'])
    count_train=count_vect_fit.transform(X_train['Document'])
    count_test=count_vect_fit.transform(nr['clean'])
    
    model = gb.fit(count_train,y_train)
    pred=pd.DataFrame(model.predict(count_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_count_GBC_85-15:']
    print('\n\n',pred)
    
predict_countGBC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_count_GBC_85-15:               massage benefits
predict_countGBC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_count_GBC_85-15:               massage benefits

TF-IDF RFC and GBC


stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#these are lists for the count vectorizer
    return text        

data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  [chiropractic, adjustment, treatment, serve, n...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...  [, unitedhealthcare, combat, opioid, crisis, n...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...  [, safety, chiropractic, adjustment, chiroprac...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  [advanced, chiropractic, relief, 8, key, benef...
## 4  Heading to the spa can be a pampering treat, b...  ...  [heading, spa, pampering, treat, also, huge, b...
## 
## [5 rows x 10 columns]

from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.15)

from sklearn.feature_extraction.text import TfidfVectorizer

tfidf_vect=TfidfVectorizer(analyzer=clean_text)
tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])

tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
tfidf_test=tfidf_vect_fit.transform(X_test['Document'])
len(tfidf_vect_fit.get_feature_names())
## 3424
tfidf_vect_fit.get_feature_names()[200:350]
## ['alkal', 'allerg', 'allergi', 'allevi', 'allianc', 'allison', 'allow', 'alloy', 'allround', 'almost', 'alon', 'along', 'alongsid', 'alreadi', 'also', 'alter', 'altern', 'alternatingli', 'although', 'altogeth', 'alway', 'alzheim', 'amateur', 'amaz', 'amazon', 'ambul', 'amelior', 'america', 'american', 'among', 'amount', 'amplitud', 'amyclarklymphaticdrainagemassag', 'amyclarklymphaticdrainagemassage1', 'analges', 'analysi', 'analyz', 'anatomi', 'ancient', 'andor', 'anecdot', 'anemia', 'anesthesia', 'anesthet', 'anger', 'angion', 'anhedonia', 'anim', 'ankl', 'ann', 'annal', 'announc', 'annual', 'anosognosia', 'anoth', 'answer', 'antibiot', 'antibodi', 'anticip', 'antidepress', 'antiinflammatori', 'antivir', 'anxieti', 'anxietydepress', 'anxietyfre', 'anyon', 'anyth', 'anywher', 'apart', 'appeal', 'appear', 'appendix', 'appetit', 'appl', 'appli', 'applianc', 'applic', 'appoint', 'approach', 'appropri', 'approv', 'approxim', 'apta', 'area', 'arent', 'argu', 'aris', 'arizona', 'arm', 'armpit', 'aromatherapi', 'around', 'array', 'arriv', 'art', 'arthrit', 'arthriti', 'articl', 'ascertain', 'ashi', 'asid', 'ask', 'asleep', 'aspect', 'aspirin', 'assert', 'assess', 'assign', 'assist', 'associ', 'asthma', 'astonishingli', 'athlet', 'athom', 'atlanta', 'atrophi', 'attach', 'attack', 'attempt', 'attent', 'attitud', 'australian', 'author', 'authorth', 'autoimmun', 'autonom', 'avail', 'averag', 'avoid', 'aw', 'awar', 'away', 'awesom', 'awkward', 'axillari', 'ayurved', 'b', 'ba', 'babi', 'back', 'bad', 'badli', 'bag', 'balanc', 'ball', 'bamboo', 'band', 'bandag', 'bank', 'bannist']
tfidf_train_vect=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(tfidf_train.toarray())],axis=1)

tfidf_test_vect=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(tfidf_test.toarray())],axis=1)
tfidf_train_vect.head()
##                                             Document  body_length  ...  3422 3423
## 0  The Role of Physical Therapy\r\n\r\nThe role o...         4704  ...   0.0  0.0
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...         3796  ...   0.0  0.0
## 2  Massage Guns Worth It? Here's What An Expert S...         1596  ...   0.0  0.0
## 3  The Benefits of Physical Therapy\n\n\nWhen peo...         3221  ...   0.0  0.0
## 4  What is a lymphatic drainage massage or detox ...         3554  ...   0.0  0.0
## 
## [5 rows x 3429 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
np.random.seed(45678)
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(tfidf_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(tfidf_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 65  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 33       chiropractic benefits       chiropractic benefits
## 69  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 41            cupping benefits            cupping benefits
## 38            cupping benefits            cupping benefits
## 39            cupping benefits            cupping benefits
## 50        massage gun benefits        massage gun benefits
## 13  Lymphatic Drainage Massage            massage benefits
## 45            massage benefits        massage gun benefits
## 34            cupping benefits            cupping benefits
## 60         cold stone benefits         cold stone benefits
## 7             massage benefits            massage benefits
## 10            massage benefits            massage benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.8461538461538461
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[2 0 0 0 0 0]
##  [0 1 0 0 0 0]
##  [0 0 1 0 0 0]
##  [0 0 0 4 0 0]
##  [1 0 0 0 2 0]
##  [0 0 0 0 1 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       0.67      1.00      0.80         2
##      chiropractic benefits       1.00      1.00      1.00         1
##        cold stone benefits       1.00      1.00      1.00         1
##           cupping benefits       1.00      1.00      1.00         4
##           massage benefits       0.67      0.67      0.67         3
##       massage gun benefits       1.00      0.50      0.67         2
## 
##                   accuracy                           0.85        13
##                  macro avg       0.89      0.86      0.86        13
##               weighted avg       0.87      0.85      0.84        13
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(tfidf_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(tfidf_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 65  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 33       chiropractic benefits       chiropractic benefits
## 69  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 41            cupping benefits            cupping benefits
## 38            cupping benefits            cupping benefits
## 39            cupping benefits            cupping benefits
## 50            massage benefits        massage gun benefits
## 13  Lymphatic Drainage Massage            massage benefits
## 45        massage gun benefits        massage gun benefits
## 34            cupping benefits            cupping benefits
## 60         cold stone benefits         cold stone benefits
## 7             massage benefits            massage benefits
## 10            massage benefits            massage benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.8461538461538461
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[2 0 0 0 0 0]
##  [0 1 0 0 0 0]
##  [0 0 1 0 0 0]
##  [0 0 0 4 0 0]
##  [1 0 0 0 2 0]
##  [0 0 0 0 1 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       0.67      1.00      0.80         2
##      chiropractic benefits       1.00      1.00      1.00         1
##        cold stone benefits       1.00      1.00      1.00         1
##           cupping benefits       1.00      1.00      1.00         4
##           massage benefits       0.67      0.67      0.67         3
##       massage gun benefits       1.00      0.50      0.67         2
## 
##                   accuracy                           0.85        13
##                  macro avg       0.89      0.86      0.86        13
##               weighted avg       0.87      0.85      0.84        13

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_tfidfRFC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['lemmatized'])
    
    model = rf.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_tfidf_RFC_85-15:']
    print('\n\n',pred)
    

def predict_tfidfRFC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    tfidf_vect=TfidfVectorizer(analyzer=clean_text)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['clean'])
    
    model = rf.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_tfidf_RFC_85-15:']
    print('\n\n',pred)
    
predict_tfidfRFC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_tfidf_RFC_85-15:               massage benefits
predict_tfidfRFC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_tfidf_RFC_85-15:               massage benefits

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[ps.stem(word) for word in tokens if word not in stopwords]
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=[wn.lemmatize(word) for word in tokens if word not in stopwords]
    return text   

def predict_tfidfGBC_lemmatized(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemmatized']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    tfidf_vect=TfidfVectorizer(analyzer=lemmatize)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['lemmatized'])
    
    model = gb.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['lemmatized_tfidf_GBC_85-15:']
    print('\n\n',pred)
    

def predict_tfidfGBC_cleaned(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    tfidf_vect=TfidfVectorizer(analyzer=clean_text)
    
    tfidf_vect_fit=tfidf_vect.fit(X_train['Document'])
    tfidf_train=tfidf_vect_fit.transform(X_train['Document'])
    tfidf_test=tfidf_vect_fit.transform(nr['clean'])
    
    model = gb.fit(tfidf_train,y_train)
    pred=pd.DataFrame(model.predict(tfidf_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['cleaned_tfidf_GBC_85-15:']
    print('\n\n',pred)
    
predict_tfidfGBC_cleaned('I need a massage!') 
## 
## 
##                           Recommended Healthcare Service
## cleaned_tfidf_GBC_85-15:               massage benefits
predict_tfidfGBC_lemmatized('I need a massage!')
## 
## 
##                              Recommended Healthcare Service
## lemmatized_tfidf_GBC_85-15:               massage benefits

N-Grams Vectorization for RFC and GBC

stopwords = nltk.corpus.stopwords.words('english')
ps=nltk.PorterStemmer()
wn=nltk.WordNetLemmatizer()
data=pd.read_csv("benefitsContraindications4.csv", encoding='unicode_escape')
data.columns=['Document','Source','Topic','InternetSearch','Contraindications','RisksSideEffects']
def count_punct(text):
    count=sum([1 for char in text if char in string.punctuation])
    return round(count/(len(text)-text.count(" ")),3)*100
data['body_length']=data['Document'].apply(lambda x: len(x)-x.count(" "))
data['punct%']= data['Document'].apply(lambda x: count_punct(x))
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])#unlisted with N-grams vectorization
    return text

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])#unlisted with N-grams vectorization
    #text=[wn.lemmatize(word) for word in tokens if word not in stopwords]#when using count Vectorization its a list
    #or else single letters returned.
    return text    
data['Cleaned_text']=data['Document'].apply(lambda x: clean_text(x))
data['Lemmatized']=data['Document'].apply(lambda x: lemmatize(x))
data.head()
##                                             Document  ...                                         Lemmatized
## 0  Chiropractic adjustments and treatments serve ...  ...  chiropractic adjustment treatment serve need m...
## 1  \nUnitedHealthcare Combats Opioid Crisis with ...  ...   unitedhealthcare combat opioid crisis nonopio...
## 2   The Safety of Chiropractic Adjustments\n\n   ...  ...   safety chiropractic adjustment chiropractic a...
## 3  Advanced Chiropractic Relief: 8 Key Benefits o...  ...  advanced chiropractic relief 8 key benefit chi...
## 4  Heading to the spa can be a pampering treat, b...  ...  heading spa pampering treat also huge boost he...
## 
## [5 rows x 10 columns]
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(data[['Document','body_length','punct%','Cleaned_text','Lemmatized']],data['Topic'],test_size=0.15)
from sklearn.feature_extraction.text import CountVectorizer
n_gram_vect=CountVectorizer(ngram_range=(1,4))
type(X_train['Cleaned_text'])
## <class 'pandas.core.series.Series'>
X_train['Cleaned_text'].head()
## 46     benefit vibrat percuss therapi vibrat therapi...
## 49    futurist gun could effect foam roll muscl sore...
## 29     mental health counselor train skill salari me...
## 69    lymphat drainag massag us think lymphat system...
## 15    top 5 health benefit regular massag therapi ma...
## Name: Cleaned_text, dtype: object
X_train['Lemmatized'].head()
## 46     benefit vibration percussion therapy vibratio...
## 49    futuristic gun could effective foam rolling mu...
## 29     mental health counselor training skill salary...
## 69    lymphatic drainage massage u think lymphatic s...
## 15    top 5 health benefit regular massage therapy m...
## Name: Lemmatized, dtype: object
n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])


n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
n_gram_test=n_gram_vect_fit.transform(X_test['Cleaned_text'])
len(n_gram_vect_fit.get_feature_names())
## 73345
print(n_gram_vect_fit.get_feature_names()[200:500])
## ['2009 show lymphat drainag', '2011', '2011 number', '2011 number adult', '2011 number adult us', '2011 research', '2011 research review', '2011 research review size', '2011 review', '2011 review 26', '2011 review 26 clinic', '2012', '2012 review', '2012 review studiestrust', '2012 review studiestrust sourc', '2013', '2013 illustr', '2013 illustr benefit', '2013 illustr benefit journal', '2015', '2015 report', '2015 report publish', '2015 report publish journal', '2015 review', '2015 review evid', '2015 review evid found', '2015 systemat', '2015 systemat review', '2015 systemat review conclud', '2016', '2016 michael', '2016 michael phelp', '2016 michael phelp perman', '2016 studi', '2016 studi publish', '2016 studi publish journal', '2016 summer', '2016 summer olymp', '2016 summer olymp share', '2016 summer olympics1', '2016 summer olympics1 use', '20162017', '20162017 wide', '20162017 wide rang', '20162017 wide rang reason', '2017', '2017 2018', '2017 2018 legal', '2017 2018 legal battl', '2017 chiropractor', '2017 chiropractor tout', '2017 chiropractor tout treatment', '2017 nba', '2017 nba final', '2017 nba final irv', '2017 scientist', '2017 scientist analyz', '2017 scientist analyz 11', '2017 studi', '2017 studi found', '2017 studi found structur', '2018', '2018 found', '2018 found chang', '2018 found chang hamstr', '2018 galluppalm', '2018 galluppalm colleg', '2018 galluppalm colleg chiropract', '2018 legal', '2018 legal battl', '2018 legal battl aim', '2018 studi', '2018 studi led', '2018 studi led dr', '2019', '2019 earli', '2019 earli 2020', '2019 earli 2020 mani', '2019 found', '2019 found patient', '2019 found patient high', '2019 massag', '2019 massag gun', '2019 massag gun one', '2019 unitedhealthcar', '2019 unitedhealthcar uhc', '2019 unitedhealthcar uhc combat', '2020', '2020 2021', '2020 2021 end', '2020 2021 end expans', '2020 beyond', '2020 beyond peopl', '2020 beyond peopl say', '2020 mani', '2020 mani peopl', '2020 mani peopl start', '2021', '2021 end', '2021 end expans', '2021 end expans period', '2021 opioid', '2021 opioid use', '2021 opioid use decreas', '20minut', '20minut selfmassag', '20minut selfmassag use', '20minut selfmassag use massag', '21', '21 benefit', '21 benefit chiropract', '21 benefit chiropract adjust', '21 benefit might', '21 benefit might known', '21 percent', '21 percent respect', '21 percent respect addit', '2105', '2105 billion', '2105 billion year2', '2105 billion year2 curiou', '22', '22 million', '22 million american', '22 million american visit', '22 percent', '22 percent 21', '22 percent 21 percent', '23', '23 2019', '23 2019 massag', '23 2019 massag gun', '24', '24 separ', '24 separ column', '24 separ column sever', '25', '25 percent', '25 percent american', '25 percent american adult', '25 reason', '25 reason get', '25 reason get massag', '25 show', '25 show 75', '25 show 75 90', '26', '26 clinic', '26 clinic trial', '26 clinic trial look', '272', '272 studi', '272 studi particip', '272 studi particip three', '275', '275 pictur', '275 pictur theragun', '275 pictur theragun pure', '275 realli', '275 realli worth', '275 realli worth invest', '281', '281 341', '281 341 mani', '281 341 mani taoist', '29', '29 2019', '29 2019 unitedhealthcar', '29 2019 unitedhealthcar uhc', '30', '30 littl', '30 littl 30', '30 littl 30 percent', '30 percent', '30 percent less', '30 percent less like', '30 second', '30 second work', '30 second work along', '300', '300 ad', '300 ad even', '300 ad even earlier', '33', '33 medic', '33 medic one', '33 medic one year', '34', '34 lymphat', '34 lymphat system', '34 lymphat system drain', '341', '341 mani', '341 mani taoist', '341 mani taoist believ', '35', '35 cup', '35 cup first', '35 cup first session', '35 seek', '35 seek relief', '35 seek relief back', '37', '37 studi', '37 studi found', '37 studi found reduct', '38', '38 take', '38 take pain', '38 take pain medic', '40', '40 percuss', '40 percuss per', '40 percuss per second', '400', '400 600', '400 600 massag', '400 600 massag gun', '400 greater', '400 greater immun', '400 greater immun compet', '4357', '4357 local', '4357 local health', '4357 local health depart', '44', '44 million', '44 million peopl', '44 million peopl 1320', '456000', '456000 chiropractor', '456000 chiropractor massag', '456000 chiropractor massag therapist', '48', '48 percent', '48 percent went', '48 percent went doctor', '48 receiv', '48 receiv pain', '48 receiv pain reduct', '4pm', '4pm afternoon', '4pm afternoon time', '4pm afternoon time dinner', '50', '50 75', '50 75 improv', '50 75 improv one', '50 adult', '50 adult least', '50 adult least one', '50 benefit', '50 benefit span', '50 benefit span ten', '50 improv', '50 improv research', '50 improv research conclud', '50 million', '50 million american', '50 million american suffer', '50 minut', '50 minut long', '50 minut long say', '50 patient', '50 patient 16', '50 patient 16 complet', '50 state', '50 state howev', '50 state howev mani', '51', '51 employe', '51 employe strong', '51 employe strong benefit', '52', '52 percent', '52 percent overal', '52 percent overal opioid', '53', '53 drugfre', '53 drugfre group', '53 drugfre group continu', '53 sought', '53 sought treatment', '53 sought treatment chiropractor', '549', '549 pictur', '549 pictur theragun', '549 pictur theragun theragunliv', '57', '57 chiropract', '57 chiropract group', '57 chiropract group achiev', '57 cup', '57 cup british', '57 cup british cup', '60', '60 also', '60 also vulner', '60 also vulner complic', '60 minut']
n_gram_train_df=pd.concat([X_train[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(n_gram_train.toarray())],axis=1)

n_gram_test_df=pd.concat([X_test[['Document','body_length','punct%','Cleaned_text','Lemmatized']].reset_index(drop=True),pd.DataFrame(n_gram_test.toarray())],axis=1)
n_gram_train_df.head()
##                                             Document  body_length  ...  73343 73344
## 0  \r\nBenefits of Vibration and Percussion Thera...         2848  ...      0     0
## 1  This futuristic ?gun? could be more effective ...         4784  ...      0     0
## 2   Mental Health Counselor Training, Skills, and...         5085  ...      0     0
## 3  What is Lymphatic Drainage Massage?\nMost of u...         1697  ...      0     0
## 4  Top 5 Health Benefits of Regular Massage Thera...          893  ...      0     0
## 
## [5 rows x 73350 columns]
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.metrics import precision_recall_fscore_support as score
import time
rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
start=time.time()
rf_model=rf.fit(n_gram_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=rf_model.predict(n_gram_test)
end=time.time()
pred_time=(end-start)

prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 58            massage benefits         cold stone benefits
## 56         cold stone benefits         cold stone benefits
## 72        dry brushing massage        dry brushing massage
## 36            massage benefits            cupping benefits
## 12            massage benefits            massage benefits
## 67            massage benefits  Lymphatic Drainage Massage
## 62  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 37            cupping benefits            cupping benefits
## 52            massage benefits         cold stone benefits
## 57            massage benefits         cold stone benefits
## 4             massage benefits            massage benefits
## 10            massage benefits            massage benefits
## 17   physical therapy benefits   physical therapy benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 0.6153846153846154
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[1 0 0 0 1 0]
##  [0 1 0 0 3 0]
##  [0 0 1 0 1 0]
##  [0 0 0 1 0 0]
##  [0 0 0 0 3 0]
##  [0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       1.00      0.50      0.67         2
##        cold stone benefits       1.00      0.25      0.40         4
##           cupping benefits       1.00      0.50      0.67         2
##       dry brushing massage       1.00      1.00      1.00         1
##           massage benefits       0.38      1.00      0.55         3
##  physical therapy benefits       1.00      1.00      1.00         1
## 
##                   accuracy                           0.62        13
##                  macro avg       0.90      0.71      0.71        13
##               weighted avg       0.86      0.62      0.61        13
gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
start=time.time()
gb_model=gb.fit(n_gram_train,y_train)
end=time.time()
fit_time=(end-start)
start=time.time()
y_pred=gb_model.predict(n_gram_test)
end=time.time()
pred_time=(end-start)
prd = pd.DataFrame(y_pred)
prd.columns=['Predicted']

prd.index=y_test.index
pred=pd.concat([pd.DataFrame(prd),y_test],axis=1)
print(pred)
##                      Predicted                       Topic
## 58         cold stone benefits         cold stone benefits
## 56         cold stone benefits         cold stone benefits
## 72        dry brushing massage        dry brushing massage
## 36            cupping benefits            cupping benefits
## 12            massage benefits            massage benefits
## 67  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 62  Lymphatic Drainage Massage  Lymphatic Drainage Massage
## 37            cupping benefits            cupping benefits
## 52         cold stone benefits         cold stone benefits
## 57         cold stone benefits         cold stone benefits
## 4             massage benefits            massage benefits
## 10            massage benefits            massage benefits
## 17   physical therapy benefits   physical therapy benefits
from sklearn.metrics import classification_report, f1_score, accuracy_score, confusion_matrix 

print('accuracy', accuracy_score(y_test, y_pred))
## accuracy 1.0
print('confusion matrix\n', confusion_matrix(y_test, y_pred))
## confusion matrix
##  [[2 0 0 0 0 0]
##  [0 4 0 0 0 0]
##  [0 0 2 0 0 0]
##  [0 0 0 1 0 0]
##  [0 0 0 0 3 0]
##  [0 0 0 0 0 1]]
print('(row=expected, col=predicted)')
## (row=expected, col=predicted)
print(classification_report(y_test, y_pred))
##                             precision    recall  f1-score   support
## 
## Lymphatic Drainage Massage       1.00      1.00      1.00         2
##        cold stone benefits       1.00      1.00      1.00         4
##           cupping benefits       1.00      1.00      1.00         2
##       dry brushing massage       1.00      1.00      1.00         1
##           massage benefits       1.00      1.00      1.00         3
##  physical therapy benefits       1.00      1.00      1.00         1
## 
##                   accuracy                           1.00        13
##                  macro avg       1.00      1.00      1.00        13
##               weighted avg       1.00      1.00      1.00        13

These next two functions only work with the n-grams vectorized and added Lematized and Cleaned_text feature columns to this data, because with n-grams the tokenizing of the text document has to be word joins and not a list, and the count and tf-idf have to be lists. Otherwise, they count single letters and not words. Derived from a regex style.

def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])
    return text

def predict_ngramRFC_clean(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
    n_gram_test=n_gram_vect_fit.transform(nr['clean'])
    
    model = rf.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['stemmed_1ngram4_RFC_85-15:']
    print('\n\n',pred)
    

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text    
    
def predict_ngramRFC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))
    
    rf=RandomForestClassifier(n_estimators=150, max_depth=None, n_jobs=-1)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
    n_gram_test=n_gram_vect_fit.transform(nr['lemma'])
    
    model = rf.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_1ngram4_RFC_85-15:']
    print('\n\n',pred)
predict_ngramRFC_clean('I need a massage!') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_RFC_85-15:               massage benefits
predict_ngramRFC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_1ngram4_RFC_85-15:                massage benefits

When lemmatizing or stemming the text with the cleaned text where both were cleaned of punctuation and stop words as well as all made lowercase, the results were the same for the short input of, ‘I need a massage!’

Lets try some other inputs.

predict_ngramRFC_clean('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_RFC_85-15:               massage benefits
predict_ngramRFC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_1ngram4_RFC_85-15:                massage benefits
def clean_text(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+',text)
    text=" ".join([ps.stem(word) for word in tokens if word not in stopwords])
    return text

def predict_ngramGBC_clean(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['clean']=nr['newReview'].apply(lambda x: clean_text(x))

    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Cleaned_text'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Cleaned_text'])
    n_gram_test=n_gram_vect_fit.transform(nr['clean'])
    
    model = gb.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service']
    pred.index= ['stemmed_1ngram4_GBC_85-15:']
    print('\n\n',pred)
    

def lemmatize(text):
    text="".join([word.lower() for word in text if word not in string.punctuation])
    tokens=re.split('\W+', text)
    text=" ".join([wn.lemmatize(word) for word in tokens if word not in stopwords])
    return text    
    
def predict_ngramGBC_lemma(new_review): 
    nr=pd.DataFrame([new_review])
    nr.columns=['newReview']
    nr['lemma']=nr['newReview'].apply(lambda x: lemmatize(x))

    gb=GradientBoostingClassifier(n_estimators=150,max_depth=11)
    n_gram_vect=CountVectorizer(ngram_range=(1,4))
    
    n_gram_vect_fit=n_gram_vect.fit(X_train['Lemmatized'])
    n_gram_train=n_gram_vect_fit.transform(X_train['Lemmatized'])
    n_gram_test=n_gram_vect_fit.transform(nr['lemma'])
    
    model = gb.fit(n_gram_train,y_train)
    pred=pd.DataFrame(model.predict(n_gram_test))
    pred.columns=['Recommended Healthcare Service:']
    pred.index= ['lemmatized_1ngram4_GBC_85-15:']
    print('\n\n',pred)
predict_ngramGBC_clean('I need a massage!') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_GBC_85-15:               massage benefits
predict_ngramGBC_lemma('I need a massage!')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_1ngram4_GBC_85-15:                massage benefits

When lemmatizing or stemming the text with the cleaned text where both were cleaned of punctuation and stop words as well as all made lowercase, the results were the same for the short input of, ‘I need a massage!’

Lets try some other inputs.

predict_ngramGBC_clean('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.') 
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_GBC_85-15:               massage benefits
predict_ngramGBC_lemma('I have been working out a lot more than normal and am sore all over. Feels like a car hit me. I can\'t touch my toes to tie my shoes and my neck won\'t turn to the right. Help me.')
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_1ngram4_GBC_85-15:                massage benefits
#X_train['Document'][35]
y_train[35]
## 'cupping benefits'
testing35 = X_train['Document'][35]
type(testing35)
## <class 'str'>

Lets test our model on this index 35 of the training set, and see if it will predict the true target, cupping benefits.

predict_ngramGBC_clean(testing35)
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_GBC_85-15:               cupping benefits
predict_ngramRFC_clean(testing35)
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_RFC_85-15:               cupping benefits
predict_ngramGBC_lemma(testing35)
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_1ngram4_GBC_85-15:                cupping benefits
predict_ngramRFC_lemma(testing35)
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_1ngram4_RFC_85-15:                cupping benefits

The above test let us realize that the functions are working, because that was taken directly from the training set and the category or class for that input was cupping benefits as these models should predict or recommend correctly.

Here is the large document for testing35 printed out:

testing35
## '\nCupping Therapy\n\n\nCupping therapy is an ancient form of alternative medicine in which a therapist puts special cups on your skin for a few minutes to create suction. People get it for many purposes, including to help with pain, inflammation, blood flow, relaxation and well-being, and as a type of deep-tissue massage.\n\nThe cups may be made of:\n\n    Glass\n    Bamboo\n    Earthenware\n    Silicone\n\nCupping therapy might be trendy now, but it?s not new. It dates back to ancient Egyptian, Chinese, and Middle Eastern cultures. One of the oldest medical textbooks in the world, the Ebers Papyrus, describes how the ancient Egyptians used cupping therapy in 1,550 B.C.\nTypes\n\nThere are different methods of cupping, including:\n\n    Dry\n    Wet\n\n\n\nDuring both types of cupping, your therapist will put a flammable substance such as alcohol, herbs, or paper in a cup and set it on fire. As the fire goes out, he puts the cup upside down on your skin.\n\nAs the air inside the cup cools, it creates a vacuum. This causes your skin to rise and redden as your blood vessels expand. The cup is generally left in place for up to 3 minutes.\n\nA more modern version of cupping uses a rubber pump instead of fire to create the vacuum inside the cup. Sometimes therapists use silicone cups, which they can move from place to place on your skin for a massage-like effect.\n\nWet cupping creates a mild suction by leaving a cup in place for about 3 minutes. The therapist then removes the cup and uses a small scalpel to make light, tiny cuts on your skin. Next, he or she does a second suction to draw out a small quantity of blood.\n\nYou might get 3-5 cups in your first session. Or you might just try one to see how it goes. It?s rare to get more than 5-7 cups, the British Cupping Society notes.\n\nAfterward, you may get an antibiotic ointment and bandage to prevent infection. Your skin should look normal again within 10 days.\n\nCupping therapy supporters believe that wet cupping removes harmful substances and toxins from the body to promote healing. But that?s not proven.\n\nSome people also get ?needle cupping,? in which the therapist first inserts acupuncture needles and then puts cups over them.'

We can see that a user most likely won’t be inputting this large of a text input into the models, so when predicting a class, the models as trees are most likely using the length of the document to predict, because the category not professional only had one line string inputs. We can try this out with some benefits we know are in the documents, and see if something other than ‘not professional’ is recommended. This next string is a snippet of above.


testing = 'improves blood flow through the veins and arteries. Especially useful for athletes is cupping?s potential to relieve muscle spasms.\r\n\r\nCupping also affects the digestive system. A few benefits include an improved metabolism, relief from constipation, a healthy appetite, and stronger digestion.'
predict_ngramGBC_clean(testing)
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_GBC_85-15:               massage benefits
predict_ngramRFC_clean(testing)
## 
## 
##                             Recommended Healthcare Service
## stemmed_1ngram4_RFC_85-15:               massage benefits
predict_ngramGBC_lemma(testing)
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_1ngram4_GBC_85-15:                massage benefits
predict_ngramRFC_lemma(testing)
## 
## 
##                                Recommended Healthcare Service:
## lemmatized_1ngram4_RFC_85-15:                massage benefits

Even though the exact words were from a portion of the cupping benefits document, the models still predicted ‘Not Professional’ for the class. This is also using ngrams tokenization so there would need to be a test of the tf-idf and count tokenizations for both of these tree type models. One thing for sure, the class of ‘Not Professional’ needs 100 times its current document length to be accurate to a degree on short user inputs.