In this data, 21 days of experimenting on body effects in terms of body measurements in inches with a tape measure and with a fat caliper in mm and also with body weight on a scale that was changed during the research to a digital scale and estimated to accuracy with the manual scale that couldn’t easily stay at 0.
This research started on Monday 1/4/2021 and ended Monday 1/25/2021 after the 21 days were completed that included rest days on Sunday of no lipocavitation or working out, but continued the diet.
Diet: The gluten free part of the diet was added mid research to see if any effects are noticed on the body. Otherwise, the dieting is daily and allows research subject of myself to eat as much of whatever I want but it cannot be alcohol, processed sweets, meats including seafood but can be plant based meat, and no added butter. The
Lipocavitation: Lipocavitation was done three times a week starting with Monday the 4th of January and ending on Saturday the 23rd of January. It was done on Mondays, Wednesdays, and Saturdays. The parts of the body where lipocavitation was done are the upper thighs, upper arms, and midsection of the body to include the sides and abdominal areas of fat. The time of the lipocavitation tools used on each body part was 5 minutes each body part. The tools used were at first vacuum at 1-3 notches for power of suction and frequency of suction at 20-65% RF frequency of an Radio Frequency able to reach 1 MHZ power. This had to be changed once bruising was notices on outer thighs and tricep areas of arms so that only the 6 pronged RF tool was used on the arms and legs at 60-65% RF power, and only the vacuum and rF combination tool was used on the midsection of the body at 20-25% RF power.
Exercise: The exercise part of the research was cardio kickboxing on a standing bag designed for kickboxing as well as weight training the total body with weights comfortable to do 3 sets of 10-12 repetitions each in 18 exercises all the same day and five times a week on Mondays, Tuesdays, Thursdays, Fridays, and Saturdays. Wednesday and Sundays there weren’t any workouts. The cardio kickboxing started out using 20-30 minutes of a mixed set of rounds and time interval of rounds always with a one minute rest interval between rounds using onlineboxingtimer.com. Video documentation was done to record each round and might be uploaded to youtube after making into daily videos for kickboxing and linked to through this document. They are too large for github, but partial videos are available at my personal instagram account @janisharris1982 for information. Only the first set of each set of weight training exercises was documented with video for the workout days. Also on my instagram account publicly available. The weight training exercises were as follows but with some variances in permutations of grouped exercises:
shoulder lifts medial/posterior deltoids/latts 3 sets 10-12 reps 10 lbs
quads with leg extensions sitting 3 sets 10-12 reps 40 lbs
obliques side extensions 3 sets 12 reps 25 lbs
bench press 3 sets 10-12 reps barbell 65 lbs
tricep extension above head dumbells 25 lbs 3 sets 10-12 reps
hamstrings leg flexion laying prone 3 sets 10-12 reps 35 lbs
calves 3 sets 12 reps 50 lbs total with dumbells
military press 3 sets 30 lb dumbells
upper trapezius shoulder shrugs 50 lbs dumbells 3 sets 10-12 reps
tricep chair dips 3 sets 12 reps no added weight
standing adductors 3 sets 10-12 reps 20 lbs
rhomboids scapula abduction 3 sets 12 reps 25 lbs
biceps curls 30 lbs 3 sets 10-12
standing abducturs 3 sets 10-12 reps 20 lbs
squats 3 sets 10 reps barbell 45 lb + 40lbs added weight
leg lifts standing for abs, 3 sets 20 reps no added weight
dead lifts 3 sets 10-12 reps dumbells 50 lbs
tricep extension rope standing 3 sets 25 lbs
A worksheet was made that wasn’t suitable for Machine learning with the string features of the notes and other features. But the data was made useable for ML was stored as a csv.Also, the fractional measurement values in the worksheet were made numeric in the csv file to be properly handled for machine learning or ML in R. They otherwise will read in as character strings or factors that isn’t useful.
The exercises done each day and order with weight changes as increases and decreases as a sum of all weight increased was added as features later to see if it plays a role in body effects.
The diet of listed food items is also in the worksheet, but the calories, fat and saturated fat that is included in fat counts, carbohydrates, fiber that is included in carbohydrate counts, and protein counts all in grams and sodium in milligrams is listed each as their own feature.Gluten free didn’t start until the 16th or day 13 of our 21 days of research.
There was also a waist trimmer that was worn daily for most of the day but more so while at work during this workout and it is added as a feature to see if there are body effects such as the waistline measurement affected by the waist trimmer.
The weather at the time of workout only was accounted for from the start of the workout which could change in a few hours after the workout was completed from cardio kickboxing and the 18 total body exercises done in 3 sets of 10-12 reps each available at https://www.timeanddate.com/weather/usa/corona/historic .I waited too long to add in the values of weather for 1/6/2021 Wednesday at 5:30 am the time of lipocavitation and found an average at https://www.accuweather.com/en/us/corona/92879/january-weather/332088 but not that time, low of 48 and high of 66, I input 48 as temperature. The weather fluctuated to cooler temps week 1, hotter temps week 2, and cold temps week 3 again by the end of that week.
Also, values that were null but literally meant no change like the increase or decrease in weight lifting weights were made 0 because there was no change. The only actual null values input were for the 1st Sunday the 10th because no measurements taken and I used the day prior’s measurements as it was an off day for working out and lipocavitation but dieting was maintained.
Compression socks were also worn at the end of the research about the last 4-5 days because the low iron and lymphedema of the research subject having an abdominal obstruction impacting veinous and lymphatic flow from a uterine fibroid.
Also note that bowel movements or BMs were noted as its own feature to see if the dieting, waist trimmer, or exercises had an effect on the number of BMs a day. Notes show if the BM was larger or not or more than one. Otherwise, not noted because of the regularity of research subject having them routinely daily after 1-2 cups of coffee.
The number of cups of coffee drank in a day was also added as a feature and the number of hours of sleep approximately from the time of bed time to the time of waking up. Not every night was a sound sleepless night. Some interferences occured like cats meowing, alarms, fireworks, phone calls going off, roomate noises, etc. occured.
The data can be retrieved at my github account: https://github.com/JanJanJan2018/lipocavitation-dieting-exercise-21-day-research and a search of the files needed. The worksheet has more detailed information that the csv file that we will input below to analyze and see body effects from this treatment over 3 weeks of dieting, exercise, and lipocavitation.
The following image is a Word document created table screen shot of important measures like the day the measurements were documented on lipocavitation days, the weight at that time, the calories consumed the day before, the time measurements taken, the minutes of cardio done the day before, and the waist trimmer worn. The top images are the side view without holding abs in, research subject has a uterine fibroid. These images could also show if the fibroid shrinks in appearance or grows with dieting, exercising, and lipocavitation. The 2nd row of images are the closest to straight into the camera with waist measurements being documented at that time. Day 22 is the results day after all 21 days were completed by routine schedule and following the rest day prior on Sunday.Also, day 22 diet and nutrition and other features are filled in, but they aren’t part of this research, as they are observed on the same day the results are completed. This row was allowed for seeing results on the 22nd day with measurements and time taken since food was eaten during the day before the measurements were taken at 2 pm.
results
library(lubridate)
##
## Attaching package: 'lubridate'
## The following object is masked from 'package:base':
##
## date
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:lubridate':
##
## intersect, setdiff, union
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(randomForest)
## randomForest 4.6-14
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:dplyr':
##
## combine
data <- read.csv('workout diet lipocavitation 1-4-2021 to 1-25-2021 -edited for ML.csv',
sep=',', header=T)
data
## weekDay.Date Day Date time weatherAtWorkoutTime
## 1 Mon 1 1/4/2021 9:00 AM 51
## 2 Tue 2 1/5/2021 9:00 AM 48
## 3 Wed 3 1/6/2021 5:30 AM 48
## 4 Thur 4 1/7/2021 9:00 AM 53
## 5 Fri 5 1/8/2021 5:00 PM 63
## 6 Sat 6 1/9/2021 4:30 PM 73
## 7 Sun 7 1/10/2021 2:00 PM 70
## 8 Mon 8 1/11/2021 9:00 AM 60
## 9 Tue 9 1/12/2021 9:00 AM 53
## 10 Wed 10 1/13/2021 5:30 AM 43
## 11 Thur 11 1/14/2021 9:00 AM 67
## 12 Fri 12 1/15/2021 5:30 PM 81
## 13 Sat 13 1/16/2021 3:00 PM 89
## 14 Sun 14 1/17/2021 2:00 PM 88
## 15 Mon 15 1/18/2021 9:30 AM 60
## 16 Tue 16 1/19/2021 9:00 AM 63
## 17 Wed 17 1/20/2021 5:30 AM 65
## 18 Thur 18 1/21/2021 9:00 AM 60
## 19 Fri 19 1/22/2021 5:00 PM 57
## 20 Sat 20 1/23/2021 4:30 PM 50
## 21 Sun 21 1/24/2021 2:00 PM 58
## 22 Mon 22 1/25/2021 2:00 PM 67
## minutesOfCardioKickBoxing lipocavitation sumLipocavitationTreatments
## 1 20 1 1
## 2 20 0 1
## 3 0 1 2
## 4 30 0 2
## 5 30 0 2
## 6 25 1 3
## 7 0 0 3
## 8 20 1 4
## 9 20 0 4
## 10 0 1 5
## 11 30 0 5
## 12 30 0 5
## 13 25 1 6
## 14 0 0 6
## 15 30 1 7
## 16 28 0 7
## 17 0 1 8
## 18 30 0 8
## 19 30 0 8
## 20 30 1 9
## 21 0 0 9
## 22 0 0 9
## timeMeasurementsTaken weightScaleMeasurement weightChangeFromDayPrior
## 1 5:00 AM 142.0 0.0
## 2 2:00 PM 141.0 -1.0
## 3 6:00 AM 142.0 1.0
## 4 1230 PM 143.0 1.0
## 5 8:45 PM 144.0 1.0
## 6 6:30 PM 147.6 3.6
## 7 7:15 AM 145.4 -2.2
## 8 6:00 AM 141.8 -3.6
## 9 11:30 AM 146.6 4.8
## 10 5:45 AM 143.8 -2.8
## 11 11:45 AM 145.4 1.6
## 12 9:00 PM 145.4 0.0
## 13 5:45 AM 143.8 -1.6
## 14 4:00 PM 143.8 0.0
## 15 7:10 AM 143.8 0.0
## 16 6:00 PM 146.4 2.6
## 17 4:00 AM 147.6 1.2
## 18 7:30 AM 146.4 -1.2
## 19 8:20 PM 147.2 0.8
## 20 5:15 AM 144.2 -3.0
## 21 8:50 PM 143.8 -0.4
## 22 2:00 PM 141.6 -0.4
## calories_consumed_dayPrior waistlineMeasurement.BellyButton
## 1 1826.50 33.50
## 2 1826.50 33.00
## 3 1884.00 33.25
## 4 2508.00 33.00
## 5 1811.50 33.00
## 6 1963.00 32.00
## 7 1407.00 32.00
## 8 2541.00 33.00
## 9 2244.00 33.50
## 10 2961.00 33.50
## 11 2526.00 33.50
## 12 2709.00 33.00
## 13 2163.00 33.50
## 14 1777.00 32.00
## 15 2295.00 32.50
## 16 2238.00 32.50
## 17 2688.00 32.50
## 18 2526.00 32.50
## 19 2418.00 33.50
## 20 1353.50 32.00
## 21 1973.75 32.00
## 22 1681.25 31.50
## armMeasurementInches.R armMeasurementInches.L thighMeasurementInches.R
## 1 12.00 12.25 22.63
## 2 11.75 11.50 23.50
## 3 12.00 12.00 23.25
## 4 12.00 12.00 23.75
## 5 12.00 12.00 24.00
## 6 12.00 12.00 24.00
## 7 12.00 12.00 24.00
## 8 12.00 12.00 23.00
## 9 12.00 12.00 23.00
## 10 12.00 12.00 23.00
## 11 12.00 12.00 23.50
## 12 12.00 12.00 23.50
## 13 12.00 12.00 22.50
## 14 12.00 12.00 23.00
## 15 11.75 11.75 22.50
## 16 11.50 11.50 22.50
## 17 11.88 11.88 22.75
## 18 11.50 11.50 22.00
## 19 11.50 11.50 23.00
## 20 11.50 11.50 22.50
## 21 11.50 11.50 22.50
## 22 11.25 11.25 22.50
## thighMeasurementInches.L absFat.MM.R absFat.MM.L tricepsFat.MM.R
## 1 22.63 30 30 26
## 2 23.50 22 22 22
## 3 23.25 20 20 20
## 4 23.75 22 22 20
## 5 24.00 24 24 22
## 6 24.00 22 22 22
## 7 24.00 22 22 22
## 8 23.00 22 22 22
## 9 23.00 20 20 22
## 10 23.00 20 22 20
## 11 23.50 20 22 24
## 12 23.50 20 22 22
## 13 23.00 22 24 22
## 14 23.00 20 20 20
## 15 22.50 20 20 20
## 16 22.50 20 20 20
## 17 22.75 20 20 20
## 18 22.00 20 22 22
## 19 23.00 18 20 22
## 20 22.50 20 22 18
## 21 22.50 22 22 22
## 22 22.50 20 20 20
## tricepsFat.MM.L innerThighFat.MM.R innerThighFat.MM.L dailyCalories fat_gram
## 1 28 28 28 1826.50 41.50
## 2 22 22 20 1884.00 61.25
## 3 20 16 16 2508.00 72.00
## 4 20 22 22 1811.50 59.00
## 5 24 16 16 1963.00 73.85
## 6 22 16 20 1407.00 39.00
## 7 22 16 20 2541.00 100.10
## 8 22 14 14 2244.00 82.00
## 9 22 12 12 2961.00 95.34
## 10 20 12 12 2526.00 78.68
## 11 22 12 14 2709.00 83.68
## 12 22 12 12 2163.00 60.68
## 13 22 14 12 1777.00 63.84
## 14 22 10 12 2295.00 60.84
## 15 20 10 12 2238.00 72.34
## 16 20 14 14 2688.00 113.34
## 17 20 12 14 2526.00 102.68
## 18 22 14 12 2418.00 88.34
## 19 20 16 12 1353.50 72.00
## 20 18 12 10 1973.75 42.00
## 21 22 12 14 1681.25 31.60
## 22 20 10 10 2655.60 143.02
## saturatedFat_gram protein_gram carbs_grams fiber_grams_from_carbs
## 1 11.50 42.00 172.50 29.50
## 2 31.25 63.00 236.50 32.00
## 3 38.50 82.00 371.00 45.00
## 4 16.00 62.00 280.00 49.00
## 5 18.75 67.25 291.75 46.00
## 6 25.00 57.00 231.00 16.00
## 7 40.00 91.25 320.25 47.00
## 8 31.00 61.00 308.00 43.00
## 9 42.04 96.00 392.00 45.00
## 10 28.58 85.00 392.00 55.00
## 11 26.58 87.00 402.00 56.00
## 12 16.58 99.00 308.00 55.00
## 13 16.04 75.00 238.00 71.00
## 14 28.04 80.00 349.00 63.00
## 15 39.04 128.00 238.00 50.00
## 16 35.54 101.00 323.00 61.00
## 17 29.08 123.00 300.00 61.00
## 18 26.04 102.00 351.00 56.00
## 19 23.50 43.00 174.00 27.00
## 20 21.00 76.00 338.50 33.00
## 21 22.00 51.80 285.50 28.20
## 22 75.82 70.57 269.61 57.57
## sodiumDailyIntake coffee_cups morning_BM MenstruationDay
## 1 1526.00 2 1 0
## 2 3729.00 2 1 0
## 3 5349.00 2 1 0
## 4 1673.00 2 1 0
## 5 2852.00 2 1 0
## 6 1731.00 2 1 0
## 7 3509.00 2 1 0
## 8 3858.00 2 1 0
## 9 4719.03 2 1 0
## 10 4831.06 2 1 0
## 11 4822.06 2 1 0
## 12 2524.06 2 1 0
## 13 1180.03 2 1 0
## 14 4302.03 2 2 0
## 15 3921.03 2 1 0
## 16 2287.03 2 1 0
## 17 2373.06 3 2 0
## 18 1736.03 3 1 0
## 19 1077.00 3 2 0
## 20 3653.00 3 1 1
## 21 2263.00 3 1 2
## 22 2832.50 3 3 3
## weightLiftingIncrease_lbs weightLiftingDecrease_lbs waistTrimmer
## 1 0 0 32
## 2 5 0 32
## 3 0 0 32
## 4 10 0 32
## 5 0 0 32
## 6 0 -10 32
## 7 0 0 32
## 8 50 0 32
## 9 45 0 32
## 10 0 0 32
## 11 15 -5 32
## 12 5 0 32
## 13 0 -10 31
## 14 0 0 32
## 15 10 -5 31
## 16 0 0 31
## 17 0 0 31
## 18 10 -20 31
## 19 10 0 31
## 20 0 -10 31
## 21 0 0 31
## 22 0 0 31
## compressionSocks HoursOfSleep glutenFree alcoholFree processedSweetsFree
## 1 0 7.0 0 1 1
## 2 0 8.5 0 1 1
## 3 0 7.5 0 1 1
## 4 0 10.0 0 1 1
## 5 0 7.5 0 1 1
## 6 0 9.5 0 1 1
## 7 0 9.0 0 1 1
## 8 0 9.5 0 1 1
## 9 0 8.5 0 1 1
## 10 0 8.0 0 1 1
## 11 0 8.5 0 1 1
## 12 0 6.0 0 1 1
## 13 0 7.5 1 1 1
## 14 0 3.5 1 1 1
## 15 0 8.0 1 1 1
## 16 0 6.0 1 1 1
## 17 0 6.5 1 1 1
## 18 1 7.5 1 1 1
## 19 1 7.5 1 1 1
## 20 1 7.0 1 1 1
## 21 1 7.0 1 1 1
## 22 1 7.0 0 1 1
## butterAddedFree meatFree
## 1 1 1
## 2 1 1
## 3 1 1
## 4 1 1
## 5 1 1
## 6 1 1
## 7 1 1
## 8 1 1
## 9 1 1
## 10 1 1
## 11 1 1
## 12 1 1
## 13 1 1
## 14 1 1
## 15 1 1
## 16 1 1
## 17 1 1
## 18 1 1
## 19 1 1
## 20 1 1
## 21 1 1
## 22 1 1
str(data)
## 'data.frame': 22 obs. of 43 variables:
## $ weekDay.Date : Factor w/ 7 levels "Fri","Mon","Sat",..: 2 6 7 5 1 3 4 2 6 7 ...
## $ Day : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Date : Factor w/ 22 levels "1/10/2021","1/11/2021",..: 17 18 19 20 21 22 1 2 3 4 ...
## $ time : Factor w/ 8 levels "2:00 PM","3:00 PM",..: 7 7 5 7 4 3 1 7 7 5 ...
## $ weatherAtWorkoutTime : int 51 48 48 53 63 73 70 60 53 43 ...
## $ minutesOfCardioKickBoxing : int 20 20 0 30 30 25 0 20 20 0 ...
## $ lipocavitation : int 1 0 1 0 0 1 0 1 0 1 ...
## $ sumLipocavitationTreatments : int 1 1 2 2 2 3 3 4 4 5 ...
## $ timeMeasurementsTaken : Factor w/ 19 levels "11:30 AM","11:45 AM",..: 7 4 10 3 17 12 14 10 1 9 ...
## $ weightScaleMeasurement : num 142 141 142 143 144 ...
## $ weightChangeFromDayPrior : num 0 -1 1 1 1 3.6 -2.2 -3.6 4.8 -2.8 ...
## $ calories_consumed_dayPrior : num 1826 1826 1884 2508 1812 ...
## $ waistlineMeasurement.BellyButton: num 33.5 33 33.2 33 33 ...
## $ armMeasurementInches.R : num 12 11.8 12 12 12 ...
## $ armMeasurementInches.L : num 12.2 11.5 12 12 12 ...
## $ thighMeasurementInches.R : num 22.6 23.5 23.2 23.8 24 ...
## $ thighMeasurementInches.L : num 22.6 23.5 23.2 23.8 24 ...
## $ absFat.MM.R : int 30 22 20 22 24 22 22 22 20 20 ...
## $ absFat.MM.L : int 30 22 20 22 24 22 22 22 20 22 ...
## $ tricepsFat.MM.R : int 26 22 20 20 22 22 22 22 22 20 ...
## $ tricepsFat.MM.L : int 28 22 20 20 24 22 22 22 22 20 ...
## $ innerThighFat.MM.R : int 28 22 16 22 16 16 16 14 12 12 ...
## $ innerThighFat.MM.L : int 28 20 16 22 16 20 20 14 12 12 ...
## $ dailyCalories : num 1826 1884 2508 1812 1963 ...
## $ fat_gram : num 41.5 61.2 72 59 73.8 ...
## $ saturatedFat_gram : num 11.5 31.2 38.5 16 18.8 ...
## $ protein_gram : num 42 63 82 62 67.2 ...
## $ carbs_grams : num 172 236 371 280 292 ...
## $ fiber_grams_from_carbs : num 29.5 32 45 49 46 16 47 43 45 55 ...
## $ sodiumDailyIntake : num 1526 3729 5349 1673 2852 ...
## $ coffee_cups : int 2 2 2 2 2 2 2 2 2 2 ...
## $ morning_BM : int 1 1 1 1 1 1 1 1 1 1 ...
## $ MenstruationDay : int 0 0 0 0 0 0 0 0 0 0 ...
## $ weightLiftingIncrease_lbs : int 0 5 0 10 0 0 0 50 45 0 ...
## $ weightLiftingDecrease_lbs : int 0 0 0 0 0 -10 0 0 0 0 ...
## $ waistTrimmer : int 32 32 32 32 32 32 32 32 32 32 ...
## $ compressionSocks : int 0 0 0 0 0 0 0 0 0 0 ...
## $ HoursOfSleep : num 7 8.5 7.5 10 7.5 9.5 9 9.5 8.5 8 ...
## $ glutenFree : int 0 0 0 0 0 0 0 0 0 0 ...
## $ alcoholFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ processedSweetsFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ butterAddedFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ meatFree : int 1 1 1 1 1 1 1 1 1 1 ...
In the features above, the weightLiftingIncrease_lbs or weightLiftingDecrease_lbs is a cumulative sum of the amount of weight per body part increased or decreased respectively when adding or removing weight from the sets of 3 per exercise in the weight training portion of exercise. For example, if I decide to increase weight that day to 65 lbs instead of 45 lbs on the pecs in the bench press and also increase the weight on the quads using the leg extension/knee extension by 5 lbs, I put 20+5 = 25 in the weightLiftingIncrease_lbs column. And if deciding to decrease the squats from 85 to 75 and decrease the military press by 10 lbs as well I would put -10+(-10)=-20 in the weightLiftingDecrease_lbs column.
Also the trailing columns of glutenFree to meatFree are binary so that if followed the value is 1 for yes, and if not followed in the diet a 0 or zero value for no, not followed. The compressionSocks column is also binary for whether they were worn that day as a 1 and if not a 0.The morning_BM and coffee_cups columns are the count of bowel movements and cups of coffee throughout the day from 1 to 3. The MenstruationDay column is binary and a 1 if on menstruation and a 0 if not. The hours of sleep are approximate as an estimate of total hours of sleep from bed time to waking up. The date columns weren’t read in as date columns and need to be done for the Date, time, and timeMeasurementsTaken columns. The weekDay.Date column can be made into a factor.
data$weekDay.Date <- as.factor(data$weekDay.Date)
data$Date <- gsub('/','-',data$Date)
data$Date
## [1] "1-4-2021" "1-5-2021" "1-6-2021" "1-7-2021" "1-8-2021" "1-9-2021"
## [7] "1-10-2021" "1-11-2021" "1-12-2021" "1-13-2021" "1-14-2021" "1-15-2021"
## [13] "1-16-2021" "1-17-2021" "1-18-2021" "1-19-2021" "1-20-2021" "1-21-2021"
## [19] "1-22-2021" "1-23-2021" "1-24-2021" "1-25-2021"
dateString <- strsplit(data$Date, split='-',perl=T)
dateString
## [[1]]
## [1] "1" "4" "2021"
##
## [[2]]
## [1] "1" "5" "2021"
##
## [[3]]
## [1] "1" "6" "2021"
##
## [[4]]
## [1] "1" "7" "2021"
##
## [[5]]
## [1] "1" "8" "2021"
##
## [[6]]
## [1] "1" "9" "2021"
##
## [[7]]
## [1] "1" "10" "2021"
##
## [[8]]
## [1] "1" "11" "2021"
##
## [[9]]
## [1] "1" "12" "2021"
##
## [[10]]
## [1] "1" "13" "2021"
##
## [[11]]
## [1] "1" "14" "2021"
##
## [[12]]
## [1] "1" "15" "2021"
##
## [[13]]
## [1] "1" "16" "2021"
##
## [[14]]
## [1] "1" "17" "2021"
##
## [[15]]
## [1] "1" "18" "2021"
##
## [[16]]
## [1] "1" "19" "2021"
##
## [[17]]
## [1] "1" "20" "2021"
##
## [[18]]
## [1] "1" "21" "2021"
##
## [[19]]
## [1] "1" "22" "2021"
##
## [[20]]
## [1] "1" "23" "2021"
##
## [[21]]
## [1] "1" "24" "2021"
##
## [[22]]
## [1] "1" "25" "2021"
month <- lapply(dateString,'[',1)
Month <- paste('0',month,sep='')
day <- lapply(dateString,'[',2)
day1 <- paste('0',day[1:6],sep='')
day[1:6] <- day1
year <- lapply(dateString,'[',3)
date1 <- paste(Month,day,year,sep='-')
date1
## [1] "01-04-2021" "01-05-2021" "01-06-2021" "01-07-2021" "01-08-2021"
## [6] "01-09-2021" "01-10-2021" "01-11-2021" "01-12-2021" "01-13-2021"
## [11] "01-14-2021" "01-15-2021" "01-16-2021" "01-17-2021" "01-18-2021"
## [16] "01-19-2021" "01-20-2021" "01-21-2021" "01-22-2021" "01-23-2021"
## [21] "01-24-2021" "01-25-2021"
Date <- mdy(date1)
Date #a Date object
## [1] "2021-01-04" "2021-01-05" "2021-01-06" "2021-01-07" "2021-01-08"
## [6] "2021-01-09" "2021-01-10" "2021-01-11" "2021-01-12" "2021-01-13"
## [11] "2021-01-14" "2021-01-15" "2021-01-16" "2021-01-17" "2021-01-18"
## [16] "2021-01-19" "2021-01-20" "2021-01-21" "2021-01-22" "2021-01-23"
## [21] "2021-01-24" "2021-01-25"
Date2 <- as.Date(Date)
Date2 #a Date object
## [1] "2021-01-04" "2021-01-05" "2021-01-06" "2021-01-07" "2021-01-08"
## [6] "2021-01-09" "2021-01-10" "2021-01-11" "2021-01-12" "2021-01-13"
## [11] "2021-01-14" "2021-01-15" "2021-01-16" "2021-01-17" "2021-01-18"
## [16] "2021-01-19" "2021-01-20" "2021-01-21" "2021-01-22" "2021-01-23"
## [21] "2021-01-24" "2021-01-25"
Time <- data$time
pm <- grep('PM', Time)
am <- grep('AM', Time)
PM <- Time[pm]
PM <- gsub(' PM','',PM)
PM <- gsub(':','-',PM)
PM
## [1] "5-00" "4-30" "2-00" "5-30" "3-00" "2-00" "5-00" "4-30" "2-00" "2-00"
PM2 <- strsplit(PM,split='-')
PM2
## [[1]]
## [1] "5" "00"
##
## [[2]]
## [1] "4" "30"
##
## [[3]]
## [1] "2" "00"
##
## [[4]]
## [1] "5" "30"
##
## [[5]]
## [1] "3" "00"
##
## [[6]]
## [1] "2" "00"
##
## [[7]]
## [1] "5" "00"
##
## [[8]]
## [1] "4" "30"
##
## [[9]]
## [1] "2" "00"
##
## [[10]]
## [1] "2" "00"
hourPM <- lapply(PM2,'[',1)
minutePM <- lapply(PM2,'[',2)
hourPM <- as.integer(hourPM)
minutePM <- as.integer(minutePM)
AM <- Time[am]
AM <- gsub(' AM','', AM)
AM <- gsub(':','-',AM)
AM
## [1] "9-00" "9-00" "5-30" "9-00" "9-00" "9-00" "5-30" "9-00" "9-30" "9-00"
## [11] "5-30" "9-00"
AM2 <- strsplit(AM,split='-')
AM2
## [[1]]
## [1] "9" "00"
##
## [[2]]
## [1] "9" "00"
##
## [[3]]
## [1] "5" "30"
##
## [[4]]
## [1] "9" "00"
##
## [[5]]
## [1] "9" "00"
##
## [[6]]
## [1] "9" "00"
##
## [[7]]
## [1] "5" "30"
##
## [[8]]
## [1] "9" "00"
##
## [[9]]
## [1] "9" "30"
##
## [[10]]
## [1] "9" "00"
##
## [[11]]
## [1] "5" "30"
##
## [[12]]
## [1] "9" "00"
hourAM <- lapply(AM2,'[',1)
minuteAM <- lapply(AM2,'[',2)
hourAM <- as.integer(hourAM)
minuteAM <- as.integer(minuteAM)
hourPM2 <- hourPM+12
PM3 <- paste(hourPM2,minutePM,sep=':')
AM3 <- paste(hourAM,minuteAM,sep=':')
PM3 <- gsub(':0',':00',PM3)
AM3 <- gsub(':0',':00',AM3)
AM4 <- paste('0',AM3,sep='')
PM4 <- paste(PM3,':00',sep='')
AM5 <- paste(AM4,':00',sep='')
Time <- as.character(Time)
Time[pm] <- PM4
Time[am] <- AM5
Time
## [1] "09:00:00" "09:00:00" "05:30:00" "09:00:00" "17:00:00" "16:30:00"
## [7] "14:00:00" "09:00:00" "09:00:00" "05:30:00" "09:00:00" "17:30:00"
## [13] "15:00:00" "14:00:00" "09:30:00" "09:00:00" "05:30:00" "09:00:00"
## [19] "17:00:00" "16:30:00" "14:00:00" "14:00:00"
Time2 <- hms(Time)
Time2
## [1] "9H 0M 0S" "9H 0M 0S" "5H 30M 0S" "9H 0M 0S" "17H 0M 0S"
## [6] "16H 30M 0S" "14H 0M 0S" "9H 0M 0S" "9H 0M 0S" "5H 30M 0S"
## [11] "9H 0M 0S" "17H 30M 0S" "15H 0M 0S" "14H 0M 0S" "9H 30M 0S"
## [16] "9H 0M 0S" "5H 30M 0S" "9H 0M 0S" "17H 0M 0S" "16H 30M 0S"
## [21] "14H 0M 0S" "14H 0M 0S"
DateHMS <- paste(Date2,Time2,sep=' ')
DateHMS
## [1] "2021-01-04 9H 0M 0S" "2021-01-05 9H 0M 0S" "2021-01-06 5H 30M 0S"
## [4] "2021-01-07 9H 0M 0S" "2021-01-08 17H 0M 0S" "2021-01-09 16H 30M 0S"
## [7] "2021-01-10 14H 0M 0S" "2021-01-11 9H 0M 0S" "2021-01-12 9H 0M 0S"
## [10] "2021-01-13 5H 30M 0S" "2021-01-14 9H 0M 0S" "2021-01-15 17H 30M 0S"
## [13] "2021-01-16 15H 0M 0S" "2021-01-17 14H 0M 0S" "2021-01-18 9H 30M 0S"
## [16] "2021-01-19 9H 0M 0S" "2021-01-20 5H 30M 0S" "2021-01-21 9H 0M 0S"
## [19] "2021-01-22 17H 0M 0S" "2021-01-23 16H 30M 0S" "2021-01-24 14H 0M 0S"
## [22] "2021-01-25 14H 0M 0S"
Date3 <- paste(Date,Time,sep=' ') # a Date object and character vector
Date3 # a character vector
## [1] "2021-01-04 09:00:00" "2021-01-05 09:00:00" "2021-01-06 05:30:00"
## [4] "2021-01-07 09:00:00" "2021-01-08 17:00:00" "2021-01-09 16:30:00"
## [7] "2021-01-10 14:00:00" "2021-01-11 09:00:00" "2021-01-12 09:00:00"
## [10] "2021-01-13 05:30:00" "2021-01-14 09:00:00" "2021-01-15 17:30:00"
## [13] "2021-01-16 15:00:00" "2021-01-17 14:00:00" "2021-01-18 09:30:00"
## [16] "2021-01-19 09:00:00" "2021-01-20 05:30:00" "2021-01-21 09:00:00"
## [19] "2021-01-22 17:00:00" "2021-01-23 16:30:00" "2021-01-24 14:00:00"
## [22] "2021-01-25 14:00:00"
# tz= "US/Pacific"
DateTZ <- ymd_hms(Date3,tz="US/Pacific")
tz(DateTZ)
## [1] "US/Pacific"
DateTZ
## [1] "2021-01-04 09:00:00 PST" "2021-01-05 09:00:00 PST"
## [3] "2021-01-06 05:30:00 PST" "2021-01-07 09:00:00 PST"
## [5] "2021-01-08 17:00:00 PST" "2021-01-09 16:30:00 PST"
## [7] "2021-01-10 14:00:00 PST" "2021-01-11 09:00:00 PST"
## [9] "2021-01-12 09:00:00 PST" "2021-01-13 05:30:00 PST"
## [11] "2021-01-14 09:00:00 PST" "2021-01-15 17:30:00 PST"
## [13] "2021-01-16 15:00:00 PST" "2021-01-17 14:00:00 PST"
## [15] "2021-01-18 09:30:00 PST" "2021-01-19 09:00:00 PST"
## [17] "2021-01-20 05:30:00 PST" "2021-01-21 09:00:00 PST"
## [19] "2021-01-22 17:00:00 PST" "2021-01-23 16:30:00 PST"
## [21] "2021-01-24 14:00:00 PST" "2021-01-25 14:00:00 PST"
data$Date_PacificTime <- DateTZ
str(data)
## 'data.frame': 22 obs. of 44 variables:
## $ weekDay.Date : Factor w/ 7 levels "Fri","Mon","Sat",..: 2 6 7 5 1 3 4 2 6 7 ...
## $ Day : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Date : chr "1-4-2021" "1-5-2021" "1-6-2021" "1-7-2021" ...
## $ time : Factor w/ 8 levels "2:00 PM","3:00 PM",..: 7 7 5 7 4 3 1 7 7 5 ...
## $ weatherAtWorkoutTime : int 51 48 48 53 63 73 70 60 53 43 ...
## $ minutesOfCardioKickBoxing : int 20 20 0 30 30 25 0 20 20 0 ...
## $ lipocavitation : int 1 0 1 0 0 1 0 1 0 1 ...
## $ sumLipocavitationTreatments : int 1 1 2 2 2 3 3 4 4 5 ...
## $ timeMeasurementsTaken : Factor w/ 19 levels "11:30 AM","11:45 AM",..: 7 4 10 3 17 12 14 10 1 9 ...
## $ weightScaleMeasurement : num 142 141 142 143 144 ...
## $ weightChangeFromDayPrior : num 0 -1 1 1 1 3.6 -2.2 -3.6 4.8 -2.8 ...
## $ calories_consumed_dayPrior : num 1826 1826 1884 2508 1812 ...
## $ waistlineMeasurement.BellyButton: num 33.5 33 33.2 33 33 ...
## $ armMeasurementInches.R : num 12 11.8 12 12 12 ...
## $ armMeasurementInches.L : num 12.2 11.5 12 12 12 ...
## $ thighMeasurementInches.R : num 22.6 23.5 23.2 23.8 24 ...
## $ thighMeasurementInches.L : num 22.6 23.5 23.2 23.8 24 ...
## $ absFat.MM.R : int 30 22 20 22 24 22 22 22 20 20 ...
## $ absFat.MM.L : int 30 22 20 22 24 22 22 22 20 22 ...
## $ tricepsFat.MM.R : int 26 22 20 20 22 22 22 22 22 20 ...
## $ tricepsFat.MM.L : int 28 22 20 20 24 22 22 22 22 20 ...
## $ innerThighFat.MM.R : int 28 22 16 22 16 16 16 14 12 12 ...
## $ innerThighFat.MM.L : int 28 20 16 22 16 20 20 14 12 12 ...
## $ dailyCalories : num 1826 1884 2508 1812 1963 ...
## $ fat_gram : num 41.5 61.2 72 59 73.8 ...
## $ saturatedFat_gram : num 11.5 31.2 38.5 16 18.8 ...
## $ protein_gram : num 42 63 82 62 67.2 ...
## $ carbs_grams : num 172 236 371 280 292 ...
## $ fiber_grams_from_carbs : num 29.5 32 45 49 46 16 47 43 45 55 ...
## $ sodiumDailyIntake : num 1526 3729 5349 1673 2852 ...
## $ coffee_cups : int 2 2 2 2 2 2 2 2 2 2 ...
## $ morning_BM : int 1 1 1 1 1 1 1 1 1 1 ...
## $ MenstruationDay : int 0 0 0 0 0 0 0 0 0 0 ...
## $ weightLiftingIncrease_lbs : int 0 5 0 10 0 0 0 50 45 0 ...
## $ weightLiftingDecrease_lbs : int 0 0 0 0 0 -10 0 0 0 0 ...
## $ waistTrimmer : int 32 32 32 32 32 32 32 32 32 32 ...
## $ compressionSocks : int 0 0 0 0 0 0 0 0 0 0 ...
## $ HoursOfSleep : num 7 8.5 7.5 10 7.5 9.5 9 9.5 8.5 8 ...
## $ glutenFree : int 0 0 0 0 0 0 0 0 0 0 ...
## $ alcoholFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ processedSweetsFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ butterAddedFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ meatFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ Date_PacificTime : POSIXct, format: "2021-01-04 09:00:00" "2021-01-05 09:00:00" ...
data$weekDay <- wday(data$Date_PacificTime, label=T)
data$weekDay
## [1] Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed Thu Fri
## [20] Sat Sun Mon
## Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat
write.csv(data,'ML_21daysResearch.csv',row.names=F)
We have our data, now lets find out if the lipocavitation, cumulative number of lipocavitations, the waist trimmer size, calories consumed before working out, and the hours of sleep the night before, and also the number of minutes of cardio the day before have any effect on the waist line measurement and then the weight measurement.
data$minutesCardioDayPrior <- lag(data$minutesOfCardioKickBoxing
, )
data$minutesCardioDayPrior[1] <- 0
data$minutesCardioDayPrior
## [1] 0 20 20 0 30 30 25 0 20 20 0 30 30 25 0 30 28 0 30 30 30 0
Lets use a simple linear model for those features on the outcome variable being the waist measurement.
linearModel_waistline <- lm(waistlineMeasurement.BellyButton ~ lipocavitation + sumLipocavitationTreatments + waistTrimmer + calories_consumed_dayPrior + HoursOfSleep + minutesCardioDayPrior, data=data)
Usine linear regression to find out the importance of the features selected on waistline measurements, we have a summary below. If the p-value in the last column is less than 0.05 it is significant in having some effect on the waistline measurement, and if greater than 0.05 then not significant to include the feature as a body effect to waistline measurement using a 95% confidence interval.
summary(linearModel_waistline)
##
## Call:
## lm(formula = waistlineMeasurement.BellyButton ~ lipocavitation +
## sumLipocavitationTreatments + waistTrimmer + calories_consumed_dayPrior +
## HoursOfSleep + minutesCardioDayPrior, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.8331 -0.2636 -0.1060 0.3380 0.7787
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 46.5204008 14.4879253 3.211 0.00583 **
## lipocavitation 0.0425253 0.2250250 0.189 0.85264
## sumLipocavitationTreatments -0.2049618 0.0888350 -2.307 0.03572 *
## waistTrimmer -0.4573360 0.4443079 -1.029 0.31964
## calories_consumed_dayPrior 0.0009208 0.0002705 3.404 0.00393 **
## HoursOfSleep -0.0396179 0.0937979 -0.422 0.67874
## minutesCardioDayPrior 0.0029561 0.0090072 0.328 0.74730
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5015 on 15 degrees of freedom
## Multiple R-squared: 0.5593, Adjusted R-squared: 0.383
## F-statistic: 3.173 on 6 and 15 DF, p-value: 0.03251
The above summary shows it is only 56% great at capturing all details on this model because of the R2 or residual standard error. The above says that the significant features having a p-value less than our significance leven or less than 0.05 on a 95% confidence interval. The caloreis consumed the day before and the sum of the lipocavitation treatments seem to be the only relevantly significant features to predict the waistline measurement.
Now lets do the same thing but on weight measurements.
linearModel_weightScale <- lm(weightScaleMeasurement ~ lipocavitation + sumLipocavitationTreatments + waistTrimmer + calories_consumed_dayPrior + HoursOfSleep + minutesCardioDayPrior, data=data)
summary(linearModel_weightScale)
##
## Call:
## lm(formula = weightScaleMeasurement ~ lipocavitation + sumLipocavitationTreatments +
## waistTrimmer + calories_consumed_dayPrior + HoursOfSleep +
## minutesCardioDayPrior, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.49942 -1.19813 0.05194 1.26935 2.77877
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.296e+02 5.134e+01 2.523 0.0234 *
## lipocavitation -6.173e-01 7.975e-01 -0.774 0.4510
## sumLipocavitationTreatments 2.792e-01 3.148e-01 0.887 0.3891
## waistTrimmer 1.911e-01 1.575e+00 0.121 0.9050
## calories_consumed_dayPrior 1.621e-03 9.588e-04 1.691 0.1115
## HoursOfSleep 3.483e-01 3.324e-01 1.048 0.3113
## minutesCardioDayPrior 8.110e-02 3.192e-02 2.541 0.0226 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.777 on 15 degrees of freedom
## Multiple R-squared: 0.4478, Adjusted R-squared: 0.2269
## F-statistic: 2.027 on 6 and 15 DF, p-value: 0.1251
When using those features to determine the effects of the weight measurement, the only significant feature is the minutes of cardio the day before. This R squared measure of how well this model displays the data is only 45% reliable or effective.
We can also split the data into testing and training sets with a 80-20 split on 22 observations. And see how well the models above predict our unobserved data.
set.seed(5678)
numberRandom <- sample(1:nrow(data), .8*nrow(data))
training <- data[numberRandom,]
testing <- data[-numberRandom,]
linearModel_waistline <- lm(waistlineMeasurement.BellyButton ~ lipocavitation + sumLipocavitationTreatments + waistTrimmer + calories_consumed_dayPrior + HoursOfSleep + minutesCardioDayPrior, data=training)
lm_waistline_prediction <- predict(linearModel_waistline, testing)
summary(linearModel_waistline)
##
## Call:
## lm(formula = waistlineMeasurement.BellyButton ~ lipocavitation +
## sumLipocavitationTreatments + waistTrimmer + calories_consumed_dayPrior +
## HoursOfSleep + minutesCardioDayPrior, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.61875 -0.19334 -0.01089 0.11786 0.64397
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 40.3517670 15.0753868 2.677 0.02323 *
## lipocavitation 0.1688081 0.2247493 0.751 0.46990
## sumLipocavitationTreatments -0.1606300 0.0994616 -1.615 0.13739
## waistTrimmer -0.3063333 0.4528434 -0.676 0.51409
## calories_consumed_dayPrior 0.0008287 0.0002424 3.418 0.00656 **
## HoursOfSleep 0.0887223 0.1010420 0.878 0.40051
## minutesCardioDayPrior 0.0205507 0.0103761 1.981 0.07580 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.426 on 10 degrees of freedom
## Multiple R-squared: 0.7306, Adjusted R-squared: 0.5689
## F-statistic: 4.519 on 6 and 10 DF, p-value: 0.01798
The R-squared value is greater than 70% and this indicates the model is good at prediction.
The following is a list of the 20% unobserved data held out to test the model on or five of the 22 total observations. The training set trained the model using the other 80% of the data or 17 observations.
lm_waistline_prediction
## 1 2 6 11 18
## 32.69188 33.06717 33.32206 32.59328 32.32900
The above indices are from unobserved or the hold out set in predicting our model. In our original data the indices, (1,2,6,11,18) are predicted above. Lets see what they were in reality.
data$waistlineMeasurement.BellyButton[c(1,2,6,11,18)]
## [1] 33.5 33.0 32.0 33.5 32.5
The predicted values above aren’t too far off from the actual values immediately above. Most all the predicted values were underestimated for the waistline, but one for the 2nd entry because its value was overestimated. Now, lets see how the weight scale measurements turn out being predicted.
linearModel_weightScale <- lm(weightScaleMeasurement ~ lipocavitation + sumLipocavitationTreatments + waistTrimmer + calories_consumed_dayPrior + HoursOfSleep + minutesCardioDayPrior, data=training)
lm_weight_prediction <- predict(linearModel_weightScale,testing)
summary(linearModel_weightScale)
##
## Call:
## lm(formula = weightScaleMeasurement ~ lipocavitation + sumLipocavitationTreatments +
## waistTrimmer + calories_consumed_dayPrior + HoursOfSleep +
## minutesCardioDayPrior, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.1542 -0.7386 -0.3815 0.4790 1.9688
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.617e+02 5.190e+01 3.115 0.0110 *
## lipocavitation -1.067e+00 7.737e-01 -1.379 0.1980
## sumLipocavitationTreatments 5.018e-02 3.424e-01 0.147 0.8864
## waistTrimmer -7.738e-01 1.559e+00 -0.496 0.6304
## calories_consumed_dayPrior 1.564e-03 8.345e-04 1.875 0.0903 .
## HoursOfSleep 2.441e-01 3.478e-01 0.702 0.4988
## minutesCardioDayPrior 1.005e-01 3.572e-02 2.814 0.0183 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.467 on 10 degrees of freedom
## Multiple R-squared: 0.5959, Adjusted R-squared: 0.3534
## F-statistic: 2.458 on 6 and 10 DF, p-value: 0.1003
This model for predicting the weight scale isn’t as accurate as the model for predicting the waistline measurement using the same features and same training samples to predict on the same testing samples.The minutes of cardio the day prior seems to be the only feature significant in predicting the weight next day. Lets see the actual values for the weight measurements compared to the predicted values by our model.
lm_weight_prediction
## 1 2 6 11 18
## 140.4660 143.9095 144.4058 143.1940 143.8742
data$weightScaleMeasurement[c(1,2,6,11,18)]
## [1] 142.0 141.0 147.6 145.4 146.4
We can see that predicted values in our weight prediction linear model underestimated all except the 2nd predicted value that was overestimated. This means there is more information excluded that can explain more results of the weight scale measurements. The same for the waistline measurements.
We will get to that next. Lets look at how glutenFree, weightLiftingIncrease_lbs, morning_BM, coffee_cups, sodiumDailyIntake, fat_gram, saturatedFat_gram, protein_gram, fiber_grams_from_carbs,and minutesCardioDayPrior effect weight and waistline measures.
weight_linearModel2 <- lm(weightScaleMeasurement ~ glutenFree + weightLiftingIncrease_lbs + morning_BM + coffee_cups + sodiumDailyIntake + fat_gram + saturatedFat_gram +protein_gram+fiber_grams_from_carbs+minutesCardioDayPrior, data=training)
weight_linearModel2_prediction <- predict(weight_linearModel2, testing)
weight_linearModel2_prediction
## 1 2 6 11 18
## 142.2778 143.7087 146.1878 141.7773 143.2913
data$weightScaleMeasurement[c(1,2,6,11,18)]
## [1] 142.0 141.0 147.6 145.4 146.4
Lets see the summary of this model on the newest feature of mostly diet and metabolism features in our model.
summary(weight_linearModel2)
##
## Call:
## lm(formula = weightScaleMeasurement ~ glutenFree + weightLiftingIncrease_lbs +
## morning_BM + coffee_cups + sodiumDailyIntake + fat_gram +
## saturatedFat_gram + protein_gram + fiber_grams_from_carbs +
## minutesCardioDayPrior, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.99860 -0.26384 -0.09609 0.30466 0.96370
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.436e+02 3.462e+00 41.476 1.31e-08 ***
## glutenFree 3.878e-01 7.201e-01 0.539 0.6096
## weightLiftingIncrease_lbs 3.170e-02 2.099e-02 1.510 0.1817
## morning_BM 1.388e+00 8.537e-01 1.626 0.1550
## coffee_cups -1.117e+00 1.131e+00 -0.988 0.3615
## sodiumDailyIntake -4.413e-04 3.335e-04 -1.323 0.2340
## fat_gram 3.661e-02 2.476e-02 1.479 0.1896
## saturatedFat_gram -6.402e-02 4.737e-02 -1.352 0.2253
## protein_gram 5.567e-02 1.688e-02 3.297 0.0165 *
## fiber_grams_from_carbs -9.833e-02 3.895e-02 -2.524 0.0450 *
## minutesCardioDayPrior 9.313e-02 2.983e-02 3.122 0.0205 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9601 on 6 degrees of freedom
## Multiple R-squared: 0.8961, Adjusted R-squared: 0.7229
## F-statistic: 5.174 on 10 and 6 DF, p-value: 0.02847
From this model the R-squared is much higher with 89.6 which is very good, and we now have three features in our list that are significant in determining the weight which are the protein consumed, fiber consumed, and minutes of cardio the day prior. Lets see if we can use the weightChangeFromDayPrior using the nutrition features.
weightChangeLinearModel <- lm(weightChangeFromDayPrior ~ fiber_grams_from_carbs + minutesCardioDayPrior + protein_gram + saturatedFat_gram + fat_gram + sodiumDailyIntake + dailyCalories, data=training)
weightPrediction2 <- predict(weightChangeLinearModel, testing)
summary(weightChangeLinearModel)
##
## Call:
## lm(formula = weightChangeFromDayPrior ~ fiber_grams_from_carbs +
## minutesCardioDayPrior + protein_gram + saturatedFat_gram +
## fat_gram + sodiumDailyIntake + dailyCalories, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.0255 -0.6926 -0.3920 1.5309 3.1476
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.190061 3.976294 -1.054 0.319
## fiber_grams_from_carbs -0.054749 0.067749 -0.808 0.440
## minutesCardioDayPrior 0.013561 0.059849 0.227 0.826
## protein_gram 0.008969 0.038380 0.234 0.820
## saturatedFat_gram -0.003954 0.090867 -0.044 0.966
## fat_gram -0.017487 0.062647 -0.279 0.786
## sodiumDailyIntake -0.001129 0.001039 -1.086 0.306
## dailyCalories 0.004841 0.004565 1.060 0.317
##
## Residual standard error: 2.437 on 9 degrees of freedom
## Multiple R-squared: 0.2603, Adjusted R-squared: -0.315
## F-statistic: 0.4524 on 7 and 9 DF, p-value: 0.8459
The model above is terrible. It would be predicting the change in weight today from what is eaten today. Which doesn’t make sense. We need to predict the weight the next day, so lets add that feature.
data$nextDayWeight <- lead(data$weightScaleMeasurement, 1)
data$nextDayWeight[22] <- 143.8 #taken at 6 pm Tuesday 1/26/2021
data$nextDayWeight
## [1] 141.0 142.0 143.0 144.0 147.6 145.4 141.8 146.6 143.8 145.4 145.4 143.8
## [13] 143.8 143.8 146.4 147.6 146.4 147.2 144.2 143.8 141.6 143.8
training <- data[numberRandom,]
testing <- data[-numberRandom,]
nextDayWeight_LinearModel <- lm(nextDayWeight ~ fiber_grams_from_carbs + minutesCardioDayPrior + protein_gram + saturatedFat_gram + fat_gram + sodiumDailyIntake + dailyCalories, data=training)
nextDayWeightPrediction <- predict(nextDayWeight_LinearModel,training)
summary(nextDayWeight_LinearModel)
##
## Call:
## lm(formula = nextDayWeight ~ fiber_grams_from_carbs + minutesCardioDayPrior +
## protein_gram + saturatedFat_gram + fat_gram + sodiumDailyIntake +
## dailyCalories, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.11917 -0.66491 -0.09671 0.73508 2.19654
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.445e+02 2.648e+00 54.577 1.17e-12 ***
## fiber_grams_from_carbs -6.596e-03 4.511e-02 -0.146 0.8870
## minutesCardioDayPrior -3.907e-02 3.985e-02 -0.980 0.3525
## protein_gram 3.070e-02 2.556e-02 1.201 0.2602
## saturatedFat_gram -1.489e-01 6.050e-02 -2.460 0.0361 *
## fat_gram 1.079e-01 4.171e-02 2.586 0.0294 *
## sodiumDailyIntake 8.955e-04 6.920e-04 1.294 0.2279
## dailyCalories -3.573e-03 3.040e-03 -1.176 0.2699
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.623 on 9 degrees of freedom
## Multiple R-squared: 0.5516, Adjusted R-squared: 0.2029
## F-statistic: 1.582 on 7 and 9 DF, p-value: 0.2556
The above model is a little better, with 55% R-squared value. The saturated fat and total fat seem to be features that are significant in determining weight scale measurement the next day due to the p-value being significant at less than 0.05 for a 95% confidence interval. Since this research had no limits on calories consumed or mix of fat content to protein content and carb content of those calories, and weight training was done that builds muscle that is commonly known to weigh more than fat, looking at the weight scale doesn’t really quantify body effects of exercise and this particular diet of no meat, alcohol, gluten, or processed sweets. The waist measurements could have some body effects from the diet and exercise and possibly the weight before. But lipocavitation was supposed to, by its claim, reduce fat along the areas treated and therefore reduce the daily measurements around the midsection, arms, and thighs that were treated 3 times a week. We should control for these tests to see if the lipocavitation does have some body effects on the measurements as the products for consumer grade lipocavitation claim. This product was a 5 in 1 lipocavitation machine bought on Amazon.
There are other ways of determining features associated with best prediction of weight or waistline or other body measurements. The trees approach uses either an ensemble of models with the best results as a random forest or a decision tree approach with gradient boosted models of classifiers that split from root to branches and need to be tuned and pruned for deciding the best predicted outcome. Since this is continuous data in the outcomes for measurements of weight or waistline the linear regression and naive bayes would be fine to predict a numeric value. But for trees the outcome variable would need to be a factor. Random forest can also be used for regression because it uses those models as an ensemble that use regression like non-linear, linear, and multinomial naive bayes predictors.
Lets look at those features that we saw had some significance on either weightScaleMeasurement, nextDayWeight or waistlineMeasurement.BellyButton earlier, sumLipocavitationTreatments, minutesOfCardioKickBoxing, fat_gram, calories_consumed_dayPrior, and saturatedFat_gram.But also add in compressionSocks, morning_BM, waistTrimmer, and make a new variable called lipocavitationPrior for signaling if lipocavitation was done the day before to test the significance of the next day results on nextDayWeight and also make a new variable for nextDayWaistLineMeasurement.
data$nextDayWaistLineMeasurement <- lead(data$waistlineMeasurement.BellyButton,1)
data$nextDayWaistLineMeasurement[22] <- (31.5+31)/2 #NA in other data avg wed + mon
data$lipocavitationPrior <- lag(data$lipocavitation,1)
data$lipocavitationPrior[1] <- 0
training <- data[numberRandom,]
testing <- data[-numberRandom,]
nextDayWeightModel2 <- lm(nextDayWeight ~ sumLipocavitationTreatments+ minutesOfCardioKickBoxing+ fat_gram+calories_consumed_dayPrior+saturatedFat_gram +compressionSocks+morning_BM+waistTrimmer+lipocavitationPrior, data=training)
predNextDayWeight2 <- predict(nextDayWeightModel2,testing)
predNextDayWeight2
## 1 2 6 11 18
## 143.6054 142.7209 143.8611 145.4673 144.3206
data$nextDayWeight[c(1,2,6,11,18)]
## [1] 141.0 142.0 145.4 145.4 147.2
summary(nextDayWeightModel2)
##
## Call:
## lm(formula = nextDayWeight ~ sumLipocavitationTreatments + minutesOfCardioKickBoxing +
## fat_gram + calories_consumed_dayPrior + saturatedFat_gram +
## compressionSocks + morning_BM + waistTrimmer + lipocavitationPrior,
## data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.1268 -0.4092 0.0876 0.5997 2.2572
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.338e+02 6.345e+01 2.109 0.0729 .
## sumLipocavitationTreatments 2.752e-01 4.586e-01 0.600 0.5673
## minutesOfCardioKickBoxing 4.828e-02 3.941e-02 1.225 0.2602
## fat_gram 3.815e-02 3.742e-02 1.020 0.3418
## calories_consumed_dayPrior 2.067e-04 1.209e-03 0.171 0.8691
## saturatedFat_gram -3.475e-02 6.683e-02 -0.520 0.6192
## compressionSocks -1.947e+00 1.793e+00 -1.086 0.3135
## morning_BM -1.058e-01 1.237e+00 -0.086 0.9342
## waistTrimmer 2.219e-01 1.956e+00 0.113 0.9129
## lipocavitationPrior -9.518e-01 9.728e-01 -0.978 0.3604
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.736 on 7 degrees of freedom
## Multiple R-squared: 0.601, Adjusted R-squared: 0.08801
## F-statistic: 1.172 on 9 and 7 DF, p-value: 0.4269
There isn’t anything significant on these features for determining next day weight. Lets try and see how lipocavitation works compared with the waist trimmer on waist line measurements.
lipoTrimmerLM <- lm(waistlineMeasurement.BellyButton ~ lipocavitationPrior + lipocavitation + sumLipocavitationTreatments + waistTrimmer, data=data)
summary(lipoTrimmerLM)
##
## Call:
## lm(formula = waistlineMeasurement.BellyButton ~ lipocavitationPrior +
## lipocavitation + sumLipocavitationTreatments + waistTrimmer,
## data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.06377 -0.22702 0.00141 0.18264 0.94641
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.053055 17.222933 2.151 0.0461 *
## lipocavitationPrior -0.190916 0.378198 -0.505 0.6202
## lipocavitation -0.008976 0.382218 -0.023 0.9815
## sumLipocavitationTreatments -0.126340 0.098969 -1.277 0.2189
## waistTrimmer -0.112540 0.530419 -0.212 0.8345
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6236 on 17 degrees of freedom
## Multiple R-squared: 0.2277, Adjusted R-squared: 0.04593
## F-statistic: 1.253 on 4 and 17 DF, p-value: 0.3266
The results used on all the data show no significance in either using lipocavitation or a waist trimmer in determing the waistline measurement. This isn’t good at all since, the products of the 5-in-1 lipocavitation machine and the waist trimmer say they reduce fat, slim, and can make the waist smaller. The waist trimmer does make the waist smaller while wearing it and for a few hours afterwards. Lets try a different model and see how the random forest regression model does with the same variables above.
See ??randomForest or R: Classification and Regression with Random Forest documentation.
set.seed(234)
x <- training[,c(13, 7,8,36,49)]
lipoTrimmer_RF <- randomForest(waistlineMeasurement.BellyButton ~ ., data = x,importance=TRUE,type="regression")
summary(lipoTrimmer_RF)
## Length Class Mode
## call 5 -none- call
## type 1 -none- character
## predicted 17 -none- numeric
## mse 500 -none- numeric
## rsq 500 -none- numeric
## oob.times 17 -none- numeric
## importance 8 -none- numeric
## importanceSD 4 -none- numeric
## localImportance 0 -none- NULL
## proximity 0 -none- NULL
## ntree 1 -none- numeric
## mtry 1 -none- numeric
## forest 11 -none- list
## coefs 0 -none- NULL
## y 17 -none- numeric
## test 0 -none- NULL
## inbag 0 -none- NULL
## terms 3 terms call
importance(lipoTrimmer_RF)
## %IncMSE IncNodePurity
## lipocavitation -3.7947199 0.3203948
## sumLipocavitationTreatments 8.5510193 1.6391874
## waistTrimmer 0.8902947 0.5396289
## lipocavitationPrior 0.5991548 0.3952462
The more important feature of the above has the higher score which is the sum of lipocavitation treatments the day the waistline measurement at the belly button was taken is more important or significant on the waistline measurement for that day than having lipocavitation that day, wearing a waist trimmer, or if lipocavitation was done the day prior.
We know from our models above the significant variables on predicting the weight the next day is the fat and saturated fat consumed that day. We also know that the minutes of cardio the day before and oddly the amount of fiber and protein eaten that day have significance in predicting the weight taken that day. Lets see how minutesOfCardioKickBoxing, dailyCalories, fat_gram, saturatedFat_gram, protein_gram, fiber_grams_from_carbs, and weightLiftingIncrease_lbs can predict the nextDayWeight.We will use the linear model first and then the random forest model to see the side by side results.
cardioDiet_LM <- lm(nextDayWeight ~ minutesOfCardioKickBoxing +
dailyCalories +
fat_gram +
saturatedFat_gram +
protein_gram +
fiber_grams_from_carbs +
weightLiftingIncrease_lbs,
data = training)
cardioDiet_Predict <- predict(cardioDiet_LM,testing)
df <- as.data.frame(cbind(cardioDiet_Predict,testing$nextDayWeight))
colnames(df) <- c('predictedNextDayWeight','actualNextDayWeight')
df$actual_PredictedError <- df$actualNextDayWeight-df$predictedNextDayWeight
df
## predictedNextDayWeight actualNextDayWeight actual_PredictedError
## 1 143.3563 141.0 -2.3563179
## 2 143.4791 142.0 -1.4790507
## 6 142.8486 145.4 2.5514497
## 11 145.8255 145.4 -0.4255159
## 18 146.3583 147.2 0.8417243
The predicted weights weren’t off by more than 2.55 lbs and the closest predicted weight was within 1/2 lb. Lets look at the summary of what these features’ weighted predictions as a linear model.
summary(cardioDiet_LM)
##
## Call:
## lm(formula = nextDayWeight ~ minutesOfCardioKickBoxing + dailyCalories +
## fat_gram + saturatedFat_gram + protein_gram + fiber_grams_from_carbs +
## weightLiftingIncrease_lbs, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.59927 -0.90120 0.07726 0.85424 2.18542
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.406e+02 3.275e+00 42.933 1.01e-11 ***
## minutesOfCardioKickBoxing 4.250e-02 4.278e-02 0.993 0.346
## dailyCalories -1.622e-04 2.297e-03 -0.071 0.945
## fat_gram 5.005e-02 3.340e-02 1.498 0.168
## saturatedFat_gram -6.579e-02 6.033e-02 -1.090 0.304
## protein_gram 1.809e-02 2.992e-02 0.604 0.560
## fiber_grams_from_carbs 3.431e-03 4.974e-02 0.069 0.947
## weightLiftingIncrease_lbs 1.090e-02 3.568e-02 0.306 0.767
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.747 on 9 degrees of freedom
## Multiple R-squared: 0.48, Adjusted R-squared: 0.07554
## F-statistic: 1.187 on 7 and 9 DF, p-value: 0.396
In the linear model above, none of the features we selected were significant as none of the p-values are less than 0.05 at the 95% confidence interval and 5% significance level. And before that the predicted side by side comparison to the actual values in our hold out set or testing set predicted within 1/2-3 lb weight differences the next day. But our summary shows that for every 4.25 minutes of cardio the weight the next day is a pound heavier, a decrease in calories of 1.62 will increase weight a lb, increasing the fat gram intake to 5.01 will increase weight a lb next day, but lowering the saturated fat grams by 6.58 will also increase the weight the next day, increasing protein gram intake to 1.81 will increase weight 1 lb, increasing fiber gram intake increases weight a lb, and increasing the weight used when weightlifting to 1.09 lbs increases weight a lb. Which honestly all doesn’t make sense, but that is why the linear model isn’t the best model but a good way of approximating an outcome. Since none of these features are significant together when predicting next day weight. Lets see how they do with waist measurements the next day before testing with the random forest model. Which uses linear and multinomial naive bayes with different tuning parameters as an ensemble of models to pick or vote the best model that gets closest to the actual value.
cardioDiet_LM2 <- lm(nextDayWaistLineMeasurement ~ minutesOfCardioKickBoxing +
dailyCalories +
fat_gram +
saturatedFat_gram +
protein_gram +
fiber_grams_from_carbs +
weightLiftingIncrease_lbs,
data = training)
cardioDiet_Predict2 <- predict(cardioDiet_LM2,testing)
df2 <- as.data.frame(cbind(cardioDiet_Predict2,testing$nextDayWaistLineMeasurement))
colnames(df2) <- c('predictedNextDayWaistMeasurement','actualNextDayWaistMeasurement')
df2$actual_PredictedError <- df2$actualNextDayWaistMeasurement-df2$predictedNextDayWaistMeasurement
df2
## predictedNextDayWaistMeasurement actualNextDayWaistMeasurement
## 1 32.55407 33.00
## 2 32.03338 33.25
## 6 31.54411 32.00
## 11 33.42217 33.00
## 18 32.99826 33.50
## actual_PredictedError
## 1 0.4459321
## 2 1.2166232
## 6 0.4558903
## 11 -0.4221655
## 18 0.5017395
The error in predicting the next day waist line measurement was from under predicting 2/5" to over predicting by 1 1/5“. Lets look at the summary noting that most of the predicted values were within 1/2” accuracy.
summary(cardioDiet_LM2)
##
## Call:
## lm(formula = nextDayWaistLineMeasurement ~ minutesOfCardioKickBoxing +
## dailyCalories + fat_gram + saturatedFat_gram + protein_gram +
## fiber_grams_from_carbs + weightLiftingIncrease_lbs, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.57893 -0.36861 -0.09497 0.43451 0.59401
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 30.7226895 1.0177999 30.185 2.35e-10 ***
## minutesOfCardioKickBoxing -0.0044475 0.0132927 -0.335 0.7456
## dailyCalories 0.0012651 0.0007138 1.772 0.1101
## fat_gram 0.0006183 0.0103788 0.060 0.9538
## saturatedFat_gram -0.0373820 0.0187478 -1.994 0.0773 .
## protein_gram 0.0016228 0.0092990 0.175 0.8653
## fiber_grams_from_carbs -0.0018406 0.0154577 -0.119 0.9078
## weightLiftingIncrease_lbs 0.0206342 0.0110860 1.861 0.0956 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.543 on 9 degrees of freedom
## Multiple R-squared: 0.6772, Adjusted R-squared: 0.4262
## F-statistic: 2.698 on 7 and 9 DF, p-value: 0.0837
We also can see from the linear model above that these features as a set aren’t significant in predicting the next day waist line measurement at the 5% significance level. But lets interpret what it says about these features. The minutes of cardio, daily calories, protein, fiber, and fat intake are less than 1% of influencing the increase in the waist line measurement the next day by 1" but decreasing the saturated fat by 3.7% and increasing the weight by 2% will have an affect on increasing the next day waist line measurement by 1."
Lets now see what our random forest models of these two linear modeled features says about next day weight and waist line measurements.
cardioDiet_RF <- randomForest(nextDayWeight ~ minutesOfCardioKickBoxing +
dailyCalories +
fat_gram +
saturatedFat_gram +
protein_gram +
fiber_grams_from_carbs +
weightLiftingIncrease_lbs,
data = training, importance=TRUE,type="regression")
summary(cardioDiet_RF)
## Length Class Mode
## call 5 -none- call
## type 1 -none- character
## predicted 17 -none- numeric
## mse 500 -none- numeric
## rsq 500 -none- numeric
## oob.times 17 -none- numeric
## importance 14 -none- numeric
## importanceSD 7 -none- numeric
## localImportance 0 -none- NULL
## proximity 0 -none- NULL
## ntree 1 -none- numeric
## mtry 1 -none- numeric
## forest 11 -none- list
## coefs 0 -none- NULL
## y 17 -none- numeric
## test 0 -none- NULL
## inbag 0 -none- NULL
## terms 3 terms call
importance(cardioDiet_RF)
## %IncMSE IncNodePurity
## minutesOfCardioKickBoxing 2.7962162 3.817400
## dailyCalories -0.9975809 5.724532
## fat_gram 6.2365289 10.368967
## saturatedFat_gram 0.5513062 5.733871
## protein_gram 1.3094408 8.768735
## fiber_grams_from_carbs -1.3514179 4.921447
## weightLiftingIncrease_lbs -0.6969332 1.517586
From the above importance of features measures we can see that the most important feature in predicting next day weight using our random forest model is the fat grams consumed a day before the weight measurement, the next important is the protein grams consumed, and after that is the saturated fat then daily calories, fiber, minutes of cardio, and finally least important is the increase in weight during weight lifting the day before the weight measurement is taken.
cardioDiet_Predict_RF <- predict(cardioDiet_RF,testing)
df3 <- as.data.frame(cbind(cardioDiet_Predict_RF,testing$nextDayWeight))
colnames(df3) <- c('predictedNextDayWeight','actualNextDayWeight')
df3$actual_PredictedError <- df3$actualNextDayWeight-df3$predictedNextDayWeight
df3
## predictedNextDayWeight actualNextDayWeight actual_PredictedError
## 1 143.7762 141.0 -2.7762333
## 2 144.2383 142.0 -2.2383000
## 6 143.5542 145.4 1.8457800
## 11 145.1989 145.4 0.2011133
## 18 145.7889 147.2 1.4110533
The results above show how the random forest model predicted the actual weight - the predicted weight as an error. The range in prediction values is from under predicting 2.78 lbs to overpredicting by 1.85 lbs.
Now, we will look at how our features predict the next day waist measurement with our random forest algorithm.
cardioDiet_RF2 <- randomForest(nextDayWaistLineMeasurement ~
minutesOfCardioKickBoxing +
dailyCalories +
fat_gram +
saturatedFat_gram +
protein_gram +
fiber_grams_from_carbs +
weightLiftingIncrease_lbs,
data = training, importance=TRUE,type="regression")
summary(cardioDiet_RF2)
## Length Class Mode
## call 5 -none- call
## type 1 -none- character
## predicted 17 -none- numeric
## mse 500 -none- numeric
## rsq 500 -none- numeric
## oob.times 17 -none- numeric
## importance 14 -none- numeric
## importanceSD 7 -none- numeric
## localImportance 0 -none- NULL
## proximity 0 -none- NULL
## ntree 1 -none- numeric
## mtry 1 -none- numeric
## forest 11 -none- list
## coefs 0 -none- NULL
## y 17 -none- numeric
## test 0 -none- NULL
## inbag 0 -none- NULL
## terms 3 terms call
importance(cardioDiet_RF2)
## %IncMSE IncNodePurity
## minutesOfCardioKickBoxing -0.3418183 0.3021152
## dailyCalories 3.4290305 1.1346708
## fat_gram 1.2520613 1.1028453
## saturatedFat_gram 0.5654202 0.9595589
## protein_gram 6.8376789 1.4658899
## fiber_grams_from_carbs 2.4185185 0.9447107
## weightLiftingIncrease_lbs 3.1181626 0.8624853
cardioDiet_Predict_RF2 <- predict(cardioDiet_RF2,testing)
df4 <- as.data.frame(cbind(cardioDiet_Predict_RF2,
testing$nextDayWaistLineMeasurement))
colnames(df4) <- c('predictedNextDayWaist','actualNextDayWaist')
df4$actual_PredictedError <- df4$actualNextDayWaist-df4$predictedNextDayWaist
df4
## predictedNextDayWaist actualNextDayWaist actual_PredictedError
## 1 32.17534 33.00 0.82465833
## 2 32.45074 33.25 0.79925833
## 6 32.01032 32.00 -0.01031667
## 11 32.99145 33.00 0.00855000
## 18 32.96116 33.50 0.53884167
Above we can see the data frame of predicted waistline measurements to actual waistline measurements and the error in actual-predicted. The error is in a range of 1/100th of an inch to about 4/5" in predicting the next day waistline measurement. Slightly better than our linear model for predicting the next day waistline measurement.
We made the time variables earlier that are actually time or date features recognized by R, but now I am going to see what findings can be plotted to visualize what these features do to the waistline measurements and weight measurements using Tableau. Lets write this new data of elements to our csv file for plotting in Tableau. But before we do, lets just see what these features are with a str().
str(data)
## 'data.frame': 22 obs. of 49 variables:
## $ weekDay.Date : Factor w/ 7 levels "Fri","Mon","Sat",..: 2 6 7 5 1 3 4 2 6 7 ...
## $ Day : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Date : chr "1-4-2021" "1-5-2021" "1-6-2021" "1-7-2021" ...
## $ time : Factor w/ 8 levels "2:00 PM","3:00 PM",..: 7 7 5 7 4 3 1 7 7 5 ...
## $ weatherAtWorkoutTime : int 51 48 48 53 63 73 70 60 53 43 ...
## $ minutesOfCardioKickBoxing : int 20 20 0 30 30 25 0 20 20 0 ...
## $ lipocavitation : int 1 0 1 0 0 1 0 1 0 1 ...
## $ sumLipocavitationTreatments : int 1 1 2 2 2 3 3 4 4 5 ...
## $ timeMeasurementsTaken : Factor w/ 19 levels "11:30 AM","11:45 AM",..: 7 4 10 3 17 12 14 10 1 9 ...
## $ weightScaleMeasurement : num 142 141 142 143 144 ...
## $ weightChangeFromDayPrior : num 0 -1 1 1 1 3.6 -2.2 -3.6 4.8 -2.8 ...
## $ calories_consumed_dayPrior : num 1826 1826 1884 2508 1812 ...
## $ waistlineMeasurement.BellyButton: num 33.5 33 33.2 33 33 ...
## $ armMeasurementInches.R : num 12 11.8 12 12 12 ...
## $ armMeasurementInches.L : num 12.2 11.5 12 12 12 ...
## $ thighMeasurementInches.R : num 22.6 23.5 23.2 23.8 24 ...
## $ thighMeasurementInches.L : num 22.6 23.5 23.2 23.8 24 ...
## $ absFat.MM.R : int 30 22 20 22 24 22 22 22 20 20 ...
## $ absFat.MM.L : int 30 22 20 22 24 22 22 22 20 22 ...
## $ tricepsFat.MM.R : int 26 22 20 20 22 22 22 22 22 20 ...
## $ tricepsFat.MM.L : int 28 22 20 20 24 22 22 22 22 20 ...
## $ innerThighFat.MM.R : int 28 22 16 22 16 16 16 14 12 12 ...
## $ innerThighFat.MM.L : int 28 20 16 22 16 20 20 14 12 12 ...
## $ dailyCalories : num 1826 1884 2508 1812 1963 ...
## $ fat_gram : num 41.5 61.2 72 59 73.8 ...
## $ saturatedFat_gram : num 11.5 31.2 38.5 16 18.8 ...
## $ protein_gram : num 42 63 82 62 67.2 ...
## $ carbs_grams : num 172 236 371 280 292 ...
## $ fiber_grams_from_carbs : num 29.5 32 45 49 46 16 47 43 45 55 ...
## $ sodiumDailyIntake : num 1526 3729 5349 1673 2852 ...
## $ coffee_cups : int 2 2 2 2 2 2 2 2 2 2 ...
## $ morning_BM : int 1 1 1 1 1 1 1 1 1 1 ...
## $ MenstruationDay : int 0 0 0 0 0 0 0 0 0 0 ...
## $ weightLiftingIncrease_lbs : int 0 5 0 10 0 0 0 50 45 0 ...
## $ weightLiftingDecrease_lbs : int 0 0 0 0 0 -10 0 0 0 0 ...
## $ waistTrimmer : int 32 32 32 32 32 32 32 32 32 32 ...
## $ compressionSocks : int 0 0 0 0 0 0 0 0 0 0 ...
## $ HoursOfSleep : num 7 8.5 7.5 10 7.5 9.5 9 9.5 8.5 8 ...
## $ glutenFree : int 0 0 0 0 0 0 0 0 0 0 ...
## $ alcoholFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ processedSweetsFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ butterAddedFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ meatFree : int 1 1 1 1 1 1 1 1 1 1 ...
## $ Date_PacificTime : POSIXct, format: "2021-01-04 09:00:00" "2021-01-05 09:00:00" ...
## $ weekDay : Ord.factor w/ 7 levels "Sun"<"Mon"<"Tue"<..: 2 3 4 5 6 7 1 2 3 4 ...
## $ minutesCardioDayPrior : num 0 20 20 0 30 30 25 0 20 20 ...
## $ nextDayWeight : num 141 142 143 144 148 ...
## $ nextDayWaistLineMeasurement : num 33 33.2 33 33 32 ...
## $ lipocavitationPrior : num 0 1 0 1 0 0 1 0 1 0 ...
Great looks good. Now we will write this out to csv.
write.csv(data,'ML_21dayResearch_addedFeatures.csv',row.names=F)
I have a blog that I need to post to inform clients on the effects of lipocavitation added to massage for reducing fat and body circumference and thereby having a positive affect in combination with lymphatic drainage as less body weight or pressure surrounding the lymph vessels and veins will lead to better circulation and less edema in the body limbs. Lets see how lipocavitation and the number of lipocavitation treatments are effective in reducing fat content by looking at any of the areas treated, like the abs, arms, or legs and only use those lipocavitation variables.
lipocavitation_RF <- randomForest(innerThighFat.MM.R ~
sumLipocavitationTreatments +
lipocavitationPrior,
data = training, importance=TRUE,type="regression")
## Warning in randomForest.default(m, y, ...): The response has five or fewer
## unique values. Are you sure you want to do regression?
summary(lipocavitation_RF)
## Length Class Mode
## call 5 -none- call
## type 1 -none- character
## predicted 17 -none- numeric
## mse 500 -none- numeric
## rsq 500 -none- numeric
## oob.times 17 -none- numeric
## importance 4 -none- numeric
## importanceSD 2 -none- numeric
## localImportance 0 -none- NULL
## proximity 0 -none- NULL
## ntree 1 -none- numeric
## mtry 1 -none- numeric
## forest 11 -none- list
## coefs 0 -none- NULL
## y 17 -none- numeric
## test 0 -none- NULL
## inbag 0 -none- NULL
## terms 3 terms call
importance(lipocavitation_RF)
## %IncMSE IncNodePurity
## sumLipocavitationTreatments 13.678434 72.06128
## lipocavitationPrior -5.742658 11.67883
The sum of the lipocavitation treatments prior to taking the thigh fat caliper measure in MM is more important than if lipocavitation was done the day before or prior to the measurement.
Lets see these predicted values next to our actual measurements.
lipocavitation_Predict_RF <- predict(lipocavitation_RF,testing)
df5 <- as.data.frame(cbind(lipocavitation_Predict_RF,
testing$innerThighFat.MM.R))
colnames(df5) <- c('predicted_R_ThighMeasure','actual_R_ThighMeasure')
df5$actual_PredictedError <- df5$actual_R_ThighMeasure-df5$predicted_R_ThighMeasure
df5
## predicted_R_ThighMeasure actual_R_ThighMeasure actual_PredictedError
## 1 15.82586 28 12.1741397
## 2 17.17285 22 4.8271504
## 6 15.44627 16 0.5537267
## 11 12.98615 12 -0.9861457
## 18 13.28555 14 0.7144543
Lets look at the linear model for predicting thigh fat MM measurements using the same lipocavitation features.
lipocavitation_ML <- lm(innerThighFat.MM.L ~
sumLipocavitationTreatments +
lipocavitationPrior,
data = training)
summary(lipocavitation_ML)
##
## Call:
## lm(formula = innerThighFat.MM.L ~ sumLipocavitationTreatments +
## lipocavitationPrior, data = training)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.6134 -0.6535 -0.1789 0.9695 3.7636
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 17.5224 1.4617 11.987 9.48e-09 ***
## sumLipocavitationTreatments -0.8115 0.2204 -3.682 0.00246 **
## lipocavitationPrior 2.3369 1.1291 2.070 0.05745 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.201 on 14 degrees of freedom
## Multiple R-squared: 0.594, Adjusted R-squared: 0.5359
## F-statistic: 10.24 on 2 and 14 DF, p-value: 0.00182
The above summary shows that the sum of lipocavitation treatments is significant at the 5% significance level for predicting the thigh measurement of the left thigh in mm of fat. Whether or not lipocavitation was done the day prior is not significant. But the measurement of the thigh fat increases 1 mm for every decrease in the sum of previous lipocavitation treatments of 81.1% from the next sum, or almost every treatment of lipocavitation will decrease the thigh fat measurements by an MM. The error with linear models is that the values can range outside of 0-100% of probability that is supposed to be between 0 and 1. So that not doing lipocavitation or having a sum of zero lipocavitations implies that the fat measurement on the left thigh will be increasing 81.1% of a MM and that a sum of lipocavitation treatments greater than 37 treatments will give a left thigh measurement of less than 0 if starting from 30 MM and decreasing by approximately 3/4 MM each treatment.
30/0.811
## [1] 36.99137
The above shows how many times 3/4 approximately goes into the maximum left thigh measure in MM. Those probabilities next to reality are not likely. But lets see the range that is more realistic for our predicted linear model results compared to actual left thigh measurements in MM.
lipocavitation_Predict_ML <- predict(lipocavitation_ML,testing)
df6 <- as.data.frame(cbind(lipocavitation_Predict_ML,
testing$innerThighFat.MM.L))
colnames(df6) <- c('predicted_L_ThighMeasure','actual_L_ThighMeasure')
df6$actual_PredictedError <- df6$actual_L_ThighMeasure-df6$predicted_L_ThighMeasure
df6
## predicted_L_ThighMeasure actual_L_ThighMeasure actual_PredictedError
## 1 16.71092 28 11.2890779
## 2 19.04785 20 0.9521495
## 6 15.08795 20 4.9120462
## 11 15.80191 14 -1.8019140
## 18 13.36746 12 -1.3674616
Our results show that the linear model over predicted the left thigh fat measurement in MM by up to 11.3 MM and under predicted as low as 4.9 MM using just the sum of lipocavitation treatments and whether or not lipocavitation was done the day before or prior to the left thigh measurement.