Welcome to our Gym Members Analysis Report!
In this report, we explore data from gym members to understand:
Let’s dive in! 👇
Dataset Name: Gym Recommendation Dataset
File Format: Excel (.xlsx)
Source: Gym Members Registration Data
First, we load the necessary tools (libraries) and import our gym dataset.
# Import the gym dataset
gym_data <- read_excel("C:/Users/Abhishek Kumar/Downloads/gym recommendation.xlsx")Let’s understand what our data looks like:
## tibble [14,589 × 15] (S3: tbl_df/tbl/data.frame)
## $ ID : num [1:14589] 1 2 3 4 5 6 7 8 9 10 ...
## $ Sex : chr [1:14589] "Male" "Male" "Male" "Male" ...
## $ Age : num [1:14589] 18 18 18 18 18 18 18 18 18 18 ...
## $ Height : num [1:14589] 1.68 1.68 1.68 1.68 1.68 1.68 1.68 1.68 1.68 1.68 ...
## $ Weight : num [1:14589] 47.5 47.5 47.5 47.5 47.5 47.5 47.5 47.5 55 55 ...
## $ Hypertension : chr [1:14589] "No" "Yes" "No" "Yes" ...
## $ Diabetes : chr [1:14589] "No" "No" "Yes" "Yes" ...
## $ BMI : num [1:14589] 16.8 16.8 16.8 16.8 16.8 ...
## $ Level : chr [1:14589] "Underweight" "Underweight" "Underweight" "Underweight" ...
## $ Fitness Goal : chr [1:14589] "Weight Gain" "Weight Gain" "Weight Gain" "Weight Gain" ...
## $ Fitness Type : chr [1:14589] "Muscular Fitness" "Muscular Fitness" "Muscular Fitness" "Muscular Fitness" ...
## $ Exercises : chr [1:14589] "Squats, deadlifts, bench presses, and overhead presses" "Squats, deadlifts, bench presses, and overhead presses" "Squats, yoga, deadlifts, bench presses, and overhead presses" "Squats, yoga, deadlifts, bench presses, and overhead presses" ...
## $ Equipment : chr [1:14589] "Dumbbells and barbells" "Light athletic shoes, resistance bands, and light dumbbells." "Dumbbells, barbells and Blood glucose monitor" "Light athletic shoes, resistance bands, light dumbbells and a Blood glucose monitor." ...
## $ Diet : chr [1:14589] "Vegetables: (Carrots, Sweet Potato, and Lettuce); Protein Intake: (Red meats, poultry, fish, eggs, dairy produc"| __truncated__ "Vegetables: (Tomatoes, Garlic, leafy greens, broccoli, carrots, and bell peppers); Protein Intake: (poultry, fi"| __truncated__ "Vegetables: (Garlic, Roma Tomatoes, Capers and Iceberg Lettuce); Protein Intake: (Cheese Standwish, Baru Nuts, "| __truncated__ "Vegetables: (Garlic, Roma Tomatoes, Capers, Green Papper, and Iceberg Lettuce); Protein Intake: (Cheese Sandwic"| __truncated__ ...
## $ Recommendation: chr [1:14589] "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. It is important"| __truncated__ "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. I'm not a medic"| __truncated__ "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. I'm not a docto"| __truncated__ "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. I'm not a docto"| __truncated__ ...
## ID Sex Age Height
## Min. : 1 Length:14589 Min. :18.00 Min. :1.30
## 1st Qu.: 3648 Class :character 1st Qu.:28.00 1st Qu.:1.64
## Median : 7295 Mode :character Median :39.00 Median :1.68
## Mean : 7295 Mean :39.55 Mean :1.70
## 3rd Qu.:10942 3rd Qu.:51.00 3rd Qu.:1.77
## Max. :14589 Max. :63.00 Max. :2.03
## Weight Hypertension Diabetes BMI
## Min. : 32.00 Length:14589 Length:14589 Min. : 9.52
## 1st Qu.: 55.00 Class :character Class :character 1st Qu.:18.94
## Median : 70.00 Mode :character Mode :character Median :25.25
## Mean : 70.51 Mean :24.42
## 3rd Qu.: 86.00 3rd Qu.:29.32
## Max. :130.00 Max. :70.00
## Level Fitness Goal Fitness Type Exercises
## Length:14589 Length:14589 Length:14589 Length:14589
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## Equipment Diet Recommendation
## Length:14589 Length:14589 Length:14589
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
Key Points:
In this section, we explore the data to answer simple questions about our gym members.
Question: How many males and females are in our gym?
##
## Female Male
## 5215 9374
# Create bar plot
barplot(gender_count,
main = "Gender Distribution of Gym Members",
col = c("skyblue", "pink"),
ylab = "Number of Members",
xlab = "Gender",
border = "white",
las = 1)Insight: We can see the distribution between male and female members. This helps the gym plan facilities and classes accordingly.
Question: What is the average BMI for different fitness levels?
BMI (Body Mass Index) tells us if someone is underweight, normal, or overweight.
# Calculate average BMI by level
avg_bmi <- gym_data %>%
group_by(Level) %>%
summarise(Average_BMI = mean(BMI, na.rm = TRUE))
# Display table
kable(avg_bmi, caption = "Average BMI by Fitness Level")| Level | Average_BMI |
|---|---|
| Normal | 21.00792 |
| Obuse | 33.27785 |
| Overweight | 26.89384 |
| Underweight | 15.99529 |
# Create bar plot
barplot(avg_bmi$Average_BMI,
names.arg = avg_bmi$Level,
main = "Average BMI Across Different Fitness Levels",
col = "orange",
ylab = "Average BMI",
xlab = "Fitness Level",
border = "white",
las = 1)Insight: The average BMI increases as we go from Underweight to Obese. This confirms that BMI levels are correctly categorized, and it helps trainers create personalized workout and diet plans according to each member’s body type.
Question: What do most people want to achieve at the gym?
##
## Weight Gain Weight Loss
## 7008 7581
# Create bar plot
barplot(goal_count,
main = "Most Popular Fitness Goals",
col = "lightgreen",
las = 2,
ylab = "Number of Members",
border = "white",
cex.names = 0.8)Insight: Both Weight Loss and Weight Gain goals are almost equally popular. This shows that the gym has a balanced mix of people — some want to lose fat, and others want to build muscle. So, the gym should focus on both cardio and strength programs. —
Question: Is there a connection between weight and BMI?
# Calculate correlation
correlation <- cor(gym_data$Weight, gym_data$BMI, use = "complete.obs")
cat("Correlation between Weight and BMI:", round(correlation, 3), "\n")## Correlation between Weight and BMI: 0.902
# Create scatter plot
plot(gym_data$Weight, gym_data$BMI,
main = "Relationship Between Weight and BMI",
xlab = "Weight (kg)",
ylab = "BMI",
col = "blue",
pch = 19)
abline(lm(BMI ~ Weight, data = gym_data), col = "red", lwd = 2)Insight: The correlation between Weight and BMI is 0.902, which is very strong. This means that as a person’s weight increases, their BMI also increases. It helps the gym easily identify people who may be overweight and need a specific plan to manage their health.
Question: What type of training do members prefer?
##
## Cardio Fitness Muscular Fitness
## 7581 7008
# Create bar plot
barplot(type_count,
main = "Preferred Training Types",
col = "purple",
ylab = "Number of Members",
xlab = "Fitness Type",
border = "white",
las = 2,
cex.names = 0.8)Insight: The graph shows that members are almost equally interested in Cardio and Muscular Fitness. This means the gym should maintain a good balance of both areas — like enough treadmills for cardio and weights for muscle training.
Question: How many members have health conditions?
# Count hypertension cases
hyper_count <- table(gym_data$Hypertension)
diabetes_count <- table(gym_data$Diabetes)
par(mfrow = c(1, 2)) # Two plots side by side
barplot(hyper_count,
main = "Hypertension Cases",
col = "red",
ylab = "Members",
border = "white")
barplot(diabetes_count,
main = "Diabetes Cases",
col = "brown",
ylab = "Members",
border = "white")Insight: Most gym members are healthy, but a small number have Hypertension or Diabetes. This information is important because trainers can give these members safe and light workouts to avoid any health risks. —
Question: What is the age range of gym members?
# Calculate age statistics
cat("Average Age:", round(mean(gym_data$Age, na.rm = TRUE), 1), "years\n")## Average Age: 39.6 years
## Median Age: 39 years
## Youngest Member: 18 years
## Oldest Member: 63 years
# Create histogram
hist(gym_data$Age,
main = "Age Distribution of Gym Members",
xlab = "Age (years)",
col = "lightblue",
border = "black",
breaks = 15)Insight: Most gym members are between 25 and 45 years old. This is the most active and working-age group. The gym can plan evening batches and energy-based programs suitable for this age group.
Now we use advanced techniques to predict and group members!
Question: Can we predict someone’s BMI using their age, height, and weight?
Answer: Yes! We use Linear Regression, a simple prediction model.
# Build the prediction model
model <- lm(BMI ~ Age + Height + Weight, data = gym_data)
# Show model results
summary(model)##
## Call:
## lm(formula = BMI ~ Age + Height + Weight, data = gym_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.384 -0.364 -0.042 0.257 45.261
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.004e+01 1.396e-01 358.382 < 2e-16 ***
## Age -1.718e-03 6.026e-04 -2.852 0.00435 **
## Height -2.943e+01 8.478e-02 -347.162 < 2e-16 ***
## Weight 3.471e-01 4.150e-04 836.405 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.9593 on 14585 degrees of freedom
## Multiple R-squared: 0.9799, Adjusted R-squared: 0.9799
## F-statistic: 2.37e+05 on 3 and 14585 DF, p-value: < 2.2e-16
Understanding the Model:
# Visualize actual vs predicted BMI
gym_data$Predicted_BMI <- predict(model, gym_data)
plot(gym_data$BMI, gym_data$Predicted_BMI,
main = "Actual BMI vs Predicted BMI",
xlab = "Actual BMI",
ylab = "Predicted BMI",
col = "darkgreen",
pch = 19)
abline(0, 1, col = "red", lwd = 2) # Perfect prediction lineInsight: This scatter plot shows how well our model predicts BMI. The green dots are the actual data points, and the red line represents perfect prediction. Since most points are close to the red line, it means our model is working very well. The accuracy (R²) is 0.98, which shows that the model can predict BMI almost perfectly using only Age, Height, and Weight.
Question: Can we divide members into similar groups?
Answer: Yes! Using K-Means Clustering, we create 3 groups based on Age, Weight, and BMI.
# Prepare data for clustering
cluster_data <- gym_data[, c("Age", "Weight", "BMI")]
cluster_data <- na.omit(cluster_data)
# Create 3 clusters
set.seed(123)
kmeans_result <- kmeans(cluster_data, centers = 3)
# Add cluster info to dataset
gym_data$Cluster <- as.factor(kmeans_result$cluster)
# Show cluster sizes
kable(table(gym_data$Cluster),
col.names = c("Cluster", "Number of Members"),
caption = "Members in Each Cluster")| Cluster | Number of Members |
|---|---|
| 1 | 3945 |
| 2 | 6261 |
| 3 | 4383 |
# Visualize clusters
ggplot(gym_data, aes(x = Age, y = BMI, color = Cluster)) +
geom_point(size = 3, alpha = 0.6) +
labs(title = "Member Clusters: Age vs BMI",
x = "Age (years)",
y = "BMI") +
theme_minimal() +
theme(legend.position = "bottom")Insight: The clustering divided members into 3 main groups based on Age, Weight, and BMI. Each color on the graph represents a group of people who are similar. For example: young fit members, middle-aged average members, and older high-BMI members. This helps the gym understand its audience and create better workout plans for each group.
Question: If a new person joins, can we predict their BMI?
Scenario: A 30-year-old person, weighing 70 kg, height 1.75 m joins our gym.
# Create new member data
new_member <- data.frame(Age = 30, Weight = 70, Height = 1.75)
# Predict their BMI
predicted_bmi <- predict(model, new_member)
# Display result
cat("✨ Predicted BMI for the new member:", round(predicted_bmi, 2), "\n")## ✨ Predicted BMI for the new member: 22.78
Insight: For a new person aged 30, weighing 70 kg, and height 1.75 m, the predicted BMI is around 22.78, which is in the normal range. This shows how our model can help trainers quickly estimate health levels even before starting workouts.
This analysis helps make gym experiences better for everyone. Data-driven decisions lead to healthier, happier members! 💪
Created with ❤️ using R and R Markdown
For questions or suggestions, feel free to reach out!