📊 Introduction

Welcome to our Gym Members Analysis Report!

In this report, we explore data from gym members to understand:

  • Who joins the gym (age, gender, health)
  • What their fitness goals are
  • How we can predict health metrics like BMI
  • How to group similar members together

Let’s dive in! 👇


🔧 Setup & Data Loading

First, we load the necessary tools (libraries) and import our gym dataset.

# Load required libraries
library(readxl)
library(ggplot2)
library(dplyr)
library(knitr)
# Import the gym dataset
gym_data <- read_excel("C:/Users/Abhishek Kumar/Downloads/gym recommendation.xlsx")

📋 Data Overview

Let’s understand what our data looks like:

# Display structure
str(gym_data)
## tibble [14,589 × 15] (S3: tbl_df/tbl/data.frame)
##  $ ID            : num [1:14589] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Sex           : chr [1:14589] "Male" "Male" "Male" "Male" ...
##  $ Age           : num [1:14589] 18 18 18 18 18 18 18 18 18 18 ...
##  $ Height        : num [1:14589] 1.68 1.68 1.68 1.68 1.68 1.68 1.68 1.68 1.68 1.68 ...
##  $ Weight        : num [1:14589] 47.5 47.5 47.5 47.5 47.5 47.5 47.5 47.5 55 55 ...
##  $ Hypertension  : chr [1:14589] "No" "Yes" "No" "Yes" ...
##  $ Diabetes      : chr [1:14589] "No" "No" "Yes" "Yes" ...
##  $ BMI           : num [1:14589] 16.8 16.8 16.8 16.8 16.8 ...
##  $ Level         : chr [1:14589] "Underweight" "Underweight" "Underweight" "Underweight" ...
##  $ Fitness Goal  : chr [1:14589] "Weight Gain" "Weight Gain" "Weight Gain" "Weight Gain" ...
##  $ Fitness Type  : chr [1:14589] "Muscular Fitness" "Muscular Fitness" "Muscular Fitness" "Muscular Fitness" ...
##  $ Exercises     : chr [1:14589] "Squats, deadlifts, bench presses, and overhead presses" "Squats, deadlifts, bench presses, and overhead presses" "Squats, yoga, deadlifts, bench presses, and overhead presses" "Squats, yoga, deadlifts, bench presses, and overhead presses" ...
##  $ Equipment     : chr [1:14589] "Dumbbells and barbells" "Light athletic shoes, resistance bands, and light dumbbells." "Dumbbells, barbells and Blood glucose monitor" "Light athletic shoes, resistance bands, light dumbbells and a Blood glucose monitor." ...
##  $ Diet          : chr [1:14589] "Vegetables: (Carrots, Sweet Potato, and Lettuce); Protein Intake: (Red meats, poultry, fish, eggs, dairy produc"| __truncated__ "Vegetables: (Tomatoes, Garlic, leafy greens, broccoli, carrots, and bell peppers); Protein Intake: (poultry, fi"| __truncated__ "Vegetables: (Garlic, Roma Tomatoes, Capers and Iceberg Lettuce); Protein Intake: (Cheese Standwish, Baru Nuts, "| __truncated__ "Vegetables: (Garlic, Roma Tomatoes, Capers, Green Papper, and Iceberg Lettuce); Protein Intake: (Cheese Sandwic"| __truncated__ ...
##  $ Recommendation: chr [1:14589] "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. It is important"| __truncated__ "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. I'm not a medic"| __truncated__ "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. I'm not a docto"| __truncated__ "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. I'm not a docto"| __truncated__ ...
# Show summary statistics
summary(gym_data)
##        ID            Sex                 Age            Height    
##  Min.   :    1   Length:14589       Min.   :18.00   Min.   :1.30  
##  1st Qu.: 3648   Class :character   1st Qu.:28.00   1st Qu.:1.64  
##  Median : 7295   Mode  :character   Median :39.00   Median :1.68  
##  Mean   : 7295                      Mean   :39.55   Mean   :1.70  
##  3rd Qu.:10942                      3rd Qu.:51.00   3rd Qu.:1.77  
##  Max.   :14589                      Max.   :63.00   Max.   :2.03  
##      Weight       Hypertension         Diabetes              BMI       
##  Min.   : 32.00   Length:14589       Length:14589       Min.   : 9.52  
##  1st Qu.: 55.00   Class :character   Class :character   1st Qu.:18.94  
##  Median : 70.00   Mode  :character   Mode  :character   Median :25.25  
##  Mean   : 70.51                                         Mean   :24.42  
##  3rd Qu.: 86.00                                         3rd Qu.:29.32  
##  Max.   :130.00                                         Max.   :70.00  
##     Level           Fitness Goal       Fitness Type        Exercises        
##  Length:14589       Length:14589       Length:14589       Length:14589      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##   Equipment             Diet           Recommendation    
##  Length:14589       Length:14589       Length:14589      
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
## 

Key Points:

  • We have data on 14589 gym members
  • Important columns: Age, Weight, Height, BMI, Sex, Fitness Goal, etc.

📈 PART 1: Descriptive Analysis

In this section, we explore the data to answer simple questions about our gym members.


1️⃣ Gender Distribution

Question: How many males and females are in our gym?

# Count males and females
gender_count <- table(gym_data$Sex)
print(gender_count)
## 
## Female   Male 
##   5215   9374
# Create bar plot
barplot(gender_count,
        main = "Gender Distribution of Gym Members",
        col = c("skyblue", "pink"),
        ylab = "Number of Members",
        xlab = "Gender",
        border = "white",
        las = 1)

Insight: We can see the distribution between male and female members. This helps the gym plan facilities and classes accordingly.


2️⃣ Average BMI by Fitness Level

Question: What is the average BMI for different fitness levels?

BMI (Body Mass Index) tells us if someone is underweight, normal, or overweight.

# Calculate average BMI by level
avg_bmi <- gym_data %>%
  group_by(Level) %>%
  summarise(Average_BMI = mean(BMI, na.rm = TRUE))

# Display table
kable(avg_bmi, caption = "Average BMI by Fitness Level")
Average BMI by Fitness Level
Level Average_BMI
Normal 21.00792
Obuse 33.27785
Overweight 26.89384
Underweight 15.99529
# Create bar plot
barplot(avg_bmi$Average_BMI,
        names.arg = avg_bmi$Level,
        main = "Average BMI Across Different Fitness Levels",
        col = "orange",
        ylab = "Average BMI",
        xlab = "Fitness Level",
        border = "white",
        las = 1)

Insight: Different fitness levels have different BMI averages, which helps trainers create personalized workout plans.


3️⃣ Most Common Fitness Goals

Question: What do most people want to achieve at the gym?

# Count fitness goals
goal_count <- table(gym_data$`Fitness Goal`)
print(goal_count)
## 
## Weight Gain Weight Loss 
##        7008        7581
# Create bar plot
barplot(goal_count,
        main = "Most Popular Fitness Goals",
        col = "lightgreen",
        las = 2,
        ylab = "Number of Members",
        border = "white",
        cex.names = 0.8)

Insight: Understanding popular goals helps the gym offer the right classes and equipment.


4️⃣ Weight vs BMI Relationship

Question: Is there a connection between weight and BMI?

# Calculate correlation
correlation <- cor(gym_data$Weight, gym_data$BMI, use = "complete.obs")
cat("Correlation between Weight and BMI:", round(correlation, 3), "\n")
## Correlation between Weight and BMI: 0.902
# Create scatter plot
plot(gym_data$Weight, gym_data$BMI,
     main = "Relationship Between Weight and BMI",
     xlab = "Weight (kg)",
     ylab = "BMI",
     col = "blue",
     pch = 19)
abline(lm(BMI ~ Weight, data = gym_data), col = "red", lwd = 2)

Insight: The correlation is 0.902, showing a strong positive relationship. Heavier people tend to have higher BMI values.


5️⃣ Preferred Fitness Types

Question: What type of training do members prefer?

# Count fitness types
type_count <- table(gym_data$`Fitness Type`)
print(type_count)
## 
##   Cardio Fitness Muscular Fitness 
##             7581             7008
# Create bar plot
barplot(type_count,
        main = "Preferred Training Types",
        col = "purple",
        ylab = "Number of Members",
        xlab = "Fitness Type",
        border = "white",
        las = 2,
        cex.names = 0.8)

Insight: This shows which training zones need more equipment and space.


6️⃣ Health Conditions Analysis

Question: How many members have health conditions?

# Count hypertension cases
hyper_count <- table(gym_data$Hypertension)
diabetes_count <- table(gym_data$Diabetes)

par(mfrow = c(1, 2))  # Two plots side by side

barplot(hyper_count, 
        main = "Hypertension Cases", 
        col = "red", 
        ylab = "Members",
        border = "white")

barplot(diabetes_count, 
        main = "Diabetes Cases", 
        col = "brown", 
        ylab = "Members",
        border = "white")

par(mfrow = c(1, 1))  # Reset layout

Insight: Knowing health conditions helps trainers design safe workout programs for members.


7️⃣ Age Distribution

Question: What is the age range of gym members?

# Calculate age statistics
cat("Average Age:", round(mean(gym_data$Age, na.rm = TRUE), 1), "years\n")
## Average Age: 39.6 years
cat("Median Age:", median(gym_data$Age, na.rm = TRUE), "years\n")
## Median Age: 39 years
cat("Youngest Member:", min(gym_data$Age, na.rm = TRUE), "years\n")
## Youngest Member: 18 years
cat("Oldest Member:", max(gym_data$Age, na.rm = TRUE), "years\n")
## Oldest Member: 63 years
# Create histogram
hist(gym_data$Age,
     main = "Age Distribution of Gym Members",
     xlab = "Age (years)",
     col = "lightblue",
     border = "black",
     breaks = 15)

Insight: Most members are in the 25-45 age range, helping the gym plan age-appropriate programs.


🔮 PART 2: Predictive Analysis

Now we use advanced techniques to predict and group members!


8️⃣ Predicting BMI with Machine Learning

Question: Can we predict someone’s BMI using their age, height, and weight?

Answer: Yes! We use Linear Regression, a simple prediction model.

# Build the prediction model
model <- lm(BMI ~ Age + Height + Weight, data = gym_data)

# Show model results
summary(model)
## 
## Call:
## lm(formula = BMI ~ Age + Height + Weight, data = gym_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.384 -0.364 -0.042  0.257 45.261 
## 
## Coefficients:
##               Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)  5.004e+01  1.396e-01  358.382  < 2e-16 ***
## Age         -1.718e-03  6.026e-04   -2.852  0.00435 ** 
## Height      -2.943e+01  8.478e-02 -347.162  < 2e-16 ***
## Weight       3.471e-01  4.150e-04  836.405  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9593 on 14585 degrees of freedom
## Multiple R-squared:  0.9799, Adjusted R-squared:  0.9799 
## F-statistic: 2.37e+05 on 3 and 14585 DF,  p-value: < 2.2e-16

Understanding the Model:

  • R-squared: Shows how accurate our model is (closer to 1 is better)
  • Coefficients: Show how each factor affects BMI
# Visualize actual vs predicted BMI
gym_data$Predicted_BMI <- predict(model, gym_data)

plot(gym_data$BMI, gym_data$Predicted_BMI,
     main = "Actual BMI vs Predicted BMI",
     xlab = "Actual BMI",
     ylab = "Predicted BMI",
     col = "darkgreen",
     pch = 19)
abline(0, 1, col = "red", lwd = 2)  # Perfect prediction line

Insight: Points close to the red line mean our predictions are accurate!


9️⃣ Grouping Members (Clustering)

Question: Can we divide members into similar groups?

Answer: Yes! Using K-Means Clustering, we create 3 groups based on Age, Weight, and BMI.

# Prepare data for clustering
cluster_data <- gym_data[, c("Age", "Weight", "BMI")]
cluster_data <- na.omit(cluster_data)

# Create 3 clusters
set.seed(123)
kmeans_result <- kmeans(cluster_data, centers = 3)

# Add cluster info to dataset
gym_data$Cluster <- as.factor(kmeans_result$cluster)

# Show cluster sizes
kable(table(gym_data$Cluster), 
      col.names = c("Cluster", "Number of Members"),
      caption = "Members in Each Cluster")
Members in Each Cluster
Cluster Number of Members
1 3945
2 6261
3 4383
# Visualize clusters
ggplot(gym_data, aes(x = Age, y = BMI, color = Cluster)) +
  geom_point(size = 3, alpha = 0.6) +
  labs(title = "Member Clusters: Age vs BMI",
       x = "Age (years)",
       y = "BMI") +
  theme_minimal() +
  theme(legend.position = "bottom")

Insight: Each color represents a different group. Members in the same group have similar characteristics and might benefit from similar training programs!


🔟 Predicting BMI for a New Member

Question: If a new person joins, can we predict their BMI?

Scenario: A 30-year-old person, weighing 70 kg, height 1.75 m joins our gym.

# Create new member data
new_member <- data.frame(Age = 30, Weight = 70, Height = 1.75)

# Predict their BMI
predicted_bmi <- predict(model, new_member)

# Display result
cat("✨ Predicted BMI for the new member:", round(predicted_bmi, 2), "\n")
## ✨ Predicted BMI for the new member: 22.78

Insight: Without measuring, we can estimate their BMI is approximately 22.78. This helps trainers prepare customized plans even before the first session!


🎯 Key Findings & Conclusion

📌 What We Learned:

  1. Member Demographics:
    • Most members are aged 25-45 years
    • Balanced gender distribution helps in planning
  2. Health & Fitness:
    • Weight and BMI are strongly correlated
    • Different fitness levels have different BMI averages
  3. Preferences:
    • Clear favorite fitness goals and training types
    • This guides equipment and class scheduling
  4. Health Awareness:
    • Some members have hypertension or diabetes
    • Trainers can provide safer, customized workouts
  5. Predictions Work!
    • We can predict BMI using simple measurements
    • Clustering helps create personalized group programs

💡 Practical Applications:

  • For Gym Owners: Plan equipment, classes, and facilities
  • For Trainers: Design personalized workout programs
  • For Members: Understand where they fit and what to expect

📚 Thank You!

This analysis helps make gym experiences better for everyone. Data-driven decisions lead to healthier, happier members! 💪


Created with ❤️ using R and R Markdown

For questions or suggestions, feel free to reach out!