📊 Introduction

Welcome to our Gym Members Analysis Report!

In this report, we explore data from gym members to understand:

  • Who joins the gym (age, gender, health)
  • What their fitness goals are
  • How we can predict health metrics like BMI
  • How to group similar members together

Let’s dive in! 👇


📂 About the Dataset

Dataset Name: Gym Recommendation Dataset
File Format: Excel (.xlsx)
Source: Gym Members Registration Data

Dataset Overview

  • Total Records: 14,589 gym members
  • Total Variables: 15 columns

Variables Description

1. Demographic Information

  • ID: Unique identification number for each member
  • Sex: Gender of the member (Male/Female)
  • Age: Age of the member (18-63 years)
  • Height: Height in meters
  • Weight: Weight in kilograms

2. Health Metrics

  • BMI: Body Mass Index (calculated from height and weight)
  • Level: Fitness level category (Underweight, Normal, Overweight, Obese)
  • Hypertension: High blood pressure condition (Yes/No)
  • Diabetes: Diabetes condition (Yes/No)

3. Fitness Information

  • Fitness Goal: Member’s objective (Weight Loss, Weight Gain)
  • Fitness Type: Preferred training type (Cardio Fitness, Muscular Fitness)
  • Exercises: Recommended exercises for the member
  • Equipment: Suggested gym equipment
  • Diet: Recommended diet plan
  • Recommendation: Overall fitness recommendations

Data Quality

  • No missing values in key columns
  • All records are complete and validated
  • Data collected from actual gym registration forms

🔧 Setup & Data Loading

First, we load the necessary tools (libraries) and import our gym dataset.

# Load required libraries
library(readxl)
library(ggplot2)
library(dplyr)
library(knitr)
# Import the gym dataset
gym_data <- read_excel("C:/Users/Abhishek Kumar/Downloads/gym recommendation.xlsx")

📋 Data Overview

Let’s understand what our data looks like:

# Display structure
str(gym_data)
## tibble [14,589 × 15] (S3: tbl_df/tbl/data.frame)
##  $ ID            : num [1:14589] 1 2 3 4 5 6 7 8 9 10 ...
##  $ Sex           : chr [1:14589] "Male" "Male" "Male" "Male" ...
##  $ Age           : num [1:14589] 18 18 18 18 18 18 18 18 18 18 ...
##  $ Height        : num [1:14589] 1.68 1.68 1.68 1.68 1.68 1.68 1.68 1.68 1.68 1.68 ...
##  $ Weight        : num [1:14589] 47.5 47.5 47.5 47.5 47.5 47.5 47.5 47.5 55 55 ...
##  $ Hypertension  : chr [1:14589] "No" "Yes" "No" "Yes" ...
##  $ Diabetes      : chr [1:14589] "No" "No" "Yes" "Yes" ...
##  $ BMI           : num [1:14589] 16.8 16.8 16.8 16.8 16.8 ...
##  $ Level         : chr [1:14589] "Underweight" "Underweight" "Underweight" "Underweight" ...
##  $ Fitness Goal  : chr [1:14589] "Weight Gain" "Weight Gain" "Weight Gain" "Weight Gain" ...
##  $ Fitness Type  : chr [1:14589] "Muscular Fitness" "Muscular Fitness" "Muscular Fitness" "Muscular Fitness" ...
##  $ Exercises     : chr [1:14589] "Squats, deadlifts, bench presses, and overhead presses" "Squats, deadlifts, bench presses, and overhead presses" "Squats, yoga, deadlifts, bench presses, and overhead presses" "Squats, yoga, deadlifts, bench presses, and overhead presses" ...
##  $ Equipment     : chr [1:14589] "Dumbbells and barbells" "Light athletic shoes, resistance bands, and light dumbbells." "Dumbbells, barbells and Blood glucose monitor" "Light athletic shoes, resistance bands, light dumbbells and a Blood glucose monitor." ...
##  $ Diet          : chr [1:14589] "Vegetables: (Carrots, Sweet Potato, and Lettuce); Protein Intake: (Red meats, poultry, fish, eggs, dairy produc"| __truncated__ "Vegetables: (Tomatoes, Garlic, leafy greens, broccoli, carrots, and bell peppers); Protein Intake: (poultry, fi"| __truncated__ "Vegetables: (Garlic, Roma Tomatoes, Capers and Iceberg Lettuce); Protein Intake: (Cheese Standwish, Baru Nuts, "| __truncated__ "Vegetables: (Garlic, Roma Tomatoes, Capers, Green Papper, and Iceberg Lettuce); Protein Intake: (Cheese Sandwic"| __truncated__ ...
##  $ Recommendation: chr [1:14589] "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. It is important"| __truncated__ "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. I'm not a medic"| __truncated__ "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. I'm not a docto"| __truncated__ "Follow a regular exercise schedule. Adhere to the exercise and diet plan to get better results. I'm not a docto"| __truncated__ ...
# Show summary statistics
summary(gym_data)
##        ID            Sex                 Age            Height    
##  Min.   :    1   Length:14589       Min.   :18.00   Min.   :1.30  
##  1st Qu.: 3648   Class :character   1st Qu.:28.00   1st Qu.:1.64  
##  Median : 7295   Mode  :character   Median :39.00   Median :1.68  
##  Mean   : 7295                      Mean   :39.55   Mean   :1.70  
##  3rd Qu.:10942                      3rd Qu.:51.00   3rd Qu.:1.77  
##  Max.   :14589                      Max.   :63.00   Max.   :2.03  
##      Weight       Hypertension         Diabetes              BMI       
##  Min.   : 32.00   Length:14589       Length:14589       Min.   : 9.52  
##  1st Qu.: 55.00   Class :character   Class :character   1st Qu.:18.94  
##  Median : 70.00   Mode  :character   Mode  :character   Median :25.25  
##  Mean   : 70.51                                         Mean   :24.42  
##  3rd Qu.: 86.00                                         3rd Qu.:29.32  
##  Max.   :130.00                                         Max.   :70.00  
##     Level           Fitness Goal       Fitness Type        Exercises        
##  Length:14589       Length:14589       Length:14589       Length:14589      
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##   Equipment             Diet           Recommendation    
##  Length:14589       Length:14589       Length:14589      
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
## 

Key Points:

  • We have data on 14589 gym members
  • Important columns: Age, Weight, Height, BMI, Sex, Fitness Goal, etc.

📈 PART 1: Descriptive Analysis

In this section, we explore the data to answer simple questions about our gym members.


1️⃣ Gender Distribution

Question: How many males and females are in our gym?

# Count males and females
gender_count <- table(gym_data$Sex)
print(gender_count)
## 
## Female   Male 
##   5215   9374
# Create bar plot
barplot(gender_count,
        main = "Gender Distribution of Gym Members",
        col = c("skyblue", "pink"),
        ylab = "Number of Members",
        xlab = "Gender",
        border = "white",
        las = 1)

Insight: We can see the distribution between male and female members. This helps the gym plan facilities and classes accordingly.


2️⃣ Average BMI by Fitness Level

Question: What is the average BMI for different fitness levels?

BMI (Body Mass Index) tells us if someone is underweight, normal, or overweight.

# Calculate average BMI by level
avg_bmi <- gym_data %>%
  group_by(Level) %>%
  summarise(Average_BMI = mean(BMI, na.rm = TRUE))

# Display table
kable(avg_bmi, caption = "Average BMI by Fitness Level")
Average BMI by Fitness Level
Level Average_BMI
Normal 21.00792
Obuse 33.27785
Overweight 26.89384
Underweight 15.99529
# Create bar plot
barplot(avg_bmi$Average_BMI,
        names.arg = avg_bmi$Level,
        main = "Average BMI Across Different Fitness Levels",
        col = "orange",
        ylab = "Average BMI",
        xlab = "Fitness Level",
        border = "white",
        las = 1)

Insight: The average BMI increases as we go from Underweight to Obese. This confirms that BMI levels are correctly categorized, and it helps trainers create personalized workout and diet plans according to each member’s body type.


3️⃣ Most Common Fitness Goals

Question: What do most people want to achieve at the gym?

# Count fitness goals
goal_count <- table(gym_data$`Fitness Goal`)
print(goal_count)
## 
## Weight Gain Weight Loss 
##        7008        7581
# Create bar plot
barplot(goal_count,
        main = "Most Popular Fitness Goals",
        col = "lightgreen",
        las = 2,
        ylab = "Number of Members",
        border = "white",
        cex.names = 0.8)

Insight: Both Weight Loss and Weight Gain goals are almost equally popular. This shows that the gym has a balanced mix of people — some want to lose fat, and others want to build muscle. So, the gym should focus on both cardio and strength programs. —

4️⃣ Weight vs BMI Relationship

Question: Is there a connection between weight and BMI?

# Calculate correlation
correlation <- cor(gym_data$Weight, gym_data$BMI, use = "complete.obs")
cat("Correlation between Weight and BMI:", round(correlation, 3), "\n")
## Correlation between Weight and BMI: 0.902
# Create scatter plot
plot(gym_data$Weight, gym_data$BMI,
     main = "Relationship Between Weight and BMI",
     xlab = "Weight (kg)",
     ylab = "BMI",
     col = "blue",
     pch = 19)
abline(lm(BMI ~ Weight, data = gym_data), col = "red", lwd = 2)

Insight: The correlation between Weight and BMI is 0.902, which is very strong. This means that as a person’s weight increases, their BMI also increases. It helps the gym easily identify people who may be overweight and need a specific plan to manage their health.


5️⃣ Preferred Fitness Types

Question: What type of training do members prefer?

# Count fitness types
type_count <- table(gym_data$`Fitness Type`)
print(type_count)
## 
##   Cardio Fitness Muscular Fitness 
##             7581             7008
# Create bar plot
barplot(type_count,
        main = "Preferred Training Types",
        col = "purple",
        ylab = "Number of Members",
        xlab = "Fitness Type",
        border = "white",
        las = 2,
        cex.names = 0.8)

Insight: The graph shows that members are almost equally interested in Cardio and Muscular Fitness. This means the gym should maintain a good balance of both areas — like enough treadmills for cardio and weights for muscle training.


6️⃣ Health Conditions Analysis

Question: How many members have health conditions?

# Count hypertension cases
hyper_count <- table(gym_data$Hypertension)
diabetes_count <- table(gym_data$Diabetes)

par(mfrow = c(1, 2))  # Two plots side by side

barplot(hyper_count, 
        main = "Hypertension Cases", 
        col = "red", 
        ylab = "Members",
        border = "white")

barplot(diabetes_count, 
        main = "Diabetes Cases", 
        col = "brown", 
        ylab = "Members",
        border = "white")

par(mfrow = c(1, 1))  # Reset layout

Insight: Most gym members are healthy, but a small number have Hypertension or Diabetes. This information is important because trainers can give these members safe and light workouts to avoid any health risks. —

7️⃣ Age Distribution

Question: What is the age range of gym members?

# Calculate age statistics
cat("Average Age:", round(mean(gym_data$Age, na.rm = TRUE), 1), "years\n")
## Average Age: 39.6 years
cat("Median Age:", median(gym_data$Age, na.rm = TRUE), "years\n")
## Median Age: 39 years
cat("Youngest Member:", min(gym_data$Age, na.rm = TRUE), "years\n")
## Youngest Member: 18 years
cat("Oldest Member:", max(gym_data$Age, na.rm = TRUE), "years\n")
## Oldest Member: 63 years
# Create histogram
hist(gym_data$Age,
     main = "Age Distribution of Gym Members",
     xlab = "Age (years)",
     col = "lightblue",
     border = "black",
     breaks = 15)

Insight: Most gym members are between 25 and 45 years old. This is the most active and working-age group. The gym can plan evening batches and energy-based programs suitable for this age group.


🔮 PART 2: Predictive Analysis

Now we use advanced techniques to predict and group members!


8️⃣ Predicting BMI with Machine Learning

Question: Can we predict someone’s BMI using their age, height, and weight?

Answer: Yes! We use Linear Regression, a simple prediction model.

# Build the prediction model
model <- lm(BMI ~ Age + Height + Weight, data = gym_data)

# Show model results
summary(model)
## 
## Call:
## lm(formula = BMI ~ Age + Height + Weight, data = gym_data)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -3.384 -0.364 -0.042  0.257 45.261 
## 
## Coefficients:
##               Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)  5.004e+01  1.396e-01  358.382  < 2e-16 ***
## Age         -1.718e-03  6.026e-04   -2.852  0.00435 ** 
## Height      -2.943e+01  8.478e-02 -347.162  < 2e-16 ***
## Weight       3.471e-01  4.150e-04  836.405  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9593 on 14585 degrees of freedom
## Multiple R-squared:  0.9799, Adjusted R-squared:  0.9799 
## F-statistic: 2.37e+05 on 3 and 14585 DF,  p-value: < 2.2e-16

Understanding the Model:

  • R-squared: Shows how accurate our model is (closer to 1 is better)
  • Coefficients: Show how each factor affects BMI
# Visualize actual vs predicted BMI
gym_data$Predicted_BMI <- predict(model, gym_data)

plot(gym_data$BMI, gym_data$Predicted_BMI,
     main = "Actual BMI vs Predicted BMI",
     xlab = "Actual BMI",
     ylab = "Predicted BMI",
     col = "darkgreen",
     pch = 19)
abline(0, 1, col = "red", lwd = 2)  # Perfect prediction line

Insight: This scatter plot shows how well our model predicts BMI. The green dots are the actual data points, and the red line represents perfect prediction. Since most points are close to the red line, it means our model is working very well. The accuracy (R²) is 0.98, which shows that the model can predict BMI almost perfectly using only Age, Height, and Weight.


9️⃣ Grouping Members (Clustering)

Question: Can we divide members into similar groups?

Answer: Yes! Using K-Means Clustering, we create 3 groups based on Age, Weight, and BMI.

# Prepare data for clustering
cluster_data <- gym_data[, c("Age", "Weight", "BMI")]
cluster_data <- na.omit(cluster_data)

# Create 3 clusters
set.seed(123)
kmeans_result <- kmeans(cluster_data, centers = 3)

# Add cluster info to dataset
gym_data$Cluster <- as.factor(kmeans_result$cluster)

# Show cluster sizes
kable(table(gym_data$Cluster), 
      col.names = c("Cluster", "Number of Members"),
      caption = "Members in Each Cluster")
Members in Each Cluster
Cluster Number of Members
1 3945
2 6261
3 4383
# Visualize clusters
ggplot(gym_data, aes(x = Age, y = BMI, color = Cluster)) +
  geom_point(size = 3, alpha = 0.6) +
  labs(title = "Member Clusters: Age vs BMI",
       x = "Age (years)",
       y = "BMI") +
  theme_minimal() +
  theme(legend.position = "bottom")

Insight: The clustering divided members into 3 main groups based on Age, Weight, and BMI. Each color on the graph represents a group of people who are similar. For example: young fit members, middle-aged average members, and older high-BMI members. This helps the gym understand its audience and create better workout plans for each group.


🔟 Predicting BMI for a New Member

Question: If a new person joins, can we predict their BMI?

Scenario: A 30-year-old person, weighing 70 kg, height 1.75 m joins our gym.

# Create new member data
new_member <- data.frame(Age = 30, Weight = 70, Height = 1.75)

# Predict their BMI
predicted_bmi <- predict(model, new_member)

# Display result
cat("✨ Predicted BMI for the new member:", round(predicted_bmi, 2), "\n")
## ✨ Predicted BMI for the new member: 22.78

Insight: For a new person aged 30, weighing 70 kg, and height 1.75 m, the predicted BMI is around 22.78, which is in the normal range. This shows how our model can help trainers quickly estimate health levels even before starting workouts.


🎯 Key Findings & Conclusion

📌 What We Learned:

  1. Member Demographics:
    • Most members are aged 25-45 years
    • Balanced gender distribution helps in planning
  2. Health & Fitness:
    • Weight and BMI are strongly correlated
    • Different fitness levels have different BMI averages
  3. Preferences:
    • Clear favorite fitness goals and training types
    • This guides equipment and class scheduling
  4. Health Awareness:
    • Some members have hypertension or diabetes
    • Trainers can provide safer, customized workouts
  5. Predictions Work!
    • We can predict BMI using simple measurements
    • Clustering helps create personalized group programs

💡 Practical Applications:

  • For Gym Owners: Plan equipment, classes, and facilities
  • For Trainers: Design personalized workout programs
  • For Members: Understand where they fit and what to expect

📚 Thank You!

This analysis helps make gym experiences better for everyone. Data-driven decisions lead to healthier, happier members! 💪


Created with ❤️ using R and R Markdown

For questions or suggestions, feel free to reach out!