library(reticulate)
use_python("C:/ProgramData/Anaconda3/python.exe")

Introduction

This study was inspired, updated and adapted from the 17 September 2020 Medium post “How Unhealthy Is Your Starbucks Drink?” Restaurant and Coffee Shop menus change frequently, however, so updated Starbucks drink nutrition data were obtained (i.e., scraped) from the current Starbucks online menu on March 6, 2022 using the same python code that was used by the author of the Medium post.

There are also several data explorations projects and/or analyses of the Starbucks menu data posted to Kaggle at Nutrition facts for Starbucks. Many of these provide mostly text output or consist of a simple plot for one or two variables (e.g., calories). The focus here is on comparing aspects of the nutrition data between 2020 and today.

About the Data

Starting on 7 May 2018, the FDA promolgated regulations that require businesses to provide, upon request, written nutrition information. These requlations apply to restaurants and similar retail food establishments if they are part of a chain of 20 or more locations, doing business under the same name, offering for sale substantially the same menu items and offering for sale restaurant-type foods.

Input Data and Cleanup

The current Starbucks nutrition data (i.e., file “starbucks_nutrition_20220306.csv”) contains the current drink nutrition information and contains the following variables:

drink_name: Name of the drink
type: Type of drink, categories defined by Starbucks
size: Size of the drink
calories: Number of calories
fat: Total fat (g)
cholesterol: Cholesterol (mg)
sodium: Sodium (mg)
carb: Total carbohydrates (g)
sugar: Sugars (g)
protein: Protein (g)
caffeine: Caffeine (g)

The Starbucks nutrition data for 2020 (e.g., file “starbucks_nutrition_20200902.csv”) contains the same nutrition variables but for the 2 September 2020 Starbucks online menu.

For the purpose of this comparison , only Starbucks drinks in Grande size are included. Each data row in the analysis here is, thus, a unique drink with a unique drink name. Also, Grande drinks for which nutrition data are not listed on the web site has been ommited. Being the largest size, the Grande has the highest levels for the nutrition variables listed above.

# imports

import os
#os.environ['QT_QPA_PLATFORM_PLUGIN_PATH'] = '/Users/awcox/opt/anaconda3/Library/plugins/platforms'

# imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
import seaborn as sns
from math import sqrt
import matplotlib.patches as mpatches
from matplotlib.patches import Polygon, Rectangle


# read in the nutrition data for the Starbucks drinks
sb_drinks = pd.read_csv("starbucks_nutrition_20220306.csv")
old_data = pd.read_csv("starbucks_nutrition_20200902.csv")

# Clean up of the new data
#sb_drinks_df = sb_drinks.dropna()            # drop drinks which do not have nutrition information
sb_drinks_df = sb_drinks.copy(deep=True)            # drop drinks which do not have nutrition information
sb_drinks_df.dropna(inplace=True)
sb_drinks_df = sb_drinks_df[sb_drinks_df['size'] == 'Grande']    # select only Grande size drinks

# group all of the Frappucino variations
sb_drinks_df['type'] = sb_drinks_df['type'].replace({'Frappuccino® Blended Beverages':'Frappuccinos'})
sb_drinks_df['drink_name'] = sb_drinks_df['drink_name'].str.replace('Frappuccino® Blended Beverage', 'Frappuccino')
sb_drinks_df['drink_name'] = sb_drinks_df['drink_name'].str.replace('Frappuccino®', 'Frappuccino')


# Clean up the old data
old_data_df = old_data.copy(deep=True)            # drop drinks which do not have nutrition information
old_data_df.dropna(inplace=True)
old_data_df = old_data_df[old_data_df['size'] == 'Grande']    # select only Grande size drinks

# group all of the Frappucino variations
old_data_df['type'] = old_data_df['type'].replace({'Frappuccino® Blended Beverages':'Frappuccinos'})
old_data_df['drink_name'] = old_data_df['drink_name'].str.replace('Frappuccino® Blended Beverage', 'Frappuccinos')
old_data_df['drink_name'] = old_data_df['drink_name'].str.replace('Frappuccino®', 'Frappuccinos')

Comparing the number of Grande drinks

# Rectangle comparison plot of the number of Grande drinks

fig1, ax1 = plt.subplots(figsize=(5,5))

# define and plot rectangles for the comparison
rectangle1 = Rectangle((0,0), sqrt(len(old_data_df)), sqrt(len(old_data_df)), fc='orange',ec="black")
rectangle2 = Rectangle((0,0), sqrt(len(sb_drinks_df)), sqrt(len(sb_drinks_df)), fc='steelblue',ec="black")
plt.gca().add_patch(rectangle1)
plt.gca().add_patch(rectangle2)

# Add plot title and number of drinks for each rectangle
ax1.set_title("Number of Grande Drinks", fontdict={'color':'black', 'weight':'bold', 'size':20})
ax1.text(4, 5, len(sb_drinks_df), fontdict={'color':'white', 'weight':'bold', 'size':20})
ax1.text(5.5, 10.75, len(old_data_df), fontdict={'color':'black', 'weight':'bold', 'size':20})

# Add legend linking color to the year of the data
blue_patch = mpatches.Patch(color='steelblue', label='2022')
orange_patch = mpatches.Patch(color='orange', label='2020')
ax1.legend(loc='lower left', facecolor="white", handles=[orange_patch, blue_patch])

plt.axis('scaled')  # to ensure plotted as squares

plt.axis('off')     # do not plot axes

plt.show()

There are 31 fewer Grande Drinks offered in 2022 than were offered in 2020. In addition, inspection of the two data sets reveals that there several of the Grande Drinks offered in 2022 were not on the menu in 2020 (more about this later).

Nutrition

Distribution of Nutrition Variables

The boxplots shown below provide a comparison between the two years of the distributions of several nutrition variables. Each boxplot provides information about the variability or dispersion of values as well as the central tendency.

# create boxplots comparing the distributions of nutrition variables between the two years

# create a copy of the dataframes and drop the non-nutrition columns
df_new = sb_drinks_df.copy(deep=True)
df_new.drop(["drink_name", "type", "size"], axis = 1, inplace=True)
df_new['Year'] = '2022'

df_old = old_data_df.copy(deep=True)
df_old.drop(["drink_name", "type", "size"], axis = 1, inplace=True)
df_old['Year'] = '2020'

# concatenate the new and the old nutrition data
list_frames = [df_new, df_old]
concat_df = pd.concat(list_frames, axis=0)
concat_df.reset_index(drop=True, inplace=True) # reset the index in the concatenated data

# rearrange (i.e., melt) the data to have the rows containing the Year, Nutritional_variable_name, and value
dd=pd.melt(concat_df,id_vars=['Year'],
           value_vars=['calories','fat', 'cholesterol', 'sodium', 'carb', 'sugar', 'protein', 'caffeine'],
           var_name='Variable')

# use the seaborn boxplot to display the paired-by-year nutritional variable box and whisker charts
fig = plt.figure(figsize = (8, 6))
sns.boxplot(x='Variable',y='value',data=dd, hue='Year').set_title(
                              'Box Plot of Starbucks Grande Drink Nutrition', fontdict = { 'fontsize': 20})
plt.rcParams["axes.titlesize"] = 20
plt.legend(loc='upper center')
plt.show()

As can be seen in the in the above boxplots:

Nutriton Variable:
- The median values for calories, sodium and caffeine show a slight improvement (i.e., smaller values) for the current Starbucks drinks over those of 2020.
- The 1st and 3rd quartile values are also lower.
- The distribution for fat is more symmterical in 2022 than for 2022, which is skewed to higher values.
- The distributions for carbohydrates, sugar and protein are basically unchanged.
- Calories for a grande drink on the menu can still go up over 450, without addons. The recommended daily calorie intake for adult women is 1,800 to 2,000 (for men it is 2,400 to 2,800). Thus, one calorie-laden grande drink can be 25% of the recommended calorie intake for an adult woman.
- Caffeine for a grande drink today can go up over 300mg. It is generally recommended to not go over 400mg a day.

What ratio of Starbucks drinks contain caffeine?

As illustrated in the diagram below, the percentage of caffeinated drinks is nearly the same today as in 2020.

# calculate the percentage of drinks that contain caffeine
pct_caffeine = round(len(sb_drinks_df[sb_drinks_df.caffeine > 0]) / len(sb_drinks_df) * 100, 1)
pct_caffeine_old = round(len(old_data_df[old_data_df.caffeine > 0]) / len(old_data_df) * 100, 1)

# Rectangle comparison plot of the number of Grande drinks containing caffeine

fig2, ax2 = plt.subplots(figsize=(5,3))
# define and plot rectangles for the comparison
rectangle1 = Rectangle((0,20), pct_caffeine_old, 20, fc='orange',ec="black")
rectangle2 = Rectangle((pct_caffeine_old,20), pct_caffeine, 20, fc='steelblue',ec="black")
plt.gca().add_patch(rectangle1)
plt.gca().add_patch(rectangle2)

# add fulcrum
x = [pct_caffeine_old - 10, pct_caffeine_old, pct_caffeine_old +10]
y = [0, 20, 0]
ax2.fill(x, y, facecolor='black', edgecolor='red', linewidth=2)

# Add plot title and number of drinks for each rectangle
ax2.set_title("Number of Grande Drinks Containing Caffeine", fontdict={'color':'black', 'weight':'bold', 'size':12})
ax2.text(pct_caffeine_old + pct_caffeine/3, 27, str(pct_caffeine) +'%',
         fontdict={'color':'white', 'weight':'bold', 'size':20})
ax2.text(pct_caffeine_old/3, 27, str(pct_caffeine_old) + '%',
         fontdict={'color':'black', 'weight':'bold', 'size':20})

# Add legend linking color to the year of the data
blue_patch = mpatches.Patch(color='steelblue', label='2022')
orange_patch = mpatches.Patch(color='orange', label='2020')
ax2.legend(loc='lower right', facecolor="white", handles=[orange_patch, blue_patch])

plt.axis('scaled')  # to ensure plotted as squares

plt.axis('off')     # do not plot axes

plt.show()

How many varieties of each category does Starbucks offer?

The following pie chart compares the percentage of drinks by drink categories for each of the two years 2020 and 2022.

# count the number of drinks by type

type_df = sb_drinks_df['type'].value_counts()
type_old_df = old_data_df['type'].value_counts()

fig3, ax3 = plt.subplots(figsize=(7,7))
size=0.6

# plot nested pie charts of the counts by drink type
ax3.pie(type_df, radius=1.2, labels=type_df.index,colors=sns.color_palette('muted'), autopct='%1.2f%%',
       pctdistance=0.8, wedgeprops=dict(width=size, edgecolor='w'), startangle=90, counterclock=False)

ax3.pie(type_old_df, radius=1.4-size,colors=sns.color_palette('muted'), autopct='%1.2f%%',
       pctdistance=0.8, wedgeprops=dict(width=size, edgecolor='w'), startangle=90, counterclock=False)

ax3.set(aspect="equal")

ax3.set_title("Fraction of Starbucks Drinks by Category", y=1.05, fontsize = 24)
ax3.text(0,-0.35, "2020", fontsize=16, fontweight="bold", c="blue", horizontalalignment="center")
ax3.text(0,-1.15, "2022", fontsize=16, fontweight="bold", c="blue", horizontalalignment="center")
plt.show()

Cold Coffees comprise a smaller percent of Starbucks Grande drinks today than they did in September 2020 (25% versus ~28.8%), while Frappuccinos, Hot Teas, Cold Drinks and Hot Drinks each have a slightly higher fractions today than before.

Focus: Drinks high in Calories & Caffeine

Why are we looking mostly at calories and caffeine?

Calories are an important factor in controlling one’s weight. Caffeine is in over 85% of items on the Starbucks drink menu, and it is a major concern in today’s nutrition and health discussions.

Which drinks have the most calories?

The line chart below plots compares the Grande drinks containing the highest number of calories. The “Salted Caramel Mocha” and the “Salted Caramel Hot Chocolate” offered in 2020 are not on the current dink menu. The “Mocha Cookie Crumble Frappuccino”, the “Caramel Ribbon Crunch Frappuccino” and the “Chocolate Cookie Crumble Creme Frappuccino” drinks are currently offered but did not appear on th 2020 menu.

# Line plots of the calories for the highest calorie drinks

# get highest calorie current drinks
top_drink_cals = sb_drinks_df.nlargest(7, 'calories')[['drink_name', 'calories']]  # top 7
list_drinks = list(top_drink_cals.drink_name)                                      #convert to list

# set ordere of drinks to appear along x axis (note: not all of these drinks are present for each
# of the years
top_drink_list = ['Salted Caramel Mocha',
                    'Mocha Cookie Crumble Frappuccino',
                    'Caramel Ribbon Crunch Frappuccino',
                    'Salted Caramel Hot Chocolate',
                    'Chocolate Cookie Crumble Crème Frappuccino',
                    'Iced Salted Caramel Mocha',
                    'White Hot Chocolate']

# create plot window forcing x and y axes
dummy = [0,0,0,0,0,0,0]

fig4, ax4 = plt.subplots(figsize=(5,5))

ax4.set_ylim(420, 500)

ax4.plot(top_drink_list, dummy)
ax4.set_xticklabels(top_drink_list,rotation=45, ha='right')     # rotate the drink labels and align to the right end of the strings


# get and plot values for 2022
x = []                # list of highes drinks for 2022
y = []                # corresponding calories for each drink
for drink in top_drink_list:        # populat lists of 2022 drinks and corresponding calories
    if(drink in list_drinks):
        x.append(drink)
        y.append(top_drink_cals[top_drink_cals['drink_name']==drink]['calories'].values[0])
        
ax4.plot(x, y, marker='o', color='steelblue')
# Pad margins so that markers don't get clipped by the axes
plt.margins(0.1)
# Tweak spacing to prevent clipping of tick-labels
plt.subplots_adjust(bottom=0.6)


# get and plot values for 2020
old_top_drink_cals = old_data_df.nlargest(7, 'calories')[['drink_name', 'calories']]
list_drinks = list(old_top_drink_cals.drink_name)

x = []                # list of highes drinks for 2020
y = []                # corresponding calories for each drink
for drink in top_drink_list:        # populate list of 2020 drinks and corresponding calories
    if(drink in list_drinks):
        x.append(drink)
        y.append(old_top_drink_cals[old_top_drink_cals['drink_name']==drink]['calories'].values[0])


ax4.plot(x, y, marker='^', color='orange')


# add titles and axes labels
ax4.set_title("Grande Drinks Containing Highest Calories", fontdict={'color':'black', 'weight':'bold', 'size':12})
ax4.set_xlabel("Grande Drink")
ax4.set_ylabel("Calories (g)")


# Add legend linking color to the year of the data
blue_patch = mpatches.Patch(color='steelblue', label='2022')
orange_patch = mpatches.Patch(color='orange', label='2020')
ax4.legend(loc='upper right', facecolor="white", handles=[orange_patch, blue_patch])

plt.show()

As can be seen in the above plot, the highest caloried Grande drinks today have increased calorie levels over the similar high-calorie drinks in 2020.

Distribution of Calories in Starbucks Grande Drinks

To compare the distribution of the number of Grande drinks at different calorie amounts, the chart below shows left-right bars comparing the two years.

# bin the calories and plot

sb_drinks_df['calorie_bin'] = pd.cut(sb_drinks_df.calories,
                           [0, 62, 125, 212, 275, 325, 375, 438, 500],
                           labels=['0-62', '63-125', '126-187', '188-249', '250-312', '313-374',
                                   '375-436', '437-500'])
vals = sb_drinks_df['calorie_bin'].value_counts(sort=False)   # count he number in each bin
vals_list = list(vals)                      # convert pandas series to list of counts
neg_vals_list = [ -x for x in vals_list]    # set the value signs so the bars plot to the left
ylabels = list(vals.index)                  # convert pandas series index to a list

old_data_df['calorie_bin'] = pd.cut(old_data_df.calories,
                           [0, 62, 125, 212, 275, 325, 375, 438, 500],
                           labels=['0-62', '63-125', '126-187', '188-249', '250-312', '313-374',
                                   '375-436', '437-500'])
old_vals = old_data_df['calorie_bin'].value_counts(sort=False)   # count he number in each bin
old_vals_list = list(old_vals)                      # convert pandas series to list of counts

fig = plt.figure(figsize=(10, 5))
  
# creating the bar plot
plt.barh(ylabels, neg_vals_list, color='steelblue')

plt.barh(ylabels, old_vals_list, color='darkorange')

# redo the x-axis tick labels so all are positive
#specify x-axis locations

x_ticks = [-20, -10, 0, 10, 20, 30]
#specify x-axis labels
x_labels = ['20', '10', '0', '10', '20', '30'] 
plt.xticks(ticks=x_ticks, labels=x_labels)

# Add legend linking color to the year of the data

blue_patch = mpatches.Patch(color='steelblue', label='2022')
orange_patch = mpatches.Patch(color='orange', label='2020')
plt.legend(loc='upper right', facecolor="white", handles=[orange_patch, blue_patch])
  
plt.xlabel("Number of Grande Drinks")
plt.ylabel("Calories (g)")
plt.title("Distribution of Calories in Grande Drinks")
plt.show()

The above distributions show that fewer Grande drinks are offered in 2022 across all calorie ranges. However, the relative distribution of the number of drinks across the various calorie ranges is the same in 2022 as for 2020.

What is the distribution of caffeine among Starbucks Grande drinks?

To compare the distribution of the number of Grande drinks at different calorie amounts, the chart below shows left-right bars comparing the two years.

# bin the caffeine and plot

sb_drinks_df['caffeine_bin'] = pd.cut(sb_drinks_df.caffeine,
                           [0, 50, 100, 150, 200, 250, 300, 350, 400],
                           labels=['0-50', '51-100', '101-150', '151-200', '201-250', '251-300',
                                   '301-350', '351-400'])
vals = sb_drinks_df['caffeine_bin'].value_counts(sort=False)   # count he number in each bin
vals_list = list(vals)                      # convert pandas series to list of counts
neg_vals_list = [ -x for x in vals_list]    # set the value signs so the bars plot to the left
ylabels = list(vals.index)                  # convert pandas series index to a list

old_data_df['caffeine_bin'] = pd.cut(old_data_df.caffeine,
                           [0, 50, 100, 150, 200, 250, 300, 350, 400],
                           labels=['0-50', '51-100', '101-150', '151-200', '201-250', '251-300',
                                   '301-350', '351-400'])
old_vals = old_data_df['caffeine_bin'].value_counts(sort=False)   # count he number in each bin
old_vals_list = list(old_vals)                      # convert pandas series to list of counts

fig = plt.figure(figsize=(10, 5))
  
# creating the bar plot
plt.barh(ylabels, neg_vals_list, color='steelblue')

plt.barh(ylabels, old_vals_list, color='darkorange')

# redo the x-axis tick labels so all are positive
#specify x-axis locations

x_ticks = [-50, -40, -30, -20, -10, 0, 10, 20, 30, 40, 50]
#specify x-axis labels
x_labels = ['50', '40', '30','20', '10', '0', '10', '20', '30', '40', '50'] 
plt.xticks(ticks=x_ticks, labels=x_labels)

# Add legend linking color to the year of the data

blue_patch = mpatches.Patch(color='steelblue', label='2022')
orange_patch = mpatches.Patch(color='orange', label='2020')
plt.legend(loc='upper right', facecolor="white", handles=[orange_patch, blue_patch])
  
plt.xlabel("Number of Grande Drinks")
plt.ylabel("Caffeine")
plt.title("Distribution of Caffeine in Grande Drinks")
plt.show()

The above distributions show that fewer Grande drinks are offered in 2022 across all caffeine ranges. However, the relative distribution of the number of drinks across the various calorie ranges is the same in 2022 as for 2020. The two horizontal bar plots above show that distributions of Starbucks drinks are both skewed lower for calories and caffeine. Starbucks offers a number of zero-calorie and zero-sugar drinks, mostly tea drinks, non-coffee and decaf drinks.

Analyzing Drink Categories

Daily Nutritional Values

Looking at g/mg values for all of the nutrition types doesn’t tell us much because we cannot compare the amounts of each nutrition type with different scales. To standardize this, the nutrition values for each type are comparer to the daily nutritional values recommended by the Food and Drug Administration based on a 2,000 Calorie Intake for Adults and Children 4 or More Years of Age._

Nutrition Variable:
- Calories: 2000
- Fat: 78g
- Cholesterol: 300mg
- Sodium: 2300mg
- Carbohydrates: 275g
- Sugar: ?? → “There is no Daily Value for total sugars because no recommendation has been made for the total amount to eat in a day.” - source - (FDA)
- Protein: 50g
- Caffeine: 400mg

Source: https://www.fda.gov/media/135301/download (last updated March 2020)

For healthy adults, it is generally recommended for caffeine not to exceed over 400mg a day, an amount that has not associated with negative affects - (FDA)

dv = sb_drinks_df.copy(deep=True)
dv.calories = dv.calories / 2000 * 100
dv.fat = dv.fat / 78 * 100
dv.cholesterol = dv.cholesterol / 300 * 100
dv.sodium = dv.sodium / 2300 * 100
dv.carb = dv.carb / 275 * 100
dv = dv.drop(columns=['sugar']) # no daily level for total sugars
dv.protein = dv.protein / 50 * 100
dv.caffeine = dv.caffeine / 400 * 100

old_dv = old_data_df.copy(deep=True)
old_dv.calories = old_dv.calories / 2000 * 100
old_dv.fat = old_dv.fat / 78 * 100
old_dv.cholesterol = old_dv.cholesterol / 300 * 100
old_dv.sodium = old_dv.sodium / 2300 * 100
old_dv.carb = old_dv.carb / 275 * 100
old_dv = old_dv.drop(columns=['sugar']) # no daily level for total sugars
old_dv.protein = old_dv.protein / 50 * 100
old_dv.caffeine = old_dv.caffeine / 400 * 100


# Plot the max values for the nutrition variables

fig5, ax5 = plt.subplots(figsize=(5,5))

var_names = list(dv.max()[3:10].index)    #  get the names of the nutritional variables

# line plot of the nutritional variables for each year
ax5.plot(var_names, list(dv.max()[3:10]), marker='o', color='steelblue')
ax5.plot(var_names, list(old_dv.max()[3:10]), marker='^', color='darkorange')
plt.xticks(rotation=45, ha='right')     # rotate the drink labels and align to the right end of the strings
# Pad margins so that markers don't get clipped by the axes

plt.margins(0.1)
# Tweak spacing to prevent clipping of tick-labels
plt.subplots_adjust(bottom=0.2)

# Add legend linking color to the year of the data
blue_patch = mpatches.Patch(color='steelblue', label='2022')
orange_patch = mpatches.Patch(color='orange', label='2020')
plt.legend(loc='upper left', facecolor="white", handles=[orange_patch, blue_patch])
  
# add plot and axes titles
ax5.set_xlabel("Nutrition Variable")
ax5.set_ylabel("Percent")
ax5.set_title("Maximum % of Recommended Daily Allowance", fontsize=12)

plt.show()

No Grande drink seem to go above 1/3 of recommended daily value for any nutrition type, with the exception of caffeine. Thus, ignoring caffeine, the maximum of the Grande drink nutrition values is on the order of 1 daily meal!

Nutrition Levels by Category

The side-by-side heatmaps below show the average percentage of the Recommende Daily Allowance by drink category for each of the nutritional variables in the Grande drinks. The left heatmap is for the current menu whereas the right heatmap is for the 2020 Starbucks menu.

heatmap = dv.groupby('type').mean()
old_heatmap = old_dv.groupby('type').mean()

fig6 = plt.figure(figsize = (10,6)) # width x height
# Tweak the margin spacing to prevent clipping of tick-labels
plt.subplots_adjust(left=0.1,
                    bottom=0.2, 
                    right=0.9, 
                    top=0.9, 
                    wspace=0.4, 
                    hspace=0.4)

# set up for side-by-side heatmaps
ax6 = fig6.add_subplot(1, 2, 1) # 1 row, 2 columns, position=1
ax7 = fig6.add_subplot(1, 2, 2) # 1 row, 2 columns, position=1

fig6.suptitle('% Recommended Daily Allowance - Averaged by Category', y=0.9, fontsize=18)
ax6.set_title('2022')
ax7.set_title('2020')

# generate heatmap for 2022
res1 = sns.heatmap(data=heatmap, ax=ax6, cmap = "Blues", square=True,
            cbar_kws={'shrink': .8,'label': 'Percent Daily Recommended Value'}, annot=True,
            annot_kws={'fontsize': 10})
res1.set_yticklabels(res1.get_ymajorticklabels(), fontsize = 10)
res1.set_xticklabels(res1.get_xmajorticklabels(), fontsize = 10, rotation=90)

# generate heatmap for 2020
res2 = sns.heatmap(data=old_heatmap, ax=ax7, cmap = "Oranges", square=True,
            cbar_kws={'shrink': .8,'label': 'Percent Daily Recommended Value'}, annot=True, 
            annot_kws={'fontsize': 10})
res2.set_yticklabels(res2.get_ymajorticklabels(), fontsize = 10)
res2.set_xticklabels(res2.get_xmajorticklabels(), fontsize = 10, rotation=90)

plt.show()

The highest level of caffeine intake is apparent in these heatmaps for the categories Cold Coffee and Hot Coffee. Across the various categories the average percentage are little changed between 2020 and 2022.

Conclusion

While there are fewer Grande Drinks on the March 2022 Starbucks menu compared with September 2020, and while some of the indivivual drinks have changed, the overall nutrition remains largely unchanged. However, the Grande drinks today with the highest calories are higher than those in 2020. The maximum amount of caffeine and the distribution across Grande drinks has not changed.

So, no, the nutrition of Starbucks drinks has not improved since September 2020.

And the two most unhealthy 2022 Starbucks drinks are…

With caffeine greater than 70% DV, your two worst Grande drinkstoday are:

dv_out = dv.copy(deep=True)
dv_out["cal_caf"] = dv_out.calories + dv_out.caffeine
dv_out.drop(['fat', 'cholesterol', 'sodium', 'carb', 'protein', 'calorie_bin', 'caffeine_bin'], axis = 1, inplace=True)
out = dv_out[(dv_out['caffeine'] > 60) | (dv_out['calories'] >= 20)].sort_values(by=['cal_caf'], ascending=False)
out.drop(['cal_caf'], axis=1, inplace=True)
print(out.to_string(index=False))

                             drink_name         type   size    calories  caffeine
                             ----------         ----   ----    --------  --------
                          Veranda Blend®  Hot Coffees Grande      0.25%     90.00%
                       Pike Place® Roast  Hot Coffees Grande      0.25%     77.50%

Has the Nutrition of Starbucks Drinks Improved since 2020?

Andrew Cox

4/1/2022