Official Cookie Overview

Author

Julian (VP) & Julia (President) from Montgomery College Data Science Club

Welcome to our Data Science Club Project Page!!!!!!!

Join our club! We meet on Thursdays at 3pm in SW 304 (Rockville Campus science building). Here is a link to our club Discord and GitHub repository (most datasets are on this fork right now).

Our Spring semester project involves working with a dataset of about 10,000 cookie recipes scraped from the internet. We’re done with the web scraping but happy to share how we did it if you’re curious.

Our goal is to analyze patterns in cookie recipes and correlations between ingredients, quantity of ingredients, and cookie recipe ratings. Ultimately we want to make a statistical model that will predict a cookie recipe’s quality based on its ratings, so we can make the best, most average, and worst cookie recipes we can come up with. We can then make the cookies and take a survey on people’s opinions of the cookies.

Right now we’re on the data preparation (data cleaning) phase. We need help! More information will be coming, but for right now, please come to the club or join our Discord if you’re interested! You can use any coding language you want. We also need to peer review the code that gets used to clean the data to make sure it’s being done right (which is another thing we need help with)

Informal AI Guideline

Because we want to make sure that we’re cleaning the data properly, and since this project is just for fun, we prefer that people avoid using AI excessively. We have no problems with using it as a tool or sharing AI generated code to adjust, but please avoid using it to write your code for you. If you send AI generated scripts, just say they’re AI generated. This is just informal so there’s obviously no “punishment” for violating it, just please don’t do that, it’s a bit annoying

Data

Overall, we have 39 datasets and a total of 11294 rows. Most datasets have about 200 rows. We’re going to need to clean nearly all the columns and filter out rows that aren’t actually cookie recipes. This document includes descriptions of every individual data source/dataset, but almost all of the cleaning will be done on one combined dataset.

Overall issues

All (or nearly all) datasets have these columns (full definition on GitHub):

title	author	rating	ratingnum	prep	cook	total	yield	totalingredients	ingredient1
Choc Chip	Sally	4.7	1813	15 mins	20 mins	2 hrs 5 mins	26 cookies	10	2 stick butter

Some of them have additional columns like genre, course, cuisine, number of steps, and list of steps. The “ingredientX” columns go up to the total number of ingredients. Some also include date posted/updated/etc - the time columns don’t need to be cleaned, they’re only for reference while working with the data

Here are some issues:

First and foremost, my computer crashed when trying to join all the datasets together, so we’ll have to figure out how to do that 🙂.

Ingredients that are not ingredients: Some (probably most) datasets have things like “filling:”, “optional”, “shredded” etc included in their ingredientX columns. These are also added to the totalingredients rows for these recipes.
How to fix this:

Remove everything that lacks numbers and comb through what’s removed (because some of them say “salt” or “sprinkles”, which many recipe authors don’t give measurements to)
Remove everything that ends with “:” (ingredients[!grepl(":$", ingredients)] is one way to do that - literally read as “remove every ingredient that ends with :”, :$ is regex meaning “ends with :”)
Make this an if statement or whatever and subtract 1 from totalingredients for every ingredient removed (and then shift everything back in the ingredientX columns)

Empty ingredient columns in the middle of a row: Some datasets (like Cookie Rookie) have empty or NA ingredientX columns in the middle of a row, so you’ll have ingredient1-5 with actual ingredients, and then NA or empty space in ingredients6-9, and then ingredients10-14, for example.
Different sources use different terms for the same ingredients. Eg, “caster sugar” vs “superfine sugar”, “all purpose flour” vs “white flour”. The AllRecipes dataset has a standard way that they refer to ingredients (thankfully) so that’s not a concern for that dataset (and we might want to follow the standard for AllRecipes when cleaning everything else)
Fractions are sometimes written as symbols in the ingredients, and all needs to be turned to decimals (Julia can fix this.)
Measurements need to be converted to grams. Most recipes use cups, tablespoons, etc. Here is an extensive ingredient conversion chart from King Arthur Baking and here’s a Python ingredient conversion program on GitHub that I haven’t looked at
A lot of recipes are NOT ACTUALLY COOKIES!!!!!!!!!
How to fix this:

Separate datasets into data with a category column (many of them have a column that says if it’s a cookie or something else) and datasets without
In the datasets without, filter recipes that have cookie words in the title (what I wrote before was grepl("biscuit|ookie|shortbread|bars|snickerdoodle|dodgers|biscotti", title, ignore.case = TRUE), although “biscuit” is risky. “ookie” includes both cookie and brookie and other puns - including pizookie)
Search through what is removed and search for sugar, molasses, honey, sweetener etc and pick out actual cookie recipes that were removed (most cookies will appear just under sugar, but some cookies have honey or molasses and not sugar, eg the greek cookie moustokouloura, which isn’t called an immediately recognizable “cookie” word)

Measurement and ingredient will need to be separated in some way for modeling
Not every dataset has prep/cook/total time set up properly. Some websites used the same html element to refer to different things, so some datasets have the label prep/cook/etc time in the rows. These need to be turned into columns and rows (we managed to do something like this before, so we can do it again.)
The “yield” or “servings” columns in different datasets are different and these columns are very unique (eg, “servings: 28 cookies”, vs “servings: 28”, vs “servings: 4 5-cookie servings”)
Some recipes MIGHT be AI. I don’t think so, but I have encountered AI cookie recipes while working on this. It would be great if someone could click through the links to data sources below and make sure they don’t see any clearly AI generated recipes in there. If you find anything like that, send it in the club Discord!

One of the goals is to shrink every recipe down into a 1-cookie recipe by dividing the ingredients by the listed yield, but bringing every recipe to approximately the same weight might also be useful

Original Datasets

Most cleaning will be done to all the data joined into one dataset, but the problems with each are detailed here. Some parts of the cleaning may also be easier to do individually (like filtering for non cookie recipes or fixing ingredients columns - maybe)

Dataset (.csv)

48 columns x 1865 rows

(more info TBA)

     rating        ratingnum         servings         totalingredients
 Min.   :1.000   Min.   :    1.0   Length:1865        Min.   : 2.00   
 1st Qu.:4.200   1st Qu.:    6.0   Class :character   1st Qu.: 8.00   
 Median :4.500   Median :   25.0   Mode  :character   Median :10.00   
 Mean   :4.364   Mean   :  190.2                      Mean   :10.01   
 3rd Qu.:4.700   3rd Qu.:   99.0                      3rd Qu.:12.00   
 Max.   :5.000   Max.   :19339.0                      Max.   :28.00   
 NA's   :105     NA's   :105


                        Almond         Almond Dessert Recipes 
                             3                             28 
       American Cookie Recipes                          Apple 
                            16                              4 
         Apple Dessert Recipes                        Apricot 
                            15                              3 
                   Argentinian             Australian Cookies 
                             2                              1 
                      Austrian                          Bacon 
                             4                              1 
        Banana Dessert Recipes             Bar Cookie Recipes 
                            29                             19 
                       Belgian          Birthday Cake Recipes 
                             2                              1 
              Biscotti Recipes                      Blueberry 
                             4                              1 
     Blueberry Dessert Recipes                      Brazilian 
                             5                              1 
          Breakfast and Brunch                Brownie Recipes 
                             1                              1 
         Butter Cookie Recipes               Butternut Squash 
                            11                              2 
         Cake Mix Cake Recipes        Cake Mix Cookie Recipes 
                             2                             15 
       Canadian Cookie Recipes            Carnivals and Fairs 
                             4                              1 
            Cheesecake Recipes                         Cherry 
                             3                              1 
        Cherry Dessert Recipes                        Chinese 
                             9                              3 
                     Chocolate      Chocolate Brownie Recipes 
                             1                              3 
       Chocolate Candy Recipes   Chocolate Cheesecake Recipes 
                             1                              3 
 Chocolate Chip Cookie Recipes       Chocolate Cookie Recipes 
                            82                             42 
         Chocolate Pie Recipes       Christmas Cookie Recipes 
                             1                             21 
                       Coconut                 Coffee Liqueur 
                            15                              2 
               Cookie Frosting                        Cookies 
                            12                             11 
               Cracker Recipes                      Cranberry 
                             1                             12 
     Cranberry Dessert Recipes                   Cream Cheese 
                             2                              1 
                  Crumb Crusts         Cut-Out Cookie Recipes 
                             1                             39 
                         Czech                 Dark Chocolate 
                             1                              1 
                  Date Cookies          Dessert Salad Recipes 
                            15                              1 
                      Desserts               Double Chocolate 
                             7                             13 
           Drop Cookie Recipes                          Dutch 
                           262                              5 
                       English            Fig Dessert Recipes 
                             1                              2 
                      Filipino          Filled Cookie Recipes 
                             1                             15 
                        France                         French 
                             5                              1 
          Fruit Cookie Recipes                  Fudge Recipes 
                             3                              1 
                        German                        Germany 
                            11                              4 
    Gingerbread Cookie Recipes             Gingersnap Recipes 
                            25                              2 
                         Greek                       Hazelnut 
                             3                              3 
              Hazelnut Recipes                        Healthy 
                             2                              1 
                     Hungarian         Ice Cream Cake Recipes 
                             1                              1 
         Ice Cream Pie Recipes                          Irish 
                             1                              1 
                       Israeli                        Italian 
                             2                              1 
                         Italy                       Japanese 
                            43                              1 
                      Lebanese                          Lemon 
                             3                              6 
         Lemon Dessert Recipes        Liqueur Dessert Recipes 
                            15                              3 
                 Macadamia Nut          Macadamia Nut Recipes 
                             4                              3 
              Macaroon Recipes               Meringue Cookies 
                             1                             11 
                       Mexican         Mexican Cookie Recipes 
                            10                              2 
                Milk Chocolate        Molasses Cookie Recipes 
                             1                             10 
            New Mexico Recipes     No-Bake Cheesecake Recipes 
                             1                              1 
        No-Bake Cookie Recipes            No-Bake Pie Recipes 
                            47                              4 
             Nut Candy Recipes             Nut Cookie Recipes 
                             1                              1 
        Oatmeal Cookie Recipes  Oatmeal Raisin Cookie Recipes 
                            89                             34 
                        Orange         Orange Dessert Recipes 
                             2                              6 
         Peach Dessert Recipes                         Peanut 
                             1                              1 
  Peanut Butter Cookie Recipes         Peanut Dessert Recipes 
                            89                              2 
                         Pecan          Pecan Dessert Recipes 
                             2                             22 
                       Persian                       Peruvian 
                             1                              1 
             Pet Treat Recipes      Pineapple Dessert Recipes 
                             2                              6 
             Pistachio Recipes                       Pizzelle 
                            11                              4 
                        Polish                     Portuguese 
                             3                              1 
               Praline Recipes                        Pudding 
                             2                              1 
        Pumpkin Cookie Recipes                 Quick and Easy 
                            49                              1 
                        Raisin      Raspberry Dessert Recipes 
                            11                              5 
   Refrigerator Cookie Recipes                            Rum 
                            28                              4 
                       Russian        Sandwich Cookie Recipes 
                             2                             22 
                  Scandinavian                       Scottish 
                            15                              2 
     Shortbread Cookie Recipes          Snickerdoodle Recipes 
                            17                              7 
                       Spanish           Spice Cookie Recipes 
                             2                             41 
     Springerle Cookie Recipes          Spritz Cookie Recipes 
                             1                             10 
                        Squash                     Strawberry 
                             9                              1 
    Strawberry Dessert Recipes           Sugar Cookie Recipes 
                             6                            104 
                        Sweden                          Swiss 
                             1                              2 
Tea Cakes and Biscuits Recipes      Thumbprint Cookie Recipes 
                             2                             21 
                        Treats                Truffle Recipes 
                             1                              1 
                        Walnut         Walnut Dessert Recipes 
                             2                              9 
                Walnut Recipes                          Welsh 
                             1                              1 
                    White Cake                White Chocolate 
                             1                             12 
           Whoopie Pie Recipes        Zucchini Cookie Recipes 
                             3                              2

Amanda’s Cookin • Dataset (.csv)

40 columns x 179 rows

    title               rating        ratingnum      numberofsteps   
 Length:179         Min.   :4.000   Min.   :  1.00   Min.   : 2.000  
 Class :character   1st Qu.:5.000   1st Qu.:  3.00   1st Qu.: 6.000  
 Mode  :character   Median :5.000   Median :  7.00   Median : 8.000  
                    Mean   :4.974   Mean   : 26.44   Mean   : 8.592  
                    3rd Qu.:5.000   3rd Qu.: 14.00   3rd Qu.:10.000  
                    Max.   :5.000   Max.   :813.00   Max.   :25.000  
                    NA's   :7       NA's   :7                        
 totalingredients
 Min.   : 4.00   
 1st Qu.: 7.00   
 Median :10.00   
 Mean   :10.16   
 3rd Qu.:12.00   
 Max.   :22.00


                American         American, Danish American, Jewish, Polish 
                     163                        1                        1 
                Austrian                 Canadian                  Italian 
                       1                        1                        1 
                 Mexican 
                       1


 Air Fryer Recipes          Breakfast           Desserts Holidays & Seasons 
                 2                  1                156                 19 
          Low Carb 
                 1


              Air Fryer Brownies Air Fryer Chocolate Chip Cookies 
                               1                                1 
              Bagels & Doughnuts                  Cake & Cupcakes 
                               1                                3 
                           Candy                        Christmas 
                               2                               10 
       Cookies, Brownies, & Bars                           Easter 
                             141                                2 
                    Father's Day                        Halloween 
                               1                                4 
      Keto Peanut Butter Cookies                 No Bake Desserts 
                               1                                6 
                One Pan Desserts                     Quick & Easy 
                               2                                1 
                    Thanksgiving               Trifles & Parfaits 
                               1                                1 
                 Valentine's Day 
                               1

Cookies, brownies, and bars. Scraped from the cookie/brownie/bar section: https://amandascookin.com/category/recipes/desserts/cookies-brownies-and-bars

America’s Test Kitchen • Dataset (.csv)

74 columns x 359 rows

Should all be cookie recipes. Scraped from the offiical cookie section.

    title               rating        ratingnum       numberofsteps  
 Length:359         Min.   :3.000   Min.   :   1.00   Min.   : 6.00  
 Class :character   1st Qu.:4.000   1st Qu.:  13.00   1st Qu.:11.00  
 Mode  :character   Median :4.500   Median :  23.00   Median :12.00  
                    Mean   :4.333   Mean   :  60.79   Mean   :13.34  
                    3rd Qu.:4.500   3rd Qu.:  54.50   3rd Qu.:14.00  
                    Max.   :5.000   Max.   :2611.00   Max.   :34.00  
 totalingredients
 Min.   : 6.00   
 1st Qu.:27.00   
 Median :30.00   
 Mean   :31.53   
 3rd Qu.:36.00   
 Max.   :63.00

Notes
1. All ingredients & steps are repeated three (3) times!
2. “Author” column needs brief cleaning; formatted like this: Watch Video_By_Charles Kelsey_Staff Pick_Comments
3. Columns time and yield also formatted oddly. This website doesn’t have prep/cook/total like most. Column time is basically total time but some mention cooling, eg, “1 hour, plus 20 minutes cooling” or “1 hour, plus 9 hours chilling and cooling”

A Mummy Too • Dataset (.csv)

50 columns x 60 rows

Should all be cookies. Scraped from the cookie category: https://www.amummytoo.co.uk/category/cookies

    title               rating        ratingnum      numberofsteps  
 Length:60          Min.   :4.500   Min.   : 1.000   Min.   : 7.00  
 Class :character   1st Qu.:5.000   1st Qu.: 1.000   1st Qu.: 9.00  
 Mode  :character   Median :5.000   Median : 1.000   Median :10.00  
                    Mean   :4.909   Mean   : 2.389   Mean   :11.47  
                    3rd Qu.:5.000   3rd Qu.: 2.000   3rd Qu.:13.00  
                    Max.   :5.000   Max.   :17.000   Max.   :22.00  
                    NA's   :6       NA's   :6                       
 totalingredients
 Min.   : 6.00   
 1st Qu.:14.00   
 Median :16.00   
 Mean   :17.13   
 3rd Qu.:20.00   
 Max.   :34.00


                    Bread, Cookies                          Breakfast 
                                 1                                  2 
                   cakes and bakes                            Cookies 
                                 2                                 32 
           Cookies, Dessert, Snack           Cookies, Dessert, Snacks 
                                 1                                  1 
Cookies, Desserts and sweet treats                    Cookies, Easter 
                                 2                                  5 
            Cookies, Festive makes                 Cookies, halloween 
                                 2                                  1 
                    Dessert, Snack          Desserts and sweet treats 
                                 3                                  3 
                     Festive makes                              Snack 
                                 1                                  3 
                            Snacks 
                                 1


         American American, British          Austrian           British 
               40                 1                 1                11 
          Italian          Scottish 
                2                 5

Averie Cooks • Dataset (.csv)

42 columns x 195 rows

     rating        ratingnum       numberofsteps    totalingredients
 Min.   :4.000   Min.   :   1.00   Min.   : 4.000   Min.   : 3.00   
 1st Qu.:4.500   1st Qu.:   5.00   1st Qu.: 7.000   1st Qu.:10.00   
 Median :4.640   Median :  11.00   Median : 9.000   Median :11.00   
 Mean   :4.691   Mean   :  53.17   Mean   : 9.646   Mean   :11.41   
 3rd Qu.:4.895   3rd Qu.:  46.00   3rd Qu.:11.000   3rd Qu.:13.00   
 Max.   :5.000   Max.   :1123.00   Max.   :28.000   Max.   :26.00

Category


Bars & Blondies       Chocolate         Cookies       Halloween 
              5               1              83               1

Should all be cookies. Scraped from the cookie category: https://www.averiecooks.com/category/dessert/cookies

Baked Bree • Datset (.csv)

37 columns x 170 rows

    title               rating       ratingnum      numberofsteps   
 Length:170         Min.   :4.00   Min.   : 1.000   Min.   : 3.000  
 Class :character   1st Qu.:4.54   1st Qu.: 1.000   1st Qu.: 6.000  
 Mode  :character   Median :5.00   Median : 2.000   Median : 7.500  
                    Mean   :4.78   Mean   : 5.404   Mean   : 8.053  
                    3rd Qu.:5.00   3rd Qu.: 6.000   3rd Qu.: 9.000  
                    Max.   :5.00   Max.   :40.000   Max.   :20.000  
                    NA's   :113    NA's   :113                      
 totalingredients
 Min.   : 2.00   
 1st Qu.: 8.00   
 Median :10.00   
 Mean   :10.48   
 3rd Qu.:13.00   
 Max.   :24.00


                                     American 
                                          124 
American, ashkenazi, eastern european, jewish 
                                            1 
                       American, Bakery-style 
                                            1 
               American, baking, comfort food 
                                            1 
                            American, English 
                                            1 
     American, Fall-Inspired, Seasonal Baking 
                                            1 
                             American, German 
                                            1 
                     American, Holiday Baking 
                                            3 
                      American, international 
                                            1 
                              American, Irish 
                                            1 
                     American, Spring dessert 
                                            1 
                                       baking 
                                           19 
                                      British 
                                            1 
                                  fall baking 
                                            1 
                                       French 
                                            2 
                                       German 
                                            1 
                                international 
                                            1 
                                      Italian 
                                            4 
                             italian-american 
                                            1 
                                      Mexican 
                                            3 
                                        vegan 
                                            1

Should all be cookies. Scraped from cookie section: https://bakedbree.com/category/food/cookies

Bake It With Love • Dataset (.csv)

39 columns x 239 rows

     rating        ratingnum      numberofsteps    totalingredients
 Min.   :4.170   Min.   :  1.00   Min.   : 2.000   Min.   : 2.00   
 1st Qu.:5.000   1st Qu.:  2.00   1st Qu.: 6.000   1st Qu.: 8.00   
 Median :5.000   Median :  6.00   Median : 8.000   Median :10.00   
 Mean   :4.988   Mean   : 14.72   Mean   : 9.619   Mean   :10.48   
 3rd Qu.:5.000   3rd Qu.: 13.00   3rd Qu.:10.000   3rd Qu.:12.00   
 Max.   :5.000   Max.   :220.00   Max.   :29.000   Max.   :22.00

Cuisine


                                  American 
                                       196 
                           American, Asian 
                                         2 
                          American, French 
                                         1 
                 American, Mexican, TexMex 
                                         1 
                                Australian 
                                         1 
                                  Austrian 
                                         1 
Canadian, English, German, Irish, Scottish 
                                         1 
                                    Danish 
                                         1 
                                   English 
                                         1 
                         English, Scottish 
                                         2 
                                    French 
                                        21 
                                    German 
                                         1 
                                     Greek 
                                         1 
                                   Italian 
                                         2 
                            Jewish, Polish 
                                         1 
                                   Mexican 
                                         1 
                           Mexican, TexMex 
                                         1 
                        Norwegian, Swedish 
                                         2 
                                    Polish 
                                         1 
                                   Russian 
                                         1

Category


                  Appetizers                    Chocolate 
                           3                            1 
Christmas Cookies & Desserts               Cookies & Bars 
                           9                          161 
                    Desserts                       Easter 
                           9                            1 
                   Halloween               Healthy Baking 
                          10                            1 
                      How To                     Macarons 
                          10                           16 
                   Main Dish                      No Bake 
                           3                            2 
                Pies & Tarts                      Recipes 
                           1                            4 
                 Side Dishes  Tips Tricks and Information 
                           1                            6

Should all be cookie (and bar) recipes. Scraped from the cookie/bars section: https://bakeitwithlove.com/category/recipes/desserts/cookies-bars

Baker By Nature • Dataset (.csv)

42 columns x 111 rows

     rating        ratingnum      numberofsteps    totalingredients
 Min.   :3.000   Min.   :  1.00   Min.   : 2.000   Min.   : 5.0    
 1st Qu.:4.740   1st Qu.:  3.00   1st Qu.: 6.000   1st Qu.:10.0    
 Median :4.920   Median :  6.00   Median : 8.000   Median :12.0    
 Mean   :4.811   Mean   : 19.36   Mean   : 8.144   Mean   :12.5    
 3rd Qu.:5.000   3rd Qu.: 15.00   3rd Qu.:10.000   3rd Qu.:14.0    
 Max.   :5.000   Max.   :441.00   Max.   :27.000   Max.   :23.0


                            American American, Baking, Chocolate, Cookies 
                                  47                                    1 
           American, Baking, Cookies           Baking, Chocolate, Cookies 
                                   3                                    4 
 Baking, Chocolate, Cookies, Italian                      Baking, Cookies 
                                   1                                    2 
              Baking, Cookies, Lemon                     Candy, Chocolate 
                                   1                                    1 
                          Cheesecake                            Chocolate 
                                   1                                    1 
                             Cookies                               French 
                                  40                                    2 
                             Italian 
                                   1


Breakfast   Dessert 
        3       102

Should all be cookies. Scraped from the cookie section: https://bakerbynature.com/recipe-index/?_desserts=cookies

BBC Good Food homepage • Dataset (.csv)
also known as Goodfood.com, British/international food website

40 columns x 298 rows

NOT all recipes are cookies. Website was scraped from a search for “cookies”.

    title               rating        ratingnum  totalingredients
 Length:298         Min.   :2.500   Min.   :5    Min.   : 1.00   
 Class :character   1st Qu.:4.000   1st Qu.:5    1st Qu.: 7.00   
 Mode  :character   Median :4.500   Median :5    Median : 9.00   
                    Mean   :4.354   Mean   :5    Mean   : 9.57   
                    3rd Qu.:4.800   3rd Qu.:5    3rd Qu.:11.00   
                    Max.   :5.000   Max.   :5    Max.   :31.00   
                    NA's   :21      NA's   :21

length(unique(BBCcookies$categories))

[1] 75

Categories with the most rows:


 Freezable Vegetarian 
        30         64

Makes 30 Easy Prep: 15 mins Cook: 10 mins

Big Man’s World • Dataset (.csv)

31 columns x 96 rows

    title               rating        ratingnum         servings    
 Length:96          Min.   :4.860   Min.   :  12.0   Min.   : 1.00  
 Class :character   1st Qu.:5.000   1st Qu.:  64.0   1st Qu.:12.00  
 Mode  :character   Median :5.000   Median : 194.0   Median :12.00  
                    Mean   :4.995   Mean   : 485.2   Mean   :13.89  
                    3rd Qu.:5.000   3rd Qu.: 553.5   3rd Qu.:16.00  
                    Max.   :5.000   Max.   :3684.0   Max.   :40.00  
                    NA's   :1       NA's   :1                       
 numberofsteps   totalingredients
 Min.   :2.000   Min.   : 2.000  
 1st Qu.:4.000   1st Qu.: 4.000  
 Median :4.000   Median : 5.000  
 Mean   :4.562   Mean   : 6.438  
 3rd Qu.:5.000   3rd Qu.: 9.000  
 Max.   :7.000   Max.   :14.000

Unique items in Big Man's Category Column


Air Fryer Recipes Breakfast Recipes   Dessert Recipes 
                1                 3                92

Unique in course


Breakfast   Dessert     Snack 
        4        91         1

Unique in cuisine


                     American American, australian, English 
                           91                             1 
                   australian                       Italian 
                            1                             3

Should all be cookies. Scraped from the cookie section: https://thebigmansworld.com/category/healthy-desserts/healthy-cookies

Bites by Bianca • Dataset (.csv)

34 columns x 80 rows

     rating        ratingnum      numberofsteps   totalingredients
 Min.   :4.260   Min.   : 1.000   Min.   : 4.00   Min.   : 1.0    
 1st Qu.:5.000   1st Qu.: 1.000   1st Qu.: 9.00   1st Qu.:11.0    
 Median :5.000   Median : 3.000   Median :12.00   Median :12.0    
 Mean   :4.941   Mean   : 7.862   Mean   :12.09   Mean   :12.4    
 3rd Qu.:5.000   3rd Qu.: 7.000   3rd Qu.:14.00   3rd Qu.:14.0    
 Max.   :5.000   Max.   :60.000   Max.   :27.00   Max.   :16.0    
 NA's   :15      NA's   :15


        Bear     Beginner        Bunny        Chick      Cookies      Dessert 
           1            6            1            1           42           23 
Intermediate        Lemon     Macarons       Matcha      No-bake   Strawberry 
           1            1            1            1            1            1


                  American            American, Asian 
                        55                          7 
 American, Asian, Filipino    American, Asian, French 
                         1                          1 
 American, Asian, Japanese         American, Filipino 
                         2                          2 
American, Filipino, French           American, French 
                         1                          1 
                     Asian            Asian, Filipino 
                         4                          1 
             Asian, French                   Filipino 
                         1                          4


Dessert 
     80

Should all be cookies. Scraped from the cookie category: https://bitesbybianca.com/category/dessert/cookies

Boston Girl Bakes • Dataset (.csv)

37 columns x 137 rows

    title              rating            ratingnum      totalingredients
 Length:137         Length:137         Min.   :  1.00   Min.   : 3.0    
 Class :character   Class :character   1st Qu.:  1.25   1st Qu.:10.0    
 Mode  :character   Mode  :character   Median :  4.00   Median :11.0    
                                       Mean   : 10.09   Mean   :11.9    
                                       3rd Qu.: 10.00   3rd Qu.:14.0    
                                       Max.   :281.00   Max.   :26.0    
                                       NA's   :31

Should all be cookies. Scraped from cookie section: https://www.bostongirlbakes.com/category/cookies

ingredients were taken incorrectly, so there will be non-ingredients in the ingredientX columns, and they will add to the totalingredients columns.
there is no NA in the ratingnum column. instead, where rating has NA, ratingnum has 0. this is actually ideal, most of the datasets don’t have this, they just have NA for both.

Broma Bakery • Dataset (.csv)

39 columns x 187 rows

Should all be cookies, scraped from cookie category: https://bromabakery.com/category/desserts/cookies

     rating        ratingnum     numberofsteps    totalingredients
 Min.   :3.000   Min.   :  1.0   Min.   : 0.000   Min.   : 1.00   
 1st Qu.:4.700   1st Qu.:  3.0   1st Qu.: 5.000   1st Qu.:10.00   
 Median :4.900   Median :  7.0   Median : 6.000   Median :11.00   
 Mean   :4.826   Mean   : 16.3   Mean   : 6.241   Mean   :11.35   
 3rd Qu.:5.000   3rd Qu.: 15.0   3rd Qu.: 8.000   3rd Qu.:13.00   
 Max.   :5.000   Max.   :527.0   Max.   :14.000   Max.   :23.00   
 NA's   :5       NA's   :5


     Bars Breakfast  Brownies   Cookies   Muffins   No Bake 
        2         1         1       181         1         1

No servings/yield

Brown Eyed Baker • Dataset (.csv)

39 columns x 131 rows

Should all be cookies, scraped from cookie category: https://www.browneyedbaker.com/recipes/desserts/cookie-recipes

    title               rating        ratingnum      numberofsteps  
 Length:131         Min.   :3.860   Min.   :  1.00   Min.   : 0.00  
 Class :character   1st Qu.:4.430   1st Qu.:  8.00   1st Qu.: 4.00  
 Mode  :character   Median :4.560   Median : 20.50   Median : 5.00  
                    Mean   :4.568   Mean   : 51.54   Mean   : 5.45  
                    3rd Qu.:4.720   3rd Qu.: 48.25   3rd Qu.: 7.00  
                    Max.   :5.000   Max.   :399.00   Max.   :14.00  
                    NA's   :7       NA's   :7                       
 totalingredients
 Min.   : 1.00   
 1st Qu.: 8.00   
 Median :11.00   
 Mean   :10.39   
 3rd Qu.:12.00   
 Max.   :23.00


       Dessert Dessert, Snack          Snack 
            15              5            103


         American American, Italian            German             Greek 
              106                 6                 1                 1 
          Italian    South American 
                8                 1

Empty “custom time” column

Cafe Sucre Farine • Dataset (.csv)

36 columns x 93 rows

     rating        ratingnum        servings  numberofsteps    totalingredients
 Min.   :3.670   Min.   : 1.00   Min.   :24   Min.   : 4.000   Min.   : 5.00   
 1st Qu.:4.910   1st Qu.: 7.00   1st Qu.:24   1st Qu.: 7.000   1st Qu.: 9.00   
 Median :5.000   Median :11.00   Median :24   Median : 9.000   Median :11.00   
 Mean   :4.897   Mean   :14.25   Mean   :24   Mean   : 9.731   Mean   :11.23   
 3rd Qu.:5.000   3rd Qu.:17.00   3rd Qu.:24   3rd Qu.:11.000   3rd Qu.:13.00   
 Max.   :5.000   Max.   :72.00   Max.   :24   Max.   :27.000   Max.   :20.00   
                                 NA's   :92


           Baked Goods, Dessert, Dessert/Cookies 
                                               1 
                    Baked Goods, Dessert/Cookies 
                                               1 
                                   Bars, Cookeis 
                                               1 
                          Bars, Cookeis, Dessert 
                                               1 
            Bars, Cookeis, Dessert, Sweet Treats 
                                               1 
                                Breakfast, Snack 
                                               1 
                                         Cookies 
                                               7 
                                Cookies, Dessert 
                                              20 
               Cookies, Dessert, Dessert/Cookies 
                                               2 
       Cookies, Dessert, Dessert/Cookies, Snacks 
                                               1 
  Cookies, Dessert, Dessert/Cookies, Sweet Treat 
                                               1 
Cookies, Dessert, Gifts from the Kitchen, Snacks 
                                               1 
                         Cookies, Dessert, Snack 
                                               1 
                        Cookies, Dessert, Snacks 
                                               2 
                        Cookies, Dessert/Cookies 
                                               4 
                Cookies, Dessert/Cookies, Snacks 
                                               1 
                                         Dessert 
                                              18 
                        Dessert, Dessert/Cookies 
                                               1 
                                  Dessert, Snack 
                                               5 
                             Dessert, Snack bars 
                                               1 
                                 Dessert, Snacks 
                                               2 
                           Dessert, Sweet Treats 
                                               2 
                                 Dessert/Cookies 
                                              16


                   American           American, British 
                         56                           1 
American, British, Scottish          American, European 
                          2                           1 
            American, Irish   American, Irish, Scottish 
                          2                           2 
          American, Mexican          American, Scottish 
                          1                           2 
American, Scottish-Inspired                    Austrian 
                          1                           1 
                    Belgian                     British 
                          1                           1 
                     French     French, French-American 
                          2                           1 
              Holiday foods                       Irish 
                          1                           2 
                    Italian   Italian, Italian-Inspired 
                          2                           1

Should all be cookies (and bars). Scraped from cookie category: https://thecafesucrefarine.com/category/cookies-bars

Only one recipe has servings info (but if that means this data has to be removed, it’s only 94 rows)

Chef’s Pencil • Dataset (.csv)

30 columns x 60 rows

< table of extent 0 x 0 >


               American American, International                Austrian 
                      3                       1                       2 
              Brazilian       British, Scottish        Eastern European 
                      1                       1                       1 
           Est European                Filipino                  French 
                      2                       1                       4 
  French, International                   Greek    Greek, International 
                      1                       6                       2 
          International                 Italian              Portuguese 
                     21                       3                       2 
               Romanian                 Russian South American, Spanish 
                      1                       1                       1 
                Spanish                   Swiss                 Turkish 
                      1                       1                       3 
             Venezuelan 
                      1

Should all be cookies. Scraped from cookie section: https://www.chefspencil.com/recipe-courses/dessert/cookies

“number of steps” seems like it might be inaccurate (although this applies to any datasets with the number of steps)

Cookie Rookie • Dataset (.csv)

70 columns x 155 rows

    title              author              rating        ratingnum     
 Length:155         Length:155         Min.   :4.150   Min.   :  2.00  
 Class :character   Class :character   1st Qu.:4.500   1st Qu.:  9.00  
 Mode  :character   Mode  :character   Median :4.670   Median : 15.00  
                                       Mean   :4.655   Mean   : 37.55  
                                       3rd Qu.:4.760   3rd Qu.: 29.00  
                                       Max.   :5.000   Max.   :595.00  
                                       NA's   :2       NA's   :6       
   servings         totalingredients
 Length:155         Min.   : 1.00   
 Class :character   1st Qu.: 8.00   
 Mode  :character   Median :12.00   
                    Mean   :15.31   
                    3rd Qu.:18.00   
                    Max.   :58.00

Ingredients not scraped properly. Non-ingredients will be in the columns and adding to the totalingredients.

Cookies & Cups • Dataset (.csv)

39 columns x 406 rows

    title              author              rating        ratingnum     
 Length:406         Length:406         Min.   :1.500   Min.   :  1.00  
 Class :character   Class :character   1st Qu.:4.500   1st Qu.:  2.00  
 Mode  :character   Mode  :character   Median :4.800   Median :  4.00  
                                       Mean   :4.649   Mean   : 12.33  
                                       3rd Qu.:5.000   3rd Qu.: 13.00  
                                       Max.   :5.000   Max.   :216.00  
                                       NA's   :143     NA's   :143     
    yield           totalingredients
 Length:406         Min.   : 2.00   
 Class :character   1st Qu.: 9.00   
 Mode  :character   Median :10.00   
                    Mean   :10.68   
                    3rd Qu.:13.00   
                    Max.   :26.00


                     Bars                      Cake                     Candy 
                        1                         1                         4 
               Cheesecake                    Cookie                Cookie Bar 
                        3                         1                         2 
              Cookie Bars                   Cookies                    Desert 
                        5                       186                         1 
                  Dessert Dessert, Breakfast, Snack                Dog Treats 
                      110                         1                         1 
                    Fruit                    Pastry                       Pie 
                        1                         1                         1 
                    Snack 
                        1


                      American                          Amish 
                           161                              1 
                     Breakfast                        Desesrt 
                             1                              1 
                       Dessert                         French 
                           147                              1 
                       Italian                         Polish 
                             3                              1 
                      Scottish South American, Latin American 
                             1                              1 
                         Swiss 
                             1

Should all be cookies. Scraped from cookie category: https://cookiesandcups.com/recipes/cookies/

Cooking with my Kids • Dataset (.csv)
British website

30 columns x 84 rows

     rating        ratingnum      numberofsteps    totalingredients
 Min.   :2.000   Min.   : 1.000   Min.   : 0.000   Min.   : 1.000  
 1st Qu.:4.768   1st Qu.: 1.000   1st Qu.: 8.000   1st Qu.: 6.000  
 Median :5.000   Median : 3.000   Median :10.000   Median : 7.000  
 Mean   :4.721   Mean   : 4.412   Mean   : 9.798   Mean   : 7.036  
 3rd Qu.:5.000   3rd Qu.: 7.000   3rd Qu.:12.000   3rd Qu.: 8.250  
 Max.   :5.000   Max.   :16.000   Max.   :18.000   Max.   :14.000  
 NA's   :50      NA's   :50

Unique items in course


                Afternoon tea        Afternoon tea, Dessert 
                           10                             5 
Afternoon tea, Dessert, Snack          Afternoon tea, Snack 
                            4                            20 
                      Dessert                Dessert, Snack 
                            6                             1 
                        Snack 
                           34

Unique items in cuisine


         American American, British           British   British, danish 
                8                32                28                 1 
British, scottish            German          scottish 
                3                 1                 7

Daring Gourmet • Dataset (.csv)

50 columns x 116 rows

     rating        ratingnum      numberofsteps    totalingredients
 Min.   :4.000   Min.   :  1.00   Min.   : 1.000   Min.   : 2.00   
 1st Qu.:4.980   1st Qu.:  8.00   1st Qu.: 2.000   1st Qu.: 7.00   
 Median :5.000   Median : 26.50   Median : 3.000   Median :12.00   
 Mean   :4.963   Mean   : 61.53   Mean   : 3.336   Mean   :11.89   
 3rd Qu.:5.000   3rd Qu.: 70.50   3rd Qu.: 4.000   3rd Qu.:16.00   
 Max.   :5.000   Max.   :610.00   Max.   :10.000   Max.   :34.00   
 NA's   :2       NA's   :2

Unique items in course


              Appetizer, hors d’oeuvres Appetizer, Main Course, Side Dish, Soup 
                                      1                                       1 
            Appetizer, Side Dish, Snack                        Appetizer, Snack 
                                      1                                       3 
                      Beverages, Drinks                        bread, Side Dish 
                                      1                                       1 
                              Breakfast                       Breakfast, Brunch 
                                      4                                       1 
             Breakfast, Brunch, Dessert          Breakfast, Brunch, Main Course 
                                      1                                       1 
                     Breakfast, Dessert               Breakfast, Dessert, Snack 
                                      2                                       1 
                 Brunch, Dessert, Lunch               Candy, condiment, Dessert 
                                      1                                       1 
           Candy, condiment, Ingredient                 Candy, condiment, Snack 
                                      1                                       1 
                         Candy, Dessert                   Candy, Dessert, Snack 
                                      1                                       1 
                              condiment         condiment, dip, Dressing, Sauce 
                                     10                                       1 
                  condiment, dip, glaze                    condiment, Seasoning 
                                      1                                       1 
             condiment, Seasoning Blend                        condiment, Syrup 
                                      1                                       1 
                                Dessert                     Dessert, Ingredient 
                                     41                                       1 
                   Dessert, Main Course                          Dessert, Snack 
                                      2                                       2 
                           Dessert, Tea                              Dog Treats 
                                      2                                       1 
                      Entree, Main Dish                 Entree, Main Dish, Soup 
                                      1                                       1 
                             Ingredient                             Main Course 
                                      2                                       4 
                              Main Dish                         Main Dish, Stew 
                                      2                                       1 
                                  Salad                        Salad, Side Dish 
                                      2                                       1 
              Salad, Side Dish, Starter              Seasoning Blend, Spice Mix 
                                      1                                       1 
                             Seasonings                               Side Dish 
                                      1                                       2 
                  Side Dish, vegetables                                    Soup 
                                      1                                       4

Unique items in cuisine


                      All             All, American                  American 
                        4                         1                        24 
        American, Italian        American, Southern            Asian, Chinese 
                        1                         1                         1 
  Australian, New Zealand  Austrian, French, German          Austrian, German 
                        1                         1                         4 
Austrian, German, Italian                   British          British, Cornish 
                        1                         2                         1 
         British, english         British, Scottish                   Chinese 
                        4                         1                         1 
                   danish                   english                    French 
                        1                         2                         5 
  French, German, Italian           French, Italian         French, Provencal 
                        1                         1                         1 
                   German           German, Italian             German, Swiss 
                       20                         1                         1 
                    Greek   Greek, Sephardic Jewish                    Indian 
                        2                         1                         1 
                  Italian                    Jewish                   Mexican 
                        2                         1                         2 
           Middle Eastern                Portuguese                     Welsh 
                        1                         1                         1

Not all cookies. Scrape off search results for “cookie” on the website https://www.daringgourmet.com/?s=cookie

Ingredients column includes things like “Day#:” (for multiple day long recipes) as their own ingredient. This will add to the totalingredients columns for rows with that. Filter for things adding in : to remove these.
“number of steps” is incorrect on a decent amount of recipes because of weird website formatting
Filtering title by ‘cookie’ might not work with this dataset, but filtering by searching for “cookie” in any row might (some of the titles say “Authentic Pfeffernüsse”, but pfeffernüsse is a cookie)

Desserts on a Dime • Dataset (.csv)

36 columns x 272 rows

    rating            ratingnum         servings   numberofsteps   
 Length:272         Min.   :  1.00   Min.   :12    Min.   : 3.000  
 Class :character   1st Qu.:  3.00   1st Qu.:12    1st Qu.: 7.000  
 Mode  :character   Median :  6.00   Median :12    Median : 8.000  
                    Mean   : 15.82   Mean   :12    Mean   : 9.151  
                    3rd Qu.: 15.00   3rd Qu.:12    3rd Qu.:11.000  
                    Max.   :242.00   Max.   :12    Max.   :23.000  
                    NA's   :7        NA's   :269                   
 totalingredients
 Min.   : 3.000  
 1st Qu.: 5.000  
 Median : 9.000  
 Mean   : 8.754  
 3rd Qu.:11.000  
 Max.   :23.000


American  Italian  Russian 
     270        1        1

Should all be cookies. Scraped from cookie category: https://dessertsonadime.com/category/recipe-by-type/cookies

The “servings” column looks really off. I might have done something wrong while scraping it

Eggless Cooking • Dataset (.csv)

36 columns x 87 rows

     rating        ratingnum      numberofsteps    totalingredients
 Min.   :4.000   Min.   :  1.00   Min.   : 1.000   Min.   : 4.00   
 1st Qu.:4.955   1st Qu.:  2.00   1st Qu.: 6.000   1st Qu.: 8.00   
 Median :5.000   Median :  5.00   Median : 8.000   Median :11.00   
 Mean   :4.925   Mean   : 35.52   Mean   : 8.322   Mean   :11.22   
 3rd Qu.:5.000   3rd Qu.: 18.50   3rd Qu.:10.000   3rd Qu.:14.50   
 Max.   :5.000   Max.   :903.00   Max.   :21.000   Max.   :22.00   
 NA's   :16      NA's   :16

Unique items in cuisine


American  British 
      86        1

Should all be cookies. Scraped from cookie section: https://www.egglesscooking.com/eggless-baking-recipes/eggless-cookies/

Family Cookie Recipes • Dataset (.csv)

44 columns x 704 rows

     rating        ratingnum      actualsteps        ingredient1       
 Min.   :2.000   Min.   : 1.000   Length:704         Length:704        
 1st Qu.:5.000   1st Qu.: 2.000   Class :character   Class :character  
 Median :5.000   Median : 2.000   Mode  :character   Mode  :character  
 Mean   :4.837   Mean   : 3.822                                        
 3rd Qu.:5.000   3rd Qu.: 5.000                                        
 Max.   :5.000   Max.   :31.000                                        
 NA's   :411     NA's   :411


          American American, Austrian   American, Danish    American, Dutch 
               673                  1                  1                  1 
  American, French  American, Mexican            Cookies             Danish 
                 1                  1                  2                  2 
        dog treats             French             German            Italian 
                 1                  4                  1                  2 
          Scottish 
                 1

SHOULD be all cookie recipes, because the website only has cookie recipes, but this was scraped off the website’s search results/recipe index. Also some of these are dog cookies - ingredients scraped properly - steps included
- very few recipes have ratings
- empty “customtime” column (remove)

Floral Apron • Dataset (.csv)

29 columns x 25 rows

     rating        ratingnum      totalingredients
 Min.   :4.000   Min.   :  1.00   Min.   : 8.00   
 1st Qu.:4.670   1st Qu.:  2.00   1st Qu.:10.00   
 Median :4.880   Median :  4.00   Median :11.00   
 Mean   :4.791   Mean   : 10.36   Mean   :11.28   
 3rd Qu.:5.000   3rd Qu.:  8.00   3rd Qu.:12.00   
 Max.   :5.000   Max.   :105.00   Max.   :16.00


  Macarons Madeleines 
        10          3


American Austrian   French   German  Italian 
       8        1       14        1        1

Only 25 recipes

Fun Cookie Recipes • Dataset (.csv)

42 columns x 207 rows

     rating        ratingnum       numberofsteps    totalingredients
 Min.   :4.340   Min.   :  1.000   Min.   : 3.000   Min.   : 3.00   
 1st Qu.:5.000   1st Qu.:  3.000   1st Qu.: 7.000   1st Qu.: 8.00   
 Median :5.000   Median :  5.000   Median : 8.000   Median :11.00   
 Mean   :4.986   Mean   :  8.553   Mean   : 8.932   Mean   :10.88   
 3rd Qu.:5.000   3rd Qu.:  9.000   3rd Qu.:11.000   3rd Qu.:14.00   
 Max.   :5.000   Max.   :142.000   Max.   :23.000   Max.   :24.00   
 NA's   :1       NA's   :1


              Birthday Brownies & Cookie Bars       Cake Mix Cookies 
                     1                     16                      6 
             Chocolate              Christmas        Cut Out Cookies 
                     1                     14                      7 
          Drop Cookies                 Easter                   Fall 
                    17                      4                      4 
             Halloween               Macarons                 Summer 
                     9                      3                      4 
           Valentine's 
                     7

Empty “update” column (just remove that)

Hilda’s Kitchen Blog • Dataset (.csv)

42 columns x 207 rows

     rating        ratingnum      numberofsteps    totalingredients
 Min.   :4.430   Min.   :  1.00   Min.   : 2.000   Min.   : 1.00   
 1st Qu.:4.955   1st Qu.:  2.00   1st Qu.: 4.000   1st Qu.: 6.00   
 Median :5.000   Median :  4.00   Median : 5.000   Median : 9.00   
 Mean   :4.954   Mean   : 25.65   Mean   : 5.704   Mean   :10.31   
 3rd Qu.:5.000   3rd Qu.: 16.75   3rd Qu.: 6.000   3rd Qu.:14.00   
 Max.   :5.000   Max.   :222.00   Max.   :15.000   Max.   :26.00

[1] "Unique in category"


       Appetizer Recipes        Breakfast Recipes        Christmas Recipes 
                       2                        1                        2 
   Condiments and Sauces            Drink Recipes           Easter Recipes 
                       1                        3                        3 
      Easy Asian Recipes           Entree Recipes             Fall Recipes 
                       1                        5                        1 
        Foraging Recipes          Italian Recipes   Middle Eastern Recipes 
                       2                        2                        5 
         Seafood Recipes           Smoker Recipes            Snack Recipes 
                       1                        2                        2 
   Spices and Seasonings Sweets & Dessert Recipes            Syrup Recipes 
                       2                       15                        1 
 Valentine's Day Recipes 
                       1

[1] "Unique in course"


        Appetizer, entree, Snack             Appetizer, Side Dish 
                               1                                1 
                Appetizer, Snack            Appetizers, Side Dish 
                               1                                1 
      Breakfast, brunch, Dessert               Breakfast, Dessert 
                               1                                1 
        Breakfast, lunch, sweets                Breakfast, sweets 
                               1                                1 
         brunch, Dessert, sweets                       Condiments 
                               1                                2 
              Condiments, sweets                          Dessert 
                               2                                9 
                  Dessert, Snack           Dessert, Snack, sweets 
                               3                                1 
                 Dessert, Snacks                  Dessert, sweets 
                               4                                1 
                  dinner, entree            dinner, entree, lunch 
                               1                                1 
dinner, entree, lunch, Main Dish        dinner, entree, Main Dish 
                               1                                1 
        dinner, lunch, Main Dish                           Drinks 
                               1                                3 
                          entree                      Main Course 
                               2                                1 
          Main Course, Main Dish                            Other 
                               1                                1 
                       seasoning                seasoning, Spices 
                               1                                1 
                       Side Dish                            Snack 
                               1                                2 
                  Snacks, sweets                           sweets 
                               1                                3 
                           syrup 
                               1

[1] "Unique in cuisine"


                                     American 
                                           31 
                     American, Asian, Chinese 
                                            1 
      American, Asian, Chinese, Mediterranean 
                                            1 
                    American, Asian, Japanese 
                                            1 
                            American, British 
                                            1 
                            American, Mexican 
                                            1 
Asian, Indian, Iraqi, Middle Eastern, Persian 
                                            1 
                     Assyrian, Middle Eastern 
                                            2 
                                       French 
                                            1 
                                       German 
                                            1 
                       Indian, Middle Eastern 
                                            1 
                                      Italian 
                                            2 
                                       Jewish 
                                            1 
                                      Mexican 
                                            1 
                               Middle Eastern 
                                            3 
                      Middle Eastern, Turkish 
                                            1 
                                    Norwegian 
                                            2 
                                      Swedish 
                                            1 
                                     ukranian 
                                            1

Not all cookie recipes. Scraped from website search results.

I Am Baker • Dataset (.csv)

54 columns x 323 rows

     rating        ratingnum      yield         totalingredients
 Min.   :2.500   Min.   :  2.0   Mode:logical   Min.   : 3.00   
 1st Qu.:4.835   1st Qu.:  3.0   NA's:323       1st Qu.: 9.00   
 Median :5.000   Median :  5.0                  Median :11.00   
 Mean   :4.829   Mean   : 12.3                  Mean   :12.47   
 3rd Qu.:5.000   3rd Qu.: 11.0                  3rd Qu.:15.00   
 Max.   :5.000   Max.   :195.0                  Max.   :42.00   
 NA's   :108     NA's   :172

I Heart Eating • Dataset (.csv)

39 columns x 139 rows

     rating       ratingnum       numberofsteps   totalingredients
 Min.   :4.82   Min.   :   1.00   Min.   : 7.00   Min.   : 4.00   
 1st Qu.:5.00   1st Qu.:   3.00   1st Qu.: 9.00   1st Qu.: 9.50   
 Median :5.00   Median :   5.00   Median :11.00   Median :11.00   
 Mean   :4.99   Mean   :  33.92   Mean   :11.57   Mean   :11.16   
 3rd Qu.:5.00   3rd Qu.:  14.00   3rd Qu.:13.00   3rd Qu.:13.00   
 Max.   :5.00   Max.   :2022.00   Max.   :28.00   Max.   :20.00   
 NA's   :15     NA's   :15

 
 Unique category


  Breakfasts      Cookies     Desserts Fall Recipes 
           1           98           38            1

 
 Unique course


Breakfast   Dessert 
        1       138

 
 Unique cuisine


         American American, Mexican          Austrian            German 
              134                 1                 1                 1 
   Latin American           Mexican 
                1                 1

I Heart Naptime • Dataset (.csv)

38 columns x 179 rows

     rating        ratingnum      numberofsteps    totalingredients
 Min.   :4.860   Min.   :  1.00   Min.   : 0.000   Min.   : 1.0    
 1st Qu.:5.000   1st Qu.:  5.00   1st Qu.: 5.000   1st Qu.: 7.0    
 Median :5.000   Median : 11.00   Median : 6.000   Median :11.0    
 Mean   :4.991   Mean   : 43.58   Mean   : 5.771   Mean   :10.3    
 3rd Qu.:5.000   3rd Qu.: 28.00   3rd Qu.: 7.000   3rd Qu.:13.0    
 Max.   :5.000   Max.   :936.00   Max.   :13.000   Max.   :23.0

Insanely Good Recipes • Dataset (.csv)

32 columns x 60 rows

   ratingnum        cuisine          totalingredients ingredient1       
 Min.   : 1.000   Length:60          Min.   : 5.00    Length:60         
 1st Qu.: 2.750   Class :character   1st Qu.: 9.75    Class :character  
 Median : 4.000   Mode  :character   Median :12.00    Mode  :character  
 Mean   : 7.367                      Mean   :12.23                      
 3rd Qu.: 9.000                      3rd Qu.:14.00                      
 Max.   :49.000                      Max.   :21.00


Cuisine: American Cuisine: Austrian  Cuisine: Dessert  Cuisine: Italian 
               54                 1                 1                 2

Kirbie’s Cravings • Dataset (.csv)

44 columns x 491 rows

summary(kirbie[c(4:5, 12, 14)])

     rating        ratingnum       numberofsteps    totalingredients
 Min.   :1.000   Min.   :  1.000   Min.   : 0.000   Min.   : 1.000  
 1st Qu.:4.800   1st Qu.:  1.000   1st Qu.: 4.000   1st Qu.: 3.000  
 Median :5.000   Median :  3.000   Median : 5.000   Median : 5.000  
 Mean   :4.794   Mean   :  8.211   Mean   : 4.955   Mean   : 6.069  
 3rd Qu.:5.000   3rd Qu.:  7.000   3rd Qu.: 6.000   3rd Qu.: 9.000  
 Max.   :5.000   Max.   :178.000   Max.   :11.000   Max.   :28.000  
 NA's   :244     NA's   :244

table(kirbie$category)


   4 Ingredients or Less Appetizers & Side Dishes           Asian Desserts 
                      39                        2                        4 
            Asian dishes          Breakfast foods                 Brownies 
                       2                        2                        7 
                   Cakes                  Cookies                     Diet 
                       1                      229                       23 
                    Keto                 Low Carb 
                       1                        3

Should all be cookie recipes. Scraped from cookie category: https://kirbiecravings.com/category/recipes/recipes-cookies

Lord Byron’s Kitchen • Dataset (.csv)

65 columns x 682 rows

     rating        ratingnum       numberofsteps    totalingredients
 Min.   :2.000   Min.   :   1.00   Min.   : 2.000   Min.   : 1.000  
 1st Qu.:4.120   1st Qu.:   5.00   1st Qu.: 7.000   1st Qu.: 7.000  
 Median :5.000   Median :  13.00   Median : 9.000   Median : 9.000  
 Mean   :4.572   Mean   :  24.87   Mean   : 9.214   Mean   : 9.622  
 3rd Qu.:5.000   3rd Qu.:  20.00   3rd Qu.:11.000   3rd Qu.:12.000  
 Max.   :5.000   Max.   :1418.00   Max.   :25.000   Max.   :49.000  
 NA's   :152     NA's   :152


Christmas   Recipes 
       11       671

Maureen Abood • Dataset (.csv)

31 columns x 24 rows

    title               rating        ratingnum    numberofsteps  
 Length:24          Min.   :4.670   Min.   :1.00   Min.   : 3.00  
 Class :character   1st Qu.:5.000   1st Qu.:1.75   1st Qu.: 6.00  
 Mode  :character   Median :5.000   Median :2.00   Median : 8.50  
                    Mean   :4.967   Mean   :2.70   Mean   : 8.25  
                    3rd Qu.:5.000   3rd Qu.:3.00   3rd Qu.:10.00  
                    Max.   :5.000   Max.   :8.00   Max.   :14.00  
                    NA's   :4       NA's   :4                     
 totalingredients
 Min.   : 4.00   
 1st Qu.: 7.75   
 Median :11.00   
 Mean   :10.46   
 3rd Qu.:12.00   
 Max.   :16.00

Contains cookies without “cookie” in title (“Sitto’s Date Ma’amoul”)

Mia Kouppa • Dataset (.csv)

40 columns x 38 rows

     rating        ratingnum      numberofsteps    totalingredients
 Min.   :4.670   Min.   : 1.000   Min.   : 4.000   Min.   : 3.00   
 1st Qu.:4.950   1st Qu.: 2.000   1st Qu.: 7.000   1st Qu.: 9.00   
 Median :5.000   Median : 5.000   Median : 9.000   Median :11.00   
 Mean   :4.968   Mean   : 9.429   Mean   : 9.368   Mean   :11.13   
 3rd Qu.:5.000   3rd Qu.:14.000   3rd Qu.:11.000   3rd Qu.:13.00   
 Max.   :5.000   Max.   :49.000   Max.   :14.000   Max.   :24.00   
 NA's   :10      NA's   :10


       American American, Greek           Greek  Greek, Italian         Italian 
             13               3              16               1               3


       Bread, meze, Snack                 Breakfast Breakfast, Dessert, Snack 
                        1                         1                         1 
                  Dessert            Dessert, Snack 
                       33                         2

will contain cookie recipes that don’t have “cookie” in the title, eg “Olive oil koulourakia with orange and cinnamon”

Modern Honey • Dataset (.csv)

38 columns x 122 rows

     rating       ratingnum       totalingredients
 Min.   :4.47   Min.   :   1.00   Min.   : 2.00   
 1st Qu.:4.99   1st Qu.:   3.00   1st Qu.:11.00   
 Median :5.00   Median :   8.00   Median :12.00   
 Mean   :4.97   Mean   :  27.08   Mean   :12.61   
 3rd Qu.:5.00   3rd Qu.:  18.00   3rd Qu.:14.00   
 Max.   :5.00   Max.   :1122.00   Max.   :27.00

Mommy’s Home Cooking • Dataset (.csv)

35 columns x 107 rows

Eggless recipes

    title               rating        ratingnum      numberofsteps   
 Length:107         Min.   :3.450   Min.   :  1.00   Min.   : 4.000  
 Class :character   1st Qu.:4.580   1st Qu.:  2.00   1st Qu.: 7.000  
 Mode  :character   Median :5.000   Median :  5.00   Median : 8.000  
                    Mean   :4.762   Mean   : 26.57   Mean   : 8.364  
                    3rd Qu.:5.000   3rd Qu.: 20.00   3rd Qu.: 9.000  
                    Max.   :5.000   Max.   :539.00   Max.   :19.000  
                    NA's   :14      NA's   :14                       
 totalingredients
 Min.   : 3.00   
 1st Qu.: 9.50   
 Median :11.00   
 Mean   :11.14   
 3rd Qu.:13.00   
 Max.   :18.00

table(mommyscooking$cuisine)


  American   European     French    Italian    Mexican Venezuelan 
        99          3          2          1          1          1

table(mommyscooking$category)


        All Recipes           Breakfast     Brownies & Bars               Cakes 
                 12                   3                   3                   1 
            Cookies Eggless Baking Tips 
                 87                   1

Chilling Time: 30 minutes mins

Munaty Cooking • Dataset (.csv)

37 columns x 34 rows

  datelisted            rating        servings         actualsteps       
 Length:34          Min.   :3.940   Length:34          Length:34         
 Class :character   1st Qu.:4.670   Class :character   Class :character  
 Mode  :character   Median :5.000   Mode  :character   Mode  :character  
                    Mean   :4.804                                        
                    3rd Qu.:5.000                                        
                    Max.   :5.000                                        
                    NA's   :16

[1] "Unique in category"


       Cookies Recipes        Dessert Recipes Easy Breakfast Recipes 
                    29                      3                      1 
Middle Eastern Recipes 
                     1

[1] "Unique in cuisine"


               American                 Arabian Arabian, Middle Eastern 
                     14                       5                       1 
                British         British, Indian                  French 
                      2                       1                       2 
                Italian          Middle Eastern                 Russian 
                      3                       1                       1

[1] "Unique in course"


            Cake          Cookies Cookies, Dessert          Dessert 
               1               26                2                2

My Baking Addiction • Dataset (.csv)

45 columns x 157 rows

    title               rating        ratingnum       numberofsteps   
 Length:157         Min.   :4.000   Min.   :   1.00   Min.   : 1.000  
 Class :character   1st Qu.:4.460   1st Qu.:   4.00   1st Qu.: 5.000  
 Mode  :character   Median :4.590   Median :  13.00   Median : 7.000  
                    Mean   :4.646   Mean   :  51.88   Mean   : 7.185  
                    3rd Qu.:4.855   3rd Qu.:  34.50   3rd Qu.: 9.000  
                    Max.   :5.000   Max.   :1580.00   Max.   :17.000  
                    NA's   :23      NA's   :23                        
 totalingredients
 Min.   : 2.00   
 1st Qu.: 9.00   
 Median :11.00   
 Mean   :11.17   
 3rd Qu.:13.00   
 Max.   :30.00

Olive Magazine • Dataset (.csv)
British magazine

34 columns x 146 rows

   ratingnum        category         totalingredients ingredient1       
 Min.   : 0.000   Length:146         Min.   : 3.000   Length:146        
 1st Qu.: 1.000   Class :character   1st Qu.: 7.000   Class :character  
 Median : 2.000   Mode  :character   Median : 9.000   Mode  :character  
 Mean   : 3.344                      Mean   : 9.562                     
 3rd Qu.: 4.000                      3rd Qu.:11.000                     
 Max.   :24.000                      Max.   :23.000                     
 NA's   :53


Baking and desserts        Chef recipes                Easy           Entertain 
                102                   4                   3                   6 
             Family              Health             Recipes               Vegan 
                 21                   2                   2                   6

Preppy Kitchen • Dataset (.csv)

34 columns x 138 rows

     rating        ratingnum      totalingredients
 Min.   :4.670   Min.   :   1.0   Min.   : 2.00   
 1st Qu.:4.960   1st Qu.:  19.0   1st Qu.: 8.25   
 Median :4.990   Median :  70.0   Median :10.00   
 Mean   :4.968   Mean   : 372.5   Mean   :10.88   
 3rd Qu.:5.000   3rd Qu.: 238.0   3rd Qu.:13.00   
 Max.   :5.000   Max.   :6053.0   Max.   :22.00   
 NA's   :5       NA's   :5


                 American         American, British American, english, French 
                      106                         2                         1 
         American, French           American, Greek         American, Italian 
                        2                         1                         1 
        American, russian                  Austrian                   British 
                        1                         1                         1 
        British, Scottish                    French                    German 
                        3                         3                         4 
                    Greek                   Italian                    Jewish 
                        1                         7                         1 
                    latin                   Mexican          Mexican, Swedish 
                        1                         1                         1

The Recipe Critic • Dataset (.csv)

38 columns x 161 rows

     rating        ratingnum      numberofsteps    totalingredients
 Min.   :3.000   Min.   : 1.000   Min.   : 2.000   Min.   : 3.00   
 1st Qu.:4.500   1st Qu.: 1.000   1st Qu.: 5.000   1st Qu.: 8.00   
 Median :5.000   Median : 3.000   Median : 7.000   Median :11.00   
 Mean   :4.682   Mean   : 4.598   Mean   : 7.261   Mean   :10.99   
 3rd Qu.:5.000   3rd Qu.: 5.000   3rd Qu.: 9.000   3rd Qu.:13.00   
 Max.   :5.000   Max.   :31.000   Max.   :20.000   Max.   :22.00   
 NA's   :69      NA's   :69


          Beverages           Breakfast            Desserts               Fruit 
                  1                   1                 139                   2 
            Holiday Recipe Skill Levels 
                  9                   9

Should all be cookies. Scraped from cookie section: https://therecipecritic.com/dessert-recipes/cookies

Recipe Magik • Dataset (.csv)

36 columns x 92 rows

     rating       ratingnum     numberofsteps    totalingredients
 Min.   :2.50   Min.   :1.000   Min.   : 0.000   Min.   : 2.000  
 1st Qu.:5.00   1st Qu.:1.000   1st Qu.: 4.000   1st Qu.: 8.000  
 Median :5.00   Median :2.000   Median : 5.000   Median :10.000  
 Mean   :4.74   Mean   :2.296   Mean   : 6.946   Mean   : 9.924  
 3rd Qu.:5.00   3rd Qu.:2.000   3rd Qu.: 8.000   3rd Qu.:12.000  
 Max.   :5.00   Max.   :8.000   Max.   :31.000   Max.   :21.000  
 NA's   :65     NA's   :65

Not all cookies. Scraped from website’s search results for “cookie”

Sally’s Baking Addiction • Dataset (.csv)

61 columns x 260 rows

Should all be cookies already - scraped from the cookie category: https://sallysbakingaddiction.com/category/desserts/cookies

    title              author             genre               rating    
 Length:260         Length:260         Length:260         Min.   :3.80  
 Class :character   Class :character   Class :character   1st Qu.:4.60  
 Mode  :character   Mode  :character   Mode  :character   Median :4.80  
                                                          Mean   :4.72  
                                                          3rd Qu.:4.90  
                                                          Max.   :5.00  
                                                          NA's   :1     
   ratingnum       totalingredients
 Min.   :   1.00   Min.   : 6.00   
 1st Qu.:  11.00   1st Qu.:11.00   
 Median :  28.00   Median :12.50   
 Mean   :  81.46   Mean   :13.08   
 3rd Qu.:  73.00   3rd Qu.:15.00   
 Max.   :1813.00   Max.   :49.00   
 NA's   :1


      Breakfast Brownies & Bars      Cheesecake       Christmas         Cookies 
              1               2               1               3             249 
       Desserts       Halloween         Healthy No-Bake Recipes 
              1               1               1               1

Notes
- The ingredients weren’t scraped properly, so some of them are “sections”, or bits of html eg:
Cookies / For Decorating / Optional Topping Before Baking / Topping / Easy Icing / Rolling / Maple Icing / /wp:list
An easy way to filter for these will be to isolate ingredients that don’t have any numbers in them, but there’s also ingredients like “Assorted sprinkles” that have no digits and are ingredients
⚠️ Every non-ingredient will have added 1 to totalingredients for that recipe
That will need to be fixed (either by subtracting 1 every time you remove an ingredient or by redoing totalingredients after removing all non-ingredients) ⚠️ The ingredientX columns will need to be shifted when ingredients are removed from a recipe

Code used to fix this for another recipe should be usable for this too

Taste of Home • Dataset (.csv)

40 columns x 1326 rows

    title              author              rating       ratingnum     
 Length:1326        Length:1326        Min.   :0.00   Min.   :  0.00  
 Class :character   Class :character   1st Qu.:3.80   1st Qu.:  2.00  
 Mode  :character   Mode  :character   Median :4.50   Median :  5.00  
                                       Mean   :3.82   Mean   : 11.38  
                                       3rd Qu.:4.90   3rd Qu.: 12.00  
                                       Max.   :5.00   Max.   :304.00  
 numberofsteps    totalingredients
 Min.   : 1.000   Min.   : 2.00   
 1st Qu.: 2.000   1st Qu.: 8.00   
 Median : 3.000   Median :10.00   
 Mean   : 3.157   Mean   :10.56   
 3rd Qu.: 4.000   3rd Qu.:13.00   
 Max.   :17.000   Max.   :25.00

length(unique(tasteofhome$category))

[1] 15

table(tasteofhome$category)


                     Bars                 Beverages             Bread Recipes 
                        1                         1                         1 
 Breads, Rolls & Pastries                  Brownies                     Cakes 
                        1                         2                         7 
                    Candy                Condiments                   Cookies 
                        7                         1                       884 
       Dishes & Beverages Ice Cream & Frozen Treats               Ingredients 
                      413                         3                         2 
                   Pizzas                  Puddings 
                        1                         1

print("Taste of Home recipes without ratings")

[1] "Taste of Home recipes without ratings"

sum(tasteofhome$ratingnum==0)

[1] 172

Should all be cookies. Scraped from the cookie section: https://www.tasteofhome.com/recipes/dishes-beverages/cookies

Notes:
1. Has both “author” and “recipe submitter” (and tester, but that column can be removed)
2. The “time” and “yield” categories are off. This is what the first time column looks like: Total Time:Prep: 15 min. Bake: 10 min./batch , this is what the second time column looks like: Yield:about 5 dozen Prep:15 min Cook:10 min . “Total time” wasn’t a number given on the recipes, so it will have to be calculated by adding together the other time values.

Taste of Lizzy T • Dataset (.csv)

35 columns x 140 rows

     rating        ratingnum         servings     totalingredients
 Min.   :4.140   Min.   :  1.00   Min.   : 8.00   Min.   : 3.00   
 1st Qu.:4.640   1st Qu.:  4.00   1st Qu.:23.50   1st Qu.: 8.00   
 Median :4.720   Median : 11.00   Median :25.50   Median :11.00   
 Mean   :4.745   Mean   : 42.46   Mean   :29.21   Mean   :10.81   
 3rd Qu.:4.875   3rd Qu.: 35.25   3rd Qu.:36.00   3rd Qu.:13.00   
 Max.   :5.000   Max.   :531.00   Max.   :72.00   Max.   :24.00   
 NA's   :4       NA's   :4

Tastes Better from Scratch • Dataset (.csv)

34 columns x 69 rows

     rating       ratingnum    numberofsteps    totalingredients
 Min.   :3.80   Min.   :   1   Min.   : 3.000   Min.   : 3.00   
 1st Qu.:4.83   1st Qu.:  18   1st Qu.: 6.000   1st Qu.:10.00   
 Median :4.93   Median :  52   Median : 7.000   Median :12.00   
 Mean   :4.87   Mean   : 126   Mean   : 7.638   Mean   :11.67   
 3rd Qu.:4.99   3rd Qu.: 128   3rd Qu.: 9.000   3rd Qu.:13.00   
 Max.   :5.00   Max.   :1389   Max.   :22.000   Max.   :19.00

[1] "Unique in course"


             Dessert              Holiday Kid Friendly Recipes 
                  64                    3                    2

Should all be cookies. Scraped from cookie category: https://tastesbetterfromscratch.com/category/dessert/cookies

Tasting Table • Dataset (.csv)

45 columns x 110 rows

     rating        ratingnum       numberofsteps   actualsteps       
 Min.   :4.900   Min.   :   1.00   Min.   : 4.00   Length:110        
 1st Qu.:5.000   1st Qu.:  28.00   1st Qu.: 8.00   Class :character  
 Median :5.000   Median :  36.00   Median :11.00   Mode  :character  
 Mean   :4.984   Mean   :  69.06   Mean   :11.37                     
 3rd Qu.:5.000   3rd Qu.:  53.00   3rd Qu.:13.75                     
 Max.   :5.000   Max.   :1729.00   Max.   :36.00

Veena Azamov • Dataset (.csv)

54 columns x 126 rows

     rating        ratingnum         servings     numberofsteps   
 Min.   :4.000   Min.   :  1.00   Min.   : 4.00   Min.   : 5.000  
 1st Qu.:5.000   1st Qu.:  5.00   1st Qu.:16.00   1st Qu.: 7.000  
 Median :5.000   Median :  9.00   Median :22.00   Median : 9.000  
 Mean   :4.986   Mean   : 17.12   Mean   :20.72   Mean   : 9.198  
 3rd Qu.:5.000   3rd Qu.: 18.00   3rd Qu.:24.00   3rd Qu.:10.000  
 Max.   :5.000   Max.   :178.00   Max.   :40.00   Max.   :26.000  
 NA's   :12      NA's   :12                                       
 totalingredients
 Min.   : 4.00   
 1st Qu.: 9.00   
 Median :11.00   
 Mean   :10.94   
 3rd Qu.:12.00   
 Max.   :39.00


                  American         American, Austrian 
                        65                          1 
American, Austrian, French         American, European 
                         2                          1 
American, European, French           American, French 
                         1                          2 
          American, German          American, Italian 
                         2                          1 
          American, Jewish          American, Mexican 
                         1                          1 
  American, Middle Eastern                   Austrian 
                         2                          1 
     British, European, UK                   European 
                         1                         14 
        European, Scottish                     French 
                         1                          8 
           French, Italian                     Indian 
                         1                          1 
             International                    Israeli 
                         8                          3 
                   Italian     Mediterranean, Spanish 
                         2                          2 
            Middle Eastern                   Scottish 
                         3                          2


                       bars, Dessert          Bread, Breakfast, Side Dish 
                                   2                                    3 
                           Breakfast    Breakfast, brunch, Cookies, Snack 
                                   2                                    1 
         Breakfast, brunch, High Tea   Breakfast, brunch, High Tea, Snack 
                                  15                                    1 
           Breakfast, brunch, Snacks Breakfast, Cookies, Desserts, Snacks 
                                   2                                    1 
       Breakfast, Desserts, High Tea                  Breakfast, High Tea 
                                   2                                   15 
          Breakfast, High Tea, Snack             Cake Decorating, Dessert 
                                   5                                    1 
                             Cookies                     Cookies, Dessert 
                                   2                                    7 
          Cookies, Dessert, Desserts Cookies, Dessert, Desserts, High Tea 
                                  28                                    1 
   Cookies, Dessert, Desserts, Snack                    Cookies, Desserts 
                                   1                                    1 
                             Dessert                    Dessert, Desserts 
                                  18                                    6 
                   Dessert, High Tea                       Dessert, Snack 
                                   1                                    4 
                            Desserts                                Snack 
                                   5                                    1 
                              Snacks 
                                   1

Should all be cookies. Scraped from cookie category: https://veenaazmanov.com/homemade-cookies

Welcome to our Data Science Club Project Page!!!!!!!

Informal AI Guideline

Data

Overall issues

Original Datasets

Combined dataset