Coffee Curiosity

INFO 526 - Summer 2024 - Final Project

Evaluation of data recovered from coffee taste test
Author
Affiliation

Cristina Lafuente

School of Information, University of Arizona

Abstract

This study examines a group of people who participated in a taste test of coffee and their experience of coffee flavors. In the course of the taste test, the participants were asked to voluntarily fill out information about up to 4 coffees labeled A, B, C, and D as well as their demographics. For each coffee, the participant could score its acidity and bitterness, give it a score and finally choose their favorite if they had one. This revealed a lot of interesting data not the least of which was that people often rank acidity and bitterness of the same coffees differently. In order to better understand how people actually taste coffee, it is necessary to understand what contributes to the flavors of coffee. In their paper, Acids in coffee: A review of sensory measurements and meta-analysis of chemical composition, Yeager et al describe how the variety of different acids contained within the coffee beans contribute to the overall flavor profiles of, not only different roasts, but different species of beans and different brewing methods as well. They describe how, to human senses, the acids in coffee make up 5 - 11% of the mass of each bean and are the most significant contributor to quality and flavor of coffee with even small changes largely impacting flavor profiles. The acids are broken down into two main categories, organic acids (OAs) and chlorogenic acids (CGAs). 38 organic acids have been identified in beans with sucrose as the precursor in the green (unroasted) bean. During the roasting process, chemical reactions increase the organic acids (leading to higher organic acids in darker roasted coffees and lower sucrose levels). There have been over 30 CGAs (broken down into several other categories) identified as well. In its raw state, these contribute to the acidity of coffee but as it is roasted, they breakdown and contribute to the bitter flavor of coffee. As noted by Yeager et al, there are no universally recognized thresholds for which roast belongs to which category which can complicate efforts to determine how people experience different roasts and flavors in addition to individual differences in quantifying flavor and simply different palates.

Introduction

The data

There are three different datasets from 2 sources. The Coffee Taste Test Data was compiled by a data blogger named Robert McKeon Aloe after the “Great American Coffee Taste Test” on YouTube when viewers filled out a survey about 4 coffees they had consumed. The data was voluntary and consumers of the coffees were not compelled to fill in all spaces or try all coffees so there are many NA entries but they did collect a lot of data from the viewers on both their tastes as well as demographics which makes this a great dataset. The columns contained within this dataset are:

 [1] "submission_id"                "age"                         
 [3] "cups"                         "where_drink"                 
 [5] "brew"                         "brew_other"                  
 [7] "purchase"                     "purchase_other"              
 [9] "favorite"                     "favorite_specify"            
[11] "additions"                    "additions_other"             
[13] "dairy"                        "sweetener"                   
[15] "style"                        "strength"                    
[17] "roast_level"                  "caffeine"                    
[19] "expertise"                    "coffee_a_bitterness"         
[21] "coffee_a_acidity"             "coffee_a_personal_preference"
[23] "coffee_a_notes"               "coffee_b_bitterness"         
[25] "coffee_b_acidity"             "coffee_b_personal_preference"
[27] "coffee_b_notes"               "coffee_c_bitterness"         
[29] "coffee_c_acidity"             "coffee_c_personal_preference"
[31] "coffee_c_notes"               "coffee_d_bitterness"         
[33] "coffee_d_acidity"             "coffee_d_personal_preference"
[35] "coffee_d_notes"               "prefer_abc"                  
[37] "prefer_ad"                    "prefer_overall"              
[39] "wfh"                          "total_spend"                 
[41] "why_drink"                    "why_drink_other"             
[43] "taste"                        "know_source"                 
[45] "most_paid"                    "most_willing"                
[47] "value_cafe"                   "spent_equipment"             
[49] "value_equipment"              "gender"                      
[51] "gender_specify"               "education_level"             
[53] "ethnicity_race"               "ethnicity_race_specify"      
[55] "employment_status"            "number_children"             
[57] "political_affiliation"       

The next two datasets are the chlorogenic acids and organic acids by coffee species, roast and extraction type from Yeager et all. These are very large datasets with much more information than will provide insight into the taste test data so it will need to be cleaned up. The columns contained within the Organic Acids Data:

 [1] "Source"         "Type"           "Roast"          "Extraction"    
 [5] "Stat"           "Other"          "Units"          "Citric"        
 [9] "Formic"         "Malic"          "Pyruvic"        "Quinic"        
[13] "Succinic"       "Acetic"         "Oxalic"         "Fumaric"       
[17] "Tartaric"       "Lactic"         "Glycolic"       "Nitric"        
[21] "Mesaconic"      "Maleic"         "Isocitric"      "Citraconic"    
[25] "Propionic"      "2-Furoic"       "Pyroglutamic"   "Phosphoric"    
[29] "Levulinic Acid" "Methylsuccinic" "Nicotinic"      "Ascorbic"      
[33] "Hydroxybenzoic" "Total"          "Notes"         

The columns contained within the Chlorogenic Acids Data:

 [1] "Source"                    "Type"                     
 [3] "Roast"                     "Extraction"               
 [5] "Stats"                     "Other"                    
 [7] "Units"                     "total CQA"                
 [9] "Total FQA"                 "Total diCQA"              
[11] "3-CQA"                     "4-CQA"                    
[13] "5-CQA"                     "3-FQA"                    
[15] "5-pCoQA"                   "5-FQA"                    
[17] "Ferulic Acid"              "4-FQA"                    
[19] "3,5-diCQA"                 "3,4-diCQA"                
[21] "4,5-DiCQA"                 "3-pCo,5-CQA"              
[23] "3-C,4-FQA"                 "3-C,5-FQA"                
[25] "4,5-FQA"                   "3-C,4-FQA and 3-pCo,4-CQA"
[27] "3-C,5-DQA"                 "3-C,4-DQA"                
[29] "3-D,5-FQA"                 "Nicotinic Acid"           
[31] "3-CGA"                     "Total CGA"                
[33] "Notes"                    

Basic information

After quite a bit of cleaning up data, the relevant data to make some assessments of coffee is achieved. Important points to note are levels of CGA, the lower they get, the more likely the coffee is to be bitter, the higher the total acids remaining, that will contribute to overall acid flavors in the coffee as well as the standard deviations of the organic acids and the chlorogenic acids. Because the study by Yeager et al looked at so many variables and some of that information was not available from the taste test, it is important to consider how much variability there could be within each roast’s flavor profiles.

Light OA Light CGA Std Dev OA Std Dev CGA Total Acids
Light
29298.43 69270.78 15612.57 56028.45 98569.21
Medium OA Medium CGA Std Dev OA Std Dev CGA Total Acids
Medium
48066.11 23530.19 21390.58 25359.47 71596.29
Dark OA Dark CGA Std Dev OA Std Dev CGA Total Acids
Dark
57830.36 12032.85 29563.87 13215.74 69863.21
Green OA Green CGA Std Dev OA Std Dev CGA Total Acids
Green
13837.22 79606.8 13911.78 51092.03 93444.02

Questions

Question 1

Do people experience and quantify coffee’s acidity and bitterness in a way that is consistent with objective data? Do their coffee preferences affect their perception?

The idea

People all have different likes and dislikes so it seemed interesting to examine the idea of whether people tasted similar things when they sipped their coffee. Does each person’s experience of acidity and bitterness have something in common with each other person’s? Taking that a step further, do the flavor profiles of the coffee a person prefers reflect on a person’s ability to “correctly” understand or interpret acidity or bitterness in coffee?

The method

Coffee’s acidity and bitterness (OA and CGA levels) can vary within species and roast depending on a number of factors not reported by the taste testers (extraction method, exact roast level, species) which leads to some uncertainty in what the expected results are for the taste test. Given that the tasters were asked to rank on a scale from one to 5, however and the flavor profiles for specific roasts are generally known (within the standard deviation), I determined that the most likely method of determining “correctness” was to select the two bars the tasters would be expected to select based on the average acid profile of that roast and the standard deviation. Then, to determine which preference group was correct most often, summed along acids and bitterness and got a cumulative percent correct for each. I thought it was interesting to note those people who chose either not to try each particular coffee or to not rank it so I included the no answers often times, those same people chose no favorite but did weigh in on some of the other coffees (as any rows of no answers were removed entirely).

Discussion

When it comes to accurately determining coffee’s acidity, it seems that people of all coffee preferences had roughly the same judgement, just under two-thirds were correct in their assessments. When it comes to judging a coffee’s bitterness, it seems that those who prefer the lighter roasts (green and light) were slightly more accurate. However, there does also appear to be some unexpected lightness to the dark roast coffee sampled by the tasters. This could have some impact on the data reported here.

An interesting future study would involve actual measurement of the OAs and CGAs of several blends and roasts and repeating the experiment for increased accuracy. Overall, this seems not too far off the mark given how much variation there can be between different people’s tastes.

Question 2

Does political affiliation play any part in coffee preference?

The idea

This is a straightforward enough question but there are quite a few studies which show that voting preference has much to do with how people’s brains work. Aside from simply an entertaining question, it seemed possible that whatever does drive that bit that encourages us to vote in one way or another might drive us to choose a coffee that tastes one way or another. Do people who want to be angry all the time want to drink a more bitter brew?

The method

In this case, all that was required was to take the respondent’s word for which political party they belong to and which coffee they prefer and create a proportion matrix to get exact percents. A mosaic plot renders a lovely visual.

               political_affiliation Democrat Independent No affiliation Republican
prefer_overall                                                                     
Coffee A                                 0.22        0.23           0.23       0.15
Coffee B                                 0.19        0.20           0.21       0.27
Coffee C                                 0.19        0.21           0.21       0.24
Coffee D                                 0.41        0.35           0.35       0.34

Discussion

Coffee D seems to have been the standout of the test, with coffee A getting the honorable mention. Only in the Republican group was there a different second place coffee, coffee B came in second.

There doesn’t seem to be any real correlation between political party and coffee preference. It might be interesting to repeat the experiment after removing coffee D to see if the trend grew with Republicans specifically preferring a unique coffee when compared with everyone else. Alternatively, expanding coffee availability to offer more within those broad categories of roast might give interesting results as well. It is possible that this brand of coffee has one very good coffee and three which seem roughly the same.