Whiskey Sample Jar Experiment - Getting to the Bottom of the Sandalwood "Scandal"



I have been meaning to run this experiment for some time now. I had the unfortunate experience of having a very generous flight of Four Roses bourbon shared with me via 2oz sample jars that didn't go as planned. I went into a March Madness bracket of all 10 recipes eager and excited to learn about the nuances between each recipe only to find they all tasted really off... Every time I tried a new one there was varying levels of one defining characteristic: sandalwood. A fine note on its own in certain doses, and one that I've found in delicious American and Indian single malt whiskeys, but here it was just... blegh. It become borderline undrinkable as I found more and more of it. It was a haunting note that I only really know about because of how much of it I tasted that fateful day.


Needless to say, I never published that tasting bracket until now so that you can see the history of what led me to dig in on this. The affected samples certainly seem to only be localized to the ones I received and if anyone else tried this flight, I'm not sure yours may be affect as much as mine. Perhaps it was a little extra time in my hot car, or something else I flubbed with proper whiskey storage (sunlight?), but I am really trying to make sure we understand the inputs that cause the outputs - hence this study - so bare with me as we nerd out a little because I think this is really important to understand.


Alan Starr has graciously given me some more of the sample jars that he used previously to perform this experiment. The fact that the whiskey spoiled in his jars seems to have had absolutely no relation to anything he could have done differently; I had a long conversation about many different factors, it just seems like there may be bad batches of sample jars out there... or at least that is the hypothesis that this experiment is meant to test. We'll see! Any negative connotations towards any samples that are labeled with "Alan" certainly don't reflect any animosity to him as a human - he's a great guy!


With that out of the way... Let's get into the test. Fellow whiskey nerds - I want to see your thoughts in the comments on this one when we get to the bottom of this.


 

Whiskey Used: Wild Turkey 101 - 1.75L Company on Label: Wild Turkey

Whiskey Type: Bourbon

Mash Bill Percentages: 75% Corn, 13% Rye, and 12% Malted Barley

Proof: 101°

Age: NAS (likely ~6 years)

Further identification: This is the new label Wild Turkey 101 that I purchased in 2022 and opened fresh for this experiment

 

Experiment set up


Equipment used:

  • 1.75L of Wild Turkey 101

  • Alan's 2oz polycone cap sample jars

  • AmongstTheWhiskey's 2oz polycone cap sample jars

  • AmongstTheWhiskey's 2oz paper cap sample jars

  • AmongstTheWhiskey's 1oz paper cap sample jars

  • Glad Cling'n Seal clear food wrap

  • Cardboard box for simulated storage conditions

Test cases:

  • Control - A pour out of the 1.75L Wild Turkey 101 bottle stored in a cool, dark place (my whiskey study)

  • Jar source (Alan vs. Amongst)

  • Cap style (Polycone vs. Paper)

  • Jar volume (2oz vs. 1oz)

  • Storage orientation (Right side up vs. Upside down vs. Side)

  • Cap masking (Masked with cling wrap vs. Native)


Environmental exposure:

  • In an effort to simulate a shipping scenario, for the first 2 days of sample aging they were stored in a warm (as defined by August weather in Massachusetts, daily highs in the 90s) shed (no direct sunlight and in a sealed cardboard box)

  • The samples were then stored in my house for an additional 28 days

  • Sample jars will be inspected for leakage

  • Caps will be removed at the same time and inspected for damage

  • Samples will be poured into clean glencairn glasses to a consistent volume

Randomization and Anti-Bias:

  • All samples will be poured by a non-participant and labeled with a corresponding letter that will be used to correlate back to the testing condition

  • Taster(s) will only see a letter and will not know which test condition the letter relates to at time of tasting


 

Results


Sample Ranking Conditions:

  • Presence of Sandalwood Detected? (Yes/No)

  • Intensity of Sandalwood (1-5, 5 being the highest)

  • Enjoyable? (Yes/No)

  • Rating (1-5, reference table here)

Leakage Inspection:

  • No evidence of any leakage

Cap Damage Inspection:

  • No evidence of any cap damage

Taster 1 Results (presented with test conditions revealed for clarity):

Sample

Sandalwood Detected? (Yes / No)

Intensity of Sandalwood (1 - 5)

Enjoyable? (Yes / No)

Rating (1 - 5)

Control: From the original bottle

No

N/A

Yes

4

A: Alan / Polycone / Side / Native

No

N/A

Yes

3.5

B: Amongst / Polycone / Side / Masked

No

N/A

Yes

4

C: Amongst / 2oz Paper / Side / Native

No

N/A

Yes

3.4

D: Alan / Polycone / Upside-Down / Native

Yes

1

No

2.7

E: Alan / Polycone / Upside-Down / Masked

No

N/A

Yes

3.8

F: Amongst / 1oz Paper / Upside-Down / Native

No

N/A

Yes

3.5

G: Amongst / 1oz Paper / Upright / Native

No

N/A

Yes

3.8

H: Alan / Polycone / Upright / Native

No

N/A

Yes

3.4

I: Alan / Polycone / Side / Masked

No

N/A

Yes

4

J: Amongst / 2oz Paper / Side / Masked

No

N/A

Yes

3.4

K: Amongst / Polycone / Side / Native

No

N/A

Yes

4

L: Alan / Polycone / Upright / Masked

No

N/A

Yes

3.8

Taster 2 Results (presented with test conditions revealed for clarity):

Sample

Sandalwood Detected? (Yes / No)

Intensity of Sandalwood (1 - 5)

Enjoyable? (Yes / No)

Rating (1 - 5)

Control: From the original bottle

No

N/A

Yes

4.1

A: Alan / Polycone / Side / Native

No

N/A

Yes

3.4

B: Amongst / Polycone / Side / Masked

No

N/A

Yes

4.1

C: Amongst / 2oz Paper / Side / Native

No

N/A

Yes

3.1

D: Alan / Polycone / Upside-Down / Native

Yes

2

No

2.3

E: Alan / Polycone / Upside-Down / Masked

No

N/A

Yes

3.8

F: Amongst / 1oz Paper / Upside-Down / Native

No

N/A

No

2.9

G: Amongst / 1oz Paper / Upright / Native

No

N/A

Yes

3.9

H: Alan / Polycone / Upright / Native

No

N/A

Yes

3.2

I: Alan / Polycone / Side / Masked

No

N/A

Yes

4.1

J: Amongst / 2oz Paper / Side / Masked

No

N/A

Yes

3.3

K: Amongst / Polycone / Side / Native

No

N/A

Yes

4.1

L: Alan / Polycone / Upright / Masked

No

N/A

Yes

3.7


 

Conclusions


I will run these numbers through a few MiniTab functions that will help us visualize & properly tell if there is any significant effect at work here. The first I will use is the main effects plot.


Interpreting the main effects plot is pretty easy - when blue lines cross the horizontal dotted line there is usually something significant about the input. The category of input is grouped into boxes and labeled at the top while the variances in that input are listed at the bottom. The magnitude in height is the average based on those inputs - the larger the magnitude in height between inputs means it is a more significant impact / driver of the output (quality of taste).


You'll see there was definitely no significant difference between tasters (thank you for the help Brendan!).


The control source was the best tasting whiskey by far, with Amongst jars averaging just higher than Alan jars (there could be some interactions with other test cases though, we will get into that later).


Cap style seems inconclusive other than the fact that all jarred samples ranked lower on average than the control.


Masking shows a very significant improvement in quality and actually comes quite close to the control group.


Storage orientation definitely matters as this shows the largest swing in total score spread with upside down storage significantly lowering tasting scores. Side and upright were closer to the control, but there were still effects.


Bottle size does not seem to be significant.


 

The next logical step is to make sure none of the inputs were dependent on each other - this will help us determine which inputs are truly the big drivers of the change in output. Sometimes an input will look significant on a main effects plot but that might just be because of how the test samples are set up (i.e. all of Alan's jars are polycones - he didn't provide me with any paper caps so that group couldn't be tested independent of Sample Jar Source in this experiment). To investigate this we look at the interaction plot for the data set.


Here we are looking for lines that cross each other and lines that are parallel. When lines run parallel that means there is no interaction between the inputs - when they cross each other it means that one of the inputs is having an effect on the other input. Where data does not show lines (and just points) it means there weren't enough of each of the other inputs that went into this group. This is where things get tricky to interpret but I'll do my best to explain. I'll read the inputs from the left to right.


Tasters did not have a bias or effect on any of the inputs - this makes perfect sense because we were both blinded to the inputs & have generally good palates for tasting nuances in whiskey. The lines are so close / so parallel you can't even see the differences.


Sample Jar Source didn't have enough relationships to other groups (remember I mentioned Alan didn't provide paper cap jars) to show any interaction effects to cap style. Sample jar source does seem to have a strong interaction (crossed lines) when plotted against Masked vs Native samples - suggesting that the source of the jar can be overshadowed by the effect of masking (which we saw was a larger mean change on the main effects plot). We see that masking only really helped Alan samples and not Amongst samples for whatever reason. There don't seem to be any other interactions beyond that.


Cap style seems to also have interactions with masking and storage orientation. We see that polycone samples were significantly improved by masking where paper samples stayed flat. Really interesting here. Storage orientation interacting with cap style doesn't show any consistent trends suggesting another factor is also interacting beyond these two.


Masking kept all storage orientations very close to the control (no interactions). Unmasked samples suffered when stored upside down.


Bottle size played no significant role in the study. 1oz jars behave like 2oz jars which again suggests that something with the caps, the jar source, or the storage conditions plays a role.


 

Final Thoughts


I'm glad to see we were able to reproduce at least one sample that did have that sandalwood flavor - we accurately repeated the conditions which I tasted (though in a higher magnitude) from the Four Roses flight given the same sample jar was used.

  1. Whiskey changes by being stored a sample jar

  2. Certain brands of sample jars produce worse effects

  3. Masking with saran wrap seems to greatly reduce the magnitude of the change if you suspect your polycone jar source to be affected by point 2

  4. Store jars upright if you can, especially paper cap jars

  5. If you had a bad time with a whiskey sample in the past: revisit that brand - it may not have been the whiskey

  6. Anecdotal addition: higher proof whiskeys seem to have a larger effect - I have the most vile level of sandalwood on an otherwise really delicious Hirsch Selected Whiskey (thank you Eric Gilbert for that sample!)


 

A few links to the known 'bad' sample jar sources that have produced this sandalwood note are listed below if you want to reference what you have bought or just want to know what to avoid in the future.


Alan's jars:

EBAY - 2 OZ 60 ML CLEAR BOSTON ROUND GLASS BOTTLES WITH CONED CAPS


Eric's jars (we couldn't confirm which of the two produced the really bad sandalwood):

Amazon - (Pack of 80) 2 oz. Clear Boston Round with Black Poly Cone Cap

Amazon - Vivaplex, 12, Clear, 2 oz Glass Bottles, with Lids


 

Thanks for checking out this study! Make sure to share this with your friends - spread it far and wide - so that our whiskey community isn't inadvertently tasting tainted whiskey and drawing bad conclusions on otherwise very good whiskey! I'd love to see if someone could repeat this study with a larger sample size or possibly a higher proof whiskey - I think the signals would be much stronger with something barrel proof which I unfortunately did not go for. I am definitely going to be revisiting how (or even if) I review whiskey samples from here on out.

1 comment