#31: Post by samsonpvr »

Amazing how one video can take down a technique, brand, or product. I guess that's why they are called "influencers".

#32: Post by JordanK »

ShotClock wrote:Statistical significance via the p test is a knotty issue in a case like this, at least by my understanding. First, we have to have a model of the distribution of the output variable (tds or whatever) for each case, then we have to infer the parameters from our measurements.

There are two tricky areas - if the guess about the type of model to use is wrong, the results are junk. Also, inferring the model parameters from only 8 samples will make them highly prone to noise.
I doubt very much there is any particular modeling going on here and would assume this is a simple t-test comparing means. Yes, some assumptions are implicit in such a test (like normal distributions), and other factors not being constant (like a variable response by the machine) may invalidate the conclusion. I can't speculate on that.
However, the fact that there are 8 rather than, say, 1000 data points is not at all an issue. Any basic t-test takes this into account. So if the test says it is statistically significant, the limited number of data points is built in already to that conclusion. It is the presence of the noise itself which would lead to it NOT being a statistically meaningful difference, even if the numbers being compared were not numerically equal.

#33: Post by cafeIKE »

samsonpvr wrote:Amazing how one video can take down a technique, brand, or product. I guess that's why they are called "influencers".
Alpha waves are lower watching that sleeping.

Moving pictures have influencing the masses for more than a century, not always for the better

#34: Post by Raja »

BaristaBob wrote:I wonder about the "Unicorn" in the room. I question his use of the Unica Pro, a big unknown for most of us, I believe. Is the machine's water debit constant shot to shot, among other things?

Good question - do you think it somehow knows when the blind shaker is being used? :mrgreen:

#35: Post by Yum »

Well said ! True & yet funny

#36: Post by ShotClock »

My main concern with this study is it may be an example of the "multiple comparison problem". If we are making comparisons between several trials, each one of which is relatively small, and could provide the "surprising result" (e.g. X is the best method for puck prep). The more bites of the apple we take, the higher the likelihood of seeing an outlier by chance. For example, if we have 10 trials looking only at noise (no underlying trend) there is about a 10% chance of having one outcome with a p-value of 0.01 or better.

In some unscrupulous scientific investigations, researchers can take a large data set, then test as many hypotheses as needed, before finding a statistical "hit".

Another example, if we were to use the "standard" p-value threshold for "statistically significant" of 0.05, there is a 33.7% chance of having a "statistically significant" result with 8 trials, if there were no underlying pattern. Here: 1-(1-0.05)^8 = 0.3366. Lance has published his raw data, which is very nice (link in YT video description), which is extremely laudable IMO. Certainly evidence of good intentions and good faith investigation, which I appreciate a lot.

The golden test for this is whether this is replicable - I hope there is someone with a few hours to spare and a refractometer who is inclined to check it.

#37: Post by BaristaBob »

Well, I've spent the past week evaluating some of the ideas Lance floated in his video using my refractometer using one coffee (medium roast), one grinder (MFlat with SSW burrs), one machine (modded BDB), one prep method (grind into KafaTek cup and gently shake side to side or blind shaker and shake like Weber says to...WDT via Moonraker using two full turns as Weber says to if using blind shaker, then tamp using Decent v3, Force Tamper, or Bravo tamper), one barista (me). This medium roast is the bomb, very fruity up front, finishing with sweet chocolate.

My results so far really don't differ much from Lance except to say the blind shaker is the only tool in my experiments to cause a change in EY, whether positive or negative. The blind shaker moved the EY almost a half percentage point to the positive ( 22.1% to 22.5%). The amount of Moonraker turns (two turns up to 10 turns) or type of tamper does not effect my EY values. I guess the greatest surprise for me is the tamper...Decent base is, I believe 58.3mm while the Force and Bravo are 58.5mm. change, my taste buds can't discern a 0.5% change in EY.

Just passing along some data points.
Bob "hello darkness my old friend..I've come to drink you once again"
#38: Post by malling »

Yes something on a sheet can look to result in large difference than it really is, if we are talking small differences in this scale good chance it's simply put nor relevant irl. But we can all chase numbers but if we can't actually detect it is it really worth the effort and investment.

And while fun dissect the science part of it, I honestly don't bother when the whole thing actually showed modest differences lots of it was within 0,5-0.75 differences and I think many would have a hard time detecting that especially blindly.

#39: Post by Capuchin Monk »

BaristaBob change, my taste buds can't discern a 0.5% change in EY.

Just passing along some data points.
Thank you for the efforts. So, it comes down to the personal preference of workflow. This type of choices exist in other industry too. For example, home audio electronics where the sound difference is measurable but to the listeners those are too small to be discernable however, some sellers will exaggerate it for marketing reason. Lance may not be a seller of those products (or maybe affiliated?) and his video may just be a curiosity filler.

#40: Post by ei8htohms »

Thank you for doing the comparison!
BaristaBob wrote:... the greatest surprise for me is the tamper...Decent base is, I believe 58.3mm while the Force and Bravo are 58.5mm.
What was surprising about the tampers?
