WBC standards don't reflect real world usage - Page 2

Want to talk espresso but not sure which forum? If so, this is the right one.
King Seven

#11: Post by King Seven »

I think they should have found the worst, cheapest and most unstable machine and then used that. Give us baristas a proper challenge....

mteahan (original poster)

#12: Post by mteahan (original poster) »

I will be giving a technical presentation on European and American standards for espresso machines and processes with Mark Crawford from ESI. We will also be on the floor as exhibitors.

For those that don't know my background:

I started working with espresso machines as a distributor of equipment in Portland in 1987. We designed out own line of espresso carts and doing technical seminars for the US importer of Brasilia espresso machines. Several coffee bars and restaurants later, I left Portland for Los Angeles to eventually become the technical director Rosito Bisani (Brasilia). I wrote for Fresh Cup Magazine delivering technical articles on espresso machines for about a year and a half when the debate centered on the size of boilers and how much steam a machine could deliver. I was against the mainstream then, too.

I had a hand in designing many of the machines many prototypical espresso machines for the U.S., but the heavy lifting was done by the engineers in Italy. I designed and built prototype machines as 'proof of concept' projects and shipped them to Italy for evaluation. Simple ideas like automatic backflush programs came from my discussions with engineers at Brasilia back in the 90's.

Most of my work through 2002 centered on automated espresso machine systems, trying to develop hardware that was reliable and yet still deliver espresso in keeping with traditional machines. It's possible, but difficult. The importance of the operator, even in automated machines, is still more important than manufacturers would like to admit.

After nearly 20 years, I work for no particular company and am able to say whatever I wish. My only concern is what is in the cup and I don't have any agenda on how to get there. The reality is that there are several. It is the notion that there exists one machine, or one technology, that is superior to all others (the one ring to rule them all??) that undermines the debate. Old San Marco Levers in Naples make some of the best coffee on the planet. Maybe it's because they hold the cups in boiling water before use. Interesting, considering that most machines in the US don't even have cup warmers.

There are so many factors that play in the equation to make good espresso, and though I have been chest high in the technology for most of my adult life, I still believe it is the art that prevails. I am dismayed by baristas who obsess about the technology instead of perfecting the process with the technology they have. A bad mechanic always blames his tools, and thinks that better tools are the solution.

Long enough? Seems I have to do this every so often.

Michael
Michael Teahan
analogue | coffee

gscace

#13: Post by gscace »

Hi:

I won't say anything specific about the data because the data belongs to the WBC and the respective machine manufacturers. I was not in Oslo, but I did the number crunching. LM won because the machine they submitted performed the best according to the criteria outlined by the standards committee. Temperature data obtained from systematic measurement was part of that.

I don't know where you got your information regarding the temperature stuff, but some of your information is wrong and that is as specific as I'm gonna be on that until WBC and the manufacturers agree to make the results public.

Michael, I'd be very interested in discussing this stuff further with you in Charlotte.

-Greg

mteahan wrote:Which is why Pragil was so pissed. There he was, sportin' the biggest wood and he got DQ'd for the functional violation of taking Viagra.

Tough thing for an Italian.

Speaking of the limit; that machine we put together for Starbucks would out steam any machine on the planet and still sit on a 30 amp circuit. Did anyone care? No. We developed an automatic/traditional hybrid that would brew up to 5 blends of espresso at 720 shots per hour with disposable brewing groups that could be replaced in about 5 minutes without turning off the machine. Did anyone bite? No.

An espresso machine that delivered shots, hot foamed or flat milk from a bar gun. The world's fastest espresso machine. Any interest? No. The Conti Twin Star has had individually temp controlled brew and steam boilers with demand based watt allocation for over a year and a half, on the market no less. Any reviews or interest among the establishment? No. Brasilia had one 15 years ago.

The only hit was an automated machine using titanium conical mills that brewed 5 liters of espresso in 25 seconds. They went to iced coffee producers in Japan. Six PID'ed boilers and three 300 liter/hour pumps. Five years ago.

None of what we think is new, isn't; merely adapted and tweaked.

Engineers push the envelope all the time and are pretty good at kicking whatever ass they want. The frustration lies in the fact that when they do, nothing comes of it. The problem occurs when trying to build the machine the customer wants. It may not be the best machine, or the most durable or the most practical. It doesn't matter. It's what the customer thinks is important. Even if they are stupid; which is pretty common, but matters little.

The reason why flow rates are important is because the temperature profile is keyed to them. None complained that the test was too hard or too demanding. The word I heard was unrealistic from an extraction viewpoint. They already build machines that deliver 6k shots per day.

This isn't just in the coffee biz. I went skiing this weekend (first time in forever) and there were at least a dozen different shapes and sizes of skis. I'll wager everyone thought theirs was best.

And they probably all correct.

Michael

mteahan (original poster)

#14: Post by mteahan (original poster) »

Until the information is released, I can only convey what I have been told. I will contact Nuova to see if they will consent to release the data and will try to get a hold of Pragil as well. I have no relationship with anyone else.

Considering that the WBC shopped sponsorships, transparency is in everyone's best interest.

Please understand that I in know way think that the selection was poor--all the machines that were in the testing are good machines. I also know that many factors over and above temperature stability are important. As someone who thinks that temperature profiling is superior to flat line stable brew temperatures, there is an argument to made against the premise of the WBC standards in the first place.

I am still old school in many ways. Put thirty shots from the same barista and grinder on a table in ten minutes and have judges taste the result without knowing the origin. Make it 40; it doesn't matter. Make them all 20 oz. mochas. Doesn't matter. When the Stabuck's technical department disqualified our machine using a PF based temperature device in 1992, they hadn't yet even pulled a shot from the machine. The device indicated a head temp of 237--way too hot to even BE an espresso machine, yet they couldn't explain why the espresso was more consistent and had better developed crema than the machine they had in use, nor could they explain how they were getting spot on shot temperatures from such an 'over-heated' group. I have never been a fan of reverse logic for quality evaluations.

Find a machine that works and figure out why. The presumption of knowing what standard is best only encourages manufacturers to meet the standard, not the result in the cup. When Cimbali, Spaziale, Marzocco and Faema use dramatically different approaches to making espresso--all very good--using any technically based standard is bound to bias one over another.

The Faema, for example, is designed to brew hot initially during pre-infusion, and then drop to a stable extraction temperature. This philosophical approach would on its face disqualify the machine from the WBC standard. Is this the right approach to evaluating machines? I would tend to think not. Cimbali and Faema together have more technical expertise in the design of espresso machines than the rest of the world combined, it would be naive to imply that they don't know what they are doing.

On another note: we have adjustable flow control portafilters that could be modified to accept a temp sensor, allowing you to adjust the flow to mimic whatever flow rate you wish. They are useless to us for the application we had in mind and would be happy to send you a couple to see if they could be of use to you.

Let me know,

Michael
Michael Teahan
analogue | coffee

User avatar
barry

#15: Post by barry »

mteahan wrote:As someone who thinks that temperature profiling is superior to flat line stable brew temperatures, there is an argument to made against the premise of the WBC standards in the first place.

i agree. the test can remain essentially the same, though, but the analysis needs to be different. i think jim schulman has a statistical method which would be useful in analysing the test data to show which machines have reproduceable profiles (no matter what that profile might be).


we need to get you together with al critzer, in charlotte. i think he's off on his own now, pursuing the perfect shot where ever he can.

gscace

#16: Post by gscace »

mteahan wrote:Until the information is released, I can only convey what I have been told. I will contact Nuova to see if they will consent to release the data and will try to get a hold of Pragil as well. I have no relationship with anyone else.

Considering that the WBC shopped sponsorships, transparency is in everyone's best interest.



Please understand that I in know way think that the selection was poor--all the machines that were in the testing are good machines. I also know that many factors over and above temperature stability are important. As someone who thinks that temperature profiling is superior to flat line stable brew temperatures, there is an argument to made against the premise of the WBC standards in the first place.

I am still old school in many ways. Put thirty shots from the same barista and grinder on a table in ten minutes and have judges taste the result without knowing the origin. Make it 40; it doesn't matter. Make them all 20 oz. mochas. Doesn't matter. When the Stabuck's technical department disqualified our machine using a PF based temperature device in 1992, they hadn't yet even pulled a shot from the machine. The device indicated a head temp of 237--way too hot to even BE an espresso machine, yet they couldn't explain why the espresso was more consistent and had better developed crema than the machine they had in use, nor could they explain how they were getting spot on shot temperatures from such an 'over-heated' group. I have never been a fan of reverse logic for quality evaluations.

Find a machine that works and figure out why. The presumption of knowing what standard is best only encourages manufacturers to meet the standard, not the result in the cup. When Cimbali, Spaziale, Marzocco and Faema use dramatically different approaches to making espresso--all very good--using any technically based standard is bound to bias one over another.

The Faema, for example, is designed to brew hot initially during pre-infusion, and then drop to a stable extraction temperature. This philosophical approach would on its face disqualify the machine from the WBC standard. Is this the right approach to evaluating machines? I would tend to think not. Cimbali and Faema together have more technical expertise in the design of espresso machines than the rest of the world combined, it would be naive to imply that they don't know what they are doing.




On another note: we have adjustable flow control portafilters that could be modified to accept a temp sensor, allowing you to adjust the flow to mimic whatever flow rate you wish. They are useless to us for the application we had in mind and would be happy to send you a couple to see if they could be of use to you.

Let me know,

Michael
Howdee:
I'm for transparency here as long as the data isn't misinterpreted to the detriment of a manufacturer. That sort of thing will lead to no one ever consenting to submit machines for a runoff in the future.

WRT machines suiting the standard or not, the WBC standard only specifies the method in which the measurements are made and in how the results get interpreted. I know that there is a lot of argument among people on whether or not a flat line is better than a declining profile. I dunno the answer to that one, but I know for certain that the temperature experienced by the coffee varies with position in the cake and elapsed time of the extraction. It is by no means flat and can also be changed by changing the geometry of the basket too. For that matter, the pressure within the cake is only 9 bars on the top surface. I think the WBC's goal was to select a machine that would make good coffee, and whose use required the least arcane knowledge of the specific machine and to that extent I think they didn't do too badly. On the other hand, Jim Schulman has suggested a statistical method by which the reproducibility of machines with inclining or declining temperature profiles could be examined. He knows more about statistics than I do. I'm an engineer with barely enuff statistical knowledge to get into trouble. I'm certainly interested in improving the standard for the next round of selection trials, if the WBC is interested in doing so, and it would be nice to have input from engineers that design espresso machines for these companies. I encourage them to contribute.

WRT your other note - I would be interested in the variable flow pfs. email me at gscace at nist dot gov and I'll send you pertinent details.

-Greg

User avatar
another_jim
Team HB

#17: Post by another_jim »

barry wrote:i agree. the test can remain essentially the same, though, but the analysis needs to be different. i think jim schulman has a statistical method which would be useful in analysing the test data to show which machines have reproduceable profiles (no matter what that profile might be).
(Sorry for the delay on this -- I'm in the middle of a family emergency, and don't have the attention span left to do any serious work)

The basic idea is simple:

1. You have to imagine the data from each of the 10 odd shots as successive rows on in a spread sheet.

2. For simplicities sake, lets assume each shot has the same number and timing of data points, say second by second. If this isn't the case you have to interpolate (this is one of the technical hitches)

3. Now take the average temperature of each shot and subtract it from the entire row. In other words, in a perfect straight line profile, you'd get a row of zeros. In a humped profile you'd start and end with negative numbers, and have positive numbers in the middle.

4. Now scan down the columns. If the pattern of each shot is the same, each column would be filled with identical numbers. For instance, a perfect straight line machine will have nothing but zeros. A perfect hump-profile machine would have the same negative numbers at the start and end columns, and the same positive numbers in the middle columns. A deviation measurement on the columns would give you the "profile independent' intra-shot stability

5. Now compare each shot's average temperature for inter-shot stability

6. Since you are using the shot averages for inter-shot stability, and the deviations from shot averages for intra-shot stability, you are not committing the stats no-no of reusing the same data. (the two series are orthogonal, with one degree of freedom removed from each shot's readings for the inter-shot comparison)


The problems are as follows:

1. What is the best measure of deviation from perfection? A standard deviation is probably not right, since successive readings are highly correlated. On the other hand, the usual cure, a Mahalanobis transform to "decorrelate" the readings (multiplying the data by the inverse of the covariance matrix) requires a few years of linear algebra to be comprehensible. I won't be sure whether this extra layer of incomprehensibility is worth it.

2. What happens if different shots have observations at different frequencies, or at a changed phase (i.e. some start at the beginning of the shot, others 0.5 seconds in. with all readings offset by a 1/2 second). It may be that one has to run an interpolating filter, then use the filtered values at fixed points instead of the raw data. Again, this is standard stuff, but hard to explain.

3. None of these complications apply to comparing successive **shot averages** for inter-shot variability, since these are only weakly correlated, and then only if the machine is poorly controlled or designed (i.e. if the machine is supposed to "recover" in 1 minute, it means it can also be set to an arbitrary new temperature, reasonably close to the old setting, within a minute).


I hope this is somewhat comprehensible to you. If not, then coming up with a profile independent measure of intra-shot stability will be a hard sell.

Caffewerks

#18: Post by Caffewerks »

mteahan wrote:I will be giving a technical presentation on European and American standards for espresso machines and processes with Mark Crawford from ESI. We will also be on the floor as exhibitors.

For those that don't know my background:

I started working with espresso machines as a distributor of equipment in Portland in 1987. We designed out own line of espresso carts and doing technical seminars for the US importer of Brasilia espresso machines. Several coffee bars and restaurants later, I left Portland for Los Angeles to eventually become the technical director Rosito Bisani (Brasilia). I wrote for Fresh Cup Magazine delivering technical articles on espresso machines for about a year and a half when the debate centered on the size of boilers and how much steam a machine could deliver. I was against the mainstream then, too.

Michael
While Michael and I often joke about our two companies bearing similar names, ( we were around first by the way :wink: ) and generally pass the time with shelves of espresso machine parts surrounding us, that is where the similarities end.

Teahan is bar far one of the most talented espresso machine engineers this trade will ever know. He is modest as hell in some camps, yet quite vocal in others. To the benefit of those that read these forums, this is really important commentary, the kind of stuff that you can reflect on, while engineering the next big thing.

I clearly remember him discussing all of the "New" technology we ramble about with today, over ten years ago. He touted many of the same benefits of temperature stability that are common place in many of today's well engineered espresso machines. There are many people in the manufacturing side of the industry that I appreciate and admire, but Michael was one of the people that sparked my interest in the industry.

We actually shared that column in Freshcup magazine under the column name Techno-jolt. Amazing really, considering I knew very little at the time. I learned a lot from those columns and still refer to them from time to time.

So anyhow, just thought I would comment on Michael's part in espresso machine technology. At the time he was doing this important work, the internet and forums were not very active, at least not like today. I'm sure that if they had been the information he provides would have been as popular as the Scace Thermofilter, or Andy Schecter's PID controlled Silvia.

Folks like Michael, John Blackwell, Bill Crossland, Greg Scace, Andy Schecter, Barry Jarret, are endless fountains of information and while they may not all agree on the technology or how to get there, they are all passionate about what ends up in the cup, and that is why I appreciate, the work they do.

User avatar
espressoperson

#19: Post by espressoperson »

another_jim wrote: 1. What is the best measure of deviation from perfection? A standard deviation is probably not right, since successive readings are highly correlated. On the other hand, the usual cure, a Mahalanobis transform to "decorrelate" the readings (multiplying the data by the inverse of the covariance matrix) requires a few years of linear algebra to be comprehensible. I won't be sure whether this extra layer of incomprehensibility is worth it.

2. What happens if different shots have observations at different frequencies, or at a changed phase (i.e. some start at the beginning of the shot, others 0.5 seconds in. with all readings offset by a 1/2 second). It may be that one has to run an interpolating filter, then use the filtered values at fixed points instead of the raw data. Again, this is standard stuff, but hard to explain.

3. None of these complications apply to comparing successive **shot averages** for inter-shot variability, since these are only weakly correlated, and then only if the machine is poorly controlled or designed (i.e. if the machine is supposed to "recover" in 1 minute, it means it can also be set to an arbitrary new temperature, reasonably close to the old setting, within a minute).

I hope this is somewhat comprehensible to you. If not, then coming up with a profile independent measure of intra-shot stability will be a hard sell.
Fool rushing in here due to travel delay and wifi.

Why one best measure to capture all the complexities? The measure you suggest (aside from being ultra esoteric) emphasizes only the amount of deviation from zero. Other measures might be:

(a) the percentage of time a shot is more than a set amount (say 0.5 degrees) from the target profile. Is it useful to be able to say, machine x was on target for 70% of the shot, whereas machine y was on target 30% of the time? (This even leaves room for varying the target profile from a straight line to any variation, e.g., rising, falling, etc.

(b) total positive temp deviation. E.g., how much above the profile. What is too much heat doing to shot?

(c) total negative temp deviation. E.g., how much below the profile. What is too little heat doing to shot?

(d) overall deviation. I think this may be close to your proposed measure and may be less interesting because it tries to summarize too many interesting factors with one number.

These measures plus the taste of the shots could be informative as a diagnostic of espresso taste in addition to a test of machine predictability and performance.

Agreed that point 2 is standard and just a procedural detail.

If I understand point 3, it is a multiple of point 1. A statistical analysis of the series of shots. Definitely interesting from a machine performance perspective, but from a taste perspective might be the equivalent of mixing all 10 shots together to see what the average shot tastes like. This measure is less interesting to me, a home barista who almost always drinks his first shot and doesn't pull many beyond that, but may be critical to a professional barista. The danger is that these results could be used to produce a machine that excels at producing mediocre shots. So whatever is done to analyze this point, it is only useful to amplify the results of point 1.
michaelb, lmwdp 24

User avatar
another_jim
Team HB

#20: Post by another_jim »

I guess I'm not getting a lot of feedback on the profile independent error measures. I wonder if it's worth working on.

To Espressoperson. With serial correlation (i.e if one observation is X degrees, the next one is going to be X or very close to X), measures of time on or off target can become deceptive as well.

As a correction to myself, on second thought, a standard deviation will work unmodified as a measure of accuracy. However, because of the correlated observations, calculations on how repeatable that standard deviation will be on subsequent tests will be off in an optimistic direction (that is, the number of independent readings one has gathered is less than the number of correlated observations, so any confidence interval derived using calculations based on independent observations will be too optimistic).