Well, I've finished my transcription of the results, which you can view
in this published Google Doc.
As I mentioned, when I arrived on Monday with my machine Jon proposed that we use the K30 Vario, given that it's a little more convenient for dosing quickly and consistently. I agreed that this was a fine choice, so we opted to make that switch. We dialed in Jon's GS3 (since mine was still heating up) and found a grind, dose, and temperature that we liked. This didn't take too long. When my GS3 was hot enough, we started trying shots on it (with a 'flat' profile, i.e. gentle rampup since I don't have the 0.6 mm gicleur and then steady ~9 bars to the end) that had been dialed in on the other one. Now, John's offset is a factory offset (~-5.3F IIRC) and mine is experimentally determined with a Scace to be ~-3F. We were unable to get a Scace to benchmark them side-by-side for the test, so we did this by taste. Initially, setting mine to have the same offset-adjusted temperature (so ~2F lower boiler temp) than Jon's yielded shots that were milder and less pleasant. We upped it to have the same boiler temp setting, and after it stabilized there, we pulled, IIRC, 2 pairs of indistinguishable shots. We were really happy with this, particularly me--it's a good indication that I'll be safe hosting any further tests we do at my own place.
As I mentioned before, the tasters were me,
Paul, and
Jon. We also had a longtime friend of mine, Jeff, though I believe he declined to participate in the IDing that we attempted to do, so his sheet just has notes.
Test ProtocolJon and I prepped baskets simultaneously (obviously we took turns grinding the shots), enforcing our doses to be within about a tenth of a gram, and using the WDT to level the baskets, particularly after spooning out excess on any shots that dosed wrong. We pulled shots into identical cups (mine was marked on the bottom to ID after sampling) staggered by a few seconds so that the first one of us that finished could call final volume to the other. As our test proceded, we ended up just shooting for the same final shot volume. After we pulled our shots, Jon would shuffle them randomly (out of view of Paul and Jeff), and then he and I would go to the tasting table and Paul would stir each shot, shuffle them himself, and bring them to the table. He would label one of them "A" with a sticker so that we could keep track of it as we passed the pairs around the table for each of us to taste them in turn. In effect, we were all blind at the tasting phase. We would take notes and make our guesses, discuss the shots, then reveal the actual ones.
You can guess my shorthand. P = Profiled, NP = flat profile/'not' profiled. The profile I was using on the 'P' shots was a normal (pretty fast, ~5s) ramp to 9 bars on the onboard gauge, then after about 15s into the shot a gradual decline to 7 bars. Note that this is pretty challenging to do with utmost consistency when you're profiling manual and ALSO trying to watch the shot volume--trying to look at 3 things (time, pressure, weight) at the same time is not something that I can do that well. That said, it seemed to work at least roughly.
ResultsAfter a quick dry run of the test protocol, Jon and I realized that it can be a challenge pulling a shot onto a scale and shutting it off at the exact weight you want. In retrospection, moving the cup away might have worked better, but probably would have shortened the lives of our scales.

In any event, we ran through the first three rounds and were discouraged by how variable the shots were pulling AND tasting. We suspected that Jon and I were prepping differently enough that it was skewing the results, so we had Jon prep both baskets for round 4 (and onwards). On the 5th round, when the shots were still pulling pretty close, we decided to
exaggerate the profiling by dropping to 6 bars rather than to 7 by the end of the shot. After the 6th round, we were feeling like we were observing a difference pretty well, but we were getting taste fatigued and indeed a bit caffeine buzzed, so we went to lunch. We returned for round 7, but IIRC it was only me and Jeff tasting for that one--that or Paul/Jon forgot to take notes for the round.
Here's the shot results summary, excerpted:
Note: R4 is where we switched to Jon-only basket prep, and R5 is where we used the exaggerated profile.And here's the summary of the shot IDing:
Note: Column 2 (A) denotes the result for the first blind shot, and the columns with taster names are their guesses for that same shot. Green means the taster got it right.Please look at the
Google Doc's other sheets for individual notes from each taster.
In the end, I think the first 3 rounds probably ought to be 'thrown out' inasmuch as I think the taste differences we noted were mostly the result of a variation in the pours that was attributable more to different basket prep styles than anything else. The shots were pretty similar, and their small differences were hard to attribute to either method based on our hypothesis. This is why Jon and Paul refused to guess, though you can see their comments in the taste notes. I guessed, and guessed wrong 2/3 times. As I said in the 4th round we decided to eliminate the possiblity of variable barista technique, and in round 5 we also exaggerated the pressure drop for the profiling shots. Between these 2 changes, we all felt like we were getting down to a predictable difference that tasters were willing to guess on, and almost always guessing right. I would have preferred to go another round or 2, but we were all feeling a bit taste fatigued and like we'd had enough.
ConclusionsPaul described the profiled shots as having a 'concentrated vac pot taste.' You'd probably have to try his vac pot coffee to get a sense of that, but essentially I think the common thing we all got was the predicted 'balance shift' in the taste towards either brighter or sweeter. Occasionally the comparative difference was in the absence of a slight bitterness in the aftertaste that was present in the NP shot. There was much less consensus on the 'body' of the shots, and I think this ended up being a poor predictor of shot variation, at least for this test. I am not ready to concede that point, though--remember we were stirring and sharing these shots, so usually tasters other than the first were hardly getting to sip much crema at all, and I think that really calls into question the validity of comparative observations of the shot body.
It's no cut-and-dry test, and it's sure not going to support any kind of rock-solid conclusions, but I feel like my casual observations have held up pretty well to blind testing, and I'm hoping to repeat this process when I can get more consistent equipment.
I'd also add that the blind test could just as easily be an advertisement against spending the money I have on profiling, depending on your level of curiosity. The effect seems more subtle (isn't that how blind testing always goes?), and certainly both machines were producing lots of really tasty shots. Moreover, we're just testing for an absolute difference here, that is a difference given all other variables are held as constant as possible. It's possible that you could obscure the differences we observed entirely by dose/grind adjustment, at least for the coffee we used.
So there it is. Let me know if I've left anything out...