I'm creating a roasting AI - data needed!

Discuss roast levels and profiles for espresso, equipment for roasting coffee.
Posts: 3
Joined: 2 months ago

#1: Post by scornflake »

Well. I'm going to try, anyway :)

I'm a happy Gene Cafe user, and developer.
To learn more about AI, Im going to create an AI driven "thing" to predict when to stop my roaster (eventually this could be automated) when it gets to "how I like it".

As such, I've begin collecting data, in the form of recocrdings of my roasts. The idea is to then pull this apart and use parts of it to see if I can get the AI to not be silly. So far I've got an extraction that can:

- split the video apart; to give frames. e.g: every 15s or so
- read the chamber temp of each frame
- extract 1st and 2nd crack audio to their own files
- extract random samples from the roast that are not 1st/2nd crack, so use as "not 1st/2nd" crack training data

I'm currently playing around with EdgeImpulse to see how far I can get with that. I'll build my own model if it has to come to that, but I'd prefer to see how far I can get using off the shelf stuff to begin with.

Also; this may well be an already solved problem - so keen to look at existing apps/ideas as well.

I've included a sample frame, just to show the camera angle that my extraction software is happy with (its configurable, to an extent)

So - If there are other Gene Cafe users out there that'd be able to contribute movies of a roast, that would be GREAT!

scornflake (original poster)
Posts: 3
Joined: 2 months ago

#2: Post by scornflake (original poster) »


So I've got the basics in place to detect first crack (even got a small app that'll monitor a microphone and tell me if it's roasting, first crack or second crack).

Now all I need (ha!) is a bunch of audio samples... well. actually I probably need LOADS of them :)

Right now, based on exactly two roasts (what a massive sample size), my AI is rather confused.
Just like me, before coffee in the morning.

This lovely little pic is known a "confusion matrix".
In this case it shows I'm quite likely to detect background noise as first crack.
I'm also very unlikely to detect second crack at all. In fact; its waaay more likely it'll been seen as first crack.

anyway; I have this working with a custom trained model based on ymnet (an existing audio classifier).
What I need now are audio samples. Lots of audio samples.

So - any gene cafe peoples out there that would be up for recording their roasts (video with audio would be my preferred format; I can extract what I need from it)??

scornflake (original poster)
Posts: 3
Joined: 2 months ago

#3: Post by scornflake (original poster) »

More of an update - I've got reading of the front panel to be more accurate. From 52% to 92%.

I'm not yet sure that those temps are even going to be useful to create the final outcome, but hey - they look pretty!

This also might be more "coding/dev" focused for a while, so I'll prob run off to an AI forum to ask more over there, and come back here when I've more real-world progress that I can show.