I wonder if this could be used as something like early recaptcha.
Have a machine do transcriptions and for the parts where it's not entirely sure just let users play the game and then accept what most users chose as the correct solution. Later on train your automatic transcriber on this.