My case: I have audio of speech and already prepared text of same speech.
And I need add time marks for text, so I can move between audio parts with help of text hints.
As I can see Google’s Speech-to-Text allow achieve that. My question is any alternatives? Maybe self-hosted open-source solutions, or paid cloud services?