Playing Field

Tech Preview

Whisper + audio synchronization:
screen-shot
In this tech preview we apply Whisper from OpenAI and the waveform visualization library from wavesurfer.js to transcribe audio into text and visualize the audio as waveforms. The waveforms may be dragged to try out syncronization.
This Preview is designed to run on a single Linux server with limited memory and storage, so we are not allowing files larger than 20 MB.
The code is available at: https://gitlab.origo.io/origosys/whisper-playing-field.
If you are into Kubernetes, you can run this web-app as a pre-built Docker image with this yaml file.
The back-end of this application consists of a single Python file, the front-end of a html file and a Javascript file. No frameworks, no shadow DOMs, no hydrating, no object stores, no Github actions, no queueing systems (though this is obviously needed if we want to scale it); just bit of plain Javascript and a Python file. Sometimes the right tool for driving in a nail is just a hammer.
Select a saved session:
Create a new session: