Gradio

Upload a video to transcribe speech from lip movements — no audio required.

Tips for best results: front-facing camera, clear face visibility, good lighting.

⚠️ Running on CPU — inference may take several minutes for longer videos.

VisNet — Visual Speech Recognition