ASR demo using onnx-asr
onnx-asr is a Python package for Automatic Speech Recognition using ONNX models.
The package is written in pure Python with minimal dependencies (no pytorch
or transformers
).
Supports Parakeet TDT 0.6B V2 (En) and GigaAM v2 (Ru) models (and many other modern models). You can also use it with your own model if it has a supported architecture.
The default VAD parameters are used. For best results, you should adjust the VAD parameters in your app.
Model
Drop Audio Here - or - Click to Upload
Russian ASR models
gigaam-v2-ctc
- Sber GigaAM v2 CTC (origin, onnx)gigaam-v2-rnnt
- Sber GigaAM v2 RNN-T (origin, onnx)nemo-fastconformer-ru-ctc
- Nvidia FastConformer-Hybrid Large (ru) with CTC decoder (origin, onnx)nemo-fastconformer-ru-rnnt
- Nvidia FastConformer-Hybrid Large (ru) with RNN-T decoder (origin, onnx)whisper-base
- OpenAI Whisper Base exported with onnxruntime (origin, onnx)alphacep/vosk-model-ru
- Alpha Cephei Vosk 0.54-ru (origin)alphacep/vosk-model-small-ru
- Alpha Cephei Vosk 0.52-small-ru (origin)
English ASR models
nemo-parakeet-ctc-0.6b
- Nvidia Parakeet CTC 0.6B (en) (origin, onnx)nemo-parakeet-tdt-0.6b-v2
- Nvidia Parakeet TDT 0.6B V2 (en) (origin, onnx)whisper-base
- OpenAI Whisper Base exported with onnxruntime (origin, onnx)