ASR demo using onnx-asr
onnx-asr is a Python package for Automatic Speech Recognition using ONNX models.
The package is written in pure Python with minimal dependencies (no pytorch
or transformers
).
Supports Parakeet TDT 0.6B V2 (En), Parakeet TDT 0.6B V3 (Multilingual) and GigaAM v2 (Ru) models
(and many other modern models).
You can also use it with your own model if it has a supported architecture.
Russian ASR models
gigaam-v2-ctc
- Sber GigaAM v2 CTC (origin, onnx)gigaam-v2-rnnt
- Sber GigaAM v2 RNN-T (origin, onnx)nemo-fastconformer-ru-ctc
- Nvidia FastConformer-Hybrid Large (ru) with CTC decoder (origin, onnx)nemo-fastconformer-ru-rnnt
- Nvidia FastConformer-Hybrid Large (ru) with RNN-T decoder (origin, onnx)nemo-parakeet-tdt-0.6b-v3
- Nvidia Parakeet TDT 0.6B V3 (multilingual) (origin, onnx)whisper-base
- OpenAI Whisper Base exported with onnxruntime (origin, onnx)alphacep/vosk-model-ru
- Alpha Cephei Vosk 0.54-ru (origin)alphacep/vosk-model-small-ru
- Alpha Cephei Vosk 0.52-small-ru (origin)
English ASR models
nemo-parakeet-tdt-0.6b-v2
- Nvidia Parakeet TDT 0.6B V2 (en) (origin, onnx)nemo-parakeet-tdt-0.6b-v3
- Nvidia Parakeet TDT 0.6B V3 (multilingual) (origin, onnx)whisper-base
- OpenAI Whisper Base exported with onnxruntime (origin, onnx)