Next-gen Kaldi
for advanced & efficient automatic speech recognition

A collection of automatic recognition toolkits consisting of data preparation, sequence modeling, training, decoding, deploying.

  1. Fast training with pruned rnnt loss
  2. Advanced zipformer for modeling
  3. Easy to use, supporting many platforms
  4. Apache-2.0 license - free for personal & commercial use
Get started Demo

Projects

k2

FSA/FST algorithms, differentiable, with PyTorch compatibility.

Code Docs
icefall

Various of recipes built based on k2, lhotse and pytorch.

Code Docs
lhotse

Tools for handling speech data in machine learning projects.

Code Docs
sherpa

Speech-to-text server framework (based on libtorch) with next-gen Kaldi

Code Docs
sherpa-onnx

Speech-to-text server framework (based on onnxruntime) with next-gen Kaldi

Code Docs
sherpa-ncnn

Speech-to-text server framework (based on ncnn) with next-gen Kaldi

Code Docs
fast rnnt

A torch implementation of a recursion which turns out to be useful for RNN-T.

Code
text search

Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup.

Code
Libriheavy

Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context.

Code
multi-quantization

Implements a utility for use with PyTorch, for training an efficient quantizer based on multiple single-byte codebooks.

Code
kaldifst

Python wrapper for OpenFST and its extensions from Kaldi. Also support reading/writing ark/scp files.

Code Docs
kaldi-decoder

Decoders from Kaldi using OpenFst.

Code
divide lm

Divide a higher order language model with a lower order language model.

Code