Publications

We only list the papers written during the time Next-gen Kaldi team develop the Next-gen Kaldi toolkits, for more papers of Daniel Povey, please see his personal page or google scholar.

  • "Zipformer: A faster and better encoder for automatic speech recognition", Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey, ICLR 2024 [pdf] [code]

  • "Libriheavy: a 50,000 hours asr corpus with punctuation casing and context", Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey, ICASSP 2024 [pdf] [code]

  • "PromptASR for contextualized ASR with controllable style", Xiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey, ICASSP 2024 [pdf] [code]

  • "Delay-penalized transducer for low-latency streaming asr", Wei Kang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Long Lin, Piotr Żelasko, Daniel Povey, ICASSP 2023 [pdf] [code icefall] [code k2]

  • "Fast and parallel decoding for transducer", Wei Kang, Liyong Guo, Fangjun Kuang, Long Lin, Mingshuang Luo, Zengwei Yao, Xiaoyu Yang, Piotr Żelasko, Daniel Povey, ICASSP 2023 [pdf] [code icefall] [code k2]

  • "Predicting multi-codebook vector quantization indexes for knowledge distillation", Liyong Guo, Xiaoyu Yang, Quandong Wang, Yuxiang Kong, Zengwei Yao, Fan Cui, Fangjun Kuang, Wei Kang, Long Lin, Mingshuang Luo, Piotr Żelasko, Daniel Povey, ICASSP 2023 [pdf] [code icefall] [code]

  • "Blank-regularized ctc for frame skipping in neural transducer", Yifan Yang, Xiaoyu Yang, Liyong Guo, Zengwei Yao, Wei Kang, Fangjun Kuang, Long Lin, Xie Chen, Daniel Povey, Interspeech 2023 [pdf] [code]

  • "Delay-penalized CTC implemented based on Finite State Transducer", Zengwei Yao, Wei Kang, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Yifan Yang, Long Lin, Daniel Povey, Interspeech 2023 [pdf] [code]

  • "Pruned RNN-T for fast, memory-efficient ASR training", Fangjun Kuang, Liyong Guo, Wei Kang, Long Lin, Mingshuang Luo, Zengwei Yao, Daniel Povey, Interspeech 2022 [pdf] [code]

Comments