2.7 KiB

Raw Permalink Blame History

Papers

FunASR have implemented the following paper code

Speech Recognition

FunASR: A Fundamental End-to-End Speech Recognition Toolkit, INTERSPEECH 2023
BAT: Boundary aware transducer for memory-efficient and low-latency ASR, INTERSPEECH 2023
Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition, INTERSPEECH 2022
E-branchformer: Branchformer with enhanced merging for speech recognition, SLT 2022
Branchformer: Parallel mlp-attention architectures to capture local and global context for speech recognition and understanding, ICML 2022
Universal ASR: Unifying Streaming and Non-Streaming ASR Using a Single Encoder-Decoder Model, arXiv preprint arXiv:2010.14099, 2020
San-m: Memory equipped self-attention for end-to-end speech recognition, INTERSPEECH 2020
Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition, INTERSPEECH 2020
Conformer: Convolution-augmented Transformer for Speech Recognition, INTERSPEECH 2020
Sequence-to-sequence learning with Transducers, NIPS 2016

Multi-talker Speech Recognition

MFCCA:Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario, ICASSP 2022

Voice Activity Detection

Deep-FSMN for Large Vocabulary Continuous Speech Recognition, ICASSP 2018

Punctuation Restoration

CT-Transformer: Controllable time-delay transformer for real-time punctuation prediction and disfluency detection, ICASSP 2018

Language Models

Attention Is All You Need, NEURIPS 2017

Speaker Verification

X-VECTORS: ROBUST DNN EMBEDDINGS FOR SPEAKER RECOGNITION, ICASSP 2018

Speaker diarization

Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis, EMNLP 2022
TOLD: A Novel Two-Stage Overlap-Aware Framework for Speaker Diarization, ICASSP 2023

Timestamp Prediction

Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model, arXiv:2301.12343