LLM-Based Post-ASR Error Correction for Disordered Speech
Hangyi Wen, Mikiyas Assefa, Anas Semsayan, Eduardo Feo-Flushing
Automatic speech recognition (ASR) systems achieve near-human accuracy on typical speech, but performance on disordered speech remains poor, with conversational word error rates (WER) often exceeding 50%. This gap creates serious accessibility barriers for individuals with communication disorders. We present the first systematic study of large language model (LLM)-based post-ASR error correction for disordered speech. Using the APROCSA corpus of conversational aphasic speech, we evaluate three complementary strategies: (i) multi-ASR fusion, where hypotheses from ten state-of-the-art ASRs are consolidated by LLMs; (ii) few-shot prompting for single-hypothesis correction; and (iii) supervised fine-tuning with parameter-efficient adapters. Results show that LLMs substantially reduce WER and improve semantic similarity, with fusion achieving up to 46\% relative WER reduction and few-shot prompting exceeding 53%. By leveraging mainstream ASRs and applying lightweight LLM correction, our approach makes powerful recognition technology more accessible to speakers with disordered speech, lowering barriers to everyday communication.