Speech disfluencies occur at higher perplexities

Priyanka Sen

Publication

Speech disfluencies occur at higher perplexities

By Priyanka Sen

2020

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Speech disfluencies have been hypothesized to occur before words that are less predictable and therefore more cognitively demanding. In this paper, we revisit this hypothesis by using OpenAI’s GPT-2 to calculate predictability of words as language model perplexity. Using the Switchboard corpus, we find that 51% of disfluencies occur at the highest, second highest, or within one token of the highest perplexity, and this distribution is not random. We also show that disfluencies precede words with significantly higher perplexity than fluent contexts. Based on our results, we offer new evidence that disfluencies are more likely to occur before less predictable words.

Speech disfluencies occur at higher perplexities

Latest news

Work with us