iNNterpret - Representation of speech, articulatory dynamics, prosody and language in layers

What models know?

Probing Phonological Encoding with Small RNNs

Recent research makes clear that deep learning provides an extremely powerful computational framework for modeling natural language phenomena across various linguistic domains. Such models frequently make use of multiple layers of high-dimensional hidden layers, often augmented with multiple attention heads. Although techniques are being developed to interpret the internal representations of these models, their massive scale (O(10^9) parameters) makes it a non-trivial challenge to “peek under the hood”.

In this work, I draw inspiration from the connectionist linguistics of the 1990s and reverse course, seeing what information can be gleaned from the smallest models capable of reliably learning a given task. Using morphophonology as my test domain, specifically inflection in languages with vowel harmony, I examine the embeddings and hidden representations of extremely small seq2seq networks. These small networks are shown to perform near ceiling on the tasks of interest (even in more challenging languages e.g. with neutral vowels), and their low-dimension allow for direct visualization, in some cases revealing patterns of representations that correspond to more traditional linguistic analyses.