Estimating the average need of semantic knowledge from distributional semantic models.Mem Cognit 2017; 45(8):1350-1370MC
Continuous bag of words (CBOW) and skip-gram are two recently developed models of lexical semantics (Mikolov, Chen, Corrado, & Dean, Advances in Neural Information Processing Systems, 26, 3111-3119, 2013). Each has been demonstrated to perform markedly better at capturing human judgments about semantic relatedness than competing models (e.g., latent semantic analysis; Landauer & Dumais, Psychological Review, 104(2), 1997 211; hyperspace analogue to language; Lund & Burgess, Behavior Research Methods, Instruments, & Computers, 28(2), 203-208, 1996). The new models were largely developed to address practical problems of meaning representation in natural language processing. Consequently, very little attention has been paid to the psychological implications of the performance of these models. We describe the relationship between the learning algorithms employed by these models and Anderson's rational theory of memory (J. R. Anderson & Milson, Psychological Review, 96(4), 703, 1989) and argue that CBOW is learning word meanings according to Anderson's concept of needs probability. We also demonstrate that CBOW can account for nearly all of the variation in lexical access measures typically attributable to word frequency and contextual diversity-two measures that are conceptually related to needs probability. These results suggest two conclusions: One, CBOW is a psychologically plausible model of lexical semantics. Two, word frequency and contextual diversity do not capture learning effects but rather memory retrieval effects.