Distributional learning of appearance.PLoS One 2013; 8(2):e58074Plos
Opportunities for associationist learning of word meaning, where a word is heard or read contemperaneously with information being available on its meaning, are considered too infrequent to account for the rate of language acquisition in children. It has been suggested that additional learning could occur in a distributional mode, where information is gleaned from the distributional statistics (word co-occurrence etc.) of natural language. Such statistics are relevant to meaning because of the Distributional Principle that 'words of similar meaning tend to occur in similar contexts'. Computational systems, such as Latent Semantic Analysis, have substantiated the viability of distributional learning of word meaning, by showing that semantic similarities between words can be accurately estimated from analysis of the distributional statistics of a natural language corpus. We consider whether appearance similarities can also be learnt in a distributional mode. As grounds for such a mode we advance the Appearance Hypothesis that 'words with referents of similar appearance tend to occur in similar contexts'. We assess the viability of such learning by looking at the performance of a computer system that interpolates, on the basis of distributional and appearance similarity, from words that it has been explicitly taught the appearance of, in order to identify and name objects that it has not been taught about. Our experiment tests with a set of 660 simple concrete noun words. Appearance information on words is modelled using sets of images of examples of the word. Distributional similarity is computed from a standard natural language corpus. Our computation results support the viability of distributional learning of appearance.