Redundancy in perceptual and linguistic experience: comparing feature-based and distributional models of semantic representation.Top Cogn Sci 2011; 3(2):303-45TC
Since their inception, distributional models of semantics have been criticized as inadequate cognitive theories of human semantic learning and representation. A principal challenge is that the representations derived by distributional models are purely symbolic and are not grounded in perception and action; this challenge has led many to favor feature-based models of semantic representation. We argue that the amount of perceptual and other semantic information that can be learned from purely distributional statistics has been underappreciated. We compare the representations of three feature-based and nine distributional models using a semantic clustering task. Several distributional models demonstrated semantic clustering comparable with clustering-based on feature-based representations. Furthermore, when trained on child-directed speech, the same distributional models perform as well as sensorimotor-based feature representations of children's lexical semantic knowledge. These results suggest that, to a large extent, information relevant for extracting semantic categories is redundantly coded in perceptual and linguistic experience. Detailed analyses of the semantic clusters of the feature-based and distributional models also reveal that the models make use of complementary cues to semantic organization from the two data streams. Rather than conceptualizing feature-based and distributional models as competing theories, we argue that future focus should be on understanding the cognitive mechanisms humans use to integrate the two sources.