Words and pictures were studied, and recognition tests were given in which each studied object was to be recognized in both word and picture format. The main dependent variable was the latency of the recognition decision. The purpose was to investigate the effects of study modality (word or picture), of congruence between study and test modalities, and of priming resulting from repeated testing. Experiments 1 and 2 used the same basic design, but the latter also varied retention interval. Exp. 3 added a manipulation of instructions to name studied objects, and Exp. 4 deviated from the others by presenting both picture and word referring to the same object together for study. The results showed that congruence between study and test modalities consistently facilitated recognition. Furthermore, items studied as pictures were more rapidly recognized than items studied as words. With repeated testing, the second instance was affected by its predecessor, but the facilitating effect of picture-to-word priming exceeded that of word-to-picture priming. The findings suggest a two-stage recognition process, in which the first is based on perceptual familiarity, and the second uses semantic links for a retrieval search. Common-code theories which grant privileged access to the semantic code for pictures are supported by the findings, or alternatively dual-code theories which assume mnemonic superiority for the image code. Explanations of the picture superiority effect as resulting from dual encoding of pictures are not supported by the data.