Examining the genealogy of captioning practice crosses many genre and institutional boundaries. The most important of these is the divide between aesthetic and practical praxis. Captions involve both accreditation and interpretive strategies in the collision of the popular and institutional spheres. The fundamental assumption I have made is that these captioning practices arise from motives. These motives reflect both an ontology and an ethics of communicative practice. Understanding the strategic and tactical functioning of captioning necessitates close study of intersection of the production of both aesthetic and evidentiary imagery, and the economic systems in which they are consumed.

However, I choose not to assume that there is an intrinsic quality to images which is either diluted or enhanced by the presence or absence of the caption. The combination of words and images is superadditive, and their content is not divisible. The emergence of graphic design and typography as a specialized field in the early twentieth century demonstrates that the play between type/text and image was of central importance in transmitting a message in an image-infused culture. This synergy did not arrive fully formed in the 1930s without history or precedent. Breaking the barrier between text and image was a complex negotiation of the relative function of each, and considerations of each media separately does not address the tension encountered when text and image are combined.

At the nexus of this tension is the caption. After the graphic innovations of the 1930s, the caption has invaded the area of the image itself. This makes it difficult to demarcate what constitutes caption, and what constitutes pure textual content. Prior to the nineteenth century, the separation of text and image was technological. Images and text were printed separately, often by different printing houses. Imagery was mediated by the engraver who reproduced an artist’s conception on the page. The role of artist and the writer were separate, and their relative status as communicators was subject to healthy rivalry. Their languages were separate, and their products subject to a different economic systems and communicative standards.

In the twentieth century, the concept of the caption as a brief explanatory or interpretive text beneath an image becomes problematic. As the modes of communication become more complex, it becomes easier to speak of paratexts—supplements to a primary text, whether this text is of imagistic or linguistic nature, which act as a rhetorical intensifier for the primary content.

Or, to place this in a different light—the combination of text and image represents a rhetorical trope or scheme. As such, the mode of meaning constructed in combination can be explored not as a reflection on the “word” level but rather at the “phrase” level of semantic structures. Like any phrase, it is not reducible to the meaning of its constituent words, but rather produces new meanings. The separation between image-makers, writers, and page designers thwarts the coherence of these meanings; the question of motive becomes exponentially complex as one extrapolates into ever widening contexts.

However, examination of these tropes or schemes is of essential strategic importance to anyone who might attempt to use or decipher words and images together. The synergy of text and image is not just of quaint historical importance, but rather the central crisis of modern literacy. This inquiry has been masked by diversions into separating content from commentary; this is the blind alley where modern Marxist criticism lives and feels at home.

