One difficulty of applying a general “science of signs” to mixed media rests in the impracticality of its labels. The labels of signifier and signified fit a restricted linguistic project, but are clearly too restrictive when applied to the polysemous text comprised of differing media. C.S. Peirce’s multitudinous labels (usually reduced to icon, index, and symbol) fail when presented as a static condition, separate from the dynamism he included in his semiotics. It is neither necessary, nor desirable, to fix the position of individual elements in a mixed media composition into fallible positions on a binary or even triadic grid. The interface between text and image, approached in this manner, always reduces to a subject/predicate relationship.

For example, in the case of an aesthetic photograph or painting, the caption becomes a comment which carries with it a parasitic relationship to the image as signified. Or, in the case of illustration, the caption is the signified—an image or sketch is the signifier of the word as signified. The relationship is purely contextual and usually unmarked in most mixed media compositions. Peirce’s categories are often used to discuss photographs that can be either iconic (in the loose sense, in that they are a “picture”) or indexical (in that they point at an object that exists) or even symbolic, when they are so widely circulated that they develop a sort of cultural currency—such as Dorothea Lange’s migrant mother, or pictures of the flaming towers of the World Trade Center.

The disposition of these labels is confusing at best. Even more problematic is the question of syntax. Under what circumstances does the image become a subject rather than a predicate? At what point does an image cross over from being an icon or index into being a fully developed symbol? On the surface, it appears that the syntax involved is related to the privilege afforded aesthetic experience. An aesthetic object is granted the status of signified, icon, or symbol, whereas a pragmatic juxtaposition is granted status as a signifier, icon, or index.

While it would be a superfluous digression to attempt any definition of the “aesthetic object,” it seems instructive to examine Walter Pater’s dictum: “All art constantly aspires towards the condition of music.” Pater explains his reasoning by rebelling against the concept of predication: “For while in all other works of art it is possible to distinguish the matter from the form, and the understanding can always make this distinction, yet it is the constant effort of art to obliterate it.” Using visual and then poetic examples, Pater finds lyric poetry closest to his ideal because it depends “on a certain suppression or vagueness of mere subject, so that the meaning reaches us through ways not distinctly traceable by the understanding” [The Renaissance (1901) 135,137]. Slavery to the subject/predicate or matter/form binaries imposes the understanding on aesthetic level difficult—on this level, the caption of a work of art can only be a parasite that reminds us of the matter separate from the form.

Thus, in the consideration of the possibility of subject/predicate relations between image and caption, in the nineteenth century aesthetic sense at least, the constitution of the work of art as subject by linguistic means closes it off by invoking specificity—that is, unless the caption is vague or metaphoric. However, it is also possible to use Pater’s explicit comparison with music (which recurs, significantly, in discourse about some photo-textual combinations of the 1930s) to suggest an alternate possibility.

Music often contains linguistic content (lyrics). However, few critics remark that lyrics interfere with the production of meaning in music. W.J.T. Mitchell excludes the discussion of music in parallel with image/texts because music cannot be said to “represent” in the same way that images or texts do. However, as the presence of sound effects libraries attests, sounds do invoke a mental representation of the circumstances that created them. The term “soundscape” is often used to describe music in an analogue to the landscape presented by an image. Perhaps it is the vagueness and indeterminacy that makes them seem less representative than the specificity of images or texts. The integration of specific linguistic content with a representative “scene” for its unfolding in music points at a different way of looking at the syntax of word and image.

Both language and music have rules for combining elements. In language, there are rules for forming words, sentences and phrases; in music there are chords, chord progressions, and keys. These elements coalesce in songs, and we process them separately and coequally in order to produce a response to music. Part of music is temporal: meter, rhythm, and particular structures. Part of it is inflectional and harmonic. It might be argued that images have no syntax as such, and this is what causes the tension when they are combined with words. However, there are conventions for images, just as there are standard tunings and chord progressions in music. Images have a temporal element, constrained by the circumstances of their creation; they have the equivalent of tone and pitch in the limits of the instruments used to create them. There is a cultural consistency in images, just as there is in music. Like music, these conventions are seemingly arbitrary and deeply implanted into each culture.

Suggesting that “All art constantly aspires towards the condition of music” can also be interpreted to mean that art attempts to integrate cognitive processing without subjects or predicates. Or, art composes a possible world by integrating subjects with predicates. Though images and texts have a different syntax, this does not necessarily preclude their synthesis into a composition rather than a mere composite. The proclivity to see them as a composite seems an arbitrary cultural choice.