
Speech and music are best considered as events. They happen. In fact, the descriptive term for words which convey action or a condition of being is verb, which is taken from the Middle French word for speech. Recording events to make them repeatable is a function of technology, either as writing (symbolic memory) or as a way of recreating the sound waves associated with an event already past through a variety of technologies
Non-symbolic analog or digital technologies for producing repeatable sounds began from forms of cutting or extruding. Bumps raised on the cylinders of a music box or cutouts in a piano roll can replay sounds with great precision, but these digital technologies lack flexibility when compared to the analog recreation of the Edison wax cylinder. A stylus impresses the mechanical motion of a diaphragm into a spiral groove created in wax. There are problems with fidelity and permanence, though.
In the early twentieth century, cutting won the day. A variety of formats of shellac discs, at first, and finally vinyl records emerged with varying standards. They were cut on record making lathes directly and replicated as stampers (molds) used to press multiple copies for reproduction. Fidelity to the original speech or musical event has been a concern from the start; recordings, like sense data, are processed in different, media-specific ways in each new technology. Progress has been largely contingent on socially negotiated agreements rather than clear-cut engineering reasoning.
Cutting provides specific challenges to the material aspects of a recording. First, there’s the speed at which the stylus passes cuts a trough, as well as the amplitude (depth) of each groove in two planes. This leads to necessary compromises regarding playing time, distortion (both amount and type of distortion), not to mention the transition early on from a 3-d cylinder to a 2-d planar surface (disk). Standards were hard fought and not adopted universally. RIAA (Recording Industry Association of America) standards for signal processing in long playing recordings were adopted in 1952, but not embraced by the rest of the world until 1970. By 1940, more than 100 competing standards had emerged to meet the challenges of encoding analog recording.
Digital coding (as in music boxes and piano rolls for player pianos) was contingent on a single on/off state for each individual note (frequency bundle) and very specific to each device. Analog recording offered a unique potential for encoding voice instead of simply musical notes, and saw wide adoption. Remember that Edison thought the primary use of his recording instruments was dictation.
The reproduction of classical music and hymns was reliant on symbolic notation of scores, which emerged in parallel with the symbolic recording of texts. The conventions for musical scores evolved within the social structures of the Church. Songs combine elements of both score and text, and after 1949 it was commonplace to refer to a song as a “cut” due to their primary means of storage and transmission, the inscribed disc. The song as an event was nominalized from the verb describing the action of recording it.
Deep cuts refer to songs buried with an artist’s catalog of recordings that have escaped notice; what I’m really trying to highlight here is that even when a recording has a metonymic connection with the original event (as with analog recording), that connection is mediated through the conventions as to what is considered significant in the truest sense (transferred as signifier, not as a symbol) in a sonic event mediated by material constraints, which are negotiated politically. This frequently escapes notice. When we talk about “high fidelity,” the question of fidelity to what or who is never really mentioned.
The opposite of a “deep cut” is of course a hit. A verb taken from Old Norse, hit became nominalized as popular success in the early nineteenth century. High fidelity, from Alexander Graham Bell forward, was mostly thought of as a function of intelligibility. The relationship between mathematically engineered models of sound and consensus about intelligibility derived from testing systems with groups of people diverged significantly. It became easy to test for frequency and amplitude, but the relationship between them—what might genuinely be labeled as the syntax of sound— was more elusive.
Psychological models and psychological testing are the basis for much of acoustic science. In 1933, Fletcher and Munson tested a group of people with headphones to determine when two frequencies were determined to be at the same volume, generating what we call the equal loudness contour. This is especially important in telephony, because it measures perceived loudness (intelligibility) rather than signal amplitude. How loud does a sound have to be for us to identify it as such? It turns out that this changes with frequency.
The experiment was flawed, and repeated in 1937. In 1956 a new version of the experiment, again with significant deviations from previous findings, was published and became the basis for a new standard. This was also widely seen as flawed, and it wasn’t until repeated experiments in 2003 that a new ISO standard for loudness contour was agreed upon. It’s hard to have a hit, and difficult to stay on the charts you might say.
Our understanding of music and speech as events leaves much room for exploration. Key to this, of course, is the repeatability of experimental results. This is only possible through the use of increasingly precise methods of reproduction, the very technologies subject to improvement.