Logocentrism: The tendency to privilege speech over non-speech in closed captioning

Tom Cruis and Cameron Diaz in Knight and Day

Just because words are spoken doesn’t mean they need to be captioned.

The opening to Knight and Day (2010) provides a dramatic example of how closed captions can mistakenly privilege speech over non-speech, even when the speech sounds are barely audible and/or insignificant. I call this audacious appetite for speech “logocentrism” and discuss its implications below.

First, consider the uncaptioned version, which I pulled from the official DVD. If you are a hearing viewer, think about how you would caption it. Which sounds are significant? Every sound cannot be captioned, so which ones are most important here? And how would you convert those sounds into words? If you are a deaf or hard-of-hearing viewer, think about how this scene visually establishes a context or mood without relying on speech from either of the main characters (Tom Cruise and Cameron Diaz).

In this opening scene, we see the world through Cruise’s eyes. His flight is delayed and he’s killing time. He seems harmless enough while eating ice cream, playing video games, and shopping for knick-knacks. But we glimpse a darker side, too. He seems to be looking for someone — a young blonde woman pulling a wheeled carry-on, perhaps? He scopes out two targets before settling on Cameron Diaz.

What sounds do we hear in this clip? While the first dozen seconds of the scene rely solely on indistinct airport noise (footsteps, crowd sounds, an indistinct PA announcement), the rest of the scene is dominated by an instrumental music track, which is timed to start on the beat of Cruise’s first dramatic step. The music conveys a light, expectant, even playful mood. Indistinct airport noise continues faintly in the background as a dull echoey hum or keynote, becoming more distinct only when Cruise grabs his ice cream cone and plays video games. As the shot of Diaz comes into visual focus at the end of this clip, the music tells us that she’s the one he’s been looking for: six musical notes — three visually associated with Cruise and three with Diaz — build to a pleasing plateau. Cruise’s three notes are mirrored in Diaz’s, connecting them together.

If there’s a PA announcement playing throughout this scene, it’s little more than background noise. If I strain unnaturally, I think I can make out the PA announcer saying “Welcome to Wichita…” and maybe “attention, please.” While background noise may be important as a way to establish the scene’s “key,” it is not so important that every (inaudible) word the announcer says needs to be captioned. In this scene, wordless music reigns, not the irrelevant specifics of the PA announcement.

With that in mind, consider the captioned version of this scene as it appears on the official DVD:

The official DVD captions miss the point entirely. They are distracting and contribute little to our understanding of how sound functions in this scene. While I suppose someone could argue that it’s useful to know the name of the airport (Wichita Mid-Continent!) and appreciate the ironic way in which Cruise is performing the announcement (maintaining “visual contact” and looking for “unattended” baggage), the scene has little to do with the specific information contained in the PA announcement. The announcement is only intended to add ambience, to convey the underlying “key” for the scene. The scene is not intended to convey information about the announcer’s gender, the “Kansas Clean Air Act,” or where to find the Baggage Claim area or the public telephones. It’s a safe bet that no hearing viewer, watching for the first time with captions turned off, has ever made out or cared to make out such details. They don’t matter. They are insignificant details but significant keynote sounds.


The captioner had access to the full range of options for captioning both speech and non-speech. Other scenes on the DVD make use of non-speech captions (e.g. music notes, lyrics, speaker IDs, non-speech sounds, sound effects, etc.). So why didn’t the captioner make use of non-speech captions in this opening scene (with the exception of the “chattering” first caption and speaker ID in the second caption)? I’m not sure why. But I think captioning advocates need to be concerned when speech sounds seem to be mistakenly privileged over non-speech sounds, especially when a scene has nothing to do with speech, when speech can’t be easily heard but is captioned anyway, when captions fail to distinguish background (ambience) from foreground (music) sounds, or when captioners seem to have failed to properly analyze how sound functions in a scene.

Just because words are spoken doesn’t mean they need to be captioned. I realize this may be a hard claim to take seriously, but it’s only hard because of our cultural tendency to privilege spoken words over non-speech sounds, to make words the center of our meaning universe even when they are irrelevant, ambient, or barely audible. When irrelevant speech sounds are captioned seemingly on the basis of their presumed privileged status alone, we need to call out the practice as yet another expression of “logocentrism.”

How might we re-caption this scene so that it shows greater sensitivity to:

  • How sounds really function in a scene and not which sounds we think are important because of our assumptions about the intrinsic value of speech;
  • The important role that non-speech sounds, especially music, play in this scene;
  • How hearing viewers experience the scene and how the scene’s soundscape was intended to be experienced;
  • The differences between background and foreground, ambient and primary, sounds;
  • The differences between significant sounds, which must be captioned, and insignificant sounds, which do not need to be captioned;
  • How assumptions about the importance of speech can shape the captioned landscape, sometimes in problematic ways.

In the comments, I’d love to hear your thoughts. How would YOU caption the opening scene from Knight and Day? Is the official version sufficient?

[Fair use notice: The videos on this site are transformative works used in good faith, in keeping with Section 107 of U.S. copyright law, and as such constitute fair use of copyrighted material. Read this site’s full fair use notice.]

S. Zdenek

Dr. Sean Zdenek is an associate professor of technical and professional writing at the University of Delaware. He is the author of Reading Sounds: Closed-Captioned Media and Popular Culture (University of Chicago Press, 2015).


2 Responses

  1. What I try to imagine is how to create “Equal Communication Access” – trying to make the experience of the movie, without sound, similar to the experience with sound.

    I really like your description of the mood that the music creates – that’s not always easy to convey – especially since musical mood involves interpretation. I get way to lazy sometimes with saying “adventure music”, I know, but it’s kind of the association with scenes of actions and certain types of music.

    The beauty of logocentrism is that it’s measurable, definable. You can measure accuracy. But in the example above, I think the chatter of the PA that’s indistinct to an average listener, gets undue focus when captioned directly, and pulls away from the scene.

    my 2 pennies.

  2. I agree with GRWEBGUY – the captions are definitely an interference. With the sound turned off, the captions only serve to distract because they’re _not_ what the scene is about, but center-stage as they are, they appear to be the important point of the scene.

    The music, on the other hand, is telling a story. Sean’s comment about the six notes at the end of the clip is apt. There’s something decidedly “off” about those notes: the line doesn’t end where it started or where you’d expect it to head. If the music is light, expectant or playful, it definitely is not when those six notes connecting Cruise and Diaz are finished. It’s eerie.