How captions tell the future

A screenshot from Taken (2008) featuring Maggie Grace

Captioning technology and conventions allow us to glimpse the future, at least sometimes.

As an habitual closed caption user for at least a decade (we rarely watch anything that isn’t captioned), I regularly read ahead of the audio dialogue. I’m able to read the captioned words faster than the actors can say them. This is something I think most caption users do, whether they can also hear the dialogue or not. Caption users can, under the right conditions, stay a beat ahead of everyone else, laughing at a joke, for example, before the punchline is spoken, or nodding in agreement before the speaker has finished making a point.

Captioning technology and conventions allow us to stay a step ahead. A two-row caption expressing a call-and-response format (one person says something and then another person responds) can be processed very quickly when both lines of dialogue are displayed on the screen at once, in “pop-on” style. Reading both lines before the actors say them is usually a piece of cake. Longer stretches of discourse pose no great challenge for seasoned caption users either, who can (again, under the right conditions) read faster than the speaker can talk. Being able to read quickly one-, two-, or three-row captions allows users to focus more of their attention on the movie or show itself.

In less than ideal conditions, poorly placed, ill-timed, and otherwise error-laden captions can create uptake problems for readers. In addition, live captions, in which captions are displayed a couple seconds after the accompanying words are spoken, simply do not allow users to engage in the kind of sophisticated, “one beat ahead” viewing style that occurs under the right conditions. In fact, live captions can make for a more demanding user experience, especially when they lag behind fast-paced, fast-talking shows.

But when captions are prerecorded to appear just before, at, or close to the moment when the accompanying audio begins, users can sometimes exploit the conventions of multi-row pop-on style captions to read ahead, even if reading ahead provides only the tiniest glimpse into the future. The flip side to knowing the future before everyone else is that the movie may not want you to know. The caption viewer may be out of sync, just slightly, with the action, or worse, stripped of the full experience of surprise and suspense.

Here’s a simple but powerful example from the 2008 movie Taken, starring Liam Neeson as a father trying to rescue his teenage daughter (Maggie Grace) from foreign kidnappers. (Spoiler alert) In this clip, the caption user recognizes a heartbeat before the non-caption user (or so I would argue) that because the bad guy’s captioned sentence is unfinished (“We can nego-“), he will be shot before he can finish saying “negotiate.” Of course, the caption user, like all viewers, also relies on context to make predictions about where the film is going: Neeson has already systematically killed everyone else on the boat, so it’s no big gamble to predict that this final bad guy — the one, at last, who is holding his daughter — will meet a similar fate, rather than the daughter being stabbed. That’s how these kinds of movies go, and we knew that walking in. But the caption tells us precisely when he will die (before he finishes that word in the caption) and how (by gunshot, since Neeson happens to be pointing a gun at the bad guy when Neeson walks into the room). (Graphic violence alert)

Source: Taken, 2008. DVD. Featured caption: “We can nego–“

Perhaps it’s not an advantage to us, after all, when captions reveal secrets before the movie is ready to share them. But my larger point — encompassing any discussion of specific advantages or disadvantages — is that no one is really talking about the rhetoric of captioning, the ways in which captions (and text/image interplay more generally) create experiences for users that are different from uncaptioned experiences. Captions are not simply the text equivalent of spoken dialogue but create different opportunities for users, mediate meaning making differently, and, as I have begun to explore, add subtle and complex layers of meaning to video texts. A closer look at the rhetoric and style of closed captioning will prepare us to offer more pointed critiques of the limits of current thinking about captions — e.g. see Joe Clark’s excellent critique of “invariant-bottom-centred” captions — and, hopefully, improve caption technology and stylistic conventions in anticipation of that time very soon (should captioning of TV-like content become legally mandated on the Internet) when closed captioned video will be flooding the Web.

We don’t tend to talk about closed captions as providing, in some cases, a different (even advantageous) viewing experience over traditional, non-captioned ways of watching movies and TV shows. And yet I think that’s precisely what we need to talk about in order to bring closed captions closer to the mainstream. That’s what Web accessibility advocates do every time they discuss Web accessibility as benefiting everyone, not just users with disabilities. (Two quick examples: consider how Mobile Web Best Practices overlap with Web Accessibility Content Guidelines [e.g. see WAI], and how the practice of optimizing websites for search engines overlaps with the practice of making websites accessible [e.g. see McGee].)

[Fair use notice: The videos on this site are transformative works used in good faith, in keeping with Section 107 of U.S. copyright law, and as such constitute fair use of copyrighted material. Read this site’s full fair use notice.]

S. Zdenek

Dr. Sean Zdenek is an associate professor of technical and professional writing at the University of Delaware. He is the author of Reading Sounds: Closed-Captioned Media and Popular Culture (University of Chicago Press, 2015).


5 Responses

  1. Great post about a lesser-known trait of captioning connoisseurs and their evolved precognitive abilities! This fits right in line with what we stress about captions and literacy (which, itself, forms the basis for our Read Captions Across America campaign, held every March). If only more “mainstreamers” knew of these benefits, I imagine many of them would cease with their objections about the “words messing up the pretty pictures.”

  2. Thom: Thanks for bringing your Read Captions Across America program to my attention. What a great idea! Certainly more practical and persuasive than my talk about the future. 🙂

    I’m surprised not every household in America has discovered the literacy benefits of having their children watch TV with captions on.


  3. Yup, except if you’re watching captions on E! – sometimes they are so far behind, it’s hard to track. Had a lot of Tonight Show and other late night talk shows have captions that were way behind too.

    They’re tape delayed, doggonit! Fixit!

  4. I agree, Bill. When captions lag, users who depend on them are at a disadvantage. Forget about leveraging the power of captions to stay a beat ahead. That only happens under the best of conditions, and too often, as you suggest, we have to contend with much less than that.

    As a hearing viewer, I find that I can’t use captions when they lag too much. The difference between the words on the screen and the audio is too jarring for me. I can’t reconcile them.


  5. I live in a country of captions. I live in Denmark, and all non-Danish programs on television (or in the cinema) are captioned. Danish programs are also captioned, often when the show is repeated at a later time.
    Captioning helped me learn Danish. It helped to build my vocabulary. You don’t need to sell me on the idea of literacy!
    More than 20 years with captions means I cannot take my eyes off the captioning. This is annoying at times, but I also find it a big relief. Sound engineers meddle with the volume of speech and dramatic noises, and frankly, I need the captions to make sure I’ve made sense of the scene. And I am talking about English-language programs captioned in Danish. English is my first language, but I find it necessary to follow the Danish to make sure I fathomed what just happened.

    Being able to follow both conversations means I am of the crowd who quibbles over the translation! A friend has a freelance captioning job. She often turns to her networks for assistance with some localization issue. I’ve grown to respect the difficulty of providing an intelligent and comprehensible written phrase that is true to the spoken word and also leaves time for you to enjoy the visual and read the text and comprehend everything. Her questions about single words or phrases have resulted in many enlightening conversations about words, language, and culture.

    The word around here is – “Germans are poor at other languages because they dub all their films and television”. I have no idea whether that is true, but that is how Danes perceive the value of captioning – as an extra educational tool.

    I am a member of the “mainstream” who really appreciates captioning. Any words messing up pictures is due to poor planning and design. The usual problem. Not thinking. 🙂

    Live captioning of the news as I’ve experienced it in the US drove me crazy. I have since learned that there are many factors involved. What I saw was an older television that displayed blocks of letters that did not match what was on the screen. Blocks of letters meant the captioning was gibberish. That would obviously turn off mainstream users. Perhaps better testing – and educating the public about proper configuration of the television?