Positioning and styling captions when speakers overlap and interrupt each other

It can be challenging to caption scenes with multiple speakers. Bottom-center caption placement is far from ideal for readers when it fails to clarify which captions belong to which speaker. Adding to the difficulty: speakers often talk quickly, interrupt each other, and overlap their speech to show collaborative support. When captions are placed underneath or next to each speaker, readers can more quickly distinguish — at a glance — who is speaking.

Screen placement is a core standard of caption quality. The FCC’s (2014) new rules for “closed captioning quality standards for TV programs” require that captions should be Accurate, Synchronous, Complete, and Properly Placed. Regarding placement: “Captions should not block other important visual content on the screen, overlap one another, or run off the edge of the video screen” (FCC).

Contagion (2011), which we rented from Amazon Prime Video recently, provides quite a few examples of captions covering on-screen text. Because the on-screen text is low on the screen, and the captions are set exclusively in the bottom-center (default location), the captions partially cover this text at times. Whether the captions cover words on the screen depends on the device being used to view the movie and the caption size set by the user. I prefer large captions when watching programs on a large-screen television. Large captions are more likely to cover any low-set text.

A caption partially covers “Day 2” in this frame from Contagion (2011). Source: Amazon Prime Video. Warner Bros.

A frame from Contagion (2011) showing a TV monitor mounted to a wall and the on-screen text, Day 8, which is partially covered by the closed caption: Chicago, Los Angeles, Boston, and Salt Lake. — Another caption partially covers on-screen text in *Contagion* (2011). Source: Amazon Prime Video. Warner Bros.

But the topic of placement goes beyond making sure that titles, names, chyrons, and other on-screen text are not obscured by the captions. Caption placement can help readers identify who is speaking when when multiple speakers are talking, interrupting, or overlapping their speech turns:

When people onscreen speak simultaneously, place the captions underneath the speakers. If this is not possible due to the length of the caption or interference with onscreen graphics, caption each speaker at different timecodes. Do not use other speaker identification techniques, such as hyphens. (The Captioning Key)

Bottom-center captions can interfere with readers’ attempts to associate lines of captioned dialogue with their respective speakers. In this scene from Contagion (original captions), Jude Law argues with a newspaper editor about the need to cover a developing story:

Source: Contagion (2011). Amazon Prime Video. Warner Bros. Original captions.

This interaction is not too difficult to follow but could be improved by leveraging the power of placement. Single captions that blend the speech of multiple speakers can be confusing, especially in the absence of any distinguishing info such as preceding hyphens. For example, the following bottom-center captions combine speech from two different speakers, yet there are no visual cues in the captions to indicate which line(s) belong to which speaker:

ALL OVER THE PLANET. <-- Speaker 1
WE DON’T WANT TO BE THE PAPER <-- Speaker 2
THAT CRIES WOLF.

I TAPED THIS MEETING. <-- Speaker 1
WE NEED MORE INFORMATION <-- Speaker 2
THAN THAT.

Let’s re-caption this scene using a caption format such as WebVTT that supports more precise screen positioning options (view the following clips with the Chrome browser).

Source: Contagion (2011). Amazon Prime Video. Warner Bros. Re-captioned by the author using the WebVTT format.

We could go further and style each speaker’s captions in a different color or visual style. Distinguishing speakers by color is common in the UK — see the BBC’s Subtitle Guidelines, which list a “limited range of colours [that] can be used to distinguish speakers from each other.” The limited color palette includes (in order of priority): white, yellow, cyan, and green. These colors must appear on a black background.

Source: Contagion (2011). Amazon Prime Video. Warner Bros. Re-captioned by the author using the WebVTT format.

Placement is meaningful. When captions are placed on the screen strategically, they convey information through their form. Well-placed captions can help readers identify and distinguish speakers at a glance. What placement and color provide to readers is a more efficient method of speaker identification. Placement can’t and shouldn’t replace traditional speaker identifiers, of course. But placement can supplement other techniques without adding any additional words (proper names) or punctuation (hyphens) to an already jam-packed caption file.