YouTube recently added support for video annotations and in-video links. Three types of annotations are supported: speech bubbles, notes, and spotlights. As Bill Creswell rightly pointed out a couple days ago, YouTube’s implementation is similar to what users can do with “bubbles” on

One key difference is that YouTube’s annotations do not fully capitalize on the social affordances of Web 2.0 technologies. Whereas BubblePly, Viddler, and DotSub allow anyone to add annotations to any video, only authors can add annotations to YouTube videos. Viewer can turn YouTube annotations on and off, but they can not edit or add them (i.e. viewers are not video modifiers [or vmodders] in the YouTube environment — cf. Overstream.) Allowing anyone to annotate (or copy and annotate) any video would provide another means for viewers to comment on videos (in addition to writing text comments and authoring response videos) as well as potentially increase the number of captioned videos available on YouTube.

The three examples selected by YouTube as representative of annotated video suggest that annotations are not being marketed as captions. For example, Interactive Card Trick contains no spoken words to be captioned; a music track provides the only audio (but the music is not signaled with an annotation): 

A screenshot of a YouTube video entitled Interactive Card Trick with annotations enabled.

Interactive Shell Game also contains no spoken words, only background noise. While My 22nd Skydive does contain spoken words, those words are not captioned. In fact, the annotations at times assume that the viewer can hear (e.g. when the man holding the camera tells the subject of the video to “have fun,” the speech bubble says “YES, this will be fun!!!”).

I’m not suggesting that these three videos are not promising developments for captioning technology on the Web. I’m also not suggesting that annotations can not be used as captions. Rather, what struck me was how the three videos selected by YouTube as representative examples did not show viewers how annotations could also be used to inscribe spoken words and other sounds in the environment. Annotations were being used in these examples to provide an additional track of commentary or to replace the spoken word track altogether on videos without spoken words, whereas captions are intended to duplicate an existing soundtrack. Clearly, all three uses can co-exist…