Identify speakers whenever it’s unclear. Given how much of the audio occurs over unrelated video clips, it was important to identify who was speaking when it wasn’t obvious. Making it clear when the audio shifted to the narrator was also important context, as was describing the everyday people who were speaking.
Describe the music. There’s nothing less helpful than simply seeing the phrase “[music playing].” We decided the captions should describe the music whenever it changed tonally — from its beginning as stark, vaguely electronic notes to moments where a choir hums somberly. So much time and artistry goes into picking the right tracks, and it makes sense to capture those choices in your captions.
Aim for brevity. The on-screen text of top search queries from 2020 are essential to the video’s creative. When the on-screen text matched the language of the audio, we didn’t need to repeat those search terms in the captions. We also found that three lines of text at once was a reasonable limit for captions, so as not to overwhelm viewers. Unpacking the experience involves thinking just as much about what to leave out of your captions as what to include.