Why are the guidelines important ?

Perception of quality in captioning

To quote the Deafness forum of Australia : “Poor quality captions are as bad as no captions at all”. Our guidelines were devised so that a maximum of viewers feel comfortable reading captions transcribed by our language team, and that our visual media is as accessible as possible, which would allow us to reach audiences across the world.

A 2003 survey conducted by the Annenburg Public Policy Center of the University of Pennsylvania set out to discover how the various respondents perceived the quality of captioning in American TV programming.[1]

The data was collected from the 203 respondents of the survey : (ESL stands for ‘English as a Foreign Language’)

Total sample Deaf Hard of hearing ESL General population
Generally happy 45% 56% 70% 34% 28%
Captions move too fast 18% 27% 14% 21% 11%
Words are too complicated 4% 6% 2% 9% 0%
Captions move too slowly 11% 13% 12% 4% 17%
Captions have too many mistakes 26% 33% 35% 15% 22%
Other issues 16% 25% 28% 7% 7%

Despite the fact that close to half of respondents are happy with subtitle quality, over a quarter among them found too many mistakes, and nearly one in five couldn’t read them comfortably ! We can't stress enough that quality transcriptions make for a more comfortable reading experience, and that any mistakes made in the original transcript will be passed along to translators, who are likely to make small mistakes of their own, or worse, make an inappropriate translation as a result of those mistakes.

Also, captioning professionals usually measure the amount of text in words per minute. Experts usually agree that children have a reading rate of about 120 words per minute, and an average adult would be able to read about 180 words per minute in average. A consensus also emerges around a 200 word per minute limit, above which reading becomes uncomfortable. Of course, the flow of speech imposes itself to us, but we find that it is a good indication as to how difficult a transcription is going to be, how much work it's going to take to make it 'readable'. Here are a few examples of text-flow rates for some of the videos we've been transcribing :

  • Venus Project : Designing the Future : 113 words per minute,
  • Zeitgeist Moving Forward : 138 words per minute,
  • Zeitgeist Addendum : 138 words per minute,
  • Peter Joseph's Why I advocate' video : 170 words per minute,
  • Larry King's interview of Jacque Fresco (1974) : 172 words per minute,
  • Russia Today News' interview of Peter Joseph : 182 words per minute.


As far as non-English languages are concerned, data was gathered so as to compare the volume of text generated by the translation, to the original transcript in English. This data analysis allowed us to determine exactly how long the strings needed to be in English to allow for a more flexible translation. The data was collected from 6 videos in dotsub whose translations were available in the languages that are shown in the following table, for a total amount of 444 minutes of running time.
Two datasets were created :

  • the first measures the number of characters, in average, for every minute of video,
  • the second measures the number of words, in average, for every minute of video.

A mean average was extrapolated out of the different videos, which produces the following results :

Bulgarian French German Polish Portuguese Romanian Spanish
Percentage of text (number of characters) relative to the English transcription 177% 120% 114% 102% 105% 107% 107%
Percentage of text (number of words) relative to the English transcription 90% 108% 97% 80% 97% 100% 100%
The dataset is available here

A number below 100% indicates that the translation uses a lower number of words or characters than the original transcript in English.

From this dataset, we can conclude that most languages require extra space to convey the same meaning, usually between 5 and 20%, which is the reason why one of the main guidelines is to keep strings shorter than 70 characters.