More and more people are enjoying video contents online, while making video content accessible to people with disabilities remains challenging for both online video maker and service providers. This is especially true considering that the majority of online video are made without careful production, spreading over public platforms such as YouTube, or social medias such as Instagram.
In this paper, we propose a novel fully automatic audio description generation system. And with this pipeline as back-end, we propose a novel audio description scripting tool with content recommendation. Our aim is to reduce the cost of production significantly, so that the amateur video makers and volunteers could produce audio description at low cost. Ideally the users of this tool could provide online video accessibility with even several clicks and edits. The proposed pipeline consists of three stage: segmentation, description generation and synthesis.
It takes an video content as input and generate synchronized subtitles and speech. And the scripting tool provides synchronized content recommendation for authors to select. While authoring the scripts, the author could choose to directly use the computer proposed content or adapt them to further use. Thus, the cost of scripting audio description is greatly reduced.