NarrAD

AI-based system for generating barrier-free audio descriptions

NarrAD is an AI-based system that automatically generates barrier-free audio descriptions, elevating the cinematic experience for visually impaired individuals.

Summary

Audio Description (AD) is a narration designed to en- hance accessibility for visually impaired individuals by con- veying the key visual elements of a video. Thus, automat- ing AD generation for long-form videos, such as movies and dramas, provides high social value but is a challenging task. First, AD must reflect the narrative context of the en- tire movie, including the storyline, names of characters and places, and the cultural setting. Second, to avoid disrupting the immersive experience of the movie, AD must not over- lap with the characters’ dialogues, requiring the delivery of numerous visual elements in concise sentences. This pa- per presents NarrAD, a training-free AD generation frame- work that satisfies both of the requirements by leveraging rich narrative context in movie scripts and curating infor- mation across narration slots. Experiments on the MAD dataset demonstrate that our approach outperforms prior works in both captioning and LLM-based metrics. In the user study with 600 subjects, NarrAD achieves the highest user experience and movie comprehensio

Publications

  1. WACV
    NarrAD: Automatic Generation of Audio Descriptions for Movies with Rich Narrative Context
    In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Mar 2025 (Oral, top 8.3% of submissions)

Members