"Hierarchical attention-based multimodal fusion for video captioning."

Chunlei Wu et al. (2018)
a service of Schloss Dagstuhl - Leibniz Center for Informatics