Memento10k Dataset

We present a new multi-temporal, multimodal memory dataset of 10,000 video clips, Memento10k. With 900,000 human memory annotations at different delay intervals and 50,000 captions describing the events, it is the largest repository of dynamic visual memory data.