MVIG-RHOS, SJTU
The synthetic videos are learned in two stages:
Stage 1: static memory learning with image distillation on one frame per video.
Stage 2: the static (frozen) and dynamic memory are combined.
@article{wang2023dancing,
title={Dancing with Images: Video Distillation via Static-Dynamic Disentanglement},
author={Wang, Ziyu and Xu, Yue and Lu, Cewu and Li, Yong-Lu},
journal={arXiv preprint arXiv:2312.00362},
year={2023}
}