Thin-Plate Spline Motion Model Usage for Image Animation

A static object in a source image transcends its immobility and awakens movement and energy through this transforming process of computer vision and pattern recognition, driven by the narrative pulse of a driving video. It requires an unsupervised motion transfer architecture to close the pose gap.

· 2 min read
Unsplash.com - Aldi Sigun Shubham Dhage Javier Miranda Abhipsa Pal Milad Fakurian
Unsplash.com - Aldi Sigun Shubham Dhage Javier Miranda Abhipsa Pal Milad Fakurian

Computer Vision and Pattern Recognition (cs.CV)

Citation.
@misc{zhao2022thinplate,
title={Thin-Plate Spline Motion Model for Image Animation},
author={Jian Zhao and Hui Zhang},
year={2022},
eprint={2203.14367},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Image animation brings life to the static object in the source image according to the driving video. Recent works attempt to perform motion transfer on arbitrary objects through unsupervised methods without using a priori knowledge. However, it remains a significant challenge for current unsupervised methods when there is a large pose gap between the objects in the source and driving images. In this paper, a new end-to-end unsupervised motion transfer framework is proposed to overcome such issue. Firstly, we propose thin-plate spline motion estimation to produce a more flexible optical flow, which warps the feature maps of the source image to the feature domain of the driving image. Secondly, in order to restore the missing regions more realistically, we leverage multi-resolution occlusion masks to achieve more effective feature fusion. Finally, additional auxiliary loss functions are designed to ensure that there is a clear division of labor in the network modules, encouraging the network to generate high-quality images. Our method can animate a variety of objects, including talking faces, human bodies, and pixel animations. Experiments demonstrate that our method performs better on most benchmarks than the state of the art with visible improvements in pose-related metrics.

Comments:CVPR 2022
Subjects:Computer Vision and Pattern Recognition (cs.CV)
Cite as:arXiv:2203.14367 [cs.CV]
 (or arXiv:2203.14367v2 [cs.CV] for this version)
 https://doi.org/10.48550/arXiv.2203.14367

Submission history

From: Jian Zhao [view email
[v1] Sun, 27 Mar 2022 18:40:55 UTC (1,952 KB)
[v2] Tue, 29 Mar 2022 03:06:26 UTC (2,010 KB)

By publishing this video footage, it does not necessarily mean that the publisher has the same views as this person concerned. This supplementary section simply wants to share experiences on how to create streaming videos, with source references in the form of one photo image and one input driving video, which then produces an output video as processed according to the paper above.