TexVocab:Texture Vocabulary-conditioned Human Avatars

¹Tsinghua University

Abstract

To adequately utilize the available image evidence in multi-view video-based avatar modeling, we propose TexVocab, a novel avatar representation that constructs a texture vocabulary and associates body poses with texture maps for animation.

Given multi-view RGB videos, our method initially back-projects all the available images in the training videos to the posed SMPL surface, producing texture maps in the SMPL UV domain. Then we construct pairs of human poses and texture maps to establish a texture vocabulary for encoding dynamic human appearances under various poses. Unlike the commonly used joint-wise manner, we further design a body-part-wise encoding strategy to learn the structural effects of the kinematic chain.

Given a driving pose, we query the pose feature hierarchically by decomposing the pose vector into several body parts and interpolating the texture features for synthesizing fine-grained human dynamics.

Overall, our method is able to create animatable human avatars with detailed and dynamic appearances from RGB videos, and the experiments show that our method outperforms state-of-the-art approaches.

@article{liu2024texvocab, title={TexVocab: Texture Vocabulary-conditioned Human Avatars}, author={Liu, Yuxiao and Li, Zhe and Liu, Yebin and Wang, Haoqian}, journal={arXiv preprint arXiv:2404.00524}, year={2024} }

For detailed questions about this work, please contact Yuxiao Liu (liuyuxia22@mails.tsinghua.edu.cn).

We are looking for talented, motivated, and creative research and engineering interns working on human-centric visual understanding and generation topics. If you are interested, please send your CV to Haoqian Wang (wangyizhai@sz.tsinghua.edu.cn).

CVPR 2024

TexVocab:Texture Vocabulary-conditioned Human Avatars

Given multi-view RGB videos of one character, we construct a texture vocabulary, and create realistic animatable human avatars.

Abstract

Method

Results and Comparisons

Animation Results on Different Sequences

Animation Results on AIST++ Dataset

Comparison Results against ARAH, Tava, AniNeRF and Posevocab

Video

BibTeX

Contact Us