Single-stage methods lose object texture and accumulate color errors over time. |
Customization without multi-view images overfits to the frontal view. |
Extend Attention enhances the consistency between keyframes. |
Both depth control and rendered feature control are vital for 3D guidance in video generation. |
Reconstructed background meshes enable meaningful interactions with the background element. |
Reconstructed background meshes provide 3D guidance after camera movement. |
Renoise (Add noise to a certain noisy level and denoise again) fails to refine the black region. Tokenflow (a V2V translation method) has flickering issues. |